Reply
Thu 24 Apr, 2003 04:14 pm
Was George Eliot a woman? New Bar-Ilan algorithm can tell youBy JUDY SIEGEL-ITZKOVICH
Graphologists claim they can identify personal characteristics by examining handwriting, but Bar-Ilan University computer experts say they can determine with an algorithm they developed whether the author of printed English text is female or male, the approximate age, and the educational and linguistic background of the author.
Moshe Koppel and Shlomo Argamon (the latter now at the Illinois Institute of Technology in Chicago), working with the linguist Jonathan Fine and graduate student Anat Shimony, say that using a handful of linguistic cues, their program correctly determines the gender of the author of a previously unseen document about five times out of six. Their study, called "Automatically Categorizing Written Texts by Author Gender," will be published next week in the Literary and Linguistic Computing, a journal published by Oxford University Press.
The program could at least theoretically know that classical author George Eliot was a woman, and that even though Benjamin Franklin used the pseudonym "The Old Widow Dogwood" as a starting journalist, he could be identified as a man.
Koppel said he and his team addressed the challenge out of academic curiosity, not thinking particularly about practical applications, although there are numerous ones. They have not patented the algorithm. A clear use is forensic - to help identify a Unibomber, for example, by something he wrote, or the writer of a suicide note. It could also be used to help detect plagiarism, if, for example, a male stole the text of a female.
"It could be used for undesirable purposes, such as marketing," Koppel said. If a person sends an e-mail of significant length to a company, the gender and background of the writer could be determined. He jokes that the number of spam e-mails promoting "treatments" for penile extensions could thus be cut in half and sent only to males. But seriously, he said, the recipient of e-mail who is not available could customize an automatic replay according to a prediction of the gender and background of the sender.
Techniques from a branch of computer science known as "machine learning" are used to program a computer to analyze examples of male and female writing. The computer is programmed to learn for itself how to distinguish between them based on statistical regularities it finds in the examples. The principles that the computer learns are then applied to other documents that it has not seen before.
The Bar-Ilan University study was carried out on about 600 books and articles taken from the British National Corpus, a massive collection of written texts assembled for the purpose of linguistics research. The researchers found that the same differences between male and female authors held across the entire range of topics in the corpus, including art, politics, science, biography, and many other areas. In a companion paper to appear this summer in the prestigious linguistics journal Text, the authors report that in both fiction and non-fiction writing, women writers tend to use words indicating relationship between the writer and reader more often than men, while men tend to use more words describing, specifying, and quantifying things than do women. The causes for these differences is still unclear.
Books by authors Deborah Tannen and John Gray have popularized the idea that men and women have different conversational styles. In conversation, Koppel said, women tend to apologize more, while men interrupt more. The new Bar-Ilan University findings show that gender differences extend even to writing styles in books and articles. This is surprising, since unlike participation in a conversation, writing books and articles does not involve direct social interaction. Koppel said that he also tried the program on texts from the Bar-Ilan Responsa Project, which detected different writing styles by unidentified German and French authors.
Internet Edition of Jerusalem Post ( 5/25/03 )
Digital Israel
© 1995-2003, The Jerusalem Post - All rights reserved.
Mary Ann Evans (George Eliot)
How is that relevant to the scientific research, as stated in the paper?
Was George Elliot A Woman?
New Haven- Very interesting study. What the researchers have done is to evaluate technologically (therefore more quickly and efficiently) what psychologists have known for a long time. Men and women DO interact differently. Their ways relating to persons and/or objects are somewhat different. As a result, these differences would emerge when putting pen to paper, or fingers to keyboard.
Excellent response, Phoenix!
Not sure why we need an algorithm when we already know the answer to this question. Right on, dys!
Phoenix, But when I mentioned several weeks after joining A2K that I'm female, there was some surprise! And think of the women who fought, disguised as men, in the Civil War and got away with it.
There were those who thought the novels of the Bronte sisters were actually written by their ne'er-do-well brother, Branwell. Of course, the reality was that Branwell spent most of his quality time in the village pub, but still. He was a guy. Hence the supposition of his authorship.
Not, of course, that this relates to this discussion...
ehbeth?
How can you be sure?
Tartarin:
You've got it right. It's the basis for transsexuality.
I might add that more than a few women married other women, disguised as men. They never new the difference till the very end.