Give linguists 140 characters and they’ll predict whether you’re a guy or girl two times out of three.
“Remember when the Gay Girl in Damascus revealed himself as a middle-aged man from Georgia? On a platform like Twitter, which doesn’t ask for much biographical information, it’s easy (and fun!) to take on a fake persona but now linguistic researchers have developed an algorithm that can predict the gender of a tweeter based solely on the 140 characters they choose to tweet. The research is based on the idea that women use language differently than men. ‘The mere fact of a tweet containing an exclamation mark or a smiley face meant that odds were a woman was tweeting, for instance,’ reports David Zax. Other research corroborates these findings, finding that women tend to use emoticons, abbreviations, repeated letters and expressions of affection more than men and linguists have also developed a list of gender-skewed words used more often by women including love, ha-ha, cute, omg, yay, hahaha, happy, girl, hair, lol, hubby, and chocolate. Remarkably, even when only provided with one tweet, the program could correctly identify gender 65.9% of the time. (PDF). Depending on how successful the program is proven to be, it could be used for ad-targeting, or for socio-linguistic research.”
This result follows a recent spate of articles in the mainstream media arguing that language reflects how you think. While emphasizing cultural rather than gender divergences, some of this research suggests profoundly different worldviews. For example, the Pormpuraaw people of aboriginal Australia speak of “my southwest foot” instead of “my left foot.”