The language used by micro-bloggers on the Twitter network characteristically scores lower on ‘reading ease’ criteria than other short-form communication.
Along the lines of what has happened with SMS messaging, social network communication suffers from a reputation for breaking language rules, which some commentators see as an indication that young people are becoming almost illiterate. And in non-English-language contexts the constant use of anglicisms is thought to be exacerbating this trend. This analysis is of course simplistic at best and, according to a recent study, may even be entirely false. In partnership with the Microsoft research centre, James R.A. Davenport, a graduate student at the department of Astronomy at the University of Washington, has just published a comparative analysis of the complexity of language used in tweets, entitled The Readability of Tweets and their Geographic Correlation with Education. His study reveals that although there are some variations, the average ease of reading of the micro-bloggers’ messages remains relatively constant across the board, with no clear correlation with users’ level of education.
Higher ‘readability’ scores
The researchers analysed more than 17 million messages tweeted in English by users located mainly in the United States. Their definition of ‘readability’ used to analyse these millions of tweets is based on a methodology known as the Flesch Reading Ease Formula. This approach was designed to predict the likelihood that a piece of writing will be easily understood by people at a given level of educational attainment. It does not compare individual terms with a set of precise terminology, but instead makes a calculation based on the number of words per sentence and the number of syllables per word, in order to arrive at a score showing how complicated a piece of text is. The overall results obtained by Mr Davenport and his colleagues, which were then also filtered through a geographical analysis – for the 2% of geo-located tweets available – and Zip Code Tabulation Area (ZCTA) education level data from the US Census, did not show any significant correlation between reading ease (RE) of the tweets and levels of education among the population. However they did find that the language of Twitter is consistently more ‘sophisticated’ – i.e. it shows a lower RE score – than other similar short-form mediums such as chat rooms and SMS. In fact the language used in tweets seems to be more formal and complex than that used in primarily one-to-one conversations such as SMS.
Twitter a ‘levelling’ medium?
Far from corroborating the suggestion that standards on social networks are being systematically ‘dumbed down’, the study seems on the contrary to indicate that Twitter users are maintaining consistent – and fairly high – standards of language. The conclusion might be that the highly public nature of these conversations encourages tweeters to pay more attention to spelling and grammar. Moreover, the restricted number of characters allowed for each tweet may well be motivating users to look for the exact right word to explain their thoughts. It might be a leap too far to suggest that the micro-blogging social network is acting as an educational tool, but the study at least shows that the tone of social networks may actually be maturing, correlating perhaps with their inevitably aging users.