Languages are spoken, written and, for some time now, also tweeted. That's how the micro-blogging social network became a stage for one of the latest, most innovative linguistic studies — the investigation of dialects worldwide using messages published on Twitter
. Researchers Bruno Gonçalves, from the University of Toluon in France, and David Sánchez, from the Interdisciplinary Physics and Complex Systems Institute in Spain, gathered more than 50 million tweets over a period of two years to analyze the Spanish language.
Using Twitter's geo-location tool to discover where message were sent from, the researchers analyzed a series of concepts from among millions of tweets written from Spain, Latin America and the U.S., as well as from some areas in Eastern Europe. They took a word and then searched and localized the tweets where the word was reference on a map, using one word and one different dialect. For example, using the word auto, automobile or car for “motor vehicle.”
This yielded a graphic illustration of what words are most common according to a geographic area, which enabled researchers to discover how the Spanish language is clearly divided into two 'superdialects' — one which is more international and used in large cities in Spain and America, and another used more in rural areas.
The first has origins in the increasing homogenization of the Spanish language due to leveling mechanisms such as the media or social networks. In the case of the rural dialect, the research also detected three different varieties: one used in Spain, another in broad areas across Latin America and the last used exclusively by speakers from the Southern Cone.