Wednesday, 21 April 2010

Keeping Time

I'm stuck in Amsterdam because of an erupting volcano. We're staying on a barge- very quaint,but the rooms are tiny. I was lying awake, wondering what time it was because my phone has run out of battery. Was it 4am or 12 noon? When should I wake my friends? How to tell how much time was passing? It's not that easy- I remember seeing a competition where people had to judge an hour without watches, and one guy made a bet at half an hour.

It occur to me that singing songs in my head was a good way of keeping time. Instead of counting seconds, I'll count songs. I'll give it two 'beat it's and a 'torn' before waking them, I thought. Then, I thought about the evolution of language conference we visited this week. People were always coming up with theories about the adaptive advantage of language. What if it was useful for measuring waiting time?why would you want to?

In the Miocene, the environment started to dry out, leading to a thinning of resources. This meant that primates either had to reduce their group size or travel further more efficiently in order to find enough food. Our ancestors chose the second option (this theory put forwards by Isbell & Young, 1996).

Now, imagine yourself as part of a large group who travel big distances in forests. Inevitably, you're going to split up. You won't be able to contact them, so you have to decide to wait or move on. In a foreign city with no phone battery, I've been in this situation many times his week. The best hing to do is wait for a while, then move on. But how long? And how to measure?

Singing! And the more complex the song, the less the number of repetitions you have to keep track of. I remember now a piece of child psychology where you tell a child they can have one biscuit now or wait 10 minutes and have two. Intelligent kids will sing to themselves to pass the time.

So there we go, language evolved under an adaptive pressure to accurately measure small periods of time. There are a billion holes in this theory. For example, the sun is a pretty good indication of the time. Also, it's not clear that this ability is any use.

Anyway, we might get a few papers, a book and a conference out of it.

Isbell, L. A., & Young T. P. (1996). The evolution of bipedalism in hominids and reduced group size in chimpanzees: alternative responses to decreasing resource availability Journal of Human Evolution, 30 (5), 389-397 DOI: 10.1006/jhev.1996.0034

Monday, 12 April 2010

Catch Up

This weekend I took part in a 48 film making competition! Here's the result:

Monday, 5 April 2010


I've just finished a stencil for a friend. It's Yuna from Final Fantasy X. I used too much paint on the first print, but the second came out much sharper.

Step 1: Steal picture

Step 2: Print and Cut

Step 3: Spray

Step 4: Repeat

Thursday, 1 April 2010

Cultural Variation and Social Networks

Children learn language from exposure to speakers in their social network. This learning influences the input that will be given to the next generation. The learning biases that an individual has will influence the way the language changes over generations (Kirby, Dowman & Griffiths, 2007). However, language also plays a part in constructing and maintaining social networks. Recent studies have suggested that the structure of the social network also has an effect on the how a language evolves. Gong and Wang (2010) find that different network types influence the evolution of linguistic categories in an artificial categorisation game. Lupyan & Dale (2010) find that the amount of contact with other communities, and a community's spatial dispersion influences the morphological complexity of a language.

I was wondering whether bilingual communities have different social network structures to monolingual communities. Real social networks are very difficult to construct, so I wanted to use some online social networking sites. Twitter seemed like an obvious choice because of it's simple API, and also because 'following' someone has a genuine connection on a user's linguistic input.

I aquired some data for Twitter users. The data includes the number of followers (indegree) and people being followed (outdegree), the user's location, the number of status updates sent by a user and the amount of time since the last update. The last two features can be used to filter out people who are not active participants. The location information is optional and may be as specific as GPS coordinates or as general as a country, or even just a timezone. Following from this, communities were defined by country. Data mining techniques will be used to automatically assign users to countries.

Ideally, we would want the following statistics: Average Degree, Clustering Coefficient, Average shortest Path length. However, this requires information on the specific links between users. However, this requires more time and resources, so this data was not collected for this report. This is not a trivial point, however, because users can follow people in other countries.

Next, data on the linguistic variance is needed. As I showed in a recent post, estimating the amount of bilingualism is difficult. The best source of information is Ethnologue, but numbers of speakers are underestimated, probably due to inadequate data for small linguistic communities. I decided to use two measures of bilingualism: The number of languages spoken in a country and the percentage of the population of a country that speak the majority language. A country's nominal per-capita GDP and number of internet users is also taken into account.

Data was collected for about 31,000 users, about 25,000 of which was usable (there have been databases of up to 2.7 million users with over a billion connections between them). Although Twitter allows very many followers, for practical purposes I filtered out users with over 1,000 followers or following over 1,000 other users. This left data for 17,444 users ffrom 119 countries.

Initial results suggest a negative correlation between the indegree for each country and the percentage of the country's population who speak the majority language (using log indegree, t = 1.88, df = 117, p = 0.06).

Using a linear regression, the indegree and outdegree are significant predictors of linguistic variation, even when the effects of population size and access to the internet are partialled out (R-squared = 0.19, F(4,17439) = 1063, p <0.01; t =" -2.04," p =" 0.04;" t ="2.54," p ="0.01). This was based on data for 17,444 users ffrom 119 countries. Statistics for countries were taken from CIA factbook, 2010. The analysis revealed a negative correlation between linguistic variance and indegree, but a positive correlation between linguistic variance and outdegree. The same qualitative results were found by using the number of languages spoken in a country. However, there is a positive correlation between the number of languages and both indegree and outdegree. I'm not sure how to interpret this yet, or whether any of it makes any sense. In the meantime, here's a pretty uninformative map of the world, coloured by average number of Twitter friends. Darker countries have users with a higher average number of friends.

Lupyan G, & Dale R (2010). Language structure is partly determined by social structure. PloS one, 5 (1) PMID: 20098492
Kirby S, Dowman M, & Griffiths TL (2007). Innateness and culture in the evolution of language. Proceedings of the National Academy of Sciences of the United States of America, 104 (12), 5241-5 PMID: 17360393