Tonight I watched the 'making-of' documentary for Ar y Tracs (see my previous blog). Co-Author Catrin Dafydd spoke about the importance of bridging the gap between Welsh and English in order to capture the complexities and humor that arises from bilingualism in Wales. Also interesting was to see her and Ruth Jones discussing their co-authorship. Here's a quote from Dafydd ('English' marked in bold for readablility):
"Ac on i'n dweud 'Oh what about this', ie, 'galla hi dweud hwn' you'd say 'wouldn't it be simpler if she just said this', a wnest ti cynnig pethau yng Ngymraeg. An' actually the perspective of someone who's learned the language really helped the script I think because hopefully it's going to be something, ti'n 'mod, really accessible i ddysgwyr ac i bobol sy ddim mor hyderus, confident, yng Nghymraeg."

(And I'd say 'Oh, what about this', ye, 'she could say this' you'd say 'wouldn't it be simpler if she just said this', and you'd suggest things in Welsh. And actually the perspective of someone who's learned the language really helped the script I think because hopefully it's going to be something, y'know, really accessible to learners and to people who are not so confident, confident, in Welsh)
A couple of interesting points here. First, a minor point about the difficulty of deciding which language is being spoken. In the final phrase (it's going to be something, ti'n 'mod, really accessible i ddysgwyr...), it's difficult to tell when Dafydd has swiched to Welsh. The first 'Welsh' words are 'i ddysgwyr', but there's evidence to suggest that switching has before this. First, 'really' is considered a loan word by many. In fact, an analysis of Dafydd's speech would be necessary to determine if she considered 'accessible' as a Welsh word, too. Furthermore, one would need to survey Dafydd's use of Welsh and English interjections to determine whether '"ti'n 'mod" (y'know) was a marked cue to language alternation.

The second point is about using other people's codes. Right at the end, Dafydd sub-titles her Welsh (hyderus, confident, also just before this quote with "... dwyiethyddiaeth, bilingualism ..."), presumably for the benefit of the less-fluent Jones. In fact, you can see Dafydd struggling with having to monitor her speech to make sure it's understandable while also trying to make a complicated point. Finally, she switches into English ('And actually the perspective ...'), which they both understand. Once Dafydd has collected her thoughts, she switches back into Welsh for the remainder of the sentence ('really accessible i ddysgwyr...').

My research focuses on why humans have a great capacity for bilingualism. The quote above demonstrates that it may be easier to adopt the 'code' of another than be constantly monitoring your own code to make sure they will understand you.

I've just finished watching the new comedy from Ruth Jones, co-writer and star of Gavin and Stacey. Ar y Tracs (On the Tracks) is a Welsh-language production for S4C (Tidy Productions/ Green Bay) following the lives of a train crew over the festive period. Ruth Jones, who is learning Welsh, co-wrote the script with Catrin Dafydd who has written for the Welsh-language soap Pobol y Cwm.

Having been away from Wales for a while, it's always a bit weird to see people speaking Welsh on TV, but this programme was notable for its extensive use of English. Characters moved in and out of thier two languages all the time, much more than you'd usually see on Pobol y Cwm, or S4C's answer to Hollyoaks, Rownd + Rownd.

As well as switching to talk to non-Welsh speaking characters, and idiomatic borrowing ("Does dim second chance da fi nawr - I've had my lot") there was plenty of inter-sentential language alternation:
"Pan ti’n mynd trwyddo i Big Brother 10, bydd, like, masif support i ti ar y we yn barod."
(When you go through to Big Brother 10, there will be like massive support for you on the web already)
Also, a good use of marked language change to emphasise dramatic turns:
A: Pwy yw Billy Bricks?
(Who is Billy Bricks?)
B: Billy Bricks was my father.
There was also a nod to Gavin and Stacey with a typical Welshifying of the catchphrase "Beth sy'n occuro?" (What's occurring?). It was really nice to see naturalistic speech. Carolyn Hitt from the Western Mail voices the same opinion.

S4C 's language scheme states that "a substantial proportion of the programmes broadcast ... must be in Welsh, and, in particular, that those programmes which are broadcast on S4C during peak viewing hours are mainly in Welsh." However, nowhere in S4C's language policy (or any other official document I've seen) is 'Welsh' actually defined. If a programme had half English words and half Welsh, would this count as a Welsh-language programme? What would a word with a Welsh stem and an English affix be counted as? What if Welsh were always the matrix language? Will broadcasters have to start hiring linguists to check their statutory obligations? I hope so (my persistence with this blog shows how far my PhD has progressed).

A possible defence for S4C in light of its language duties is to recognise that the majority of 'Welsh speakers' in Wales use Welsh as part of a cohesive Welsh-English code. Since Ar y Tracs was certainly in peak time, this may be the first endorsement by a public institution of a code-based view of communication! There's hope yet.

Last week I discussed Kovacs & Mehler (2009), here. I now understand that the participants in the study were bilingual in Italian and Slovenian. Here's an update on the data from the last post and also data for all onsets (not just 3 syllable words):

It looks like there is no reason to believe that Slovenian-speakers would have more experience of ABA structures than Italian speakers. For some reason, the proportions of ABB syllables seem to shrink in comparison to AAB and ABA when taking all onsets into account. However, the distributions over nouns, adjectives and verbs for English, Dutch and German are a lot more even when considering all onsets, suggesting there is not a huge difference in the way different languages indicate syntactic class using syllable structures.

Kovacs, A., & Mehler, J. (2009). Flexible Learning of Multiple Speech Structures in Bilingual Infants Science, 325 (5940), 611-612 DOI: 10.1126/science.1173947

A mixed week for Bilingualism in Canada

This week has seen both the good and bad side of language policies being played out in Ottawa, Canada.

On the negative side, there has been some debate over bilingualism in official roles. There's been a call for a requirement that the Fire Chief of Ottawa be bilingual. Meanwhile, a monolingual post-office worker is fighting to keep her job because she is not bilingual, prompting locals to organise a petition (here).

On the other hand, there are also inclusive bilingual policies being implemented. Vancouver is hosting the winter Olympics next year, and this week it was announced that it will be a bilingual experience:
The Vancouver 2010 Olympic and Paralympic Winter Games provide an unprecedented opportunity to showcase our unique Canadian identity to the world. ... the Vancouver Organizing Committee has devoted a great deal of time and resources to ensure these Games reflect our country’s world-renowned diversity, including its linguistic duality.

I seem to have an obsession with Ghost in the Shell! Here's some stencils I made (see my first one here):

Here's a Tachikoma, an old stencil that I re-sprayed today!

Major Motoko Kusanagi. I messed up the face, so decided to do some kind of light-explosion thing. This is my first attempt at painting with acrylic. Mixing colours was fun, but I realise I didn't really understand layering, and I was still painting in a digital frame of mind.

A Laughing Man icon! The quote is from J.D. Salinger's Catcher in the Rye.

Here are some other stenicls I found on the internet. There were suprisingly few, though.

An amazing stencil from Cyberdelics.

A cheeky stencil plan by Okto.

A much better Laughing Man icon from mmoroca.

This one kind of counts. From Pierre Huyghe, Philippe Parreno.

I've been thinking about nicknames and pet names. The people that mean the most to us usually have more than one name (see my post on Sliders). This appears to contradict the mutual exclusivity bias to have only one word for each object. That is, unless we see names as social tools we use to manage our relationships with people. Morgan et al. (1979) discuss the importance of managing social relationships with nicknames, for instance using a full name with your parents and a more informal diminutive with friends. Andersen (2006) discusses nicknames as adaptive innovations that serve the speaker’s emotive expressiveness. As Jhumpa Lahir puts it in The Namesake:
In Bengali the word for pet name is daknam, meaning, literally the name by which one is called by friends, family, and other intimates, at home and in other private, unguarded moments. Pet names are a... reminder that life is not always so serious, so formal, so complicated. They are a reminder, too, that one is not all things to all people... Every pet name is paired with a good name, a bhalonam, for identification in the outside world. Consequently, good names appear on envelopes, on diplomas, in telephone directories and in all other public places.
Perhaps, then, there is a way of linking bilingualism to social grooming theory. That is, language has taken over the social role of paying attention to significant others and the more 'effort' you put into innovation, the more attentive you are perceived to be.

Although public opinion is still coming round to the idea, Bilingualism does not, in fact, impede cognitive and linguistic development, but may enhance it. Bilinguals have been shown to be more aware of the use of words in social contexts (Rosenblum and Pinker, 1983), they are better at taking other speaker’s perspectives (Genesee et al., 1975), and better at monitoring the knowledge state of others (Genesee et al., 1996). Recently, Kovacs and Mehler (2009) showed that bilinguals are more flexible at processing linguistic structures.

Kovacs and Mehler (2009) run an eye-tracking experiment on infants. Children had to learn to associate three syllable words with either AAB (e.g. 'babaka') or ABA ('bakaba') structures with a stimulus appearing on either the right or the left side of the screen. Bilinguals successfully learned to associate both structures with the correct side, while monolinguals performed equally well for AAB structures, but could not learn the associations for ABA structures. (See Ed Young's blog for another analysis). This is a bit confusing - it's not just that monolinguals were worse than bilinguals at processing ABA structures, monolinguals actually scored negatively!

Bilinguals have a wider range of input than monolinguals, and so may be more used to different linguistic structures. Bilinguals may be better at processing two linguistic structures simultaneously. Alternatively, monolinguals may have refined their processing to fit their input language.

Part of Kovacs and Mehler’s argument relies on AAB structures being easier to process, referring to Gervain et al. (2008). Presumably it’s true that this is because less memory is needed but Gervain et al. find that ABB structures, not AAB, are easier to process than ABA structures in monolinguals. One question is whether this processing benefit is language-specific. If it is, then the result may reflect bilingual’s experience with ABA strucutres. If it is not, then the result may show either a faster maturation of processing abilities in bilinguals, a slower maturation of processing abilities in monolinguals or a difference in investment of resources in the domain of word learning.

Let's look at some data! Proportions of various structures of 3 syllable words were gathered for English, Dutch, German (CELEX), Mandarin Chinese (CC-CEDICT, taking tone into consideration) and Italian (CoLFIS, although orthographic and automatically parsed for syllables):

The graph above shows that there is not much variation between proportions of word forms with AAB and ABA structures within a language. This suggests that bilinguals do indeed have a processing advantage. However, there is variation in the number of tokens of AAB and ABA across languages, and large variation in the proportions of word forms with ABB structures compared to other structures. If this is the case, then bilingual infants may be exposed to (and therefore be more effcient at processing) a greater range of structures, which may be an additional factor in breaking the Mutual Exclusivity bias.

Furthermore, the proportions of words which conform to either AAB, ABB or ABA structures are very small. Why, then, are these structures used frequently in infant-directed speech (Ferguson, 1983)? A possible answer is a kind of explicit demonstration of linguistic structure. At any rate, Kovacs & Mehler's paper is interesting, as is it's companion paper, described here.

Below, I've split the data from the graph above by counts for Adjectives, Nouns and Verbs. Interestingly, German, Dutch and English have different distributions. For instance, a new word with an ABA structure (e.g. 'bakaba') may be interpreted differently by different speakers (rationally, ignoring heuristics based on morphological cues). An English speaker would assume it was an adjective, a Dutch speaker would assume it was a verb and a German speaker would assume it was a noun:

This would show that different languages have different ways of cuing children into the meanings of their words. Unfortunately, perhaps because of the very low numbers of examples, when I tested some English and Dutch speaking friends on this, they basically chose randomly. Another big, unfounded linguistic theory.

(also, see my followup post here)

Kovacs, A., & Mehler, J. (2009). Flexible Learning of Multiple Speech Structures in Bilingual Infants Science, 325 (5940), 611-612 DOI: 10.1126/science.1173947

Bilingual Puns in Bali

I came accross an old paper by Joel Sherzer on bilingual puns and word play in Bali. There are several languages in use in Bali, including Sanskrit, Old and Middle Javanese, Balinese (including the various levels - alus 'refined', biasa 'ordinary', kasar 'coarse' etc.), Indonesian and English. Most people speak many of these, and the interplay between them is a common feature of dialog. Here's some examples:

X (to Y, in Indonesian): Sudah siap? 'Are you ready?' (lit. 'already ready')
Y (in Indonesian): Sudah ayam 'Already a chicken.'

Here, siap, which means 'ready' in Indonesian, means 'chicken' in Balinese Alus. Also, ayam, means chicken in both Indonesian and Balinese Alus.

X calls out: Wayan mejalan cara taluh 'Wayan walks like an egg'.

Taluh is Balinese biasa for 'egg'. 'Egg' in Balinese alus is adeng. Adeng in Balinese biasa is 'slow'. That is, Wayan is walking slowly. People can also conduct entire conversations where the meaning is actually based on puns:

X: Mekunyit di alas? 'turmeric in the forest?'
Y: Ketemu '(type of) spice'

Here, X is asking Y if they have a girlfriend, since Ketemu is both a kind of spice and 'acquaintence'. This might just seem anoying, but it is by far the least complex punning interaction. Here's a section on popular ways of saying 'goodbye':

Here are some examples involving the sound similarity between Balinese siu 'one thousand' and Eng. See you. A person may say Siu surat, lit. '1000 letter', but a play on Eng. 'See you later', in which B, I surat 'letter' is a pun on Eng. later. Or a person may say Siu berjumpa, with Indonesian 'meet, see'. The use of meaning equivalences in different languages to go nowhere referentially is also the basis for such comebacks as Siu one thousand, based on the fact that siu is Balinese for 'one thousand'. Or a person may say Siu seribu, in which siu stands for Eng. 'See you' or Balinese siu 'one thousand', and seribu is Indonesian for 'one thousand'.

The most baroque and recherche in this group of mock leavetakings is Siu satak, lit. 'one thousand two hundred'. Again this takes off from the similarity of Balinese siu and Eng. See you, but added to this is the fact that '1200' can also be expressed as nem bangsit, lit. 'six two-hundred' - in which bangsit, with the m/b interchange seen above, sounds like mangsit 'to stink'. Once again the play is not on an uttered word, but on an imagined or presupposed word.
Aparrently, types of pig-latin are farily common including:
Inserting syllables with vowel echoing
Deleting all but the first Consonant-Vowel-Consonant sequence of each word
Reversing syllables
Reversing phonemes


Dubloons, loonies and moonies

Recently, I've been thinking about the Mutual Exclusivity bias - the tendency children and adults have to assume one meaning has only one associated word (in other words, to avoid homonomy). This bias seems to work against bilinguals, who have two words for many things. I'll get round to discussing this at some other point. In the meantime, here's an extract from an interesting article by Neil Wick on the tendency for new words for new meanings to converge on a single, conventional form. Wick charts the course of nicknames for the newly introduced $2 coin in Canada. It's worth noting that money is not a normal object - its meaning remains constant in all contexts. However, it's quite a good story:

"Speculation on a name began as soon as the new Canadian coin was announced in the February, 1995, federal budget. On March 3, The Globe and Mail reported that “the ink was barely dry on Paul Martin’s new budget before lock-up wags were bantering around a new name for the $2 coin” and doubloon was mentioned among other suggestions (“We like Ike” 1995). Both The Globe and The Star printed numerous letters hotly debating myriad possible names. One thing seemed beyond debate—writers felt strongly that a name would emerge. The Toronto Star confidently predicted on October 7, 1995, “It will acquire its own nickname” (Aaron, 1995). The Word Play columnist at The Globe implicitly acknowledged a lexical gap on February 24, 1996: “Writers of letters to The Globe and Mail have been diligent about filling this void” (Clements, 1996). One reporter even suggested that the coin needed a name. On February 16, 1996, a Star business reporter stated “All it needs is a nickname— and there is no shortage of suggestions” (Hemeon, 1996). The Mint would have no part in settling the debate. A Globe story on March 23, quoted a Mint spokesperson: “We’re not in the habit of giving names to any of our coins. For us a 10-cent piece is a 10-cent piece.” Characterizing such coin names as too unprofessional to be used by Mint workers, he declared, “The public will have to sort out [what to call the coin] on its own” (Grange, 1996). In spite of the heated debates which still continued after the coin’s launch, a consensus was already forming a month before the launch, as evidenced by responses to The Star’s request for readers to phone in their name suggestions (Stefaniuk, 1996a). Among 57 names and variations submitted by readers, four stood out. Teddy had 11 votes, Toonie/Twoonie/Twooney had 10, and two variants were tied for third place with 9 votes each: Doubloon/Doubloonie and Moonie (Stefaniuk, 1996b). Aaron (1996b) lamented on March 9, that “the horrible term ‘twoonie’ seems to have an edge in public acceptance.” On February 19, 1996, the official launch date for the new coin, Freeman (1996) wrote in The Globe that the new coin “has already picked up a string of unofficial names such as toonie, doubloon, bearbuck, blooney, Doosie and Loonie II.” Meanwhile, a March 14 Star article about panhandlers’ experiences with the new coin (DeMara, 1996) used the word toonie 10 times without remark, prompting an angry letter accusing The Star of “trying to shove the word ‘toonie’ down our throats” (Moshinsky, 1996). “Over here, it’s poly (polar bear – see?) or polies; always was, always will be, The Star’s decree notwithstanding,” the reader wrote. This reader may have overestimated the influence of the paper on public consensus. In fact, if the name depended on a Star decree, dubloon or dubloonie probably would have won out. This was the name used most by The Star in early stories, it was preferred by the Coins columnist (Aaron, 1995) who found toonie to be a “horrible term” (1996a) as already noted, and it was even the personal choice of the chief lexicographer of the new Gage Canadian dictionary (Grange, 1996). Toronto Star art critic Christopher Hume on March 21, 1996, recognized that the lack of a stable name put the two-dollar coin in a different category than that of the one-dollar coin introduced nine years earlier: “By contrast, the loonie has become part of the culture. The word has entered the vocabulary” (Hume, 1996). He called the two-dollar coin “still, annoyingly, nameless” yet he matter-of-factly called it a toonie twice in that same column. By the end of August, Aaron was describing the coin as “affectionately called the ‘toonie’” (Aaron, 1996a), and by September, The Star acknowledged that “its colloquial name ‘toonie’ is part of the vernacular” (Vincent, 1996). In March of the following year, Kesterton (1997) wrote in The Globe that “Fairly quickly and dismissively, Canadians have come to call the $2 coin the ‘toonie,’ despite the many clever terms that were suggested by word mavens …”. Clearly, cleverness alone was not enough. The winning candidate was efficiently short—less than three syllables as with the other coins’ names, incorporated an allusion to the word two, rhymed with loonie, and recalled the familiar collocation looney tunes. Importantly, it started to build momentum in public acceptance early in the process and once established in a few speakers’ lexicons, there was little chance that those speakers would accept alternatives barring major pressure from another stronger group of speakers."

I came across a series of films on You Tube with famous faces talking about growing up with two languages, and two identities, from the Equality and Human Rights Commission. For some reason, they're all subtitled in Welsh, even though the speech is in English. The famous faces include Sanjeev Bhaskar, Gok Wan and Boris Johnstone.

In October I wrote about a Big-Issue hawker selling his wares. A recent article from Edinburgh's student newspaper, The Journal, interviews him, here. In addition to his 'Can I interest you in the Big Issue?', he has recently added 'Don't be shy come and buy/try!'.

Where the Welsh Things Are

Maurice Sendak's classic children's story 'Where the Wild Things Are' has been made into a movie and is released in the UK this week. I, like millions of other children, read this as a child. However, I will feel an extra level of betrayal to those who don't agree with the movie changes to the voices/plot/characters, because I read it initially in Welsh.

The welsh speaking characters in 'Gwlad y pethau Gwyllt' always felt closer, more special to me, in a way the characters in English books did not. Indeed, they may have been more dear to me literally - I discovered that my Welsh version of 'Where the Wild things Are' is a collectable now, selling for $450 dollars on Amazon! I should really try to find it...

It's not the first time this has happened. I remember seeing Superted in English for the first time (Superted was originally in Welsh), and feeling like he had abandoned Wales and the fight to 'achyb yr iaith' (rescue the language). I also remember being surprised that my English speaking friends also knew about Fireman Sam.

I wonder if other children around the world have been similarly confused. In the end, I guess I shouldn't have been that surprised that my childhood heroes were bilingual, but why did they never code-switch?

Talking of code-switching cartoons, I can't help including possibly the most complicated example of code-switching ever, from Disney's The Prince Of Egypt:

Conceptions of bilingualism in Canada

I just finished watching the documentary 'Incident at Restigouche' about tensions between the Canadian native Micmac Indians and the Quebec government over fishing rights in 1981. The Quebec authorities raided the local reserve because, in their eyes, they were over-fishing the river. The Micmac deny this, pointing out that sport fishing alone took a greater number of fish each year. I was linked by The AQ's blog on the Micmac language and identity, here.

Several things made me upset. First was the labelling of the Micmac as 'not bilingual' because of speaking English and Micmac, not English and French.

Secondly, I was particularly struck by what Quebec Minister of Fisheries, Lucien Lessard, was heard to say:
"You cannot ask for sovereignty, because to have sovereignty, one must have one's own culture, language and land."
First of all, the Micmac have all of these, so it's not clear what on earth Lessard was thinking. It's clear, though, that people's conceptions of other languages and how they relate to culture and sovereignty can be radically different.

Bayesian Bilingualism

I've been wondering about Bayesian models of language learning and bilingualism. Models such as Griffiths & Kalish (2005) assume learners have probabilities for hypotheses of the structure of a language in a large hypothesis space, based on utterances heard. The posterior probability represents the learner’s model of a speaker’s language (compatible with a view of trying to learn the parents’ Medium). Two methods drive convergence to a best hypothesis in the learner: The MAP (maximum a posteriori) process assumes the maximally probable ‘language’ and only produces strings created by that ‘language’. The sampling approach (SAM) does not rule out any non­zero probability hypothesis and may produce mixed strings occasionally.

In a monolingual environment, MAP should be most efficient, but SAM is better for acquiring more than one language. A sampling approach also models observations of better task switching but worse inhibition in bilinguals than monolinguals. This may be another factor in the differences between monolingual and bilingual development.

However, I'm not completely sure about the maths, and suspect that MAP can define a best hypothesis over any number of 'languages', so they may be equivalent.

Bilingualism in Singapore

Singapore is certainly multilingual. It has four official languages, and Ethnologue catalogues 21 different languages and dialects, all within about 5 million people.

The latest post is on Language Log is on Bilingualism in Singapore, charting the dubious theorising of Minister Lee Kuan Yew (for more on Lee, see Lee essentially forged an education plan based on the idea that people only have so much 'space' to store languages, and so bilingualism can only be bad. Surprisingly, this was not so far from well received Linguistic theory until Martin-Jones and Romanie (1986) argued it was 'half-baked'.

However, the minister seems to have recently changed his mind about bilingual education for the better. Maybe it'll trickle down to the guy I met the other day who congratulated me on my very good English despite my Welsh medium education.

Last week I heard someone laugh, and I thought it was such a good laugh that I would use it from then on. It was sort of halfway between a cackle and a guffaw - definitely a mocking, cruel, delighted burst of air.

This reminded me of another concious adoption of a cultural trait that I read about recently from Papua New Guinea: McElhanon describes a community meeting in which one group decided to change their word for ‘no’ in order to distinguish themselves from another group (Kulick, 1992). Although there are many examples of people changing to conform, it's not often you find such an organised move away from the norm. I couldn't find out how successful the change was, though.

Surprisingly, my own adoption seems to have worked, and I now involuntarily use my new laugh quite a lot. Laughter, it seems, is infectious.

Edinburgh Skyline

Me and Keelin just made a stencil of the Edinburgh skyline from Princes Street. I was really pleased with it, and will definitely use sponging again. The whole thing only took about 4 hours.

Here's the original panorama. I cut out the boring trees in the middle.

The contrast on the castle wasn't great, so I ended up splicing in a separate image:
Desaturate, boost contrast, find edges, print:

Cut out!

Stencil Ready:

Sponge acrylic paint onto wall:
Stencil done!

Ha Long Time Coming

David Graddol predicted in 1996 (theory quantified by Lupyan and Dale, 2009) that, as more people learned English as a second language, native English speakers would lose their grip on the language. Indeed, there are probably far more second language speakers of English now than native speakers, so the non-natives have the power to change the language to suit themselves.

This was highlighted on a friend's travel blog recently (I've been enjoying living vicariously, especially when there are puns involved in the titles). The latest post in Eric's South East Asia blog finds the protagonist on board a ship, floating between the myriad of islands in Ha Long Bay, Vietnam. Incidentally, I have to agree that it is one of the most beautiful areas I have ever seen. Having signed up for an English tour, Eric is surprised to find that he is the only native speaker of English. However, he found himself translating between 'versions' of English. There seemed to be non-reciprocal intelligibility between them: Eric could understand them all, but they had difficulty translating between themselves. Indeed, one traveller resorts to using his iPhone for translation (see my post on Lingua Tecnologia).

So it seems that, instead of a sprawling continent, English may be eroded by the seas of time into thousands of tiny little islands.

Ghost in the Shell

This was my favourite stencil. Unfortunately, the original got lost in a move and I haven't had the heart to do another. It's inspired by Shirow Masamune's original Ghost in the Shell comic. In this scene, the hero glimpses the birth of a new, intelligent life form. The thing in the middle is some kind of neural network, and descending onto it is an angel (the shape pointing down is the shade on a foot, some people don't notice it at first, but after it's pointed out, you can't help seeing it).

The idea is similar to the Sistine Chapel ceiling fresco, where God and Adam reach out towards each other. Michaelangelo captures the question of man's relationship with God. Masamune questions the relationship between man's body and man's spirit.

The medium here also hints at a direction for answers: Stencils are made up of use small, abstract, isolated shapes. However, in a particular configuration, they create an impression of a unified image to human perceivers. However, not all shapes are allowable - you can't have a stencil with free standing opaque parts (e.g. the outline of a full circle). In fact, even convex shapes decrease the integrity of the stencil. So, even as the whole influences how one interprets the parts, the parts influence what that whole can be.

Codeswitching as a Move to Markedness

One advantage of having two languages is having an extra tool with which to avoid ambiguity. For example, in English, ‘Thirteen’ and ‘Thirty’ are often confused, while in German ‘dreizehn’ and ‘dreissig’ are more different, while in Chinese ‘三十’ and ‘十 三’ are very different. Montanari (2008, pp. 622) gives an example of this tactic in a trilingual child (KAT) interacting with their grandmother (GRA) in Spanish and Tagalog:

%sit : KAT and GRA are engaged in book reading
*KAT : [‘ota].
%gls : pelota
%eng : ball in Spanish
*GRA : ¿botas ? zapatos ? zapatos.
%eng : boots ? shoes ? shoes.
*KAT : bola bola !
%eng : ball in Tagalog
*GRA : ah la pelota ahí detrá s, ahí está la pelota.
%eng : ah the ball right behind, there is the ball.

Because the child cannot pronounce the ‘pel’ of ‘peloa’ (ball), their attempt is confused with ‘botas’ (shoes). Instead of attempting the word again, or using pragmatics, the child uses the word in a different language. This makes it easier to pronounce and thus easier to understand. Perhaps, then, some codeswitching can be accounted for by this tactic.

One might assume that the optimal strategy, given two different languages, is to switch at every word. However, individual languages tend to display a move to markedness (Shillcock, Hick, Cairns, Chater & Levy, 1995). This principle is ‘that when consonant interactions introduce phonological ambiguity, the ambiguity introduced is always in the direction of a less frequent phoneme’ (Tamariz & Shillcock, 2001). That is, frequently occurring words should be optimised for pronunciation within a language, while words from another language will be free from this pressure. This suggests that frequent constructions (e.g. Noun Phrases) should be most salient in the same language. However, at larger phrase/constituent boundaries, where the probability of words co-occurring is less, words from other languages may be more salient. Code-switching phenomena such as Myer-Scotton’s embedded language frames may fall out of this interaction.

A modelling approach could be used to investigate this. A list of cognates and sentence templates in two languages will be required. Sentence templates will be filled with words from either language, based on maximising the phonetic distinctness of the sentence. This will be calculated using Markov Chain assumptions, with words as nodes and transition costs as the phonetic difference between the last phone of the current word and the first phone of the next word. To model this for children, extra costs could be imposed on transitions to words with complex consonant clusters.

This will produce sentences which are maximally phonetically distinct. Inferences about the choice of language could be drawn over many sentences and many sentence types, with particular attention being paid to constituent boundaries.

Tuesday, 10 November 2009

Mixing into weaker language

Studies above show that bilingual children can differentiate between their languages and show sensitivity to their interlocutor’s linguistic abilities from a very early age. Yet it is still implied that mixing occurs for qualitatively different reasons to adults. For instance, Cantone & Müller (2007) suggest that children mix more often into their weaker language. However, adults also have lexical gaps in weaker languages. The assumption of separate lexicons is weakened in a draft of Cantone & Müller’s paper:

“… one word und two word utterances …” (Cantone & Müller, 2007b, p.8, my emphasis).

Cantone & Müller’s theory could explain this mixing – that is, they are mixing into the language they are less ‘ready’ to speak (both authors are stronger speakers of German). The example is harsh, but demonstrates that it is unclear why their theory applies to developing children alone. The observation that both children and adults mix more into a weaker language is not surprising, and may only be a quantitative difference.

Monday, 9 November 2009

Modeling Bilingualism

When children are brought up speaking two languages, they often go through a stage of 'mixing' where they appear to be unable to separate their languages. For instance, a Welsh word might be inserted into an English sentence: As an example, when I first realised the implications of death, my parents told me that I cried and said "I don't want to go into the pridd" (earth, dirt).

Several theories have been put forward to explain this. Firstly, I may simply not have known the word for 'dirt', and had to rely on a word in another language. Back then, Welsh was probably my stronger language, so this would be an example of mixing into my weaker language. Alternatively, I had not yet learned to tell the difference properly between Welsh and English.

However, both my parents speak Welsh and both languages are used, probably with quite a lot of mixing. Therefore, I may have known the English word, and been aware that I was mixing, but I knew that using a bilingual code was permissible, given my interlocutors.

Indeed, Montanari (2008) finds that the child she studies mixes some words even when they know the word in the language of context. Does this suggest, then, that the child simply didn't know which words belonged to which language? I argue that this isn't necessarily the case.

Adults mix their languages for many reasons. In fact, it's often difficult to decide which language a word belongs to without a lot of context (e.g. 'zeitgeist'). Let's forget about languages for a minute and ask 'to what extent has the child acquired the communicative code of its parents'? By this, I mean how closely does the child's output mirror the parent's input?

To do this, let's look at Quay's (2008) study of a trilingual child. Japanese is the language of the environment, the father is strongest in English and also speaks Japanese and the mother strongest in Chinese and also speaks English and Japanese. Weekly recordings were made from 1;10 to 2;4 years. The utterances of both the child and the parents were coded along with the addressee. The summary of the data is very detailed - containing the proportions of mixing between any two people in Japanese/English, Japanese/Chinese, Chinese/English and Japanese/Chinese/English.

Let's model the child's mixing proportions as a function of the parent's mixing proportions. Each cell in the table below contains the correlation between the model’s predictions and the child’s actual mixing proportions. The first two models use the mother and father’s data separately. The third model is an additive model which combines the parents’ utterances and the fourth uses the difference between the parents’ mixed utterance types. The difference model was provided as a conceivable, but unlikely model. The correlations in the first column correspond to a model using the total input, whereas the last two columns correspond to a model using only utterances directed to the child (direct) and utterances directed to the other parent (indirect).

Although the mother spends more time with the child than the father, the total mixing behaviour of the child is equally predicted by the mother and the father. However, the best model is an additive model of the direct utterances to the child. That is, the child's output is closest to a model which tries to imitate the mixing behaviour of both parents.

Interestingly, the highest correlation between the mixing proportions is between the parents (0.999), which is nearly perfect. Perhaps, then, the child is simply trying to acquire the adult’s mixing strategies or 'Code'.

We can look at the data in more detail by calculating the correlations between mixing proportions for each interlocutor separately:

When addressing the mother, the child's mixing proportions reflect the mother’s total mixing proportions better than the father’s and vice versa, indicating pragmatic differentiation to each parents’ mixing. When addressing the father, the child’s mixing proportions reflect the mother’s indirect input. This could indicate that the child is mimicking the mother’s interaction with the father. The opposite isn't true, but any mimicry may be masked since the child spends so much time alone with the mother.

These two analyses conclude that the child’s mixing reflects the mixing of the parents from a very young age. Modelling allows us to gain extra insights on the potential learning mechanism for the child, but it relies on detailed data, as in Quay (2008). The model could be taken further to include considerations of location, the societal status of each language and the parent's tactics (Negative evidence, implicit allowance of mixing, teaching of translation equivalents etc.).

Now for the ambitious, unfounded part: Considering a communicative code, there may be no qualatative difference between mono- and bi-lingual language acquisition. How, then, do bilinguals select words? One possible solution is to use a sort of Bayesian probability distribution over the linguistic, social and pragmatic contexts for each word that represents the best estimation of when to use a word. If a mapping between words and pragmatic and social contexts is acquired, a discrete mapping between words and ‘languages’ becomes irrelevant. This approach works equally well for acquiring one ‘language’, or several levels of tone or dialect.

In this sense, the ‘remarkable’ ability to keep languages separate (Costa & Santesteban, 2004) seems less remarkable and less specific to bilinguals: We don’t find it remarkable that an adult refrains from using terms of endearment during a boardroom speech.

This approach would be extended to syntactic acquisition by assuming that, as the mapping between words and meanings developed, strings of words themselves became a context which was encodable in the probability distributions of words. This is essentially a constructivist approach to bilingual acquisition: Before linguistic acquisition, infants first learn an embodied perceptual ‘language’ – an iconic mapping between form and meaning – which allows them to relate structure in the world to an interaction between sensory and motor activity. The mapping between structure in the world and symbolic, linguistic representations would build itself on top of this system in the same way as syntactic (Bernardini & Schlyter) and lexical (Nicoladis & Secco) acquisition can build on pre-existing structures.

Following from this, the ‘difficult’ bit of language acquisition is not the segmentation of strings into words or words into lexicons, but the initial segmentation of the world into functional concepts. The development of this more fundamental understanding of the world may be an additional factor in the qualitative differences between mixing in children and adults.

Sunday, 8 November 2009

Languages and Poetry


by Carl Sandburg

There are no handles upon a language
Whereby men take hold of it
And mark it with signs for its remembrance.
It is a river, this language,
Once in a thousand years
Breaking a new course
Changing its way to the ocean.
It is mountain effluvia
Moving to valleys
And from nation to nation
Crossing borders and mixing.
Languages die like rivers.
Words wrapped round your tongue today
And broken to shape of thought
Between your teeth and lips speaking
Now and today
Shall be faded hieroglyphics
Ten thousand years from now.
Sing—and singing—remember
Your song dies and changes
And is not here to-morrow
Any more than the wind
Blowing ten thousand years ago.

Good point, Carl. However, poetry may be a particularly bad way to make points about language change, as Paul Valery says in The Art of Poetry, "poetry can be recognised by its ability to get us to reproduce it in its own form: it stimulates us to reconstruct it identically."

On the other hand, although poetry has a small transmission error in terms of phonetic reproduction, the fidelity of conceptual interpretation may be a different story. Show me a class of high school English Literature students, and I'll show you eleven different, badly written interpretations.

Lazy Linking Friday

Time for some lazy linking!

I've been listening to Archers of Loaf recently - especially the incredible ending to their penultimate album, White Trash Heroes. This is likely to be in my collection forever.

I have been utterly captivated by guitarist hironou2525: Videos of beautiful guitar playing together with links to mp3s and very high quality tabs. Obviously someone who knows a thing or two about cultural transmission. They have a blog here, although I couldn't read it. I especially liked I do:

The Speculative Grammarian is a collection of linguistic satire. There is a load of stuff there, but this set of puns caught my eye.

This recent paper by Novembre et al. is interesting - Genetic distance between people is correlated with the geographic distance between them. In fact, a PCA graph draws a pretty good map of Europe. I was especially interested to see the analysis of Swiss genes - they divide on primary language!

While Linguistic Exogamy (marrying someone outside your linguistic group) is common in liguistically diverse areas of the world (Papua New Guinea, Amazon basin) this analysis may suggest it wasn't practised so much in Europe.

Levels of Bilingualism

How many people in the world speak more than one language? Probably the vast majority. In the USA it's estimated at 18% (US Census Bureau), in Canada it's about 34% (Statistics Canada) and in the EU it's about 66% (European Commission). But getting data is hard - even in countries with the infrastructure to support a large scale census, the issue of bilingualism is often not prioritised. The metric of number of languages spoken in each country (linguistic density) has been used (e.g. Nettle, 1999), as well as the number of neighbour groups (Lupyan & Dale, 2009) and is probably correlated, but is not the same as bilingualism.

So, maybe we can estimate a different way. The Ethnologue has data on the estimated number of speakers for each language within a country, along with the number of people in a country. Subtracting the number of speakers from the number of people gives, in theory, the maximum number of bilinguals in a country.
Maximum Number of Bilinguals =
total number of speakers for all languages – total number of people
For example, if a country has 1 million people, and 500,000 speakers of language A and 750,000 speakers of language B, then 250,000 must be bilingual (if there are no other languages spoken). The figure below shows the ratio of speakers to people with darker areas indicating higher levels of bilingualism (data from Ethnologue, created with R):

click for larger image

As expected, the data is not good enough to warrant a proper analysis. The number of speakers is underestimated (total population of world = 6 billion, total number of speakers = 5.7 billion). 12% of entries in the ethnologue have no population data and for more than half of the countries the number of speakers is less than the number of people. One exception was Saudi Arabia, with a ratio of 9.4, possibly because 23% of the population are foreign nationals or, more intriguingly, because the majority of the population were nomadic until the 1960s.

At any rate, there appears to be no correlation with latitude (r= -0.1, t = -1.4, df = 197, p-value = 0.15) or longitude (r = -0.01, t = -0.28, df = 198, p-value = 0.8).

Ah well, back to counting people instead of numbers.
Gary Lupyan, Rick Dale (0). Linguistic Structure is Partly Determined by Social
Structure in Press

Genes vs Language

A point about the difference between genetic and linguistic inheritance:

I can count the number of ad banners I have clicked on one mouse. But I had to click on this one:
[picture of cat and human] - "We're not that different, if you believe we're all OneKind".

As a linguist trying to understand the genetic basis of language in humans, I immediately thought "Uh-Oh".

The link took me to a sign up sheet with a picture of a dog and Paul O'Grady. This is what it said:
"Feelings. We all have them and so do animals. 9 out of 10 people agree. We’re all OneKind."
Amazing! A whole new way to approach the study of cognition in animals - tap into the 'Wisdom of Crowds' collective subconscious of the masses. It continued:
"Can an animal feel lonely, can an animal feel scared, can an animal feel pain? Common sense and experience have long implied that animals are capable of feeling. New research reveals that 91% of people believe this."
Who could argue with those statistics? I've always said that science gets things wrong sometimes - look at the classification of the tomato as a fruit! Surely this is having an adverse affect? Yes, OneKind tells me - Seals are having a bad time in Scotland, there are snares in the world and scientists are running experiments with monkeys.

What are OneKind doing about this? The only answer on the whole site? A petition! Which doesn't seem to be being sent anywhere!
"I believe that animals can feel.
I believe we’re all OneKind.
I emphatically hope that my intrigued ad-click did not boost the ratings of a brainwashing cult. Not even brainwashing - just a group of people saying a meaningless statement, a sentence with a made up word, then three categories. And probably getting money and sympathy for it.

As an evolutionary linguist, I am very happy with the idea that we share many cognitive abilities with animals, and I'm not sure whether we'll ever find out whether animals have what people would describe as feelings. And, of course, I believe that all organisms are probably related at some level. What really annoys me, as a linguist, is that this site is exploiting a strange hypocrisy that, as Nettle & Romanie (2000) point out, people care about some obscure endangered species without caring that whole human languages are going extinct at a comparative rate. I'll be keeping my ad clicking in check from now on.

Cross-Dimensional Linguistics

It's time for some Socio Linguistics! How many names do you have? What do you call other people? Would you use first names with some but never others?

So far, I have been Sean, Seanny, D, Lep and Monyn, but not yet Mr. Roberts.

Can we tell anything about people by the names they use? Let's look at a corpus. I have selected that pillar of linguistic research and popular 90s sci-fi tv program Sliders. Let's see what the four main characters call themselves while dimension hopping around alternative versions of San Fransisco and Los Angeles (from EarthPrime, with Regular Expressions):

In the graph above, Quin, Wade, Rembrandt and Professor Arturo are represented by dashed circles with the different names people use in squares. Arrows between a person and a name indicate that person using that name, with the relative frequency indicated by the thickness of the line. All arrows from the same origin sum to 100%.

So what can we learn? The professor and Wade have the most reciprocal relationship - they use each other's first name and title in similar proportions. This indicates a curtious respect.

The Professor and Quinn, however, have opposite proportions of first names and titles - this indicates that Quinn recognises his intellectual superior (although Quinn is much better, obviously).

Next, Quinn only calls Wade "Wade", while Wade also uses Quinn's first name the most, but also plays with "Q-Ball" (Rembrandt's favourite nickname for Quinn) and "Mr. Mallory". This pattern is typical of people who fancy each other, but, in the end, will never do anything about it.

The group's most complex relationship is with Rembrandt "Crying Man" Brown. Everyones uses his first name and affectionate diminutive "Remmy", while only the youngsters use his stage name and only the Professor uses "Mr. Brown". All this confusion is, lamentably, because even dimension hoppers can struggle socially around black people.

Where do our words come from? English is Germanic, right? Ok, but what about words like Cappuccino, Revolution and Smorgasbord. Well, those were just 'borrowed' - they don't really count, since we're intending to give them back. But how valid is this view? Over the centuries, speakers have adopted words from all over the place, yet the diversity of the sources of words is under-appreciated.

Sounds like a job for ... etymology!

The general view of languages is that they are related like a family tree. English is seen as a Germanic language, along with Dutch and Flemish, while Welsh is seen as a Celtic language along with Irish and Cornish. The tree diagram below shows this idea, and gives the impression that the last 'common ancestor' of English and Welsh was way-back Proto-European:

However, this masks the complexity of languages and language change. A strict family tree marginalises the borrowing of words from other languages. For example, there are a huge number of 'English' words with roots in French, Italian and Spanish.

Hurford & Dediu (2009) encourage us to see languages as made up of sets of linguistic units (e.g. a word), each of which can have a separate ancestry. I wondered what this would look like, so I used the Online Etymology Dictionary to create one.

The Etymology Dictionary lists the heritage of English words, for example:
Cabin: 1549, from M.Fr. cabinet "small room," dim. of O.Fr. cabane "cabin" (see cabin); perhaps infl. by It. gabbinetto, dim. of gabbia, from L. cavea "stall, stoop, cage." Sense of "private room where advisors meet" (1607) led to modern political meaning (1644).
That is, the ancestry of 'Cabin' can be traced back through Middle French, Old French, Italian and Latin. Similarly, the word 'Tower' also comes from Latin, but via Old English. Crawling the website, the relationships for about 5000 words were processed. I used hypergraph to display them in an interactive hyperbolic graph. You can play about with it below, or visit here. Click and drag portions of the graph on the edges closer to the middle to explore. For some reason, it starts off zoomed in on Latin, but there's a lot of detail to the right (see here for abbreviations).

For ease of presentation, the graph is simplifed, with lineages of words between 'languages' first going through a language node. Also, Modern English words are not represented, but all contained within the 'Mod.Eng.' node.

Some bits of the graph are tree-like: Words with roots in Middle High German are only borrowed through (New High) German. However, in general, the graph is not tree-like at all. The lineages of English words have all sorts of routes through earlier languages. For example, words can come from Greek via German or French. And this is only for English words. Imagine etymological data from German and French was added.

Ok, so the graph is pretty useless for research - it's just way too complicated (part of the problem is that hypergraph is designed for trees). What I'm aiming at is questioning the idea of a 'language' as a stable set cut off from other 'languages'. We don't inherit a 'dictionary' from just two individuals, like our genes; we pick up individual words from a wide range of sources, and keep adding, borrowing and changing them throughout our lifetime.

Happy Scary Season

All Hallows Eve, Hallowe'en, Halloween, Calan Gaeaf, Reformation Day, call it what you will, there are pumpkins being carved around the globe. Here's my attempt:

If you're wondering, the inspiration is from the film poster for Kurosawa's Throne of Blood.

Coke Maze

I had an idea for a game: You build a maze out of cocaine, then have to snort your way to the end without dying. For an extra element of danger, some lines can be salt/anthrax.

The anti-coke campaign below had a similar idea. However, the maze seems unsolvable, so presumably the message is "Cocaine addiction is a problem, but there's no hope".

Entrance Music

Many people have told me they know which song they want played at their funeral, but few have ever had a preference for songs to play at their birth. Ok, both are a bit strange, one you can't do because it's in the past and one you won't be around to enjoy anyway.

Of course, some would argue that the choice of music should really be left to the mother.
Dr. Jeni Worden has some advice on up-beat tracks for mothers, but can have some have mixed results, for example:
I spent the early part of my labour making a mixed tape of Motown, hip hop and rock to keep me energised. While this worked when I had been labouring relatively pain-free for eight hours, I wanted to destroy it as violently as possible when I was exhausted ten hours later.
This illustrates a difference between birth and death songs. While a 12 minute guitar solo might seem a bit indulgent at a funeral, there's a lot more scope for a birth song. Wagner's Ring Cycle wouldn't be totally out of the question. At any rate, choosing a song for the 'big moment' is still tricky. A balance must be struck between the rock and roll celebration of life and the choir-of-angels pretension.

In the end (or is it the beginning?), I'd like to come out rocking. Here are some songs I'd like to emerge to (unfortunately I've taken this rather seriously and none of them are funny puns):

1. It's all too much - The Beatles
2. untitled track (Kick out the Jams) - Rage Against the Machine
3. Le Cinquième As - MC Solar
4. The Blue Mask - Lou Reed
5. Cochise - Audioslave
6. Run Rabbit Junk - Yoko Kanno/Tim Jensen

Sound symbolism describes a ‘natural’ connection between sound and meaning. For example, it has been noted that many words which sound the same also have similar meanings (e.g. Point, poke, pike, peg, pierce, prick, prod, probe). The idea is that some meanings (like the pointyness of the example words) lend themselves more naturally to certain sounds (like the explosive 'p' of the example words). It's been a contravertial theory, especially when applied accross languages towards explaining the origins of language (e.g. Rammachandran and Hubbard's synaesthetic bootstrapping hypothesis). More recently, the topic of 'ideophones' has take over.

It's fun to think of more examples, but here's a study I did a little while ago:

Sound symbolism may be felt most with very visceral words. Hurford (2007) suggests that words for ‘urine’ and ‘faeces’ should sound different across languages. Specifically, 'urine' words should have higher vowels to match the high pitch of pissing, (e.g. 'pee') while 'faeces' words should have lower vowels to match the bass sounds of the behind (e.g. 'poo').

Who said Linguists were always high-minded?

I tested this theory by finding Words for ‘urine’ and 'faeces' in 29 languages, including ones from Europe, Africa, North America, South America, Asia, Austral Asia, Australia and the Pacific islands. There was a bias towards selecting 'childish' or child-directed words over clinical terms (e.g. 'poo' over 'faeces') following arguments for the primacy of childish language (Cytowic 1995). Here are the results:

Words With... Urine Faeces
High Front Vowels Only 26 5
No high front vowels 7 32
Mixed vowel types 12 5
Total 45 42

There were significantly more words for urine with high front vowels than for faeces (Chi-Squared = 33.07, df = 2, p <>

Yes We Can

Here's an old stencil made during the Obama Campaign. Yes, it's the Master Chief from Halo. I suppose I could change it now to "Yes They Did".

Monday, 26 October 2009

Lingua Tecnologia

People worry that Capitalism and Technology will force everyone into a monoculture, destroying the idiosyncrasies of local cultures and languages. Here, I argue that, although these factors my cause people's cultures to grow increasingly closer, people's languages may diverge. This argument will take us to the North-West coast of India, but first, to a not-so-distant future.

Famers in Gujarat, India, using the Avaaj Otalo service.

In Jim Hurford's mini Sci-Fi Desperanto is set in a future society where people use lators - small devices which translate speech from one language into another - to speak to people outside their social group. Over time, language packs have been added so that anyone can speak to anyone else through them, even between dialects. This means that the languages of each little community are cut off from each other, leaving them free from the pressure to conform. The result is an explosion of diversity in languages.

The lators are used for trade between groups. However, similar systems may not be so futuristic.

Alastair Sussock was telling me about his Weather Insurance project for the IFMR Centre for Microfinance. It's an assessment of how farmers in the state of Gujarat, India, approach the idea of buying insurance against crop failure due to bad weather. Gujarat has an unstable climate where years of good rainfall will be punctuated by severe droughts, and weather insurance would help spread this risk. However, farmers were initially resistant to the idea of paying for a service which may not materialise. In the report, Sussock identifies 'financial literacy' as a major pitfall for the sale of insurance.

Indeed, his next project looked at how a mobile SMS system for disseminating local economic information would affect the farmers. There is a high degree of inefficiency in local markets, due to the farmers not being sure what price they will receive for their goods in which market towns. SMS farming is a system whereby farmers receive information about the prices of goods in various locations (e.g. 60 Rupies for 10kg of Cabbage in Ahmedabad) so they can best judge where to sell it and for how much. There are already several services set up (e.g. here), and many seem to be succeeding (here).

The idea is that farmers can optimise their business decisions, increasing profit and lowering prices. Armed with this financial knowledge, farmers are less dependent on social networks for information.

Daniel Nettle identifies ecological risk as a major factor in linguistic diversity. Where the environmental conditions are more variable and self-sufficiency is more difficult (e.g. a dessert/floodplain), groups must form close social bonds with each other in order to spread the risk of ruin. In this situation, it's more likely that linguistic norms are adopted by networked groups and, over time, everybody ends up speaking the same language. On the other hand, in ecologies where self-sufficiency is easy, there is less need to form close social bonds with other groups, therefore the languages will remain separate. Indeed, Nettle (1999) finds direct correlations between ecological risk and linguistic diversity.

Importantly, languages will only converge when groups have close social bonds with other, distant groups. Although the SMS technology may be expanding people's social networks, the new links are domain-specific and loose. Therefore, the introduction of ecological insurance and a centralised technical lingo for economics may have interesting effects on the linguistic diversity of India. If the ecological risk can be reduced by insurance, and the inter-group dependence and linguistic communication reduced through the SMS service, then the languages of individual groups will become increasingly isolated. This may, as Nettle and Hurford predict, lead to a diversification of language.

Although farmers across India may come to share a common financial language, the languages and dialects they use with those in their closest social circles will remain their own.

Cultural Tansmission and Martial Arts

As I tried to think of anything else except the searing pain in my quads at my last Kung Fu class, I started wondering about how martial arts are taught. The master gradually imparts a very stratified sequence of movements, called the form, to a student who then repeats them over and over. Importantly, the master tries to minimise the idiocyncracies in the student's form - a slightly arched back, irregular timing, poor balance etc. Because of this, I was wondering how much variation there is between different schools paracticing the same forms. Do different masters have different 'accents'?

The approach of most martial arts is specifically set up in order to minimise this. Eugen Herrigel, a german proffesor of philosophy living in Japan, describes his attempt to master the art of Archery. He claimed that, although the initial repetition of movements isolated from application was frustrating, he eventually experienced a 'direct transference of the spirit' from his master (from Singleton, 1998). How about that for anyone doubting the fidelity of memetic transfer?

The theory of evolution states that, in order for a system to improve, there must be variation and competition. In martial arts there is literal competition between sparring partners and in organised events. The best sequences will be selected. However, the system of teaching is set up so only the highest ranking teachers can alter and add to the form. This seems sensible, since changes are likely to improve the form, but it sacrificies improvements that could be found by chance.

Can we learn anything about the cultural transmission of language by studying the way form is transferred from master to student? Moscato (1989) in one of the first articles on memetic algorithms argues that Martial Arts is indeed a good candidate for the study of complex adaptive systems:

The form, like a chromosome, is not an indivisible entity. It is composed of a sequence of defensive and aggressive sub-units which can also be divided, a pattern that resembles the structure of chromosomes, genes and alleles. But within the form there are some movements which can be understood as an indivisible unit, and these are the ones that are really important. The whole is a support to let the brain transform them as reflexes that can be automatically triggered in real combat. The individuals can compute their fitness function by the evaluation of their performance in the execution of the movements of the forms and with some tournaments where they compete.
Moscato points out that most primates, including unskilled humans, tend to fight in an unorganised way while a martial art teaches one to use simple, efficient movements. This might be compared to the chimpanzee Nim's ramblings 'give orange me give eat orange me eat orange give me eat orange give me you' compared to a human's utterance ('give me the orange'). The martial art from may have a 'syntax' in the sense that the order of moves is very important.

However, there are several problems with using martial arts to study language. First, an estimation of the amount of variation in martial arts forms would be necessary. In lieu of excellent historical records, the best estimation would be reached by a comparative approach. This would involve the codification of forms then comparisons to find underlying structures. In other words, the whole problem of linguistics in the first place, and not obviously easier than for languages.

Secondly, it's not clear that martial arts has a semantics. In fact, the moves are not signals used for communication, but coercive actions.

Ah well, back to the searing leg pain.