Hello I’m working with code-switched, multilingual data and I would like to calculate the percentage of intraword switching in my data.
I have a tier in which the transcript is divided into individual words, called ‘mixed by word’. Three further child tiers annotate the parts of each individual word by language. These are ‘morpheme break English’, ‘Morpheme Break Afrikaans’, ‘Morpheme Break Dialect’.
How, for example, can I ask ELAN to count every annotation in parent tier ‘mixed by word’ which also contain any annotations on the child tiers ‘morpheme break English’ and ‘Morpheme break Afrikaans’? (I realise I’ll have to run 4 queries to exhaust all the possible intraword switching possibilities, but once I have the formula I can alter it as needed)
From the manual I understand that the expression \W stands for ‘any’. See my attempt at the query below. ELAN brings up a result, but not the correct result. In the end the result is showing me intrasentential switches, not intraword switches (and I know these exist in the transcript).
FIND
An annotation on tier “mixed by word FA” that matches regular expression \W
WITH CONSTRAINT
An annotation on tier “mb Eng FA” that matches regular expression \W in distance of -X to +X Mixed by Word FA annotations
WITH CONSTRAINT
An annotation on tier “mb Afr FA” that matches regular expression \W in distance of -X to +X Mixed by Word FA annotations
I wonder if my stereotypes are affecting the result? ‘morpheme break’ tiers have the stereotype symbolic subdivision, because I wanted to be able to have gaps between annotations (because of intraword switches). The parent tier ‘mixed by word’ has the stereotype time subdivision, and is also the child of the primary parent tier which is a mixed transcription, stereotype Default.
I hope I’ve explained relatively well, while keeping the post short. I tried to insert a printscreen, but unfortunately doesn’t seem possible.
Thanks,
Christine