Challenging people on what phonemes they’re hearing, when they’re analysing a language: that’s thankless stuff. There are subtle continua of phonetics, and if you’re actually doing this kind of thing for a living, you rely on spectrograms and electropalatograms, with chocolate paste to tell where your tongue is actually moving. One’s ears? They hear what they’re used to hearing, and fail to hear what they’re not used to hearing.

I’m biased in saying so, I have to admit, because I have a tin ear. My experiences of trusting my ear have gone badly. I did a phonetics assignment as an undergrad on establishing the phonology of a random language. I picked Cantonese. Easy choice given the demographics of Melbourne University Engineering (as opposed to Arts); insane choice for linguistics. Has anyone settled on how many tones Cantonese has? I checked, and the model of Cantonese I came up with had very little to do with any reality; my model was still exhaustively enough argued to get a good mark, regardless.

Which is how science works.

Even worse was the beginning lecture of of Field Methods, when I had to transcribe unreleased stops. Unreleased stops. Cantonese had them too. I mean, what’s the point of unreleased stops when the release is the only way to hear the difference. What a daft thing to do to the listener.

I should… come to the subject at hand though, shouldn’t I. My topic is the phonetics of Mariupolitan, and how it is reflected in its orthography.

There is scattered work on Mariupolitan: Solokov’s and Sergievskij’s 1930s studies, work by Chernyshova and Beletskij in the ’50s and ’60s, which I can’t check halfway across the Pacific from my library; Zhuravliova’s 1982 thesis and work in the early ’90s, published in the Studies in Greek Linguistics series; and finally the grammar by Symeonidis and Tombaidis, the first time linguists from Greece worked on the language. My bibliography is on my web site, search for “Mariupolitan [MRP]”.

  • Συμεωνίδης, Χ. ϗ Τομπαΐδης, Δ. 1999. Η Σημερινή Ελληνική Διάλεκτος της Ουκρανίας (περιοχής Μαριούπολης). Αθήνα: Επιτροπή Ποντιακών Μελετών.

Zhuravliova and Symeonidis & Tombaidis disagree on two allophones of Mariupolitan.

  • Zh reports that Mariupolitan has a central allophone for /i/, [ɨ], which is said to occur “before hard consonants”. S&T aren’t clear what a hard consonant is meant to be (in a Slavic linguistic context, it’s pretty obvious, but I’ll argue that here it ended up circular). At any rate, they did not find the distribution that systematic in her own transcriptions: it’s frequent word-finally. But S&T did not hear [ɨ] when they were surveying Mariupolitan; in fact, they could not hear it in the recordings Zh provided, where she had already transcribed [ɨ].
  • Zh reports that /k/ palatalises to [tʲ]: so /kifali/ comes out as [tʲifaʎ]. Again, S&T note the change is not systematic in the transcriptions, and S&T did not hear this in the field or on the tapes: all they heard was /k/. S&T consider whether this is because of Standard Greek influence, but reject it as unrealistic: Standard Greek influence has been limited, especially with the elderly villagers they’ve been dealing with. S&T also call the posited change “peculiar”, and remark that “it is not the purpose of this study to determine the reasons for Zhuravliova’s mishearing.”

That’s an uncollegial thing for them to say, but it’s allowed if it’s what the data says. Still, I’m going to be uncollegial in turn:

  • The standard Greek palatalisation of /k/ is to [c], which means that what S&T are hearing is [cifaʎ]. If it was [kifaʎ], they would have mentioned the deviation from Standard Greek as noteworthy: it gets mentioned in studies of Cappadocian Greek.
  • There’s not a lot of difference between [tʲ] and [c]. There may be less, since Zh had described her [tʲ] as palatal, not palatalised alveolar—but she does distinguish [tʲ] from [kʲ], so maybe not.
  • S&T don’t mention [c] as an alternative to [tʲ]: they dismiss Zh’s phonetics, but aren’t showing the required awareness of phonetics in this case themselves. Talking about just <κ> to refute [tʲ] is lazy.
  • The change of /k/ to [c] to [tʲ] is not absurd, and in fact there is a parallel within Greek, in Tsakonian. Tsakonian /k/ has palatalised to [tɕ] while Tsakonian /t/ has palatalised to [c]. So “weather” καιρός /keros/ is in Tsakonian τχαιρέ [tɕere], while “I honour” τιμώ /timo/ is κιμού [cimu]. That means that palatalised /t/ and /k/ have actually swapped places in the palate, with [tɕ] further front than [c].
  • And if there is a [ɨ] in Mariupolitan, Greek linguists don’t have a good track record hearing central vowels in their dialects. We only found out Samothrace has [ɨ, ə] in the ’90s, and that only because Katsanis, who discovered it, is a native speaker of Aromanian—a language which unlike Standard Greek has those central vowels.
    • Κατσάνης, Ν.Α. 1996. Το γλωσσικό ιδίωμα της Σαμοθράκης. Θεσσαλονίκη: Δήμος Σαμοθράκης.

Two more things. First, why Zh “misheard” (if that’s what’s happened) may not have been of interest to S&T, but it is of interest to me. Second, Zh is not the only person to hear those allophones in Mariupolitan. From the Shevchenko poem I posted before, Kir’jakov writes скутъеты ратлых виглызу /skuθetɨ ratlɨx viɣlɨzu/. He has the palatalisation of /k/ to [tʃ], which S&T noted as occasional (прощину /prostʃinu/), but he also has тен тъэос /tʲen θeos/ (“there is no God”, Pontic ‘κ έν Θεός [kʰ en θeos], Early Modern Greek οὐκ ἔνι Θεός), and тюнурю /tʲunurʲu/ for “new” (Standard Greek καινούργιος [cenurʝos].)

The Mariupolitan of the ’30s, written in phonetic Greek, does not have a distinct letter for [ɨ]: it’s an allophone, and alphabets aren’t normally in the allophone business. But Kostoprav clearly thought he heard [tʲ] too, at least some of the time: from Φλογομινίτρες Σπίθες p. 92, κε ντρυπιαχτικά κρέμαςιν τυ τιφάλτς /ke drupiaxtika kremasin tu tifalts/ “and she hung her head in shame”, Standard Greek και ντροπιασμένα κρέμασε το κεφάλι της /ke dropiasmena kremase to kefali tis/.

As /ke/ “and” shows, the change is hardly systematic; as in Pontic, Mariupolitan has an unfortunate homophony of /ki/ < /ke/ (more frequent for “and”) and /ki/ < /uki/ “not”. Pontic deals with it by aspirating “not” as [kʰi] (and writing it as ‘κι); Mariupolitan seems to deal with it by writing them as ки and ти.

Do they sound different? Is there a distinction made between [ci] or [ki] and [tʲi]? I can’t check Zh’s thesis at the moment, and S&T’s phonology doesn’t tell me, but the transcriptions don’t differentiate them. The orthographic distinction Kir’jakov makes of ки and ти could be artificial of course, but I don’t think they made their <т> up completely.

Assuming that S&T are right, and there is just [c] in Mariupolitan, we should remember that the Mariupolitans are bilingual in Russian now, and many of them already were in the ’30s. A Cyrillic alphabet for Mariupolitan is going to bring with it Russian notions of phonetics, and of palatalisation in particular: Kir’jakov’s Cyrillic alternates between е and э [je, e] and I don’t see a consistent pattern for it—especially when the distinction is carried across to Cyrillic publications of Kostoprav, who originally wrote in phonetic Greek.

So the act of writing down Mariupolitan in the Ukraine filters Mariupolitan phonology though Russian—certainly in Cyrillic, maybe even in Kostoprav’s Greek script, which was after all phonetic. All it takes for the “mishearing” is for Mariupolitan [c] to sound closer to Russian /tʲ/ than to Russian /kʲ/. And I wouldn’t be too harsh on the “mishearing”: S&T didn’t exactly highlight that [k] and [c] are not the same either.

About [ɨ], I’m less certain. Kostoprav doesn’t have it, and wouldn’t have been in a hurry to write down as an allophone (and how would he write it anyway, <η>?) Kir’jakov has it, but S&T can’t hear it though they tried. I suspect this is influence somehow from Russian, with some Mariupolitan dialects picking up the Russian vowel, and applying it haphazardly to their Greek. Zh’s “hard consonants” don’t tell us why, because they’re only hard (unpalatalised) if there’s no [i, j] next to them to begin with in Russian.

Filtering your orthography through the majority language’s ears is nothing now. It’s an uncomfortable fact for linguists working on minority languages. Alphabets normally code just phonemes and not allophones: that’s why Greek dropped its koppa. Linguists work out the phonemic inventory of a language, reuse the spare letters for phonemes not in English, and present coherent well-thought out alphabets to their communities.

But the communities’ priority is not literacy in the minority language: it’s literacy. If the proposed orthography for their language looks confusingly different to the orthography of the majority language they actually need access to, they’ll reject it. Papua New Guinea communities for instance can reject reusing <q> for [ɣ], because that’s not what <q> means in English; and learning to write their tokples shouldn’t be getting in the way of learning to write English (or Tok Pisin).

At any rate, if Kir’jakov thinks Mariupolitan has [ɨ] and [tʲ], then that’s how his texts should be transcribed. [tʲ] is a particular problem for Standard Greek speakers, because it’s so far from their expectation; in transliterating Kostoprav’s poems into Greek, Ioakimidis silently emended them back to <κ>. I didn’t help by transliterating тен тъэос as τ’ έν Θεός; following Pontic ‘κι, I should have written ʼτʼ έν Θεός, but I doubt that would have been clearer.

Oh, and Kir’jakov writing тен “isn’t” as one word? Never trust native speakers on word segmentation. There’s confusion already in his text between н та “with the” and н та “when”.

  • (S&T record /min/ > /mi tun/ for “with”, but not /n/; and they record “when” as /an, anda/. So /nda/ for “when” is not “with the”, but a variant of Greek dialectal /onde/ “when”. But native speakers aren’t historical linguists, so they can’t tell when something historically was a single word.

    You should see what Tsakonians do to their clitics when they write in the dialect…)


  • David Marjanović says:

    …and indeed [c] and [tʲ] are ends of a spectrum. Slovak is said to be closer to the [c] end than to the [tʲ] end…

  • David Marjanović says:

    Sorry for being 3 years late. I have meanwhile been to Greece twice, once to Athens and once to Crete (mostly Heraclion). The only palatalized allophone of /k/ I noticed was [kʲ], the exact same thing as in Russian. Likewise, the palatalized allophones of /l n/ are [lʲ nʲ], sounding again exactly as in Russian. All three of these sounds seem to occur in central Macedonian; at least two of them, plus [gʲ], probably occur in southern Albanian; and I've heard [kʲ] in Turkish.

    It was a bit hard to find out what the IPA symbol [c] means, because different phoneticians use it for different sounds (even affricates, not just plosives)! But, by its official description, it's a dorso-palatal plosive, made with the back of the tongue (not the tip or anywhere near it!) against the middle of the palate. That means Ladefoged was (unsurprisingly) right in equating it with the Hungarian ty, soundfile accessible from here. It's quite hard to tell apart from [tʲ], so I'm completely sure that's how Russians would interpret it at first. Other than Hungarian, the only European language I've found it in is Latvian, where it's spelled ķ – rather confusing from a synchronic perspective. Of course, the sample of languages I've heard and paid attention to is quite limited.

    I'd be very surprised if [c] and [tʲ], or of course [ɟ] and [dʲ], were distinct phonemes in any sound system. They're just too similar.

    However, I don't think Shanghainese should really be seen as the standard of Wu

    I wasn't trying to do any such thing; I was talking purely about Shanghainese in its own terms, which has – from the descriptions I've read – a phonemic distinction between high and middle pitch in the stressed syllable provided it doesn't begin with a voiced consonant (which automatically imposes low pitch). It's easily possible that other dialects of Wu have the five tones they're traditionally credited with.

  • 28481k says:


    True, Shanghainese is pretty much pitch-accent than tonal, but each individual syllable has its innate tone that will be adapted in a phrase, that's why you still find tone indications when you look up in a Shanghainese dictionary, phonetically tone still exists, just not phonologically. Also, the pitch of the first syllable of a phrase is still determined by its original tone. 😛

    However, I don't think Shanghainese should really be seen as the standard of Wu, because of the fluidity of its phonology, it seems that each generation has begotten a new way to enunciate Shanghainese. Some changes can be accounted to the influx of Mandarin, but other changes would probably just be simplification… I asked someone in Twitter about this, he said one shouldn't be worry about the flux, after all, rural dialects retained characteristics of old (like 1920s) urban dialects, so phonological history can still be traced, somehow.

    Oh, I didn't even start to describe other dialects of Cantonese in my previous piece, simply because I don't know enough of them. The dialect of my ancestral hometown, Dongguan, for example, has a more pervasive *h/w > f phonological change than Metropolitan Cantonese, in fact once I mistaken what one such speaker said because he said fu6 instead of wu6 (Mandarin hù 戶, as in Hukou, the household registration system)! Tones are obviously different across accents and dialect, at times they can sound jarring to my ears even though one shouldn't judge other's way to speak!


    Well, unsystematic tone sandhi makes Cantonese unlearnable? I know it's a hyperbole but it does have some interesting result.

    Let's take the word orange, 橙, as an example of crazy tone sandhi. It should have a "low level" (4th) tone caang4 tsʰaːŋ˨˩ as dictated from middle Chinese (and indeed, the corresponding mandarin pronunciation is chéng /ʈʂʰɤŋ˧˥/). But it's such a common word for the fruit and colour (unlike Mandarin which usually calls it 橘子 mandarin orange anyway) that it gets a habitual sandhi into caang2 tsʰaːŋ˨˥ to make it more audible! Indeed, every Cantonese speaker will attest that this is the only way to pronounce it, irrespective to its historical value. This is descriptively fair and correct, even the most ardent orthophonologist who wants to "correct" our Cantonese pronunciation would not counter that claim. However, it's just funny to see that the fossilized sandhi becomes the standard because the derived pronunciation from middle Chinese is seen as so flat and difficult to pronounce. Other sandhied syllables would be pronounced at its original tone at least in some circumstances.

    I won't chime into the debate on what /c/ is because I don't know it well myself!

  • John Cowan says:

    My reference to /cʲ/, a palatalized palatal stop, should have told you that I was brain farting. I was both reading and writing /c/ for Russian's dental slit affricate, IPA /ts/-with-ligature. This is what I said didn't have a palatal form in Russian.

    I have no idea how a Russian ear would hear a palatal stop.

  • opoudjis says:

    @28481k, @David: thank you for weighing in. Phonology is always a lot more complicated than students hear about in first year. (And my department had less emphasis on phonology than phonetics.)

    Phonolog conditioned by word class! Now that's nasty, and no wonder dialects confuse it. The lack of systematic sandhi just makes the language unlearnable, surely 🙂 (or rather, makes it more plausible that there are more underlying tones than less, since they're not as predictable.)

    @David: I'm influenced by learning Greek in a region with a dialect substrate that affricates its /k/ _i; but I'm reasonably confident through it really is [c] in Standard Greek—and I've never seen a recent phonetic treatment of Greek say different.

    @Anon, @John: that's bad news for my hypothesis. But since Russian doesn't have a [c], the options for transliteration mean it's likelier to force [c] to conflate to *anything*, than to write it as a distinct letter, especially given it's a palatal stop. Greek has always gone the phonemic route, and I think even in Mariupolitan it'd be obvious on naive linguistic analysis that the [c] is an allophone of /k/. Hence my hypothesis that it was conflated with /t/ instead because of Russian.

    But if [c] is too different from [tʲ] to conflate for a Russian-speaker, that would throw the burden of mishearing back to S&T. Still, the consistent practice of Mariupolitan native speakers, both in the 1990s and 1930s, mean this [tʲ] is not just in Zh's head, as S&T suggest: it's something that does need to be explained.

  • David Marjanović says:

    The English Wikipedia says Standard Greek has [c], but the German one says [kʲ] instead… there are also other languages where is [c] traditionally used in phonetic transcriptions but [kʲ] is actually pronounced. [c] is this here (link to a page with audio files from Hungarian), very similar to [tʲ]; does that sound occur in Greek?

    I can easily imagine different dialects of Cantonese having different numbers of tones. After all, some Mandarin dialects have 3 instead of 4, others lack toneless syllables ("light tone"/"neutral tone"), and I forgot if any 5-tone dialects are left.

    According to Wikipedia and other online sources, Shanghainese doesn't have tone sandhi, or tone in general – it has a pitch-accent system: the stressed syllable of a word can be high or low, voiced consonants make a syllable low, and the pitches of the other syllables of the word are completely predictable. It's just a Chinese tradition to look at the syllable and to lack the very concept of "word" as distinct from "syllable".

    4 and 10 are apparently homophones in the Mandarin of Sichuan. Standard: sì; shí – southerners don't retroflex, so they use [s] for both, and tones are a volatile affair anyway.

  • 28481k says:

    "Has anyone settled on how many tones Cantonese has?"

    The answer is, like all things in linguistics, "yes and no".

    It is a yes for non-native pedagogical purposes concern, 6 tones "contours", that means dictionaries and teaching materials are now conformed to the 6-tone standard instead of 7-tone (as allowed in Yale) or 9-tone (as in traditional including unreleased stop "entering" tones).

    It is, however a no when it comes to real phonetics. As you might have read in Wikipedia, they can't tell whether the traditional 9-tone model, 6-tone model, or whatever model is "right".

    The problem starts from the variability of the first tone (high level): is it high level or high falling. It supposed to be divided semantically (nouns at high level, verbs at high falling), but modern dialects conflate them where you can see Hong Kong and Guangzhou Canton) made different choices, HKC the former, GZC the later.

    Unlike say dialects of Wu (Shanghainese being an extreme example) and various of Min, Cantonese doesn't have a highly systematic tone sandhi. That means, when tone sandhi occurs, you can argue that a new tone emerges, like the super high tone, the raised low tones… This makes a cogent phonetic analysis a difficult thing, and we're already excluding other Cantonese dialects! Phonology in other Cantonese dialects can also be rather different, from something relatively mutually intelligible to Metropolitan Cantonese to something rather unintelligible like Taishanese!

    "Unreleased stops. Cantonese had them too. I mean, what's the point of unreleased stops when the release is the only way to hear the difference. What a daft thing to do to the listener."

    Many a speaker would agree with you! Although -p is rather apparent, -t and -k can be difficult to discern without release, but that's just how things are done, and many speaker confuse them together (with interesting results vis-a-vis to Middle Chinese). The worst culprit seems to be between 八 /pat/ (eight) vs 百 /pak/ (hundred), and in close successions it can become a tongue twister!

  • John Cowan says:

    I can't believe that any native russophone would conflate /c/, /tʲ/, and /kʲ/, or even any two of them. It's conceivable that /cʲ/ might be misheard, since that does not occur in Russian.

  • Anonymous says:

    Speaking as a native-English speaking undergrad with 2 years' worth of college-level Russian and a phonetics course under his belt, I'd like to disagree with the idea that "there's not a lot of difference between [tʲ] and [c]." It seems like there's a lot more affrication for the former than the latter. When I look at the waveforms, there seems to be a lot more fricative-like energy at higher amplitudes for [tʲ] than for [c].

