Ἡλληνιστεύκοντος

New TLG words in DGE VII

By: Nick Nicholas | Post date: 2010-02-15 | Comments: 1 Comment
Posted in categories: Ancient Greek, Linguistics
Tags: Ancient Greek, lexicography, lexicon, TLG

As I posted last month, the new volume of DGE (Diccionario Griego–Español) has appeared, spanning ἐκπελλεύω–ἔξαυος. As with any lexicographic work of an older language, some philology and textual emendation has been involved; this paper by Eugenio Luján Martínez gives four such instances, in Epicurus, Aretaeus, Nicander, and Galen.

I have gone through this volume and the TLG texts dated from before i AD, to find words not given in the other dictionaries out there (LSJ, LSJ Supplement, Bauer, Lampe, Trapp). I’m posting them here for interest. Note that the list may be small, but that’s because the major gaps are in papyri (which are not in the TLG)—not ancient literature, which is already (nominally) well-ploughed land. (There should be quite a few more words from TLG AD texts.) DGE’s entries are still much more detailed than LSJ’s, and its coverage of antiquity broader. I’m also omitting proper names, which I have been treating differently in lemmatisation.

I’m leaving out the glosses, because the DGE is a commercial product and all. Most of them you can guess the meaning of, if you know your way around Greek vocabulary…

ἐκπυρώδης, ες.
ἐλάφινος, η, ον. (But Trapp has the variant ἐλαφινός.)
ἐλέκεβρα, ας, ἡ. (ἠλεκέβρα, ἰλλεκέβρα, ἐκλεκέβρα).
ἐλέσσω.
ἑλωρεύς, έως, ὁ.
ἕμα, ματος, τό.
ἐμβρωσί, τό.
ἐμπιεστός, ή, όν.
ἐμπύρευσις, εως, ἡ.
ἔνδεινος, ον.
ἐνναγώνιον, ου, τό.
ἔνοψ, πος, ὁ.
ἐναντιοεργός, όν.
ἐνεδρευτός, ή, όν.
ἐνιζυγίς, ίδος, ἡ.
ἐντεροκοιλιακός, ή, όν.
ἐντραπής, ές.
ἐντυπάδιον, ου, τό.
ἐξάρακτος, ον.

Etymologies and attestation of μουνί

By: Nick Nicholas | Post date: 2010-02-13 | Comments: 9 Comments
Posted in categories: Linguistics, Mediaeval Greek
Tags: Early Modern Greek, etymology, lexicography, philology

OK, let’s draw this talk of μουνίν to some sort of close. I’ll present the first attestations of the word, as given in Trapp’s and Kriaras’ dictionary; and then I’ll reproduce Moutsos’ presentation of the various proposed etymologies, with a few of my comments.

The attestations are given with date of authorship, followed by date of earliest manuscript. The manuscript date matters because, as you’ll have already seen from TAK’s comments in previous threads, you can’t trust mediaeval scribes not to interfere with the language of what they’re copying.

John Tzetzes, Theogony (12th century/ca. 1400) οὐκ αἰσχύνεσαι, αὐθέντριά μου, νὰ γαμῇ τὸ μουνίν σου παπᾶς; Aren’t you ashamed, my lady, to have a priest fuck your cunt?: The reading is preserved only in one manuscript, ca. 1400 (“turn of 14th century”). There is a possibility that the manuscript scribe has introduced this, pejorating whatever Tzetzes originally had written. The other manuscript preserving the passage, from the 15th century, clearly saw something worth censoring in the original, to have left the second half of the verse out. But given that it also censored the previous verse, mentioning a priest as lover, we can’t be sure what it censored was a translation like “fuck your cunt”, or something closer to the Proto-Ossetian’s “have a love affair”.
Entertaining Tale of Quadrupeds 467 (ca. 1364/15th century) διὰ νὰ σηκώνῃς τὴν οὐράν, νὰ δείχνῃς τὸ μουνίν σου (Sheep to Goat): You’re here to lift your tail and show your cunt off!: The translation is due to one George Baloglou and one Nick Nicholas.
Miklosich & Müller, Acta et Diplomata Graeca Medii Aevi, Vol. II p. 53: Church synod condemnation of Constantine Cabasilas (24 August, 1383) τέταρτον· ὅτι ἐβάπτιζε ποτὲ παιδίον, ἵστατο δὲ ἐκεῖσε καὶ γυνή· ἔχρισεν οὖν τὸ βρέφος τῷ ἀγίῳ μύρῳ, εἷτα λέγει πρὸς τὴν γυναῖκα· φέρε μοι τὸ μουνίν σου ἐνταῦθα, ἵνα χρίσω αὐτὸ, καὶ οὐ συγκάπτῃ Fourthly: that he was christening a child once, and there was a woman standing there too; so he anointed the infant with holy myrrh, and then said to the woman, “bring me your cunt here, for me to anoint it; it won’t swallow it up.”: I don’t quite understand what exactly συγκάπτῃ means here, and I’m not sure I want to. Yes, he was excommunicated. Legal processes, such as this, are usually boring, but at times can give invaluable linguistic evidence. Even though in this case the grammar has clearly been antiquated, the vocabulary has not.
Mass of the Beardless Man (ca. 1500/1515~1519) (Β 165, Α 499) γραίας πορδὴ μαστίχα σου, γαδάρας μουνὶ πουγγί σου an old woman’s fart is your mastic, a donkey’s cunt is your purse: The Mass is a relentlessly scatological parody; as the previous quote shows, not all churchmen were saintly, then any more than now. It was printed in 1553, but the two manuscript versions are if anything even filthier—as is shown here.
Mass of the Beardless Man (ca. 1500/1515~1519) (Α 375) Μαγαρίζομέν σε, κὺρ Φασούλη σπανέ, καὶ ὑβρίζω τὴν ὡραίαν πατσάδαν σου ὡς ἀντίτυπον γαδάρας <τὸ> μουνίν. We pollute thee, Sir Bean Beardless, and I curse thy pretty beard as a copy of a donkey’s cunt.
Glossae Graecobarbarae (end of 15th century/1614) [cited in Meursius] τῆς γυναικὸς τὸ αἰδοῖον, ὅπερ καλοῦσι μουνὴν. The woman’s pudendum, which they call μουνί.: The Glossae Graecobarbarae have survived only in citations by the lexicographers Meursius and DuCange; they’ve been claimed to originate in Cyprus, at the end of the 15th century. (Beaudouin, Mondry 1884. Dialecte chypriote. pp. 109-131 cites the glosses and compares them to Modern Cypriot.)
Stefano de Sabio, Corona Pretiosa (1527) [cited in Meursius] μουνὴ. Cunnus. αἰδοῖον γυναικὸς. μουνί. Cunt. “(Ancient Greek) woman’s pudendum.”: The Corona Pretiosa was published by Stefano de Sabio in 1527, with a reprint in 1543; it’s a glossary that translates Modern Greek into Latin, Ancient Greek and (apparently) Italian. I’d like to register my astonishment that this is the first time I’ve heard of it. I’d also like to register my astonishment that nowadays I can *expect* a 1527 book to be digitised and online. It isn’t, but all the other lexica are.
Johannes Meursius: Glossarium Graeco-barbarum (1614) Μουνή. Membrum muliebre. [cites definitions from Corona Pretiosa and Glossae Graecobarbarae] Μουνί: Female member.: Little gripe to Kriaras’ dictionary: if they’re going to cite words as being cited in Meursius, DuCange, Vlachos and Somavera, they really should also have mentioned De Sabio and the Glossae: they push the date back a lot.
Alessio da Somavera (Alexis de Sommevoire), Tesoro della lingua greca-volgare ed italiana (1709) Μοῦνα, ἡ. μαϊμοῦ μὲ τῆν οὐράν, ἡ. (ζῶον) Mona, gatto fariano. (animale) // Μουνάρα, ἡ. Natura granda di donna. // Μουνί. βλ. Σάρκα. // ἡ Σάρκα. Le parti vergognose, honestamente parlando. Μοῦνα, ἡ. Monkey with a tail (animal). // Μουνάρα, ἡ. Large feminine organ. // Μουνί. See Σάρκα “flesh”. // Σάρκα. The embarrassing parts, to speak bluntly.: Somavera is more hesitant than previous lexicographers, but he does note the augmentative μουνάρα. He also shows that the Venetian monna “monkey”, which I mentioned confused matters in Venetian, had also entered Greek at the time.

Now to the etymologies. None of them are straightforward phonologically: there is no obvious Ancient word starting in /mon/ or /mun/, which could account for it.

A friend of mine said he always assumed it was derived from μόνος “only, unique” (as in mono- in English), because there’s only one of them. That’s in contrast to testicles, presumably, but it doesn’t exactly distinguish vaginas from penises though.

No, I’m not going further with that proposal.

Faced with this difficulty, Hatzidakis arrived at the ingenious (too ingenious) parallel of /evnuxos/ “eunuch” > /munuxos/. Each of the steps posited for that transition has precedent in Greek:

eunúkʰos
evˈnuxos, through regular phonetic change
*ˈvnuxos, through aphaeresis
ˈmnuxos, through assimilation
muˈnuxos, through epenthesis

This allowed people to look for etymologies of μουνίν in something like *βνίν. I admit to some residual scepticism; as I said, the epenthetic /u/ in /munuxos/ could be copying the latter /u/, which wouldn’t apply to /vnin/; and neither /mn/ nor /vn/ is always broken up in Greek: /mnimori/ “memorial stone”, /keravnos/ “thunder”. So if we could find a less awkward etymology for μουνίν, we’d use it.

Like, say, Venetian mona, as I had at first leapt at. But as I’ve argued, the evidence from Italiot Greek is that the Venetian word probably does have a Greek origin after all, so it doesn’t help get rid of the problem.

So, let’s see who Moutsous reports has had a go. I’ve already mentioned the later etymologies:

DuCange (1688): βουνή, from βουνός “hill, mound”. As in mons Veneris: Moutsos cites Psichari and Rohlfs as rejecting it, and there’s no good reason for /vun/ to go to /mun/.
Koraes (1835): μύλλον “lip”; cf. μυλλός “cake shaped like a vagina” (Athenaeus 14.647a), and μυλλάς “prostitute”.: μυλλός and μυλλάς are derived from μύλλω “to fuck”; μύλλον apparently is unrelated, and there’s no obvious reason for /myl/ to go to /mun/, either.
Hatzidakis (1892): εὐνή > *εὐνίον “bed” > *βνίν.: It seems a bit stretched, although I did point out the parallel in Modern Greek with carriola “cradle with wheels” > καριόλα “bed” > “whore”. Hatzidakis knew it was stretched too, and didn’t want to rule out mona
Filintas (1934): μνοῦς “down” > *μνίον: Moutsos dismisses this; the proposal “drew no attention as being entirely hypothetical”. I’m not as sure: the form is semantically possible, and phononologically less indirect: we need only posit *mnin and not *vnin. The word did stick around long enough to show up in the Graeco-Latin glossaries as a gloss of pluma “feather”, both as μνοῦς and as the vernacular diminutive μνούδιον.

Moutsos’ proposal is that this is a nominalised infinitive of βινεῖν “to fuck”. We have several such fossil infintives in Modern Greek: φαγεῖν “to eat” > φαγί > φαΐ “food”, πιεῖν “to drink” > πιεί “drink” (dialectal); φιλεῖν “to love” > φιλί “kiss”. Moutsos adds γαμήσειν “to get married > to fuck” > γαμήσι “fucking”, parallel to λύσειν “to untie” > λύσι “untying” (dialectal); that I’m not as convinced of.

On a scale of more to less phonological plausibility—intermediate steps postulated: we have (1) μνοῦς > *μνίον (2 steps); (2) βινεῖν (3 steps); (3) εὐνή > *εὐνίον (4 steps).
On a scale of more to less semantic plausibility, we have (1) βινεῖν; (2) μνοῦς > *μνίον; (3) εὐνή > *εὐνίον.
On a scale of morphological plausibility, we have (1) βινεῖν; (2) μνοῦς > *μνίον; (3) εὐνή > *εὐνίον. βινεῖν uses a nominalised infinitive, which is attested as a process, but rare. The dimunitives μνίον and εὐνίον are both unattested, and -ίον did stop being a productive suffix sometime in Early Middle Greek. At least μνούδιον shows the word stuck around in the vernacular for a while (the Graeco-Latin glossaries admit colloquial words); I see no evidence that εὐνή made it to the Koine.

On balance, all three have problems, but “bed” has the most problems; and I guess “fuck” has the least (though not by as much as Moutsos thinks).

The other evidence that Moutsos gives is:

βινέω seems to have survived into Proto-Pontic: βιντώ “to be in a rut” (> βινητιῶ), βίντος “gadfly”.
Circumstantial influence of a nominalised τὸ βινεῖν in Alexis, cited in Plutarch:

τὰς ἡδονὰς δεῖ συλλέγειν τὸν σώφρονα.
τρεῖς δ’ εἰσὶν αἵ γε τὴν δύναμιν κεκτημέναι
τὴν ὡς ἀληθῶς συντελοῦσαν τῷ βίῳ,
τὸ φαγεῖν τὸ πιεῖν τὸ τῆς Ἀφροδίτης τυγχάνειν·
τὰ δ’ ἄλλα προσθήκας ἅπαντα χρὴ καλεῖν
The wise man knows what of all things is best,
Whilst choosing pleasure he slights all the rest.
He thinks life’s joys complete in these three sorts,
To drink and eat, and follow wanton sports;
And what besides seems to pretend to pleasure,
If it betide him, counts it over measure (Alexis fr. 271 Kock; Plutarch: Moralia 21e)

Bless those old-school translators for using verse. Damn clever: “to be lucky with Venus” (Ἀφροδίτης τυγχάνειν) as a euphemism for βινεῖν, which happens to rhyme with φαγεῖν and πιεῖν “to eat and drink”—two infinitives that happen to have survived as nouns in Modern Greek. I don’t think this is overwhelming evidence that τὸ βινεῖν was a commonplace colloquial expression, let alone an expression that turned into “cunt”; but it’s cute anyway.
μουνίν and associated compounds are common in Early Modern Greek. True, but they’re mostly in the Mass of the Beardless Man (μουνιοτζακάτος, καβουριομουνομέτωπος, σκατόμουνος), which is reasonably late, and doesn’t prove anything about etymology.

Moutsos also derives Italiot munno (and Erice Sicilian monnu) from *μοῦνος, explained as an augmentative of μουνί attested in Modern Greek. The augmentative I know is the one Somavera recorded, μουνάρα; but slang.gr confirms the existence of μούνος, adding the improvised proverb κάλλιο μούνος και στο χέρι παρά κώλος και καρτέρι, “A bush in the hand is worth two arses in the bush”. I think.) Now, it’s true that switching genders can act as an augmentative, although that’s because diminutives old and new are neuter, so this can be viewed as a back-formation. But a masculine could also be just an archaism.

To explain: Ancient masculine ποῦς, ποδός “foot” survives in Modern Greek through the diminutive neuter πόδιον > πόδι. The masculine πόδας is also found, and is how ποῦς, ποδός would have developed on its own (switching third to first declension). πόδας is normally interpreted as an augmentative: if people remember that a masculine was bigger than a neuter, and the neuter is now the normal term, then the masculine must be an augmentative of the neuter.

But πόδας is (I think!) the normal Cretan term for “foot”, which suggests its an independent survival, unaffected by πόδιον. And that’s a problem with munno. μουνίν looks, at first glance, like a diminutive of *μοῦνος. We know of no Ancient noun like μοῦνος. (We do have the Ionic μοῦνος “only, unique”, corresponding to μόνος in the rest of Greek; but we’ve already rejected that track.) It’s because we don’t have an Ancient *μοῦνος that we’ve ended up looking at *vnin forms.

But Bova is evidence that there was a *μοῦνος form at some stage after all. And Moutsos’ proposal has no room for a *μοῦνος. (Neither does Hatzidakis’; Filintas’ does only with accent shift.) Bova could be doing the same gender-switch as Modern Greek, forming a local augmentative. But it really does look more like an original unattested Ancient form, of which μουνίν is the derived form.

But until someone comes up with an Ancient form that can explain *μοῦνος, Moutsos’ βινεῖν is the best proposal on the table. Not overwhelmingly good or unproblematic; but that’s the thing with etymology. And scholarship. Sometimes, we only have weak hypotheses. As the great Greek humourist (and amateur student of mythology) Nikos Tsiforos once put it, “Scholars argue when they don’t know what’s going on. When they do know what’s going on, they just say ‘1 + 1 = 2’, and they’re done.”

μουνί vs. monín

By: Nick Nicholas | Post date: 2010-02-10 | Comments: 25 Comments
Posted in categories: Linguistics, Modern Greek
Tags: etymology, historical linguistics, language contact, Modern Greek, Venetian

OK, I don’t particularly intend for this blog to be turned over to the etymology of sundry four-letter words, but the etymology of μουνί which I had posted on turns out to be complicated, and interesting. It’s certainly attracted a lot of interest in comments; I don’t remember my article on πεσσός “pier” getting this much interest.

Things being complicated, I’ll break this up in two. The first post is on whether μουνί is originally Greek or Italian. The second goes through the attestation of μουνί in Early Modern Greek, and reviews the proposed etymologies. My godfather (TAK) has raised a major issue in comments, about whether the dates of any phenomena in literary texts before 1400 can be trusted; that’s a methodological issue, and will get its own post—hopefully with a lot less vulgarity.

The language advisory still applies, especially because I’ll be discussing a metaphor based on said word.

This post and the next is of course secondary scholarship, and is indebted to the linguists who have actually had a go at working out the etymology of μουνί. The latest reference in Kriaras’ Early Modern dictionary—which I’m now inclined to agree with— is:

Moutsos, Demetrios. 1975. Varia Etymologica Graecanica. Byzantion 45. 118–130.
The article also proposes etymologies for: αγνάντια “opposite”, ανακούρκουδα “crouching”, βότσαλο “pebble”, βούρδουλας “whip”, μουνί “cunt”, βυζί “breast”. The alphabetical bias suggests this resulted from lexicon work: Moutsos acknowledges Georgacas’ help, and I suspect the article came from Georgacas’ unfinished Greek–English dictionary.

The etymology of μουνί “cunt” is complicated alright. If I were vulgar, which of course I am (but not in Greek), I might go so far as to say that the etymology of μουνί… είναι μουνί. The expression doesn’t mean what it’s a cunt of an etymology would in English—meaning unpleasant, obnoxious. In Greek, it means “it’s a mess, it’s chaotic”. The expression is odd (why would vaginas be chaotic in particular?); and the reason behind the expression is actually a key point later on.

The major complication to note is, we have a very similar word in the Northwest Mediterranean, also meaning “cunt”:

mona in Venetian (and the Venetian hinterland, in Veneto Guiliano and Trentino)
a diminutive monín, which has made it to the Lingua Franca
mouni in Occitan, reported as meaning variously “cunt”, “monkey”, and “cat”.

Coincidence can happen, but this kind of coincidence probably didn’t: these are likeliest the same word. If they are the same word, and there’s no obvious common ancestor, then one language borrowed it from other. So which came first?

I was worried about mon-a becoming μουν-ί(ο)ν: the 12th century is way too late for the Greek diminutive suffix -ίον to have remained productive, and a loanword ending in -a should have become a word ending in -α. (Contrast τιμόνι “steering wheel”, from Venetian timón.) The existence of monín in Venetian deals with that problem.

The change of vowel between /o/ and /u/ is another issue; but etymologists rarely care about vowels (as Voltaire apocryphally noted). Greek sporadically has /o/ > /u/ around labials; on the other hand, Venetian borrowed Arabic maimūn “monkey” as monna. So the vowel difference is not something to worry about.

The etymologies I had seen in the previous post were implausible enough that I was happy to accept mona came before μουνί. But there are reasons not to. For starters, we don’t actually have an accepted Latin origin of mona. The Loony Tunes Aristophanean etymology of mona from βυνεῖν, mentioned in the previous post, is proof that Italianists couldn’t come up with something closer to home. Boerio’s old dictionary of Venetian, which TAK cited in comments, left open which direction the word moved in.

But the compelling argument Moutsos mentions is where else the word shows up in Italy. Herhard Rohlfs noted that “cunt” is munno in the Greek of Bova, Calabria; and munnu in the Siciliano of Erice. It would be odd for a Venetian form to show up in the Calabrian and Sicilian hinterland. (OK, Bova and Erice are pretty close to the sea, but still.) Southern Italian Greek is archaic, with much influence on the Romance dialects that replaced it—that’s why Rohlfs became interested in it. The word didn’t get into Southern Italy from Latin: it’s much likelier to have gotten there from Greek than from Venetian.

That doesn’t necessarily mean *Ancient* Greek: Southern Italy only became cut off from Byzantium in the 11th century, and Greek was used as a legal language for several centuries longer.

Given Southern Italy, and the lack of a Latin etymon, I’m inclined to go with a Greek origin, then. The one remaining oddity is the ending: if it was difficult to accept mon-a becoming μουν-ίν, it is also difficult to accept μουν-ίν becoming mon-a. But it’s entirely possible that mona was backformed from monín, and that monín was the original form that entered Venetian.

That’s a matter for Romanists to work out, though; at any rate, Cortelazzo & Marcato’s 1992 Dizionario Etimologico dei Dialetti italiani accepts a Greek origin, with an Arabic sideswipe:

The word corresponds to the Modern Greek mouní, and seems to belong to the wave of Grecisms that penetrated into Venice in the 14th and 15th century (Cortelazzo 1970). Alternatively, the homophony with monna “barbery ape, monkey” (from the Arabic maimūn) could let us suspect a transition from the animal name; but there is no lack of other etymological hypotheses, including the personal name Mona, from Simona.

As often occurs in etymology, of course, coincidences matter: if Venetian had three identical words for “cunt”, “monkey”, and “Mona”, people will start conflating them in their heads. It’s plausible that Venetian monín went across to Occitan mouni, and it’s also plausible that the confusion took hold there, so that “monkey”, “pussy”, and “pussy cat” got entwined. (Is “monkey” a term of affection there? It sort of is in Greek, via “mischievous child”. Then again, that usage occasionaly turns up in English too.)

So mona has picked up semantic richness in Venice and Toulouse. But that’s not inconsistent with mona being borrowed from Greek. All you need is for the word to have become common and entrenched in the receiving language.

TAK also noted the coincidence of the expression “become cunt” in Venetian (deventar una mona), as recorded in Boerio’s dictionary, and Greek (έγινε μουνί). The expression also indicates a word common and entrenched enough to support metaphor; but I don’t think it is that illuminating. Or rather, it’s illuminating, but not illuminating about how the word travelled.

The Greek expression, as I mentioned above, means “to become messed up”. To illustrate it, I here reproduce an exchange I witnessed 15 years ago, between two recent female graduates in linguistics, in Salzburg:

GRADUATE 1: I’m not going to put the Mozartkuglen in my luggage, γιατί θα γίνουν μουνί! [Because they will become cunt = they will be ruined, they will be a mess]
GRADUATE 2: *laughs very nervously, because language taboos do still count for something*
GRADUATE 1: Μα θα *γίνουν* μουνί! [Well they *will* become cunt!]

Being a creature of little imagination, I couldn’t quite place what the analogy was between “cunt” and “mess”. That’s because I wasn’t aware of the other meaning it has in Greek: “be drenched”. (My experience of Greek has been sheltered.) To my surprise, the expression isn’t defined in slang.gr, the Greek Urban Dictionary; but it needn’t be, it’s in the “proper dictionaries”.

This blog gives Babiniotis’ dictionary’s definition of μουνί, including the phrase τα κάνω μουνί/γίνομαι μουνί “make things cunt/become cunt”: “(i) drench someone/something; (ii) argue strongly with someone, ruin one’s relationship with”.
The Triantafyllidis Institute’s dictionary entry skips the verb, and make “mess” (the missing link in Babiniotis’ entry) a secondary meaning of μουνί, optionally combined with καπέλο “hat”: “Phrase: cunt(–hat): (a) a mess, damage, or turmoil: After the party, his house was cunt(–hat); (b) a turn for the worse; (c) noisy argument: She’s become cunt(–hat) with her husband again.“

So Babiniotis records the “drenching” meaning as primary, and skips “mess” as a stepping stone between “drenching” and “argument”; OTOH Triantafyllidis skips “drenching”, and goes from “mess” to “argument”.

I trust my readers can work out the semantic transition “become [like] a cunt” > “become drenched” > “become a mess (physically)” > “become a mess (situation)” > “end up in an argument (with messy consequences)”. The metaphor builds on the taboo of μουνί, and packages it with plenty of sexism; but its starting point is the association of μουνί with wetness.

Venetian makes the opposite association: Boerio defines “become a cunt” as “become flabby, wither, dry up: lose freshness, beauty, joy; said of a man”.

Venetian’s exploiting the taboo of mona too, in the cause of colourful language; but the initial metaphor is not wetness. I’m guessing it is “shameful to behold” > “ugly to behold”; and the unseemliness is emphasised by applying it to the wrong gender. Which appears to involve a different repertoire of sexism.

So we have the same words in the expression, but different connotations leading to different senses. (And of course English has different senses again—let alone the different metaphorical meanings of cunt in American and Commonwealth English, as applied to a person.) That doesn’t really tells us which language the word came from. Metaphors cluster around concepts, however they happen to be expressed. Metaphors can travel, because people travel. But metaphors can be coined independently, and end up in different places.

So it doesn’t look like “become a cunt” necessarily travelled between Greece and Venice; it could have been coined independently. Still, the word itself clearly travelled. One further piece of evidence for monín getting around the Mediterranean is indirect evidence that it got into Lingua Franca. The *original* Lingua Franca. Kahane & Tietze’s reference work on Lingua Franca is based on common nautical loanwords through the Mediterranean, as they have ended up in Turkish. We don’t know a lot about the Lingua Franca, and we don’t need an intermediate pidgin to explain Italian nautical loanwords in Turkish. But since the Lingua Franca did exist, and was multilingual, it is the most plausible vehicle for such loans to have happened.

Now, one of the entries in Kahane & Tietze is *monín de gassa “cut splice”. We don’t have evidence for the Venetian expression (unsurprisingly, since it literally means “eye cunt”), but it did survive in Turkish as munikasa or münikasa. (Or at least it did: the Turkish Wikipedia names it as kesik örgü.)

A “cut splice” is a kind of rope splice, that, well, looks like a monín:

It looks more like a monín given that the slit closes when the rope is taut. Venetian sailors weren’t the only people who thought so: the English name of the splice is cut splice, but as Wikipedia informs us, there used to be an extra n in cut…

Tzetzes’ Theogony, continued

By: Nick Nicholas | Post date: 2010-02-06 | Comments: 16 Comments
Posted in categories: Linguistics, Mediaeval Greek
Tags: Byzantine Greek, Ossetian, philology

I have picked up Hunger’s edition of the epilogue to Tzetzes’ Theogony, so I can now fill in some of the questions left open in my previous post, and correct some misunderstandings I had, In a separate post, I’ll speculate further on the etymology of μουνί. I’ve changed my mind on it, btw.

But first, to Tzetzes.

We know of five manuscripts of the text

V₁: Vindobonensis phil.gr. 321 (second half of 13th century)
C: Casanatensis gr. 306 (1413)
P: Vaticanus Palatinus gr. 424 (16th century)
B: Vaticanus Barberinus gr. 30 (15th century)
V: Vindobonensis phil.gr. 118 (turn of 14th century)

and most of them give up about the epilogue completely. Here is the translation of the epilogue Language Hat cites from Alexander Kazhdan’s Change in Byzantine Culture in the Eleventh and Twelfth Centuries. I’ve tweaked the translation where appropriate, and added the reconstructed Proto-Ossetic:

[V₁ already gave up copying 45 verses ago: We left out the entire epilogue, because it just went on too long]
One finds me Scythian among Scythians, Latin among Latins,
And among any other tribe a member of that folk.
[P stops copying]
When I embrace a Scythian I accost him in such a way:
“Good day, my lady, good day, my lord:
Salamalek alti, salamalek altugep.”
And also to Persians I speak in Persian:
“Good day, my brother, how are you? Where are you from [Missing in C], my friend?
Asan khais kuruparza khaneazar kharandasi?”
To a Latin I speak in the Latin language:
“Welcome, my lord, welcome, my brother:
Bene venesti, domine, bene venesti, frater.
Wherefrom are you, from which theme [province] do you come?
Unde es et de quale provincia venesti?
How have you come, brother, to this city?
Quomodo, frater, venesti in istan civitatem?
[C stops copying: And there were many other verses of sundry dialects, but I omitted them as useless. We’re down to V and B]
On foot, on horse, by sea? Do you wish to stay?
Pezos, caballarius, per mare? Vis morare?”
To Alans I say in their tongue:
“Good day, my lord, my archontissa, where are you from?
Tapankhas mesfili khsina korthi kanda,” and so on.
(dæ ban xʷærz, mæ sfili, (æ)xsinjæ kurθi kændæ)
If an Alan lady has a priest as a boyfriend, she will hear such words: [verse only in V]
“Aren’t you ashamed, my lady, to have a priest fuck your cunt? [missing in B, only given in V]
(οὐκ αἰσχύνεσαι, αὐθέντριά μου, νὰ γαμῇ τὸ μουνίν σου παπᾶς;)
To farnetz kintzi mesfili kaitz fua saunge.”
(du farnitz, kintzæ mæ sfili, kajci fæ wa sawgin?)
[Literally: “Aren’t you ashamed, my lady, to have a love affair with the priest?”]
Arabs, since they are Arabs, I address in Arabic:
“Where do you dwell, where are you from, my lady? My lord, good day to you.
Alentamor menende siti mule sepakha.”
And also I welcome the Ros according to their habits:
“Be healthy, brother, sister, good day to you.
Sdra[ste], brate, sestritza,” and I say “dobra deni.”
To Jews I say in a proper manner in Hebrew:
“You blind house devoted to magic, you mouth, a chasm engulfing flies,
memakomene beth fagi beelzebul timaie,
You stony Jew, the Lord has come, lightning be upon your head.
Eber ergam, maran atha, bezek unto your khothar.”
So I talk with all of them in a proper and befitting way;
I know the skill of the best management.”

The language names aren’t what they seem:

I recognised ἀλτή as Turkic, confirmed that altı is Turkish for “lady”, and so assumed “Scythian” was Turkish. It was a bit odd that the Turks were being placed in Scythia—modern Ukraine and Kazakhstan; but maybe Tzetzes was thinking of some Turkic tribe up north.
In fact he was: Hunger’s manuscript has the interlinear gloss Cuman at “when I embrace a Scythian”. And Cuman was indeed a Turkic language.
With the “Persians”, I committed a thinko. I noticed that “friend” in Persian, kharandasi, looked like Turkish kardaş “brother, friend”. I also know that by the 14th century, classicising Byzantine historians referred to the Turks as Persians, referring back to the Achaemenids. But surely, I thought, Tzetzes would have actually been familiar with Persians, being part-Georgian. (More on that later.) So he wouldn’t have made that Chalcocondylian conflation. As for kardaş, I dunno, maybe it is a Turkish loan from Persian.
Not so. Hunger’s manuscript also glosses “Persian” as “Turkish”. I’m not game to suggest a (Seljuk) Turkish rendering of ἀσὰν χαῒς κουρούπαρζα χαντάζαρ χαραντάση; I may get lucky and have a passing commenter do so.
Latin is Latin. Tzetzes’ dates are ca. 1110-1180; certainly not too late for Latin to have been spoken by scholars, at least.
Alanic is Proto-Ossetian. Ironically, Alanic *is* a Scythian language, the Scythians and Alans being Iranic peoples.
I also wouldn’t object to hearing the wisdom of the crowds on the Arabic.
The Ros are the Rus’, i.e. Russia. The manuscript actually reads sdra, but Hunger consulted a Slavicist who said it was unattested, and Hunger assumed the ste dropped out as a haplology. Tzetzes, perversely (but unsurprisingly) put the foreign language fragments in the same metre as the rest of the poem; and there is indeed a missing syllable there.
Hunger finds the orthography δόβρα δένη interesting, because the words still end in full vowels (dobra deni, cf. Modern Russian dobryj denj).
The Jews get Tzetzes’ anti-Semitism in Hebrew, although the Jews of Byzantium certainly spoke Greek as their everyday language. But admitting that would be admitting they were not space aliens who didn’t belong in Byzantium; and Tzetzes couldn’t do that. Tzetzes’ use of Latin also suggests that his language command was of the debate hall, rather than the marketplace—learnèd Hebrew, rather than spoken Judaeo-Greek. Language Hat’s comment thread has some information on Tzetzes’ Hebrew.

The Greek interest in the epilogue is on its use of μουνί, and that use is surprising, because it’s not what the Proto-Ossetian says. That’s not the only thing strange about the Greek translations, though: they are in red ink in the manuscript, and don’t fit the metre like the foreign originals do. Moreover, “fuck your cunt” look a bit over-colloquial to us—although the rest of the translation is consistent with Tzetzes’ Koine (πόθεν εἶσαι καὶ ἀπὸ ποίου θέματος ἦλθες;), and the correlation of vulgar with colloquial we make can be anachronistic.

As Modern Greek readers will have noticed, “that he fucks” is γαμῇ, not γαμᾷ: the original verb is γαμέω, and while the vernacular was already conflating -αω and -εω conjugations by then (something that had started in the Koine), Tzetzes knew that the original verb is γαμέω—and he’d want you to know that he knew it.

Still, it’s a reasonable question to ask: can we be sure the translations are from Tzetzes himself? Hunger agrees with Moravcsik that we can, because Tzetzes was pedantic enough to gloss everything in sight. That’s not a compelling reason in my book: if he was that pedantic, why aren’t the glosses in metre? The fact that glosses show up in all four manuscripts is more convincing to me.

In particular, whoever wrote the translation “fuck your cunt” in V knew enough Proto-Ossetian to render its meaning misleadingly. Tzetzes did; I’m less certain a random scribe would, especially when most scribes ran away as fast as they could from this Berlitz job application.

Only two scribes persevered with it, and it’s interesting what B left out: not just the νὰ γαμῇ τὸ μουνίν σου παπᾶς, but any reference to the lady shacking up with a priest at all. It wasn’t just the four-letter words that offended the scribe of the Barberinus, but the social faux pas.

It’s an odd thing to do, though, translate “to have a love affair” as “to fuck your cunt”. Nikos Sarantakos asked me whether this is some indication of Georgian–Ossetian enmity being a thousand years old.

Let’s go to Wikipedia University. Tzetzes *was* Georgian and not Ossetian, right?

John Tzetzes: “was Georgian on his mother’s side. In his works, Tzetzes states that his grandmother was a relative of the Georgian Bagratid princess Maria of Alania who came to Constantinople with her and later became the second wife of the sebastos Constantine, megas droungarios and nephew of the patriarch Michael I Cerularius. [Garland, Lynda (2006), Byzantine Women: Varieties of Experience, 800-1200, pp. 95-6. Ashgate Publishing, Ltd. ISBN 075465737X.]”
(“Of Alania”, meaning “Ossetian”. Crap.)
Maria of Alania: “was a daughter of the Georgian king Bagrat IV of the Bagrationi (1027–1072) and spouse of the Byzantine Emperor Michael VII Doukas and later also Nikephoros III Botaneiates. She is frequently known as Maria of Alania in apparent confusion with her mother Borena of Alania, the second wife of Bagrat of Georgia.”
Borena of Alania: “was a sister of the Alan king Durgulel “the Great”, and the Queen consort of Georgia, as the second wife of Bagrat IV (r. 1027-1072). […] This was just one of the several intermarriages between the medieval Georgian Bagratids and their natural allies, the royal house of Alania.”

Phew. So Borena was Alan, but her daughter was born in Georgia, and her daughter’s granddaughter was related to Tzetzes. So we’re probably safe.

Of course, instead of Wikipedia University, we could just turn to what Tzetzes himself says. In his Chiliades, Tzetzes dedicates a chapter to the proposition

ΟΤΙ Ο ΤΖΕΤΖΗΣ ΚΑΤΑ ΜΗΤΕΡΑ ΙΒΗΡΤΩι ΓΕΝΕΙ, ΚΑΤΑ ΔΕ ΠΑΤΕΡΑ ΚΑΘΑΡΩΣ ΕΛΛΑΔΟΣ ΓΟΝΗΣ
THAT TZETZTES IS OF IBERIAN (= Georgian) STOCK ON HIS MOTHER’S SIDE, AND OF PURE GREEK DESCENT ON HIS FATHER’S (Chiliades 5.17)

It’s worth going on, because it offers a hint as to Tzetzes dissing the Ossetians:

(5.591) Τῆς Τζέτζου μητρομήτορος ἡ Ἀβασγὶς ἡ μήτηρ
σὺν τῇ δεσποίνῃ Μαριάμ, τῇ Ἀβασγίσσῃ λέγω,
ἣν οἱ πολλοὶ Ἀλάνισσαν φασὶν οὐκ ἀκριβοῦντες,
ἦλθεν εἰς μεγαλόπολιν ὡς συγγενὴς καθ’ αἷμα,
Tzetzes’ mother’s mother’s Abkhazian (Ἀβασγὶς) mother,
together with Lady Mariam—I mean, the Abkhazian (Ἀβασγίσσῃ),
whom most people incorrectly call the Alan (Ἀλάνισσαν),
came to the Great City [Constantinople] as her blood relative
[— snip two generations —]
(5.612) Τούτου θυγάτηρ σὺν δυσὶν ἑτέραις θυγατράσιν
τὴν κλῆσιν Εὐδοκία μέν, μήτηρ δ’ αὐτοῦ τοῦ Τζέτζου.
Ἔγνως κατὰ μητέρα μὲν Ἴβηρα τοῦτον ὄντα·
πατὴρ δὲ τούτου Μιχαὴλ ὃς καὶ παιδεύει τοῦτον
ἐν λόγοις καὶ τοῖς πράγμασιν ὡς τὸν υἱὸν ὁ Κάτων.
She was this man’s daughter, together with two other daughters;
she was Eudocia by name, and mother of this Tzetzes.
So now you know that he is Iberian (Georgian) from his mother.
And his father was Michael, who taught him
in words and deed like Cato taught his son.
[— snip —]

We pause here for the Modern Greek readers to stop guffawing at Ἀλάνισσα, which meant “Alan” in Byzantine Greek, and means “tramp” in Modern Greek.

Ossetians, Georgians, and now Abkhazians? What is this, a Russia-NATO impasse? But yes, Georgia had that diversity of peoples back then too. Abkhazia was part of the Kingdom of Georgia at the time, and not yet a separate principality, and Wikipedia at least says the Abkhazian nobility of the time spoke Georgian. So even if Tzetzes’ great-grandmother was Abkhazian, she would have spoken Georgian—and would not necessarily have felt affinity with the Alans.

Tzetzes names Maria of Alania as Mariam, which is the Georgian form. But was Maria Alan, Abkhaz, Georgian, or what? Rather than Wikipedia University, we can refer to the roman-emperors.org article on Maria written by Lynda Garland, who actually is a Byzantine historian (and who’s cited by Wikipedia U): as far as I can tell, the Georgian monarchy was described by both Georgians and Byzantines as “of the Ap’xaz/Abasgia”, but the Bagratid heartland was further south than Abkhazia, and the name doesn’t mean the Abkhaz were running things.

But just before Tzetzes confuses us further with the Abkhazians, he says why:

(5.585) Ἡ τοῦ δὲ μητρόμητωρ μὲν Τζέτζου τοῦ Ἰωάννου
τοῦ ἱστοριογράφου τε καὶ συγγραφέως πόσων,
μητρὸς ἧν Μασσαγέτιδος, ἤγουν ἐξ Ἀβασγίδος.
Ἴβηρες δὲ καὶ Ἀβασγοὶ καὶ Ἀλανοὶ ἓν γένος·
οἱ Ἴβηρες πρωτεύοντες, οἱ Ἀβασγοὶ δευτέραν,
οἱ Ἀλανοί δ’ ἐσχήκασι τάξιν τριῶν ὑστέραν.
Tzetzes, John, historian, and author
of so many works: his mother’s mother’s
mother was a Massageta, namely from Abkhazia.
The Iberians [Georgians] and the Abkhazians and the Alans are one race;
the Iberians hold the first rank; the Abkhaz the second;
and the Alans hold the third and last rank.

Do ignore the Massagetae: an Iranic people in Herodotus, which gave Tzetzes a classical pedigree to hang on to the Abkhaz. Ammianus Marcellinus hanged the label onto the Alans, and Procopius of Caesaria picked the Huns; so the label doesn’t mean much.

The Alans were allies of the Georgians, which is why Borena and Maria married into Georgian royalty. Tzetzes could say they were “the same” in some sense; shortly after his death, the Alan prince married the queen of Georgia, which effectively merged the countries for a couple of decades, until they both were conquered by the Mongols.

But the Alans were not the same country yet while Tzetzes lived: they were a neighbouring country, while the Abkhaz were a province of Georgia, and their nobility was assimilated. The Abkhaz were sort of Georgian; the Alans were not, and Tzetzes is eager to put them at the bottom of the heap.

He’s also adamant to point out that Maria was not “of Alania”—and true enough, it’s her mother that was. We know her as Maria of Alania because most Byzantine historians called her that; and they called her that, as Garland explains, because they didn’t give a toss what nowheresville principality she came from. Psellus, for instance, would just as soon not mention where she came from at all. (“Maria may have been a Georgian princess, but in fact her homeland and royal parentage cut little ice with the Byzantines as a whole.”)

So Psellus, as Garland mentions, casually disses the Alan Kingdom:

(The emperor) fell in love with a girl, as I have mentioned above, who was a hostage with us from Alania. That kingdom was not particularly distinguished in itself, nor had it any great prestige.
ἐρᾷ τινος μείρακος, ὥς μοι καὶ ἄνω που τοῦ λόγου λέλεκται, ἐξ Ἀλανίας ὁμηρευούσης ἡμῖν· βασιλεία δὲ αὐτὴ οὐ πάνυ σεμνὴ, οὐδὲ ἀξίωμα ἔχουσα. (Chronographia 6.151)

But Psellus wasn’t any more respectful to Georgians. As far as he was concerned, “all you Caucasians look alike”.

Tzetzes cared though. Yes, he said “we’re all the same race (γένος)”. But if they were all the same race, the Alans wouldn’t have had the last rank. And Tzetzes was enough of a walking rancour machine, that I wouldn’t put the uncomplimentary mistranslation “fuck your cunt” past him.

Comparison, TLG BC and AD: log-likelihood

By: Nick Nicholas | Post date: 2010-02-06 | Comments: 3 Comments
Posted in categories: Ancient Greek, Linguistics, Mediaeval Greek
Tags: Ancient Greek, Byzantine Greek, lexicon, morphology, TLG, Wordle

Helma Dik left a comment on my post on comparing TLG AD and BC through Wordle, suggesting I use Dunning’s Log-Likelihood measure of differential word frequencies in corpora, as Wordled by Martin Mueller. That lets you work out what the real shifts in frequency are, rather than trying to eyeball them through the aggregate word counts.

Here for instance is his comparison of the Iliad to the Odyssey—which words are more frequent in the one, or the other:

I looked up Ted Dunning’s paper, failed to understand it 🙁 , and used instead the walkthrough of the computation on the user manual of the Wordhoard corpus software package.

And this is the more statistically sound Wordle comparison. Words more frequent BC are in red, words more frequent AD are in black. I’m leaving in stop words this time, and not cleaning up the ambiguity, because this says some interesting things about the changes in Greek grammar between Classical and Late Greek. Do click:

Here’s my impressionistic notes, that haven’t already been covered in the previous post (where I was working through rankings):

Both corpora talk about θεός God, but the big jump, of course, is Χριστός Christ. The second biggest jump is in ἅγιος holy, displacing ἱερός. (Was ἱερός too pagan-sounding?)
But the biggest discrepancy between BC and AD Greek is the avoidance of δέ but, on the other hand, followed by avoidance of μέν on the one hand. That tells you that AD Greek used different sentence structures, such as a lot more ἀλλά but. Tucked away, there’s also more καί and (i.e. more coordinating constructions) and a lot less τε and (a very archaic phrase-second construction).
There are a lot more ἤγουν and τουτέστι that is, and a lot less ἐάν if and ἄρα therefore; I’m tempted to think that says something about changing rhetoric in the genres popular in the respective periods—less logic, more exemplification. It’s foolhardy, but not impossible.
There is a lot more τίς who? being reported, and that’s an error in ambiguity, but it’s an illuminating error. τοῦ in Attic (though not Late Greek) is ambiguous between “whose?”, and the genitive definite article. And there are a lot more definite articles in Late Greek, as you can see by the black ὁ. (My friend Io Manolessou actually wrote her PhD on that shift; nice to see it visually confirmed.)
There’s also more ἵνα in order to, which suggests Late Greek was already moving towards more subjunctive constructions rather than participles and infinitives, even before Early Modern Greek made the switch completely.
Clearly less ὦ Ο!—A very Classical way of addressing people.
Some of the odder looking words more prevalent in BC Greek are there because there are a lot more geometric texts in the BC corpus: Ἄβ is actually mistakenly picking up the line ΑΒ, and you can also see in smaller print ΑΒΓ, ΒΔ, ΓΔ, ΕΖ, ΞΖ.

Hm. Yes, that was somewhat more illuminating. Thanks, Helma!

μούτζα, μουνί and Tzetzes

By: Nick Nicholas | Post date: 2010-02-03 | Comments: 27 Comments
Posted in categories: Linguistics, Mediaeval Greek, Modern Greek
Tags: Byzantine Greek, etymology, historical linguistics, Modern Greek, Ossetian, philology

I thank my esteemed commenters on the last post, and have a post-length response to them, concerning:

… Ah yes. There is a Language Advisory on this post.

The Complaint of the Anonymous Naupliot

Nauplion: Ever onward. You should add this one:
http://angiolello.net/Anonymous.html

The Complaint of the Anonymous Naupliot is not currently in the pipeline to my knowledge, but it’s a fascinating text, and I commend to everyone else your post on it.

The Byzantinicity of the Greek insulting gesture of the moutza

Peter: If I’m not mistaken, the μούντζα gesture, not the name itself, goes all the way back to classical times: Greek Sicily.

Hadn’t heard that. Everything’s possible, but does the source make it clear it’s the same gesture?

The Greek insulting gesture of the moutza, involving the spread palm directed at the target (or at oneself, in a Greek equivalent of the facepalm), is traditionally derived as cognate to μουτζούρα “smudge”, and referring to pillorying criminals by smearing ash (or worse) on them.

I did find a blog saying someone’s written the gesture is Ancient and represents the rays of Helios, which is uh, yeah. The blogger doubts the gesture is Byzantine, because if it was, wouldn’t it be attested outside Greece. Well,

who said the gesture was use throughout the Empire,
who said every part of the (increasingly shrinking) empire has had cultural continuity to this day—especially with the massive population movements since the Goths first came for a visit,
who said the gesture isn’t used outside Greece? Oh, you mean Nigeria wasn’t part of the Byzantine Empire? Damn…

(I have to wonder though: has anyone checked in Albania? Or, given Pierre’s comment, the Roma?—these phenomena don’t come to a halt at borders finalised in 1912, after all.)

The blogger also disputes that the moutza originated in pillorying, because the Dodecanesian “moutzes and ash on you” is a major curse, and pillorying was meted out for minor infractions.

It wasn’t limited to minor infractions, as this extensive excerpt from Koukoules’ encyclopaedia of Byzantine realia shows: it included adultery, theft, and rebellion; and it could be combined with blinding.
She’s underestimating the potency of shame culture.
If the moutza combined with ash isn’t about pillorying, I can’t see what else it’s about.

The controversy over the etymology of μουνί “cunt”

Pierre: In reference to your last remark, is μουνίν related to the gypsy gesture, the μούντζα? I have always believed with Colin Edmonson that it probably is. (The gesture has power. There is a wonderful story of Eugene Vanderpool, exasperated by a pestilential taxi driver while trying to give an introduction to the “white tower” on the Elusis road. He finally gave the driver all ten, and the taxi ran , not fatally, into a power pole.)

I’d have thought, as much as anything, the taxi driver was astonished that the Frank knew the local gestures: not just the moutza, but the double moutza, at that.

Relate μουνίν to moutza? I don’t see it: I don’t know where the /dz/ would come from, and the semantics doesn’t fit either.

I’ve seen an obscure Hesychian lemma proposed for μουνί “cunt” (was it Korais?), and Venetian. The etymologies I’m finding in the dictionaries are far-fetched enough to show why scholars have been confused. Not that they’re wrong necessarily, they’re just not obvious.

Triantafyllidis dictionary: Ancient εὐνή “bed, wedding bed” > Hellenistic diminutive *εὐνίον > Mediaeval *βνίον > *μνίον (cf. εὐνοῦχος > μουνοῦχος “eunuch, gelding”, ἐλαύνω > λάμνω “arrive”) > *μουνίον (cf. *μνοῦχος > μουνοῦχος)

Hm. I mean, the developments proposed all could have actually happened in Greek: /evnion/ as a diminutive, /vnion/ with deletion of initial vowel, /mnion/ with assimilation, /munion/ with epenthesis. But /mnuxos/ > /munuxos/ is surely repeating the /u/ already there for its epenthesis, and the only mn- word I know survived into the modern vernacular, μνημόρι “memorial stone”, didn’t go to *μουνιμόρι. (Although given what μουνί means, it couldn’t.) I’m not sure /u/ is a regular epenthetic vowel in Greek, but to be honest I can’t think of epenthetic vowels in Greek right now.

The semantics seems stretched too. The word εὐνή seems to have been poetic, particularly in any marital connotation; I’d be very surprised if it survived alongside κοίτη. Modern Greek does admittedly use καριόλα “orig. wooden bed” (Italian carriola) to mean “whore”: it’s a straightforward metonymy, although the carriola was originally a cradle.

(So the Greek dictionary tells me; carriola in Italian now seems to mean “wheelbarrow”… Oh, I see, it was both: “The characteristics of a carriola were that it was a small bed and that it had wheels; this made it easy for a servant or young person to push it under the great bed occupied by the owner of the bedchamber”. Thornton, Peter. 1991 The Italian Renaissance interior, 1400-1600. H.N. Abrams. p. 153.)

But the further claimed step of *εὐνίον from “bed” to “cunt”… well, I dunno, anything’s possible.

The Triantafyllidis institute isn’t convinced by its derivation from “wedding bed” either, because they suggest another derivation:

Ancient μνοῦς “soft feather, down” > Hellenistic diminutive *μνίον > Mediaeval *μουνίον (as in the previous hypothesis) > Mediaeval μουνίν

At least that’s slightly more plausible semantically than “little bed”, although the attested dimunutive (in the Latin-Greek glossaries) is μνούδιον—and, um, “fine, soft down, as on young birds”? Oooo-kay…

But then, it’s all blown skyhigh by the third option:

(But also cf. Venetian mona, same meaning)

As long as we can get a Romance etymology for mona, we can dispense with the epenthetic acrobatics… Except that Tzetzes is a bit early for Venetian loanwords.

Looking at Andriotis’ Etymological Dictionary, it turns out all three proposals are pedigree. The “bed” derivation is from Georgios Hatzidakis, the founder of Modern Greek linguistics (though not infallible). The “down” is from Menos Filintas, a good etymologist who hasn’t gotten enough attention (although you’ll see him very often in Andriotis.)

The Venetian etymology? Gustav Meyer. The contemporary of Hatzidakis who performed an even more valuable service. Thanks to Hatzidakis, we know the rules which derived Modern Greek words from Ancient. Thanks to Meyer, we know that there are words in Modern Greek from other languages. 🙂 (Meyer did the pioneering work in identifying Albanian, Aromanian and Venetian loanwords in Greek.)

If Tzetzes is early enough to disprove Venetian influence (not a given), and if the Hunger manuscript is preserving Modern Greek as written from Tzetzes, and not the scribe’s ad lib on an earlier, cleaner, and more accurate rendering of the Ossetian (which is also not a given)… then I’ll go with “down” over “little bed”.

Babiniotis’ dictionary has another couple of guesses:

“*μνίον derived from Ancient βινεῖν ‘to fuck'”. There are other instances of ancient infinitives turned into modern nouns—φαγεῖν “to eat” > φαΐ “food”, φιλεῖν “to love” > φιλί “kiss”. And the verb did stick around until the Magical Papyri and Philogelos—the latter dated 4th century AD. But unlike the mn- guesses, there’s no obvious reason for /vinin/ to go to /vnin/ > /munin/.
“mona may be derived from Greek βυνῶ “to fill” (cf. βυζαίνω), in which case it would be a Rückwanderer [loanword reborrowed into source language]”. That “Rückwanderer” (αντιδάνειο) is not an innocent comment: it’s vengeance against Meyer. And ultimately it’s not that important: if the word came into the language that way, then as far as everyone was concerned, it was Venetian.
I’d defer to an Italianist on the plausibility of the derivation, but while βυνέω ~ βύω has useful semantics (“to stuff, to plug”), the βυνέω variant occurs only once in Greek literature, in Aristophanes Peace 645, in a decidedly non-sexual context: “sealed their lips with gold”. It looks like a pretty far-fetched way to account for a Venetian vulgarity to me—far-fetched enough I’m happy to blame an Italian scholar who doesn’t actually know Ancient Greek. If we’re going to look for Venetian etymologies that way, βινεῖν is far likelier than βυνεῖν.
Babiniotis’ dictionary also repeats Hatzidakis’ and Filintas’ derivations; my memory of Hesychius as an etymology must be his entry μνοιόν “soft”, used here to support *μνίον “soft down”. God alone knows what Hesychius was referring to with μνοιόν, but I haven’t changed my mind: Venetian (ultimate origin unknown) is the most plausible etymology, then “down”, then maybe “to fuck”.

OK, that’s enough four-letter words for one post.

The curious editorial fate of Tzetzes’ Theogony

Nikos Sarantakos: Curiously, the TLG text of Theogony does not contain the Ossetian verses -the showing off is cut (abruptly?) after the Latin verses, with a note that “there were many more verses in various dialects but I omitted them as useless”

Yes; I had to do some digging to work out what happened.

Tzetzes wrote an epilogue to the Theogony, showing off his command of exotic languages.
One scribe got as far as Scythian (Turkish), Persian and Latin, before deciding “screw this, I’m copying a lineage of Gods here, I don’t care about Tzetzes’ job application to Berlitz“. And left the note Nikos cited.
That scribe’s copy is what Bekker published in 1840.
Other scribes had the same reaction: “We have left the entire epilogue unwritten because it just went on too long (διὰ τὴν πολυλογίαν)”
Fortunately for Caucasian linguistics, Herbert Hunger discovered another copy of the Theogony, with the epilogue intact. He published the epilogue in: Hunger, H. 1953. Zum Epilog der Theogonie des Johannes Tzetzes. Byzantinische. Zeitschrift 46, 302-7
Thanks to Ronald Kim for putting a googleable draft of his paper online, to allow me to discover this. The final paper is Kim, R. 2003. “On the Historical Phonology of Ossetic: The Origin of the Oblique Case Suffix.” Journal of the American Oriental Society 123: 43-72. The online draft is Kim, R. 1999. “The origin of the Pre-Ossetic oblique case suffix and its implications”. U. Penn Working Papers in Linguistics 6.1.

I’ll pick up the Hunger edition when I’m next in the library (it’s passé in most circles to physically walk to consult a journal article, but Melbourne University has no motivation to fork out for a subscription of the electronic version). But this is how the epilogue starts, before the scribe fell asleep:

And you’ll find me a Scythian to the Scythians, a Latin to the Latins,
and to all other nations, as if I’m of the same race.
And embracing a Scythian, I shall address him thus:
[Good day to you, my mistress; good day to you, my lord]
salá malék altí salá malék
And Persians, I shall address in Persian thus:
[Good day to you, my brother; where are you going? Where are you from, friend?]
asaŋxáis karúparza. xatázar xarantási
And A Latin I shall address in the Latin tongue:
[Welcome, my lord, welcome, brother]
véne venésti, ðómine; véne venésti, fráter.
kómoðo, fráter, venésti in ístan tsivitátem?
[And there were many other verses of sundry dialects, but I omitted them as useless.]

Language Hat has a translation of the entire epilogue up. Which is hardly a surprise. (The “Scythian” is slightly different in that version.)

TLG updates

By: Nick Nicholas | Post date: 2010-02-02 | Comments: 5 Comments
Posted in categories: Linguistics, Mediaeval Greek, Modern Greek
Tags: Byzantine Greek, Cretan, Early Modern Greek, Ossetian, philology, TLG

The TLG has just released a new update to its corpus. As of tonight, the automatic recognition of lemmata in the TLG which I’ve been working on has just reached 95% of all wordforms. With these two milestones, I’ll be posting a few things about the current corpus; I’ve already put up some Wordles, as you will have seen.

First, about the new texts.

Early Modern Greek is well represented in the update. Readers of this blog will have noticed as much, because I’ve spent some time dealing with the peculiarities of those texts. Romances are the genre that attracts the most scholarly interest—they make the most sense as literature to contemporary readers. Accordingly, this update includes redaction α of Livistros and Rodamni as recently published by Panagiotis Agapitos of U Cyprus (who also writes Byzantine detective novels); the War of Troy (on which I’ve already posted); and the four redactions of the Tale of Belisarius.
Jumping forward a couple of centuries, the update also includes Cretan Renaissance drama: George Chortatzes’ tragedy Erophile, and the intermedios from Erophile, retelling Tasso’s Jerusalem Delivered.
1595 isn’t the latest date of TLG texts; the Chapbook of Alexander the Great (h/t Diver of Sinks) was published in 1750, and while the collections of monastic documents are normally late Byzantine and Ottoman, one collection goes up to 1968. But Chortatzes is the first representative of Cretan dialect, and I was quite happy to tweak the lemmatiser to accomodate it.
I’ll pause to note something which shouldn’t be an oddity, but is. Chortatzes writes his stage directions in the same Cretan dialect as his dialogue. Of course he would; our stage directions are in the same language as the rest of the text. But now that there is a Standard Greek, and Cretan isn’t it, that comes across as quizzical. If anyone now writes drama in dialect (not Cretan, but possibly Pontic, and certainly Cypriot), the metalanguage of the drama is going to be Standard Greek.
That’s because the metalanguage is the dramatist’s own voice; and while hillbilly dialect might be good enough for the dramatists’ characters, it will not do for their own directions. I’d imagine it’s the same for other Dachsprache languages (language variants under the “roof” of a prestige language), like say German or Italian dialect.
A few texts are recent re-editions. Moeris’ Atticist dictionary has been updated. By telling us which colloquial forms not to use, Moeris and the other Atticists (Phrynichus is the main one) tell us a lot about what their colloquial language actually looked like.
There is also a new edition of Cyril of Alexandria’s Paschal Letters, and Eudocia’s Homeric Centos. The centos are rearrangements of verses from an old poem, to tell a new story—in this case, Homer rearranged into an account of the Passion of Christ. This sounds like a very postmodern thing to do. But then again, there’s nothing new in postmodernism, apart from its bankruptcy of an intellectual programme, as it magpies any flotsam that gets it out of having to tell a story.
Not that you need to hear about my cultural conservatism.
The TLG also has an updated edition of the fragments of John of Antioch. There has been a controversy about whether the fragments attributed to John belong to one or two authors (there are two different linguistic registers in what we have); as a result, we’re in the odd situation by Classical standards of having two new editions of the author, within three years, from the same publisher: Roberto 2005 and Mariev 2008. (The Bryn Mawr Classical Review gives context in its review of Mariev.) The TLG has gone with Roberto, which considers both registers to belong to the same author; that has the added advantage of not throwing half the fragments out of the corpus.
In gathering up bits left out from Middle Byzantine authors, the update also includes Michael Psellus’ commentary on Aristotle’s Physics, and Constantine Manasses’ Moral Poem.
Finally, the update includes the irascible John Tzetzes’ retelling of the Theogony.
We already have the original Theogony from Hesiod; so Tzetzes’ retelling might have some interest for cultural history, but is not telling us much we don’t already know. Tzetzes’ retelling has attracted attention for different reasons. Tzetzes was Georgian on his mother’s side; and he sees fit to show off his command of foreign languages, in the epilogue to his poem. That makes his Theogony the earliest attestation of Ossetian. (And in his loose Greek translation, as linked, it may well be one of the first attestations of Greek μουνίν: er, “pudendum muliebre”.)

Comparison, TLG BC and AD

By: Nick Nicholas | Post date: 2010-02-01 | Comments: 6 Comments
Posted in categories: Ancient Greek, Linguistics
Tags: Ancient Greek, lexicon, morphology, TLG, Wordle

In the previous post, I used Wordle to illustrate stop words in Greek (and, by the by, the exponential distribution of function words following Zipf’s Law). After getting rid of a whole bunch of stop words, I ended up with a Wordle of the lemmata of the TLG:

But I stopped short of making sense of the Wordle, because the TLG contains both Ancient and Mediaeval texts, and they talk about different things. I promised Wordles of the texts in the TLG from BC and AD, which will give at least a rough sense of the difference.

So here they are:

Images created by the Wordle.net web application are licensed under a Creative Commons Attribution 3.0 United States License.

The Wordle images are hyperlinked to the Wordle applets hosted there, so you can play with the applets by eliminating words. The stopwords are as before, but I also got rid of πολύς “much”, which was crowding the BC texts a bit much.

A few things jump out quickly: there’s a lot more God AD, as you’d expect (θεός), slightly more talk of “people” than of “men” (ἄνθρωπος, ἀνήρ), less talk of the City and more talk of power (πόλις, δύναμις).

But I’m not really a visual person, so I’m going to use more quantitative ways of working out the changes in vocabulary.

To begin with, the two Wordles show the 150 most frequent lemmata for each period, not counting stop words. These are the differences between the two—words in the top 150 of one period, but not the other.

Ancients talked more about…	and less about…
Ἕλλην, Ἀθηναῖος, Ζεύς, ἀμφότερος, διαφέρω, ἑκάτερος, ἐλάσσων, εἶμι, εὐ, ἤλιος, ἡγέομαι, ἱερός, κεῖμαι, κύκλος, ναῦς, νέος, νομίζω, ὀρθός, οἶκος, πλέως, πλεῖστος, πλῆθος, πόλεμος, πολέμιος, ποταμός, θάλασσα, θεά, σημεῖον, ταχύς, ὔστερος, χρῆμα, χώρα, ζῷον	Χριστός, ἅγιος, ἁπλόος, ἄξιος, ἀδελφός, ἀλήθεια, βασιλεία, δέχομαι, δηλόω, δόξα, ἐκκλησία, ἐνέργεια, εἶδος, φωνή, κίνησις, κόσμος, νόος, οἰκεῖος, οὐρανός, οὐσία, πάθος, πίστις, πνεῦμα, πρόσωπον, θάνατος, θεῖος, σάρξ, τέλος, τρίτος, χάρις, ζητέω, ζωή
Greek, Athenian, Zeus, both, to differ, either, less, go, good, sun, to lead, dawn, holy, to lie, circle, ship, new, to think, right, house, full, most, crowd, war, enemy, river, sea, goddess, point, fast, last, need, land, animal	Christ, holy, simple, worthy, brother, truth, kingdom, to accept, to declare, glory, church, activity, form, voice, movement, world, mind, own, heaven, substance, passion, faith, spirit, face, death, divine, flesh, end, third, grace, to ask, life

Ancients talked more about…

and less about…

Ἕλλην, Ἀθηναῖος, Ζεύς, ἀμφότερος, διαφέρω, ἑκάτερος, ἐλάσσων, εἶμι, εὐ, ἤλιος, ἡγέομαι, ἱερός, κεῖμαι, κύκλος, ναῦς, νέος, νομίζω, ὀρθός, οἶκος, πλέως, πλεῖστος, πλῆθος, πόλεμος, πολέμιος, ποταμός, θάλασσα, θεά, σημεῖον, ταχύς, ὔστερος, χρῆμα, χώρα, ζῷον

Χριστός, ἅγιος, ἁπλόος, ἄξιος, ἀδελφός, ἀλήθεια, βασιλεία, δέχομαι, δηλόω, δόξα, ἐκκλησία, ἐνέργεια, εἶδος, φωνή, κίνησις, κόσμος, νόος, οἰκεῖος, οὐρανός, οὐσία, πάθος, πίστις, πνεῦμα, πρόσωπον, θάνατος, θεῖος, σάρξ, τέλος, τρίτος, χάρις, ζητέω, ζωή

Greek, Athenian, Zeus, both, to differ, either, less, go, good, sun, to lead, dawn, holy, to lie, circle, ship, new, to think, right, house, full, most, crowd, war, enemy, river, sea, goddess, point, fast, last, need, land, animal

Christ, holy, simple, worthy, brother, truth, kingdom, to accept, to declare, glory, church, activity, form, voice, movement, world, mind, own, heaven, substance, passion, faith, spirit, face, death, divine, flesh, end, third, grace, to ask, life

The effect of Christianity on vocabulary use is pretty obvious. A few other changes are worth noting:

Byzantines nominalised a lot more than Ancients did. That’s at last some of the reason for ἀλήθεια “truth” (instead of the more Attic τὸ ἀληθές “the true”), and it may relate to other nominalisations like κίνησις “movement” and ἐνέργεια “activity”. (βασιλεία “kingdom” has a Biblical pedigree—but that is also because the Bible was not written in Attic.)
Many of the differences are a matter of language change, rather than different ideology. For all that most Byzantines did not write in the vernacular, their language was usually more akin to Koine than to Attic. That explains the absence of εἶμι, εὐ, ναῦς, πλέως, ἐλάσσων, ἱερός, πολέμιος (replaced by στέλλω, καλός, πλοῖον, πλήρης, μικρότερος/ὀλιγότερος, ἄγιος, ἐχθρός) “send, good, ship, full, less, holy, enemy”, and presumably also the avoidance of ἀμφότερος and ἑκάτερος “both, either”.

I’ve left out from those lists words that show up in the top 150 only because they’re ambiguous with other legitimate words. (Yes, I should have pruned the Wordles.)

BC: δίκαιον, δοκεύς, ἠώς, θέα: rights, beam, dawn, view
AD: ἅγιον, βασίλειος, ἴδιον, κενόω, πρόσωπος, ζωός: sanctuary, royal, particularity, make void, face, alive

There’s one further comparison I’ll attempt: the words whose frequency changed the most between the two periods. To track this, I’m going to use the 2000 most frequent lemmata for each period—including both normal words and stop words; that constraint means we’re only looking at words that are likely to matter. I’ll go through the lemmata in those lists whose ranking changed by the greatest amount (e.g. from #1537 to #10342).

Because it’s a pretty heterogeneous list—and different kinds of words tells us different things, I’ll split them up into categories. (And I will do some silent suppressing of ill-recognised ambiguous words.)

These are the biggest shifts in proper names:

Ancients talked more about…		Rank Shift
Ἔφορος	Ephorus	-8530
Ποσειδώνιος	Posidonus	-8397
Πελοποννήσιος	Peloponnesian	-6655
Αἰτωλός	Aetolian	-5399
Ἑκαταῖος	Hecataeus	-5157
Θεόπομπος	Theopomus	-5046
Ἀπολλόδωρος	Apollodorus	-4948
Φωκεύς	Phocian	-4786
Τυρρηνικός	Tyrrhenian	-4587
Χρύσιππος	Chrysippus	-4043

Two things are going on here. First, some ancient authorities—primarily historians, if I read the names right—were of interest to several ancient writers, but of less interest to the Byzantines. They tend to be the historians whose texts didn’t survive, which is related to them being of less interest to the Byzantines. (I don’t know offhand whether that’s cause or effect.)

Second, Greece was very important to Ancient Greeks, and so were the various regions of Greece. To the Byzantines though, Greece was a backwater, and the old regions did not survive into the Byzantine system of themes. So there was no reason to talk about Aetolia or Phocia outside of Ancient History; and less reason to talk about the Peloponnese than you might think, even while the name survived. The same goes for Tyrrhenians: it wasn’t Etruscans that the Byzantines were having to deal with in Italy, but Lombards.

Ancients talked less about…		Rank Shift
Κύριλλος	Cyril	+214,509
Κωνσταντινούπολις	Constantinople	+214,399
Γρηγόριος	Gregory	+214,391
Ἀθανάσιος	Athanasius	+214,154
Γεώργιος	George	+85,856
Κωνσταντῖνος	Constantine	+47,064
Πέτρος	Peter	+40,947
Χριστιανός	Christian	+36,162
Βασίλειος	Basil	+28,217
Χριστός	Christ	+23,988

The only surprise is that Christians turn in BC texts at all; there’s only 5 instances though, and the dating of texts in the corpus is porous (late citations can appear as testimonia of earlier authors).

These are the biggest shifts in common nominals:

Ancients talked more about…		Rank Shift
εὔδοξος	reputable	-8805
κύλινδρος	cylinder	-6569
ἀσύμμετρος	asymmetrical	-5939
δημοκρατία	democracy	-5389
πυραμίς	pyramid	-4714
ναυμαχία	sea battle	-4274
κῶνος	cone	-4205
παραλληλόγραμμος	parallelogram	-3837
παρεμβολή	interpolation; encampment	-3668
ψήφισμα	decree passed by vote	-3194

If the AD texts have more theology, they clearly have a lot less geometry, and a lot less to do with representational systems of government. The drop in εὔδοξος is surprising, given it’s in Plato; I wonder if the change of -δοξ- in compound from “reputations” to “glory” made the adjective confusing for later writers.

Ancients talked less about…		Rank Shift
ἀποστολικός	apostolic	+214,282
θεοτόκος	God-bearing (Theotokos)	+85,966
βάπτισμα	baptism	+85,945
θεότης	divinity	+59,016
μόδιος	bushel	+58,616
μοναστήριον	monastery	+57,696
σεβάσμιος	reverend	+57,691
αἱρετικός	heretic	+35,602
χάρισμα	(spiritual) gift	+35,588
πατριάρχης	patriarch	+27,001

No surprises again; the only non-religious term is μόδιος “bushel”, both as a vessel and a measure.

These are the biggest shifts in verbs:

Ancients talked more about…		Rank Shift
διαπορεύω	pass across	-5939
βλώσκω	go	-5710
εἰσοράω	look upon	-4134
ἄημι	blow (wind)	-4113
κλύω	hear	-3392
ἐπιζεύγνυμι	join to	-3039
ἀμφισβητέω	doubt	-2436
ἐφάπτω	hang on	-2122
ἱκνέομαι	come	-2088
μεταπέμπω	send for	-1974

Many of the missing verbs are poetic and/or dialectal, and would not have a natural place in Byzantine prose; that includes βλώσκω, εἰσοράω, ἄημι, κλύω, ἱκνέομαι. The surprise here is the vanishing of doubt in the Middle Ages.

… Yes, yes, the jokes just write themselves, I know…

Ancients talked less about…		Rank Shift
ἐνάγω	persuade	+10,366
βαπτίζω	baptise	+8529
ψάλλω	chant	+5021
φανερόω	reveal	+3988
καταδικάζω	condemn	+3483
φωτίζω	illuminate	+3308
περισπάω	take a circumflex	+2948
βαστάζω	carry	+2911
ἀνέρχομαι	go up	+2809
προλαμβάνω	anticipate	+2769

I admit to being less sure about some the shifts here, such as ἐνάγω and προλαμβάνω. The Christian influence is clear in βαπτίζω, ψάλλω, φανερόω and φωτίζω. Language change accounts for βαστάζω and ἀνέρχομαι replacing φέρω and ἄνειμι, and I assume καταδικάζω for “condemn” replaced what came to look like more generic verbs, in καθαιρέω or καταγιγνώσκω. And unlike the Ancients, the Byzantines had to learn about polytonic orthography; so what word took a circumflex and what word took an acute was a matter much ink was spilled about.

Finally, these are the biggest shifts in function words:

Ancients talked more about…		Rank Shift
τοτέ	at times	-5796
αὖτε	again	-4707
δισχίλιοι	two thousand	-4676
αἴ	alas	-3334
πεντακόσιοι	five hundred	-2859
ἠέ	or	-2844
νή	[I swear] by [deity]	-2663
διακόσιοι	two hundred	-2597
μά	yea	-2470
πω	yet, at all	-2466

There is some Epic dialect here, in αὖτε and ἠέ; some strictly Attic rather than Koine words in τοτέ, πω, and δισχίλιοι; and a rather different approach to exclamations, with the old oaths by the Gods dispensed with, and the ai!‘s of tragedy avoided in theological discourse. (There are 2100 instances AD of φεῦ “alas”; maybe αἴ was too specific to tragedy? *shrug*) Not sure why the written-out 500 and 200 were less popular. Maybe the armies just got bigger, so historians talked in the thousands instead of 300…

Ancients talked less about…		Rank Shift
ἀμήν	amen	+19,195
νά	to (Modern Greek)	+18,984
ἀλλαχοῦ	elsewhere	+7689
δηλαδή	that is	+7587
ἤγουν	that is	+6367
ιζ΄	XVII	+4541
καθό	insofar as	+4524
ιϛʹ	XVI	+4001
ιηʹ	XVIII	+3727
ιδʹ	XIV	+3196

It’s obvious why amen is there; it’s also obvious why να, the Modern Greek equivalent of the ancient infinitive inflection, is there. ἀλλαχοῦ for “elsewhere” is attested in Sophocles and Xenophon, but it became prevalent much later, and LSJ reports that Moeris proscribed it as vernacular, in favour of ἄλλοθι. The other conjunctions are run-in phrases, which Byzantine texts in general are rather more sympathetic to treating as single words than are ancient texts: δῆλα δή “so [they are] obvious”, ἤ γε οὖν “or indeed then”, καθ’ ὅ “according to what”.

Finally, the numerals aren’t there because the Byzantines were more numerate than the Ancients. After all, the Byzantines had given up on geometry, from what the counts tell us. (And that’s a silly enough thing to conclude that you should not take much of this too seriously.) No, the reason there’s a whole lot of XVII’s and XIV’s in the AD corpus is that there are a lot more chapter headings in the theologians…

Wordle and Greek stop words

By: Nick Nicholas | Post date: 2010-01-31 | Comments: 7 Comments
Posted in categories: Ancient Greek, Linguistics
Tags: Ancient Greek, lexicon, morphology, TLG, Wordle

Some of you may be familiar with Wordle, an online tool which displays the words in a text with different sizes, depending on their frequency. Wordle is a convenient tool for seeing what the frequently mentioned concepts are in a text, so it gets a fair amount of use in blogs. It’s the same concept as Word Clouds; but done with much more typographical finesse. This, for instance, is Wordle run over the English text of Plato’s Republic:

And courtesy of The Crazy Australian, this is the ESV New Testament:

(As The Crazy Australian noted, you can learn one thing immediately from that: the Third Person of the Trinity doesn’t get as much stage presence as the Other Two in Holy Writ. Not really a surprise, but the point of Wordle is as much to visualise the obvious as it is to discover the not as obvious.)

Wordle works quite well with English, because most words don’t have a lot of inflection, to multiply the instances of the concept you’re looking for. In a language like Greek, on the other hand, lemmatisation—or as it’s more often called in search engines, stemming—is essential. Otherwise, you get not one instance of “Jesus” or “state”, but four or five, with no material difference.

Funnily enough, I do lemmatising. So what happens when you put the TLG through Wordle?

Images created by the Wordle.net web application are licensed under a Creative Commons Attribution 3.0 United States License.

Well, what you get is this:

I’ve highlighted the top seven verbs in green, and the top seven nouns in green. You can see the nouns, right?

Of course you can’t, because there’s a whopping great big ὁ and another rather outsize καί there, crowding everything else out. And being told that Greek texts have a whole lot of instances of the and and is unlikely to be what most people are curious to know.

What we have here is the notion of stop words: grammatical words that don’t convey a lot of content, and which search engines traditionally ignore. Wordle also ignores them, which is why you don’t see a lot of the and and in English-language Wordles. But Wordle doesn’t happen to be configured for Classical Greek.

So what happens if we whittle away at the stop words? Let’s do this slowly. We’ll start by getting rid of ὁ and καί.

Woah. Where did all that come from? You can see something now: θεός, λόγος, and if you really squint, ἄνθρωπος. But that’s still making life too difficult, because there are more stop words to dismiss. I’ve highlighted the next batch in red: τίς, δέ, αὐτός, εἰμί, who?, but, he/himself, be. Of these, τίς “who?” is inflated through ambiguity with τις “someone”; because the lemmatisation is not disambiguated by context, a few word counts are more sizeable than they should be.

With those four out of the way, we have:

An improvement; you can see ἄνθρωπος now, and maybe even πατήρ “father” next to θεός “god”. But we still can do better. We have eight more stop words that we don’t really need to hear about: ἐγώ “I”, ὡς “as, that”, ὅς “who, that”, τις “someone”, οὐ “not”, γάρ “because”, ἐν “in”, and οὗτος “this”.

With them left out, we have:

Still better: you can make out ἔχις “viper” now, at the bottom left hand edge. Not that Greeks spent a lot of time talking about vipers; they just spent a lot of time using the verb ἔχει “has”, which happens to be ambiguous with the dative of ἔχις. It’s automated lemmatisation, this kind of thing can happen.

We have sixteen more stop words, and as you may have worked out, the easiest criterion is to bundle up all function words—prepositions, adverbs, conjunctions, interjections, pronouns. With some of the ambiguity inherent in the venture—is πᾶς “every” a pronoun or an adjective?—but we can keep slicing nonetheless:

And again:

We’re not making as much of a difference now; but notice that the screen is being crowded out by verbs: λέγω “say” (and “pick”, as a synonym that used to be the same verb—just like “count” and “recount” in English); γίγνομαι “become”, ἔχω “have”. These are verbs, and are properly considered content words. But I already got rid of εἰμί “to be” (which as a copula is not a content word; and I’m happy to also throw out “have”, “become” (close to a copula itself), and verbs for “say”. (There is a lot of “he said she said” in the TLG, because there is a lot of narrative.)

If we get rid of those verbs?

And tidying up getting rid of the next hundred and fifty function words, which are a distraction as you squint for content:

You could argue there’s still some guff there: ποιέω “do” doesn’t tell you much more than ἔχω “have”, and πολύς “much” doesn’t really deserve its disproportionate size. But we have enough cleaned up that we can say now something about what the texts talk about. It’s certainly a sight better than this:

So what do the TLG texts talk about? You may well be starting to come up with ideas if you can read Greek. But before you do, remember that there a whole lot of Christian texts in the TLG, and they quantitatively crowd the ancient texts out. The texts of John Chrysostom alone in the TLG are almost as sizeable as all surviving Ancient literature between Homer and Aristotle.

So yes, the TLG as a whole talks about God and logos a fair bit. But we’d expect that of John Chrysostom; it doesn’t mean its what Plato or Homer talk about.

What’d be useful is to split up the corpus, say BC and AD, and see how they differ. Sounds like the next blog post to me…

Btw, I’ve been stamping out stop words, but stop words are of interest if you’re looking at grammar; and Nikos Sarantakos did ask me to pony up the word counts that I was tossing out. So, for the TLG and the lemmatiser as of last night, these are the twenty five most frequent lemmata of Greek, with their textual frequency:

πᾶς	534,845	every
ἕ	547,255	he
ἀλλά	548,203	but
διά	561,813	for
ἐπί	566,238	on
πρός	566,476	towards
κατά	643,767	by
εἰς	694,035	to
τῷ	732,938	therefore (ambiguous with “to the”)
μέν	762,890	on the one hand
ἐγώ	767,104	I
ὡς	771,416	as, that
ὅς	801,401	who, that
λέγω	811,330	say
τις	834,155	someone
οὐ	926,059	not
γάρ	951,810	because
ἐν	1,128,716	in
οὗτος	1,228,627	this
αὐτός	1,646,014	he, himself
εἰμί	1,704,651	be
δέ	2,265,028	but
τίς	2,624,172	who?
καί	5,765,491	and
ὁ	14,335,717	the

Of the lemmata we have not thrown out, θεός “god” is the 39th most frequent, with 388,933 instances.

DGE Vol VII

By: Nick Nicholas | Post date: 2010-01-18 | Comments: 5 Comments
Posted in categories: Ancient Greek, Linguistics
Tags: Ancient Greek, lexicography

Volume VII of the Diccionario Griego-Español (ἐκπελλεύω–ἔξαυος), intended to be the most comprehensive dictionary of Ancient and Early Middle Greek, has been published in 2009, and is available for purchase. (I’ve just ordered it.)

I found the new volume by googling; you’d be none the wiser about that from the DGE’s own web page, which hasn’t been updated since the second edition of Volume I. I’ve already commented on the melancholy slow pace of the DGE (begun in 1980) elsewhere, and that redoing Vol I was a really really inefficient use of time. But the appearance of VII is to be welcomed—the more so since VI appeared in 2003. I’ve just harvested from DGE five lemmata missing in Lampe for Gregory of Nyssa; the DGE continues to fill in gaps the others don’t. (Gregory of Nazianzen wasn’t as lucky though.)

Subscribe to Blog via Email

Email Address

Join 329 other subscribers
March 2026

M T W T F S S

1

2 3 4 5 6 7 8

9 10 11 12 13 14 15

16 17 18 19 20 21 22

23 24 25 26 27 28 29

30 31

« Jul

Ἡλληνιστεύκοντος

Set Union of Greek and Linguistics

Pages

Categories