Subscribe to Blog via Email
January 2025 M T W T F S S 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Category: Ancient Greek
Lerna VIIa: Classical and Late vocabulary
Here, I’ll try making some sense of how the vocabularies of Greek have shifted between the corpora. This is where we got to. Lemmata Excluding Proper Names TLG + PHI #7 (viii-XVI, +tech +christ +inscr/pap) 214,381 172,646 TLG (viii–XVI, +tech +christ -inscr/pap) 201,823 162,009 LSJ Corpus (viii-VI, +tech -christ +inscr/pap) 159,636 124,215 Mostly Pagan (viii–IV, […]
Lerna VId: A correction of lemma counts
Last post had its share of egg on my face, showing systematic overcounts of word forms in the corpora. This post is another healthy serving of omelette, correcting the lemma counts given in Lerna VIa. The overall story is: There are less distinct word forms in the PHI #7 corpus than I thought There are […]
Lerna VIc: A correction of word form counts
This post fixes counts given in Lerna Va and Lerna Vb, with corrected counts from the PHI #7 disc—and a couple of weeks’ work on the archaic dialects and proper names of the PHI #7 corpus. I’ve also fixed several errors in how I was counting forms as unique. The end result is that the […]
Lerna VIb: A derailing of lemma counts
You may have noticed an extended radio silence for the last couple of weeks in the series counting lemmata. The people at the Magnificent Nikos Sarantakos’ blog, where the good fight against Lerna is fought, know why: I found some problems in the way I was counting lemmata in the inscriptions and papyrus corpus (PHI […]
Lerna VIa: For Zeus’ Sake, How Many Words?
[Counts in this post have been corrected in Lerna VId] At long last, after nine posts of teasing, will I finally give the punters a count of lemmata of Greek? Why yes. Yes I will. And then for a change, I will also set to work inflating it, to extrapolate from the current corpus and […]
Lerna Vb: Forms of Good Pedigree
[Counts in this post have been corrected in Lerna VIc] In the last post, we did some pruning of the word form count of our corpora, and came up with some numbers. We also noted that, once you pruned away the 137 forms of ἀνήρ, you’re still left with 42 forms of ἀνήρ. (Did I […]
Lerna Va: Word Form Counts, pruning
[Counts in this post have been corrected in Lerna VIc] So surely, after all the disclaimers in previous posts, I will now tell you how many words there are in Greek? Oh no. Not at all. Not even close. Before I alight at the burning question of how many lemmata of Greek (and when), I’m […]
Lerna IV: Corpora
So having spent four posts on why we should not count words of Greek, I will count words of Greek. The counts are only meaningful relative to a corpus, so here I detail what’s in the corpus I’ll be using, PHI #7 + TLG—and how I will end up treating it as four concentric corpora. […]
Lerna IIId: Why we do not count lemmata
Now, the whole point of any word counting venture, such as Lerna attempts and gets galumphingly wrong, is not the corpus size, which is contingent and always less than infinity; nor is it the number of word forms, which tells you about morphological happenstance but not about vocabularies. When people talk about words, they mean […]
Lerna IIIc: Why the Greek scales are rigged
Even if you allow for the fact that Greek is flexional and has lots of inflections, a literary corpus of Greek is going to have a lot more morphological variety than most other literary languages. That doesn’t tell you something about the superiority of the Greek language. But it does tell you a bit about […]