How many words does the Greek language have?

By: | Post date: 2017-01-05 | Comments: No Comments
Posted in categories: Ancient Greek, Linguistics, Mediaeval Greek

I wrote an extensive set of blog posts in 2009 under Ἡλληνιστεύκοντος (read them backwards), trying to deal with this question with a fixed(ish) corpus, that I was responsible for lemmatising: the TLG. It has a whole lot about the distinction between word tokens (individual instances of words), wordforms, and lemmata (dictionary words).

It starts with several posts about how pointless this question is. Which noone seems to pay attention to.

The count of lemmata for the Corpus in the TLG (ancient and mediaeval literature) plus PHI (inscriptions) was 214,000 in 2009. By the time I was terminated from the TLG in 2016, I had gotten recognition up to 240,000 lemmata.

For the strictly classical corpus, up to the 4th century BC, it was 66,000.

If we add Modern Greek and Modern Greek dialect, it’ll be more. I’ve seen a guess by Christophoros Charalambakis, director of the Historical Dictionary of Modern Greek (dialect dictionary) at the Academy of Athens, of 600,000. I think that’s implausible. Given Zipf, I think 350,000 to 400,000 for all periods of Greek is plausible.

OED has something like 600,000 for English.

Leave a Reply

  • Subscribe to Blog via Email

    Join 300 other subscribers

  • May 2018
    M T W T F S S
    « Jan    
     123456
    78910111213
    14151617181920
    21222324252627
    28293031  
%d bloggers like this: