By what process(es) do complex inflection systems form in natural languages? What influences how they form?

There are languages with clean, atomic, nuggety units of meaning as separate words: isolating languages like Chinese and (mostly) English.

There are languages with suffixes as well as words, where those suffixes are still, for the most part, clean, atomic, easy to detect, and easy to take apart: agglutinative languages like Turkish.

And then you have horrid messy languages, where the inflections are laborious to learn, have only the faint traces of pattern, and where an inflection suffix often ends up conveying two or three grammatical categories at once. Fusional languages. Like most of the old Indo-European languages, and most of the new Eastern Indo-European languages.

There’s a hypothetical cycle (or rather spiral) of Isolating > Agglutinative > Fusional … > Isolating.

Assuming that fusional languages came from something, that there is a different type that they draw from, that type would have to be agglutinative: inflections going from clean and discrete, to messy and mooshed together. What perverse, counterintuitive force would make that happen?

Well, language change is often a messy compromise between two contrary forces; in theory it has to be, because we know that language varies and does not uniformly end up at the same endpoint. There are forces pushing it in one direction; there clearly have to be forces pushing it in the opposite direction, or else all language would converge at the endpoint of that first direction.

There is a force pushing language to be clearer: more communicative, easier to learn, more iconic, clearer in structure, more logical. That force would keep language agglutinative.

The force that usually ends up pushing in the opposite direction is the force pushing language to be easier: in particular, easier to utter. It’s phonetics.

So the old Germanic i-plurals make sense: one fōt ‘foot’, many fōt-i; one mūs ‘mouse’, many mūs-i. All very clean.

Until people start making those plurals easier to pronounce.

  • fōti > föti > föt > fēt > feet
  • mūsi > müsi > müs > mīs > mice

One foot, two feet makes no sense; neither does one mouse, two mice. But they used to make sense. And the changes can all be explained as regular sound changes, that make the words easier to pronounce. (That plus the Great English Vowel Shift.)

It’s the same with those complex inflections of classical languages. Those complicated verbal flexions of Ancient Greek do kind of suggest patterns; in fact, if you look at the fine print of classical grammars, you will see a section where the verb endings are taken apart letter by letter to make sense of them, in a way that tells you they used to be agglutinative. (That plus Indo-European e/o ablaut.)

But to get from that proto-Greek agglutinative pristine niceness, to the mess of Classical Greek, you go through a bunch of sound changes—many of them to do with smashing vowels together into new vowels. Dropping s between vowels is only the most irritating of those sound changes. (So irritating, Modern Greek ended up undoing it: Proto-Greek *lyesai > *lyeai > Classical Greek lyēi ‘thou art unbound’—and notice eai > ēi; Modern Greek linese < *ly-n-esai ‘you’re untied’.)

