Subscribe to Blog via Email
Join 325 other subscribersJune 2025 M T W T F S S 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Is it acceptable to use “with” without an object? For example. I’m coming with. I hear this lately in Southern California. Is this correct?
It’s a regionally restricted colloquialism, and outside of those regions it sounds odd.
I’m surprised to hear it’s showing up in SoCal and Hawaii. I was aware of it in New York English, under Yiddish influence, and South Australian English, under German influence.
EDIT: looks like I got my Germanic-influenced American dialects mixed up: not New York, but Upper Midwest, and not Yiddish, but likelier Swedish.
And here’s the PhD dissertation on the subject: A cross -dialectal, multi -field, variation” by John M Spartz
History: Which cultures or societies went from being literate to illiterate? As in a script becoming extinct or some other reason.
This is a mythological rather than factual answer, but:
The Hmong people were illiterate, but they lived at the crossroads of a bunch of literate cultures—the Chinese, the Thai, the Vietnamese, the Laotians. The Hmong noticed. And they figured that they must not always have been the downtrodden illiterates that they were: surely they too used to have writing.
The legend was that, when they were literate, they were migrating across a river, and as they did, their horses ate their books.
Any resemblance to “the dog ate my homework” is fortuitous.
This meant that the Hmong invested much messianic expectation in the restoration of literacy. Which explains the enthusiasm with which the Hmong embraced the Romanized Popular Alphabet, promulgated by missionaries such as William Smalley. It also explains the curious history of the https://en.wikipedia.org/wiki/Pa… script—and the messianic cult around its martyred originator, Shong Lue Yang, the “Mother of Writing”.
Smalley cowrote the biography of Yang, Mother of Writing, with Yang’s chief disciple. It is touchingly respectful.
In linguistics, are there views other than the primacy of speech over writing?
The default thinking in linguistics is indeed that spoken language has primacy over written, and Brian has outlined the arguments for it.
But coming from another culture with the burden of diglossia and veneration for old forms of the language, I get where OP is coming from. Written language is never anterior to spoken, and babies listen before they read. But clearly there are cases where primarily written prestige registers do influence what people say. The cases in English are marginal, like pronouncing t in often.
The instances in highly diglossic or culturally conservative languages, like Greek, are not marginal. Modern Greek phonology is a mess, because of a mass of spelling pronunciations, that violate vernacular phonotactics.
Let me give an example. “cheapness” was pronounced [ewtʰɛːnia] in Ancient Greek. If you read it out with modern pronunciation, you get [efθinia]. Now [fθ] is hard to pronounce, and noone probably ever did pronounce it: [pʰtʰ] > *[fθ] regularly goes to [ft] in the vernacular, and itʼs likely that [ewtʰ] just went straight to [ft]. So the modern word for “cheapness” is [ftiɲa].
But because of diglossia, and the primary in prestige of the Ancient written word, Modern Greek got a bunch of Ancient words that were not adjusted for modern phonotactics, and were read out letter for letter. So “responsibility”, which was [ewtʰɛːnɛː], is pronounced as [efθini], not *[ftini]. And it gets worse: “fragile” [ewtʰrawstos] ends up as [efθrafstos]. (Ioannis Psycharis, who was the first populariser of the term “diglossia”, wanted to regularlise Ancient loans like that, as a good neogrammarian. Everyone else thought that regularised Ancient loans were ridiculous—and the resulting unpronouncable phonotactics of [efθrafstos] was not.)
(Of course in reality, noone pronounces it [efθrafstos] in rapid speech; the -fst- at least gets simplified to -st-. When a Cretan dialect retelling of Game Of Thrones went viral, the narrator called Jorah Mormont, whose actor has a Scottish accent, “the Australian tax dodger”, [afstralos]. Because all Commonwealth accents sound alike. YouTube commenters corrected him: a true Cretan would say he was an [astralos].)
Pronunciations like [efθrafstos] came into being because the pedants prioritised the written word. You can say they were fools. You can say languages doesn’t work that way. (God knows Psycharis did.) But Modern Greek phonotactics is now the way it is, because of them.
Ditto the development of Chinese, or for that matter Icelandic.
Which language is closest to Greek?
Following up on Joachim Pense’s answer:
Modern Hellenic languages
If we include modern Hellenic languages, a (purely subjectively) ranking of the “outlier” dialects by closeness to Standard Modern Greek is:
- Salento Griko
- Calabria Griko
- Mariupolitan
- Pontic
- Silliot (spoken in Sille, near Konya)
- Cappadocian
- Tsakonian
The dividing point for mutual intelligibility is probably Pontic, definitely by Silliot. Of course, it’s not routine practice to consider anything but Tsakonian a distinct language.
Historical Hellenic languages
If we instead include ancient Hellenic languages, traditionally it is considered to be Ancient Macedonian—based on the Hesychian glosses, which have radical sound changes. The recent epigraphic finds OTOH look like routine Doric.
Outside Hellenic languages
If we exclude Hellenic languages and if we trust lexicostatistics, the closest language is Armenian. (See: Graeco-Armenian)
How did names like Anatoly and Arcady become names in Russia?
Partial answer: from St Anatolius: Anatolius of Laodicea and Anatolius of Constantinople. Saints’ names are the default source of given names in Orthodoxy.
The question then becomes, why this saint’s cult was so much stronger in Russia than in Greece—I’ve never heard of a Greek called Anatolios, and the Prosopographisches Lexikon der Palaiologenzeit has only 5 known people of that name in Byzantine texts from the 13th through 15th centuries.
Historical Linguistics: In simple terms, what are the laryngeal consonants h₁, h₂, h₃? What do they have to do with the word “name” in various languages? What do they have to do with Proto-Indo-European?
This is self-indulgent of me, but this is how I presented the laryngeal theory to my poor Historical Linguistics students in 2002.
Saussure (1879): let’s look at Ablaut in proto–Indo-European:
- e:o:Ø
Greek patéra, eupátora, patrós
[father.ACC, of.good.father, father.GEN] - eR:oR:R̩ where R is a resonant (jwrlmn):
- R=w:
Greek eleusomai, eiléːloutʰa, éːlutʰon
[I.will.come, I.have.come, I.came] - R=l:
Greek stélloː, stólos, éstalmai
[I.send, sending, I.have.been.sent]
[a: epenthesis and reflex of *ə] - ē:ō:ə
Greek títʰeːmi, tʰoːmós, Latin facio
[I.put, heap, I.make] - ā:ō:ə
Greek pʰaːmí, pʰoːnéː, pʰásis
[I.say, voice, utterance] - ō:ō:ə
Greek dídoːmi, dôːron, Latin datus
[I.give, gift, given]
Now hang on:
e o Ø
eR oR R̩ (j> i, w > u, l> l̩ > (a)l)
ē ō ə
ā ō ə
ō ō ə
Yuck. eR:oR:R̩ is just e:o:Ø put in front of R.
Why can’t ē:ō:ə be as nice? Why can’t it be… eə:oə:ə > ē:ō:ə ?
That way, we still have e:o:Ø + something, and a simple monophthongisation to wrap it up.
If we’re lucky, we might be able to account for ā:ō:ə and ō:ō:ə the same way.
Furthermore, Proto-IE phonotactics is typically (s)C(R)e(R)C:
- bʰer-‘carry’, k̂ei-‘lie’, kers-‘run’, pel-‘thrust’,
spen-‘stretch’, sreu-‘flow’, lewk-‘light, bright’, melg– ‘milk’,
plek-‘plait’, dʰwer-‘door’
But some roots are CV or VC—often with the same suspicious long vowels:
- aĝ-‘lead’, aug-‘increase’, dō-‘give’, ed-‘eat’,
gʰrē-‘grow, green’, gnō– ‘know’, od-‘smell’, stā– ‘stand’
So in Ablaut, ē, ā, ō are kind of behaving like eə, oə (former diphthongs.)
In phonotactics, ē, ā, ō are kind of behaving like eC or Ce (where C can include j, w—so eC, Ce can still be a diphthong)
Saussure’s solution: there is something there: a ‘sonorant coefficient’, *X
With X, we can say:
e o Ø
eR oR R
eX oX X > ē ō ə
aX oX X > ā ō ə
oX oX X > ō ō ə
(cf. eR:oR:R̩, e.g. ew:ow:w̩ [= u])
We can also say:
- dō– ‘give’ <**doX= CVC
- gʰrē-‘grow, green’ < **gʰreX= CRVC
- stā– ‘stand’< **staX= sCVC
Only one more catch: where do a and o come from in the left column in the first place? (eX; aX;oX). Why can’t we just make do with e:o:Ø for all rows?
Hey! Let’s posit two sonorant coefficients:
eA oA A> aA oA A > ā ō ə
eO̬ oO̬ O̬ > oO̬ oO̬ O̬ > ō ō ə
(to which others later added E:
eE oE E > eE oE E > ē ō ə)
A is ‘something’ to which e assimilates, giving a
O̬ is ‘something’ to which e assimilates, giving o
E is ‘something’ that leaves e alone
When A, E, O̬ drop out, compensatory lengthening or something for a, e, o; schwa for Ø.
If we have assimilation going both directions, we can now say:
aĝ– ‘lead’ <**Aeg = CeC
ed-‘eat’ < *Eed = CeC
od-‘smell’ < O̬ed = CeC
What’s this buying us?
- Consistent phonotactics: all stems are CVC (plus or minus R and #s).
- Makes sense of all kinds of Ablaut: it all boils down to e:o:Ø + something.
- Ablaut like ā:ō:ə now makes sense: it’s actually originally eA:oA:A.
Saussure tells the world…
… and noone cares. Who’d buy this bunch of abstract algebra? Who cares what the phonotactics of proto-Indo-European are? What’s wrong with just saying ā ō ə? What’s the evidence that these phonemes ever even existed?
Saussure dies 1916. Around that time Hittite is deciphered: turns out to be Indo-European, though distantly related to other IE languages.
‘water’: Greek hýdoːr, Gothic watō, Old Church Slavonic voda, Gaelic u(i)sce, Albanian ujë ,proto-IE *wedōr
… Hittite watar.
1927: Jerzy Kurylowicz notices something:
Latin/Greek Saussure Hittite
mālum ‘apple’ < *meAl- maḫl-
plānus ‘flat’ <*pleA- palḫiiiš
ōs ‘bone’ < *Oes- ḫastai
anti ‘against’ < *Aent- ḫanti
argēs ‘white’ < *Aerg- ḫarkis
esti ‘he is’ < *Ees- es-
A and O correspond to Hittite ḫ.
(E is no more attested in Hittite than in the rest of Indo-European)
Semiticists had noticed that the sonant coefficients, assimilating e to a or o, were behaving like Arabic laryngeals
… ḫ is a laryngeal.
So EAO must have been laryngeals: h1 h2 h3
Saussure was so right, even he would have been surprised.
Answered 2016-03-26 · Upvoted by
,
Linguistics PhD candidate at Edinburgh. Has lived in USA, Sweden, Italy, UK.
Has emergence of case system ever been observed?
Though it doesn’t look like Indo-European case, serial verb constructions have ended up turning into case markers. An instance is Chinese ba, which is primarily the verb “take”, but which has started to act like an accusative marker: “I take spear look” > “I ACC spear look”, I pick up the spear to look at it > I look at the spear.
Where did the word Nemesis originate?
Nemesis, “Greek goddess of vengeance, personification of divine wrath,” from Greek nemesis “just indignation, righteous anger,” literally “distribution” (of what is due), related to nemein “distribute, allot, apportion one’s due”.
Goes on to note that the word is cognate to German nehmen “take”.
Conceptually, Nemesis is the same notion as one’s “lot” (allotment)—which also underlies the Greek name for the Fates, Moirai (literally: “shares”).
What is the degree of intelligibility between Standard Modern Greek and Cretan Greek?
I’ve done the Swadesh list lexicostatistics: 89 of 100 core words, which is comparable to Russian and Ukrainian. (I get the same figure for Cypriot.) Mutually intelligible, but just. Much more now that the dialect is dying out.
I was exposed to the dialect 30 years ago when it wasn’t doing as badly; so I’m not necessarily the right person to ask; I’d be interested in others’ opinion.
Is there any NLP tool that can extract affix and stem of English words?
Yes, the Porter Stemmer is the most popular approach by far. See A survey of stemming algorithms in information retrieval for a survey, nltk.stem package for NLTK implementations, and Porter Stemming Algorithm for Porter’s own description of it. There are tweaks of it around, but noone has gone for anything different; and English being the way it is, there’s no real interest in the more powerful lemmatisers, which would do actual dictionary work.
As a linguist, I (and I’m sure many another linguist) am aghast at what the Porter Stemmer doesn’t do. stupider for example goes to stupid, but bigger does not go to big: Porter does not touch bisyllabic words—there’s too much risk of error. Similarly, Porter has no idea or interest in irregular forms.
It is a decent compromise on doing too much versus doing too little (and doing too much is a real problem). What people always forget is that it has to be customised, to deal with the vocabulary you’re likely to encounter, with an exceptions list. That applies in particular to its use in Lucene/SOLR.