Subscribe to Blog via Email
May 2022 M T W T F S S « Nov 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Did ASCII and other character sets change the way people think about characters or letters?
Nice question! I believe that they have, though this is kind of speculative.
ASCII and charsets have cemented the notion of a fixed repertoire of characters available to a language or a context. Specialist printers beforehand did have a little wiggleroom in making up characters for specialist purposes–various iterations of sarcasm marks, one-off diacritics or phonetic symbols, and whatever other adhoccery there has been. It cost money, but it was done; so the set of characters was open-ended. There’s less room to do that now, as there is a clearer division in digital media between images and text.
The script this is likely to have the most real impact on is Chinese, which has a (very limited) ability to make up ad hoc new characters.
ASCII and Latin-1 have had contradictory effects on how people thought of letters with diacritics, both of which were unhelpful. Unicode theoretically has solved this; in practice, the damage has been done through legacy.
ASCII (and typewriters before them) often put diacritics out of users’ reach, and they often ended up dropped. So the notion started circulating that diacritics did not matter. Conversely, Latin-1 and its sibling included diacritics and letters as precomposed letters; this circulated the notion that diacritics are not separate from their letters (e.g. e-acute is a single unit)—which is an approach some languages take, but not all. In theory, Unicode decomposes combinations like e-acute into its constituents; but users in data entry are usually not exposed to that, especially for Western European (Latin-1) combinations.
Unicode even more than ASCII, because of its completeness, has promoted the notion of character above that of letter in the way people think of text. People often left out numerals, punctuation, dingbats etc when they thought of what constitutes text (see What is the last letter in the Coptic alphabet?). Being exposed to a character matrix like Unicode makes people much more aware of non-letters. An unholy side-effect of this has been the proliferation of emoji.
With much more limited impact, Unicode has prioritised the notion of character above that of the glyph: it allows that there are contextual variants of characters, but it promoted the platonic ideal of the character over the glyph. We rarely see this in most contexts, especially because the really commonplace contextual variants are encoded as characters anyway (medial and final letters in Greek, Arabic and Hebrew); but ligatures are the most common instance of this (which data entry now realises silently). People are now actually less aware of ligatures than before, precisely because they are now substantially automated; so people don’t need to focus on glyphs as much as they needed to before 2000.