29 September 2014

Forgotten Derivatives and Their Sexual Implications

What kind of noun is čët? What is its relationship to our hypothetical verb root? One cannot avoid asking such questions when proposing an etymology. A word is more than a root; it has a derivational history. If you add an affix to a word, you may alter its lexical category and its meaning of the base. We already know a good deal about morphological processes in the Indo-European languages, which means that we can tell plausible relationships between possibly related words from unlikely ones.

Let R be a root morpheme. In Proto-Indo-European (and in many of the languages descended from it), a root consists of a consonantal skeleton with a slot where a vowel can be inserted. For example, the verb root *{w_rǵ} ‘make, work’ is normally quoted in the form *werǵ-, called its e-grade, symbolised as R(e). Here, the slot is occupied by the vowel *e. The same root also forms an o-grade, R(o), realised as *worǵ-, and a zero grade, R(z), in which the vowel slot remains empty. In that case, the liquid *r, sandwiched between two other consonants, has to play the role of a syllable nucleus, and becomes phonetically *wr̥ǵ- (in the traditional Indo-Europeanist notation, a tiny subscipt ring marks a syllabic consonant).

One of the largest and most productive classes of PIE nominals (nouns/adjectives) were the so-called thematic nouns (also known as o-stems). Their stem ended in the vowel *-o-, to which inflectional endings were attached. In the simplest case, the vowel was added directly to the root; in more complex cases it was part of a suffix (such as *-to-, *-no-, *-tero-, *-tlo-, etc.). Somewhat surprisingly, “simple thematic”  nouns of the shape R(e)-o- were pretty rare in the protolanguage. The neuter action noun *wérǵ-o-m ‘work, activity’ is well supported by the agreement between Germanic *werka- (Old English weorc, German Werk) and Greek érgon; we also have Iranian (Avestan) varəza-, with the same stem (and meaning) but with masculine inflections. Very few such nouns, however, are truly old. More typically, the suffix *-o- was added to R(o), as in *wóiḱ-o- ‘house, dwelling’ (root *weiḱ- ‘enter, occupy’) and sometimes to R(z), as in *jug-ó- ‘yoke’ (root *jeug-, already mentioned in earlier posts).

Marc Greenber (2001) doesn’t define the morphological status of his reconstruction *kʷet- (‘two’ > ‘pair, partner’). In some places in the article he treats it as if it were a root noun (with no suffixes), but the simplest form we actually find in Slavic is represented by Russ. čët (cf. dialectal Polish cot), which appears to reflect a thematic masculine noun *kʷet-o-s ‘even number’. How could it have originated? If *kʷet- was once a verb root (with the approximate meaning of ‘arrange in pairs, pair up’), *kʷet-o- makes sense as a kind of action noun that has acquired a resultative interpretation: by pairing objects together, you end up with an even number of them. (By the way, the verb root is not entirely conjectural: we can see it in Russian četáť ‘form pairs’.) The problem  with *kʷet-o- is that it represents a rare type of stem, at least in terms of PIE morphology. Is it legitimate to posit it just like that?

On the other hand, *kʷet-o- needn’t go all the way back to PIE. The deverbal formation R(e)-o- has enjoyed increased productivity in Slavic. We even have doublets like R(o)-o- and R(e)-o-, where the o-grade variant is more conservative (and has more external cognates), while the e-grade seems to be a younger innovation (with a more restricted distribution).  Thus, the root *tekʷ- ‘run, flow’ has produced Slavic *tekъ (as if from *tekʷ-o-s) ‘waterflow, leak, source’, which coexists with *tokъ (< *tokʷ-o-s) ‘stream, current, flux; (figuratively) course, sequence of events’. The former is an innovation directly connected with the Slavic verb *tekti ‘leak, flow’ (3sg. *tečetь > *tékʷ-e-ti), whereas the latter is a relict form which has drifted away from its etymological base, also semantically. Therefore, if *četъ is a relatively recent derivative of a Proto-Slavic verb, it wouldn’t be surprising if it had an o-grade cousin (possibly with a more “evolved” meaning).

As a matter of fact, Greenberg mentions *kotъ ‘offspring (of animals), litter’ and *kotiti (sę) ‘have young’ as possible members of the same word-family. A connection with the homophonous noun *kotъ ‘domestic cat’ (a European Wanderwort which spread with the introduction of cats) is folk-etymological: the verb may be used of cats, but also of mice, sheep, goats, roe deer, and a variety of other animals. It is used even in those Slavic languages that have a different word for ‘cat’ (e.g. Serbo-Croatian mačka). The verb *kotiti could be an “iterative/causative” built to the root *kʷet-. The structure of such secondary verbs is R(o)-éje/o- (the final vowel of the stem alternates depending on which conjugational ending is added). For example, the Slavic verb *gъnati (3sg. *ženetь) ‘drive on, drive away, rush’ has a corresponding o-grade iterative, *goniti (3sg. *gonitь) ‘chase, run after’. These forms ultimately reflect PIE *gʷʰén-/*gʷʰn- ‘slay, kill with blows’ (a root verb, somewhat  restructured in Slavic) and its PIE iterative *gʷʰon-éje/o-. The verb *tekti (< *tékʷ-e/o-), mentioned above, forms a pair with the causative *točiti ‘cause to flow, (cause to) roll’  (< *tokʷ-éje/o-). Note also such English pairs as lie vs. lay, or sit vs. set, where the first member is a primary verb and the second is its causative (e.g. ‘lay’ = ‘cause to lie’).

The consequences of forming a pair.
[source; © gerald reiner]
The stem *kʷot-éje/o-, originally with middle-voice inflections (whose function was taken over by the reflexive/reciprocal pronoun * in Slavic), would mean ‘form a couple (together)’, hence ‘mate, have sex’, and eventually ‘reproduce, have young’. If so, *kotъ ‘litter’ is not a senior synonym of *četъ (with a hard-to-explain change of meaning), but more likely a separate verbal noun back-formed from *kotiti sę (the consequence of mating), on the analogy of formally similar denominal verbs: *agniti sę ‘yean’, *teliti sę ‘calve’, *žerbiti sę ‘foal’.

The feminine *četa can hardly be a collective (at any rate in the meaning ‘pair’). Not only because it refers to just two things, but also because collectives in *-ah₂ to o-stem masculines are an archaic formation in Indo-European (as opposed to neuter collectives, co-opted as ordinary plurals of neuter nouns and adjectives), and *četъ is unlikely to be sufficiently ancient. But Indo-European *-(a)h₂ was not only a collective suffix and a marker of femininity; it was also employed to coin (formally feminine) abstracts, including action nouns. Quite a few deverbal masculines in Slavic (and more generally in Balto-Slavic) have feminine synonyms like *čarъ ~ *čara ‘sorcery, enchantment’ or *-tokъ ~ *-toka ‘flow, course’, *-sěkъ ~  *-sěka ‘cutting’ (in compounds). Note the familiar morphological formations represented by Greek tómos ‘slice’ (result of cutting) versus tomḗ ‘cut’ (an instance of cutting) – a nice parallel to *četъ (resultative) vs. *četa (an individual instance of pairing).

In the first post of this series I suggested that the stem *kʷet-w(o)r- was originally a deverbal neuter of a familiar type. Before I develop this idea, let me briefly suggest one other possible trace of the root *kʷet-: the second member of the Latin compound triquetrus ‘triangular’. The next post will be about it.

[back to the table of contents]

26 September 2014

Twos and Troops: Sifting the Evidence

Jakobson’s remark about a possible connection between Russian čët and četýre is discussed in Blažek (1999: 212-213) and especially in Greenberg (2001). Both authors mention earlier, more sketchy treatments of the problem, and they both add more Slavic material to the Russian words originally listed by Jakobson (which were čët, čëtka ‘even number’, četá ‘pair, union’, and čeť ‘quarter’). Blažek also notes an interesting potential cognate in Ossetian, an Indo-European language spoken in the north-central Caucasus (Ossetian is the only living descendant of the Northeast Iranian languages once spoken by the Scytho-Sarmatian inhabitants of the Eurasian steppe belt). The word in question is cæd ‘pair of oxen yoked together’, as if from Proto-Iranian *čatā (the Digor dialect of Ossetian has preserved a more conservative disyllabic form of the word, cædæ).

Blažek does not follow up Jakobson’s suggestion (presumably because he favours a different etymology of ‘four’, proposed by Schmid 1989; see pp. 213, 215, 331 in Blažek’s book). Greenberg, however, regards it as convincing and develops it further. Like Blažek, he considers the predominantly South Slavic *četa ‘troop, military unit’ (hence Serbo-Croatian Četnici ‘Chetniks’) to be part of the word-family of čët, and tries to explain the accentual difference between the end-stressed word četá (< *četa̍) in Russian and the root-stressed South Slavic forms – Bulgarian čéta, Serbian/Croatian čȅta, Slovene čẹ́ta (< *čèta) – in order to defend their common origin.

According to Greenberg, the word ‘four’ is derived from the root *kʷet- meaning ‘two’ extended with a multiplicative suffix, so that *kʷet-wor- means ‘(two) groups of two, twice two’. Greenberg also speculates that Proto-Indo-European *kʷotero- ‘which (of two)?’ (Greek póteros, English whether) contains the same root. This is hardly a good idea, since there is no compelling reason to question the straightforward standard analysis of *kʷo-tero- as the interrogative pronoun *kʷo- plus *-tero-, the IE suffix of binary contrast. The semantic gap between ‘two’ and ‘military unit’ is bridged by Greenberg as follows: Slavic *četa originated as the collective (in *-ah₂) of a word meaning ‘two, pair’, and ‘multitude of pairs’ evolved into ‘troop, group, band (of soldiers)’.

Arranged in pairs
[source]
There are serious problems with this derivation. First, (East/West) Slavic *četъ means ‘even number’, not ‘two’ or ‘pair’, while, on the contrary, the supposedly collective četá can mean ‘pair’ in Russian (beside some related meanings: ne četá, accompanied by a dative, means ‘not on a par with, superior to…’). What appears to be its exact cognate in Ossetian means ‘pair of oxen’, not, say, ‘herd of cattle’. Furthermore, while it’s true that the semantics of Russian četá covers not only ‘pair’ but also ‘troop’ (the latter attested already in Old Russian), we are probably dealing with a lexical merger between a native East Slavic word and a borrowing from Church Slavic (Czech četa ‘platoon’ is likewise a South Slavic loan, as are, ultimately, a number of similar “wandering words” in various neighbouring languages – Romanian, Hungarian, Albanian, and even Turkish). The non-attestation of intermediate meanings like ‘double column (of soldiers)’ makes it hard to justify the derivation of ‘troop’ from ‘pair’. Since the semantic difference is combined with a formal difference (conflicting accentuation), the etymology simply falls apart. It seems reasonable to conclude that the contrast between *četa̍ and *čèta is old and distinguishes two words of different origin (notwithstanding their merger in Russian). [See this comment, however.]

Jakobson’s final hypothetical relative of ‘four’, čeť ‘fourth part (of land), quarter’ (Old Russian četь ~ četъka), is in all likelihood a popular truncation of četverť (~ četvertka) < Proto-Slavic *četvьrtь ‘quarter’ < *kʷetwr̥-ti-, a noun corresponding to the widespread ordinal *kʷetwr̥-to- ‘fourth’. It is of course related to ‘four’, but in a rather trivial manner.

Etymological dictionaries often attempt to connect četa (in either sense) with the Slavic verb *čьtǫ (inf. *čisti) ‘count, reckon, read’, derived from PIE *kʷeit- ‘notice, recognise’. This verb has produced numerous derivatives in Slavic (e.g. *čislo ‘number’); some of them may be accidentally similar to members of the čët group both in form and in meaning, e.g. Old Czech čet ‘count, quantity’ (Modern Czech počet, with a prefix). Note, however, the gen.sg. čtu ~ čta. The disappearing root vowel reflects Proto-Slavic *ь (a reduced vowel continuing earlier short *i in the weak form of the root, *kʷit-). Despite their deceptive similarity, Russian četá (or čët) and Czech čet have different etymologies.

If we remove all the false or dubious cognates, we are left with just the initial material: *četъ ‘even number’, *četьnъ ‘even (of numbers)’ and *četa ‘pair’ ­– a word-family securely attested in East and West Slavic. We can safely add the Ossetian word (isolated in Iranian, as far as I know, but a perfect match for *četa, semantically and formally). There’s no evidence that the original meaning of the morpheme *čet- was ‘two’; nevertheless, it seems to have had something to do with arranging things in couples. Typologically, the Slavic “odd/even” terminology is parallel to what we have seen in Greek and Sanskrit, even if different lexical roots are involved. If so, one could expect *čet- to be semantically close to the familiar Indo-European roots *h₂ar- ‘fit together’ and *jeug- ‘yoke, connect’. I shall therefore tentatively assume that *čet- continues a verb root like *kʷet-, with the approximate meaning of ‘combine into pairs’. Let’s see if we can work from here ­– next time.


References


Václav Blažek. 1999. Numerals: Comparative–etymological analyses of numeral systems and their implications. Brno: Masarykova Univerzita v Brně.

Marc L. Greenberg. 2001. “Is Slavic četa an Indo-European archaism?”. International Journal of Slavic Linguistics and Poetics 43: 35-39.

21 September 2014

‘Four’: A Map

I didn’t plan it this way, but since the discussion of the etymology of ‘four’ has unfolded into a small saga in several acts, I have to organise it for convenience. Here is a map of the route:

  1. [Word of the Month: Proto-Indo-European ‘Four’]
  2. [Even and Odd]
  3. [The Name of the Game: Jakobson Reads Vasmer]
  4. [Twos and Troops: Sifting the Evidence]
  5. [Forgotten Derivatives and their Sexual Implications] NEW!


(and more to follow)

The Name of the Game: Jakobson Reads Vasmer

With the vast and reliable etymological material put into circulation by Vasmer, a number of new questions naturally arises. I should like to dwell on some particulars.
Roman Jakobson (1955) *) 

The Slavs played at “even and odd” too. In Polish the game used to be called cetno licho (or cetno i licho). The noun licho is still used as a mild euphemism for ‘devil’. Czego chcesz, do licha? means “What the heck do you want?” Polish also has the adjective lichy ‘poor, inferior, in bad shape’. Historically, licho is a neuter form of lichy, substantivised centuries ago, when the adjective had a wider range of meaning, including  ‘mean, evil’; licho was therefore ‘something wicked’. The phrase cetno i licho lingers on on the fringes of literary Polish (people are at best vaguely aware that it refers to some old game of chance), but cetno no longer occurs on its own, and has no obvious relatives  in the modern Polish lexicon.

The man who read Vasmer's dictionary
[source]
A few hundred years ago (most examples come from 16th-century texts) cetno and licho could mean, respectively, ‘even number’ and ‘odd number’. Though often contrasted with each other, they were not yet harnessed together into a fixed phrase. Cetnem (instr.sg.) or w cetnie (loc.sg.) meant ‘(occurring) in even numbers’; likewise lichem and w lichu ‘in odd numbers’. This usage has been completely forgotten.

Licho and lichy go back to Proto-Slavic *lixъ ‘strange, irregular, rogue’. In the modern Slavic languages it usually has pejorative conotations (‘bad, lacking, defective, lonely’, etc.); it can also mean ‘excessive, superfluous’. The meaning of Russian lixój, however, ranges – somewhat schizophrenically – from ‘bad, sinister, hard’ to ‘daring, valiant’ (the common ancestor was ‘extraordinary’, whether in a positive or a negative sense)’. Like semantically similar words in other languages (Greek perittós, English odd), *lixъ developed the arithmetical meaning of ‘odd’, which survives here and there in the Slavic branch. For example, in Czech liché číslo means ‘odd number’. As for its origin, *lixъ < *leikʷ-so-, from the widespread Proto-Indo-European root *leikʷ- ‘leave, abandon’.

So much for licho. Where does cetno come from? The Russian term for “even and odd” is čët i néčet. Čët means ‘even number’ (= čëtnoe čisló); néčet is its antonym. The adjective čëtnyj ‘even’ (of a number) is closely related to Polish cetno. Russian č normally corresponds to Polish cz, but some regional varieties of Polish have merged the affricate cz /tʂ/ with c /ts/ for centuries, and the standard language has borrowed a number of dialectal pronunciations of this kind.

On the combined evidence of Polish and East Slavic forms we can reconstruct Proto-Slavic *četъ (n.) and *četьnъ (adj.). Russian also has the noun četá ‘pair, couple’, which is formally and semantically close to them. There are several other Slavic words that might or might not be related to *četъ, but it’s wiser at this stage to exclude more difficult material so as to avoid the risk of contaminating a reliable set of cognates with spurious ones.

Back in the 1950s, as successive volumes of Max Vasmer’s monumental Russisches etymologisches Wörterbuch were published in Heidelberg, the great linguist Roman Jakobson (then at Harvard University) read the entire dictionary (I mean, actually read it like a novel, page by page), jotting down comments on entries that attracted his attention. Those marginalia were published as a journal article (see the reference below) and reprinted in Jakobson’s Selected Writings (Volume II: Word and Language). With regard to čët and its relatives, Jakobson remarked that they “seem to be archaic relics of the same word family as četýre” (the Russian reflex of the Indo-European numeral ‘four’). Having devoted one sentence to the matter, he moved on to the next entry that had caught his eye, čex ‘Czech’. The idea that čët and četýre are somehow related has been picked up by several other authors, but hitherto published attempts to analyse *kʷetwor- in this light have the usual flaws of “root etymologies”: too little attention to morphological details, and too much imaginative semantics. Nevertheless, I think Jakobson’s idea is worth salvaging, so I’ll review those previous attempts and try to see if I can do any better.

*) Roman Jakobson. 1955. “While reading Vasmer’s dictionary”. Word 11: 611-617.

[link to a digitalised Russian translation of Vasmer's dictionary]

[to be continued]

[back to the table of contents]

19 September 2014

Even and Odd

A brief interlude before we dissect *kʷetwor- for good:

This game is simple, and is played with marbles. One player holds in his hand a number of these toys, and demands of another whether that number is even or odd. If the guess is right, the guesser wins one; if wrong, he loses one.
Edgar Allan Poe, The Purloined Letter

Hellenistic ladies playing with astragaloi
(The British Museum)
This game is not only simple, but also as old as the hills. The Romans played it, and so did the Greeks and their gods. It was played with whatever could be concealed in one’s hand: astragaloi (“knucklebones”), nuts, coins, or pebbles. The game, in some ways ancestral to roulette, was called pār impār ‘equal-unequal’ in Latin. The Greeks called it artiasmós, or ártia ḕ perittà ‘even or odd’, or zugà ḕ ázuga ‘pairs or non-pairs’. It was so popular among the Greeks that a special verb, artíazō, was coined to mean ‘play at even and odd’.

Note that the Greek word for ‘even’ is ártios, meaning also ‘perfect, complete, exactly fitted’; it contains the highly productive Proto-Indo-European root *h₂ar- ‘fit together’, which has yielded, among many other Classical words of international currency, Greek harmonía ‘connection, framework’ (hence, figuratively, ‘agreement, order, harmony’) and Latin articulus ‘joint’. Similarly, Greek zugón ‘yoke’ (hence ‘pair’) < PIE *jugóm is derived from the root *jeug- ‘to yoke, connect’. The same root is the source of the Sanskrit words for ‘even’ (yugmán-) and ‘odd’ (a-yúj-, literally ‘having no yoke-fellow’). On the other hand, the core meaning of Greek perittós ~ perissós was ‘excessive, superfluous, extraordinary’. It seems that the notion of parity or “evenness” was understood as exhaustive divisibility into pairs rather than into two equal halves. To check if a number of things was even, you removed pair after pair until either nothing or a surplus of one was left. Such a remainder, or “odd man out”, was a kind of imperfection, marring the regularity of the number.

What has it got to with the etymology of ‘four’? We shall see next time.

[back to the table of contents]

17 September 2014

Word of the Month: Proto-Indo-European ‘Four’

As promised in a comment to my previous blog post, I’m going to discuss an etymological question: the origin and structure of the numeral ‘4’ in the Indo-European languages.

The Proto-Indo-European numeral ‘four’ had several intriguing properties. It was the largest non-complex cardinal number that agreed grammatically with a noun it modified. Consequently, it was inflected for gender and case, like any ordinary adjective. It shared that property with the words for ‘one’, ‘two’ and ‘three’. For obvious semantic reasons, their declension was defective: ‘one’ was normally singular, ‘two’ was declined only in the dual number, and ‘three’ and ‘four’ only in the plural.

The fourth is for luck.
The basic forms of the numeral ‘4’ (as reconstructed in handbooks) were the animate “count plural” *kʷetwores and the inanimate (neuter) “collective plural” *kʷetwōr (from earlier *kʷetwor-h₂). There is some uncertainty about the accentuation of these forms: some reconstruct them with PIE stress on the first syllable, others on the second (the comparative evidence is not unambiguous).

Proto-Indo-European probably had no feminine gender as a formal category, but it had ways to express femininity in derivatives. Curiously, the numerals ‘three’ and ‘four’ seem to have had feminine forms, preserved only in Celtic and Indo-Iranian. They are reconstructed as *tisres ‘3’ and *kʷetesres ‘4’. The final *-es is the familiar nom.pl. ending of animate stems ending in a consonant, but the rest looks baffling. The suffix *-sr-, known also from the Anatolian languages, where it forms nouns denoting human females, probably reflects an archaic, almost completely abandoned word for ‘woman’ (*ser-), although the zero grade (absence of a vowel) in the nom.pl. is aberrant; the initial part (*ti-, *kʷete-) looks in either case like the badly mangled residue of an actual numeral stem. Given the normal rules of IE word-formation, we would expect something like *trí-sor-es and *kʷétwr̥-sor-es. The characteristic “defects” of the attested forms are nevertheless shared between Celtic and Indo-Iranian; they must therefore go back at least to their most recent common ancestor. Such distortions are not quite unexpected in compound words, which commonly lose their transparency through irregular simplification.

Let’s ask a stupid question: what is *kʷetwores/*kʷetwōr the plural of? I mean, if it’s really an adjective, perhaps it had an older “etymological” meaning before it became part of the numeral system? If we strip off the inflections, what remains is the stem *kʷetwor-/*kʷetwr- (the second vowel is lost in so-called “weak” case-forms like loc.pl. *kʷetwr̥sú). This “bare” stem also occurs as a compositional variant of ‘four’, sometimes with  the final segments reversed (*kʷetwr̥- ~ *kʷetru-).

An Indo-European stem with four consonants and two vowel slots must have been morphologically complex at some point. The most likely division into morphemes would be *kʷet-w(o)r-. The *-w(o)r- part looks familiar. A suffix of this form is found in a number of Indo-European nouns, typically inanimate abstracts derived from verb roots. We also find it e.g. in the PIE word for ‘fire’, *páh₂wr̥, which is not obviously deverbal (though a connection with *pah₂- ‘guard’ is thinkable). We also have at least one evidently archaic example od an adjective built in the same manner. Beside the inanimate noun *p(e)iH-wr̥ ‘fat’ (Greek pĩar) we find an adjective meaning ‘fat, fertile’ whose masculine form was *p(e)iH-won-; its neuter must have been originally identical with the noun, and a suffixed feminine *piH-wer-ih₂ was added to the paradigm as the IE gender system developed a three-way contrast (I use the cover symbol *H here for a laryngeal whose “index” is hard to determine). Note the consonant alternation in the suffix: it’s characteristic of an entire class of neuters, so-called r/n-stems. They show *-r in the nom./acc. singular and collective (e.g. *páh₂-wōr, the collective of the ‘fire’ word), but *-n- in the remaining cases (like the gen.sg. *ph₂-wén-s). The variant *-n- is also expected in related animate forms, with the strange exception of *-r- occurring before the femininising suffix *-ih₂, as illustrated by the preserved forms of the adjective ‘fat’. The striking agreement between Greek píōn (m.), píeira (f.) and Vedic pī́van- (m.), pī́varī (f.) shows that this unusual alternation is inherited.

To continue our Gedankenexperiment: so far we haven’t identified the underlying root *kʷet-. Still, if we tentatively assume that it was indeed a verb root, some predictions can be made: beside the hypothetical abstract noun *kʷét-wr̥, possible derivatives include an adjective of exactly the same form in the inanimate gender. Its expected animate form would be *kʷét-won- (nom.sg. *kʷétwō, nom.pl. *kʷétwones). The neuter noun/adjective would form the collective plural *kʷétwōr. Of these forms, two can be regarded as attested: *kʷétwōr is a possible reconstruction of the neuter numeral, and *kʷétwr̥ is its uninflected compositional variant. Conspicuous by their absence are any forms with *n instead of *r. Why, for example, is the animate (masculine) plural *kʷetwores rather than *kʷetwones? The most natural explanation is that this particular plural isn’t old enough to participate in the *-n/r- alternation.

Let’s imagine that *kʷétwr̥ was originally a neuter noun (without an accompanying adjective). Whatever its etymological meaning (let’s symbolise it ‘X’), the collective plural *kʷétwōr (meaning ‘a set of instances of X’) came to be employed as a cardinal number, at first uninflected (like ‘five’, ‘six’, etc.), but eventually attracted into the adjective system, presumably on the analogy of the already adjectival numerals ‘two’ and ‘three’. In the early history of Indo-European the accent was often shifted to the second syllable in such collectives; hence the by-form *kʷ(e)twṓr, in which the first vowel could be phonetically reduced (*kʷətwṓr) or lost altogether. Non-initial stress is reflected in Germanic (cf. Gothic fidwor, displaying the voicing effect of Verner’s Law), and vowel reduction accounts for Latin quattuor (with Lat. /a/ from *ə).

When *kʷétwōr ~ *kʷ(e)twṓr came to be interpreted (and declined) as a neuter plural adjective, an animate counterpart was analogically supplied by adding appropriate inflectional endings to the stem *kʷétwor- or *kʷ(e)twór-. Since its origin as an n/r-noun had been forgotten by that time, PIE-speakers had no reason to make their life more difficult by reviving an ancient alternation. The only case-forms requiring distinctly animate inflections (different from neuter ones) were the nom.pl. (*-es) and acc.pl. (*-n̥s from earlier *-m̥-s). The unsettled stress pattern (*kʷétwores ~ *kʷ(e)twóres) may well be an old feature of the numeral ‘four’.

Some details require more attention, but first I would like to address the question left unanswered above: what exactly was *kʷet-, the root supposedly underlying the derivation of the numeral ‘four’? I will try to suggest an answer in the next post (later this week, I hope), so please stay tuned.

[back to the table of contents]

11 August 2014

De-Extinction: The Mammoth Walks Again

A word has a definable function if speakers regularly select it to convey a certain meaning (or, more generally, to achieve a certain communicative effect). As long as they have a reason to do so, a word  remains useful and there is a good chance that it will stay in circulation. A word which is used frequently will be transmitted to new users more reliably, especially if its function is easy to infer from the way it is used. Low-frequency words are prone both to semantic change and to lexical replacement: new speakers may quite accidentally fail to hear them used, or encounter them only occasionally in a context which doesn’t quite clarify their meaning. Word death is mostly due to accidental transmission breaks happening too often.

If historical linguists had any say in the matter, I’m sure that time-honoured words, priceless as evidence of language history, would enjoy special protection, and every care would be taken that they should be saved for posterity (no matter if we still need them for everyday communication). Alas, linguists have no such authority. It’s common usage plus quirks of fate that ultimately decide whether a word will die or survive.

A word already dead in spoken language may occasionally come back to life. Talking of fate and its quirks – here is one well-known case.

The descendant of the Old English noun wyrd ‘fate, destiny, fortune’ was practically extinct by the sixteenth century, ousted by its Latinate synonyms. It lingered on in Scotland long enough to be used by John Bellenden (in the 1530s) in his Scots translation of a Latin version of the story of King Duncan and Macbeth (published by Hector Boece a decade earlier). The three prophesying fairies which appear in the narrative, thought to be the supernatural “Fates” who control human destiny (comparable to the Greek Μοῖραι, the Roman Parcae or the Scandinavian Nornir), are called weird sisteris (literally = ‘the Fate Sisters’) by Bellenden. He didn’t invent the phrase; it can be found in earlier Scots sources referring to the three classical Fates.

The story told by Boece and translated by Bellenden was in turn adapted by the English chronicler Raphael Holinshed and his collaborators, and thus the weird sisters found their way into The Chronicles of England, Scotland, and Ireland. The second edition of that work, published in 1587, was Shakespeare’s source for the plot of Macbeth. Some confusion must have taken place in the process. Shakespeare turned Holinshed’s “goddesses of destinie, or else some nymphs or feiries” into repulsive old hags with “choppy” fingers, skinny lips, and even beards to boot. Shakespeare and the compositors of the First Folio (1623) were apparently puzzled by the unfamiliar word weird. The original phrase underwent deformation into weyward or weyard sisters; the first word was possibly taken for an adjective similar to wayward, and pronounced as two syllables (although exactly how Shakespeare understood it and whether he actually confused it with wayward are moot questions). Later editors “restored” the spelling used by Holinshed and his Scots source (but not by Shakespeare), bringing back the form of weird, but not its original function. Like an Egyptian mummy from old horror films, weird rose from its tomb and strutted about, half-resurrected but not sure what to do in the modern world.

Nineteenth-century readers and playgoers deduced the meaning of weird from what they saw on the stage. They were shown three “Weird Sisters” portrayed as grotesquely hideous witches, bizarre and unearthly. “Ah,” thought the audience, “so that’s what they mean by ‘weird’.” Before long, weird became a popular adjective to describe anything strange or uncanny. Crucially for its further spread, it managed to colonise the colloquial register of English, in which there is a constant demand for new emotionally coloured words to replace those that have become hackneyed. A function was apparently there, waiting for a suitable word to express it. What remains of Old English wyrd is just the form, like an empty shell, co-opted for completely new grammatical and semantic uses. Those who would like to clone the mammoth should draw a lesson from it.

The life restoration of a 17th-c. word.
De-extinction can happen in various ways. The word twat, gone obsolete for about a century, was excavated by Robert Browning and mistaken for something entirely innocent (the context was again not clear enough and could suggest a nun’s headgear; see here and here at Language Log). Browning’s naive mistake was later exposed by the Wise Clerks of Oxenford, much to the delight of those who heard of it, and the seventeenth-century four-letter word came back to life, regaining even its high obscenity index. It’s probably far more frequent now (especially in British English) than it ever was in its former heyday. Please consider this cautionary example before you de-extinct the thylacine.

My personal favourites among the words that should have been saved (but were not) are old kinship terms. Proto-Indo-European had a large and complicated system of names for different kinds of family relations. Many of them were still used in Old English, but only a handful have survived till now ­–­ those refering to the closest biological relationships (mother, father, sister, brother, daughter, son, all of them with impeccable PIE pedigrees, even if sister was touched by Old Norse influence). A few have been substituted by terms borrowed from French (aunt, uncle, niece, nephew), also traceable back to PIE, but acquired second-hand. Note, by the way, that while Old English ēom, for example, referred specifically to a maternal uncle in the strictest sense (the brother of one’s mother), an uncle could be maternal or paternal already in Middle English. Furthermore, uncle may refer to the husband of one’s aunt (again maternal or paternal) – not even a blood relation. We are dealing here with a new system replacing an older one, not just a series of lexical replacements.

The boringly transparent “in-law” terms have replaced the Old English words for affinity relationships. Not a single one has survived. All that mattered in the late Middle Ages was the degree of affinity as defined by the Code of Canon Law (which prohibited sex and marriage between some people so related), and the “in-law” terminology made that explicit. Gone are such beautiful Old English relics as snoru ‘daughter-in-law’ (from PIE *snusós) and tācor ‘the brother of one’s husband’ (note that only a woman could have one) – one of the four kinds of brotherhood-in-law possible today. The latter word has relatives in Indo-Iranian, Balto-Slavic, Greek, Latin, and Armenian. The PIE stem is usually reconstructed as *dah₂iwér-, but the details of its development in some branches of the family (including PGmc. *taikuraz and its historical reflexes) are not quite clear, making it especially interesting.

Couldn’t we revive those forgotten kinship terms, just for fun? Well, I don’t think the two just mentioned would have much chance of success. Had snoru developed regularly, it would be *snore today, and I doubt if any woman would find such awkward homonymy acceptable. Tācor, in turn, would have become Modern English *toker. Unfortunately, such a form (orthographic and phonetic) is no longer up for grabs. We find it in the lyrics of “The Joker” (by the Steve Miller Band):
I’m a joker
I’m a smoker
I’m a midnight toker...
and it doesnt mean an Anglo-Saxon brother-in-law.