BCD Logo

Dee ("Plan D")?

Appendix 1: Phonology & Orthography

How many phonemes does Plan B have?

Jacques Guy writes of Plan B that "it has 16 er... phonemes, because sixteen is a power of two, which makes it computationally desirable. Each phoneme has two allophones, one of which is a vowel, or a diphthong, or the same preceded by 'r', the other a consonant. I say: jolly good idea!"

Shortly before writing the above words, Jacques had written: "Be warned: I'm about to take the mickey out of Plan B." Therefore, we may, I think, rightly take "jolly good idea!" as implying that it is a very bad idea. But Jacques is, in my opinion, being a little unfair to Jeff Prothero. Nowhere, in fact, does Jeff claim that Plan B has sixteen phonemes, nor does he state explicitly or implicitly that each grapheme is intended to represent a phoneme.

It is true that Plan B has sixteen graphemes because sixteen is a power of two, namely 24. But Jeff Prothero says of these sixteen graphemes: "(The particular sixteen letters chosen don't matter.) To encode an arbitrary bitstream efficiently, we use these sixteen letters as a hex encoding according to the following scheme. (The capital letters in the right two columns give the intended pronunciation of each letter when used as a vowel and when used as a consonant.)"

Here is an explicit statement that

  • the particular sixteen letters chosen do not matter; they are merely to encode sixteen hexadecimal values (0..F).
  • each hex value may be used as a vowel or used as a consonant.

The phrase "used as" does not mean the same as 'is an allophone of'! By any reasonable analysis of Classical Latin, /w/ and /u/ were different phonemes; certainly in medieval Latin /v/ and /u/ were different phonemes. We may say that up till the Renaissance, the letter V (in uncial script U) was used as a consonant and also used as a vowel. If such a statement is not taken to imply that /v/ and /u/ were variants of the same morpheme, why should we treat Jeff Prothero's statement differently?

We should note more carefully what Jeff Prothero actually wrote: "An alphabet .... the choice is not critical", "The particular sixteen letters chosen don't matter" and "the particular letters and pronunciations chosen don't matter much." From this it is surely clear that the written and spoken outward form given by Jeff Prothero is strictly secondary.

His stated aim of giving his language an alphabet and "a pronunciation scheme which makes all sequences of letters equally pronouncable" can be implemented in many different ways. The written form and phonology are as secondary to Plan B as bit encoding is secondary to English (or any other natural language).

Plan B per se has no one single orthography or phonology any more than English has only one method of being encoded as bits for a computer. To talk about the phonology of Plan B is not very meaningful. At best, one can discuss the merits or demerits of particular mappings of Plan B in an alphabet of graphemes or as sounds for human speech.


So what did Jeff Prothero do?

He provide Plan B with 16 graphemes and gave a scheme whereby each grapheme represents two sounds: one of which is a vowel, or a diphthong, or the same preceded by 'r', and the other a phonetically unrelated consonant. Whether the grapheme is to be used as a consonant or as a vowel depends upon where it occurs in a string of sounds. If it is in an odd-numbered position (1st, 3rd, 5th ...) it is a consonant; if in an even numbered position it is vocalic. Neither the choice of the graphemes nor the allocation of sounds has anything to do with the bit patters per se. The whole thing is clearly done on an ad hoc basis as an examination of Jeff Prothero's ariticle of May, 1990 will show; the consonants and vowels being a subset of American English (for it is clear that he considers to vowels in fought and prop to be the same, which they are not, as far I am aware, in any form of English outside of north America).

This clealy ad hoc arrangement has two undesirable effects:

  1. The language is given a bizarre orthography in which, for example, <b> may be pronounced [b] or [ɛ], <n> may be [n] or [ɹɪ] or <v> may be [v] or [ɹoʊ].
  2. Every morpheme has eactly two allomorphs, which are phonetically quite unrelated, depending upon what precedes the morpheme. For example, ck (she, her) may be pronounced either as [ʃiː] or as [eɪk]; cf:
    English:  I   like her driving my car
     Plan B:  G-l tk-s ck-l mg-n g-n cc-l
        IPA: [grɛ ti:s eɪkrɛ maɪn aɪn eɪʃrɛ]

    English:  She  likes me
     Plan B:  Ck-l tk-n  g-l
        IPA: [ʃi:l rukri: grɛ]

    (Note: hyphens are used only for clarity; the 'case endings', which as one can see have alternative pronunciations, are normally appended directly)

These two effects have led some to postulate the bizarre idea that Plan B has 16 phonemes, each with exactly two phonetically disparate allophones: one consonantal, the other vocalic. Jacques Guy's critique in his "Plan C" shows how risible this is. The only thing the pair of "allophones", and the pairs of phonetically disparate morphemes have in common is that they have the same underlying bit pattern; at the level of the spoken language they are quite different. As I have argued in the first section above, Plan B per se has no phonology.

Why are there these phonetically unrelated pairs of allomorphs and, indeed, in the orthography Jeff Prothero adopted, pairs of phonetically unrelated pronunciations of each grapheme? It is, to quote the author, "By providing both a vowel and a consonant pronunciation for each letter, and using them alternately, we can pronounce arbitrary strings of letters without difficulty." But, as Jacques Guy rightly observed: "And I, poor sod, who thought a strict CV(V) language would do it!"


A strict CV language does do it!

Yes, indeed, languages with simple phonologies and only (C)V syllables, such as Hawaian, Samoan and other Polynesian languages, manage this quite well without the undesirable features of Plan B. Those undesirable features could be removed at a stroke by giving each bit pattern a CV value, i.e. mapping each bit pattern to syllable.

It seems strange to me that while bits in Plan B are used to determine both the length of morphemes and where word boundaries occur, no use of individual bits was made in mapping Plan B to a written and spoken form. All we have is a clearly ad hoc assignment of consonant and vowel pairs to bit quartets. In the system I shall set out below:

  • Each quartet will be mapped not only to a single grapheme but also to a single phonetic realization, namely a (C)V syllable.
  • The individual bits in each quartet will determine both the syllable's onset and its rhyme.

The latter, of course, has nothing to do per se with the design of a loglang. But I wish to show that a language of the Plan B type can still retain a bit stream at its lowest level and have a systematic and consistent mapping to a phonetic realization which takes account of individual bits in the bit stream.

A syllabary of 32 or 64 syllables might be deemed more desirable but, as Jeff Prothero wrote: "It is handy to have the alphabet size be a power of two. Eight letters would be less concise, thirty-two would be tough to map onto the standard twenty-six char character set." It is, indeed, difficult to map 32 onto the 26 letters of the modern Roman alphabet, and even more difficult, of course, to map 64 onto these letters. I shall restrict myself to just sixteen syllables to show there is no reason why Plan B could not have been mapped in a similar way.

Sixteen syllables is not many, but there is no reason why we could not have a language with eight consonants and two vowels. Let the two vowels be:

  • Front:   /e/ - realized as any front vowel from [ɪ] down to [ɛ] inclusive;
  • Back:    /o/ - realized as any front vowel from [ʊ] down to [ɔ] inclusive;

The eight consonants shall be sonorants and obstruents in four grades, thus:

 Sonorant  Obstruent   
grade #0(zero)/k/ Note:
  • The 'zero' consonant is realized as a semivocalic onset, i.e. [j] before /e/ and as [w] before /o/.
  • The obstruents are voiceless when initial, but may become voiced between vowels.
  • The phoneme /l/ may be realized as any dental or alveolar approximant, whether lateral or not, or as a dental/alveolar flap.
grade #1/l//s/
grade #2/n//t/
grade #3/m//p/

The four grades occur in four series such that series #0 & #1 are sonorants, and series #2 & #3 are obstruents, the even series having the vowel /o/ and the odd having the vowel /e/.

Our sixteen syllables are mapped to bit quartets thus:

  • The two most significant bits denote the grades 0 to 3 thus: 00 01 10 11.
  • The two least significant bits denote the series 0 to 3 thus: 00 01 10 11
    (It will thus be observed that:
    - the first of these two bits indicates whether the consonant is a sonorant [0] or an obstruent [1],
    - and the second indicaties whether the vowel is /o/ [0] or /e/ [1]).

Putting this all together we arive at our complete syllabary. The table below shows, in bold type, the sixteen symbols we shall use, the phonemic values*, and the bit pattern.

 series #0series #1series #2series #3
grade #0 w
grade #1 r
grade #2 n
grade #3 µ

* The phonemic status of the semivocalic onsets [w] and [j] is left ambiguous or controversial, as in Modern Chinese.

We have had to introduce two symbols not normally included in the Roman alphabet: the ñ used in Spanish, and the symbol µ used to denote the metric prefix micro. The complete syllabary should be ordered gradewise, thus:
  w y g k r l z s n ñ d t µ m b p
which is also the order of the bit quartets.


Other ways the 16 quartets could have been mapped to CV syllables

During the discussion on the Conlang list in September 2005, I came up with a different system; see "Deprecated scheme of September 2005" in the menu box at the top right of this page. Although this system mapped each quartet to a unique grapheme, it actually generated 32 possible CV syllables. It had the eight consonants given above, but allowed the use of four vowels, namely: /i/, /e/, /o/ and /u/, the vowel (except in the final syllable) being determined by the least significant bit of the current quartet and the most significant bit of the following quartet. Jörg Rhiemeier's X-1 language uses a similar system to my September 2005 proposal.

However, on reflection it seemed to me that such a system was complicated and computer-centric rather than anthropocentric. In March of 2006 I proposed a modified version of this which was intended to makes things a little more "user friendly." Only the first vowel in a word was determined by the first two quartets (it will be found that all words in Plan B must contain at least two quartets), the other vowels followed a rule of vowel harmony, the quartets merely determing whether the vowel was high or low. A link to this now deprecated system is given in the menu box at the top right of this page.

However, in my view, this second system is still a bit cumbersome and "kludgey," and the system described in the section above is surely the simplest. Although the individual bits in each quartet generate a CV syllable in a methodical and coherent way, the syllabary may be used without any reference to the bit values. The table below shows some of the differences between the earlier systems and the one I give in the section above (July 2007):

September 2005March 2006July 2007
The rules for determing vowels before and after obstruents are different from those determing vowels before and after sonorants. The rules for determing vowels are the same for all consonants. This is a true syllabary and each symbol represents CV syllable. There is no computation needed for determing the vowel.
The vowel in every syllable, except the final one, is determined by a combination of two consonant symbols. The vowel of the first syllable only is determined by two consonant symbols.
The bit value order of symbols (which a computer would use when sorting) is different from the more 'human friendly' grade and series order. The bit value of symbols is identical to the grade and series ordering of symbols.