A Dictionary of the PinYin Language
by Zhang JuLi
Language change

This is a proposal for development of the han4yu3pin1yin1 (Chinese language spelling sounds), or PinYin for short. In the Chinese character writing system (HanZi), homophonic words are represented by different HanZi. This phenomenon is taken into consideration in this design. The standard Chinese is different in various periods of history. Chinese language in the future will be different from what is currently spoken. Language change is also taken into consideration in this design. This dictionary is intended to prove that an alphabetic script facilitates language change towards more phonological distinction.

In this dictionary, a blue print represents a sound that is possible but not current in the standard Chinese. I admit that the 'possible' is my personal opinion. Other people definitely have their personal opinions, too. I have no intention of defining what the future language will sound like, but only propose the possibility of more phonological diversification. A language is created and developed by all members of the speech community.

Three methods

Three methods, synonymy, heterography and homography, are adopted in this design. Synonymy is to adopt a new sound for a word. The new and current sounds will coexist. The current sound of the word is shared by at least another word. But the new sound and its spelling are unique for the meaning. The new sound will eventually prevail and replace the current sound for the word. The language will consequently become more distinct, thus less homophonic.

Chinese is a tonal language. Each syllable has a tone. The tone insulates a syllable from its neighboring syllables. Therefore, change the sound of a syllable does not affect other syllables in speech. This characteristic makes the method of synonymy feasible. As said, synonyms are presented in blue font. In order to demonstrate the possibility of the new sound, a sound file is attached at the end of each synonymous entry.

Heterographic homophones are different words that share a sound but not written form. Many Chinese words share sound but are written in different HanZi. For instance, the Chinese words for 'one', 'clothes' and 'medicine' share the sound [i1], but are written in different characters. They are heterographic homophones. Heterographic spellings proposed in this design are presented in orange font.

It is assumed that each possible but not-currently-spoken syllable also has a standard spelling in PinYin, thus can have heterographic spellings, too. For instance, [an2] is such a syllable. Its standard spelling is an2. If this sound takes another spelling, such as ann2, then this spelling is presented in pink font, indicating it is both a possible but not-currently-spoken sound and a heterographic spelling.

The method of homography is to write two or more homophonic words with one graph or spelling. For instance, the words for 'face' and 'flour' are homophones in Chinese, but written with two different HanZi in the traditional set of HanZi. In the simplified set of HanZi, the HanZi for 'face' is used for both words. Which word it represents depends on the particular context in which it appears. The HanZi for 'flour' is abolished, due to its complex written form. This method is adopted in this design wherever possible. it is easy to see as there will be two or more numbered English words in the parentheses after the main entry. Please note that when two or more English words in parentheses are separated by a forward slash, they represent one HanZi that has two or more meanings.

Aid to understanding non-standard PinYin spellings

When encountered with a non-standard pinyin spelling in any example, the reader can move the cursor to that spelling, the standard pinyin spelling will appear in a rectangular box. All such spellings are underlined with a short green dotted bar. In the following example, when moving the cursor to the 'qyu4', 'qu4' will appear in a rectangular box. Also, a sound file is created for each synonymous entry. In the following example, a sound file can be opened by clicking the 'listen' at the end.

an3 (I/me): ngan3 [ŋan]. ngan3 bu  qyu4 . listen.

Three New Initials

In this and following few sections I will discuss my proposals on change of the language. Three initials are added to the PinYin language. They are [v] written as 'v', [z] as 'dz', and [Error: Access denied. Your account has been disabled.] as 'ng'. These three initials existed in the earlier standard Chinese. Though not defined as current in the standard Chinese nowadays, they are current respectively in following mandarin dialects:

Initial Dialects
[ŋ] shan1dong1, shan3xi1, hu2bei1, si4chuan1
[v] shan1xi1, shan3xi1
[z] shan1xi1, si4chuan1

These three initials are also extensively current in non-mandarin dialects. When speakers of some southern dialects speak Mandarin, they change the [r] to [z]. For instance, the pronunciation of the word 'date' in standard mandarin is [rɨ4 tsɨ]. When speakers of southern dialects speak mandarin, they pronounce it as [zɨ4 tsɨ]. Because the sound [ts] is already written as 'z' in PinYin, [z] is tentatively written as 'dz' in this dictionary. Interested people can examine feasibility of these sounds by listening to the sound files where they occur.

The ending consonant [m]

In current standard mandarin there are two possible consonant codas for a syllable, [n] and [ŋ], written as 'n' and 'ng' respectively. The [m] used to be another one, but then disappeared. It is revived in this design with two justifications. First, the sounds [m2] and [m4] are current in the standard mandarin as exclamations. Secondly, [m] is a coda in the standard language of all three historical linguistic periods, namely, Early Mandarin, Late Middle Chinese and Early Middle Chinese. This coda is also current in several non-mandarin dialects.

The ju, qu and xu

Sounds of these three PinYin spellings are [tçy], [tçhy] and [çy]. But [tçu], [tçhu] and [çu] are also possible sounds in the mandarin. These six sounds are written as follows in this dictionary:

Writing Pronunciation
ju tçu
qu hu
xu çu
Writing Pronunciation
jyu tçy
qyu hy
xyu çy

The retroflex vowel [ɑr]

Currently only one retroflex vowel, [ər], is defined in the standard mandarin. The [ɑr] is not considered a vowel. I argue that the pronunciation for the word 'two' in mandarin is not [ər4], but [ɑr4], so [ɑr] is a current vowel. The following is more evidence:

Word Unsuffixed Suffixed
liver [kan1] [kar1]
root [kən1] [kər1]
Word Unsuffixed Suffixed
ring [xuan2] [xuar2]
soul [xuən2] [xuər2]

In the standard mandarin the suffixed pronunciations for 'liver' and root' are entirely different. So are those for 'ring' and 'soul'. This is a proof for the existence of retroflexed [ar] which is not same as [ər]. Though the vowel [ar] is currently used only for the word 'two', granting its existence will make transliterating foreign words more convenient, such as in bar1 (bar), kar3 (card), kar4 (car), etc.

Heterographic spellings of [ə], [ər] and [ar]

Standard written forms for these sounds in pinyin are 'e', 'er' and 'ar'. In order to render these sounds with more than one written form, following spellings are designated:

[ə] e ir or ur
[ər] er err irr urr
[ar] ar are  

The diphthong [iə]

An approximant of this sound, [iə̌], was in Early Middle Chinese and Late Middle Chinese. Furthermore, the 'je', 'qe' and 'xe' are used in this dictionary for the purpose of synonymy. their pronunciations are in fact [tçiə], [tçhiə] and [çiə]. This is because native speakers of Chinese tend to pronounce the letter 'j', 'q' and 'x' as [tçi], [tçhi] and [çi]. As an experiment, I move one more step forward by spelling other initials with the [iə]. Its written form is 'ire', and a heterographic variation is 'yre'.

Omission of Tone Marks

The neutral tones are not marked, such as in 'd', 'de', 'di', and interjection and auxiliary words 'a', 'ma', 'ba', etc. Besides, tone marks are omitted in two other words, namely, those for 'no' and 'one'. The word for 'no' has two sounds [bu2] and [bu4]. Sometimes it looks like two words when in fact is just one when the tones are marked, in phrases such as bu2 da2 bu4 xiao3, bu4 hao3 bu2 huai4. I just write them without the tone marks: bu da2 bu xiao3, bu hao3 bu huai4.

The word for 'one' has three tones, [i1], [i2], and [i4]. Which one is used depends on the tone of following syllable. Native speakers of Mandarin can make correct decision on the tone so naturally that they do not realize to be using different tones. This is a reason to leave the tone mark off. In this design, the word for 'one' is written with the capital letter 'I'.

Heterographic spellings

In the dictionary, IPA is used to indicate the sound if ambiguity may occur.


The standard Chinese used to be closely related to the location of China's capital. The dialect of new capital became new standard, with the same writing. With the advent of the scientific pinyin, in fact its predecessor zhuyin, standard Chinese doesn't have to associate with relocation of political capital. When the pinyin is adopted as primary writing, the impact on the language from capital relocation will become further less. The language still changes, but the factor of capital location can be excluded.

Many people may ask me: How do you know that the language will change as you want it to? My answer is: I don't. Nobody does. People in the times of qieyun (AD 601), guangyun (AD 1008) or zhongyuan yinyun (AD 1271-1386) could not predict what the language would sound like in the next period of development, let alone what it would today. On the other hand, all changes must be made by native speakers themselves. They initiate, evaluate and accept the changes for their own convenience. This process never stops. Therefore, as an ordinary native speaker of Chinese, I can only say that some sounds are possible when the language is examined in its diachronic and current synchronic perspectives. There are definitely other possibilities that I don't know. Or my ideas on some possibilities are incorrect.

Though no HanZi is used, this design is for people who know HanZi well. When reading an example, the reader is expected to not only understand it, but also realize all the HanZi in discussion, because this dictionary is about how to deal with the homophonic phenomena as they are distinguished by the HanZi.

Can an artificial or more generalized language prevail? It depends on how close the artificial language is related to a speaker's native language or dialect. The standard mandarin is quite artificial to many non-native speakers of it. But it is not as hard for them to learn it, because Chinese dialects are similar to one another in many aspects. Other factors include how well the artificial language succeed the linguistic, literate and cultural traditions that the current writing carries, what conveniences and benefits it can bring to the speakers, whether it is a step of natural development of the current language, and how well the sounds are represented by the writing.

I'm not satisfied with many of the synonymous and heterographic proposals in this design, but have to provide a spelling to all frequent-used HanZi. There are definitely numerous errors, mistakes, omissions and inconsistencies. On the one hand, native speakers of Chinese share a lot linguistic experience. But on the other, not any two persons share exactly same experience. Everyone is unique in a sense, so has his or her own ideas about language and language change. But in general, the people who can make better proposals are those who know more about the language's past and current variations. They are welcome to try their hands on the Chinese romanization.


Primary Sources
bei3 jing3 da4 xue2, zhong1 guo2 yu3 yan2 wen1 xue2 xi4, yu3 yan2 xue2 jiao4 yan2 shi4. 1989. han4 yu3 fang1 yin1 zi4 hui4 . Bei3 jing1: wen2 zi4 gai3 ge2 chu1 ban3 she4.

Pulleyblank, Edwin G. 1991. Lexicon of Reconstructed Pronunciation in Early Middle Chinese, Late Middle Chinese and Early Mandarin. Vancouver: UBC Press.