A European Informational Website
learn more
Hanyu Pinyin (), commonly called Pinyin, is the most common variant of Standard Mandarin romanization system in use. Hanyu means the Chinese language, pin means "spell" and yin means "sound". It is also known as scheme of the Chinese phonetic alphabet ().
Pinyin uses Roman letters to represent sounds in Standard Mandarin. The way these letters represent sounds in Standard Mandarin differ from other languages that use the Roman alphabet. For example, the sounds indicated in pinyin by b and g correspond more closely to the sounds indicated by p and k in some Western uses of the Latin script, e.g., French. Other letters, like j, q, x or zh, indicate sounds that do not correspond to any exact sound in English. Some of the transcriptions in pinyin, such as the ang ending, do not correspond to English pronunciations, either.
By letting Roman characters refer to specific Chinese sounds, pinyin produces a compact and accurate romanization, which is convenient for native Chinese speakers and scholars. However, it also means that a person who has not studied Chinese or the pinyin system is likely to severely mispronounce words.
Hanyu Pinyin was approved in 1958 and adopted in 1979 by the government in the People's Republic of China. It superseded older romanization systems such as Wade-Giles (1859; modified 1892) and Chinese Postal Map Romanization, and replaced Zhuyin as the method of Chinese phonetic instruction in mainland China. Hanyu Pinyin was adopted in 1979 by the International Organization for Standardization (ISO) as the standard romanization for modern Chinese (ISO-7098:1991). It has also been accepted by the Government of Singapore, the Library of Congress, the American Library Association, and many other international institutions. It has also become a useful tool for entering Chinese language text into computers.
The primary purpose of pinyin in Chinese schools is to teach Standard Mandarin pronunciation. For those Chinese who speak Standard Mandarin at home, pinyin is used to help children associate characters with spoken words which they already know; however, for the many Chinese who do not use Standard Mandarin at home, pinyin is used to teach them the Standard Mandarin pronunciation of words when they learn them in elementary school.
Pinyin vowels are pronounced similarly to vowels in Romance languages, and most consonants are similar to English. A pitfall for English-speaking novices is, however, the unusual pronunciation of x, q, c, zh, and z (and sometimes i) and the unvoiced pronunciation of d, b, g, and j. More information on the pronunciation of all pinyin letters in terms of English approximations is given further below.
The pronunciation of Chinese is generally given in terms of initials and finals, which represent the segmental phonemic portion of the language. Initials are initial consonants, while finals are all possible combinations of medials (semivowels coming before the vowel), the nucleus vowel, and coda (final vowel or consonant).
For a complete table of all pinyin syllables, see pinyin table.
In each cell below, the first line indicates IPA, the second indicates pinyin.
<sup>1</sup> may phonetically be (a voiced retroflex fricative). This pronunciation varies among different speakers, and is not two different phonemes.<br /> <sup>2</sup> the letter "w" is not generally considered a true initial and may be pronounced as or <br /> <sup>3</sup> the letter "y" is not generally considered a true initial and may be pronounced as or <br />
Conventional order (excluding w and y), derived from the Zhuyin system, is:
In each cell below, the first line indicates IPA, the second indicates pinyin for a standalone (no-initial) form, and the third indicates pinyin for a combination with an initial. Other than finals modified by an -r, which are omitted, the following is an exhaustive table of all possible finals. <sup>1</sup>
It is of interest to point out that the only syllable-final consonants in standard Mandarin are -n and -ng, and -r which is attached as a grammatical suffix. If you see a Chinese syllable ending with any other consonant, it is either from a non-Mandarin language (southern Chinese languages such as Cantonese, or minority languages of China), or it indicates the use of a non-pinyin Romanization system (where final consonants may be used to indicate tones).
<sup>1</sup> /ər/ (而, 二, etc.) is written as er. For other finals formed by the suffix -r, pinyin does not use special orthography; one simply appends -r to the final that it is added to, without regard for any sound changes that may take place along the way. For information on sound changes related to final -r, please see Standard Mandarin.<br /> <sup>2</sup> "ü" is written as "u" after j, q, x, or y.<br /> <sup>3</sup> "uo" is written as "o" after b, p, m, or f.<br /> <sup>4</sup> It is pronounced when it follows an initial, and pinyin reflects this difference.<br />
In addition, ê is used to represent certain interjections.
All rules given here in terms of English pronunciation are approximate, as several of these sounds do not correspond directly to sounds in English.
The following is an exhaustive list of all finals in Standard Mandarin. Those ending with a final -r are listed at the end.
To find a given final:
Pinyin differs from other romanizations in several aspects, such as the following:
Most of the above are used to avoid ambiguity when writing words of more than one syllable in pinyin. For example uenian is written as wenyan because it is not clear which syllables make up uenian; uen-ian, uen-i-an and u-en-i-an are all possible combinations whereas wenyan is unambiguous because we, nya, etc. do not exist in pinyin. A summary of possible pinyin syllables (not including tones), can be reviewed at: pinyin table
Although Chinese characters represent single syllables, Mandarin Chinese is a polysyllabic language. Spacing in Hanyu Pinyin is based on whole words, not single syllables. However, there are often ambiguities in partitioning a word. Orthographic rules were put into effect in 1988 by the National Educational Commission (国家教育委员会) and the National Language Commission (国家语言文字工作委员会).
The pinyin system also incorporates suprasegmental graphemes to represent the four tones of Mandarin. Each tone is indicated by a diacritical mark above a non-medial vowel. Many books printed in China mix fonts, with vowels with tone marks rendered in a different font than the surrounding text, a practice that tends to give such pinyin texts a typographically ungainly appearance. This style, most likely rooted in early technical limitations, has led many to believe that pinyin's rules call for this practice and also for the use of "" (with no curl over the top) rather than the standard style of the letter "a" found in most fonts. The official rules of Hanyu Pinyin, however, specify no such practice. Note that tone marks can also appear on consonants in certain vowelless exclamations.
These tone marks normally are only used in Mandarin textbooks or in foreign learning texts, but they are essential for correct pronunciation of Mandarin syllables, as exemplified by the following classic example of five characters whose pronunciations differ only in their tones:
Traditional characters:
Simplified characters:
The words are "mother", "hemp", "horse", "admonish" and a question particle, respectively.
Since most computer fonts do not contain the macron or caron accents, a common convention is to add a digit representing the tone to the end of individual syllables. For example, "tóng" (tong with the rising tone) is written "tong2". The number used for each tone is as the order listed above (except the "fifth tone", which, in addition to being numbered 5, is also sometimes not numbered or numbered zero, as in ma0 (吗/嗎, an interrogative marker).
The rules for determining on which vowel the tone mark appears are as follows:
(y and w are not considered vowels for these rules.)
The reasoning behind these rules is in the case of diphthongs and triphthongs, i, u, and ü (and their orthographic equivalents y and w when there is no initial consonant) are considered medial glides rather than part of the syllable nucleus in Chinese phonology. The rules ensure that the tone mark always appears on the nucleus of a syllable.
Another algorithm for determining the vowel on which the tone mark appears is as follows:
An umlaut is placed over the letter u when it occurs after the initials l and n in order to represent the sound [y]. This is necessary in order to distinguish the front high rounded vowel in lü (e.g. 驴/驢 donkey) from the back high rounded vowel in lu (e.g. 炉/爐 oven). Tonal markers are added on top of the umlaut, as in lǘ.
However, the ü is not used in other contexts where it represents a front high rounded vowel, namely after the letters j, q, x and y. For example, the sound of the word 鱼/魚 (fish) is transcribed in pinyin simply as yú, not as yǘ. This practice is opposed to Wade-Giles, which always uses ü, and Tongyong Pinyin, which always uses yu. Whereas Wade-Giles needs to use the umlaut to distinguish between chü (pinyin ju) and chu (pinyin zhu), this ambiguity cannot arise with pinyin, so the more convenient form ju is used instead of jü. Genuine ambiguities only happen with nu/nü and lu/lü, which are then distinguished by an umlaut diacritic.
Many fonts or output methods do not support an umlaut for ü or cannot place tone marks on top of ü. Likewise, using ü in input methods is difficult because it is not present as a simple key on many keyboard layouts. For these reasons v is sometimes used instead by convention. Occasionally, uu (double u), u: (u followed by a colon) or U (capital u) is used in its place.
Taiwan has adopted Tongyong Pinyin on the national level since October 2002. Tongyong Pinyin is a modified version of Hanyu Pinyin. The adoption of Tongyong Pinyin has also resulted in political controversy. Much of the controversy centered on issues of national identity, with proponents of Chinese reunification favoring the Hanyu Pinyin system which is used on the People's Republic of China, and proponents of Taiwanese independence favoring the use of Tongyong Pinyin.
Localities with governments controlled by the Kuomintang, most notably Taipei City, have overridden the 2002 administrative order and converted to Hanyu Pinyin (although with a slightly different capitalization convention than the Mainland). As a result, the use of romanization on signage in Taiwan is inconsistent, with many places using Tongyong Pinyin but some using Hanyu Pinyin, and still others not yet having had the resources to replace older Wade-Giles or MPS2 signage. This has resulted in the odd situation in Taipei City in which inconsistent pinyin are shown in freeway directions, with freeway signs, which are under the control of the national government, using one pinyin, but surface street signs, which are under the control of the city government, using the other.
Elementary education continues to teach pronunciation using the zhuyin system in Taiwan. Although the ROC government has stated the desire to use romanization rather than zhuyin in education, the lack of agreement on which form of pinyin to use and the huge logistical challenge of teacher training has stalled these efforts.
Pinyin-like systems have been devised for other variants of Chinese. Guangdong Romanization is a set of romanizations devised by the government of Guangdong province for Cantonese, Teochew, Hakka (Moiyen dialect), and Hainanese. All of these are designed to use Latin letters in a similar way to pinyin.
In addition, in accordance to the "Regulation of Phonetic Transcription in Hanyu Pinyin Letters of Place Names in Minority Nationality Languages" (《少数民族语地名汉语拼音字母音译转写法 》) promulgated in 1976, place names in non-Chinese languages like Mongol, Uyghur, and Tibetan are also officially transcribed using pinyin. The pinyin letters (26 Roman letters, ü, ê) are used to approximate the non-Chinese language in question as closely as possible. This results in spellings that are different from both the customary spelling of the place name, and the pinyin spelling of the name in Chinese:
See also: Tibetan Pinyin
Debate continues about the actual suitability of pinyin as a Chinese romanization method. This argument revolves around pinyin's unconventional use of Roman letters, of which the phonological values of some phonemes are quite different from that of most languages utilizing the Roman alphabet. Some sinologists praise this as pinyin's flexibility in that it allows the entire Roman alphabet to be adapted to the Chinese sound system (compared to Wade-Giles, which leaves out or underuses many letters). Others point out that pinyin letter values are so unconventional that for a person unfamiliar with Chinese, they result in a larger number of mispronunciations when compared to Wade-Giles. However, as not only the PRC but by now most institutions and publications have adopted it, the debate seems increasingly obsolete.
Pinyin, like all systems of romanization, has certain limitations that users should be aware of:
Computer systems long provided the most convincing argument in favor of pinyin; early computers were able to display nothing but 7-bit ASCII (essentially the 26 letters, the 10 digits, and a handful of punctuation marks). Most contemporary computer systems are now able to readily display characters from not only Chinese, but from many other writing systems as well. In addition, multiple input method editors exist that use standard keyboards to type them (pinyin being one such method). Now, PDAs, tablet PCs and digitizing tablets allow users to write characters with a stylus, which can then be stored and edited like any text. Thus, this justification is no longer as strong as it used to be.
Nonetheless, pinyin has gained wide acceptance, and supporters believe it is useful for students of Chinese as a second language. Also, the ability to easily convert electronic Chinese texts written in traditional or simplified characters into pinyin using computer programs such as the Chinese Pronunciation Tool has greatly increased the value of being able to read pinyin.
Some Internet users using the Internet Explorer browser may have difficulty displaying characters bearing the third tone mark. If the following character displays as an empty square box: ǔ, do the following: on the Internet Explorer menu at the top of the screen select "Tools," then "Internet Options," then "Accessibility." Check the box labeled "Ignore font styles specified on Web pages." Click "OK." After that, select "Tools," then "Internet Options," then "Fonts." In the menu at the left, select "Arial Unicode MS" (or "Arial," if this font is not available), then click "OK." It may also be necessary to select "View," then "Encoding," then "Unicode (UTF-8)."
Activate the "US Extended" keyboard in System Preferences and then do:
Contents