Social:Modern Chinese characters

From HandWiki
Revision as of 16:47, 5 February 2024 by TextAI (talk | contribs) (over-write)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Short description: Chinese characters used in modern languages

Modern Chinese characters (traditional Chinese: 現代漢字; simplified Chinese: 现代汉字; pinyin: xiàndài hànzì) are the Chinese characters used in modern languages, including Chinese, Japanese, Korean and Vietnamese. [1]

According to Ethnologue,[2] "Mandarin Chinese is the largest language in the world, if you count only native speakers. If you count both native and non-native speakers, English is the largest (with Mandarin being the 2nd largest)." And Mandarin is written in modern Chinese characters. [3][4]

As a subtopic of Chinese characters,[5][6] the present article provides a more comprehensive and detailed introduction to modern Chinese characters, especially the characters in the standard Mandarin Chinese.

Chinese characters are composed of components, which are in turn composed of strokes.[7] The 100 most frequently-used characters cover (i.e., having an accumulated frequency of) over 40% of modern Chinese texts. The 1000 most frequently-used characters cover approximately 90% of the texts. [8] There are a variety of novel aspects of modern Chinese characters, including that of orthography, phonology, and semantics, as well as matters of collation and organization and statistical analysis, computer processing, and pedagogy.[9][10]

Background

Historical development

Since maturing as a complete writing system, Chinese characters have had an uninterrupted history of development over more than 3,000 years, with stages including

leading to the modern written forms,[11] as illustrated by the development of character Template:Zhi (horse):

Oracle Bronze Bigseal Seal Clerical Regular Simplified
馬-oracle.svg 馬-bronze.svg 馬-bigseal.svg 馬-seal.svg 馬-clerical.svg 馬-kaishu.svg 马-kaishu.svg

In 1980, Zhou Youguang, known as the "father of pinyin", published a paper entitled "Introduction to the Studies of Modern Chinese Characters"—within, he detailed aspects of the numbers, orders, forms, sounds, meanings, and pedgagogy regarding the modern characters.[12] His paper was followed by Gao Jiaying's "A Brief Discussion on the Establishment of Modern Chinese Character Studies", [13] and other related writings on the subject.[14] At least five textbooks have been published in this area.[15][9][10][16][17]

Regional varieties

Chinese characters were originally invented for writing the Chinese language, and were later employed for other East Asian languages, developing as part of a shared orthographic tradition. Among other places, for ordinary and historical purposes, Simplified characters are primarily used in mainland China, Singapore, and Malaysia, Traditional characters are used in Taiwan, Hong Kong, and Macau, along with kanji in Japan, hanja in Korea, and chữ Hán in Vietnam.[18] For example, the Traditional character Template:Zhi (wide, broad) has a Simplified form of Template:Zhi and a shinjitai kanji form of .

Characteristics

In contrast with the Latin alphabet used to write many languages, including English, Chinese characters have many divergent properties, including:[19]

  • There are tens of thousands of different characters,
  • A character is in a two-dimensional block structure,
  • A character may have dozens of strokes,
  • In most cases, the character denotes a morpheme.[20]
  • Monosyllabic, normally one character per syllable.[21]
  • Texts written in Chinese characters are intelligible to readers of different dialects and different dynasties.

etc.

Sources

Modern Chinese characters include:[22]

Number and sets

Main page: Social:Chinese character sets

Due to the dynamic development of languages, there is no definite number of modern Chinese characters. However a reasonable estimation can be made by a survey of the character sets of relevant standard lists and influential dictionaries in the countries and regions where Chinese characters are used.[23]

Mainland China

The important standards in the People's Republic of China include List of Frequently Used Characters in Modern Chinese (Template:Zhi), totalling 3,500 characters,[24] and List of Commonly Used Characters in Modern Chinese (Template:Zhi with 7,000 characters, including the 3,500 characters in the previous list).[25] But the current standard is Table of General Standard Chinese Characters, which was released by the State Council in June 2013 to replace the previous two lists and some other standards. It includes 8,105 characters of the Simplified Chinese writing system, 3,500 as primary, 3,000 as secondary, and 1,605 as tertiary. In addition, there are 2,574 Traditional characters and 1,023 variants.[26] And the character sets of Xinhua Zidian[27] and Xiandai Hanyu Cidian,[28] the most popular modern Chinese character dictionary and word dictionary. They each includes over 13,000 characters of Simplified characters, Traditional characters and some variants.

Taiwan

In Taiwan, there are the Chart of Standard Forms of Common National Characters (Template:Zhi) with 4,808 characters, and the Chart of Standard Forms of Less-Than-Common National Characters (Template:Zhi), with 6,341 common national characters. Both lists were released by the Ministry of Education, with a total of 11,149 characters of the Traditional Chinese writing system.

Hong Kong

In Hong Kong, there is the List of Graphemes of Commonly-Used Chinese Characters for elementary and junior secondary education, totally 4,762 characters. This list was released by the Education Bureau, and is very influential in the educational circles.

Japan

In Japan, there are the jōyō kanji (frequently-used Chinese characters, designated by the Japanese Ministry of Education, including 2,136 characters) and jinmeiyō kanji (for use in personal names, currently including 983 characters).

Korea

In Korea, there are the Basic Hanja for educational use (漢文敎育用基礎漢字, a subset of 1,800 Hanja defined in 1972 by a South Korea educational standard), and the Table of Hanja for Personal Name Use (人名用追加漢字表), published by the Supreme Court of Korea in March 1991.[29] The list expanded gradually, and to year 2015 there were 8,142 hanja permitted to be used in Korean names.[30]

Overall estimates

With consideration of all the character sets mentioned above, the total number of modern Chinese characters in the world is over 10,000, probably around 15,000.[31][32] Such an estimation should not be counted as too rough, considering that there are totally over 90,000 Chinese characters (CJK Unified Ideographs) in Unicode, and more if every Chinese character ever appeared in the world is to be included.[33]

A college graduate who is literate in written Chinese knows between three and four thousand characters. Specialists in classical literature or history, who would often encounter characters no longer in use, are estimated to have a working vocabulary of between 5,000 and 6,000 characters.[34]

Frequency

Main page: Social:Chinese character frequency

Chinese character frequencies are calculated on data of corpora. A corpus is a collection of texts representative of one or more languages. The frequency of a character is the ratio of the number of its occurrences in the corpus to the total number of characters of the corpus. The formula for calculating frequency is

Fi = ​ni N  ×  100%,

where ni is the number of times a certain (ith) Chinese character appears in the corpus, and N is the total number of (occurrences of) characters in the corpus.[35]

Origins

The first person to make a statistic study on the frequency of Chinese characters was Chen Heqin (Template:Zhi).[36] In the 1920s, he and his assistants spent two years manually counting the characters in a corpus of 554,478 characters, and obtained 4,261 different characters with frequency information. They then compiled a book Applied Lexis of Vernacular Chinese (Template:Zhi).[37] The 10 most frequently-used characters in their corpus are, by descending frequency,

Template:Zhi (of), Template:Zhi (no, not), Template:Zhi (one, a(n)), Template:Zhi (Template:Gcl), Template:Zhi (to be), Template:Zhi (I/me), Template:Zhi (on, up), Template:Zhi (he/him), Template:Zhi (to have), Template:Zhi (person).

Survey by CUHK

In 2001, the Chinese University of Hong Kong (CUHK) published a number of frequency lists on the Web,[8] entitled "Hong Kong, Mainland China and Taiwan Chinese Frequency: a Trans-regional Diachronic Survey". The frequency data came from a grand corpus with a number of sub-corpora representing the Chinese languages in the three regions of Hong Kong, Mainland China and Taiwan and in the two time periods of the 1960s and 1980/90's. Each sub-corpus includes about 5,000 different characters, as shown by their frequency lists.

From the data of these frequency lists, some important and interesting features of Chinese can be discovered:

  1. Template:Zhi, Template:Zhi and Template:Zhi are the three most frequently-used characters across the regions and time periods of the corpora. And Template:Zhi is number one in all the frequency lists.
  2. The 10 most frequently-used characters across the three regions and two time periods are very consistent. That means a frequently-used character in one region or period is very likely to be frequently-used in another region or period.
  3. The 100 most frequently-used characters in the 80/90's cover (i.e., have an accumulated frequency of) 41.00% of the Hong Kong texts of that period, 41.34% of the Mainland texts, and 41.88% of the Taiwan texts. That is more than 4 out of every 10 characters for the three regions.
  4. The 1000 most frequently-used characters in the 80/90's cover 89.25% of the Hong Kong texts of that period, 90.26% of the Mainland texts, and 88.74% of the Taiwan texts.

Survey by the Chinese government

Large-scale surveys by the Ministry of Education and the State Language Commission of PRC over the years have shown that the use of Chinese characters and words has a strong distribution pattern. The number of characters used in modern Chinese is stable at about 10,000 for quite a few years. The number of most frequently-used characters with a coverage rate of 80%, 90%, and 99% is about 590, 960, and 2,400 respectively.[38]

Chinese character frequency is essential to quantitative research of Chinese characters, and has been applied to language teaching, dictionary composition, character lists compilation, Chinese character information processing, etc.[39]

Orders

Main page: Social:Chinese character orders

The orders or sorting methods of Chinese dictionaries are traditionally divided into three categories: form-based orders, sound-based orders and meaning-based orders.[40] In modern Chinese, people also use frequency orders.

Form-based

In this category of orders, characters and words are sorted according to various features of the forms or shapes of Chinese characters. Comparing with sound-based orders, form-based orders have the advantages of (a) allowing lookup of characters and words without knowing their pronunciations, and (b) effective collation of large character sets without support from other sorting methods. There are two subcategories of form-based orders: stroke-based orders and component-based orders, which further includes radical-based orders, etc.[41][42]

Sound-based

There are two major sound representation systems for Standard Chinese: pinyin and bopomofo. Accordingly, there is a pinyin alphabetical order and a bopomofo-based order.[43]

Meaning-based

Meaning-based orders, also called semantics-based orders, arrange characters and words in a hierarchical structure of semantic categories.[44]

Frequency-based

This category of orders have Chinese characters sorted by their frequency of uses, normally in descending order. That means the most frequently-used character is at the top of the list. A frequency list is created from a text corpus. In corpus linguistics, the frequency of a character is the ratio percentage of its number of occurrences in the corpus to the total number of characters of the corpus.[35]

Orders of words

Chinese words consist of one or more characters. Single-character words can be sorted by a character order, and multi-character words can be sorted character by character in a similar way.[45]

Forms

Main page: Social:Chinese character forms

Modern Chinese characters appear in the form of square blocks. There are three layers or levels of structural units of Chinese characters: strokes, components, and whole characters.[46][lower-alpha 1] For example, Template:Zhi (character) has two components, each of which is composed of three stokes:

Template:Zhi = Template:Zhi(㇔㇔㇇) + Template:Zhi(㇇㇚㇐).

Strokes

Main page: Social:Chinese character strokes

Strokes (traditional Chinese: 筆劃; simplified Chinese: 笔画; pinyin: bǐhuà) are the smallest writing units of Chinese characters. When writing a Chinese character, the trace of a dot or a line left on the writing material (such as paper) from pen-down to pen-up is called a stroke.[48]

Stroke number is the number of strokes of a Chinese character. It varies, for example, characters Template:Zhi and Template:Zhi have only one stroke, while character Template:Zhi has 36 strokes, and Template:Zhi (composed of three Template:Zhi) consists of 48 strokes.[49]

Stroke forms refer to the shapes of strokes. The stroke forms of a standard Chinese character set can be classified into a stroke table (or stroke list), for instance, the Unicode CJK strokes list has 36 types of strokes: [50]

Stroke order is the order in which strokes are written to form a Chinese character, for example, the stroke order of Template:Zhi is "㇓,㇐,㇑". [51]

Components

Main page: Social:Chinese character components

Chinese characters are composed of components (Chinese: 部件; pinyin: bùjiàn), which are in turn composed of strokes.[7] In most cases, a component is larger than a stroke (i.e., consists of more than one stroke) and smaller than the whole character (combines with some other components to form a character). For example, in character Template:Zhi, there are two components, Template:Zhi and Template:Zhi, each with more than one stroke (Template:Zhi: ㇓㇑, Template:Zhi: ㇓㇐㇐㇑). In the special cases of one-stroke characters, such as Template:Zhi and Template:Zhi, a stroke is a component and is a character.

Chinese character component analysis is to divide or separate a character into components. There are two ways for Chinese character dividing, hierarchical dividing and plane dividing. Hierarchical dividing separate layer by layer from larger to smaller components, and finally get the primitive components. Plane dividing separate out the primitive components all at once.[52]

A component that can independently form a character is a character component, or a component of independent character formation (Template:Zhi). For example, component Template:Zhi formed character Template:Zhi independently, and is a component in characters Template:Zhi, Template:Zhi and Template:Zhi. A component that can not independently form a character is a non-character component, or a component of dependent character formation (Template:Zhi). For example, component Template:Zhi in character Template:Zhi, Template:Zhi and Template:Zhi.[7]

A component that cannot be (further) divided into smaller components by the rules is a primitive component, or basic component (Template:Zhi). Primitive components are the final-level components of hierarchical dividing. For example, components Template:Zhi and Template:Zhi in character Template:Zhi. A component composed of two or more primitive components is a compound component (Template:Zhi). For example, component Template:Zhi in character Template:Zhi, Template:Zhi and Template:Zhi.[53]

Whole characters

Main page: Social:Chinese whole characters

'Whole characters' (Chinese: 整字; pinyin: zhěngzì) lie at the final level of the stroke–component–character composition. [54] A non-decomposable character (Template:Zhi) consists of one primitive component, which is directly formed by strokes and can not be decomposed into smaller components. [55] A decomposable character (Template:Zhi) can be broken down into multiple components.

The structure of a Chinese character is the pattern or rule in which the character is formed by its (first level) components.[53] Chinese character structures include:

Popular typefaces or fonts of modern Chinese characters include

In Chinese, in addition to the international point system, a unique 'number' (Template:Zhi) system is used for character sizes. For example, the Simplified Chinese version of Microsoft Word allows setting font sizes by either points or numbers.[57]

Phonology

Main page: Social:Chinese character sounds

The standard pronunciation of Chinese characters is based on the Beijing dialect of Mandarin.[58]

Normally a Chinese character is read with one syllable. Some Chinese characters have more than one pronunciation (polyphonic characters). Some syllables correspond to more than one character (homophonic characters).[59]

Polyphonic characters

Polyphonic characters (Template:Zhi) are characters with two or more pronunciations, as opposed to monophonic characters with only one.

A polyphonic monosemous character (Template:Zhi) has two or more pronunciations of the same meaning. For example: the English word 'ton' is transliterated as Template:Zhi, with two pronunciations of dūn and dùn coexisting in some old dictionaries, both sharing the meaning of 'ton'. Since Template:Zhi is both a character and a word, it is a polyphonic monosemous character, as well as a polyphonic monosemous word.

In December 1985, the PRC announced the Table of Mandarin Words with Variant Pronunciation (Template:Zhi) to define the standard pronunciations for polyphonic monosemous characters.[60] In Taiwan, there is a similar official standard for Mandarin words with variant sounds, where pronunciations are expressed in bopomofo instead of pinyin.

A polyphonic polysemous character (Template:Zhi) has two or more pronunciations, and different pronunciations represent different meanings. For example, character Template:Zhi is pronounced cháng with the meaning of 'long', or zhǎng with the meaning of 'grow'. The simplified character Template:Zhi is pronounced as zāng from traditional character Template:Zhi (dirty), or as zàng from traditional character Template:Zhi (internal organs). The pronunciation of such characters is determined by the meaning intended. [61]

Polyphonic polysemous characters may hinder the learning and application of Chinese characters and should be reduced. There are two main methods:[62]

  • Chang pronunciation. A common approach is to change rare sounds and sub-frequent sounds to frequent readings. And change the ancient pronunciations to today's pronunciations.
  • Change form. It means changing some sounds and meanings to be expressed by other characters.

Homophones

Homophonic characters (Template:Zhi) are those sharing the same pronunciation, as opposed to heterophonic characters (Template:Zhi). Homophonic characters are either narrowly understood as having identical initials, finals, and tones, or more broadly as merely having identical initials and finals, with tones possibly differing. For example, "Template:Zhi" are all pronounced , while "Template:Zhi, Template:Zhi, Template:Zhi, Template:Zhi, Template:Zhi" are homophones only in the broader sense. Usually, people understand homophony in characters as referring to the narrow sense.[63]

Homophonic characters are widespread in Mandarin: there are around 1,300 possible syllables, including tonal distinctions—excluding tones, the number of different syllables drops to 400. Meanwhile, the written language has more than 10,000 characters, for an average of 7.5 characters mapped to each syllable.[64]

Zhou Youguang introduced two ways homophones have been historically reduced:[65][66]

  • Differentiate character pronunciations without changing the word. For example: Template:Zhi (cancer) was originally pronounced yánzhèng, later changed to áizhèng due to confusion with Template:Zhi);
  • Differentiate words and pronunciation. For example: Template:Zhi) was confused with Template:Zhi), later the synonym Template:Zhi) began to be used instead.

Others

There are two systems for phonetic notation of Chinese characters.

In pinyin, either diacritics (e.g., ) or numbers (ma1) may be used to mark tones. The Jyutping system for Cantonese uses numbers, e.g. Template:Zhi

Kun'yomi are readings of kanji using native Japanese words mapped to the meanings of borrowed Chinese characters. Characters have also been borrowed with on'yomi readings with borrowed Sino-Japanese pronunciations. For example, when Chinese character Template:Zhi, mountain) was borrowed to Japan, people read it with either a native kun'yomi pronunciation of yama, or with a Sino-Japanese on'yomi pronunciation of shan. These phenomena also appear in Mandarin and English, such as i.e. being read aloud as 'that is'. Qiu Xigui called it Template:Zhi, synonymous reading).[67]

Semantics

Main page: Social:Chinese character meanings

In modern Chinese, a character may represent a word, a morpheme in a compound word, or alternatively a meaningless syllable combined with some other syllables or characters to form a morphine.[68] In a language, morphemes are the minimal units of meaning.[69] Some characters have only one meaning, some have multiple meanings, and some characters largely share the same meaning.[70]

Monosemous and polysemous characters

A character with only one meaning is a monosemous character, and a character with two or more meanings is a polysemous character. According to statistics from the Chinese Character Information Dictionary, among the 7,785 mainland standard Chinese characters in the dictionary, there are 4,139 monosemous characters, 3,053 polysemous characters and 593 meaningless characters. [71]

The meaning people assigned to a character when it was created is the original meaning (Template:Zhi) of the character. [72] For example, the original meaning of Template:Zhi is 'weapon' (a Template:Zhi (jīn, cutting knife) being held with both hands Template:Zhi).

The meaning developed from the original meaning of a character through association is the extended meaning (Template:Zhi). [72] For example, Template:Zhi is an extended meaning of Template:Zhi.

The meaning added through the loan of homonymous sounds is the phonetic-loan meaning (Template:Zhi). [73] For example, the original meaning of Template:Zhi is 'dustpan': its use as the pronoun 'his', 'her', 'its' is due to a phonetic loan.

Synonyms

Synonym characters are a group of characters that have the same or similar meaning. The characters in a synonym group often differ in frequency of use and word-formation ability, and there are some (subtle) differences in meaning and emotional color. The knowledge of synonym characters will help students write Chinese more correctly and express meanings more accurately.[74] For examples,

Both Template:Zhi and Template:Zhi have the meaning of 'face'. But there are some differences.[75] Generally, Template:Zhi is not used as an independent word in Mandarin, but only in multi-character compounds. For example, Template:Zhi (to meet), Template:Zhi (face and eyes), Template:Zhi (red face), Template:Zhi (yellow face, with thin muscles). The Template:Zhi in these words cannot be equivocated with Template:Zhi. In contrast, Template:Zhi can usually be used alone in Mandarin as its own word, as well as in compounds such as Template:Zhi (facial makeup), Template:Zhi ((painted face), Template:Zhi (baby face), Template:Zhi (round face) and Template:Zhi (square face), Template:Zhi (a cute face). The Template:Zhi in these words cannot be replaced by Template:Zhi.

Meanings of characters and words

The meaning of a single-character word is its character meaning. The meaning of a multicharacter word is generally derived from the meanings of the characters. The relationships between the meaning of a compound word and of its characters are categorized as follows: [76]

  1. Synonyms (A + B = A = B), such as Template:Zhi (sound) = Template:Zhi (sound) = Template:Zhi (sound).
  2. Synthetic meaning (A + B = AB), such as Template:Zhi (moral character) = Template:Zhi (character) and Template:Zhi (morality)
  3. Expanded meaning (A + B = AB + ε), such as Template:Zhi (scenery) = Template:Zhi (view) + Template:Zhi (thing) + (for sightseeing)
  4. Partial meaning (A + B = A or B, but not the other), for example Template:Zhi (country) = Template:Zhi (country) but ≠ Template:Zhi (family), Template:Zhi (easy) = Template:Zhi (easy) but ≠ Template:Zhi (countenance).
  5. Complementary meaning (A + B = ε), for example Template:Zhi (thing, stuff) is not Template:Zhi (east) + Template:Zhi (west).

According to sampling statistics, categories 2 and 3 account for 89.7% of the compound words.

Classification

Main page: Social:Chinese character internal structures

In the analysis of internal structures, Chinese characters are decomposed into internal structural components in relations with the sound and meaning of the characters.[77]

Traditional classification

In Shuowen Jiezi, Xu Shen proposed six categories (traditional Chinese: 六書; simplified Chinese: 六书; pinyin: liùshū; literally: Six Writings) of Chinese characters, including [78]

  1. Pictograms (Template:Zhi), single-semantic-component characters which are drawings of the objects they represent.
  2. Simple ideograms (Template:Zhi), express an abstract idea with an iconic form.
  3. Compound ideographs (Template:Zhi; huìyì; 'joined meaning'), combine two or more semantic components to indicate the meaning of the character.
  4. Phono-semantic characters (Template:Zhi; 'form and sound'), consist of phonetic components and semantic components.
  5. Derivative cognates (Template:Zhi ), two characters had similar Old Chinese pronunciations and may have had the same etymological root.
  6. Rebus (phonetic loan) characters (Template:Zhi; jiǎjiè; 'borrowing, making use of'), are characters "borrowed" to write another morpheme which is pronounced the same or nearly the same.

Modern classification

The traditional Six Writings pre-supposed that every internal component can either represent the sound or meaning of the character. But, after the long evolution of Chinese writing systems, quite a few components can no longer effectively play the roles and have become pure form components. From the internal structure point of view, modern Chinese characters are composed of semantic components, phonetic components and pure form components. And they have formed seven categories of modern Chinese characters:[79][80]

Semantic component characters are composed of semantic components and include:[81][82]

Phonetic component characters are composed of phonetic components.[81] For example,

  • Phonetic-loan, for example, character "Template:Zhi" (flower) is borrowed to mean "spending".
  • Used in a transliterated foreign word, e.g., the characters in words "Template:Zhi" (dá, dozen) and "Template:Zhi" (mǎdá, motor).
  • Multi-phonetic component characters, for example, "Template:Zhi" (xīn) was originally a semantic-phonetic character, but its modern meaning of "new" has nothing to do with the original semantic component of "Template:Zhi" (jīn, 0.5 kg), but the sounds are similar. In this way, "Template:Zhi" (xīn) then has two phonetic components: "Template:Zhi" (qīn) and "Template:Zhi" (jīn).

Pure form characters are composed of form components, which neither represent the sound nor the meaning of the characters.[83] For example:

Semantic-phonetic characters, also called "phono-semantic characters", consist of semantic components and phonetic components.[84] There are six combinations:

  1. Left meaning (semantic) and right sound (phonetic), such as Template:Zhi (gān, liver), Template:Zhi (jīng, fear), Template:Zhi (hú, lake);
  2. Right meaning and left sound, such as Template:Zhi (wǔ, parrot), Template:Zhi (gāng, firm), Template:Zhi (shēng, nephew);
  3. Upper meaning and lower sound: Template:Zhi (lín, rain), Template:Zhi (máo, cogongrass) and Template:Zhi (gān, pole);
  4. Lower meaning and upper sound: Template:Zhi (yú, bowl), Template:Zhi (dài, Mount Tai), Template:Zhi (shā, shark);
  5. Outer meaning and inner sounds: Template:Zhi (yǎng, itch), Template:Zhi (yuán, garden), Template:Zhi (zhōng, heart), Template:Zhi (zuò, seat), Template:Zhi (qí, flag);
  6. Inner meaning and outer sound: Template:Zhi (biàn, braid), Template:Zhi (mèn, dull), Template:Zhi (mó, imitation).

Semantic-form characters are composed of semantic components and pure form components.[85] Many of these characters were originally semantic-phonetic characters. Due to subsequent changes in the pronunciation of the phonetic components or the characters, the phonetic components could not effectively represent the pronunciation of the character and became pure form. For example: [86]

Phonetic-form characters are composed of phonetic components and pure form components.[87] They mostly came from ancient semantic-phonetic characters, where the semantic components lost their functions and became pure form. For example,

  • Template:Zhi (qiú, ball): Originally refers to a kind of beautiful jade, with semantic component Template:Zhi, jade). Later, it was borrowed to represent a ball, and then extended to any round three-dimensional object, and Template:Zhijade) became a pure form component, while Template:Zhi (qiú) remains a phonetic component.
  • Template:Zhi (bèn, stupid): Originally refers to the inner white layer of bamboo, with semantic component Template:Zhi (bamboo) and phonetic Template:Zhi (běn). Later, the character was borrowed by sound to mean stupid.
  • Template:Zhi, magnificent): This is a simplified character with phonetic Template:Zhi, and pure form component Template:Zhi.

Semantic-phonetic-form characters consist of the three kinds of components. For example,[83]

Semantic-phonetic-form characters are very rare and the examples above are not quite persuasive. Whether they can be justified as an internal structural category remains to be further studied. (If not a category, then the classification above can also be called "New Six Writings")

According to Yang, [85] among the 3,500 frequently used Chinese characters of their experiment, semantic component characters are the least, accounting for about 5%; pure form component characters account for about 18%; semantic-form and phonetic-form characters account for about 19%. The largest group is semantic-phonetic characters, accounting for about 58%.

Simplification

Milestones

The historical milestones of Chinese character simplification include: [88] [89]

In 1909, Lu Feikui published article "Vulgar Chinese Characters Should Be Used in General Education" (Template:Zhi). The May Fourth Movement further promoted Chinese character simplification.

In August 1935, the Ministry of Education of China in Nanjing published the "List of the First Batch of Simplified Chinese Character" (Template:Zhi), which included 324 characters.

In January, 1956, the Chinese Character Simplification Scheme was approved by the State Council of China.

In May, 1964, the General list of simplified characters (Template:Zhi) was published. A revised version was published in 1986.

In June 2013, the Table of General Standard Chinese Characters was released by the State Council of China. It includes 8,105 characters of the Simplified Chinese writing system. In addition, there are 2,574 corresponding Traditional characters and 1,023 variants.

Sources

There are four main sources of simplified characters: [90]

  1. Ancient characters, such as: Template:Zhi (Template:Zhi, cloud), Template:Zhi (Template:Zhi, etiquette), Template:Zhi (Template:Zhi, after)
  2. Simplified Chinese characters popular in the society, such as: Template:Zhi (Template:Zhi, body), Template:Zhi (Template:Zhi, sound), Template:Zhi (Template:Zhi, iron).
  3. Cursive regularized characters, for example: Template:Zhi (Template:Zhi, book), Template:Zhi (Template:Zhi, for), Template:Zhi (Template:Zhi, east).
  4. Newly coined characters, for example: Template:Zhi (Template:Zhi, country), Template:Zhi (Template:Zhi, support), Template:Zhi (Template:Zhi, protect).

Methods

The methods to simplify Chinese characters include [91] [92]

Omitting

That is, to omit some components of the character, for example:

Reshaping

That is to change forms based on the original characters. For example,

Replacing

Usually replace the whole character with a character of similar sound. For example,

Rationalization

Main page: Social:Chinese character rationalization

The goal of Chinese character rationalization or Chinese character optimization (traditional Chinese: 漢字整理; simplified Chinese: 汉字整理; pinyin: hànzì zhěnglǐ) is to, in addition to Chinese character simplification, optimize the Chinese characters and set up one standard form for each of them. [93]

Processing variant characters

Variant Chinese characters are characters with the same pronunciation and meaning but different forms, such as "Template:Zhi" (gòu, enough) and "Template:Zhi" (tā, it). The existence of variant characters results in multiple forms for one character, which increases the burden of language learning and application. In the process of Chinese characters application, people need to constantly process variant characters and eliminate inappropriate ones. [94]

There are two different principles for processing variant characters: One is conforming to the customs and simplicity. The other is to follow the original form and meaning, based on the character creation method and etymology, especially Shuowen Jiezi.[95]

There are two methods for processing variant characters: The selecting method is to select one of the variant characters as the standard character and eliminate the rest. [96] The splitting method is to differentiate a group of variant characters in terms of usage to eliminate the variant relationship.

In December 1955, the Ministry of Culture and the Cultural Reform Commission of PRC jointly announced the "First List of Processed Variant Characters" (Template:Zhi). After some later adjustments, the list now has 796 groups of variant characters, and 1,027 characters have been eliminated. [97]

Processing printing fonts

In January 1965, the Ministry of Culture and the Cultural Reform Commission of PRC jointly issued the "Template:Zhi" (General Chinese Character Forms for Printing), "Template:Zhi" (Font Table) in short. The "Font Table" contains 6,196 commonly-used Song-style characters for printing. In accordance with the principles of simplicity, convenience for learning and use, a standard form was specified for each common character, including the number of strokes, structure and stroke order. After the Cultural Revolution, the Font Table was formally published. The character forms specified by it are now customarily called "new character forms", while the fonts used before were called "old character forms". The "New and Old Character Form Comparison Table" (Template:Zhi) in many language reference books including Xinhua Dictionary and Xiandai Hanyu Dictionary are compiled and printed based on the Font Table. [98][99]

Current font standards include:

Names of places

In order to make place names easier to use, the Chinese government started to process the uncommon characters used in place names in 1950s.

The principles for choosing replacement characters are: [101] [102]

  1. Same pronunciation and clear,
  2. More commonly used,
  3. Simple and easy to write,
  4. A standard character that is popular in the local area,
  5. Not to be confused with other place names.

From March 1955 to August 1964, 35 place names of county level or above were changed with the approval of the State Council. For example:

Later, in order to maintain the stability of place names, this work was suspended. [103]

Measurement words

When the English units of measurement were translated into Chinese, there were inconsistencies in the use of characters. For example:

mile:  Template:Zhi or Template:Zhi.
foot:  Template:Zhi, Template:Zhi.
kilowatt: Template:Zhi, Template:Zhi.

Therefore, the burden of language application was increased. "Template:Zhi" etc. are specially created characters, and they also have poly-syllable sounds, which does not follow the monosyllable pattern of Chinese characters. In order to solve these problems, in July 1977, the Chinese Character Reform Commission and the National Bureau of Standards and Measures of PRC jointly issued the "Notice on the Uniform Use of Characters in the Names of Some Measurement Units" (Template:Zhi), establishing the metric system as the basic measurement system. [104] [105]

Education

Main page: Social:Chinese character education

Chinese character education is the teaching and learning of Chinese characters. When written Chinese appeared in social communication, Chinese character teaching came into being. From ancient times to the present, the teaching of Chinese characters has always been the focus of Chinese language education. [106]

Ancient education

In ancient times, research on Chinese character teaching focused on the preparation of various centralized literacy textbooks and dictionaries. Among them, the ones with greater impact include: [107]

The previous three books then developed into a set of teaching materials, collectively called "Three Hundred Thousand" (Template:Zhi, about 2,000 different characters), which were used for over 1000 years until the end of the Qing Dynasty, and still have a certain influence today. "Three Hundred Thousand" is arranged in rhyme form to make it catchy and easy to remember. Another influential literacy textbook is "Wenzi Mengqiu" (Template:Zhi) compiled for children by the Qing Dynasty writer Wang Jun (1784-1854), which contains 2,049 characters. [108]

Modern native language education

Modern Chinese character education is an important component of primary education in China, and an important part of literacy teaching and teaching Chinese as a foreign language. [109] [110]

The method is to use high-frequency characters according to frequency statistics. The important character lists include:

  • "List of Frequently Used Characters in Modern Chinese" (Template:Zhi, State Language Commission, Beijing, 1988),[111] 3,500 characters.
  • Table of General Standard Chinese Characters (Template:Zhi, the 3,500 primary characters in this list of 8,105 characters of the Simplified Chinese writing system, released by the State Council of PRC in June 2013,)
  • Chart of Standard Forms of Common National Characters (Template:Zhi, 1979), including 4,808 commonly used Chinese characters.

Chinese character literacy movement began in the early 20th century, when the literacy level of ordinary Chinese people was quite low. Intellectuals who cared about the country and its people advocated education to save the country and started a Chinese character literacy campaign. [112] In June 1952, the Ministry of Education of China published a list of commonly used literacy characters, including 2,000 characters for use in literacy textbooks. In 1993, the State Language Commission published the "Character List for Literacy", which includes Table A and B. Table A contains 1,800 characters that are required for literacy in the country, and Table B contains 200 reference characters for literacy.[113] According to UNESCO, by 2015, China's illiteracy rate had dropped to 3.6 percent.

Foreign language education

In the 3rd century AD, Chinese characters were introduced to Korea, thereafter to Japan, Vietnam and other countries. Thus, teaching Chinese characters to foreigners began. By 1989, there were more than 100 colleges and universities teaching Chinese as a foreign language in China. [114]

From 1990 to 1991, the National Leading Group for Teaching Chinese as a Foreign Language and the Chinese Proficiency Test Center of Beijing Language Institute jointly developed the "Template:Zhi" (Outline of the Graded Vocabulary and Characters for HSK). The Chinese character outline contains 2,905 characters, divided into four grades: 800 Grade A characters, 804 Grade B characters, 601 Grade C characters, and 700 Grade D characters. Among these 2,905 characters, 2,485 are first-level frequently-used characters in the "Template:Zhi" (List of Frequently Used Characters in Modern Chinese).[115] Teaching Chinese characters as a foreign language has received more and more attention, and many textbooks and elective courses in this area have appeared. There are now more than 200 Confucius Institutes teaching Chinese as a foreign language in the world.

Information technology

Main page: Chinese character IT

Chinese character Information Technology (IT) is the technology of computer processing of Chinese characters. While the English writing system makes use of a few dozen different characters, Chinese language needs a much larger character set. There are over ten thousand characters in the Xinhua Dictionary.[27] In the Unicode multilingual character set of 149,813 characters, 98,682 (about 2/3) are Chinese.[116]

Chinese character input

Computer input of Chinese characters is by no means as easy as English. English is written with 26 letters and a handful of other characters, and each character is assigned to a key on the keyboard. Chinese can be input in a similar way. However that would involve a huge keyboard with at least thousands of keys. Searching for a character on the keyboard would be a daunting job.[117] An alternative way is to encode each Chinese character in English characters, enabling Chinese input on an English keyboard. As a matter of fact, this method has become predominant for Chinese computer input.

Sound-based encoding is normally based on an existing Latin character scheme for Chinese phonetics, such as the Pinyin Scheme for Mandarin Chinese or Putonghua, and the Jyutping Scheme for the Cantonese dialect. The input code of a Chinese character is its pinyin letter string followed by an optional number representing the tone. For example, the Putonghua Pinyin input code of 香港 (Hong Kong) is "xianggang" or "xiang1gang3", and the Cantonese Jyutping code is "hoenggong" or "hoeng1gong2", all of which can be easily input via an English keyboard.

A Chinese character can alternatively be input by form-based encoding. Most Chinese characters can be divided into a sequence of components in writing order. There are a few hundred basic components,[118] much less than the number of characters. By representing each component with an English letter and putting them in writing order of the character, the input method creator can get a letter string ready to be used as an input code on the English keyboard. Of course the creator can also design a rule to select representative letters from the string if it is too long. For example, in the Cangjie input method, character (border) is encoded as "NGMWM" corresponding to components "弓土一田一", with some components omitted. Popular form-based encoding methods include Wubi (五笔) in the Mainland and Cangjie (倉頡) in Taiwan and Hong Kong.[119]

The most important feature of intelligent input is the application of contextual constraints for candidate character selection. For example, on Microsoft Pinyin, when the user types input code "daxuejiaoshou", he/she will get "大学教授 / 大學教授" (University Professor), when types "daxuepiaopiao" the computer will suggest "大雪飘飘 / 大雪飄飄" (heavy snow flying). Though the non-toned Pinyin letters of 大学 and 大雪 are both "daxue", the computer can make a reasonable selection based on the subsequent words.[120]

Chinese character encoding for information interchange

Inside the computer or mobile phone each character is represented by an internal code. When a character is sent between two computers or other digital devices, it is in information interchange code. Nowadays, information interchange codes, such as ASCII and Unicode, are often directly employed as internal codes.

The first GB Chinese character encoding standard is GB2312, which was released by the PRC in 1980. It includes 6,763 Chinese characters, with 3,755 frequently-used ones sorted by Pinyin, and the rest by radicals (indexing components). GB2312 was designed for simplified Chinese characters. Traditional characters which have been simplified are not covered. The code of a character is represented by a two-byte hexadecimal number, for instance, the GB codes of 香港 (Hong Kong) are CFE3 and B8DB respectively. GB2312 is still in use on some computers and the WWW, though newer versions with extended character sets, such as GB13000.1 and GB18030, have been released.[121] The latest version of GB encoding is GB18030, which supports both simplified and traditional Chinese characters, and is consistent with the Unicode character set.[122]

The standard of Big5 encoding was designed by five big IT companies in Taiwan in the early 1980s, and has been the de facto standard for representing traditional Chinese in computers ever since. Big5 is popularly used in Taiwan, Hong Kong and Macau. The original Big5 standard included 13,053 Chinese characters, with no simplified characters of the Mainland. Each character is encoded with a two byte hexadecimal code, for example, 香 (ADBB) 港 (B4E4) 龍 (C073). Chinese characters in the Big5 character set are arranged in radical order. Extended versions of Big5 include Big-5E and Big5-2003, which include some simplified characters and Hong Kong Cantonese characters.[123]

The full version of the Unicode standard represents a character with a 4-byte digital code, providing a huge encoding space to cover all characters of all languages in the world. The Basic Multilingual Plane (BMP) is a 2-byte kernel version of Unicode with 2^16=65,536 code points for important characters of many languages. There are 27,522 characters in the CJKV (China, Japan, Korea and Vietnam) Ideographs Area, including all the simplified and traditional Chinese characters in GB2312 and Big5 traditional. In Unicode 15.0, there is a multilingual character set of 149,813 characters, among which overs 98,682 (about 2/3) are Chinese sorted by Kangxi Radicals. Even very rarely-used characters are available. For example: H (0048) K (004B), 香 (9999), 港 (6E2F), 龍(9F8D), 龙 (9F99), 龖 (9F96), 龘 (9F98), 𪚥 (2A6A5).[116] [124]

Unicode is becoming more and more popular. It is reported that UTF-8 (Unicode) is used by 98.1% of all the websites. It is widely believed that Unicode will ultimately replace all other information interchange codes and internal codes, and there will be no more code confusing.[125]

Chinese character output

Like English and other languages, Chinese characters are output on printers and screens in different fonts and styles. The most popular Chinese fonts are the Song (traditional Chinese: 宋體; simplified Chinese: 宋体), Kai (Template:Zhi), Hei (Template:Zhi) and Fangsong (Template:Zhi) families.[56]

Fonts appear in different sizes. In addition to the international measurement system of points, Chinese characters are also measured by size numbers ( traditional Chinese: 字號; simplified Chinese: 字号; pinyin: zìhao) invented by an American for Chinese printing in 1859. [57]


See also

Notes

  1. In some applications, there are smaller configuration units, e.g., stroke segments, turning points, and pixels.[47]

References

Citations

  1. Su 2014, p. 19-21.
  2. "What is the most spoken language?". https://www.ethnologue.com/insights/most-spoken-language/. 
  3. Arcodia 2021, pp. 62–71.
  4. Zhou 2003.
  5. Qiu 2000.
  6. Chen 2021.
  7. 7.0 7.1 7.2 National Language Commission 2009, p. 1.
  8. 8.0 8.1 "Chinese Character Frequency Statistics for Hong Kong, Mainland China and Taiwan – A Trans-Regional, Diachronic Survey: 香港、大陸、台灣 – 跨地區、跨年代漢語常用字頻統計". https://humanum.arts.cuhk.edu.hk//Lexis/chifreq/. 
  9. 9.0 9.1 Su 2014.
  10. 10.0 10.1 Yang 2008.
  11. Qiu 2013, pp. 45–101.
  12. Zhou 1980.
  13. Gao & Fan 1985.
  14. Su 2014, pp. 29–30.
  15. Yin & Wang 2007, p. [page needed].
  16. Gao, Fei & Fan 1993, p. [page needed].
  17. Zhang 1992.
  18. Su 2014, pp. 19–21.
  19. Peking University 2004, pp. 145–148.
  20. Norman 1988, pp. 74–75.
  21. Norman 1988, p. 74.
  22. Su 2014, pp. 51–52.
  23. Su 2014, p. 47.
  24. 现代汉语常用字表 [List of Frequently Used Characters in Modern Chinese], Ministry of Education of the People's Republic of China, 26 Jan 1988.
  25. 现代汉语通用字表 [List of Commonly Used Characters in Modern Chinese], Ministry of Education of the People's Republic of China, 26 Jan 1988.
  26. 26.0 26.1 "国务院关于公布《通用规范汉字表》的通知" (in zh). State Council of the People's Republic of China. 5 June 2013. http://www.gov.cn/zwgk/2013-08/19/content_2469793.htm. 
  27. 27.0 27.1 Language Institute 2020.
  28. Language Institute 2016.
  29. National Academy of the Korean Language (1991)
  30. "'인명용(人名用)' 한자 5761→8142자로 대폭 확대" (in ko). Chosun Ilbo. 2014-10-20. http://news.chosun.com/site/data/html_dir/2014/10/20/2014102001300.html. 
  31. Su 2014, p. 51.
  32. (Lecture notes of the subject "Modern Chinese Characters and Information Technology", Dept of Chinese and Bilingual Studies, Hong Kong Polytechnical University, by Dr. Zhang Xiaoheng, June 12, 2017.)
  33. "UAX #38: Unicode Han Database (Unihan)". https://www.unicode.org/reports/tr38/#BlockListing. 
  34. Norman 1988, p. 73.
  35. 35.0 35.1 Su 2014, p. 34.
  36. Su 2014, p. 35.
  37. Chen 1928.
  38. National Language Commission 2013.
  39. Su 2014, p. 42.
  40. Su 2014, pp. 183–207.
  41. Zhan 2008, p. 19-24.
  42. Wang & Zou 2003, p. 20-27.
  43. Wang & Zou 2003, p. 27-28.
  44. Wang & Zou 2003, p. 29-31.
  45. Su 2014, pp. 201–202.
  46. Peking University 2004, pp. 148–152.
  47. Zhang & Li 2013.
  48. Su 2014, pp. 74–75.
  49. National Language Commission 1999.
  50. "The Unicode Standard, Version 15.1: CJK Strokes". https://www.unicode.org/charts/PDF/U31C0.pdf. 
  51. Su 2014, pp. 82–84.
  52. Su 2014, p. 86.
  53. 53.0 53.1 National Language Commission 2009, p. 2.
  54. Su 2014, p. 94.
  55. National Language Commission 2009a, p. 1.
  56. 56.0 56.1 Li 2013, p. 62.
  57. 57.0 57.1 Zhang 2006.
  58. Peking University 2004, p. 169.
  59. Su 2014, pp. 160–161.
  60. National Language Commission 1985.
  61. Peking University 2004, p. 171.
  62. Su 2014, pp. 172–175.
  63. Su 2014, p. 176.
  64. Peking University 2004, p. 172.
  65. Zhou 1993.
  66. Su 2014, p. 180.
  67. Qiu 2013, pp. 210–211.
  68. Yang 2008, p. 169.
  69. Fromkin 1993, p. 41.
  70. Yang 2008, p. 170–172.
  71. Li 1988, p. 1112.
  72. 72.0 72.1 Li 2013, p. 231.
  73. Li 2013, p. 232.
  74. Su 1994, pp. 128–129.
  75. Su 1994, p. 129.
  76. Yang 2008, pp. 173–174.
  77. Li 2013, pp. 122–124.
  78. Qiu 2013, pp. 102–108.
  79. Yin & Wang 2007, pp. 97–100.
  80. Su 2014, pp. 102–111.
  81. 81.0 81.1 Yin & Wang 2007, p. 98.
  82. Su 2014, pp. 103–105.
  83. 83.0 83.1 Yin & Wang 2007, p. 100.
  84. Yin & Wang 2007, p. 99.
  85. 85.0 85.1 Yang 2008, p. 147.
  86. Su 2014, p. 107-108.
  87. Su 2014, p. 109.
  88. Su 2014, pp. 120–126.
  89. Li 2013, pp. 300–302.
  90. Su 2014, p. 127.
  91. Su 2014, pp. 127–128.
  92. Li 2013, p. 304.
  93. Yang 2008, p. 116.
  94. Peking University 2004, p. 163.
  95. Su 2014, pp. 141.
  96. Su 2014, pp. 141–142.
  97. Chinese Language Press 1997, p. 182.
  98. Su 2014, pp. 145–146.
  99. Li 2013, pp. 301–302.
  100. "國字標準字體楷書母稿 <教育部字序>". http://language.moe.gov.tw/001/Upload/files/SITE_CONTENT/M0001/MU/c5.htm?open. 
  101. Su 2014, p. 148.
  102. Fu 1994, p. 95.
  103. Li & Fei 2001, p. 260.
  104. Su 2014, pp. 150–151.
  105. "关于部分计量单位名称统一用字的通知". http://www.china-language.gov.cn/wenziguifan/shanghi/008.htm. 
  106. Li 2013, p. 366.
  107. Su 2014, pp. 245–246.
  108. Yang 2008, p. 216.
  109. Yang 2008, p. 217.
  110. Su 2014, pp. 248–249.
  111. "List of Frequently Used Characters in Modern Chinese". https://en.wikisource.org/wiki/Translation:List_of_Frequently_Used_Characters_in_Modern_Chinese. 
  112. Yang 2008, p. 219.
  113. Yang 2008, pp. 219–220.
  114. Wang 2001, p. 147.
  115. Yang 2008, p. 220.
  116. 116.0 116.1 "Unicode Statistics". https://www.unicode.org/versions/stats/. 
  117. Su 2014, p. 218.
  118. National Language Commission 1997.
  119. Zhang 2016, p. 422.
  120. Su 2014, p. 222.
  121. Su 2014, pp. 213–215.
  122. Lunde, Ken (4 August 2022). "The GB 18030-2022 Standard" (in en). https://ken-lunde.medium.com/the-gb-18030-2022-standard-3d0ebaeb4132. 
  123. "[chinese mac Character Sets"]. http://chinesemac.org/pages/character_sets.html. 
  124. Unicode Consortium 2023.
  125. "Usage Statistics and Market Share of UTF-8 for Websites, November 2023". https://w3techs.com/technologies/details/en-utf8. 

Works cited

  • Arcodia, Giorgio (and Basciano, Bianca) (2021). Chinese Linguistics. Oxford: Oxford University Press. ISBN 978-0-19-884784-7. 
  • Chen, Heqin 陳鶴琴 (1928) (in zh). Beijing: Shangwu (The Commercial Press). 
  • Chen, Maoren 陳茂仁 (2021) (in zh). Taipei: 新學林 (New Xuelin). 
  • Chinese Language Press, PRC 语文出版社 (1997) (in zh). Beijing: 语文出版社 (Chinese Language Press). 
  • Fromkin, Victoria (and Robert Rodman) (1993). An Introduction to Language (5th ed.). Orlando, USA: Harcourt Brace Javanovich College Publishers. ISBN 0-03-075379-1. 
  • Fu, Yonghe 傅永和 (1994) (in zh). Beijing: 语文出版社 (Chinese Language Press). 
  • Fu, Yonghe 傅永和 (1999) (in zh) (3rd ed.). Guangzhou: 广东教育出版社 (Guangdong Education Press). ISBN 9-787540-640804. 
  • Gao, Jiaying 高家鶯; Fan, Keyu 范可育 (1985). 1985. 
  • Gao, Jiaying 高家鶯; Fei, Jinchang 费锦昌; Fan, Keyu 范可育 (1993) (in zh). Xiàndài hànzìxué. Beijing: 高等教育出版社 (Higher Education Press). ISBN 7040040670. 
  • Language Institute, Chinese Academy of Social Sciences (2016) (in zh) (7th ed.). Beijing: Commercial Press. ISBN 978-7-100-12450-8. 
  • Language Institute, Chinese Academy of Social Sciences (2020) (in zh). Xīnhuá zìdiǎn (12th ed.). Beijing: Shangwu yinshuguan (The Commercial Press). ISBN 978-7-100-17093-2. 
  • Li, Dasui 李大遂 (2013) (in zh) (3rd ed.). Beijing: Peking University Press. ISBN 978-7-301-21958-4. 
  • Li, Gongyi 李公宜 (1988). Liu, Rushui 劉如水. ed (in zh). Beijing: 科学出版社 (Science Press). ISBN 7-03-000862-6. 
  • Li, Xingjian 李行建; Fei, Jinchang 费锦昌 (2001) (in zh). Shanghai: 上海辞书出版社 (Shanghai Dictionary Publishing House). 
  • National Language Commission, Ministry of Education, China (1985). Beijing: National Language Commission. http://www.moe.gov.cn/jyb_sjzl/ziliao/A19/201001/W020190416497956176438.pdf. Retrieved September 15, 2023. 
  • National Language Commission, PRC (1997). Chinese Character Component Standard of GB13000.1 Character Set for Information Processing. Beijing: National Language Commission of China. http://www.moe.gov.cn/ewebeditor/uploadfile/2015/01/12/20150112165337190.pdf. 
  • National Language Commission, Ministry of Education, China (1999) (in zh). Shanghai Education Press. ISBN 7-5320-6674-6. http://www.moe.gov.cn/jyb_sjzl/ziliao/A19/201001/W020150902458280061291.pdf. 
  • National Language Commission, Ministry of Education, China (2009a). Beijing: National Language Commission. http://www.moe.gov.cn/ewebeditor/uploadfile/2015/01/13/20150113090418639.pdf. Retrieved September 8, 2023. 
  • National Language Commission, Ministry of Education, China (2009). Beijing: National Language Commission. http://www.moe.gov.cn/ewebeditor/uploadfile/2015/01/13/20150113090318445.pdf. Retrieved 3 September 2023. 
  • National Language Commission, Ministry of Education, China (2013) (in zh). Beijing: Shangwu (The Commercial Press). 
  • Norman, Jerry (1988). Chinese. Cambridge: Cambridge University Press. ISBN 978-0-521-29653-3. 
  • Peking University, Modern Chinese Language Teaching and Research Office (2004) (in zh). Xiàndài hànyǔ. Shangwu yinshuguan (The Commercial Press). ISBN 7-100-00940-5. 
  • Qiu, Xigui (2000). Chinese writing. Berkeley: Society for the Study of Early China and The Institute of East Asian Studies, University of California. ISBN 978-1-55729-071-7.  (English translation of Wénzìxué Gàiyào 文字學概要, Shangwu, 1988.)
  • Qiu, Xigui 裘锡圭 (2013) (in zh) (2nd ed.). Beijing: 商务印书馆 (Commercial Press). ISBN 978-7-100-09369-9. 
  • Su, Peicheng 苏培成 (1994) (in zh). Beijing: Peking University Press). ISBN 7-301-02597-1. 
  • Su, Peicheng 苏培成 (2014) (in zh) (3rd ed.). Beijing: 商务印书馆 (The Commercial Press, Shangwu). ISBN 978-7-100-10440-1. 
  • Unicode Consortium (2023) (in en). Unicode Standard, Version 15.1.0.. Mountain View, CA: Unicode Consortium. https://www.unicode.org/versions/Unicode15.1.0/. 
  • Wang, Ning (王宁) (2001) (in zh). Beijing: Beijing Normal University Press. 
  • Wang, Ning (王寧); Zou, Xiaoli (鄒曉麗) (2003) (in zh). Hong Kong: 和平圖書有限公司. ISBN 962-238-363-7. 
  • Yang, Runlu 杨润陆 (2008) (in zh). Beijing: Beijing Normal University Press. ISBN 978-7-303-09437-0. 
  • Yin, Jiming 殷寄明; Wang, Rudong 汪如东 (2007) (in zh). Xiàndài hànyǔ wénzìxué. Shanghai: Fudan University Press. ISBN 978-7-309-05525-2. 
  • Zhan, Deyou 詹德优 (2008) (in zh). Beijing: Commercial Press. ISBN 978-7-100-01510-3. 
  • Zhang, Jingxian 张静贤 (1992) (in zh). Beijing: Modern Press. 
  • Zhang, Xiaoheng 张小衡 (2006). 42. pp. 175–177 & p 215. 
  • Zhang, Xiaoheng (张小衡); Li, Xiaotong (李笑通) (2013) (in zh). Beijing: 语文出版社 (The Language Press). ISBN 978-7-80241-670-3. 
  • Zhang, Xiaoheng (2016). "Computational Linguistics". The Routledge Encyclopedia of the Chinese Language. Oxfordford: Routledge. pp. 420–437. ISBN 978-0-415-53970-8. 
  • Zhou, Youguang 周有光 (1980). 2. Knowledge Press (知识出版社). 
  • Zhou, Youguang 周有光 (1993). 10. 
  • Zhou, Youguang (2003). The Historical Evolution of Chinese Languages and Scripts. Columbus: National East Asian Languages Resource Center, Ohio State University. ISBN 978-0-87415-349-1. 

External links