Skip to product information
1 of 7

PayPal, credit cards. Download editable-PDF & invoice in 1 second!

GB 18030-2022 English PDF (GB18030-2022)

GB 18030-2022 English PDF (GB18030-2022)

Regular price $5,490.00 USD
Regular price Sale price $5,490.00 USD
Sale Sold out
Shipping calculated at checkout.
Quotation: In 1-minute, 24-hr self-service. Click here GB 18030-2022 to get it for Purchase Approval, Bank TT...

GB 18030-2022: Information technology - Chinese coded character set

This document specifies the hexadecimal representation of Chinese graphic characters and their binary codes used in information technology. This document applies to the processing, exchange, storage, transmission, presentation, input and output of Chinese and other graphic character information. This document is applicable to technical products with information processing and exchange functions of Chinese and other text and graphic characters, including but not limited to the software products represented by input methods, optical character recognition (OCR), editing and proofreading, machine translation, speech synthesis, text transcription, intelligent writing, etc., as well as the hardware products represented by computers, communication terminal equipment, e-book readers, learning machines, etc.
GB
NATIONAL STANDARD OF THE
PEOPLE REPUBLIC OF CHINA
ICS 35.040
CCS L 71
GB 18030-2022
Replacing GB 18030-2005
Information technology - Chinese coded character set
ISSUED ON: JULY 19, 2022
IMPLEMENTED ON: AUGUST 01, 2023
Issued by: State Administration for Market Regulation;
Standardization Administration of the People's Republic of China.
Table of Contents
Foreword ... i
1 Scope ... 0
2 Normative references ... 0
3 Terms and definitions ... 0
4 Repertoire ... 1
5 Overall structure ... 2
6 Sequence of characters ... 4
7 Code point allocation ... 4
8 Explanation of some characters and codes ... 7
9 Implementation level ... 7
Annex A (normative) Character table of double-byte ... 9
Annex B (normative) Ideographic descriptors ... 91
Annex C (normative) Character table of four-byte ... 92
Annex D (informative) Explanation of some characters and codes ... 546
Annex E (informative) Code positions of Chinese characters in "General Standard Chinese Character List" ... 549
Bibliography ... 742
ii
Foreword
This document was drafted in accordance with the rules given in GB/T 1.1-2020 "Directives for standardization - Part 1: Rules for the structure and drafting of standardizing documents".
This document replaces GB 18030-2005 "Information technology - Chinese coded character set". Compared with GB 18030-2005, in addition to the structural modifications and editorial changes, the main technical changes in this document are as follows:
a) Add the applicable objects of this document (see Chapter 1 of this Edition); b) In the double-byte coding area, change the GB/T 13000 code positions corresponding to 10 vertical punctuation marks and 8 Chinese character
components. Delete 6 repeated coded Chinese character components and 9
repeated coded Chinese characters (see Annex D of this Edition, Annex A of Edition 2005);
c) In the four-byte coding area, change 18 GB/T 13000 code positions (see Annex D of this Edition, Annex D of Edition 2005);
d) In the part of four-byte code 0x82358F33~0x82359636, add 66 new Chinese characters added by CJK unified Chinese characters (see Annex C of this Edition); e) In the part of four-byte code 0x9835F738~0x98399E36, add 4149 Chinese characters of CJK unified Chinese character extension C (see Annex C of this Edition);
f) In the part of four-byte code 0x98399F38~0x9839B539, add 222 Chinese characters of CJK unified Chinese character expansion D (see Annex C of this Edition);
g) In the part of four-byte code 0x9839B632~0x9933FE33, add 5762 Chinese characters of CJK unified Chinese character extension E (see Annex C of this Edition);
h) In the part of four-byte code 0x99348138~0x9939F730, add 7473 Chinese characters of CJK unified Chinese character expansion F (see Annex C of this Edition);
i) In the part of four-byte code 0x81398B32~0x8139A035, add 214 Kangxi radicals (see Annex C of this Edition);
j) In the part of four-byte code 0x8134F932~0x81358437, add 83 Xishuangbanna New Dai characters (see Annex C of this Edition);
iii
Information technology - Chinese coded character set
1 Scope
This document specifies the hexadecimal representation of Chinese graphic characters and their binary codes used in information technology.
This document applies to the processing, exchange, storage, transmission, presentation, input and output of Chinese and other graphic character information.
This document is applicable to technical products with information processing and exchange functions of Chinese and other text and graphic characters, including but not limited to the software products represented by input methods, optical character recognition (OCR), editing and proofreading, machine translation, speech synthesis, text transcription, intelligent writing, etc., as well as the hardware products represented by computers, communication terminal equipment, e-book readers, learning machines, etc.
2 Normative references
The following referenced documents are indispensable for the application of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies. GB/T 2312-1980, Code of Chinese graphic character set for information
interchange - Primary set
GB/T 11383-1989, Information process in 8-bit code for information interchange - Structure and rules for implementation
GB/T 13000, Information technology - Universal multiple - Octet coded character set (UCS)
3 Terms and definitions
For the purposes of this document, the following terms and definitions apply. 3.1 character
An element in a collection of elements used to organize, control, or represent data. 3.2 coded character
Character (3.1) and its coded representation.
3.3 private use area
An area that can be specified by the user of a product conforming to this document. 3.4 repertoire
A specified set of characters (3.1) represented by a coded character (3.2) set. 3.5 reserved zone
Areas reserved for future specified by this document.
4 Repertoire
4.1 Overview
The characters included in this document are coded in single-byte, double-byte or four- byte.
4.2 Part of single-byte
In this document, the part of single-byte includes all 128 characters from 0x00 to 0x7F of GB/T 11383-1989.
4.3 Part of double-byte
The part of double-byte includes all graphic characters in GB/T 2312-1980, CJK unified Chinese characters and some graphic characters in GB/T 13000. The characters in the part of double-byte are in accordance with the provisions in Annex A. Among them, the graphics, code positions and functions of ideographic descriptors shall comply with the provisions of Annex B.
NOTE: GB/T 13000 uniformly encodes Chinese characters used in China, Japan, South Korea, Vietnam and other countries and regions. Chinese characters with unique abstract glyphs are assigned a separate code position. Chinese characters with different sources but the same abstract glyphs are given a common code position. The encoded Chinese characters are called CJK unified Chinese characters (CJK Unified Ideographs), where CJK means China, Japan, and Korea. 4.4 Part of four-byte
The part of four-byte includes 66 CJK unified Chinese characters (9FA6~9FEF, excluding 9FB4~9FBB) in GB/T 13000 other than the above-mentioned double-byte characters, CJK unified Chinese character extension A, CJK unified Chinese character extension B, CJK unified Chinese character extension C, CJK unified Chinese character extension D, CJK unified Chinese character extension E, CJK unified Chinese character extension F and the characters of ethnic minorities that have been coded in GB/T 13000. The characters in the part of four-byte follow the provisions of Annex C. 5 Overall structure
In the text, all numbers marked with 0x are in hexadecimal. Those not marked with 0x are in decimal. All coded representations in the appendix are expressed in hexadecimal. All other numbers are expressed in decimal.
The part of single-byte adopts the encoding structure of GB/T 11383-1989. Use code points 0x00~0x7F.
The part of double-byte adopts two octet strings to represent a character. Its first byte code point is from 0x81~0xFE. The tail byte code points are 0x40~0x7E and 0x80~0xFE respectively.
The part of four-byte adopts 0x30~0x39 not used in GB/T 11383-1989 as the suffix to expand the double-byte code. The encoding range is 0x81308130~0xFE39FE39. The encoding range of the first byte of a four-byte character is 0x81~0xFE. The encoding range of the second byte is 0x30~0x39. The encoding range of the third byte is 0x81~0xFE. The encoding range of the fourth byte is 0x30~0x39. That is: 0x81308130 ~ 0x81308139;
0x81308230 ~ 0x81308239;
...
0x8130FE30 ~ 0x8130FE39;
0x81318130 ~ 0x81318139;
...
0x8131FE30 ~ 0x8131FE39;
...
0x82308130 ~ 0x82308139;
...
0x8230FE30 ~ 0x8230FE39;
...
0xFE308130 ~ 0xFE308139;
This document specifies three implementation levels. System software products that meet the corresponding implementation level shall provide input and output functions for all characters within the corresponding implementation level.
9.2 Implementation level 1
Implementation level 1 supports CJK unified Chinese characters (i.e.,
0x82358F33~0x82359636) and CJK unified Chinese character extension A (i.e., 0x8139EE39~0x82358738) of the single-byte coded part, double-byte coded part and four-byte coded part of this document.
Any product to which this document applies shall meet the requirements for implementation level 1.
NOTE: According to the needs of software applications, implementation level 1 can also choose to support any one or more non-Chinese characters listed in Table 3.
9.3 Implementation level 2
Implementation level 2 contains implementation level 1. In addition, implementation level 2 also supports encoded Chinese characters that are not included in implementation level 1 in the "General Standard Chinese Character List". See Annex E for the code positions and glyphs of the Chinese characters included in the "General Standard Chinese Character List" in this document.
The system software and supporting software shall meet the requirements for implementation level 2.
NOTE: System software and supporting software include but not limited to operating system, database management system, and middleware (see GB/T 36475 for information on software product classification).
9.4 Implementation level 3
Implementation level 3 contains implementation level 2. In addition, implementation level 3 also supports all Chinese characters specified in this document and Kangxi radicals in Table 3.
Products used for government services and public services shall meet the requirements of level 3.
NOTE: Government services and public service industries include but are not limited to railway transportation, road transportation, water transportation, air transportation, multimodal transportation and transportation agency, postal services, monetary and financial services, insurance, land management, health, national institutions, social security, etc. (see GB/T 4754 for industry classification information).

View full details