1
/
of
6
PayPal, credit cards. Download editable-PDF and invoice in 1 second!
GB/T 36344-2018 English PDF (GB/T36344-2018)
GB/T 36344-2018 English PDF (GB/T36344-2018)
Regular price
$150.00 USD
Regular price
Sale price
$150.00 USD
Unit price
/
per
Shipping calculated at checkout.
Couldn't load pickup availability
Delivery: 3 seconds. Download true-PDF + Invoice.
Get Quotation: Click GB/T 36344-2018 (Self-service in 1-minute)
Historical versions (Master-website): GB/T 36344-2018
Preview True-PDF (Reload/Scroll-down if blank)
GB/T 36344-2018: Information technology - Evaluation indicators for data quality
GB/T 36344-2018
Information technology--Evaluation indicators for data quality
ICS 35.240.01
L70
National Standards of People's Republic of China
Information technology data quality evaluation index
Published on.2018-06-07
2019-01-01 implementation
State Market Supervisory Administration
China National Standardization Administration issued
Content
Foreword I
1 range 1
2 Terms and Definitions 1
3 indicator framework 2
4 Overview 2
5 Indicator Description 2
5.1 Description of the header information in the evaluation form 2
5.2 Normative 3
5.3 Integrity 4
5.4 Accuracy 4
5.5 Consistency 4
5.6 Timeliness 5
5.7 Accessibility 5
Appendix A (informative appendix) Data Quality Evaluation Process 6
Reference 7
Foreword
This standard was drafted in accordance with the rules given in GB/T 1.1-2009.
Please note that some of the contents of this document may involve patents. The issuing organization of this document is not responsible for identifying these patents.
This standard is proposed and managed by the National Information Technology Standardization Technical Committee (SAC/TC28).
This standard was drafted. China Electronics Technology Standardization Institute, Yujifang (Beijing) Technology Consulting Co., Ltd., Shanghai Information Investment
Co., Ltd., Computer Network Information Center of Chinese Academy of Sciences, Shenzhen Huaao Data Technology Co., Ltd., Guiyang Institute of Information Technology
(Chinese Academy of Sciences Software Institute Guiyang Branch), State Grid Zhejiang Electric Power Co., Ltd.
The main drafters of this standard. Wei Fenglin, Bin Junzhi, Gan Xiangyu, Hu Lianglin, Yu Wenyuan, Li Junmao, Chen Feng, Yang Da, Wang Jing, Dong Jian, Zhang Qun,
Zhang Zhanxin, Zhao Jinghua, Li Bing, Li Yiang, Qin Junning, Chen Liyue.
Information technology data quality evaluation index
1 Scope
This standard specifies the framework and description of data quality assessment indicators.
This standard applies to data quality evaluation at all stages of the data life cycle.
2 Terms and definitions
The following terms and definitions apply to this document.
2.1
Data
A formal representation of the reinterpretable information that is suitable for communication, interpretation, or processing.
Note. Data can be processed manually or automatically.
[GB/T 5271.1-2000, definition 01.01.02]
2.2
Metadata
Data about data or data elements (possibly including their data descriptions), as well as data ownership, access paths, access rights, and numbers
According to variability data.
[GB/T 5271.17-2010, definition 17.06.05]
2.3
Data quality
When used under specified conditions, the characteristics of the data meet the requirements of explicit and implicit requirements.
2.4
Raw data rawdata
Various unprocessed or simplified data stored by the end user.
Note. The original data has multiple forms of existence, such as text data, image data, audio data or a mixture of several types of data.
2.5
Data life cycle datalifecycle
A set of processes that transform raw data into knowledge that can be used for action.
2.6
Dataset dataset
A collection of data that has a certain theme that can be identified and can be computerized.
2.7
Data model datamodel
An image and textual representation of the analysis that identifies the organization's mission, function, goals, objectives, and strategies, as well as management and
Evaluate the data that the organization needs.
Note 1. When representing data at different levels of abstraction from high to low, the conceptual model (the model consisting of concepts related to some efforts) is usually distinguished.
Type and physical model.
Note 2. The formal description of the boundary of the usage context of the data model used is called the context mode.
Note 3. The data model identifies entities, domains (attributes), and relationships (associated with other data), providing a conceptual view of the relationship between data and data.
Example 1. A semantic data model consisting of block diagrams representing a set of transactions that are meaningful to the business, such as "people" or "actions", and describing such entities
The lines of the relationship.
Example 2. A relational table of application specific data management techniques or an extensible markup language XML or the like is a logical data model.
2.8
Data standard datastandard
Rules and benchmarks for naming, definition, structure, and value specifications of data.
3 indicator framework
The data quality evaluation indicator framework is shown in Figure 1.
Description.
Normative—The degree to which data conforms to data standards, data models, business rules, metadata, or authoritative reference data.
Integrity - The extent to which data elements are assigned numerical values as required by data rules.
Accuracy---The degree to which the data accurately represents the true value of the real entity (actual object) it describes.
Consistency - The degree to which data does not contradict data used in other specific contexts.
Timeliness - the correctness of the data in time.
Accessibility - the extent to which data can be accessed.
Figure 1 Data Quality Evaluation Indicator Framework
4 Overview
The six categories of evaluation indicators specified in Chapter 5 are the minimum set for implementing data quality evaluation. See Appendix A for the data quality evaluation process.
5 indicator description
5.1 Description of the header information in the evaluation form
The headers in the evaluation form are described below.
a) Indicator number and coding rule. The indicator number is the unique number of the evaluation indicator, which consists of 4 digits of the primary indicator and the secondary indicator.
composition. The coding rules are shown in Figure 2.
××
Level 1 indicator, 2 digits
××
Secondary indicator, 2 digits
Figure 2 coding rules
1) Level 1 indicator. consists of 2 digits, 01 stands for normative, 02 stands for integrity, 03 stands for accuracy, 04 stands for consistency
Sex, 05 stands for timeliness, and 06 stands for accessibility;
2) Secondary indicator. A sequential code consisting of 2 digits, ranging from 01 to 99.
b) Indicator Name. The name of the evaluation indicator.
c) Description of the indicator. an explanation of the evaluation indicator.
d) Calculation method. The calculation method of the evaluation index.
5.2 Normative
The definition of normative evaluation indicators is shown in Table 1.
Table 1 Normative evaluation indicators
Index number indicator name indicator description calculation method
0101 data standard
The data conforms to the metrics of the data standard.
Note 1. When evaluating data quality, it is necessary to collect data in naming, creating, defining,
Standards to be followed for updating and archiving, including international standards, national standards
Standards, industry standards, local standards or related regulations.
Note 2. As much as data archiving is even more important, in a complete data specification
Then the destruction of the old data is generally more detailed and has
Executive regulations
X=A/B
In the formula.
A=Data set that meets the requirements of the data standard
The number of elements;
B = number of elements in the data set being evaluated
0102 data model
The data conforms to the metrics of the data model.
Note 1. The data model is a means of visually describing the organization's data structure.
Specification of data representation.
Note 2. When evaluating data quality, it is necessary to check whether there are clear and understandable numbers.
According to the model definition and the organization of these data
X=A/B
In the formula.
A = data set that meets the data model requirements
The number of elements;
B = number of elements in the data set being evaluated
0103 metadata
The data conforms to the metrics defined by the metadata.
Note. Metadata labels, describes, or portrays other data for retrieval, or use
Information is easier. When evaluating the quality of the data, you need to check if it is available.
Interpreted metadata document.
Example. Data dictionary containing content such as field names, descriptions, type value fields, etc.
a metadata document
X=A/B
In the formula.
A = data set element that meets the metadata definition
Number of primes;
B = number of elements in the data set being evaluated
0104 Business Rules
The data conforms to the metrics of the business rules.
Note 1. Business rules are an authoritative principle or guideline used to describe
Business interaction and establish action and data behavior results and integrity
rule.
Note 2. When evaluating data quality, it is necessary to check whether there are good archived business rules.
X=A/B
In the formula.
A=Dataset elements that satisfy business rules
Number of
B = number of elements in the data set being evaluated
Authoritative reference number
According to (authoritative reference
source)
Reference data is systems, applications, databases, processes, reports, and transaction records.
A collection or classification of values used for reference and master records.
Note. A list of reference data needs to be collected when evaluating data quality.
Example. A list of valid values for a particular field is a reference
type of data
X=A/B
In the formula.
A=Data set that satisfies the reference data rule
The number of elements;
B = number of elements in the data set being evaluated
0106 Safety Specifications
Security specifications are rules for security and privacy, including data rights management.
Data desensitization treatment, etc.
X=A/B
In the formula.
A=Dataset elements that meet the security specification
Number of
B = number of elements in the data set being evaluated
5.3 Integrity
The integrity evaluation indicators are defined in Table 2.
Table 2 Integrity evaluation indicators
Index number indicator name indicator description calculation method
End of data element
Integrity
Data that should be assigned in the data set as required by business rules
Degree of assignment of elements
X=A/B
In the formula.
A = the number of elements in the data set that are assigned;
B = the number of elements in the data set that are expected to be assigned
Data logging
Integrity
Data that should be assigned in the data set as required by business rules
Recorded degree of assignment
X=A/B
In the formula.
A = the number of elements in the data set that are assigned;
B = the number of elements in the data set that are expected to be assigned
5.4 Accuracy
The accuracy evaluation indicators are defined in Table 3.
Table 3 accuracy evaluation indicators
Index number indicator name indicator description calculation method
Data content is positive
Authenticity
Whether the data content is expected data
X=A/B
In the formula.
A = the number of elements in the data set that meet the data correctness requirements;
B = number of elements in the data set being evaluated
Data format
Regulatory
Data format (including data type, value range, data length)
Degree, accuracy, etc.) Whether the expected requirements are met.
Example. gender column cannot appear outside of male/female;
Punctuation marks cannot appear in the certificate number; and the characters are encoded
Some restrictions need to be achieved by specifying the format of the content.
X=A/B
In the formula.
A = the number of elements in the data set that meet the format requirements;
B = number of elements in the data set being evaluated
0303 data repetition rate
Unexpected repetition of a particular field, record, file, or data set
measure
X=A/B
In the formula.
A = the number of elements in the repeated data set;
B = number of elements in the data set being evaluated
0304 Data Uniqueness A measure of the uniqueness of a particular field, record, file, or data set
X=A/B
In the formula.
A = the number of elements in the data set that satisfy the uniqueness requirement;
B = number of elements in the data set being evaluated
Dirty data out
Current rate
Invalid data outside of the correct field, record, file, or dataset
Metrics.
Example. When a transaction rolls back because the rollback mechanism is not sound
Or imperfect results in possible dirty data
X=A/B
In the formula.
A = the number of elements in the data set where dirty data appears;
B = number of elements in the data set being evaluated
5.5 Consistency
The definition of conformance assessment indicators is shown in Table 4.
Table 4 Consistency evaluation indicators
Index number indicator name indicator description calculation method
Same data one
Sexuality
The same data is stored in different locations or used by different applications or
When the user is using, the data is consistent; when the data changes, save
The same data stored in different locations is modified synchronously
X=A/B
In the formula.
A = the number of elements in the data set that meet the consistency requirements;
B = number of elements in the data set being evaluated
Associated data one
Sexuality
Check the consistency of associated data according to the consistency constraint rules
X=A/B
In the formula.
A = the number of elements in the data set that meet the consistency requirements;
B = number of elements in the data set being evaluated
5.6 Timeliness
The definition of timeliness evaluation indicators is shown in Table 5.
Table 5 Timeliness evaluation indicators
Index number indicator name indicator description calculation method
Time period based
Correctness
Number of records or frequency distribution based on date range conforms to business
Degree of demand
X=A/B
In the formula.
A = the number of elements in the data set that meet the validity requirements;
B = number of elements in the data set being...
Get Quotation: Click GB/T 36344-2018 (Self-service in 1-minute)
Historical versions (Master-website): GB/T 36344-2018
Preview True-PDF (Reload/Scroll-down if blank)
GB/T 36344-2018: Information technology - Evaluation indicators for data quality
GB/T 36344-2018
Information technology--Evaluation indicators for data quality
ICS 35.240.01
L70
National Standards of People's Republic of China
Information technology data quality evaluation index
Published on.2018-06-07
2019-01-01 implementation
State Market Supervisory Administration
China National Standardization Administration issued
Content
Foreword I
1 range 1
2 Terms and Definitions 1
3 indicator framework 2
4 Overview 2
5 Indicator Description 2
5.1 Description of the header information in the evaluation form 2
5.2 Normative 3
5.3 Integrity 4
5.4 Accuracy 4
5.5 Consistency 4
5.6 Timeliness 5
5.7 Accessibility 5
Appendix A (informative appendix) Data Quality Evaluation Process 6
Reference 7
Foreword
This standard was drafted in accordance with the rules given in GB/T 1.1-2009.
Please note that some of the contents of this document may involve patents. The issuing organization of this document is not responsible for identifying these patents.
This standard is proposed and managed by the National Information Technology Standardization Technical Committee (SAC/TC28).
This standard was drafted. China Electronics Technology Standardization Institute, Yujifang (Beijing) Technology Consulting Co., Ltd., Shanghai Information Investment
Co., Ltd., Computer Network Information Center of Chinese Academy of Sciences, Shenzhen Huaao Data Technology Co., Ltd., Guiyang Institute of Information Technology
(Chinese Academy of Sciences Software Institute Guiyang Branch), State Grid Zhejiang Electric Power Co., Ltd.
The main drafters of this standard. Wei Fenglin, Bin Junzhi, Gan Xiangyu, Hu Lianglin, Yu Wenyuan, Li Junmao, Chen Feng, Yang Da, Wang Jing, Dong Jian, Zhang Qun,
Zhang Zhanxin, Zhao Jinghua, Li Bing, Li Yiang, Qin Junning, Chen Liyue.
Information technology data quality evaluation index
1 Scope
This standard specifies the framework and description of data quality assessment indicators.
This standard applies to data quality evaluation at all stages of the data life cycle.
2 Terms and definitions
The following terms and definitions apply to this document.
2.1
Data
A formal representation of the reinterpretable information that is suitable for communication, interpretation, or processing.
Note. Data can be processed manually or automatically.
[GB/T 5271.1-2000, definition 01.01.02]
2.2
Metadata
Data about data or data elements (possibly including their data descriptions), as well as data ownership, access paths, access rights, and numbers
According to variability data.
[GB/T 5271.17-2010, definition 17.06.05]
2.3
Data quality
When used under specified conditions, the characteristics of the data meet the requirements of explicit and implicit requirements.
2.4
Raw data rawdata
Various unprocessed or simplified data stored by the end user.
Note. The original data has multiple forms of existence, such as text data, image data, audio data or a mixture of several types of data.
2.5
Data life cycle datalifecycle
A set of processes that transform raw data into knowledge that can be used for action.
2.6
Dataset dataset
A collection of data that has a certain theme that can be identified and can be computerized.
2.7
Data model datamodel
An image and textual representation of the analysis that identifies the organization's mission, function, goals, objectives, and strategies, as well as management and
Evaluate the data that the organization needs.
Note 1. When representing data at different levels of abstraction from high to low, the conceptual model (the model consisting of concepts related to some efforts) is usually distinguished.
Type and physical model.
Note 2. The formal description of the boundary of the usage context of the data model used is called the context mode.
Note 3. The data model identifies entities, domains (attributes), and relationships (associated with other data), providing a conceptual view of the relationship between data and data.
Example 1. A semantic data model consisting of block diagrams representing a set of transactions that are meaningful to the business, such as "people" or "actions", and describing such entities
The lines of the relationship.
Example 2. A relational table of application specific data management techniques or an extensible markup language XML or the like is a logical data model.
2.8
Data standard datastandard
Rules and benchmarks for naming, definition, structure, and value specifications of data.
3 indicator framework
The data quality evaluation indicator framework is shown in Figure 1.
Description.
Normative—The degree to which data conforms to data standards, data models, business rules, metadata, or authoritative reference data.
Integrity - The extent to which data elements are assigned numerical values as required by data rules.
Accuracy---The degree to which the data accurately represents the true value of the real entity (actual object) it describes.
Consistency - The degree to which data does not contradict data used in other specific contexts.
Timeliness - the correctness of the data in time.
Accessibility - the extent to which data can be accessed.
Figure 1 Data Quality Evaluation Indicator Framework
4 Overview
The six categories of evaluation indicators specified in Chapter 5 are the minimum set for implementing data quality evaluation. See Appendix A for the data quality evaluation process.
5 indicator description
5.1 Description of the header information in the evaluation form
The headers in the evaluation form are described below.
a) Indicator number and coding rule. The indicator number is the unique number of the evaluation indicator, which consists of 4 digits of the primary indicator and the secondary indicator.
composition. The coding rules are shown in Figure 2.
××
Level 1 indicator, 2 digits
××
Secondary indicator, 2 digits
Figure 2 coding rules
1) Level 1 indicator. consists of 2 digits, 01 stands for normative, 02 stands for integrity, 03 stands for accuracy, 04 stands for consistency
Sex, 05 stands for timeliness, and 06 stands for accessibility;
2) Secondary indicator. A sequential code consisting of 2 digits, ranging from 01 to 99.
b) Indicator Name. The name of the evaluation indicator.
c) Description of the indicator. an explanation of the evaluation indicator.
d) Calculation method. The calculation method of the evaluation index.
5.2 Normative
The definition of normative evaluation indicators is shown in Table 1.
Table 1 Normative evaluation indicators
Index number indicator name indicator description calculation method
0101 data standard
The data conforms to the metrics of the data standard.
Note 1. When evaluating data quality, it is necessary to collect data in naming, creating, defining,
Standards to be followed for updating and archiving, including international standards, national standards
Standards, industry standards, local standards or related regulations.
Note 2. As much as data archiving is even more important, in a complete data specification
Then the destruction of the old data is generally more detailed and has
Executive regulations
X=A/B
In the formula.
A=Data set that meets the requirements of the data standard
The number of elements;
B = number of elements in the data set being evaluated
0102 data model
The data conforms to the metrics of the data model.
Note 1. The data model is a means of visually describing the organization's data structure.
Specification of data representation.
Note 2. When evaluating data quality, it is necessary to check whether there are clear and understandable numbers.
According to the model definition and the organization of these data
X=A/B
In the formula.
A = data set that meets the data model requirements
The number of elements;
B = number of elements in the data set being evaluated
0103 metadata
The data conforms to the metrics defined by the metadata.
Note. Metadata labels, describes, or portrays other data for retrieval, or use
Information is easier. When evaluating the quality of the data, you need to check if it is available.
Interpreted metadata document.
Example. Data dictionary containing content such as field names, descriptions, type value fields, etc.
a metadata document
X=A/B
In the formula.
A = data set element that meets the metadata definition
Number of primes;
B = number of elements in the data set being evaluated
0104 Business Rules
The data conforms to the metrics of the business rules.
Note 1. Business rules are an authoritative principle or guideline used to describe
Business interaction and establish action and data behavior results and integrity
rule.
Note 2. When evaluating data quality, it is necessary to check whether there are good archived business rules.
X=A/B
In the formula.
A=Dataset elements that satisfy business rules
Number of
B = number of elements in the data set being evaluated
Authoritative reference number
According to (authoritative reference
source)
Reference data is systems, applications, databases, processes, reports, and transaction records.
A collection or classification of values used for reference and master records.
Note. A list of reference data needs to be collected when evaluating data quality.
Example. A list of valid values for a particular field is a reference
type of data
X=A/B
In the formula.
A=Data set that satisfies the reference data rule
The number of elements;
B = number of elements in the data set being evaluated
0106 Safety Specifications
Security specifications are rules for security and privacy, including data rights management.
Data desensitization treatment, etc.
X=A/B
In the formula.
A=Dataset elements that meet the security specification
Number of
B = number of elements in the data set being evaluated
5.3 Integrity
The integrity evaluation indicators are defined in Table 2.
Table 2 Integrity evaluation indicators
Index number indicator name indicator description calculation method
End of data element
Integrity
Data that should be assigned in the data set as required by business rules
Degree of assignment of elements
X=A/B
In the formula.
A = the number of elements in the data set that are assigned;
B = the number of elements in the data set that are expected to be assigned
Data logging
Integrity
Data that should be assigned in the data set as required by business rules
Recorded degree of assignment
X=A/B
In the formula.
A = the number of elements in the data set that are assigned;
B = the number of elements in the data set that are expected to be assigned
5.4 Accuracy
The accuracy evaluation indicators are defined in Table 3.
Table 3 accuracy evaluation indicators
Index number indicator name indicator description calculation method
Data content is positive
Authenticity
Whether the data content is expected data
X=A/B
In the formula.
A = the number of elements in the data set that meet the data correctness requirements;
B = number of elements in the data set being evaluated
Data format
Regulatory
Data format (including data type, value range, data length)
Degree, accuracy, etc.) Whether the expected requirements are met.
Example. gender column cannot appear outside of male/female;
Punctuation marks cannot appear in the certificate number; and the characters are encoded
Some restrictions need to be achieved by specifying the format of the content.
X=A/B
In the formula.
A = the number of elements in the data set that meet the format requirements;
B = number of elements in the data set being evaluated
0303 data repetition rate
Unexpected repetition of a particular field, record, file, or data set
measure
X=A/B
In the formula.
A = the number of elements in the repeated data set;
B = number of elements in the data set being evaluated
0304 Data Uniqueness A measure of the uniqueness of a particular field, record, file, or data set
X=A/B
In the formula.
A = the number of elements in the data set that satisfy the uniqueness requirement;
B = number of elements in the data set being evaluated
Dirty data out
Current rate
Invalid data outside of the correct field, record, file, or dataset
Metrics.
Example. When a transaction rolls back because the rollback mechanism is not sound
Or imperfect results in possible dirty data
X=A/B
In the formula.
A = the number of elements in the data set where dirty data appears;
B = number of elements in the data set being evaluated
5.5 Consistency
The definition of conformance assessment indicators is shown in Table 4.
Table 4 Consistency evaluation indicators
Index number indicator name indicator description calculation method
Same data one
Sexuality
The same data is stored in different locations or used by different applications or
When the user is using, the data is consistent; when the data changes, save
The same data stored in different locations is modified synchronously
X=A/B
In the formula.
A = the number of elements in the data set that meet the consistency requirements;
B = number of elements in the data set being evaluated
Associated data one
Sexuality
Check the consistency of associated data according to the consistency constraint rules
X=A/B
In the formula.
A = the number of elements in the data set that meet the consistency requirements;
B = number of elements in the data set being evaluated
5.6 Timeliness
The definition of timeliness evaluation indicators is shown in Table 5.
Table 5 Timeliness evaluation indicators
Index number indicator name indicator description calculation method
Time period based
Correctness
Number of records or frequency distribution based on date range conforms to business
Degree of demand
X=A/B
In the formula.
A = the number of elements in the data set that meet the validity requirements;
B = number of elements in the data set being...
Share





