Skip to product information
1 of 12

PayPal, credit cards. Download editable-PDF & invoice in 1 second!

GB/T 25724-2017 English PDF (GBT25724-2017)

GB/T 25724-2017 English PDF (GBT25724-2017)

Regular price $1,205.00 USD
Regular price Sale price $1,205.00 USD
Sale Sold out
Shipping calculated at checkout.
Quotation: In 1-minute, 24-hr self-service. Click here GB/T 25724-2017 to get it for Purchase Approval, Bank TT...

GB/T 25724-2017: Technical specifications for surveillance video and audio coding

This Standard specifies the decoding process of digital video and audio compression coding for public security video surveillance. This Standard applies to the audio and video real-time compression, transmission, playback and storage services of the field of public security; other fields that need audio and video coding may also refer to this Standard.
GB/T 25724-2017
GB
NATIONAL STANDARD OF THE
PEOPLE REPUBLIC OF CHINA
ICS 13.310
A 91
Replacing GB/T 25724-2010
Technical specifications for
surveillance video and audio coding
ISSUED ON: MARCH 9, 2017
IMPLEMENTED ON: JUNE 1, 2017
Issued by: General Administration of Quality Supervision, Inspection and Quarantine of the PEOPLE Republic of China;
Standardization Administration of the PEOPLE Republic of
China.
Table of Contents
Foreword ... 4
Introduction ... 6
1 Scope ... 10
2 Normative reference ... 10
3 Terms, definitions and abbreviations ... 10
3.1 Terms and definitions ... 10
3.2 Abbreviations ... 24
4 Agreement ... 27
4.1 Arithmetic operators ... 27
4.2 Logical operators ... 27
4.3 Relational operators ... 28
4.4 Bit operators ... 28
4.5 Assignment operators ... 29
4.6 Mathematical functions ... 29
4.7 Syntax elements, variables and tables ... 30
4.8 Text description of logical operators ... 31
4.9 Process ... 33
5 Video section ... 33
5.1 Coded bitstream and output data format ... 33
5.2 Syntaxes and semantics ... 39
5.3 Decoding process ... 94
5.4 Parsing process ... 166
6 Audio part ... 265
6.1 General description ... 265
6.2 Encoder function description ... 270
6.2.1 Pre-processing ... 270
6.3 Decoder function description ... 347
6.4 Bit allocation description ... 359
6.5 Storage, transmission interface format... 362
Appendix A (Normative) Hypothetical reference decoder (HRD) ... 370
Appendix B (Normative) Byte stream format ... 374
Appendix C (Normative) Video profile and level ... 377
Appendix D (Normative) Video usability information (VUI) ... 381
Appendix E (Normative) Supplemental enhancement information (SEI) ... 385 Appendix F (Normative) Intelligent analysis data description ... 391
Appendix G (Normative) Audio profile and level ... 412
Appendix H (Normative) Exception sound event type definition ... 414
Appendix I (Informative) VAD detection ... 415
Appendix J (Informative) Noise elimination ... 421
References ... 435
Technical specifications for
surveillance video and audio coding
1 Scope
This Standard specifies the decoding process of digital video and audio compression coding for public security video surveillance.
This Standard applies to the audio and video real-time compression, transmission, playback and storage services of the field of public security; other fields that need audio and video coding may also refer to this Standard.
2 Normative reference
The following document is indispensable for the application of this document. For dated references, the only dated edition applies to this document. For undated references, the latest edition (including all modifications) applies to this document. rfc 3548 The Base 16, Base 32, and Base 64 Data Encodings
3 Terms, definitions and abbreviations
3.1 Terms and definitions
For the purpose of this document, the following terms and definitions apply. 3.1.1
NAL unit
A syntax structure that contains the instruction type and the number of bytes contained in the subsequent data. The data appears in RBSP form and, if necessary, contains the scattered emulation prevention bytes.
3.1.2
NAL unit stream
A sequence of NAL units.
3.1.3
4.9 Process
The process is used to describe the decoding of the syntax elements. All the syntax elements and uppercase variables that belong to the current syntax structure, as well as the associated syntax structures, are available in both the specification and the call of the process. The specification of the process may also contain lowercase variables that are explicitly specified as input. Each specification explicitly specifies the output. The output can be uppercase variables or lowercase variables.
In the specification of the process, a particular macroblock can be represented by a variable name whose value is equal to its macroblock index.
5 Video section
5.1 Coded bitstream and output data format
5.1.1 Bitstream format
This clause specifies the relationship between the NAL unit stream and the byte stream, both of which are referred to as bitstreams.
The NAL unit stream format consists of a series of syntax structures called NAL units, arranged by decoding order. The decoding order and contents of the NAL units in the NAL unit stream are constrained.
The byte stream can be constructed by the NAL unit stream, by arranging the NAL units in the decoding order, and adding a start code prefix and a number of zero bytes to each NAL unit to form a byte stream. The NAL unit stream format can be extracted from the byte stream format by searching a unique start code prefix in the byte stream. Except for the byte stream format, other methods of constructing the NAL unit are not specified in this Standard. The byte stream format is specified in Annex B. 5.1.2 Picture format
This clause specifies the relationship between the source determined by the bitstream and the decoded frame.
The video stream represented by the bitstream is a series of frames arranged in decoding order.
Each source or decoded frame is composed of one or more video sample point arrays: - array of only luma (Y) (monochrome);
- array of luma and two chroma (YCbCr);
The following functions are used for the syntax description. These functions assume that there is a bitstream pointer in the decoder that points to the next bit position in the bitstream where the decoding process is to be read. Specific requirements are as follows:
Specification for byte_aligned ():
- If the current position of the bitstream is at the boundary of the byte, that is, the next bit in the bitstream is the first bit of the byte, then the return value of byte_aligned () is TRUE;
- otherwise, the return value of byte_aligned () is FALSE.
Specification for get_left_ae_bits ():
- The value of the counter count in the entropy decoder plus 8 and then perform the modulo operation on 8, if it is equal to 0, continue to parse through the fixed probability of 128;
- if it is not equal to 0, then continue to parse through the fixed probability of 128 to obtain the value after modulo operation and plus 8 bits.
Specification for more_data_in_byte_stream (), which is used in the byte stream NAL unit syntax specified in Annex B:
- If there is more data in the byte stream, the return value of
more_data_in_byte_stream () is TRUE;
- otherwise, the return value of more_data_in_byte_stream () is FALSE.
Specification for more_rbsp_data ():
- If there is more data in RBSP before rbsp_trailing_bits (), the return value of more_rbsp_data () is TRUE;
- otherwise, the return value of more_rbsp_data () is FALSE.
The method of determining whether there is more data in RBSP is not specified in this Standard.
next_bits (n) provides the next n bits in the bitstream without changing the bitstream pointer. This function makes the next n bits in the bitstream visible. When it is used in the byte stream specified in Annex B, the return value of next_bits (n) is 0 if the remaining byte stream has less than n bits.
Read_bits (n) reads the following n bits from the bitstream, and moves the bitstream pointer forward by n bits. When n is equal to 0, the return value of read_bits (n) is 0 may also contain some emulation_prevention_three_byte. NumBytesInNALunit is required for decoding the NAL unit. In order to be able to export NumBytesInNALunit, the boundary of the NAL unit needs to be divided. Annex B specifies a method for dividing the byte stream type. Other partition methods may be given outside this Standard.
forbidden_zero_bit indicates the version of the SVAC standard that the video stream supports. forbidden_zero_bit shall be equal to 1.
forbidden_zero_bit equal to 0 indicates that the video stream supports GB/T 25724- 2010 standard.
When nal_ref_idc is not equal to 0, the contents of the NAL unit contain a sequence parameter set, or a picture parameter set, or a security parameter set, or tiles of a reference picture. When the nal_ref_idc of a tile NAL unit of a coded picture is equal to 0, the nal_ref_idc of all the tile NAL units of the coded picture shall be equal to 0. nal_unit_type indicates the type of RBSP data structure in the NAL unit, see Table 30. The VCL NAL unit refers to those NAL units with the value of nal_unit_type equal to 1, 2, 3 or 4. All other NAL units are called non-VCL NAL units.
NOTE 1: The VCL specification is for effectively representing the contents of the video data. The NAL specification is for formatting the data and providing header information for storage or transmission over a variety of communication channels. Each NAL unit contains integer bytes. The NAL unit specifies a general format that applies to both packet-oriented and bitstream systems.
Without affecting the decoding process of NAL units with nal_unit_type not equal to 5 and without affecting the consistency of this Standard, NAL units with nal_unit_type equal to 5 can be discarded by the decoder.
NOTE 2: This Standard does not specify the decoding process of NAL units with the value nal_unit_type is reserved. The decoder can ignore (removed from the bitstream and discarded) all contents of NAL unit with the value nal_unit_type is reserved.
When the value of nal_unit_type of a tile NAL unit is equal to 2, the value of nal_unit_type of all other tile NAL units encoding the same picture shall be the same, and the value of nal_unit_type of all the tile NAL units of the corresponding SVC enhance layer coded picture shall be equal to 4. Such a picture is called an IDR picture. The NAL unit type is as shown in Table 30.
Parameters included in a sequence parameter set RBSP can be used by one or more pictures or SEI NAL units containing buffer cycle SEI messages. Each sequence parameter set RBSP takes effect at the same time as it is received by the decoder, and the previously valid sequence parameter set RBSP (if any) will be fail. Not more than one sequence parameter set RBSP is valid at the specified time in the decoding process.
When a sequence parameter set RBSP is used by the SEI NAL unit containing a buffer cycle SEI message, the SEI NAL unit shall be located after the sequence parameter set RBSP.
Parameters included in the picture parameter set RBSP can be used by the coding tile NAL unit of a coded picture. Each picture parameter set RBSP takes effect at the same time as it is received by the decoder, and the previously valid sequence parameter set RBSP (if any) will be fail. For a layer picture of SVC, not more than one picture parameter set RBSP is valid at the specified time in the decoding process. The specification for the relationship between the syntax element values and the other syntax elements in the sequence parameter sets and the picture parameter set is only for the valid sequence parameter set and the valid picture parameter set. During the decoding process, the parameter values of the valid picture parameter set and the valid sequence parameter set shall remain valid.
5.2.4.3.2.2 Taking effect of security parameter set RBSP
Parameters included in the security parameter set RBSP can be used by one or more other types of NAL units. At the beginning of the decoding process, each security parameter set RBSP takes effect at the same time as it is received by the decoder, and the previously valid sequence parameter set RBSP (if any) will be fail. When ldp_mode_flag of the sequence parameter set is equal to 1, the security parameter set shall only appear before the IDR picture. Not more than one security parameter set RBSP is valid at the specified time in the decoding process.
NOTE: In some applications, the security parameter set can also be passed to the decoder via other reliable mechanisms.
5.2.4.3.2.3 Order of VCL NAL unit and its relationship with encoded pictures Each VCL NAL unit is part of a coded picture.
The order of the VCL NAL units in a coded picture is defined as follows: - the tile order of a picture shall be in ascending order of the first CTU index of the tile;
lf_mode_delta_enable [i] indicates that the mode related loop filter parameter difference update is enabled, equal to 0 indicates that the mode related loop filter parameter difference update is closed, equal to 1 indicates that the mode related loop filter parameter difference update is opened.
lf_mode_deltas [i] indicates the mode related loop filter parameter difference. lf_mode_deltas_sign [i] indicates the sign of the mode related loop filter parameter difference.
mode_deltas [i] = lf_mode_deltas [i] ?? lf_mode_deltas_sign [i]
picture_sao_enable [i] indicates whether the sample adaptive offset of the luma and chroma components is opened. picture_sao_enable [i] equal to 0 indicates that it does not open, equal to 1 indicates open, where i equal to 0 indicates the luma component; i equal to 1 or 2 indicates the chroma component.
picture_alf_enable [i] is the permission sign of picture adaptive loop filter, indicating whether the adaptive loop filter of the luma and chroma components of the current picture is opened. picture_alf_enable [i] equal to 0 indicates that the ith component of the current picture shall not use adaptive loop filter; equal to 1 indicates that the ith component of the current ipicture uses adaptive loop filter, where i equal to 0 indicates the luma component; i equal to 1 or 2 indicates the chroma component.
The value of alf_filter_num_minus1 plus 1 indicates the number of current picture?€?s luma component adaptive loop filter.
The value of alf_filter_num_minus1 shall be 0 to 15.
alf_region_distance [i] indicates the difference between the base unit start sign of luma component?€?s ith adaptive loop filter region and the base unit start sign of luma component?€?s i-1th adaptive loop filter region. The value of alf_region_distance [i] shall be 1 to 15.
If alf_region_distance [i] is not exist in the bitstream, when i is equal to 0, the value of alf_region_distance [i] is 0. when i is not equal to 0 and the value of alf_filter_num_minus1 is 15, the value of alf_region_distance [i] is 1. The bitstream shall satisfy that the sum of alf_region_distance [i] (i = 0 ~ alf_filter_num_minus1) is less than or equal to 15.
alf_coeff_luma [i] [j] indicates the jth coefficient of the luma component of the ith adaptive loop filter. The value range of alf_coeff_luma [i] [j] (j = 0 ~ 8) obtained from decoding in the bitstream shall be -64 to 63, and the value range of alf_coeff_luma [i] [9] shall be -1088 ~ 1071.
alf_coeff_chroma [0] [j] indicates the coefficient of the jth adaptive loop filter of the sao_merge_flag equal to 0 indicates that the parameter is not merged; equal to 1 indicates that the parameter is merged, and the SAO parameter is the same as the SAO parameter of the CTU adjacent to its left or adjacent to its upper. sao_merge_type equal to 1 indicates that the SAO parameter of the current CTU uses the SAO parameter of the adjacent CTU on the left; equal to 0 indicates that the SAO parameter of the current CTU uses the SAO parameter of the upper adjacent CTU on the upper side.
sao_mode [compIdx] equal to 0 indicates that the SAO mode of the compIdxth component in the current CTU is SAO_OFF; equal to 1 indicates that the SAO mode of the compIdxth component in the current CTU is determined by sao_type [compIdx]. sao_type [compIdx] equal to 0 indicates that the SAO mode of the compIdxth component in the current CTU is SAO_BO; equal to 1 indicates that the SAO mode of the compIdxth component in the current CTU is SAO_EO.
sao_start_band [compIdx] indicates the start compensation interval of the compIdxth component in the current CTU in SAO_BO mode, and the value shall be 0 ~ 31. sao_offset_sign [compIdx] [j] indicates the sign of sao_offset [compIdx] [j] in SAO_BO mode. sao_offset_sign [compIdx] [j] equal to 0 indicates that the value of corresponding sao_offset [compIdx] [j] is positive, equal to 1 indicateds that the value of corresponding sao_offset [compIdx] [j] is negative.
sao_offset_abs [compIdx] [j] indicates the absolute value of the compensation value sao_offset [compIdx] [j] in SAO_BO mode, the value shall be 0 ~ (1 < < (Min (bit_depth, 10) - 5)) - 1.
sao_edge_type [compIdx] indicates the angular direction of the compIdxth component in the current CTU in SAO_EO mode. sao_edge_type [compIdx] equal to 0 indicates EO_0??; equal to 1 indicates EO_90??; equal to 2 indicates EO_135??; equal to 3 indicates EO_45??.
sao_edge_offset [compIdx] [j] indicates the corresponding compensation value in SAO_EO mode.
alf_ctu_enable [compIdx] equal to 0 indicates that the compIdxth component of the current CTU does not perform adaptive loop filter. alf_ctu_enable [compIdx] equal to 1 indicates that the compIdxth component of the current CTU performs adaptive loop filter. 5.2.4.4.6 Authentication data RBSP semantics
frame_num indicates that the picture of authentication data shall be included; the picture is the same picture as the authentication data frame_num which is closest before the authentication data NAL unit. When frame_num is equal to 0, frame_num inter_block indicates whether the current block is an inter coded block. skip_flag indicates whether the current block is skipped.
coeff_value indicates the value of the block coefficients.
coeff_sign indicates the sign of the block coefficients.
tx_size indicates the size of the transform matrix used by the current block. tx_size equal to 0 indicates that the transform matrix is TX_4 ?? 4; equal to 1 indicates that the transform matrix is TX_8 ?? 8; equal to 2 indicates that the transform matrix is TX_16 ?? 16; equal to 3 indicates that the transform matrix is TX_32 ?? 32.
prev_intra_luma_pred_flag indicates whether the luma intra prediction mode is in the intra prediction mode prediction list and the prediction list contains 5 most likely prediction modes.
mpm_idx0 equal to 0 indicates that the current luma prediction mode is the first mode in the prediction list; equal to 1 indicates that the current luma prediction mode is not the first mode in the prediction list.
mpm_idx1, when mpm_idx0 is 1, mpm_idx1 + 1 indicates the position where the current prediction mode is in the prediction list. The value of mpm_idx1 shall be 0 ~ 3. rem_pred_intra_mode indicates the index in the remaining 32 prediction modes except for the 5 prediction modes in the prediction list in the current luma prediction mode. The value of rem_pred_intra_mode shall be 0 ~ 31.
uv_fllow_y_flag equal to 1 indicates that the chroma intra prediction mode is consistent with the luma intra prediction mode of its corresponding position, and uv_fllow_y_flag equal to 0 indicates that the chroma intra prediction mode does not coincide with the luma intra prediction mode of its corresponding position. chroma_intra_mode indicates the chroma intra prediction mode index.
block_reference_mode indicates the reference frame mode of the current block, the value is SINGLE_REFERENCE or COM-POUND_REFERENCE. If
block_reference_mode does not exist in the code stream, the value of
block_reference_mode is equal to frame_reference_mode. If block_reference_mode is equal to COMPOUND_REFERENCE, is_compound is equal to 1, otherwise
is_compound is equal to 0.
ref_frame indicates the current prediction block reference frame index. When block_reference_mode is equal to SINGLE_REFERENCE, ref_frame has five possible values, namely DYNAMIC_REF, STA...

View full details