CN101223579B - Method of encoding and decoding an audio signal - Google Patents

Method of encoding and decoding an audio signal Download PDF

Info

Publication number
CN101223579B
CN101223579B CN2006800263123A CN200680026312A CN101223579B CN 101223579 B CN101223579 B CN 101223579B CN 2006800263123 A CN2006800263123 A CN 2006800263123A CN 200680026312 A CN200680026312 A CN 200680026312A CN 101223579 B CN101223579 B CN 101223579B
Authority
CN
China
Prior art keywords
signal
sound
spatial information
embedded
supplementary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN2006800263123A
Other languages
Chinese (zh)
Other versions
CN101223579A (en
Inventor
吴贤午
郑亮源
房熙锡
金东秀
林宰显
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
LG Electronics Inc
Original Assignee
LG Electronics Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from KR1020060030658A external-priority patent/KR20060122692A/en
Application filed by LG Electronics Inc filed Critical LG Electronics Inc
Priority claimed from PCT/KR2006/002020 external-priority patent/WO2006126858A2/en
Publication of CN101223579A publication Critical patent/CN101223579A/en
Application granted granted Critical
Publication of CN101223579B publication Critical patent/CN101223579B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

An apparatus for encoding and decoding an audio signal and method thereof are disclosed, by which compatibility with a player of a general mono or stereo audio signal can be provided in coding an audio signal and by which spatial information for a multi-channel audio signal can be stored or transmitted without a presence of an auxiliary data area. The present invention includes extracting side information embedded in non-recognizable component of audio signal components and decoding the audio signal using the extracted side information.

Description

Be used for the method to coding audio signal and decoding
Technical field
The present invention relates to the method with encode/decode audio signal.
Background technology
Recently, carry out many work and be used for various encoding schemes and the method for digital audio and video signals with research and development, and made the product that many and various encoding schemes and method are associated.
And, developed the encoding scheme that the spatial information that uses multi-channel audio signal changes over monophony or stereo audio signal multi-channel audio signal.
Yet, in the situation that sound signal is stored in some recording mediums, do not have the ancillary data area for storage space information.So, in this case, because only monophony or stereo audio signal are stored and send, so only reproduce this monophony or stereo audio signal.Therefore, sound quality is comparatively dull.
In addition, storing separately or sending in the situation of spatial information, the compatibility issue of the player of existence and common monophony or stereo audio signal.
Summary of the invention
Therefore, the present invention is directed to for the devices and methods therefor with encode/decode audio signal, it has avoided one or more problems of causing owing to the limitation of correlation technique and shortcoming substantially.
It is a kind of for the devices and methods therefor with encode/decode audio signal that one object of the present invention is to provide, and takes this to provide when coding audio signal the compatibility with the player of common monophony or stereo audio signal.
Another object of the present invention is to provide a kind of for the devices and methods therefor with encode/decode audio signal, take this in the situation that does not have ancillary data area, to store or to send the spatial information of multi-channel audio signal.
Other features and advantages of the present invention will be set forth in the following description, and part will obviously maybe can obtain teaching by practice of the present invention because of this description.Purpose of the present invention and other advantage will realize by the structure that particularly points out in written description and claim and accompanying drawing and obtain.
In order to realize these and other advantage and according to purposes of the present invention, a kind of method according to decoded audio signal of the present invention comprises: extracts by being embedded in the step (a) of the supplementary in this sound signal at least one sound channel that is dispersed in sound signal, and with the decode step (b) of this sound signal of this supplementary.
For further these and other advantage of realization and according to purposes of the present invention, a kind of method according to coding audio signal of the present invention comprises: generate the step (a) of the required supplementary of decoded audio signal, and by scattering this supplementary it is embedded in the step (b) in the sound signal with at least one sound channel.
In order further to realize these and other advantage and according to purposes of the present invention, a kind of data structure according to the present invention comprises: sound signal and by being used for the required supplementary of this sound signal of decoding in the distributed unrecognizable component that is embedded in the sound signal with at least one sound channel.
For further these and other advantage of realization and according to purposes of the present invention, a kind of device for coding audio signal according to the present invention comprises: the supplementary generation unit is used for generating the required supplementary of this sound signal of decoding; And embedded unit, be used for by scattering supplementary it being embedded in the sound signal with at least one sound channel.
For further these and other advantage of realization and according to purposes of the present invention, a kind of device for decoded audio signal according to the present invention comprises: embed signal decoding unit, be used for extracting the supplementary that is embedded in the sound signal with at least one sound channel by distribution; And the multichannel generation unit, be used for by this sound signal of decoding with this additional information.
Will be recognized that above summary and following detailed description are exemplary and explanatory, and aim to provide the further explanation to the present invention for required protection.
The accompanying drawing summary
Be included to provide a further understanding of the present invention and be comprised in this instructions and consist of its a part of accompanying drawing show embodiments of the invention, and together work to explain principle of the present invention with description.
In the accompanying drawings:
Fig. 1 is for the diagram of explanation according to the method for the spatial information of human body identification sound signal of the present invention;
Fig. 2 is the block diagram according to spatial encoder of the present invention;
Fig. 3 is the detailed diagram according to the embedded unit for the spatial encoder shown in the arrangement plan 2 of the present invention;
Fig. 4 is the diagram according to the first method for resetting the spatial information bit stream of the present invention;
Fig. 5 is the diagram according to the second method for resetting the spatial information bit stream of the present invention;
Fig. 6 A is the diagram according to the spatial information bit stream through shaping of the present invention;
Fig. 6 B is the detailed view of the configuration of the spatial information bit stream shown in Fig. 6 A;
Fig. 7 is the block diagram according to spatial decoder of the present invention;
Fig. 8 is the concrete block diagram that is included in the embedding decoding signals in the spatial decoder according to of the present invention;
Fig. 9 is for the diagram of explaining according to the situation of common PCM decoder reproducing audio signal of the present invention;
Figure 10 is the process flow diagram according to the coding method for spatial information being embedded in lower mixed (downmix) signal of the present invention;
Figure 11 is the process flow diagram according to the method for the spatial information that is embedded in lower mixed signal is decoded of the present invention;
Figure 12 is the diagram that is embedded in the frame sign of the spatial information bit stream in the lower mixed signal according to of the present invention;
Figure 13 is the diagram that is embedded in the spatial information bit stream in the lower mixed signal by fixed size according to of the present invention;
Figure 14 A explains to be used for solution by the diagram of the first method of the time alignment problem of the spatial information bit stream of fixed size embedding;
Figure 14 B explains to be used for solution by the diagram of the second method of the time alignment problem of the spatial information bit stream of fixed size embedding;
Figure 15 is according to the diagram for the spatial information bit stream being appended to the method for lower mixed signal of the present invention;
Figure 16 is the process flow diagram according to the method for the spatial information bit stream that is embedded in lower mixed signal by different sizes is encoded of the present invention;
Figure 17 is the process flow diagram according to the method for the spatial information bit stream that is embedded in lower mixed signal by fixed size is encoded of the present invention;
Figure 18 is the diagram that the spatial information bit stream is embedded into the first method in the sound signal that is mixed in down at least one sound channel according to of the present invention;
Figure 19 is the diagram that the spatial information bit stream is embedded into the second method in the sound signal that is mixed in down at least one sound channel according to of the present invention;
Figure 20 is the diagram that the spatial information bit stream is embedded into the third method in the sound signal that is mixed in down at least one sound channel according to of the present invention;
Figure 21 is the diagram that the spatial information bit stream is embedded into the cubic method in the sound signal that is mixed in down at least one sound channel according to of the present invention;
Figure 22 is the diagram that the spatial information bit stream is embedded into the 5th method in the sound signal that is mixed in down at least one sound channel according to of the present invention;
Figure 23 is the diagram that the spatial information bit stream is embedded into the 6th method in the sound signal that is mixed in down at least one sound channel according to of the present invention;
Figure 24 is the diagram that the spatial information bit stream is embedded into the 7th method in the sound signal that is mixed in down at least one sound channel according to of the present invention;
Figure 25 is for to being embedded into by the process flow diagram of the method for the spatial information bit stream of the sound signal of descend to be mixed at least one sound channel coding according to of the present invention; And
Figure 26 is for to being embedded into by the process flow diagram of the method for the spatial information bitstream decoding of the sound signal of descend to be mixed at least one sound channel according to of the present invention;
Embodiment
Now will be specifically with reference to its example the preferred embodiments of the present invention illustrated in the accompanying drawings.
At first, the present invention relates to a kind of supplementary that decoded audio signal is required and be embedded into devices and methods therefor in this sound signal.For convenience of explanation, this sound signal and supplementary represent that with lower mixed signal and spatial information they do not make any restriction to the present invention respectively in the following description.In this case, this sound signal comprises the PCM signal.
Fig. 1 is for the diagram of explanation according to the method for the spatial information of human body identification sound signal of the present invention.
With reference to Fig. 1, can identify these facts of sound signal in 3 dimension ground based on human body, the encoding scheme that is used for multi-channel audio signal utilizes this sound signal to be expressed as 3 these facts of dimension space information by a plurality of parameter settings.
The spatial parameter that is used for the spatial information of expression multi-channel audio signal comprises CLD (levels of channels difference), ICC (inter-channel coherence), CTD (sound channel time difference) etc.CLD represents two capacity volume variances between the sound channel, and ICC represents two correlativitys between the sound channel, and CTD represents two time differences between the sound channel.
The concept of how human body being identified sound signal spatially and how producing spatial parameter with reference to Fig. 1 makes an explanation.
One direct sound wave 103 is from the left ear of long-range sound source 101 arrival human bodies, and another direct sound wave 102 is around the diffracted auris dextra 106 to arrive this human body of head.
These two sound waves 102 and 103 differed from one another in time of arrival and energy level.And CTD and CLD parameter are by generating with these differences.
If the sound wave 104 and 105 through reflection arrives respectively two ears, if perhaps this sound source is scattered, then the sound wave of non-correlation will arrive respectively two ears to generate the ICC parameter each other.
The spatial parameter that use generates according to above-mentioned principle can be with multi-channel audio signal as monophony or stereophonic signal sends and this signal is output as multi-channel signal.
The invention provides a kind of with spatial information-be spatial parameter-be embedded in monophony or the stereo audio signal, send this through embedding signal and be the method for multi-channel audio signal with the signal reproduction that sends.The present invention is not limited to multi-channel audio signal.In the following description of the present invention, for convenience of explanation multi-channel audio signal is made an explanation.
Fig. 2 is the block diagram according to code device of the present invention.
With reference to Fig. 2, this code device according to the present invention receives multi-channel audio signal 201.In this case, ' n ' refers to the number of input sound channel.
Multi-channel audio signal 201 is converted to lower mixed signal (Lo and Ro) 205 by sound signal generation unit 203.The mixed signal of this time comprises monophony or stereo audio signal and can be multi-channel audio signal.In the present invention, stereo audio signal will be used as example in the following description.Yet the present invention is not limited to stereo audio signal.
The spatial information of multi-channel audio signal-be spatial parameter-generated from multi-channel audio signal 201 by supplementary generation unit 204.In the present invention, the indication of this spatial information send by lower mixed multichannel (for example, left and right, mid-, left around, right around etc.) the lower mixed signal 205 that generates of signal and with the information of lower mixed signal used sound signal sound channel when upper mixed (upmix) is for multi-channel audio signal again of sending.Randomly, lower mixed signal 205 can be to generate with the lower mixed signal such as mixed signal 202 under the art that directly provides from the outside.
The spatial information that generates in the supplementary generation unit 204 is encoded to the spatial information bit stream for transmission and storage by supplementary coding unit 206.
The spatial information bit stream by suitable shaping with the lower mixed signal 205 that directly is inserted into sound signal-namely will send by embedded unit 207-in.When so doing, can use ' DAB embedding grammar '.
For example, from different by the situation of the compressed encodings such as AAC, lower mixed signal 205 be be stored in be difficult to storage space information therein storage medium (for example, in the situation of the original pcm audio signal that maybe will send by SPDIF (Sony (Sony)/Philip (Philips) digital interface) stereo compact disk), there is not the auxiliary data field for this spatial information of storage.
In this case, if used " DAB embedding grammar ", then this spatial information can be embedded in the situation without the tonequality distortion in this original pcm audio signal.And, at common demoder, wherein embed sound signal and not difference of original signal that this spatial information is arranged.That is, at common PCM decoder, wherein embed the output signal Lo ' of the information of having living space/Ro ' 208 and can be considered to the signal identical with input signal Lo/Ro205.
As ' DAB embedding grammar ', have ' position replace coding method ' ", ' echo hiding method ', ' method of extension-based frequency spectrum ' etc.
It is a kind ofly to insert the method for customizing messages by what revise the audio samples that quantizes than low level that coding method is replaced in the position.In sound signal, than the modification of low level hardly on the mass formation impact of this sound signal.
The echo hiding method is a kind of method of inserting enough little so that echo that do not perceive for people's ear in sound signal.
And the method for extension-based frequency spectrum is a kind ofly via discrete cosine transform, discrete Fourier transform (DFT) etc. sound signal to be transformed in the frequency domain, specific binary message is carried out spread spectrum to form PN (pseudo noise) sequence and to add it to be switched in the frequency domain this sound signal.
In the present invention, will mainly explain in the following description position replacement coding method.Yet the present invention is not limited to the position and replaces coding method.
Fig. 3 is the detailed diagram according to the embedded unit for the spatial encoder shown in the arrangement plan 2 of the present invention.
With reference to Fig. 3, in the time of in the non component that spatial information is embedded in lower mixed signal component by position replacement coding method, the insertion bit length (hereinafter referred to as ' K value ') that is used for this spatial information of embedding can use K position (K>0) replace only using lower 1 according to the method that predetermines.This K position can use lower mixed signal than low level, but be not limited in than low level.In this case, this method that predetermines is a kind of method of for example seeking masking threshold and distributing suitable position according to this masking threshold according to psychoacoustic model.
Lower mixed signal Lo/Ro 301 as shown in FIG. is transferred to audio-frequency signal coding unit 306 via the impact damper 303 in this embedded unit.
Masking threshold computing unit 304 is segmented into the sound signal of input predetermined segment (for example, piece) and seeks masking threshold for respective section subsequently.
Masking threshold computing unit 304 is sought can be so that make amendment and the insertion bit length (that is, K value) of the lower mixed signal of audible distortion can not occur according to this masking threshold.That is the figure place of, using when spatial information is embedded into lower mixed signal is distributed according to piece.
In description of the invention, the data cell that bit length (that is, K value) inserts is inserted in one of the use that piece represents to be present in the frame.
In a frame, can there be at least one or a plurality of.If frame length is fixed, then block length can reduce along with the increase of piece number.
In case the K value is determined, just can in the spatial information bit stream, comprise this K value.That is, bit stream shaping unit 305 can be so that this spatial information bit stream can comprise this K value mode wherein comes the shaping of spatial information bit stream.In this case, synchronization character, error-detecging code, error correcting code etc. can be included in this spatial information bit stream.
Spatial information bit stream through shaping can be rearranged as embedding form.Spatial information bit stream through resetting is embedded in the lower mixed signal by audio-frequency signal coding unit 306, and is output as the sound signal Lo ' that wherein embeds the information bit stream of having living space/Ro ' 307 subsequently.In this case, the spatial information bit stream can be embedded in K the position of lower mixed signal.The K value can have a fixed value in a piece.In a word, the K value is inserted into the spatial information bit stream and is transferred to subsequently decoding device in the shaping of spatial information bit stream or rearrangement process.And this decoding device can extract the spatial information bit stream with this K value.
As mentioning in describing before, the spatial information bit stream is gone through by piece and is embedded in process in the lower mixed signal.This process is carried out by one of the whole bag of tricks.
First method is to substitute K lower position of lower mixed signal with zero simply and interpolation is implemented through the mode of the spatial information bit stream data of shaping.For example, if the K value is 3, if the sample data of lower mixed signal is 11101101 and if the spatial information bit stream data that embeds is 111, then low 3 usefulness zero of ' 11101101 ' substitute to provide 11101000.And spatial information bit stream data ' 111 ' is added to " 11101000 " so that " 11101111 " to be provided.
Second method is implemented with dithering.At first, deduct spatial information bit stream data through resetting from the insert district of lower mixed signal.Then lower mixed signal comes re-quantization based on this K value.And this spatial information bit stream data through resetting is added to this through the lower mixed signal of re-quantization.For example, if the K value is 3, if the sample data of lower mixed signal is 11101101 and if the spatial information bit stream data that embeds is 111, then deduct ' 111 ' to provide 11100110 from ' 11101101 '.Low 3 subsequently by re-quantization so that ' 11101000 ' (by rounding off) to be provided.And ' 111 ' is added to ' 11101000 ' to provide ' 11101111 '
Because the spatial information bit stream that is embedded in the lower mixed signal is stream of random bits, so it may not possess white noise character.Because mixed signal interpolation white noise type signal is favourable at acoustic feature downwards, so the spatial information bit stream is gone through albefaction (whitening) process to be added to lower mixed signal.And this albefaction process is applicable to the spatial information bit stream except synchronization character.
In the present invention, " albefaction " expression makes random signal have at the All Ranges of frequency domain to equate or the volume of similar sound signal almost.
In addition, when the spatial information bit stream is embedded in lower mixed signal, can be by spatial information bit stream using noise manufacturing process be minimized audible distortion.
In the present invention, ' noise shaped method ' expression is revised noisiness so that move on to the process that can listen the higher frequency band on the frequency range or generate corresponding to the time varying filter of the masking threshold that obtains from the respective audio signal and this wave filter of passing through to generate from the energy of the quantizing noise that quantizes to generate and revise from the process of the characteristic of the noise that quantizes to generate.
Fig. 4 is the diagram according to the first method for the spatial information bit stream is reset of the present invention.
With reference to Fig. 4, as in describing before, mentioning, can use the K value that the spatial information bit stream is rearranged to and to embed form.In this case, the spatial information bit stream can be embedded in the lower mixed signal by resetting in every way.And Fig. 4 shows the method with sample plane order embedded space information.
This first method is a kind of to scatter the spatial information bit stream of relevant block and sequentially to embed the method that the mode of distributed spatial information bit stream is reset the spatial information bit stream as unit by the K bit.
If be made of N sample 403 if the K value is 4 and pieces 405, then spatial information bit stream 401 can be rearranged in low 4 that sequentially are embedded in each sample.
As mentioning in describing before, the present invention is not limited to the spatial information bit stream is embedded in low 4 this situations of each sample.
In addition, in the lower K position of each sample, as shown in the figure, the spatial information bit stream can at first be embedded into MSB (highest significant position) or at first be embedded into LSB (least significant bit (LSB)).
In Fig. 4, arrow 404 indications embed direction and the interior digital designation data rearrangement order of bracket.
The certain bits layer that the bit plane indication is made of a plurality of positions.
In the figure place of the spatial information bit stream that will the embed situation less than the embedded figure place in the insert district of embedded space information bit stream therein, clog remaining bit 406, in remaining bit, insert random signal or can replace remaining bit with original lower mixed signal with zero.
For example, if be configured to the number of samples (N) of a piece if be 100 and the K value be 4, then the embedded figure place (W) in this piece is W=N*K=100*4=400.
If the figure place (V) of the spatial information bit stream that embeds be 390 positions (namely, V<M) is then with zero clogging remaining 10 positions, inserting random signal or replace these remaining 10 positions, fill these remaining 10 positions or can fill this remaining 10 positions with their combination with the tailer sequence of designation data end with original lower mixed signal in these remaining 10 positions.Tailer sequence represents the bit sequence of the end of space information bit stream in the indication relevant block.Although Fig. 4 shows remaining bit and clog by piece, the present invention also comprises in the above described manner and clogs remaining bit by inserting frame.
Fig. 5 is the diagram according to the second method for resetting the spatial information bit stream of the present invention.
With reference to Fig. 5, this second method is to implement by the mode of bit plane 502 order rearrangement spatial information bit streams 501.In this case, the spatial information bit stream can be by piece from sequentially being embedded than low level of lower mixed signal, and this does not make any restriction to the present invention certainly.
For example, if be configured to the number of samples (N) of a piece if be 100 and the K value be 4,100 least significant bit (LSB)s that then are configured to bit plane 0502 are preferentially clogged and are configured to 100 positions of bit plane 1502 and can be clogged.
In Fig. 5, arrow 505 indications embed direction and the interior digital designation data rearrangement order of bracket.
Advantageous particularly during the synchronization character of this second method on extracting random site.When the synchronization character of the spatial information bit stream that inserts from the signal search through rearrangement and coding, only can extract LSB and search for synchronization character.
And can expect that according to the figure place (V) of the spatial information bit stream that will embed, this second method is only used minimum LSB.In this case, if the figure place (V) of the spatial information bit stream that embeds is less than inciting somebody to action therein the embedded figure place of the insert district of embedded space information bit stream (W), then clog remaining bit 506 with zero, be inserted into random signal in the remaining bit, replace remaining bit, clog remaining bit or can clog remaining bit with their combination with the end tailer sequence of designation data end with original lower mixed signal.Especially, it is favourable using the method for lower mixed signal.Although Fig. 5 shows the example of clogging remaining bit by piece, the present invention also comprises the situation of clogging remaining bit by the insertion frame in the above described manner.
Fig. 6 A shows according to of the present invention the spatial information bit stream is embedded in bitstream structure in the lower mixed signal.
With reference to Fig. 6 A, spatial information bit stream 607 can reset to comprise synchronization character 603 and K value 604 for this spatial information bit stream by bit stream shaping unit 305.
And, in the shaping process, at least one error-detecging code or error correcting code 606 or 608 (following will being described error-detecging code) can be included in the spatial information bit stream of shaping.Error-detecging code can be judged the whether distortion in transmission or storing process of this spatial information bit stream 607.
Error-detecging code comprises CRC (cyclic redundancy check (CRC)).Can comprise error-detecging code by being divided into two steps.Be used for having K value header 601 error-detecging code 1 and can be included in dividually the spatial information bit stream for the error-detecging code 2 of the frame data 602 of spatial information bit stream.In addition, all the other information 605 can be included in the spatial information bit stream dividually.And, can be included in these all the other information 605 about the information of rearrangement method of this spatial information bit stream etc.
Fig. 6 B is the detailed view of the configuration of the spatial information bit stream shown in Fig. 6 A.The frame that Fig. 6 B shows spatial information bit stream 601 comprises the embodiment of two pieces (the present invention is not limited to this).
With reference to Fig. 6 B, the spatial information bit stream shown in Fig. 6 B comprises synchronization character 612, K value (K1, K2, K3, K4) 613 to 616, all the other information 617 and error- detecging code 618 and 623.
Spatial information bit stream 610 comprises pair of block.In the situation of stereophonic signal, piece 1 can be comprised of the piece 619 and 620 that is respectively applied to L channel and R channel.And piece 2 can be comprised of the piece 621 and 622 that is respectively applied to L channel and R channel.
Although in Fig. 6 B stereophonic signal has been shown, the present invention is not limited to stereophonic signal.
The insertion bit length of these pieces (K value) is included in the header part.
The insertion bit length of the L channel of K1 613 indicator dogs 1.The insertion bit length of the R channel of K2 614 indicator dogs 1.The insertion bit length of the L channel of K3 615 indicator dogs 2.And the insertion position of the R channel of K4 616 indicator dogs 2 is big or small.
And, can comprise error-detecging code by being divided into two steps.For example, the error-detecging code 2 comprising the frame data 611 of the error-detecging code 1618 of the header 609 of K value and this spatial information bit stream can separately be comprised.
Fig. 7 is the block diagram according to decoding device of the present invention.
With reference to Fig. 7, receive the sound signal Lo ' that wherein embedded the spatial information bit stream/Ro ' 701 according to decoding device according to the present invention.
The sound signal that wherein embeds the information bit stream of having living space can be a kind of in monophony, the stereo and multi-channel signal.For the ease of explanation, stereophonic signal is used as example of the present invention, but this does not make any restriction to the present invention.
Embed signal decoding unit 702 and can extract the spatial information bit stream from sound signal 701.
The spatial information bit stream that is extracted by embedding signal decoding unit 702 is encoded spatial information bit stream.And encoded spatial information bit stream can be the input signal of going to spatial information decoding unit 703.
Spatial information decoding unit 703 is with encoded spatial information bitstream decoding and will output to multichannel generation unit 704 through the spatial information bit stream of decoding subsequently.
Multichannel generation unit 704 receive lower mixed signal 701 and the spatial information that obtains from decoding as input and be multi-channel audio signal 705 with the input and output that receive subsequently.
Fig. 8 is the detailed diagram according to the embedding signal decoding unit 702 be used to being configured to this decoding device of the present invention.
With reference to Fig. 8, wherein embedded the sound signal Lo ' of spatial information/Ro ' and be imported into and embed signal decoding unit 702.And synchronization character search unit 802 detects synchronization character from sound signal 801.In this case, can detect this synchronization character from a sound channel of this sound signal.
After detecting synchronization character, header decoding unit 803 is decoded the header district.In this case, the information extraction of predetermined length from this header district and data inverse can contrary albefaction scheme be applied to the header district information except synchronization character in institute's information extraction to revising unit 804.
Then, can be from the length information in this header district of information acquisition, header district of having used contrary albefaction scheme thereon etc.
And data inverse is revised unit 804 can will be applied to remaining spatial information bit stream against the albefaction scheme.Information such as K value etc. can obtain by the header decoding.The raw spatial information bit stream can obtain again arranging through the spatial information bit stream of resetting by using such as information such as K values.In addition, can obtain to arrange the sync bit information of the frame of lower mixed signal and spatial information bit stream, namely the frame arrangement information 806.
Fig. 9 is for the figure that explains according to the situation of general PCM decoding device reproducing audio signal of the present invention.
With reference to Fig. 9, wherein embed the sound signal Lo ' of the information bit stream of having living space/Ro ' and be used as the input of general PCM decoding device.
General PCM decoding device will wherein embed the sound signal Lo ' of the information bit stream of having living space/Ro ' and be identified as the normal stereo sound signal with producing sound.And the sound signal 902 before the sound of reproduction and the embedded space information is not difference with regard to tonequality.
Therefore, be compatible with the normal reproduction of the stereophonic signal in the general PCM decoding device according to the sound signal of wherein embedded space information of the present invention and have the advantage that multi-channel audio signal is provided in can the decoding device of multi-channel decoding.
Figure 10 is the process flow diagram according to the coding method of embedded space information in lower mixed signal of the present invention.
With reference to Figure 10, sound signal from multi-channel signal by lower mixed (1001,1002).In this case, lower mixed signal can be a kind of in monophony, the stereo and multi-channel signal.
Then, extract spatial information (1003) from multi-channel signal.And usage space Information generation spatial information bit stream (1004).
The spatial information bit stream is embedded in the lower mixed signal (1005).
And, comprise that the whole bit stream that wherein embeds the lower mixed signal of the information bit stream of having living space is transferred into decoding device (1006).
Especially, the present invention's insertion bit length (being the K value) of using lower mixed signal to find to insert therein the insert district of spatial information bit stream and the spatial information bit stream can being embedded in this insert district.
Figure 11 is according to the process flow diagram to the method that is embedded in the spatial information decoding in the lower mixed signal of the present invention.
With reference to Figure 11, the decoding device reception comprises the whole bit stream (1101) of the lower mixed signal that wherein embeds the information bit stream of having living space and extracts lower mixed signal (1102) from this bit stream.
Decoding device extracts also information bit stream (1103) between decode empty from whole bit stream.
Decoding device extracts spatial information (1104) by decoding and uses subsequently the spatial information decoding mixed signal of this time (1105) that extracts.In this case, lower mixed signal can be decoded as two sound channels or a plurality of sound channel.
Especially, the present invention can extract the information of the information of spatial information bit stream embedding grammar and K value and can use the embedding grammar that extracts and the K value of extracting to this spatial information bitstream decoding.
Figure 12 is the diagram that is embedded in the frame length of the spatial information bit stream in the lower mixed signal according to of the present invention.
With reference to Figure 12, ' frame ' expression has a header and allows the unit of independent decoding one predetermined length.In description of the invention, ' frame ' expression is about to ' the insertion frame ' of appearance.In the present invention, ' insertion frame ' is illustrated in the unit of embedded space information bit stream in the lower mixed signal.
And the length of insertion frame can define frame by frame or can use predetermined length.
For example, make insert in frame length and the spatial information bit stream corresponding to decode and the frame length (s) (hereinafter being referred to as " decoded frame length ") of the unit of application space information have equal length (referring to, Figure 12 (a)), be ' S ' multiple (referring to, Figure 12 (b)) or make ' S ' be the multiple of ' N ' (referring to, Figure 12 (c)).
Under the situation of N=S, shown in Figure 12 (a), decoded frame length (S, 1201) is consistent so that the decoding processing with insertion frame length (N, 1202).
Under the situation of N>S, shown in Figure 12 (b), can reduce because additional figure places such as header, error-detecging codes (such as CRC) by a plurality of decoded frames (1203) being connected together to transmit a mode of inserting frame (N, 1204).
Under the situation of N<S, shown in Figure 12 (c), can be by some insertion frames (N, 1206) be connected together to dispose a decoded frame (S, 1205).
In inserting the frame header, can insert information for the insertion bit length of embedded space information therein, insert frame length (N) information, be included in the information etc. of a plurality of subframes of this insertion frame.
Figure 13 is the figure that is embedded in the spatial information bit stream in the lower mixed signal by the insertion frame unit according to of the present invention.
At first, in every kind of situation shown in Figure 12 (a), 12 (b), 12 (c), insert frame and decoded frame and be configured to each other multiple.
With reference to Figure 13, in order to transmit, the bit stream of configurable regular length, for example, the grouping 1303 of transport stream (TS) form.
Especially, the grouped element that spatial information bit stream 1301 can predetermined length be the boundary and no matter the decoded frame length of spatial information bit stream how.The grouping that wherein is inserted with such as information such as TS headers 1302 is transmitted to decoding device.The length of inserting frame can define or use predetermined length rather than define in frame by every frame.
Consider that this method is necessary for the data rate of change spatial information bit stream according to the masking threshold of each piece of characteristic of lower mixed signal maximum number of digits (K_max) difference different and that do not have lower of the situation of quality distortion to distribute at lower mixed signal separately.
For example, be not enough in the situation of the required spatial information bit stream of perfect representation relevant block at K_max, high data to K_max be transmitted and remainder data after pass through another block transfer.
Under the enough situation of K_max, the spatial information bit stream of next piece is by pre-loaded.
In this case, each TS grouping has an independently header.And, can comprise synchronization character, TS packet-length information in the header, be included in the information of a plurality of subframes in this TS grouping, the information of the interior insertion bit length that distributes of grouping etc.
Figure 14 A is the diagram of explaining for the first method of the time alignment problem that solves the spatial information bit stream that embeds by the insertion frame unit.
With reference to Figure 14 A, insert the length of frame by every frame definition and maybe can use a predetermined length.
May cause the insertion frame start position of spatial information bit stream of embedding and the time alignment problem between the lower mixed signal frame by the embedding grammar that inserts frame unit.Therefore, need a kind of time alignment solution of problem scheme.
In the first method shown in Figure 14 A, the header 1402 of the decoded frame 1403 of spatial information (hereinafter being referred to as ' decoded frame header ') is separated.
Indicate whether to exist and to be included in the decoded frame header 1402 distinctive information of the positional information of the sound signal of its application space information.
For example, under the situation of TS grouping 1404 and 1405, indicate whether to exist the distinctive information 1408 (for example, sign) of decoded frame header 1402 to be comprised in the TS packet headers 1404.
If distinctive information 1408 is 1, if namely decoded frame header 1402 exists, then can from this decoded frame header, extracts and indicate whether and to use the distinctive information of positional information of the lower mixed signal of this spatial information bit stream to it.
Then, will can from decoded frame header 1402, extract according to the distinctive information of extracting the positional information 1409 (for example deferred message) of the lower mixed signal of its application space information bit stream.
If distinctive information 1411 is 0, then may not comprise positional information in the header of TS grouping.
Generally speaking, spatial information bit stream 1403 preferably appears at the front of corresponding lower mixed signal 1401.Therefore, positional information 1409 can be the sample value that postpones for.
Simultaneously, for the problem that the amount of the required information of the expression sample value that prevents from causing owing to excessive delay excessively increases, defined the sample group unit (for example granularity unit) that represents one group of sample etc.Therefore, positional information can be represented by this sample group unit.
Describe such as the front and to mention, TS synchronization character 1406, insert bit length 1407, indicate whether to exist the distinctive information of decoded frame header and all the other information 140 can be included in the TS header.
Figure 14 B is the diagram of explaining for the second method of the time alignment problem that solves the spatial information bit stream that embeds by the insertion frame with the length that defines frame by frame.
With reference to Figure 14 B, under the situation of for example TS grouping, the mode of the starting point of the starting point 1413 of the second method employing matching and decoding frame, the starting point of TS grouping and corresponding lower mixed signal 1412 realizes.
For the part through coupling, the distinctive information 1420 or 1422 (for example sign) of indicating the starting point of this three types to be aligned can be included in the header 1415 of TS grouping.
Figure 14 B illustrates these three kinds of starting points in n frame 1412 places of lower mixed signal coupling.In this case, distinctive information 1422 can have value 1.
If three kinds of starting points are not mated, then distinctive information 1420 can have value 0.
For these three kinds of starting points are matched together, specific part 1417 usefulness after previous T S packet zero are clogged, are wherein inserted random signal, replace or clog with their combination with original lower mixed sound signal.
Mention such as the front description, TS synchronization character 1418, insertion bit length 1419 and all the other information 1421 can be comprised in the TS packet headers 1415.
Figure 15 is the diagram that the spatial information bit stream is attached to the method for lower mixed signal according to of the present invention.
With reference to Figure 15, the length of the frame of additional spatial information bit stream on it (hereinafter being referred to as ' additional frame ') can be the length cell that defines frame by frame or not according to the predetermined length unit of frame definition.
For example, as shown in the figure, can multiply by by the decoded frame length 1504 with spatial information or obtain inserting frame length (wherein N is positive integer) or insert frame length divided by N and can have fixed-length cell.
If decoded frame length 1504 is to insert frame length different, can for example need not the segmentation of spatial information bit stream but randomly the cutting room information bit stream generate the insertion frame that has equal length with decoded frame length 1504 in the situation that is fit to insert frame.
In this case, the spatial information bit stream is configured to be embedded in and maybe can be configured in the lower mixed signal be additional on the lower mixed signal rather than be embedded in the lower mixed signal.
Be to become from analog signal conversion the signal of digital signal (hereinafter being referred to as ' the first sound signal ') as the PCM signal, the spatial information bit stream can be configured to be embedded in this first sound signal.
In the digital signal (hereinafter being referred to as ' the second sound signal ') of the further compression as the MP3 signal, the spatial information bit stream can be configured to append to this second sound signal.
For example under the situation of using the second sound signal, lower mixed signal is represented as the bit stream of compressed format.Therefore as shown in the figure, lower mixed signal bitstream 1502 exists with compressed format and the spatial information of decoded frame length 1504 is affixed to lower mixed signal bitstream 1502.
Therefore, the spatial information bit stream can transmit with burst.
Header 1503 can be present in the decoded frame.And, its positional information of having used the lower mixed signal of spatial information is comprised in this header 1503.
Simultaneously, the present invention includes a kind of situation, namely the spatial information bit stream is configured to the additional frame (for example the TS bit stream 1506) of compressed format this additional frame is appended to the lower mixed signal bitstream 1502 of compressed format.
The TS header 1505 that can have in this case, TS bit stream 1506.And, in additional frame header (for example the TS header 1505), additional frame synchronizing information 1507 be can comprise, the distinctive information 1508 of the header that whether has decoded frame in this additional frame, the information that is included in the number of subframes in this additional frame and at least one in all the other information 1509 indicated.And the distinctive information whether starting point of indication additional frame and the starting point of decoded frame mate also can be comprised in the additional frame.
If the decoded frame header is present in the additional frame, then from the decoded frame header, extracts and indicate whether to exist the distinctive information of positional information of it having been used the lower mixed signal of spatial information.
Then, can be according to the positional information of distinctive information extraction to the lower mixed signal of its application space information.
Figure 16 is the process flow diagram that the insertion frame by all size is embedded in the method for the spatial information bit stream coding in the lower mixed signal according to of the present invention.
With reference to Figure 16, sound signal from multi-channel audio signal by lower mixed (1601,1602).In this case, lower mixed signal can be monophony, stereo or multi-channel audio signal.
And spatial information is extracted (1601,1603) from multi-channel audio signal.
Adopt subsequently the spatial information span information bit stream (1604) of extracting.The spatial information that generates can be embedded in the lower mixed signal by the insertion frame unit with length corresponding with the integral multiple of the decoded frame length of each frame.
If decoded frame length (S) is then inserted frame length (N) and is configured to equal a S (1607) by a plurality of N are linked together greater than inserting frame length (N) (1605).
If decoded frame length (S) is then inserted frame length (N) and is configured to equal a N (1608) by a plurality of S are linked together less than inserting frame length (N) (1606).
If decoded frame length (S) equals to insert frame length (N), then insert frame length (N) and be configured to equal decoded frame length (S) (1609).
The spatial information bit stream of configuration is embedded in the lower mixed signal (1610) in the above described manner.
At last, comprise that the whole bit stream that wherein embeds the lower mixed signal of the information bit stream of having living space is transmitted (1611).
In addition, in the present invention, the information of the insertion frame length of spatial information bit stream can be embedded in the whole bit stream.
Figure 17 is according to the process flow diagram to the method that is embedded in the spatial information bit stream coding in the lower mixed signal by regular length of the present invention.
With reference to Figure 17, sound signal is mixed by lower from multi-channel audio signal (1701,1702).In this case, lower mixed signal can be monophony, stereo or multi-channel audio signal.
And, extract spatial information (1701,1703) from multi-channel audio signal.
Use subsequently the spatial information span information bit stream (1704) of extracting.
After the spatial information bit stream is demarcated for the bit stream with regular length (grouped element) of for example transport stream (TS) (1705), the spatial information bit stream of this regular length is embedded in the lower mixed signal (1706).
Then, comprise that the whole bit stream that wherein embeds the lower mixed signal of the information bit stream of having living space is transmitted (1707).
In addition, in the present invention, use the therein insertion bit length of the insert district of embedded space information bit stream (being the K value) of lower mixed signal acquisition, and the spatial information bit stream can be embedded in the insert district.
Figure 18 is embedded in the spatial information bit stream by the diagram of the first method in the lower mixed sound signal at least one sound channel according to of the present invention.
Have under the situation of at least one sound channel in lower mixed signal configures, spatial information is considered to the data shared with this at least one sound channel.Therefore, need to be a kind of by spatial information being dispersed in the method for embedded space information on this at least one sound channel.
Figure 18 is illustrated in the method for embedded space information on the sound channel of the lower mixed signal with at least one sound channel.
With reference to Figure 18, spatial information is embedded in the K position of lower mixed signal.Especially, spatial information only is embedded in the sound channel and is not embedded in other sound channel.And the K value of each piece or sound channel can be different.
As the front describe mention like that, these corresponding with the K value can be corresponding to the low level of lower mixed signal, but the present invention is not limited only to this.In this case, the spatial information bit stream can be inserted into the sound channel by the bit plane order that begins from LSB or by the sample plane order.
Figure 19 is embedded in the spatial information bit stream by the diagram of the second method in the lower mixed sound signal at least one sound channel according to of the present invention.For ease of explanation, Figure 19 illustrates the lower mixed signal with two sound channels, but the present invention is not limited only to this.
With reference to Figure 19, the second method is to adopt successively the mode that spatial information is embedded piece-n, the piece-n of another sound channel (such as R channel) of a sound channel (such as L channel), the piece of last sound channel (L channel)-(n+1) etc. to realize.In this case, synchronizing information can only be embedded in the sound channel.
Although the spatial information bit stream can be embedded in the lower mixed signal of each piece, however also can be in decode procedure by piece or extract frame by frame the spatial information bit stream.
Because the signaling feature of two sound channels of mixed frequency signal differs from one another, therefore can come to two channel allocation K values by finding separately two sound channels masking threshold separately.Especially, as shown in the figure, K1 and K2 are assigned to respectively two sound channels.
In this case, spatial information can sequentially be embedded in each sound channel by the bit plane order or the sample plane that begin from LSB.
Figure 20 is embedded in the spatial information bit stream by the diagram of the third method in the lower mixed sound signal at least one sound channel according to of the present invention.Figure 20 illustrates the lower mixed signal with two sound channels, but the present invention is not limited only to this.
With reference to Figure 20, third method adopts and realizes by the mode that spatial information is spread to embedded space information on two sound channels.Especially, spatial information is to be embedded into by the mode that sample unit replaces corresponding embedding order for two sound channels.
Because the signaling feature of two sound channels of lower mixed signal differs from one another, therefore can by find individually two sound channels separately masking threshold and the K value differently is assigned in two sound channels.Particularly, as shown in the figure K 1And K 2Distributed to respectively two sound channels.
The K value of each piece can differ from one another.For example, spatial information is successively placed on the K of the sample-1 of a sound channel (for example L channel) 1In the individual low level, the K of the sample-1 of another sound channel (for example R channel) 2In the individual low level, the K of the sample-2 of last sound channel (for example L channel) 1In the individual low level and the K of the sample 2 of a rear sound channel (for example R channel) 2In the individual low level.
In the accompanying drawings, the order of the indication of the numeral in bracket packing space information bit stream.Begin to fill from MSB although Figure 20 illustrates the spatial information bit stream, yet the spatial information bit stream also can begin to fill from LSB.
Figure 21 is embedded in the spatial information bit stream by the diagram of the cubic method in the lower mixed sound signal at least one sound channel according to of the present invention.Figure 21 illustrates the lower mixed signal with two sound channels, but the present invention is not limited only to this.
With reference to Figure 21, cubic method is that the mode with embedded space information realizes at least one sound channel by spatial information is spread in employing.Particularly, spatial information is to be embedded into to begin to replace corresponding embedding mode sequentially by bit-plane cell from LSB for two sound channels.
Because the signaling feature of two sound channels of lower mixed signal differs from one another, therefore can by find separately two sound channels separately the mask threshold value and with K value (K 1And K 2) differently distribute to two sound channels.Particularly, as shown in the figure K1 and K2 can be assigned to respectively two sound channels.
The K value of each piece can differ from one another.For example, spatial information is placed in 1 least significant bit (LSB) of sample-2 of 1 least significant bit (LSB) of sample-2 of 1 least significant bit (LSB), last sound channel (for example L channel) of sample-1 of 1 least significant bit (LSB), another sound channel (for example R channel) of the sample-1 of a sound channel (for example L channel) and a rear sound channel (for example R channel) successively.In the accompanying drawings, the indication of the numeral in piece packing space sequence of information.
Be stored under the storage medium (for example, stereo CD) that does not have ancillary data area or the situation that is sent out by SPDIF etc. in sound signal, the L/R sound channel interweaves by sample unit.Thereby if by the 3rd or cubic method stored audio signal, then to come audio signal according to the order that receives be favourable to demoder.
And cubic method is applicable to the situation of spatial information bit stream by resetting to store by bit-plane cell.
As mentioning during the front is described, by being dispersed under the situation that is embedded on two sound channels, can differently distribute K value to all sound channels respectively at the spatial information bit stream.In this case, can transmit separately the K value by each sound channel in the bit stream.Under the situation that transmits a plurality of K values, differential coding is applicable to the situation of encoded K value.
Figure 22 is embedded in the spatial information bit stream by the diagram of the 5th method in the lower mixed sound signal at least one sound channel according to of the present invention.Figure 22 illustrates the lower mixed signal with two sound channels, but the invention is not restricted to this.
With reference to Figure 22, the mode with embedded space information realizes the employing of the 5th method on two sound channels by spatial information is dispersed in.Particularly, the 5th method is to realize in the mode of repeatedly inserting identical value in each of two sound channels.
In this case, the value with same sign can be inserted in each of two sound channels at least, and the value that perhaps sign is different can be respectively inserted in two sound channels at least.
For example, value 1 is inserted into each sound channel in two sound channels or is worth 1 and-1 and alternately is inserted into respectively in two sound channels.
The advantage of the 5th method is that the minimum effective insertion position (a for example K position) that is beneficial to by comparing at least one sound channel checks error of transmission.
Particularly, under the situation that monophonic audio signal is sent to such as stereo medias such as CD, because the sound channel-L (L channel) of lower mixed signal and the sound channel-R (R channel) of lower mixed signal are equal to each other, therefore can improve robustness etc. by the spatial information that equilibrium is inserted.In this case, spatial information is embedded in each sound channel by the bit plane order that starts from LSB or by the sample plane order.
Figure 23 is embedded in the spatial information bit stream by the diagram of the 6th method in the lower mixed sound signal at least one sound channel according to of the present invention.
The 6th method relates under the situation that frame in each sound channel comprises a plurality of (length B) method of spatial information being inserted in the lower mixed signal with at least one sound channel.
With reference to Figure 23, the insertion bit length of each sound channel and piece (being the K value) can have separately different value or each sound channel can have identical value with piece.
Insert bit length (K for example 1, K 2, K 3, and K 4) can be stored in the frame header that a complete frame is once transmitted.And the frame header can be positioned on the LSB.In this case, header can be inserted into by bit-plane cell.And the spatial information data can alternately be inserted by sample unit or module unit.In Figure 23, the piece number in the frame is 2.Therefore, the length of piece (B) is N/2.In this case, the figure place that is inserted in this frame is (K1+K2+K3+K4) * B.
Figure 24 is embedded in the spatial information bit stream by the diagram of the 7th method in the lower mixed sound signal at least one sound channel according to of the present invention.Figure 24 illustrates the lower mixed signal with two sound channels, but the present invention is not limited only to this.
With reference to Figure 22, the 7th method is that the mode with embedded space information realizes on two sound channels by spatial information is spread in employing.Particularly, the 7th method is characterised in that and will will alternately insert the method for spatial information and mix mutually by the sample plane order is alternately inserted spatial information in two sound channels method in two sound channels by the bit plane order that starts from LSB or MSB.
The method is carried out maybe and can be carried out by module unit by frame unit.
But dash area 1 to C corresponding also step-by-step planar sequence with header as shown in figure 24 inserts among LSB or the MSB so that frame alignment word is inserted in search.
Other parts (non-shaded portion) C+1 and more high-order portion corresponding to the part except header and can alternately being inserted in two sound channels in order to extract the spatial information data by sample unit.For each sound channel and piece, insert position size (for example K value) and can have each other similar and different value.And all insert bit length and all can be comprised in the header.
Figure 25 is according to the process flow diagram to the method that will be embedded into the spatial information coding in the lower mixed signal with at least one sound channel of the present invention.
With reference to Figure 25, sound signal is mixed to a sound channel (2501,2502) by lower from multi-channel audio signal.And, extract spatial information (2501,2503) from multi-channel audio signal.
Use subsequently the spatial information span information bit stream (2504) of extracting.
The spatial information bit stream is embedded in the lower mixed signal with at least one sound channel (2505).In this case, can use a kind of in those the seven kinds of methods of embedded space information bit stream at least one sound channel.
Then, comprise that the whole stream that wherein embeds the lower mixed signal of the information bit stream of having living space is transmitted (2506).In this case, the present invention uses lower mixed signal to find the K value and the spatial information bit stream is embedded in the K position.
Figure 26 is according to the process flow diagram to the method that is embedded in the spatial information bitstream decoding in the lower mixed signal that has at least one sound channel of the present invention.
With reference to Figure 26, spatial decoder receives and comprises the bit stream (2601) that wherein embeds the lower mixed signal of the information bit stream of having living space.
Detect lower mixed signal (2602) from the bit stream that receives.
Be embedded in that spatial information bit stream in the lower mixed signal with at least one sound channel is extracted and according to the bit stream that receives decode (2603).
Then, use the spatial information that obtains by decoding to convert lower mixed signal to multi-channel signal (2604).
The present invention extracts the distinctive information of the order of embedded space information bit stream and also can use this distinctive information to extract reconciliation code space information bit stream.
In addition, the present invention extracts the information of K value and can use information bit stream between this K value decode empty from the spatial information bit stream.
Commercial Application
Therefore, the invention provides following effect or advantage.
At first, when encoding multi-channel audio signal according to the present invention, spatial information is embedded in the lower mixed signal.Therefore, multi-channel audio signal can be stored into/be rendered to/from not having the storage medium (for example stereo CD) of ancillary data area or an audio format that does not have ancillary data area.
Secondly, spatial information can be embedded in the lower mixed signal by various frame lengths or fixed frame length.And spatial information can be embedded in the lower mixed signal with at least one sound channel.Therefore, the present invention has improved Code And Decode efficient.
Although in conjunction with its preferred embodiment the present invention is set forth and illustrates at this, yet to those skilled in the art, be apparent can make various modifications and variations and do not deviate from the spirit and scope of the present invention therein.Therefore, the present invention is intended to contain it and drops on all changes and variation in appended claims and the equivalent scope thereof.

Claims (15)

1. the method for a decoded audio signal comprises:
Extraction is embedded in the supplementary in the described sound signal, and wherein, described supplementary is to scatter corresponding at least two sound channels of described sound signal; And
With the described supplementary described sound signal of decoding,
Wherein:
Described at least two sound channels comprise left channel signals and right-channel signals,
Described supplementary comprises the frame data of header and spatial information,
The information of described header comprises the insertion bit length corresponding to described at least two sound channels, and
The frame data of described spatial information alternately are embedded in the described sound channel by sample unit.
2. the method for claim 1 is characterized in that, described supplementary is embedded in the insert district of described sound signal by module unit.
3. the method for claim 1 is characterized in that, the described supplementary in the described insert district is embedded into from highest significant position (MSB) or least significant bit (LSB) (LSB).
4. the method for claim 1 is characterized in that, also comprises the synchronizing information of extracting described supplementary at least from described two sound channels of described sound signal.
5. the method for claim 1 is characterized in that, described supplementary is inserted in the described sound signal with at least two sound channels repeatedly with identical value or value with opposite sign.
6. the method for claim 1 is characterized in that, described header step-by-step plane order is embedded in the described sound signal with described two sound channels at least.
7. the method for claim 1 is characterized in that, described sound signal comprises the lower audio signal of multi-channel signal.
8. the method for claim 1 is characterized in that, described supplementary comprises the spatial information of multi-channel signal.
9. the device of a decoded audio signal comprises:
Embed signal decoding unit, the described sound signal that is used for decoding is also extracted the supplementary that is embedded in described sound signal, and wherein, described supplementary is to scatter corresponding at least two sound channels of described sound signal; And
The multichannel generation unit, for the described sound signal of decoding with described supplementary,
Wherein:
Described at least two sound channels comprise left channel signals and right-channel signals,
Described supplementary comprises the frame data of header and spatial information,
The information of described header comprises the insertion bit length corresponding to described at least two sound channels, and
The frame data of described spatial information alternately are embedded in the described sound channel by sample unit.
10. device as claimed in claim 9 is characterized in that, described supplementary is embedded in the insert district of described sound signal by module unit.
11. device as claimed in claim 9 is characterized in that, the described supplementary in the described insert district is embedded into from highest significant position (MSB) or least significant bit (LSB) (LSB).
12. device as claimed in claim 9 is characterized in that, described embedding signal decoding unit also extracts the synchronizing information of described supplementary at least from described two sound channels of described sound signal.
13. device as claimed in claim 9 is characterized in that, described embedding signal decoding unit decoding header, and this header step-by-step plane order is embedded in the described sound signal with described two sound channels at least.
14. a method that is used for coding audio signal comprises:
Generate the required supplementary of the described sound signal of decoding; And
Described supplementary is embedded in the described sound signal, and wherein, described supplementary is to scatter corresponding at least two sound channels of described sound signal,
Wherein:
Described at least two sound channels comprise left channel signals and right-channel signals,
Described supplementary comprises the frame data of header and spatial information,
The information of described header comprises the insertion bit length corresponding to described at least two sound channels, and
The frame data of described spatial information alternately are embedded in the described sound channel by sample unit.
15. a device that is used for coding audio signal comprises:
The sound signal generation unit is used for from the sound signal of the lower audio mixing of multi-channel audio signal generation;
The supplementary generation unit is used for generating supplementary from described multi-channel audio signal;
The supplementary coding unit is used for the supplementary that coding generates; And
Embedded unit is used for described supplementary is embedded in described sound signal, and wherein, described supplementary is to scatter corresponding at least two sound channels of described sound signal,
Wherein:
Described at least two sound channels comprise left channel signals and right-channel signals,
Described supplementary comprises the frame data of header and spatial information,
The information of described header comprises the insertion bit length corresponding to described at least two sound channels, and
The frame data of described spatial information alternately are embedded in the described sound channel by sample unit.
CN2006800263123A 2005-05-26 2006-05-26 Method of encoding and decoding an audio signal Active CN101223579B (en)

Applications Claiming Priority (19)

Application Number Priority Date Filing Date Title
US68457805P 2005-05-26 2005-05-26
US60/684,578 2005-05-26
US75860806P 2006-01-13 2006-01-13
US60/758,608 2006-01-13
US78717206P 2006-03-30 2006-03-30
US60/787,172 2006-03-30
KR1020060030658A KR20060122692A (en) 2005-05-26 2006-04-04 Method of encoding and decoding down-mix audio signal embeded with spatial bitstream
KR1020060030658 2006-04-04
KR1020060030661A KR20060122694A (en) 2005-05-26 2006-04-04 Method of inserting spatial bitstream in at least two channel down-mix audio signal
KR1020060030660 2006-04-04
KR1020060030661 2006-04-04
KR10-2006-0030661 2006-04-04
KR10-2006-0030660 2006-04-04
KR1020060030660A KR20060122693A (en) 2005-05-26 2006-04-04 Modulation for insertion length of saptial bitstream into down-mix audio signal
KR10-2006-0030658 2006-04-04
KR1020060046972 2006-05-25
KR10-2006-0046972 2006-05-25
KR1020060046972A KR20060122734A (en) 2005-05-26 2006-05-25 Encoding and decoding method of audio signal with selectable transmission method of spatial bitstream
PCT/KR2006/002020 WO2006126858A2 (en) 2005-05-26 2006-05-26 Method of encoding and decoding an audio signal

Publications (2)

Publication Number Publication Date
CN101223579A CN101223579A (en) 2008-07-16
CN101223579B true CN101223579B (en) 2013-02-06

Family

ID=39406062

Family Applications (4)

Application Number Title Priority Date Filing Date
CN2006800263123A Active CN101223579B (en) 2005-05-26 2006-05-26 Method of encoding and decoding an audio signal
CN2006800263119A Active CN101258538B (en) 2005-05-26 2006-05-26 Method of encoding and decoding an audio signal
CN200680018078XA Active CN101180674B (en) 2005-05-26 2006-05-26 Method of encoding and decoding an audio signal
CN2006800263104A Active CN101253550B (en) 2005-05-26 2006-05-26 Method of encoding and decoding an audio signal

Family Applications After (3)

Application Number Title Priority Date Filing Date
CN2006800263119A Active CN101258538B (en) 2005-05-26 2006-05-26 Method of encoding and decoding an audio signal
CN200680018078XA Active CN101180674B (en) 2005-05-26 2006-05-26 Method of encoding and decoding an audio signal
CN2006800263104A Active CN101253550B (en) 2005-05-26 2006-05-26 Method of encoding and decoding an audio signal

Country Status (1)

Country Link
CN (4) CN101223579B (en)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2297728B1 (en) 2008-07-01 2011-12-21 Nokia Corp. Apparatus and method for adjusting spatial cue information of a multichannel audio signal
JP5258967B2 (en) 2008-07-15 2013-08-07 エルジー エレクトロニクス インコーポレイティド Audio signal processing method and apparatus
EP2146342A1 (en) 2008-07-15 2010-01-20 LG Electronics Inc. A method and an apparatus for processing an audio signal
CN101662688B (en) * 2008-08-13 2012-10-03 韩国电子通信研究院 Method and device for encoding and decoding audio signal
CN101340191B (en) * 2008-08-19 2013-07-31 无锡中星微电子有限公司 Decoder and decoding method
US9514768B2 (en) 2010-08-06 2016-12-06 Samsung Electronics Co., Ltd. Audio reproducing method, audio reproducing apparatus therefor, and information storage medium
TWI517142B (en) 2012-07-02 2016-01-11 Sony Corp Audio decoding apparatus and method, audio coding apparatus and method, and program
US10083700B2 (en) 2012-07-02 2018-09-25 Sony Corporation Decoding device, decoding method, encoding device, encoding method, and program
CA2843226A1 (en) 2012-07-02 2014-01-09 Sony Corporation Decoding device, decoding method, encoding device, encoding method, and program
WO2014011487A1 (en) * 2012-07-12 2014-01-16 Dolby Laboratories Licensing Corporation Embedding data in stereo audio using saturation parameter modulation
US9445197B2 (en) * 2013-05-07 2016-09-13 Bose Corporation Signal processing for a headrest-based audio system
EP3312835B1 (en) 2013-05-24 2020-05-13 Dolby International AB Efficient coding of audio scenes comprising audio objects
GB2515539A (en) 2013-06-27 2014-12-31 Samsung Electronics Co Ltd Data structure for physical layer encapsulation
EP4199544A1 (en) 2014-03-28 2023-06-21 Samsung Electronics Co., Ltd. Method and apparatus for rendering acoustic signal
EP3198594B1 (en) * 2014-09-25 2018-11-28 Dolby Laboratories Licensing Corporation Insertion of sound objects into a downmixed audio signal
EP3201916B1 (en) * 2014-10-01 2018-12-05 Dolby International AB Audio encoder and decoder
CN107782977A (en) * 2017-08-31 2018-03-09 苏州知声声学科技有限公司 Multiple usb data capture card input signal Time delay measurement devices and measuring method
CN112639884A (en) * 2018-08-30 2021-04-09 松下电器(美国)知识产权公司 Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device
CN109785849B (en) * 2019-01-17 2020-11-27 福建歌航电子信息科技有限公司 Method for inserting unidirectional control information into pcm audio stream based on iis transmission

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5606618A (en) * 1989-06-02 1997-02-25 U.S. Philips Corporation Subband coded digital transmission system using some composite signals

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
NL8901032A (en) * 1988-11-10 1990-06-01 Philips Nv CODER FOR INCLUDING ADDITIONAL INFORMATION IN A DIGITAL AUDIO SIGNAL WITH A PREFERRED FORMAT, A DECODER FOR DERIVING THIS ADDITIONAL INFORMATION FROM THIS DIGITAL SIGNAL, AN APPARATUS FOR RECORDING A DIGITAL SIGNAL ON A CODE OF RECORD. OBTAINED A RECORD CARRIER WITH THIS DEVICE.
DD289172A5 (en) * 1988-11-29 1991-04-18 N. V. Philips' Gloeilampenfabrieken,Nl ARRANGEMENT FOR THE PROCESSING OF INFORMATION AND RECORDING RECEIVED BY THIS ARRANGEMENT
TR200002630T1 (en) * 1999-01-13 2000-12-21 Koninklijke Philips Electronics N.V. Adding complementary data to an encoded signal
JP4470322B2 (en) * 1999-03-19 2010-06-02 ソニー株式会社 Additional information embedding method and apparatus, additional information demodulation method and demodulating apparatus
DE60223067T2 (en) * 2001-10-17 2008-08-21 Koninklijke Philips Electronics N.V. DEVICE FOR CODING AUXILIARY INFORMATION IN A SIGNAL

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5606618A (en) * 1989-06-02 1997-02-25 U.S. Philips Corporation Subband coded digital transmission system using some composite signals

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Hosoi, S. et al..Audio coding using the best level wavelet packet transform and auditory masking.《Signal Processing Proceedings, 1998. ICSP "98. 1998 Fourth International Conference on》.1998,第2卷 *
Stoll, G..MPEG audio layer II. A generic coding standard for two and multichannel sound for DVB, DAB and.《Broadcasting Convention, 1995. IBC 95., International》.1995, *
W. R. TH. TEN KATE et al..A New Surround-Stereo-Surround Coding Technique.《Journal of Audio Engineering Society》.1992,第40卷(第5期), *

Also Published As

Publication number Publication date
CN101258538A (en) 2008-09-03
CN101258538B (en) 2013-06-12
CN101253550A (en) 2008-08-27
CN101223579A (en) 2008-07-16
CN101253550B (en) 2013-03-27
CN101180674A (en) 2008-05-14
CN101180674B (en) 2012-01-04

Similar Documents

Publication Publication Date Title
CN101223579B (en) Method of encoding and decoding an audio signal
US8214220B2 (en) Method and apparatus for embedding spatial information and reproducing embedded signal for an audio signal
CN101789792B (en) Multichannel audio data encoding/decoding method and apparatus
CN101542596B (en) For the method and apparatus of the object-based audio signal of Code And Decode
US20080052089A1 (en) Acoustic Signal Encoding Device and Acoustic Signal Decoding Device
CN101292428B (en) Method and apparatus for encoding/decoding
CN101151659A (en) Scalable multi-channel audio coding
US11200906B2 (en) Audio encoding method, to which BRIR/RIR parameterization is applied, and method and device for reproducing audio by using parameterized BRIR/RIR information
KR20060122694A (en) Method of inserting spatial bitstream in at least two channel down-mix audio signal
TWI501220B (en) Embedding and extracting ancillary data
AU2006300102B2 (en) Method and apparatus for signal processing
WO2023173941A1 (en) Multi-channel signal encoding and decoding methods, encoding and decoding devices, and terminal device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant