TITLE: IMPROVED METHOD FOR COMPRESSION OF A PULSE
TRAIN
BACKGROUND OF THE INVENTION
Field of the Invention
The invention relates to a method for data compression of nonperiodic signals. More particularly, the invention relates to a method, using a novel set of mathematical basis functions, for reducing the number of bits in a digital representation of an electrocardiogram (ECG) signal.
Description of Prior Art
The desire for compression of ECG data streams is motivated by their ubiquity as well as the large amounts of data which are generated. In a typical clinical setting, a single lead (or channel) of ECG data will yield 250 samples per second of 12 bit data, yielding 32.4
Mbytes of packed binary data per day per channel. Usually a patient will have two or three leads of ECG data being taken, with many more leads in some clinical situations. Furthermore, a cardiac ward will have many beds leading to a single central station, as many as 64 in a large state-of-the-art cardiac ward. These figures imply an amount of data accumulated in a central station equivalent to tens of Gbytes per day. Although computer mass memory has become less expensive in recent years, this amount of memory is still prohibitive in practice. Data compression allows for more data to be stored using less memory and, therefore, is extremely useful.
Aside from the advantages data compression affords for data storage, data compression allows significant engineering advantages for telemetered ECG data streams. In a clinical situation involving ECG monitored ambulatory patient units, a patient's recorded data must either be stored in a memory device sufficiently small to be worn comfortably or the data must be telemetered to a central station. Unfortunately, data telemetry imposes limits on the available bandwidth of data transmission and on ambulatory unit battery life. Data compression, however, can improve data telemetry by increasing the speed of data
transmission and by reducing power consumption Transferring data in a continuous mode, as is done tor uncompressed data, requires more energy than transferring data in small packets in a burst mode, as is done with compressed data As a result, data compression before telemeteπng reduces telemeter power consumption Reduction of power consumption
•"> is particularly important for ambulatory patient units operating on battery power
For the above mentioned reasons data compression is highly desirable from an engineeπng standpoint From the clinical standpoint, however, it is highly desirable that no distortion of the ECG signal arise due to the compression / decompression process Details of 0 the shape of the ECG waveform are required for diagnostic techniques such as ST segment analysis A conservative requirement for the performance of an ECG compression method is that the errors introduced by the compression / decompression process be smaller than or comparable to the level of electrical noise m the ECG signal
5 Motivated by the economic pressures discussed above, many inventors have developed data compression methods of various types for ECG signals One class of compression methods which is capable of replicating the original ECG signal exactly includes methods based on information-theoretic techniques such as Huffman coding Huffman coding is disclosed m a 1952 article by Huffman A method for construction of 0 minimum-redundancy codes
The above mentioned techniques are well suited for computer applications, but are too computationally intensive for effective application m medical instruments Furthermore, there is no legitimate requirement for reproducing noise in the data stream and so a bit-wise 5 accurate compression /decompression method is not required In any event, these methods are sufficiently computationally intensive as to be unsuitable for real-time applications
Several approaches for lossy compression have also been developed One well known lossy compression method is the AZTEC method (Amplitude Zone Time Epoch Coding) 0 The AZTEC method decomposes ECG sample data points into a piece-wise continuous collection of amplitudes and slopes The AZTEC method is disclosed in a 1968 article by
Cox: AZTEC: a preprocessing program for real-time ECG rhythm analysis. The AZTEC method was originally developed for rhythm analysis and so does not have sufficient fidelity for reproduction of the ECG waveform to allow for a more sophisticated analysis such as ST segment analysis.
Other data compression methods use a mean ECG waveform obtained by averaging many beats and then compute residuals from this mean waveform and encode them by various techniques ranging from simple processes such as run-length encoding to more complex processes such as Huffman coding. These methods are more suitable for batch processing and suffer from lack of flexibility when the waveform's basic shape changes with time, as does an ECG waveform. An example of this class of methods is discussed in a 1992 article by Reddy et a\.:Data compression for storage of resting ECGs digitized at 500 samples/second.
Another group of data compression methods involve the use of truncated Fourier series, i.e. a set of sine and cosine waves, each of which individually oscillates about zero. However, a truncated Fourier series or any other mathematical option similar to it is a poor choice for representing a pulse or a series of pulses, since these series representations are very poorly convergent.
U.S. Pat. No. 5,596,658 discloses a method that involves representing data points with a polynomial expression obtained using a linear regression analysis and a least-squares regression analysis. Rather than storing the actual sample data points, only the coefficients of the function are stored. A desired data point, using this method, is approximated from the function rather than simply being retrieved from memory (as would be done if there was no compression). Data compression is accomplished if the number of coefficients necessary to identify the function is less than the number of data points being stored. This method as well, as the truncated Fourier series, cannot achieve the same level of data compression as the method herein disclosed.
While the above mentioned methods may be suitable for the particular purpose
employed, or for general use, they would not be as suitable for the purpose of the present invention as disclosed hereafter.
Recently, a number of inventors have attempted to develop wavelet-based ECG data compression methods. Thus far, these methods have yielded only modest gains in compression ratio as compared to other methods currently in use. The compression ratio is the number of bits in the original digitized stream divided by the number of bits required for the representation of the ECG signal in terms of the parameterized mathematical representation.
One group of data compression methods use wavelets with shapes poorly adapted for the compression of ECG waveforms. Examples of these methods can be found in the biomedical signal processing literature: a 1992 article by Crowe: Wavelet transform as a potential tool for ECG analysis and compression; a 1994 article by Cetin: Coding of ECG signals by wavelet transform extrema; a 1995 article by Krishnamurthy et al.: Adaptive wavelet packet decomposition for ECG data compression: and a 1995 article by Bradie: Wavelet packet-based compression of single lead ECG.
Jane et al. in Adaptive Hermite Models for ECG Data Compression: Performance and Evaluation with Automatic Wave Detection, disclose an orthogonal transformation based on
Hermite functions as a method for ECG data compression. (Jane et al. in Adaptive Hermite Models for ECG Data Compression: Performance and Evaluation with Automatic Wave Detection, Proceedings of the Computers in Cardiology Conference, London, Sept. 5-8, 1993, 5 September 1993, pages 389-392). In order to apply the procedure four signal windows are selected in each beat, corresponding to the principal ECG features: P wave, QRS complex,
ST segment, and T wave. Jane does not disclose segmenting the waveform into beats for optimal compression results.
SUMMARY OF THE INVENTION
Accordingly, it is an object of the invention to produce a data compression method and device that uses a basis set deπved from Associated Hermite or Hermite-Rodπguez wavelets to reduce the number of bits in a digital representation of a nonpeπodic signal such as an ECG signal, and thereby, compactly represent said nonpeπodic signal
It is another object of the invention to produce a data compression method and device that computes the representation of a signal efficiently using the orthogonality properties of the Associated Hermite or Hermite-Rodπguez wavelet basis set
It is a further object of the invention to produce a data compression method and device using a basis function, incorporating Hermite polynomials, which is better suited to the ECG waveform morphology, and can therefore reach high compression ratios while still retaining good fidelity of the reconstructed waveform
It is a still further object of the invention to produce a data compression method and device that can be used to facilitate accurate storage and transfer of a nonpeπodic signal such as an ECG waveform
It is yet a further object of the invention to produce a data compression method and device that effectively optimize the Associated Hermite wavelet or Hermite Rodπguez representation of a nonpeπodic signal by compressing each heartbeat pulse individually rather than compressing a fixed-size buffer of data
It is still yet a further object of the invention to produce a data compression method and device that produces a high fidelity reconstruction of the oπgmal waveform which a physician mav use for diagnostic purposes
It is still another object of the invention to produce a data compression method and device that removes noise from a data stream by limiting the components included in the
Associated Hermite wavelet representation to a finite set of terms
It is still yet another object of the invention to produce a data compression method and device that produces a compressed representation of physiological signal data points that can be used for analysis and classification of physiological signals
The invention is a method for compressing a nonpeπodic signal, such as an electrocardiograph signal, using Associated Hermite or Hermite-Rodπguez wavelets and weighting coefficients m a mathematical linear combination m which a mathematical senes is constructed, said seπes is indexed by the mathematical order of the wavelets, said weighting coefficients are computed using the orthogonality properties of the wavelets Each heartbeat is compressed individually and the amount of noise m the reconstructed signal is controlled by varying the number of terms m the mathematical seπes
To the accomplishment of the above and related objects the invention may be embodied in the form illustrated m the accompanying drawings Attention is called to the fact, however, that the drawings are illustrative only Variations are contemplated as being part of the invention, limited only by the scope of the claims
BRIEF DESCRIPTION OF THE DRAWINGS
In the drawings, like elements are depicted by like reference numerals The drawings are bπefly described as follows Figure 1 illustrates a typical ECG signal Figure 2 is a block diagram illustrating the route that data takes m an ECG system which does not incorporate a data compression method Figure 3 is a block diagram illustrating the route that data takes in an ECG system incorporating a data compression method Figure 4-A illustrates a plot of a rectangular pulse reconstruction Figure 4-B illustrates a plot of a tπangular pulse reconstruction
Figure 4-C illustrates a plot of a function with odd symmetry
Figure 5-A 1 is a plot of a normal ECG beat taken from the MIT database file number 105, and the corresponding reconstructed beat Figure 5A-2 is the same plot as in figure 5-A 1 but with the reconstructed beat shifted down by 100 mV for clarity Figure 5-B 1 is a plot of a normal ECG beat, taken from the MIT database file number 123, and the corresponding reconstructed beat Figure 5-B 2 is a plot of the same data as m figure 5-B 1 but with the reconstructed beat shifted down by 100 mV for claπty Figure 5-C 1 is a plot of an abnormal ECG beat, taken from the MIT database file number 124, and the corresponding reconstructed beat
Figure 5-C 2 is a plot of the same data as m figure 5-C 1 but with the reconstructed beat shifted down by 100 mV for clarity Figure 6- A illustrates the Hermite polynomial expanded up to the 12th order Figure 6-B illustrates the Associated Hermite wavelets defined up to the 12th order, used here in reference to the square wave example
Figure 6-C illustrates integrals which need to be solved m order to arrive at the wavelet coefficients, a 0 through a12, for the square wave example Figure 6D is a table of coefficient values calculated for the square wave example Figure 6E illustrates a seπes of defining equation partial sums used for the square wave example
Figure 7A illustrates two successive ECG beats that were reconstructed without first undergoing the disclosed transformation
Figure 7B illustrates an ECG beat and an interpolation function, which is used to transform the ECG beat Figure 7C illustrates the ECG beat of FIG 7B after undergoing the disclosed transformation
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
Figure 1 illustrates a typical ECG waveform The signal (12) consists of a number of data points ( 14) Each data point ( 14) represents the amplitude of the signal at a given moment of time
Figure 2 illustrates the route that data takes in an ECG system (16) which does not incorporate a data compression method An ECG (18) generates an analog signal (20) This analog signal (20) is first converted to a digital signal (24) by an analog-to-digital converter (22) The digital signal 24 compπses a piecewise numeric representation of the analog signal A memory unit (26) stores the digitized version of each data point
Figure 3 illustrates the route that data takes in an ECG system (28) incorporating a data compression method As in the ECG system (16), illustrated m figure 2. an analog signal (32) output by an ECG (30) is first converted to a digital signal (36) by an analog-to- digital converter (34) In the ECG system of figure 3 (28), however, the digital signal (36) is next fed into a compression algorithm (38) The algoπthm (38) executes the following operations
The algoπthm (38) segments the digital signal (36) (the ECG data stream) into a set of heartbeat pulses This is accomplished by using a conventional QRS detection algoπthm such as the algoπthm published by Pan and Tompkms, but any reliable QRS detection algoπthm will serve Next, before computing a defining function, the compression algoπthm 38 transforms each segment so that its first data point and its last data point are equal to zero This transformation is necessary because it is difficult to represent a signal, using a wavelet basis function which oscillates about zero This step will be discussed m further detail after the general compression procedure is descπbed
The algoπthm (38) computes the Associated Hermite wavelet transform of each segment of the ECG waveform An appropπate zero time is assigned to each ECG waveform segment, which is taken to be the maximum of the R wave component of the ECG waveform
in a typical embodiment, but can be selected according to other cπteπa as well The coefficients of the Associated Hermite wavelet transform are computed according to the formula
w(t) U(t)dt
(1) segment
where w(t) is the waveform, U, (t) is the ith Associated Hermite wavelet as defined below, with t the time relative to the zero point chosen for the waveform segment In the case of discretely sampled data, as m common engmeeπng practice, this integral is wπtten as a summation m the conventional fashion The compressed representation of the ECG signal (39) output by the algoπthm to a memory unit (40) consists of a list of numeπcal values containing the beginning, and ending times of the waveform segment, the zero time for the segment, and a sufficiently large set of wavelet coefficients to allow subsequent reconstruction of the ECG waveform to a desired level of accuracy One of these lists will be generated for each data segment, I e each beat, in a long set of data
In order to decompress the waveform the algoπthm (38) accesses the list of numbers m the compressed waveform of the desired time segment of the data and then uses the time times and coefficients of the wavelet transform to evaluate the sum
between the start time, and end time of the desired data segment If the waveform contains more than one beat, as is usually the case, the algorithm must successively evaluate the formula (equation 2) over a set of time segments to reconstruct the entire desired waveform
The value of n used in the compression and decompression of the waveform controls the accuracy of the reconstructed waveform at the expense of a loss in compression ratio The user decides what is an acceptable tradeoff between accuracy of reconstruction of the compressed waveform and the costs in computer memory, transmission channel bandwidth, or other parameters relevant to the system under consideration Increasing the number of
terms summed, 1 e the higher the value of n, increases the accuracy of reconstruction of the compressed waveform defined by equation 2 We illustrate this effect for claπty first with a set of simplified mathematical waveforms in figures 4A 4B, and 4C
Figure 4A illustrates a rectangle function (bold solid lines) of unit width centered at the origin and the reconstructions of said rectangle function
Figure 4B illustrates a plot of a triangle function (bold solid line) also centered on the origin along with reconstructions of said tπangle function
Figure 4C is a plot of another example function with odd symmetry along with reconstructions of said function
R0(t,) represents a reconstruction of F(t) using equation- 1 with n=0, and λ=0 5 R4(t,) represents a reconstruction of F(t) using equation- 1 with n=4, and λ=0 5 R8(t,) represents is a reconstruction of F(t) using equation- 1 with n=8 and λ=0 5 R12(t,) represents a reconstruction of F(t) using equation- 1 with n=12 and λ=0 5 As illustrated by the plots in figures 4A, 4B, and 4C, as the number of terms used m the defining function summation increase, the plot of the defining function (the reconstruction) approaches the shape of a plot of the oπgmal data points
It should be noted that the Hermite-Rodnguez wavelet could be used instead of the Associated Hermrte wavelet
We now define the Associated Hermite wavelet basis functions used m this compression and decompression technique according to the relation
where H„ denotes the Hermite polynomial of order n and is well known among those skilled in the relevant art An interesting feature of the Associated Hermite wavelets is that their scaling properties are explicitly included in a scaling parameter λ which controls the overall scale of the wavelet basis
It should be noted that the Associated Hermite wavelet is the most useful example of a class of closely related wavelets and wavelet frames which can be expected to be effective for ECG compression because they all have shapes similar to typical ECG waveforms The Hermite-Rodπguez wavelet is another example of this class of wavelets which could be used effectively for ECG data compression All of the wavelets discussed here are of a class which obey a convolution algebra, l e they all share the properties of an abstract algebra in which convolution is the multiplicative operation The scope of the invention descπbed here includes compression and decompression algonthms whrch use wavelet bases and wavelet frames which are members of this class
Three ECG beats taken from the MIT-BIH Arrhythmia Database are illustrated in figures 5-A 1, 5-B 1, and 5-C 1, in which amplitude, on the y-axis, is plotted against time, on the x-axis The figures depict the original signal, as well as a reconstructed signal The latter were deπved after reducing the original beats to their wavelet components and reconstructed from these components using the defining function (1 ) and the reconstruction steps discussed above All plots are shown superposed each figure so that the fidelity of the wavelet reconstruction is evident In each case fifty wavelet components are used (n=50)
A first example of a normal ECG beat (56) shown m dotted lines, taken from the MIT database file number 105, and a leconstructed version of the same beat (58) (solid line) are illustrated in figure 5-A 1 A total of 253 samples make up the original beat (56) The time scale is in seconds, with the zero point at the center of a central peak (59) The amplitude scale is in millivolts and the scaling parameter is λ = 0 0625 As can be seen from figure 5- A, the oπgmal beat (56) and the reconstructed beat (58) match very closely Figure 5-A 2 is the same plot as in figure 5-A 1 but with the reconstructed beat (56) shifted down by 100 mV for clarity
A second example of a normal ECG beat (60, dotted line), taken from MIT database file number 123, and a reconstructed version of this beat (62, solid line) are depicted in figure 5-B 1 The original beat (60) in this example contains 577 samples As in the first example, a λ = 0 0625 was used A slightly higher degree of deviation between the oπg al beat (60) and the reconstructed beat (62) is evident in figure 5-B 1 to either side of the central peak
(68) This deviation is analogous to the "ringing" in the representation of a rectangular pulse by a truncated Fourier series Figure 5-B 2 is the same plot as m figure 5-B 1 but with the reconstructed beat (62) shrfted down by 100 mV for claπty
Figure 5-C 1 illustrates an ectopic beat (70), taken from the MIT database file number
124, and its reconstructed version (72) The oπgmal beat (70) in this example contains 397 samples The scaling factor, λ, is set to 0 125 Agreement between the original beat (70) and the corresponding reconstructed beat (72) is excellent This agreement demonstrates the capability of the data compression method to represent efficiently different waveform morphologies expected in ECG data Figure 5-C 2 depicts the same signals as m figure 5-C 1 but with the reconstructed beat (72) shifted down by 100 mV for claπty
Figure 5-C 1 also illustrates how the method herein disclosed for compactly representing a waveform can also be used to remove noise from a signal The oπgmal beat (70), as best shown in figure 5-C 2, is much bumpier than the smooth reconstructed beat (72)
If more terms are used m equatιon-2 to reconstruct the oπgmal beat, a more accurate the representation will be obtained, and thus, a more "noisy" signal will be included in the representation Accordingly, one method for eliminating noise (unwanted signal components) in a reconstructed signal is to limit the number of terms used in the series m equatιon-2
Figure 6- A illustrates the Hermite polynomial expanded up to the 12th order Figure 6-B illustrates the Associated Hermite wavelets (see equatιon-3 above) defined up to the 12th order Note that for figures 6-B to 6-E time (t) is represented by the vaπable x Figure 6-C illustrates integrals which need to be solved in order to arrive at the wavelet coefficients, a 0 through a, 2 The term F(x) within the integral represents the value of a data point at a given
point m time The Associated Hermite wavelet, U„ as defined m equation-3 and illustrated in figure 6-C, is equivalent to a constant multiplied by the time value (it can be so converted mathematically) Therefore, the integral of F(x) multiplied by the Associated Hermite wavelet (also a constant at time = \) will yield a constant value If the limit of integration, w. is set to the value of 2 (it just needs to be bigger than the function width) then the wavelet coefficient values will match those in the table illustrated in figure 6-D
Figure 6-E illustrates a series of defining equation partial sums created using the defining equation- 1 A plot of four of the defining equation partial sums is illustrated in figure 4- A Note, once again, that time is represented on the x axis As discussed earlier, as the number of terms used m the defining function summation (1) increase (as the number of partial sums increase) the defining function (the reconstruction) approaches the shape of the original rectangle function (2)
Returning now to the transformation of setting the first and last data points equal to zero, first discussed on page 8 FIG 7A illustrates two consecutive ECG beats, a first ECG beat 42 and a second ECG beat 44, that were reconstructed without first undergoing the above mentioned transformatron As shown, a spike 46 (circumscπbed by a dotted circle) results at the point where the first ECG beat 42 and the second ECG beat 44 meet The first and last points of an individual beat in an ECG signal are often not equal to zero because the
ECG signal is subject to baseline effects, due to muscle noise, respiration, electπcal noise, and other similar effects Therefore, if a spike, such as the spike 46 illustrated in FIG 7A, is to be avoided, the ECG pulses must be transformed so as to set their first and last data points to zero while retaining the excursions in the signal amplitude of interest The above mentioned transformation is accomplished by
(a) calculating an interpolation function which starts at the first data point of a beat and ends at the last data point of the same beat, and(b)then subtracting the value of this function from each data point of the segment In this example, the interpolation function is linear However, any appropπate function may be used FIG 7B illustrates a beat 48 and a line 50, which starts at a first point 52 having a value of A , at time T, and ends at a second point 54 having a value of A , at time T„ used to
transform the ECG beat 48. Subtracting the value of the line function at each heart beat data point has the effect of setting the first and last data point of the transformed ECG beat equal to zero. FIG 7C illustrates the ECG beat 48 of FIG 7B after the above descπbed transformation. As shown in FIG 7C, the transformation has accomplished its purpose of transforming the ECG beat 48 such that the first and last data points are equal to zero.
Dashed honzontal lines, illustrated in FIGS 7B and 7C, indicate the zero line. Most often linear interpolation will be used between the starting and ending points of the pulse, since baseline shifts will be approximately linear in time over the limited timebase of a single pulse. In principle, however, any interpolating function with the properties of starting and ending at the appropriate amplitudes may be used.