US20090195697A1 - Noise and/or flicker reduction in video sequences using spatial and temporal processing - Google Patents

Noise and/or flicker reduction in video sequences using spatial and temporal processing Download PDF

Info

Publication number
US20090195697A1
US20090195697A1 US12/233,468 US23346808A US2009195697A1 US 20090195697 A1 US20090195697 A1 US 20090195697A1 US 23346808 A US23346808 A US 23346808A US 2009195697 A1 US2009195697 A1 US 2009195697A1
Authority
US
United States
Prior art keywords
frame
sub
transform
spatial
coefficients
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US12/233,468
Other versions
US8731062B2 (en
Inventor
Sandeep Kanumuri
Onur G. Guleryuz
M. Reha Civanlar
Akira Fujibayashi
Choong S. Boon
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NTT Docomo Inc
Original Assignee
NTT Docomo Inc
Docomo Communications Labs USA Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NTT Docomo Inc, Docomo Communications Labs USA Inc filed Critical NTT Docomo Inc
Priority to US12/233,468 priority Critical patent/US8731062B2/en
Assigned to NTT DOCOMO, INC. reassignment NTT DOCOMO, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DOCOMO COMMUNICATIONS LABORATORIES USA, INC.
Assigned to DOCOMO COMMUNICATIONS LABORATORIES USA, INC. reassignment DOCOMO COMMUNICATIONS LABORATORIES USA, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: GULERYUZ, ONUR G, BOON, CHOONG S., FUJIBAYASHI, AKIRA, CIVANLAR, M. REHA, KANUMURI, SANDEEP
Priority to JP2010545258A priority patent/JP5419897B2/en
Priority to CN2009801039523A priority patent/CN101933330B/en
Priority to PCT/US2009/032888 priority patent/WO2009100032A1/en
Priority to KR1020107017838A priority patent/KR101291869B1/en
Priority to EP09708388.5A priority patent/EP2243298B1/en
Assigned to NTT DOCOMO, INC. reassignment NTT DOCOMO, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BOON, CHOONG S., FUJIBAYASHI, AKIRA
Assigned to DOCOMO COMMUNICATIONS LABORATORIES USA, INC. reassignment DOCOMO COMMUNICATIONS LABORATORIES USA, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: GULERYUZ, ONUR G., CIVANLAR, M. REHA, KANUMURI, SANDEEP
Assigned to NTT DOCOMO, INC. reassignment NTT DOCOMO, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DOCOMO COMMUNICATIONS LABORATORIES USA, INC.
Publication of US20090195697A1 publication Critical patent/US20090195697A1/en
Assigned to NTT DOCOMO, INC. reassignment NTT DOCOMO, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DOCOMO COMMUNICATIONS LABORATORIES USA, INC.
Publication of US8731062B2 publication Critical patent/US8731062B2/en
Application granted granted Critical
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • G06F17/147Discrete orthonormal transforms, e.g. discrete cosine transform, discrete sine transform, and variations therefrom, e.g. modified discrete cosine transform, integer transforms approximating the discrete cosine transform
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • G06F17/145Square transforms, e.g. Hadamard, Walsh, Haar, Hough, Slant transforms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/10Image enhancement or restoration by non-spatial domain filtering
    • G06T5/75
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • H04N19/122Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/14Coding unit complexity, e.g. amount of activity or edge presence estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/18Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/48Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using compressed domain processing techniques other than decoding, e.g. modification of transform coefficients, variable length coding [VLC] data or run-length data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/59Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/649Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding the transform being applied to non rectangular image segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/86Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving reduction of coding artifacts, e.g. of blockiness
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20048Transform domain processing
    • G06T2207/20052Discrete cosine transform [DCT]

Definitions

  • the present invention relates generally to processing of video sequences; more particularly, the present invention is related to reducing noise and/or flicker in video sequences.
  • Mosquito noise and temporal flicker are caused during acquisition due to camera limitations. Modules in the video processing pipeline such as compression, downsampling and upsampling lead to blocking artifacts, aliasing, ringing and temporal flicker. Image and video signal processing is widely used in a number of applications today. Some of these techniques have been used to reduce noise and temporal flicker.
  • a method and apparatus for reducing at least one of both flicker and noise in video sequences.
  • the method comprises receiving an input video and performing operations to reduce one or both of noise and flicker in the input video using spatial and temporal processing.
  • FIG. 1 illustrates one embodiment of a noise and flicker reduction module to reduce noise and/or flicker in an input video.
  • FIG. 2 illustrates a flow diagram of one embodiment of a process for performing image processing on a video sequence.
  • FIGS. 3A-M illustrate examples of masks that correspond to a library of sub-frame types.
  • FIG. 4 shows an example sub-frame at pixel i when pixels are number in raster-scan order.
  • FIG. 5 is a flow diagram of one embodiment of a sub-frame type selection process.
  • FIG. 6 is a flow diagram of one embodiment of a sub-frame formation process from the past output frame.
  • FIG. 7 is a flow diagram of one embodiment of a spatial transform selection process.
  • FIG. 8 is a flow diagram of one embodiment of a temporal transform selection process.
  • FIG. 9 is a flow diagram of one embodiment of a thresholding process for thresholding transform coefficients.
  • FIG. 10 is a flow diagram of one embodiment of a process for combining sub-frames to create a frame.
  • FIG. 11 illustrates a monotonic decreasing stair-case function.
  • FIG. 12 is a flow diagram of another embodiment of a process for performing image processing on a video sequence.
  • FIGS. 13A-E illustrate example subsets of selected pixels.
  • FIG. 14 is a block diagram of one embodiment of a computer system.
  • a method and apparatus for noise and/or flicker reduction in compressed/uncompressed video sequences are described.
  • a video sequence is made up of multiple images referred to herein as frames placed in order.
  • the techniques disclosed herein include, but are not limited to: selecting a sub-frame at certain pixels from the current frame of input video and finding another sub-frame from the past frame of output video that satisfies a criterion; selecting a pixel-adaptive warped spatial transform and transforming the sub-frames into a spatial transform domain; deriving a detail-preserving adaptive threshold and thresholding the transform coefficients of the sub-frames from the current frame and the past frame using hard thresholding (set to zero if magnitude of transform coefficients is less than the threshold) or other thresholding techniques such as soft-thresholding; further transforming the spatial-transform coefficients using a temporal transform and thresholding a selected sub-set of the temporal-transform coefficients; inverse transforming the temporal-transform coefficients first temporally and then spatially to get the processed sub-frames belonging to both current frame and past frame; and combining the processed sub-frames belonging to current frame from input video to obtain the current frame for output video.
  • the present invention also relates to apparatus for performing the operations herein.
  • This apparatus may be specially constructed for the required purposes, or it may comprise a general purpose computer selectively activated or reconfigured by a computer program stored in the computer.
  • a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, and magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus.
  • a machine-readable medium includes any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computer).
  • a machine-readable medium includes read only memory (“ROM”); random access memory (“RAM”); magnetic disk storage media; optical storage media; flash memory devices; electrical, optical, acoustical or other form of propagated signals (e.g., carrier waves, infrared signals, digital signals, etc.); etc.
  • FIG. 1A illustrates one embodiment of a noise and flicker reduction module to reduce noise and/or flicker in an input video.
  • noise and flicker reduction block 101 receive input video 100 .
  • Input video 100 includes noise and/or flicker.
  • Noise and flicker reduction block 101 also receives a vector of optional parameters, referred to herein as OP, and threshold parameters T , T S1 , T S2 .
  • OP optional parameters
  • T S1 , T S2 threshold parameters
  • noise and flicker reduction block 101 generates output video 102 with reduced noise and flicker.
  • FIG. 1B illustrates a flow diagram of one embodiment of a process for performing image processing on a video sequence.
  • the process is performed by processing logic that may comprise hardware (circuitry, dedicated logic, etc.), software (such as is run on a general purpose computer system or a dedicated machine), or a combination of both.
  • processing logic receiving an input video (processing block 111 ).
  • processing logic In response to receiving the input video, processing logic performs operations to reduce one or both of noise and flicker in the input video using spatial and temporal processing (processing block 112 ).
  • these operations include applying a spatial transform and a temporal transform with adaptive thresholding of coefficients.
  • applying the spatial transform and the temporal transform comprises applying at least one warped transform to a sub-frame to create transform coefficients.
  • FIG. 2 illustrates a more detailed flow diagram of one embodiment of a process for performing image processing on a video sequence.
  • the process is performed by processing logic that may comprise hardware (circuitry, dedicated logic, etc.), software (such as is run on a general purpose computer system or a dedicated machine), or a combination of both.
  • x denotes the current frame from the input video that is being processed by the techniques described herein
  • y denotes the past frame output after using the techniques described herein
  • T , T S1 , T S2 denote threshold parameters used by the image processing process.
  • a vector denoted by OP containing other optional parameters, can be supplied.
  • the user or an algorithm can determine the most desired parameters using optimization of subjective/objective quality, using model based techniques, or using other methods. Calibration algorithms can also be used. Such algorithms can also take advantage of partial/complete knowledge of either the video processing pipeline or the input video or both.
  • all video frames are represented as vectors by arranging the pixels in raster-scan order and N represents the number of pixels in each video frame.
  • a sub-frame type S is defined as an M 2 ⁇ 1 integer-valued vector.
  • M can be any integer greater than zero.
  • p i is a vector of zeros.
  • the set of selected pixels can be predetermined or signaled within the vector OP.
  • a sub-frame is formed and processed for each pixel in the image. That is, the set of selected pixels is the entire set of pixels in the frame.
  • the processing may be performed only on a selected subset of the pixels and not on all the pixels in the image.
  • the subset may be predetermined or signaled as part of the side-information.
  • FIGS. 13A-E illustrate examples of such subsets; other subsets may be used with the teachings described herein.
  • FIG. 4 shows an example sub-frame z i at pixel i when pixels are numbered in raster-scan order. Referring to FIG. 4 , the raster-scan ordering of pixels occurs by numbering pixels starting from “1” in that order.
  • a sub-frame is shown pivoted at pixel i.
  • a sub-frame is organized into M vectors called warped rows. The first warped row has the sub-frame elements 1 to M in that order; the second warped row has the elements (M+1) to 2M; and so on.
  • M is equal to 4 and the library of sub-frame types correspond to a set of masks illustrated in FIGS. 3A-3M .
  • the masks correspond to different directions as shown with arrows.
  • the mask in FIG. 3A is referred to herein as a regular mask because it corresponds to the regular horizontal or vertical directions.
  • the other masks are called directional masks since they correspond to non-trivial directions.
  • C C is the number of columns one needs to move horizontally to the right starting from the column of pixel ‘a’ to get to the column of the current pixel of interest.
  • C R is the number of rows one needs to move vertically down starting from the row of pixel ‘a’ to get to the row of the current pixel of interest.
  • the sub-frame type corresponding to a mask is the vector containing the differential-positions of pixels in that mask ordered from ‘a’ to ‘p’.
  • the choice of the sub-frame type for a pixel is made by choosing the sub-frame type corresponding to the regular mask always.
  • the choice of the sub-frame type for a pixel is made, for each selected pixel, (1) by evaluating, for each sub-frame type, a 2-D DCT over the sub-frame formed, and (2) by choosing, for a given threshold T, the sub-frame type that minimizes the number of non-zero transform coefficients with magnitude greater than T.
  • the choice of the sub-frame type for a pixel is made by choosing, for each selected pixel, the sub-frame type that minimizes the warped row variance of pixel values averaged over all warped rows.
  • the choice of the sub-frame type for a pixel is made by having, for a block of K ⁇ L pixels, each pixel vote for a sub-frame type (based on the sub-frame type that minimizes the warped row variance of pixel values averaged over all warped rows) and choosing the sub-frame type with the most votes for all the pixels in the K ⁇ L block, where K and L can be any integers greater than 0. In one embodiment, K and L are all set to be 4.
  • the choice of the sub-frame type for a pixel is made by forming, for each pixel, a block of K ⁇ L pixels and choosing a sub-frame type by using the preceding voting scheme on this block. In each case, the chosen sub-frame type is used for the current pixel. Thus, by using one of these measured statistics for each mask, the selection of a subframe is performed.
  • FIG. 5 is a flow diagram of one embodiment of sub-frame selection processing.
  • the process is performed by processing logic that may comprise hardware (circuitry, dedicated logic, etc.), software (such as is run on a general purpose computer system or a dedicated machine), or a combination of both.
  • the process begins by processing logic receiving frame x and determining whether the sub-frames are pixel-adaptive (processing block 501 ). If the sub-frames are not pixel-adaptive, processing logic chooses the regular sub-frame type for all pixels (processing block 502 ). If the sub-frames of frame x are pixel adaptive, processing logic, for each pixel, marks the sub-frame type that minimizes the warped row variance (processing block 503 ). This is done using the library of sub-frame types ( 510 ) as described above. Thus, for each pixel, the sub-frame type that minimizes the warped row variance among the library of sub-frame types is marked.
  • processing logic determines whether the choice is block-based (processing block 504 ). If processing logic determines the choice is block-based, processing logic counts the number of pixels that marked each sub-frame type in each block (processing block 506 ) and, for all pixels in a block, processing logic chooses the sub-frame type marked by most pixels in that block (processing block 507 ). In other words, if the choice is block-based, the sub-frame type marked by most pixels in a block is chosen for all pixels in that block. If processing logic determines the choice is not block-based, processing logic chooses, for each pixel, the sub-frame type marked by that pixel (processing block 505 ). In other words, each pixel chooses the sub-frame type marked by itself.
  • the choice of the sub-frame types for each pixel can be signaled within the vector OP.
  • Processing logic also forms an M 2 ⁇ 1 vector denoted by z i (also a sub-frame) with the pixel values of the past output frame, y , at locations corresponding to elements of p i (processing block 203 ).
  • m i can be made in a number of different ways. In alternative embodiments, the choice of m i is performed in one of the following ways:
  • the sub-frame z i is formed after the past output frame y has been processed using techniques such as, but not limited to, Intensity Compensation and Non-linear Prediction Filter, to compensate for issues such as, for example, brightness changes and scene fades.
  • FIG. 6 is a flow diagram of one embodiment of a sub-frame formation process from the past output frame.
  • the process is performed by processing logic that may comprise hardware (circuitry, dedicated logic, etc.), software (such as is run on a general purpose computer system or a dedicated machine), or a combination of both.
  • the process begins by processing logic using a search set ⁇ m 1 ,m 2 , . . . ⁇ and, for each value m j , computes p i j according to the following formula:
  • processing logic forms sub-frame z i j from frame y using p i j (processing block 602 ). Then, for each j, processing logic computes the p-norm
  • processing logic After computing the p-norm, processing logic selects m k such that it gives the least p-norm; sets m i equal to m k , sets p i according to the following formula:
  • processing block 604 forms sub-frame z i using p i (processing block 604 ).
  • processing logic also performs spatial transform selection and application. More specifically, processing logic transforms the sub-frames z i and z i into e i and ⁇ i respectively using a pixel-adaptive warped spatial transform H i .
  • the transform is called ‘warped’ because the support of the transform basis has warped to match the sub-frame shape.
  • the transform is called pixel-adaptive because sub-frames pivoted at different pixels can use different transforms in addition to the fact that the choice of sub-frame type can vary from pixel to pixel.
  • the transform H i can be chosen from a library of transforms such as separable DCT, non-separable DCT, 2-D Gabor wavelets, Steerable pyramids, 2-D directional wavelets, Curvelets and Contourlets.
  • the spatial transform used is an orthonormal separable 2D-DCT in a non-adaptive fashion.
  • the spatial transform used is an orthonormal separable 2D-Hadamard transform in a non-adaptive fashion.
  • a separable transform becomes non-separable after it is warped.
  • the choice of the transform can be fixed apriori or can be adaptive to the different sub-frames pivoted at different pixels.
  • the chosen transform is the one that has the least number of coefficients in e i with absolute value greater than a master threshold T S1 .
  • FIG. 7 A flow diagram of one embodiment of a spatial transform selection process for a sub-frame is illustrated in FIG. 7 .
  • the process is performed by processing logic that may comprise hardware (circuitry, dedicated logic, etc.), software (such as is run on a general purpose computer system or a dedicated machine), or a combination of both.
  • the process begins by processing logic testing whether the transform is pixel-adaptive (processing block 701 ). This test may be performed by referring to a list. In one embodiment, the list can be such that the transform is non-adaptive. In another embodiment, the list can be signaled within the vector OP. If processing logic determines that the transform is not pixel-adaptive, processing logic selects a 2-D orthonormal separable DCT for use as the transform H i , generates the transform coefficients e i by applying the transform to the sub-frame z i , and generates the transform coefficients ⁇ i by applying the transform to the sub-frame z i (processing block 702 ).
  • processing logic determines the transform is pixel-adaptive, then, for each transform H j in the library of transforms ⁇ H 1 ,H 2 , . . . ⁇ (processing block 704 ), processing logic computes the transform coefficients e j using the formula:
  • the transform coefficients e j correspond to the transform H j .
  • processing logic counts the number of coefficients in e j with an absolute value greater than a threshold T S1 (processing block 705 ) and chooses the transform from the library of transforms with the least count H k , sets the transform H i equal to the transform corresponding to the least count (H k ), then sets the coefficients e i equal to the transform coefficients e k and generates the transform coefficients ⁇ i by applying the transform H i to the sub-frame z i (processing block 706 ).
  • the choice of the spatial transform can be signaled within the vector OP.
  • processing logic also performs thresholding. More specifically, processing logic applies an adaptive threshold ⁇ circumflex over (T) ⁇ i1 on selected elements of e i to get a i . In one embodiment, all the elements of e i are selected. In another embodiment, all elements except the first element (usually the DC element) are selected. In still another embodiment, none of the elements are selected.
  • the transform coefficients e i are also thresholded using a master threshold T S1 to get ê i .
  • the thresholding operation can be done in a variety of ways such as, for example, hard thresholding and soft thresholding. The hard thresholding operation is defined as
  • HT ⁇ ( x ) ⁇ x , ⁇ x ⁇ ⁇ T 0 , ⁇ x ⁇ ⁇ T ,
  • T is the threshold used.
  • the soft thresholding operation with T as the threshold is defined as
  • the threshold ⁇ circumflex over (T) ⁇ i1 is computed in one of the following ways:
  • ⁇ circumflex over (T) ⁇ i1 can be signaled within the vector OP.
  • the choice of the option used for calculating ⁇ circumflex over (T) ⁇ i1 can be signaled within the vector OP.
  • An adaptive threshold ⁇ circumflex over (T) ⁇ i2 is applied on selected elements of ⁇ i to get ⁇ i .
  • all the elements of ⁇ i are selected.
  • all elements except the first element (usually the DC element) are selected.
  • none of the elements are selected.
  • the transform coefficients ⁇ i are also thresholded using a master threshold T S2 to get ⁇ tilde over (e) ⁇ i .
  • the thresholding operation can be done in a variety of ways such as hard thresholding and soft thresholding described above.
  • the threshold ⁇ circumflex over (T) ⁇ i2 is computed in one of the following ways:
  • ⁇ j 1 N ⁇ ⁇ e _ j - a _ j ⁇ 2 ⁇ E global ⁇ E global
  • ⁇ circumflex over (T) ⁇ i2 can be part of the side-information or default values may be used. This can be viewed as a setting for the algorithm.
  • a default value can be obtained by tuning on a training set and choosing the value that achieves a local optimum in reconstructed image/video quality.
  • the value of ⁇ circumflex over (T) ⁇ i2 is signaled within the vector OP. In another embodiment, the choice of the option used for calculating ⁇ circumflex over (T) ⁇ i2 is signaled within the vector OP.
  • the function h( ) may be an identity function or a simple linear scaling of all the elements of ⁇ i to match brightness changes or a more general function to capture more complex scene characteristics such as fades.
  • the transform G i can be chosen from a library of transforms.
  • the transform is called pixel-adaptive because sub-frames pivoted at different pixels can use different transforms.
  • the chosen transform is the one that has the least number of coefficients in b i with absolute value greater than a master threshold T .
  • FIG. 8 is a flow diagram of one embodiment of a temporal transform selection process.
  • the process is performed by processing logic that may comprise hardware (circuitry, dedicated logic, etc.), software (such as is run on a general purpose computer system or a dedicated machine), or a combination of both.
  • the process begins by processing logic testing whether the transform is pixel-adaptive (processing block 801 ). This test may be performed by referring to a list. In one embodiment, the list can be such that the transform is non-adaptive. In another embodiment, the list can be signaled within the vector OP. If processing logic determines that the transform is not pixel-adaptive, processing logic selects transform G i based on a default temporal transform and generates the transform coefficients b i by applying the transform G i to the matrix ⁇ i (processing block 802 ). In one embodiment, the default temporal transform used is a Haar transform, i.e.
  • G i [ 1 2 1 2 1 2 - 1 2 ] .
  • the choice of the temporal transform can be signaled within the vector OP.
  • processing logic determines the transform is pixel-adaptive, then, for each transform G j in the library of transforms ⁇ G 1 , G 2 , . . . ⁇ (processing block 804 ), processing logic computes the transform coefficients b j using the formula:
  • the transform coefficients b j correspond to the transform G j .
  • processing logic counts the number of coefficients in b j with an absolute value greater than a master threshold T (processing block 805 ) and then chooses the transform from the library of transforms with the least count G k , sets the transform G i equal to the transform corresponding to the least count (G k ), and then sets the coefficients b i equal to the transform coefficients b k (processing block 806 ).
  • the transform coefficients b i are thresholded using T to get c i (processing block 206 of FIG. 2 ).
  • the thresholding operation can be done in a variety of ways such as hard thresholding and soft thresholding as described above. The choice of thresholding can be signaled within the vector OP.
  • hard thresholding is used as illustrated in FIG. 9 .
  • the hard thresholding is performed by processing logic that may comprise hardware (circuitry, dedicated logic, etc.), software (such as is run on a general purpose computer system or a dedicated machine), or a combination of both.
  • the hard thresholding begins using a master threshold T and coefficients b i as inputs, and processing logic, for each element b ij ⁇ b i , computing the corresponding element c ij ⁇ c i according to the following equation:
  • c ij ⁇ b ij , ⁇ b ij ⁇ ⁇ T _ 0 , ⁇ b ij ⁇ ⁇ T _
  • processing logic sets to zero all coefficients with absolute values less than the master threshold T and these coefficients are stored as c i .
  • some elements of b i are not thresholded and copied directly into their respective positions in c i .
  • the elements in the first column of b i are not thresholded.
  • the choice of the set of elements that are not thresholded can be signaled within the vector OP.
  • the parameters can be signaled within the vector OP.
  • the current frame is processed without using the past frame output by a previous iteration.
  • the vectors z i , ⁇ i , ⁇ i and the matrices ⁇ i , b i , c i , ⁇ tilde over (d) ⁇ i are not computed.
  • a set of past frames ⁇ y , y , . . . ⁇ output as a result of the image processing can be used instead of just using the immediate past output frame y .
  • N PF denote the number of past frames in the set.
  • each of the past frames in the set contributes to one column of ⁇ i in the same way, as described above.
  • the output frame y contributes in the form of ⁇ i to the second column, the output frame y contributes in the form of a i to the third column and so on.
  • ⁇ i , b i , c i and d i are of size M 2 ⁇ (N PF +1) and G i is of size (N PF +1) ⁇ (N PF +1).
  • a weight w i is computed for each processed sub-frame ⁇ circumflex over (z) ⁇ i .
  • weights based on e i and a i are computed in one of the following ways:
  • w i ⁇ 1 ⁇ e i - a i ⁇ 2 , ⁇ e i - a i ⁇ 2 > e min 1 e min , ⁇ e i - a i ⁇ 2 ⁇ e min ,
  • w i ⁇ 1 ⁇ a i ⁇ p , ⁇ a i ⁇ p > n min 1 n min , ⁇ a i ⁇ p ⁇ n min ,
  • n min is a constant
  • w i ⁇ 1 ⁇ c i ⁇ p , ⁇ c i ⁇ p > n min 1 n min , ⁇ c i ⁇ p ⁇ n min ,
  • n min is a constant
  • the processed sub-frames ⁇ circumflex over (z) ⁇ 1:N are combined together to form y in a weighted manner.
  • y j is the value of the j th pixel.
  • FIG. 10 is a flow diagram of one embodiment of a process for combining all processed sub-frames to form frame y.
  • the process is performed by processing logic that may comprise hardware (circuitry, dedicated logic, etc.), software (such as is run on a general purpose computer system or a dedicated machine), or a combination of both.
  • processing logic initializes the pixel index j and the sub-frame index i to one (processing block 1002 ).
  • processing logic determines whether pixel j ⁇ p i (processing block 1003 ). If it is, the process transitions to processing block 1004 . If not, process transitions to processing block 1005 .
  • processing logic updates y j and n j using ⁇ circumflex over (z) ⁇ ik , the value of the pixel j in ⁇ circumflex over (z) ⁇ i , and using weight w i as described above.
  • the weight is calculated according to the following:
  • w i ⁇ 1 ⁇ e i - a i ⁇ 2 , ⁇ e i - a i ⁇ 2 > e min 1 e min , ⁇ e i - a i ⁇ 2 ⁇ e min
  • processing logic updates y j and n j based on the following equation:
  • n j n j +w i
  • processing logic After processing logic updates y j and n j , the process transitions to processing block 1005 .
  • processing logic updates y j according to the following equation:
  • processing logic After updating y j , processing logic sets the index i equal to 1 (processing block 1008 ) and checks whether the index j is equal to N (processing block 1009 ). If it is, the process ends. If not, the process transitions to processing block 1010 where the index j is incremented by one. After incrementing the index j by one, the process transitions to processing block 1003 .
  • the frame y is the output corresponding to the current input frame x. If there are more frames to process, processing logic updates the current input frame x, copies y into y and repeat the process as shown in FIG. 2 (processing block 212 ).
  • the frame y undergoes further image/video processing in pixel-domain or a transform domain.
  • unsharp masking is performed on frame y to enhance high-frequency detail.
  • multiple blocks of size P ⁇ P pixels are formed from frame y, where P is an integer and each P ⁇ P block f undergoes a block transform, such as 2-D DCT, 2-D Hadamard etc, to produce another P ⁇ P block h.
  • the enhancement factor ⁇ (i,j) can be computed in one of the following ways:
  • the process described in FIG. 2 can be modified to get a lower complexity algorithm, hereinafter referred to as the lower-complexity technique.
  • the lower-complexity technique is illustrated by the flow chart in FIG. 12 .
  • the frame y is the output of the lower-complexity technique corresponding to the current input frame x, and if there are more frames to process, we update the current input frame x, copy y into y and repeat the process as shown in FIG. 12 .
  • the process begins by processing logic forming a frame ⁇ tilde over (y) ⁇ using the current input frame x and the past output frame y such that
  • ⁇ tilde over (y) ⁇ ( j ) w z *x ( j ) ⁇ w y * y ( j+m ) j ⁇ Z, 1 ⁇ j ⁇ H*W,
  • w z , w y are real numbers and m is an integer (processing block 1201 ).
  • the notation (j) denotes the value of pixel j (numbered in the raster scan order) in the frame of interest.
  • y (5) represents the value of 5 th pixel of frame y .
  • the values w z and w y are signaled within the vector OP.
  • the choice of m can be made in one of the following ways:
  • the choice of m can be signaled within the vector OP.
  • the frame ⁇ tilde over (y) ⁇ is formed using a processed version of y instead of y to compensate for issues such as brightness changes and scene fades, where the processing includes techniques such as, but not limited to, Intensity Compensation and Non-Linear Prediction Filter.
  • Processing logic forms an M 2 ⁇ 1 vector z i called a sub-frame with pixel values of frame x at locations corresponding to elements of p i .
  • Pixel i is called the pivot for sub-frame z i (processing block 1202 ).
  • An M 2 ⁇ 1 vector denoted by z i (also a sub-frame) is formed with the pixel values of frame ⁇ tilde over (y) ⁇ at locations corresponding to elements of p i (processing block 1202 ).
  • Processing logic selects a spatial transform H i and applies the spatial transform to sub-frames z i and z i to get vectors e i and ⁇ i respectively (processing block 1203 ).
  • Processing logic computes adaptive threshold ⁇ circumflex over (T) ⁇ i1 from T S1 using the same process described above and applies the adaptive threshold ⁇ circumflex over (T) ⁇ i1 on selected elements of e i to get a i (processing block 1203 ). In one embodiment, all the elements of e i are selected. In another embodiment, all elements except the first element (usually the DC element) are selected.
  • the thresholding operation can be done in a variety of ways such as hard thresholding and soft thresholding, as described above.
  • processing logic After applying the adaptive threshold ⁇ circumflex over (T) ⁇ i1 on selected elements of e i , processing logic forms a vector d i using a i , e i , ⁇ i and using threshold T (processing block 1204 ).
  • a ij , e ij , ⁇ ij and d ij represent the j th element in the vectors a i , e i , ⁇ i and d i respectively, where j ⁇ 1,2, . . . ,M 2 ⁇ .
  • the value d ij is computed in one of the following ways:
  • the choice of the option used for calculating d ij is signaled within the vector OP.
  • processing logic applies the inverse spatial transform to the vector d i to produce the sub-frame ⁇ circumflex over (z) ⁇ i (processing block 1205 ), and the remainder of the processing blocks 1206 , 1207 , 1208 , and 1209 operate as their respective counterparts 209 , 210 , 211 , and 212 in FIG. 2 to complete the process.
  • the optional parameter vector OP or parts of it can be signaled by any module including, but not limited to, codec, camera, super-resolution processor etc.
  • codec codec
  • camera camera
  • super-resolution processor etc.
  • One simple way to construct the parameter vector OP is as follows: each choice is signaled using two elements in the vector. For the nth choice,
  • the techniques described herein can be used to process a video sequence in any color representation including, but not limited to, RGB, YUV, YCbCr, YCoCg and CMYK.
  • the techniques can be applied on any subset of the color channels (including the empty set or the all channel set) in the color representation.
  • only the ‘Y’ channel in the YUV color representation is processed using the techniques described herein.
  • the U and V channels are filtered using a 2-D low-pass filter (e.g. LL band filter of Le Gall 5/3 wavelet).
  • the techniques described herein can be used to process only a pre-selected set of frames in a video sequence. In one embodiment, alternative frames are processed. In another embodiment, all frames belonging to one or more partitions of a video sequence are processed. The set of frames selected for processing can be signaled within OP.
  • the techniques can also be applied to compressed video sequences that underwent post-processing such as Non-linear Denoising Filter. Furthermore, the techniques can be applied on video sequences that are obtained by super-resolving a low-resolution compressed/uncompressed video sequence. The techniques can also be applied on video sequences that are either already processed or will be processed by a frame-rate conversion module.
  • FIG. 14 is a block diagram of an exemplary computer system that may perform one or more of the operations described herein.
  • computer system 1400 may comprise an exemplary client or server computer system.
  • Computer system 1400 comprises a communication mechanism or bus 1411 for communicating information, and a processor 1412 coupled with bus 1411 for processing information.
  • Processor 1412 includes a microprocessor, but is not limited to a microprocessor, such as, for example, PentiumTM, PowerPCTM, AlphaTM, etc.
  • System 1400 further comprises a random access memory (RAM), or other dynamic storage device 1404 (referred to as main memory) coupled to bus 1411 for storing information and instructions to be executed by processor 1412 .
  • main memory 1404 also may be used for storing temporary variables or other intermediate information during execution of instructions by processor 1412 .
  • Computer system 1400 also comprises a read only memory (ROM) and/or other static storage device 1406 coupled to bus 1411 for storing static information and instructions for processor 1412 , and a data storage device 1407 , such as a magnetic disk or optical disk and its corresponding disk drive.
  • ROM read only memory
  • Data storage device 1407 is coupled to bus 1411 for storing information and instructions.
  • Computer system 1400 may further be coupled to a display device 1421 , such as a cathode ray tube (CRT) or liquid crystal display (LCD), coupled to bus 1411 for displaying information to a computer user.
  • a display device 1421 such as a cathode ray tube (CRT) or liquid crystal display (LCD)
  • An alphanumeric input device 1422 may also be coupled to bus 1411 for communicating information and command selections to processor 1412 .
  • An additional user input device is cursor control 1423 , such as a mouse, trackball, trackpad, stylus, or cursor direction keys, coupled to bus 1411 for communicating direction information and command selections to processor 1412 , and for controlling cursor movement on display 1421 .
  • bus 1411 Another device that may be coupled to bus 1411 is hard copy device 1424 , which may be used for marking information on a medium such as paper, film, or similar types of media.
  • hard copy device 1424 Another device that may be coupled to bus 1411 is a wired/wireless communication capability 1425 to communication to a phone or handheld palm device.

Abstract

A method and apparatus is disclosed herein for reducing at least one of both flicker and noise in video sequences. In one embodiment, the method comprises receiving an input video and performing operations to reduce one or both of noise and flicker in the input video using spatial and temporal processing.

Description

    PRIORITY
  • The present patent application claims priority to and incorporates by reference the corresponding provisional patent application Ser. No. 61/026,453, titled, “Flicker Reduction in Video Sequences Using Temporal Processing,” filed on Feb. 5, 2008.
  • RELATED APPLICATIONS
  • This application is related to the co-pending application entitled “Image/Video Quality Enhancement and Super-Resolution Using Sparse Transformations,” filed on Jun. 17, 2008, U.S. patent application Ser. No. 12/140,829, assigned to the corporate assignee of the present invention.
  • FIELD OF THE INVENTION
  • The present invention relates generally to processing of video sequences; more particularly, the present invention is related to reducing noise and/or flicker in video sequences.
  • BACKGROUND OF THE INVENTION
  • Mosquito noise and temporal flicker are caused during acquisition due to camera limitations. Modules in the video processing pipeline such as compression, downsampling and upsampling lead to blocking artifacts, aliasing, ringing and temporal flicker. Image and video signal processing is widely used in a number of applications today. Some of these techniques have been used to reduce noise and temporal flicker.
  • SUMMARY OF THE INVENTION
  • A method and apparatus is disclosed herein for reducing at least one of both flicker and noise in video sequences. In one embodiment, the method comprises receiving an input video and performing operations to reduce one or both of noise and flicker in the input video using spatial and temporal processing.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The present invention will be understood more fully from the detailed description given below and from the accompanying drawings of various embodiments of the invention, which, however, should not be taken to limit the invention to the specific embodiments, but are for explanation and understanding only.
  • FIG. 1 illustrates one embodiment of a noise and flicker reduction module to reduce noise and/or flicker in an input video.
  • FIG. 2 illustrates a flow diagram of one embodiment of a process for performing image processing on a video sequence.
  • FIGS. 3A-M illustrate examples of masks that correspond to a library of sub-frame types.
  • FIG. 4 shows an example sub-frame at pixel i when pixels are number in raster-scan order.
  • FIG. 5 is a flow diagram of one embodiment of a sub-frame type selection process.
  • FIG. 6 is a flow diagram of one embodiment of a sub-frame formation process from the past output frame.
  • FIG. 7 is a flow diagram of one embodiment of a spatial transform selection process.
  • FIG. 8 is a flow diagram of one embodiment of a temporal transform selection process.
  • FIG. 9 is a flow diagram of one embodiment of a thresholding process for thresholding transform coefficients.
  • FIG. 10 is a flow diagram of one embodiment of a process for combining sub-frames to create a frame.
  • FIG. 11 illustrates a monotonic decreasing stair-case function.
  • FIG. 12 is a flow diagram of another embodiment of a process for performing image processing on a video sequence.
  • FIGS. 13A-E illustrate example subsets of selected pixels.
  • FIG. 14 is a block diagram of one embodiment of a computer system.
  • DETAILED DESCRIPTION OF THE PRESENT INVENTION
  • A method and apparatus for noise and/or flicker reduction in compressed/uncompressed video sequences are described. For purposes herein, a video sequence is made up of multiple images referred to herein as frames placed in order.
  • In one embodiment, the techniques disclosed herein include, but are not limited to: selecting a sub-frame at certain pixels from the current frame of input video and finding another sub-frame from the past frame of output video that satisfies a criterion; selecting a pixel-adaptive warped spatial transform and transforming the sub-frames into a spatial transform domain; deriving a detail-preserving adaptive threshold and thresholding the transform coefficients of the sub-frames from the current frame and the past frame using hard thresholding (set to zero if magnitude of transform coefficients is less than the threshold) or other thresholding techniques such as soft-thresholding; further transforming the spatial-transform coefficients using a temporal transform and thresholding a selected sub-set of the temporal-transform coefficients; inverse transforming the temporal-transform coefficients first temporally and then spatially to get the processed sub-frames belonging to both current frame and past frame; and combining the processed sub-frames belonging to current frame from input video to obtain the current frame for output video. These operations can be repeated for all the frames of the input video.
  • In the following description, numerous details are set forth to provide a more thorough explanation of the present invention. It will be apparent, however, to one skilled in the art, that the present invention may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form, rather than in detail, in order to avoid obscuring the present invention.
  • Some portions of the detailed descriptions which follow are presented in terms of algorithms and symbolic representations of operations on data bits within a computer memory. These algorithmic descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. An algorithm is here, and generally, conceived to be a self-consistent sequence of steps leading to a desired result. The steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.
  • It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the following discussion, it is appreciated that throughout the description, discussions utilizing terms such as “processing” or “computing” or “calculating” or “determining” or “displaying” or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.
  • The present invention also relates to apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, and magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus.
  • The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various general purpose systems may be used with programs in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform the required method steps. The required structure for a variety of these systems will appear from the description below. In addition, the present invention is not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the invention as described herein.
  • A machine-readable medium includes any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computer). For example, a machine-readable medium includes read only memory (“ROM”); random access memory (“RAM”); magnetic disk storage media; optical storage media; flash memory devices; electrical, optical, acoustical or other form of propagated signals (e.g., carrier waves, infrared signals, digital signals, etc.); etc.
  • Overview
  • FIG. 1A illustrates one embodiment of a noise and flicker reduction module to reduce noise and/or flicker in an input video. Referring to FIG. 1A, noise and flicker reduction block 101 receive input video 100. Input video 100 includes noise and/or flicker. Noise and flicker reduction block 101 also receives a vector of optional parameters, referred to herein as OP, and threshold parameters T, T S1, T S2. In response to these inputs, noise and flicker reduction block 101 generates output video 102 with reduced noise and flicker.
  • FIG. 1B illustrates a flow diagram of one embodiment of a process for performing image processing on a video sequence. The process is performed by processing logic that may comprise hardware (circuitry, dedicated logic, etc.), software (such as is run on a general purpose computer system or a dedicated machine), or a combination of both.
  • Referring to FIG. 1B, the process begins with processing logic receiving an input video (processing block 111).
  • In response to receiving the input video, processing logic performs operations to reduce one or both of noise and flicker in the input video using spatial and temporal processing (processing block 112). In one embodiment, these operations include applying a spatial transform and a temporal transform with adaptive thresholding of coefficients. In one embodiment, applying the spatial transform and the temporal transform comprises applying at least one warped transform to a sub-frame to create transform coefficients.
  • FIG. 2 illustrates a more detailed flow diagram of one embodiment of a process for performing image processing on a video sequence. The process is performed by processing logic that may comprise hardware (circuitry, dedicated logic, etc.), software (such as is run on a general purpose computer system or a dedicated machine), or a combination of both.
  • In the process described below, x denotes the current frame from the input video that is being processed by the techniques described herein, y denotes the past frame output after using the techniques described herein and T, T S1, T S2 denote threshold parameters used by the image processing process. Furthermore, a vector denoted by OP, containing other optional parameters, can be supplied. The user or an algorithm can determine the most desired parameters using optimization of subjective/objective quality, using model based techniques, or using other methods. Calibration algorithms can also be used. Such algorithms can also take advantage of partial/complete knowledge of either the video processing pipeline or the input video or both. In one embodiment, all video frames are represented as vectors by arranging the pixels in raster-scan order and N represents the number of pixels in each video frame.
  • After frame x has been obtained, the sub-frame selection process of processing block 202 of FIG. 2 begins. A sub-frame type S is defined as an M2×1 integer-valued vector. For purposes herein, M can be any integer greater than zero. {S1,S2,S3, . . . } is a library of sub-frame types. For each pixel i in a set of selected pixels from frame x where pixels are numbered in raster-scan order, a sub-frame type si is selected from the library and a vector pi is formed as pi=si+i× 1, where 1 is an M2×1 vector with all elements equal to 1. In one embodiment, for pixels that are not selected, pi is a vector of zeros. The set of selected pixels can be predetermined or signaled within the vector OP. In this embodiment, a sub-frame is formed and processed for each pixel in the image. That is, the set of selected pixels is the entire set of pixels in the frame. However, in another embodiment, the processing may be performed only on a selected subset of the pixels and not on all the pixels in the image. The subset may be predetermined or signaled as part of the side-information. FIGS. 13A-E illustrate examples of such subsets; other subsets may be used with the teachings described herein. An M2×1 vector zi called a sub-frame is formed with pixel values of frame x at locations corresponding to elements of pi. Pixel i is called the pivot for sub-frame zi. FIG. 4 shows an example sub-frame zi at pixel i when pixels are numbered in raster-scan order. Referring to FIG. 4, the raster-scan ordering of pixels occurs by numbering pixels starting from “1” in that order. A sub-frame is shown pivoted at pixel i. A sub-frame is organized into M vectors called warped rows. The first warped row has the sub-frame elements 1 to M in that order; the second warped row has the elements (M+1) to 2M; and so on.
  • In one embodiment, M is equal to 4 and the library of sub-frame types correspond to a set of masks illustrated in FIGS. 3A-3M. Referring to FIGS. 3A-3M, with this library of sub-frames, the masks correspond to different directions as shown with arrows. The mask in FIG. 3A is referred to herein as a regular mask because it corresponds to the regular horizontal or vertical directions. The other masks are called directional masks since they correspond to non-trivial directions. The differential-position (Ω) of a pixel (‘a’ to ‘p’) in a mask is defined as Ω=CC+W×CR, where W is the width of frame y. CC is the number of columns one needs to move horizontally to the right starting from the column of pixel ‘a’ to get to the column of the current pixel of interest. CR is the number of rows one needs to move vertically down starting from the row of pixel ‘a’ to get to the row of the current pixel of interest. For example, in the case of the mask in FIG. 3H, pixel ‘c’ has CC=−1 and CR=2. The sub-frame type corresponding to a mask is the vector containing the differential-positions of pixels in that mask ordered from ‘a’ to ‘p’.
  • In one embodiment, the choice of the sub-frame type for a pixel is made by choosing the sub-frame type corresponding to the regular mask always. In another embodiment, the choice of the sub-frame type for a pixel is made, for each selected pixel, (1) by evaluating, for each sub-frame type, a 2-D DCT over the sub-frame formed, and (2) by choosing, for a given threshold T, the sub-frame type that minimizes the number of non-zero transform coefficients with magnitude greater than T. In yet another embodiment, the choice of the sub-frame type for a pixel is made by choosing, for each selected pixel, the sub-frame type that minimizes the warped row variance of pixel values averaged over all warped rows. In still another embodiment, the choice of the sub-frame type for a pixel is made by having, for a block of K×L pixels, each pixel vote for a sub-frame type (based on the sub-frame type that minimizes the warped row variance of pixel values averaged over all warped rows) and choosing the sub-frame type with the most votes for all the pixels in the K×L block, where K and L can be any integers greater than 0. In one embodiment, K and L are all set to be 4. In still another embodiment, the choice of the sub-frame type for a pixel is made by forming, for each pixel, a block of K×L pixels and choosing a sub-frame type by using the preceding voting scheme on this block. In each case, the chosen sub-frame type is used for the current pixel. Thus, by using one of these measured statistics for each mask, the selection of a subframe is performed.
  • Note that masks other than those in FIGS. 3A-3M may be used.
  • FIG. 5 is a flow diagram of one embodiment of sub-frame selection processing. The process is performed by processing logic that may comprise hardware (circuitry, dedicated logic, etc.), software (such as is run on a general purpose computer system or a dedicated machine), or a combination of both.
  • Referring to FIG. 5, the process begins by processing logic receiving frame x and determining whether the sub-frames are pixel-adaptive (processing block 501). If the sub-frames are not pixel-adaptive, processing logic chooses the regular sub-frame type for all pixels (processing block 502). If the sub-frames of frame x are pixel adaptive, processing logic, for each pixel, marks the sub-frame type that minimizes the warped row variance (processing block 503). This is done using the library of sub-frame types (510) as described above. Thus, for each pixel, the sub-frame type that minimizes the warped row variance among the library of sub-frame types is marked.
  • Next, processing logic determines whether the choice is block-based (processing block 504). If processing logic determines the choice is block-based, processing logic counts the number of pixels that marked each sub-frame type in each block (processing block 506) and, for all pixels in a block, processing logic chooses the sub-frame type marked by most pixels in that block (processing block 507). In other words, if the choice is block-based, the sub-frame type marked by most pixels in a block is chosen for all pixels in that block. If processing logic determines the choice is not block-based, processing logic chooses, for each pixel, the sub-frame type marked by that pixel (processing block 505). In other words, each pixel chooses the sub-frame type marked by itself.
  • The choice of the sub-frame types for each pixel can be signaled within the vector OP.
  • The sub-frame type si is used to form a vector p i=si+mi× 1, where mi is an integer and 1 is an M2×1 vector with all elements equal to 1. Processing logic also forms an M2×1 vector denoted by z i (also a sub-frame) with the pixel values of the past output frame, y, at locations corresponding to elements of p i (processing block 203).
  • The choice of mi can be made in a number of different ways. In alternative embodiments, the choice of mi is performed in one of the following ways:
    • i. mi=i
    • ii. choose mi from all possible values such that a p-norm (p≧0) between zi and z i, ∥ziz ip, is minimized.
    • iii. choose mi based on ‘ii’ above, but restrict the search set to {j: j=i+jh+W×jv}, where W is the width of frame y and jh, jv ∈ {−J,−(J−1), . . . ,−1,0,1, . . . ,J−1,J}. J is any integer greater than or equal to zero. In one embodiment, when option ‘iii’ is used, the value of J is set to 2 and a 2-norm is used.
    • iv. calculate mi based on ‘iii’ above and add a value k=kh+W×kv to mi, where W is the width of frame y and kh,kv are randomly generated values from the set {−K,−(K−1), . . . ,−1,0,1, . . . ,K−1, K}. K is any integer greater than or equal to zero.
      The choice of mi can be signaled within the vector OP.
  • In another embodiment, the sub-frame z i is formed after the past output frame y has been processed using techniques such as, but not limited to, Intensity Compensation and Non-linear Prediction Filter, to compensate for issues such as, for example, brightness changes and scene fades.
  • FIG. 6 is a flow diagram of one embodiment of a sub-frame formation process from the past output frame. The process is performed by processing logic that may comprise hardware (circuitry, dedicated logic, etc.), software (such as is run on a general purpose computer system or a dedicated machine), or a combination of both.
  • Referring to FIG. 6, the process begins by processing logic using a search set {m1,m2, . . . } and, for each value mj, computes p i j according to the following formula:

  • p i j =s i +m j× 1
  • (processing block 601).
  • Next, processing logic forms sub-frame z i j from frame y using p i j (processing block 602). Then, for each j, processing logic computes the p-norm

  • ∥zi z i jp
  • (processing block 603).
  • After computing the p-norm, processing logic selects mk such that it gives the least p-norm; sets mi equal to mk, sets p i according to the following formula:

  • p i =s i +m i× 1
  • and forms sub-frame z i using p i (processing block 604).
  • Spatial Transform Selection and Application
  • As part of processing block 204 of FIG. 2, processing logic also performs spatial transform selection and application. More specifically, processing logic transforms the sub-frames zi and z i into ei and ēi respectively using a pixel-adaptive warped spatial transform Hi. The transform is called ‘warped’ because the support of the transform basis has warped to match the sub-frame shape. The transform is called pixel-adaptive because sub-frames pivoted at different pixels can use different transforms in addition to the fact that the choice of sub-frame type can vary from pixel to pixel. The transform Hi can be chosen from a library of transforms such as separable DCT, non-separable DCT, 2-D Gabor wavelets, Steerable pyramids, 2-D directional wavelets, Curvelets and Contourlets. In one embodiment, the spatial transform used is an orthonormal separable 2D-DCT in a non-adaptive fashion. In another embodiment, the spatial transform used is an orthonormal separable 2D-Hadamard transform in a non-adaptive fashion.
  • It should be noted that a separable transform becomes non-separable after it is warped. The choice of the transform can be fixed apriori or can be adaptive to the different sub-frames pivoted at different pixels. In the adaptive case, the chosen transform is the one that has the least number of coefficients in ei with absolute value greater than a master threshold T S1.
  • A flow diagram of one embodiment of a spatial transform selection process for a sub-frame is illustrated in FIG. 7. The process is performed by processing logic that may comprise hardware (circuitry, dedicated logic, etc.), software (such as is run on a general purpose computer system or a dedicated machine), or a combination of both.
  • Referring to FIG. 7, the process begins by processing logic testing whether the transform is pixel-adaptive (processing block 701). This test may be performed by referring to a list. In one embodiment, the list can be such that the transform is non-adaptive. In another embodiment, the list can be signaled within the vector OP. If processing logic determines that the transform is not pixel-adaptive, processing logic selects a 2-D orthonormal separable DCT for use as the transform Hi, generates the transform coefficients ei by applying the transform to the sub-frame zi, and generates the transform coefficients ēi by applying the transform to the sub-frame z i (processing block 702).
  • If processing logic determines the transform is pixel-adaptive, then, for each transform Hj in the library of transforms {H1,H2, . . . } (processing block 704), processing logic computes the transform coefficients ej using the formula:

  • e j =H j ×z i
  • (processing block 703).
    The transform coefficients ej correspond to the transform Hj.
  • Next, for each j, processing logic counts the number of coefficients in ej with an absolute value greater than a threshold T S1 (processing block 705) and chooses the transform from the library of transforms with the least count Hk, sets the transform Hi equal to the transform corresponding to the least count (Hk), then sets the coefficients ei equal to the transform coefficients ek and generates the transform coefficients ēi by applying the transform Hi to the sub-frame z i (processing block 706).
  • The choice of the spatial transform can be signaled within the vector OP.
  • Thresholding
  • As part of processing block 204 of FIG. 2, processing logic also performs thresholding. More specifically, processing logic applies an adaptive threshold {circumflex over (T)}i1 on selected elements of ei to get ai. In one embodiment, all the elements of ei are selected. In another embodiment, all elements except the first element (usually the DC element) are selected. In still another embodiment, none of the elements are selected. The transform coefficients ei are also thresholded using a master threshold T S1 to get êi. The thresholding operation can be done in a variety of ways such as, for example, hard thresholding and soft thresholding. The hard thresholding operation is defined as
  • HT ( x ) = { x , x T 0 , x < T ,
  • where T is the threshold used. Similarly, the soft thresholding operation with T as the threshold is defined as
  • ST ( x ) = { x - T , x T x + T , x - T 0 , x < T .
  • In alternative embodiments, the threshold {circumflex over (T)}i1 is computed in one of the following ways:
      • {circumflex over (T)}i1=0
      • {circumflex over (T)}i1= T S1
  • T ^ i 1 = f ( T _ S 1 , j = 1 N e j - e ^ j 2 ) ,
  • where f( ) represents a function.
      • {circumflex over (T)}i1=f( T S1, ∥ei−êi2), where f( ) represents a function.
      • {circumflex over (T)}i1= T S1×f(∥ei−êi2). The function f( ) is a monotonic decreasing stair-case function as illustrated in FIG. 11. In one embodiment, the step positions of the function (f1, f2, . . . , fn and E1, E2, . . . , En) are tuned on a training set to achieve a local optimum in reconstructed image/video quality. In one embodiment, this threshold computation is used with hard thresholding.
      • Perform a search on possible values for {circumflex over (T)}i1 to minimize the number of non-zero elements in ai such that ∥ei−ai2<Elocal. Elocal can be part of the side-information or default values may be used. This can be viewed as a setting for the algorithm. In one embodiment, a default value can be obtained by tuning on a training set and choosing the value that achieves a local optimum in reconstructed image/video quality.
      • Perform a joint search on possible values for ({circumflex over (T)}11, {circumflex over (T)}21, . . . , {circumflex over (T)}N1) to minimize the total number of non-zero elements in ak summed over all k ∈ {1,2, . . . ,N} such that
  • j = 1 N e j - a j 2 < E global · E global
  • can be part of the side-information or default values may be used. This can be viewed as a setting for the algorithm. In one embodiment, a default value can be obtained by tuning on a training set and choosing the value that achieves a local optimum in reconstructed image/video quality.
    The value of {circumflex over (T)}i1 can be signaled within the vector OP. In another embodiment, the choice of the option used for calculating {circumflex over (T)}i1 can be signaled within the vector OP.
  • An adaptive threshold {circumflex over (T)}i2 is applied on selected elements of ēi to get āi. In one embodiment, all the elements of ēi are selected. In another embodiment, all elements except the first element (usually the DC element) are selected. In still another embodiment, none of the elements are selected. The transform coefficients ēi are also thresholded using a master threshold T S2 to get {tilde over (e)}i. The thresholding operation can be done in a variety of ways such as hard thresholding and soft thresholding described above.
  • In alternative embodiments, the threshold {circumflex over (T)}i2 is computed in one of the following ways:
      • {circumflex over (T)}i2−0
      • {circumflex over (T)}i2= T S2
  • T ^ i 2 = f ( T _ S 2 , j = 1 N e _ j - e ~ j 2 ) ,
  • where f( ) represents a function.
      • {circumflex over (T)}i2=f( T S2,∥ēi−{tilde over (e)}i2), where f( ) represents a function.
      • {circumflex over (T)}i2= T S2×f(∥ēi−{tilde over (e)}i2). The function f( ) is a monotonic decreasing stair-case function as illustrated in FIG. 11. The step positions of the function (f1, f2, . . . , fn and E1, E2, . . . , En) are tuned on a training set to achieve a local optimum. In one embodiment, this threshold computation is used and hard thresholding is used for the thresholding operation.
      • Perform a search on possible values for {circumflex over (T)}i2 to minimize the number of non-zero elements in āi such that ∥ēi−āi2<Elocal. Elocal can be part of the side-information or default values may be used. This can be viewed as a setting for the algorithm. In one embodiment, a default value can be obtained by tuning on a training set and choosing the value that achieves a local optimum in reconstructed image/video quality.
      • Perform a joint search on possible values for ({circumflex over (T)}12, {circumflex over (T)}22, . . . , {circumflex over (T)}N2) to minimize the total number of non-zero elements in āk summed over all k ∈ {1,2, . . . ,N} such that
  • j = 1 N e _ j - a _ j 2 < E global · E global
  • can be part of the side-information or default values may be used. This can be viewed as a setting for the algorithm. In one embodiment, a default value can be obtained by tuning on a training set and choosing the value that achieves a local optimum in reconstructed image/video quality.
    In one embodiment, the value of {circumflex over (T)}i2 is signaled within the vector OP. In another embodiment, the choice of the option used for calculating {circumflex over (T)}i2 is signaled within the vector OP.
  • Temporal Transform Selection and Application
  • Processing logic in processing blocks 205 uses the results of the thresholding, namely vectors ai and āi, to form an M2×2 matrix ãi; ãi=[ai h(āi)]. For purposes herein, the function h( ) may be an identity function or a simple linear scaling of all the elements of āi to match brightness changes or a more general function to capture more complex scene characteristics such as fades. Processing logic transforms ãi into bi using a pixel-adaptive temporal transform Gi; bii×Gi. The transform Gi can be chosen from a library of transforms. The transform is called pixel-adaptive because sub-frames pivoted at different pixels can use different transforms. In the adaptive case, the chosen transform is the one that has the least number of coefficients in bi with absolute value greater than a master threshold T.
  • FIG. 8 is a flow diagram of one embodiment of a temporal transform selection process. The process is performed by processing logic that may comprise hardware (circuitry, dedicated logic, etc.), software (such as is run on a general purpose computer system or a dedicated machine), or a combination of both.
  • Referring to FIG. 8, the process begins by processing logic testing whether the transform is pixel-adaptive (processing block 801). This test may be performed by referring to a list. In one embodiment, the list can be such that the transform is non-adaptive. In another embodiment, the list can be signaled within the vector OP. If processing logic determines that the transform is not pixel-adaptive, processing logic selects transform Gi based on a default temporal transform and generates the transform coefficients bi by applying the transform Gi to the matrix ãi (processing block 802). In one embodiment, the default temporal transform used is a Haar transform, i.e.
  • G i = [ 1 2 1 2 1 2 - 1 2 ] .
  • The choice of the temporal transform can be signaled within the vector OP.
  • If processing logic determines the transform is pixel-adaptive, then, for each transform Gj in the library of transforms {G1, G2, . . . } (processing block 804), processing logic computes the transform coefficients bj using the formula:

  • b j i ×G j
  • (processing block 803).
    The transform coefficients bj correspond to the transform Gj.
  • Next, for each j, processing logic counts the number of coefficients in bj with an absolute value greater than a master threshold T (processing block 805) and then chooses the transform from the library of transforms with the least count Gk, sets the transform Gi equal to the transform corresponding to the least count (Gk), and then sets the coefficients bi equal to the transform coefficients bk (processing block 806).
  • Thresholding after Temporal Transform
  • After generating the transform coefficients bi, the transform coefficients bi are thresholded using T to get ci (processing block 206 of FIG. 2). The thresholding operation can be done in a variety of ways such as hard thresholding and soft thresholding as described above. The choice of thresholding can be signaled within the vector OP.
  • In one embodiment, hard thresholding is used as illustrated in FIG. 9. Referring to FIG. 9, the hard thresholding is performed by processing logic that may comprise hardware (circuitry, dedicated logic, etc.), software (such as is run on a general purpose computer system or a dedicated machine), or a combination of both.
  • The hard thresholding begins using a master threshold T and coefficients bi as inputs, and processing logic, for each element bij ∈ bi, computing the corresponding element cij ∈ ci according to the following equation:
  • c ij = { b ij , b ij T _ 0 , b ij < T _
  • (processing block 901). In this manner, processing logic sets to zero all coefficients with absolute values less than the master threshold T and these coefficients are stored as ci.
  • In one embodiment, some elements of bi, selected apriori, are not thresholded and copied directly into their respective positions in ci. In a specific embodiment, the elements in the first column of bi are not thresholded. The choice of the set of elements that are not thresholded can be signaled within the vector OP.
  • In one embodiment, the elements cij ∈ ci are optionally enhanced by using the equation cij=cijj0j1, where the parameters αj0, αj1 are tuned on a training set to achieve a local optimum in reconstructed image/video quality. Note that such an operation occurs after processing block 206 in FIG. 2. In one embodiment, the parameters can be signaled within the vector OP.
  • Inverse Transformation
  • After thresholding, processing logic inverse transforms (with a temporal transform) the coefficients using Gi −1 to obtain {tilde over (d)}i=[di {tilde over (d)}i]=ci×Gi −1 (processing block 207). Processing logic also applies an inverse transform (spatial) Hi −1 on di to obtain the processed sub-frame {circumflex over (z)}i (processing block 208).
  • In one embodiment, the current frame is processed without using the past frame output by a previous iteration. In this embodiment, the vectors z i, ēi, āi and the matrices ãi, bi, ci, {tilde over (d)}i are not computed. The vector di is obtained as di=ai and the inverse transform (spatial) Hi −1 is applied on di to obtain the processed sub-frame {circumflex over (z)}i ({circumflex over (z)}i=Hi −1×di).
  • In another embodiment, a set of past frames { y, y, . . . } output as a result of the image processing can be used instead of just using the immediate past output frame y. Let NPF denote the number of past frames in the set. In this case, each of the past frames in the set contributes to one column of ãi in the same way, as described above. The output frame y contributes in the form of āi to the second column, the output frame y contributes in the form of a i to the third column and so on. In one embodiment, ãi, bi, ci and di are of size M2×(NPF+1) and Gi is of size (NPF+1)×(NPF+1).
  • Combining Sub-Frames
  • After applying the inverse transform to the thresholded coefficients, all of the processed sub-frames are combined in a weighted fashion to form frame y. In one embodiment, a weight wi is computed for each processed sub-frame {circumflex over (z)}i. In alternative embodiments, weights based on ei and ai are computed in one of the following ways:
      • wi=1
      • wi=f(ei,ai), where f( ) represents a function.
      • MSE option 1:
  • w i = { 1 e i - a i 2 , e i - a i 2 > e min 1 e min , e i - a i 2 e min ,
  • where emin is a constant.
      • L-p Norm (p≧0) option 1:
  • w i = { 1 a i p , a i p > n min 1 n min , a i p n min ,
  • where nmin is a constant.
      • Tuned weights option 1: wi=ft(∥ai0), where ft( ) represents a mapping from the set {1,2, . . . ,M2} (set of possible values for ∥ai0) to [0,1]. ft( ) is tuned using optimization algorithms such as simulated annealing to get the best performance (measured using metrics such as PSNR or using subjective scores) on a set of training videos.
      • In other embodiments, weights for weighting based on bi and ci can be computed in one of the following ways:
      • wi=f(bi,ci), where f( ) represents a function.
      • MSE option 2:
  • w i = { 1 b i - c i 2 , b i - c i 2 > e min 1 e min , b i - c i 2 e min ,
  • where emin is a constant.
      • L-p Norm (p≧0) option 2:
  • w i = { 1 c i p , c i p > n min 1 n min , c i p n min ,
  • where nmin is a constant.
      • Tuned weights option 2: wi=ft(∥ci0), where ft( ) represents a mapping from the set {1,2, . . . ,2M2} (set of possible values for ∥ci0) to [0,1]. ft( ) is tuned using optimization algorithms such as simulated annealing to get the best performance (measured using metrics such as PSNR or using subjective scores) on a set of training videos.
        The mapping ft( ) and/or the calculated weight can be signaled within the vector OP.
  • The processed sub-frames {circumflex over (z)}1:N (corresponding to all pixels) are combined together to form y in a weighted manner. One embodiment of this process is described for yj which is the value of the jth pixel.
      • 1. Set yj=0 and nj=0, where nj is the normalization coefficient for jth pixel.
      • 2. For each processed sub-frame {circumflex over (z)}i
        • a. If pixel j is part of pi
          • i. k=index of pixel j in pi.
          • ii. yj=yj+wi×{circumflex over (z)}ik, where {circumflex over (z)}ik is the value of pixel j in the processed sub-frame {circumflex over (z)}i.
          • iii. nj=nj+wi
  • 3. y j = y j n j
  • FIG. 10 is a flow diagram of one embodiment of a process for combining all processed sub-frames to form frame y. The process is performed by processing logic that may comprise hardware (circuitry, dedicated logic, etc.), software (such as is run on a general purpose computer system or a dedicated machine), or a combination of both.
  • Referring to FIG. 10, the process begins by setting its value ym and its normalizing factor nm to zero for each pixel m=1:N in frame y (processing block 1001). Next, processing logic initializes the pixel index j and the sub-frame index i to one (processing block 1002).
  • After initialization, processing logic determines whether pixel j ∈ pi (processing block 1003). If it is, the process transitions to processing block 1004. If not, process transitions to processing block 1005.
  • At processing block 1004, in one embodiment, processing logic updates yj and nj using {circumflex over (z)}ik, the value of the pixel j in {circumflex over (z)}i, and using weight wi as described above. In one embodiment, the weight is calculated according to the following:
  • w i = { 1 e i - a i 2 , e i - a i 2 > e min 1 e min , e i - a i 2 e min
  • In processing block 1004, k is equal to the index of pixel j in pi. In one embodiment, processing logic updates yj and nj based on the following equation:

  • y j =y j +w i ×{circumflex over (z)} ik

  • n j =n j +w i
  • After processing logic updates yj and nj, the process transitions to processing block 1005.
  • At processing block 1005, processing logic checks whether the index i=N, the total number of pixels in the frame. If so, the process transitions to processing block 1007. If not, the process transitions to processing block 1006. At processing block 1006, the index is incremented by one and the process transitions to processing block 1003.
  • At processing block 1007, processing logic updates yj according to the following equation:
  • y j = y j n j .
  • After updating yj, processing logic sets the index i equal to 1 (processing block 1008) and checks whether the index j is equal to N (processing block 1009). If it is, the process ends. If not, the process transitions to processing block 1010 where the index j is incremented by one. After incrementing the index j by one, the process transitions to processing block 1003.
  • The frame y is the output corresponding to the current input frame x. If there are more frames to process, processing logic updates the current input frame x, copies y into y and repeat the process as shown in FIG. 2 (processing block 212).
  • In one embodiment, the frame y undergoes further image/video processing in pixel-domain or a transform domain. In one embodiment, unsharp masking is performed on frame y to enhance high-frequency detail. In another embodiment, multiple blocks of size P×P pixels are formed from frame y, where P is an integer and each P×P block f undergoes a block transform, such as 2-D DCT, 2-D Hadamard etc, to produce another P×P block h. The elements of P×P block h, h(i,j), 0≦i, j≦P−1, are processed to form an enhanced P×P block ĥ such that h(i,j)=h(i,j)*α(i,j). In alternative embodiments, the enhancement factor α(i,j) can be computed in one of the following ways:
      • a. α(i,j)=α0*(i+j)β1
      • b. α(i,j)=α0*iβ*jδ1
        where the parameters (α01,β and δ) are tuned on a training set to achieve a local optimum in reconstructed image/video quality. In one embodiment, the parameters can be signaled within the vector OP. Note that the above operations occur after processing block 210 of FIG. 2. The enhanced P×P blocks are inverse transformed and combined to form an enhanced version of frame y.
    An Alternative Image Processing Embodiment
  • In an alternative embodiment, the process described in FIG. 2 can be modified to get a lower complexity algorithm, hereinafter referred to as the lower-complexity technique. The lower-complexity technique is illustrated by the flow chart in FIG. 12. In this embodiment, the frame y is the output of the lower-complexity technique corresponding to the current input frame x, and if there are more frames to process, we update the current input frame x, copy y into y and repeat the process as shown in FIG. 12.
  • Referring to FIG. 12, the process begins by processing logic forming a frame {tilde over (y)} using the current input frame x and the past output frame y such that

  • {tilde over (y)}(j)=w z *x(j)−w y * y (j+m)j ∈ Z,1≦j≦H*W,
  • where wz, wy are real numbers and m is an integer (processing block 1201). For purposes herein, the notation (j) denotes the value of pixel j (numbered in the raster scan order) in the frame of interest. For example, y(5) represents the value of 5th pixel of frame y. In one embodiment, wz=0.5 and wy=0.5. In one embodiment, the values wz and wy are signaled within the vector OP.
  • In alternative embodiments, the choice of m can be made in one of the following ways:
    • i. m=0
    • ii. choose m from all possible values such that the p-norm (p≧0) of {tilde over (y)}, ∥{tilde over (y)}∥p, is minimized.
    • iii. choose m based on ‘ii’ above, but restrict the search set to {j: j=jh+W×jv}, where W is the width of frame x and jh,jv ∈ {−J,−(J−1), . . . ,−1,0,1, . . . ,J−1, J} J is any integer greater than or equal to zero.
  • In one embodiment, the choice of m can be signaled within the vector OP.
  • In another embodiment, the frame {tilde over (y)} is formed using a processed version of y instead of y to compensate for issues such as brightness changes and scene fades, where the processing includes techniques such as, but not limited to, Intensity Compensation and Non-Linear Prediction Filter.
  • Processing logic forms an M2×1 vector zi called a sub-frame with pixel values of frame x at locations corresponding to elements of pi. Pixel i is called the pivot for sub-frame zi (processing block 1202). An M2×1 vector denoted by z i (also a sub-frame) is formed with the pixel values of frame {tilde over (y)} at locations corresponding to elements of pi (processing block 1202).
  • Processing logic selects a spatial transform Hi and applies the spatial transform to sub-frames zi and z i to get vectors ei and ēi respectively (processing block 1203).
  • Processing logic computes adaptive threshold {circumflex over (T)}i1 from T S1 using the same process described above and applies the adaptive threshold {circumflex over (T)}i1 on selected elements of ei to get ai (processing block 1203). In one embodiment, all the elements of ei are selected. In another embodiment, all elements except the first element (usually the DC element) are selected. The thresholding operation can be done in a variety of ways such as hard thresholding and soft thresholding, as described above.
  • After applying the adaptive threshold {circumflex over (T)}i1 on selected elements of ei, processing logic forms a vector di using ai, ei, ēi and using threshold T (processing block 1204). Let aij, eij, ēij and dij represent the jth element in the vectors ai, ei, ēi and di respectively, where j∈{1,2, . . . ,M2}. In alternative embodiments, the value dij is computed in one of the following ways:
  • i . d ij = { w y 2 * e ij + w z * ( w z * e ij - e _ ij ) ( w x + w y ) w y a ij 0 , e _ ij < T _ e ij a ij 0 , e _ ij T _ 0 a ij = 0 ii . d ij = { w y 2 * e ij + w z * ( w z * e ij - e _ ij ) ( w x + w y ) w y e _ ij < T _ a ij e _ ij T _ iii . d ij = e ij
  • In one embodiment, the choice of the option used for calculating dij is signaled within the vector OP.
  • Thereafter, processing logic applies the inverse spatial transform to the vector di to produce the sub-frame {circumflex over (z)}i (processing block 1205), and the remainder of the processing blocks 1206, 1207, 1208, and 1209 operate as their respective counterparts 209, 210, 211, and 212 in FIG. 2 to complete the process.
  • For the embodiments described above, the optional parameter vector OP or parts of it can be signaled by any module including, but not limited to, codec, camera, super-resolution processor etc. One simple way to construct the parameter vector OP is as follows: each choice is signaled using two elements in the vector. For the nth choice,
  • OP ( 2 * n - 1 ) = { 0 , choice is not signalled 1 , choice is signalled and
  • OP(2*n)=value representing the choice. OP(2*n) needs to be set and is used only when OP(2*n−1)=1.
  • The techniques described herein can be used to process a video sequence in any color representation including, but not limited to, RGB, YUV, YCbCr, YCoCg and CMYK. The techniques can be applied on any subset of the color channels (including the empty set or the all channel set) in the color representation. In one embodiment, only the ‘Y’ channel in the YUV color representation is processed using the techniques described herein. The U and V channels are filtered using a 2-D low-pass filter (e.g. LL band filter of Le Gall 5/3 wavelet).
  • The techniques described herein can be used to process only a pre-selected set of frames in a video sequence. In one embodiment, alternative frames are processed. In another embodiment, all frames belonging to one or more partitions of a video sequence are processed. The set of frames selected for processing can be signaled within OP.
  • In addition to the application of the techniques described herein on compressed/uncompressed video sequences, the techniques can also be applied to compressed video sequences that underwent post-processing such as Non-linear Denoising Filter. Furthermore, the techniques can be applied on video sequences that are obtained by super-resolving a low-resolution compressed/uncompressed video sequence. The techniques can also be applied on video sequences that are either already processed or will be processed by a frame-rate conversion module.
  • An Example of a Computer System
  • FIG. 14 is a block diagram of an exemplary computer system that may perform one or more of the operations described herein. Referring to FIG. 14, computer system 1400 may comprise an exemplary client or server computer system. Computer system 1400 comprises a communication mechanism or bus 1411 for communicating information, and a processor 1412 coupled with bus 1411 for processing information. Processor 1412 includes a microprocessor, but is not limited to a microprocessor, such as, for example, Pentium™, PowerPC™, Alpha™, etc.
  • System 1400 further comprises a random access memory (RAM), or other dynamic storage device 1404 (referred to as main memory) coupled to bus 1411 for storing information and instructions to be executed by processor 1412. Main memory 1404 also may be used for storing temporary variables or other intermediate information during execution of instructions by processor 1412.
  • Computer system 1400 also comprises a read only memory (ROM) and/or other static storage device 1406 coupled to bus 1411 for storing static information and instructions for processor 1412, and a data storage device 1407, such as a magnetic disk or optical disk and its corresponding disk drive. Data storage device 1407 is coupled to bus 1411 for storing information and instructions.
  • Computer system 1400 may further be coupled to a display device 1421, such as a cathode ray tube (CRT) or liquid crystal display (LCD), coupled to bus 1411 for displaying information to a computer user. An alphanumeric input device 1422, including alphanumeric and other keys, may also be coupled to bus 1411 for communicating information and command selections to processor 1412. An additional user input device is cursor control 1423, such as a mouse, trackball, trackpad, stylus, or cursor direction keys, coupled to bus 1411 for communicating direction information and command selections to processor 1412, and for controlling cursor movement on display 1421.
  • Another device that may be coupled to bus 1411 is hard copy device 1424, which may be used for marking information on a medium such as paper, film, or similar types of media. Another device that may be coupled to bus 1411 is a wired/wireless communication capability 1425 to communication to a phone or handheld palm device.
  • Note that any or all of the components of system 1400 and associated hardware may be used in the present invention. However, it can be appreciated that other configurations of the computer system may include some or all of the devices.
  • Whereas many alterations and modifications of the present invention will no doubt become apparent to a person of ordinary skill in the art after having read the foregoing description, it is to be understood that any particular embodiment shown and described by way of illustration is in no way intended to be considered limiting. Therefore, references to details of various embodiments are not intended to limit the scope of the claims which in themselves recite only those features regarded as essential to the invention.

Claims (39)

1. A method comprising:
receiving an input video; and
performing operations to reduce one or both of noise and flicker in the input video using spatial and temporal processing.
2. The method defined in claim 1 wherein performing the operations to reduce one or both of noise and flicker in the input video using spatial and temporal processing comprises applying a spatial transform and a temporal transform with adaptive thresholding of coefficients.
3. The method defined in claim 2 wherein applying the spatial transform and the temporal transform comprises applying at least one warped transform to a sub-frame to create transform coefficients.
4. The method defined in claim 3 wherein the at least one warped transform comprises a 2-D separable DCT or a 2-D Hadamard transform.
5. The method defined in claim 2 wherein the adaptive thresholding includes applying spatially adaptive thresholds.
6. The method defined in claim 2 further comprising computing an adaptive threshold, and wherein performing adaptive thresholding comprises thresholding transform coefficients with the adaptive threshold.
7. The method defined in claim 1 wherein performing the operations to reduce one or both of noise and flicker in the input video using spatial and temporal processing comprises:
transforming sub-frames of a current frame and a past frame using a spatial transform for each sub-frame;
thresholding spatial-transform coefficients for each transformed sub-frame with an adaptive threshold;
transforming thresholded spatial-transformed coefficients using a temporal transform;
thresholding temporal-transform coefficients for each transformed sub-frame with a threshold to create thresholded temporal-transformed coefficients;
inverse transforming the thresholded temporal-transform coefficients to form processed sub-frames in the pixel-domain;
combining the processed sub-frames to create a new frame.
8. The method defined in claim 7 wherein the spatial transform is a warped transform.
9. The method defined in claim 7 wherein thresholding spatial-transform coefficients for each transformed sub-frame with an adaptive threshold comprises:
performing thresholding to coefficients generated from the subframe of the current frame using a first threshold;
performing thresholding to coefficients generated from the subframe of the past frame using a second threshold, the second threshold being computed independently of the first threshold.
10. The method defined in claim 7 further comprising computing one or more adaptive thresholds, and wherein thresholding transform coefficients for each transformed sub-frame with an adaptive threshold comprises thresholding transform coefficients for each transformed sub-frame with one of the one or more adaptive thresholds.
11. The method defined in claim 7 further comprising:
applying at least one forward transform to the new frame to convert data of the new frame into coefficients in the transform domain;
performing at least one data processing operation on the coefficients; and
applying at least one inverse transform to the coefficients after data processing.
12. The method defined in claim 11 wherein the at least one data processing operation includes one or more of a group consisting of unsharp masking and applying an enhancement factor to the coefficients.
13. The method defined in claim 1 wherein the operations comprise:
selecting a sub-frame at certain pixels from a current frame of input video and for finding another sub-frame from a past frame of output video;
selecting a warped spatial transform and for transforming the sub-frames into a spatial transform domain;
deriving an adaptive threshold and thresholding spatial-transform coefficients of the sub-frames from the current frame and the past frame;
applying a temporal transform to thresholded spatial-transform coefficients and thresholding a selected sub-set of the temporal-transform coefficients;
inverse transforming temporal-transform coefficients first temporally and then spatially to obtain a processed sub-frame; and
combining the processed sub-frame with previously processed sub-frames belonging to current frame to create a new frame of an output video.
14. The method defined in claim 13 wherein the warped spatial transform is pixel-adaptive and the adaptive threshold is detail-preserving.
15. The method defined in claim 13 wherein the sub-frame of the past frame is located based on satisfying a criterion.
16. The method defined in claim 15 wherein the criterion is based on one of a group consisting of the number of the pixel; a minimum value among all values of a p-norm between the selected sub-frame of the current frame and the selected sub-frame of the past frame; a minimum value among values, within a range limited by width of the past frame and horizontal and vertical offsets, of a p-norm between the selected sub-frame of the current frame and the selected sub-frame of the past frame; a minimum value among values, within a range limited by width of the past frame and randomly-chosen horizontal and vertical offsets, of a p-norm between the selected sub-frame of the current frame and the selected sub-frame of the past frame.
17. The method defined in claim 13 wherein deriving the adaptive threshold and thresholding spatial-transform coefficients of the sub-frames from the current frame and the past frame comprises using hard thresholding in which coefficients are set to zero if magnitude of transform coefficients is less than a threshold.
18. The method defined in claim 13 wherein deriving the adaptive threshold and thresholding spatial-transform coefficients of the sub-frames from the current frame and the past frame comprises using soft thresholding.
19. The method defined in claim 13 further comprising:
selecting an output video frame of the output video that best matches another frame from the input video; and
performing the operations using the output video frame as the past frame.
20. The method defined in claim 13 further comprising setting the sub-frames to be regular at every pixel.
21. The method defined in claim 13 further comprising selecting adaptively a transform for each sub-frame.
22. The method defined in claim 13 further comprising selecting a sub-frame adaptively at each pixel.
23. The method defined in claim 13 further comprising computing one or more adaptive thresholds, and wherein thresholding transform coefficients for each transformed sub-frame with an adaptive threshold comprises thresholding transform coefficients for each transformed sub-frame with one of the one or more adaptive thresholds.
24. The method defined in claim 23 further comprising adaptively selecting the transform for a sub-frame selected at each pixel.
25. The method defined in claim 13 further comprising sending a vector of operational parameters.
26. The method defined in claim 13 wherein applying the temporal transform to spatial-transform coefficients and thresholding a selected sub-set of the temporal-transform coefficients comprises:
forming a first matrix/vector from thresholded spatial-transform coefficients of the sub-frames from the current frame and the past frame, and
applying thresholding to a selected sub-set of coefficients in the first matrix/vector to create a second matrix/vector;
and further wherein inverse transforming temporal-transform coefficients first temporally and then spatially to obtain a processed sub-frame comprises
applying an inverse temporal transform to the second matrix/vector to generate a third matrix/vector, and
applying an inverse spatial transform to the third matrix/vector to produce the processed sub-frame.
27. The method defined in claim 1 wherein performing the operations to reduce one or both of noise and flicker in the input video using spatial and temporal processing comprises:
transforming sub-frames of a current frame and a past frame using a spatial transform for each sub-frame;
transforming spatial-transformed coefficients using a temporal transform;
thresholding temporal-transform coefficients for each transformed sub-frame with a threshold to create thresholded temporal-transformed coefficients;
inverse transforming the thresholded temporal-transform coefficients to form processed sub-frames in the pixel-domain;
combining the processed sub-frames to create a new frame.
28. The method defined in claim 1 wherein the operations comprise:
forming a new frame from a current frame of input video and a past frame of output video;
processing sub-frames of the new frame and the current frame by:
generating first and second sub-frames using pixels from the current and new frames, respectively, using a vector formed from each pixel in the first and second sub-frame, respectively, based on a sub-frame type for each pixel;
selecting a warped spatial transform and for transforming the first and second sub-frames into a spatial transform domain;
deriving an adaptive threshold and thresholding transform coefficients of the first sub-frame;
generating a matrix/vector using thresholded transform coefficients and coefficients generated from the second sub-frame;
inverse transforming coefficients in the matrix/vector to produce a processed sub-frame; and
combining the processed sub-frame with previously processed sub-frames belonging to current frame to create a new frame of an output video.
29. The method defined in claim 1 wherein the current frame and the past frame include channel information of the frames for only a subset of all channels of a multi-dimensional color representation.
30. The method defined in claim 1 wherein the operations comprise:
selecting a sub-frame at certain pixels from a current frame of input video;
selecting a warped spatial transform and for transforming the sub-frame into a spatial transform domain;
deriving an adaptive threshold and thresholding spatial-transform coefficients of the sub-frame from the current frame;
inverse transforming spatial-transform coefficients to obtain a processed sub-frame; and
combining the processed sub-frame with previously processed sub-frames belonging to current frame to create a new frame of an output video.
31. An article of manufacture having one or more computer readable storage media storing instructions therein which, when executed by a system, causes the system to perform a method comprising:
receiving an input video; and
performing operations to reduce one or both of noise and flicker in the input video using spatial and temporal processing.
32. The article of manufacture defined in claim 31 wherein performing the operations to reduce one or both of noise and flicker in the input video using spatial and temporal processing comprises applying a spatial transform and a temporal transform with adaptive thresholding of coefficients.
33. The article of manufacture defined in claim 32 wherein applying the spatial transform and the temporal transform comprises applying at least one warped transform to a sub-frame to create transform coefficients.
34. The article of manufacture defined in claim 33 wherein the at least one warped transform comprises a 2-D separable DCT or a 2-D Hadamard transform.
35. The article of manufacture defined in claim 31 wherein performing the operations to reduce one or both of noise and flicker in the input video using spatial and temporal processing comprises:
transforming sub-frames of a current frame and a past frame using a spatial transform for each sub-frame;
thresholding spatial-transform coefficients for each transformed sub-frame with an adaptive threshold;
transforming thresholded spatial-transformed coefficients using a temporal transform;
thresholding temporal-transform coefficients for each transformed sub-frame with a threshold to create thresholded temporal-transformed coefficients;
inverse transforming the thresholded temporal-transform coefficients to form processed sub-frames in the pixel-domain;
combining the processed sub-frames to create a new frame.
36. The article of manufacture defined in claim 35 wherein the spatial transform is a warped transform.
37. The article of manufacture defined in claim 35 wherein thresholding spatial-transform coefficients for each transformed sub-frame with an adaptive threshold comprises:
performing thresholding to coefficients generated from the subframe of the current frame using a first threshold;
performing thresholding to coefficients generated from the subframe of the past frame using a second threshold, the second threshold being computed independently of the first threshold.
38. The article of manufacture defined in claim 31 wherein the operations comprise:
selecting a sub-frame at certain pixels from a current frame of input video and finding another sub-frame from a past frame of output video;
selecting a warped spatial transform and transforming the sub-frames into a spatial transform domain;
deriving an adaptive threshold and thresholding spatial-transform coefficients of the sub-frames from the current frame and the past frame;
applying a temporal transform to the spatial-transform coefficients and thresholding a selected sub-set of the temporal-transform coefficients;
inverse transforming the temporal-transform coefficients first temporally and then spatially to obtain processed sub-frames belonging to both current frame and past frame; and
combining processed sub-frames belonging to current frame to create a new frame of an output video.
39. The article of manufacture defined in claim 38 wherein the warped spatial transform is pixel-adaptive and the adaptive threshold is detail-preserving.
US12/233,468 2008-02-05 2008-09-18 Noise and/or flicker reduction in video sequences using spatial and temporal processing Active 2032-02-21 US8731062B2 (en)

Priority Applications (6)

Application Number Priority Date Filing Date Title
US12/233,468 US8731062B2 (en) 2008-02-05 2008-09-18 Noise and/or flicker reduction in video sequences using spatial and temporal processing
JP2010545258A JP5419897B2 (en) 2008-02-05 2009-02-02 Reduction of noise and / or flicker in video sequences using spatial and temporal processing
CN2009801039523A CN101933330B (en) 2008-02-05 2009-02-02 Noise and/or flicker reduction in video sequences using spatial and temporal processing
PCT/US2009/032888 WO2009100032A1 (en) 2008-02-05 2009-02-02 Noise and/or flicker reduction in video sequences using spatial and temporal processing
KR1020107017838A KR101291869B1 (en) 2008-02-05 2009-02-02 Noise and/or flicker reduction in video sequences using spatial and temporal processing
EP09708388.5A EP2243298B1 (en) 2008-02-05 2009-02-02 Noise and/or flicker reduction in video sequences using spatial and temporal processing

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US2645308P 2008-02-05 2008-02-05
US12/233,468 US8731062B2 (en) 2008-02-05 2008-09-18 Noise and/or flicker reduction in video sequences using spatial and temporal processing

Publications (2)

Publication Number Publication Date
US20090195697A1 true US20090195697A1 (en) 2009-08-06
US8731062B2 US8731062B2 (en) 2014-05-20

Family

ID=40931208

Family Applications (2)

Application Number Title Priority Date Filing Date
US12/233,468 Active 2032-02-21 US8731062B2 (en) 2008-02-05 2008-09-18 Noise and/or flicker reduction in video sequences using spatial and temporal processing
US12/239,195 Active 2032-05-18 US8837579B2 (en) 2008-02-05 2008-09-26 Methods for fast and memory efficient implementation of transforms

Family Applications After (1)

Application Number Title Priority Date Filing Date
US12/239,195 Active 2032-05-18 US8837579B2 (en) 2008-02-05 2008-09-26 Methods for fast and memory efficient implementation of transforms

Country Status (6)

Country Link
US (2) US8731062B2 (en)
EP (2) EP2243298B1 (en)
JP (3) JP5419897B2 (en)
KR (2) KR101137753B1 (en)
CN (2) CN102378978B (en)
WO (2) WO2009100034A2 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090027548A1 (en) * 2007-07-27 2009-01-29 Lsi Corporation Joint mosquito and aliasing noise reduction in video signals
US20110081085A1 (en) * 2009-04-23 2011-04-07 Morpho, Inc. Image processing device, image processing method and storage medium
US8731062B2 (en) * 2008-02-05 2014-05-20 Ntt Docomo, Inc. Noise and/or flicker reduction in video sequences using spatial and temporal processing
US20140204996A1 (en) * 2013-01-24 2014-07-24 Microsoft Corporation Adaptive noise reduction engine for streaming video
US20180191939A1 (en) * 2016-05-27 2018-07-05 Boe Technology Group Co., Ltd. Methods and devices for correcting video flicker
US20190007705A1 (en) * 2017-06-29 2019-01-03 Qualcomm Incorporated Memory reduction for non-separable transforms
US10346960B2 (en) * 2017-01-25 2019-07-09 Boe Technology Group Co., Ltd. Apparatus and method for correcting a flicker in a video, and video device

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8311088B2 (en) * 2005-02-07 2012-11-13 Broadcom Corporation Method and system for image processing in a microprocessor for portable video communication devices
KR101682147B1 (en) 2010-04-05 2016-12-05 삼성전자주식회사 Method and apparatus for interpolation based on transform and inverse transform
EP2442567A1 (en) * 2010-10-14 2012-04-18 Morpho Inc. Image Processing Device, Image Processing Method and Image Processing Program
FR2978273B1 (en) * 2011-07-22 2013-08-09 Thales Sa METHOD OF REDUCING NOISE IN A SEQUENCE OF FLUOROSCOPIC IMAGES BY TEMPORAL AND SPATIAL FILTRATION
EA017302B1 (en) * 2011-10-07 2012-11-30 Закрытое Акционерное Общество "Импульс" Method of noise reduction of digital x-ray image series
US9357236B2 (en) * 2014-03-13 2016-05-31 Intel Corporation Color compression using a selective color transform
US9939253B2 (en) * 2014-05-22 2018-04-10 Brain Corporation Apparatus and methods for distance estimation using multiple image sensors
US10102613B2 (en) * 2014-09-25 2018-10-16 Google Llc Frequency-domain denoising
EP3557484B1 (en) * 2016-12-14 2021-11-17 Shanghai Cambricon Information Technology Co., Ltd Neural network convolution operation device and method
TWI748035B (en) * 2017-01-20 2021-12-01 日商半導體能源硏究所股份有限公司 Display system and electronic device
KR102444054B1 (en) 2017-09-14 2022-09-19 삼성전자주식회사 Image processing apparatus, method for processing image and computer-readable recording medium
EP4274207A1 (en) 2021-04-13 2023-11-08 Samsung Electronics Co., Ltd. Electronic apparatus and control method thereof
CN114020211B (en) * 2021-10-12 2024-03-15 深圳市广和通无线股份有限公司 Storage space management method, device, equipment and storage medium

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4442454A (en) * 1982-11-15 1984-04-10 Eastman Kodak Company Image processing method using a block overlap transformation procedure
US4447886A (en) * 1981-07-31 1984-05-08 Meeker G William Triangle and pyramid signal transforms and apparatus
US5666209A (en) * 1993-07-15 1997-09-09 Asahi Kogaku Kogyo Kabushiki Kaisha Image signal processing device
US5844611A (en) * 1989-08-23 1998-12-01 Fujitsu Limited Image coding system which limits number of variable length code words
US5859788A (en) * 1997-08-15 1999-01-12 The Aerospace Corporation Modulated lapped transform method
US6141054A (en) * 1994-07-12 2000-10-31 Sony Corporation Electronic image resolution enhancement by frequency-domain extrapolation
US6438275B1 (en) * 1999-04-21 2002-08-20 Intel Corporation Method for motion compensated frame rate upsampling based on piecewise affine warping
US20050030393A1 (en) * 2003-05-07 2005-02-10 Tull Damon L. Method and device for sensor level image distortion abatement
US20060050783A1 (en) * 2004-07-30 2006-03-09 Le Dinh Chon T Apparatus and method for adaptive 3D artifact reducing for encoded image signal
US20070074251A1 (en) * 2005-09-27 2007-03-29 Oguz Seyfullah H Method and apparatus for using random field models to improve picture and video compression and frame rate up conversion
US20070160304A1 (en) * 2001-07-31 2007-07-12 Kathrin Berkner Enhancement of compressed images
US7284026B2 (en) * 2002-07-02 2007-10-16 Canon Kabushiki Kaisha Hadamard transformation method and device
US20070299897A1 (en) * 2006-06-26 2007-12-27 Yuriy Reznik Reduction of errors during computation of inverse discrete cosine transform
US20080246768A1 (en) * 2004-06-30 2008-10-09 Voxar Limited Imaging Volume Data
US20090046995A1 (en) * 2007-08-13 2009-02-19 Sandeep Kanumuri Image/video quality enhancement and super-resolution using sparse transformations
US20090060368A1 (en) * 2007-08-27 2009-03-05 David Drezner Method and System for an Adaptive HVS Filter
US7554611B2 (en) * 2005-04-19 2009-06-30 Samsung Electronics Co., Ltd. Method and apparatus of bidirectional temporal noise reduction

Family Cites Families (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2637978B2 (en) * 1987-04-16 1997-08-06 日本ビクター株式会社 Motion adaptive image quality improvement device
JPH01201773A (en) * 1988-02-05 1989-08-14 Matsushita Electric Ind Co Ltd Digital signal processor
JP3302731B2 (en) 1992-06-02 2002-07-15 大日本印刷株式会社 Image enlargement method
JP3222273B2 (en) * 1993-07-09 2001-10-22 株式会社日立製作所 Image quality improvement method for moving images in nuclear magnetic resonance diagnostic apparatus
JPH08294001A (en) 1995-04-20 1996-11-05 Seiko Epson Corp Image processing method and image processing unit
JP3378167B2 (en) 1997-03-21 2003-02-17 シャープ株式会社 Image processing method
AUPQ156299A0 (en) * 1999-07-12 1999-08-05 Canon Kabushiki Kaisha Method and apparatus for discrete wavelet transforms and compressed bitstream ordering for block entropy coding of subband image data
KR100327385B1 (en) 2000-07-18 2002-03-13 Lg Electronics Inc Spatio-temporal three-dimensional noise filter
EP1209624A1 (en) * 2000-11-27 2002-05-29 Sony International (Europe) GmbH Method for compressed imaging artefact reduction
US6898323B2 (en) * 2001-02-15 2005-05-24 Ricoh Company, Ltd. Memory usage scheme for performing wavelet processing
JP3887178B2 (en) * 2001-04-09 2007-02-28 株式会社エヌ・ティ・ティ・ドコモ Signal encoding method and apparatus, and decoding method and apparatus
JP2003134352A (en) * 2001-10-26 2003-05-09 Konica Corp Image processing method and apparatus, and program therefor
US7120308B2 (en) * 2001-11-26 2006-10-10 Seiko Epson Corporation Iterated de-noising for image recovery
JP2005523615A (en) * 2002-04-19 2005-08-04 ドロップレット テクノロジー インコーポレイテッド Wavelet transform system, method, and computer program product
US7940844B2 (en) * 2002-06-18 2011-05-10 Qualcomm Incorporated Video encoding and decoding techniques
US7352909B2 (en) * 2003-06-02 2008-04-01 Seiko Epson Corporation Weighted overcomplete de-noising
US20050105817A1 (en) 2003-11-17 2005-05-19 Guleryuz Onur G. Inter and intra band prediction of singularity coefficients using estimates based on nonlinear approximants
KR100564592B1 (en) 2003-12-11 2006-03-28 삼성전자주식회사 Methods for noise removal of moving picture digital data
EP1800245B1 (en) * 2004-09-09 2012-01-04 Silicon Optix Inc. System and method for representing a general two dimensional spatial transformation
US8050331B2 (en) 2005-05-20 2011-11-01 Ntt Docomo, Inc. Method and apparatus for noise filtering in video coding
US20060288065A1 (en) * 2005-06-17 2006-12-21 Docomo Communications Laboratories Usa, Inc. Method and apparatus for lapped transform coding and decoding
JP4699117B2 (en) * 2005-07-11 2011-06-08 株式会社エヌ・ティ・ティ・ドコモ A signal encoding device, a signal decoding device, a signal encoding method, and a signal decoding method.
JP4743604B2 (en) * 2005-07-15 2011-08-10 株式会社リコー Image processing apparatus, image processing method, program, and information recording medium
US8135234B2 (en) 2006-01-31 2012-03-13 Thomson Licensing Method and apparatus for edge-based spatio-temporal filtering
JP4760552B2 (en) * 2006-06-06 2011-08-31 ソニー株式会社 Motion vector decoding method and decoding apparatus
US20080007649A1 (en) * 2006-06-23 2008-01-10 Broadcom Corporation, A California Corporation Adaptive video processing using sub-frame metadata
CN100454972C (en) 2006-12-28 2009-01-21 上海广电(集团)有限公司中央研究院 3D noise reduction method for video image
US8731062B2 (en) * 2008-02-05 2014-05-20 Ntt Docomo, Inc. Noise and/or flicker reduction in video sequences using spatial and temporal processing

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4447886A (en) * 1981-07-31 1984-05-08 Meeker G William Triangle and pyramid signal transforms and apparatus
US4442454A (en) * 1982-11-15 1984-04-10 Eastman Kodak Company Image processing method using a block overlap transformation procedure
US5844611A (en) * 1989-08-23 1998-12-01 Fujitsu Limited Image coding system which limits number of variable length code words
US5666209A (en) * 1993-07-15 1997-09-09 Asahi Kogaku Kogyo Kabushiki Kaisha Image signal processing device
US6141054A (en) * 1994-07-12 2000-10-31 Sony Corporation Electronic image resolution enhancement by frequency-domain extrapolation
US5859788A (en) * 1997-08-15 1999-01-12 The Aerospace Corporation Modulated lapped transform method
US6438275B1 (en) * 1999-04-21 2002-08-20 Intel Corporation Method for motion compensated frame rate upsampling based on piecewise affine warping
US20070160304A1 (en) * 2001-07-31 2007-07-12 Kathrin Berkner Enhancement of compressed images
US7284026B2 (en) * 2002-07-02 2007-10-16 Canon Kabushiki Kaisha Hadamard transformation method and device
US20050030393A1 (en) * 2003-05-07 2005-02-10 Tull Damon L. Method and device for sensor level image distortion abatement
US20080246768A1 (en) * 2004-06-30 2008-10-09 Voxar Limited Imaging Volume Data
US20060050783A1 (en) * 2004-07-30 2006-03-09 Le Dinh Chon T Apparatus and method for adaptive 3D artifact reducing for encoded image signal
US7554611B2 (en) * 2005-04-19 2009-06-30 Samsung Electronics Co., Ltd. Method and apparatus of bidirectional temporal noise reduction
US20070074251A1 (en) * 2005-09-27 2007-03-29 Oguz Seyfullah H Method and apparatus for using random field models to improve picture and video compression and frame rate up conversion
US20070299897A1 (en) * 2006-06-26 2007-12-27 Yuriy Reznik Reduction of errors during computation of inverse discrete cosine transform
US20090046995A1 (en) * 2007-08-13 2009-02-19 Sandeep Kanumuri Image/video quality enhancement and super-resolution using sparse transformations
US20090060368A1 (en) * 2007-08-27 2009-03-05 David Drezner Method and System for an Adaptive HVS Filter

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Warped Discrete Cosine Transform and Its Application in Image Compression", Cho et al, IEEE transactions on Circuits and Systems for Video Technology, Vol. 10, No. 8, December 2000. *
GUPTA N. et al.: "Wavelet domain-based video noise reduction using temporal discrete cosine transform and hierarchically adapted thresholding", lET Image Processing, vol. 1, no. 1,6 March 2007 (2007-03-06), pages 2-12, XP006028283 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8305497B2 (en) * 2007-07-27 2012-11-06 Lsi Corporation Joint mosquito and aliasing noise reduction in video signals
US20090027548A1 (en) * 2007-07-27 2009-01-29 Lsi Corporation Joint mosquito and aliasing noise reduction in video signals
US8837579B2 (en) 2008-02-05 2014-09-16 Ntt Docomo, Inc. Methods for fast and memory efficient implementation of transforms
US8731062B2 (en) * 2008-02-05 2014-05-20 Ntt Docomo, Inc. Noise and/or flicker reduction in video sequences using spatial and temporal processing
US8744199B2 (en) * 2009-04-23 2014-06-03 Morpho, Inc. Image processing device, image processing method and storage medium
US20110081085A1 (en) * 2009-04-23 2011-04-07 Morpho, Inc. Image processing device, image processing method and storage medium
US20140204996A1 (en) * 2013-01-24 2014-07-24 Microsoft Corporation Adaptive noise reduction engine for streaming video
US9924200B2 (en) * 2013-01-24 2018-03-20 Microsoft Technology Licensing, Llc Adaptive noise reduction engine for streaming video
US10542291B2 (en) 2013-01-24 2020-01-21 Microsoft Technology Licensing, Llc Adaptive noise reduction engine for streaming video
US20180191939A1 (en) * 2016-05-27 2018-07-05 Boe Technology Group Co., Ltd. Methods and devices for correcting video flicker
US10230902B2 (en) * 2016-05-27 2019-03-12 Boe Technology Group Co., Ltd. Methods and devices for correcting video flicker
US10346960B2 (en) * 2017-01-25 2019-07-09 Boe Technology Group Co., Ltd. Apparatus and method for correcting a flicker in a video, and video device
US20190007705A1 (en) * 2017-06-29 2019-01-03 Qualcomm Incorporated Memory reduction for non-separable transforms
US11134272B2 (en) * 2017-06-29 2021-09-28 Qualcomm Incorporated Memory reduction for non-separable transforms

Also Published As

Publication number Publication date
CN102378978A (en) 2012-03-14
KR20100112162A (en) 2010-10-18
CN101933330B (en) 2013-03-13
WO2009100034A2 (en) 2009-08-13
EP2243298B1 (en) 2021-10-06
JP2014112414A (en) 2014-06-19
EP2240869A2 (en) 2010-10-20
KR101137753B1 (en) 2012-04-24
WO2009100032A1 (en) 2009-08-13
EP2240869B1 (en) 2019-08-07
JP5419897B2 (en) 2014-02-19
JP2011527033A (en) 2011-10-20
JP5517954B2 (en) 2014-06-11
KR101291869B1 (en) 2013-07-31
US8837579B2 (en) 2014-09-16
EP2243298A1 (en) 2010-10-27
US20090195535A1 (en) 2009-08-06
US8731062B2 (en) 2014-05-20
CN102378978B (en) 2015-10-21
JP2011512086A (en) 2011-04-14
JP5734475B2 (en) 2015-06-17
WO2009100034A3 (en) 2012-11-01
CN101933330A (en) 2010-12-29
KR20100114068A (en) 2010-10-22

Similar Documents

Publication Publication Date Title
US8731062B2 (en) Noise and/or flicker reduction in video sequences using spatial and temporal processing
US8743963B2 (en) Image/video quality enhancement and super-resolution using sparse transformations
Heckel et al. Deep decoder: Concise image representations from untrained non-convolutional networks
JP4920599B2 (en) Nonlinear In-Loop Denoising Filter for Quantization Noise Reduction in Hybrid Video Compression
CN101959008B (en) Method and apparatus for image and video processing
US7738739B2 (en) Method and apparatus for adjusting the resolution of a digital image
EP2300982B1 (en) Image/video quality enhancement and super-resolution using sparse transformations
US7085436B2 (en) Image enhancement and data loss recovery using wavelet transforms
CN110717868B (en) Video high dynamic range inverse tone mapping model construction and mapping method and device
US9189838B2 (en) Method and apparatus for image processing
CN111612695B (en) Super-resolution reconstruction method for low-resolution face image
JP7472403B2 (en) Adaptive local reshaping for SDR to HDR upconversion - Patents.com
CN112270646B (en) Super-resolution enhancement method based on residual dense jump network
Jin et al. Full RGB just noticeable difference (JND) modelling
Bochinski et al. Regularized gradient descent training of steered mixture of experts for sparse image representation
US20060153467A1 (en) Enhancement of digital images
Zhang et al. Compression noise estimation and reduction via patch clustering
Ramsook et al. A differentiable VMAF proxy as a loss function for video noise reduction
Hui et al. Rate-Adaptive Neural Network for Image Compressive Sensing
Pang et al. Region-Adaptive Video Sharpening Via Rate-Perception Optimization
Venkataramanan et al. Joint Quality Assessment and Example-Guided Image Processing by Disentangling Picture Appearance from Content
Sasikumar An Efficient Video Denoising using Patch-based Method, Optical Flow Estimation and Multiresolution Bilateral Filter
Ibnelhaj et al. A spatiotemporal neuronal filter for channel equalization and video restoration

Legal Events

Date Code Title Description
AS Assignment

Owner name: NTT DOCOMO, INC., JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:DOCOMO COMMUNICATIONS LABORATORIES USA, INC.;REEL/FRAME:021920/0652

Effective date: 20081015

Owner name: DOCOMO COMMUNICATIONS LABORATORIES USA, INC., CALI

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KANUMURI, SANDEEP;GULERYUZ, ONUR G;CIVANLAR, M. REHA;AND OTHERS;REEL/FRAME:021920/0353;SIGNING DATES FROM 20080918 TO 20081008

Owner name: DOCOMO COMMUNICATIONS LABORATORIES USA, INC., CALI

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KANUMURI, SANDEEP;GULERYUZ, ONUR G;CIVANLAR, M. REHA;AND OTHERS;SIGNING DATES FROM 20080918 TO 20081008;REEL/FRAME:021920/0353

AS Assignment

Owner name: DOCOMO COMMUNICATIONS LABORATORIES USA, INC., CALI

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KANUMURI, SANDEEP;GULERYUZ, ONUR G.;CIVANLAR, M. REHA;REEL/FRAME:022334/0030;SIGNING DATES FROM 20080918 TO 20081008

Owner name: NTT DOCOMO, INC., JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:FUJIBAYASHI, AKIRA;BOON, CHOONG S.;REEL/FRAME:022334/0033

Effective date: 20081006

Owner name: NTT DOCOMO, INC., JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:DOCOMO COMMUNICATIONS LABORATORIES USA, INC.;REEL/FRAME:022334/0066

Effective date: 20081009

Owner name: DOCOMO COMMUNICATIONS LABORATORIES USA, INC., CALI

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KANUMURI, SANDEEP;GULERYUZ, ONUR G.;CIVANLAR, M. REHA;SIGNING DATES FROM 20080918 TO 20081008;REEL/FRAME:022334/0030

AS Assignment

Owner name: NTT DOCOMO, INC.,JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:DOCOMO COMMUNICATIONS LABORATORIES USA, INC.;REEL/FRAME:024413/0161

Effective date: 20100429

Owner name: NTT DOCOMO, INC., JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:DOCOMO COMMUNICATIONS LABORATORIES USA, INC.;REEL/FRAME:024413/0161

Effective date: 20100429

STCF Information on status: patent grant

Free format text: PATENTED CASE

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551)

Year of fee payment: 4

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 8