Search Images Maps Play YouTube News Gmail Drive More »
Sign in
Screen reader users: click this link for accessible mode. Accessible mode has the same essential features but works better with your reader.

Patents

  1. Advanced Patent Search
Publication numberUS6072494 A
Publication typeGrant
Application numberUS 08/951,070
Publication date6 Jun 2000
Filing date15 Oct 1997
Priority date15 Oct 1997
Fee statusPaid
Also published asUS6256033, WO1999019788A1
Publication number08951070, 951070, US 6072494 A, US 6072494A, US-A-6072494, US6072494 A, US6072494A
InventorsKaterina H. Nguyen
Original AssigneeElectric Planet, Inc.
Export CitationBiBTeX, EndNote, RefMan
External Links: USPTO, USPTO Assignment, Espacenet
Method and apparatus for real-time gesture recognition
US 6072494 A
Abstract
A system and method are disclosed for providing a gesture recognition system for recognizing gestures made by a moving subject within an image and performing an operation based on the semantic meaning of the gesture. A subject, such as a human being, enters the viewing field of a camera connected to a computer and performs a gesture, such as flapping of the arms. The gesture is then examined by the system one image frame at a time. Positional data is derived from the input frames and compared to data representing gestures already known to the system. The comparisons are done in real-time and the system can be trained to better recognize known gestures or to recognize new gestures. A frame of the input image containing the subject is obtained after a background image model has been created. An input frame is used to derive a frame data set that contains particular coordinates of the subject at a given moment in time. This series of frame data sets is examined to determine whether it conveys a gesture that is known to the system. If the subject gesture is recognizable to the system, an operation based on the semantic meaning of the gesture can be performed by a computer.
Images(11)
Previous page
Next page
Claims(58)
What is claimed is:
1. A computer-implemented method of storing and recognizing gestures made by a moving subject within an image, the method including:
a) building a background model by obtaining at least one frame of an image;
b) obtaining a data frame containing a subject performing part of a gesture;
c) analyzing the data frame thereby determining particular coordinates of the subject at a particular time while the subject is performing the gesture;
d) adding the particular coordinates to a frame data set;
e) examining the particular coordinates such that the particular coordinates are compared to positional data making up a plurality of recognizable gestures, wherein a recognizable gesture is made up of at least one dimension such that the positional data describes dimensions of the recognized gesture;
f) repeating b through e for a plurality of data frames; and
g) determining whether the plurality of the data frames when examined in a particular sequence, conveys a subject gesture by the subject that resembles a recognizable gesture, thereby causing an operation based on a predetermined meaning of the recognizable gesture be performed by a computer.
2. A method as recited in claim 1 wherein building a background model further includes determining whether there is significant activity in the background image thereby restarting the process for building the background model.
3. A method as recited in claim 1 wherein obtaining a data frame further includes separating the subject in the data frame into a plurality of identifiable parts wherein an identifiable part is assigned particular coordinates.
4. A method as recited in claim 1 wherein a dimension represents a movement space and has at least one key point wherein a key point represents a significant direction change.
5. A method as recited in claim 4 wherein analyzing the data frame further includes:
determining a certainty score at a key point in a dimension wherein the certainty score represents a probability that another key point in the dimension has been reached; and
using the certainty score at each key point to determine whether a subject gesture matches a recognizable gesture.
6. A method as recited in claim 1 wherein determining whether the plurality of the data frames conveys a subject gesture further includes comparing the frame data set to positional data corresponding to a dimensional pattern for a recognizable gesture.
7. A method as recited in claim 1 further including:
obtaining a next data frame thereby determining whether the subject gesture has reached a next key point; and
updating a status report containing data on key points reached in a dimension.
8. A method as recited in claim 7 further including checking the status report to determine if the subject gesture is a partial completion of a recognizable gesture by comparing a previous data frame to the positional data for a recognizable gesture and determining how many key points have been reached.
9. A method as recited in claim 1 further including:
determining whether the particular coordinates in the frame data set match the positional data for a potential gesture;
resetting a data array representative of the potential gesture and resetting a status report if the particular coordinates in the frame data set severely mismatch the positional data making up a plurality of recognizable gestures;
discarding data in the data array representative of the potential gesture if the particular coordinates in the frame data set mismatch the positional data to a degree lesser than a severe mismatch; and
signaling if the particular coordinates in the frame data set match positional data for a recognizable gesture thereby indicating that requirements for a recognizable gesture have been met.
10. A method as recited in claim 9 further including discarding the data in the data array representative of the potential gesture if a predetermined amount of time has passed.
11. A method as recited in claim 1 wherein the step of examining the particular coordinates further includes extracting data from the frame data set based on characteristics of the recognizable gesture being checked.
12. A method as recited in claim 1 wherein the recognizable gesture that matches the subject gesture first is the recognizable gesture that causes an operation to be performed in a computer.
13. A method as recited in claim 1 wherein adding the particular coordinates to a frame data set further includes storing the frame data set in a plurality of arrays wherein an array corresponds to one dimension for each recognizable gesture.
14. A method as recited in claim 1 further including:
storing a plurality of samples of a subject gesture;
inputting a number of key points that fit in the subject gesture and a time value representing the time for the subject gesture to complete;
inputting a number of dimensions of the subject gesture;
determining locations of key points in a model representative of the subject gesture; and
calculating a probability distribution for key points indicating the likelihood that a certain output will be observed.
15. A method as recited in claim 14 further including refining the model such that the plurality of samples of the subject gesture fit within the model.
16. A method as recited in claim 14 further including calculating a confusion matrix wherein the subject gesture is compared with previously stored recognizable gestures so that similarities between the new gesture to previously stored recognizable gestures can be determined.
17. A method as recited in claim 1 further including pre-processing the data frame such that the subject is visually displayed on a computer display monitor.
18. A method as recited in claim 17 wherein the subject is composited onto a destination image such that the background image is subtracted from the data frame thereby isolating the subject to be composited.
19. A computer readable medium including program instructions implementing the process of claim 1.
20. A computer-implemented system for storing and recognizing gestures made by a moving subject within an image, the system comprising:
an image modeller for creating a background model by examining a plurality of frames of an input image that does not contain a subject;
a frame capturer for obtaining a data frame containing the subject performing part of a subject gesture;
a frame analyzer for analyzing the data frame thereby determining relevant coordinates of the subject at a particular time while the subject is performing the subject gesture;
a data set creator for creating a frame data set by collecting the relevant coordinates;
a data set analyzer for examining the particular coordinates in the frame data set such that the particular coordinates are compared to positional data making up a plurality of recognizable gestures, wherein each recognizable gesture is made up of at least one dimension such that the positional data describes dimensions of the recognized gesture; and
a gesture recognizer for determining whether a plurality of the data frames, wherein a data frame is represented by a frame data set, when examined in a particular sequence, conveys a gesture by the subject that resembles a recognizable gesture, thereby causing an operation based on a predetermined meaning of the recognizable gesture be performed by a computer.
21. A system as recited in claim 20 wherein the image modeller further comprises an image initializer for initializing the input image that does not contain the subject.
22. A system as recited in claim 20 wherein the frame capturer further comprises a frame separator for categorizing the subject represented in the data frame into a plurality of identifiable parts wherein an identifiable part is assigned particular coordinates.
23. A system as recited in claim 20 wherein a dimension represents a movement space and has at least one key point wherein a key point represents a significant direction change.
24. A system as recited in claim 23 wherein the frame analyzer further comprises:
a probability evaluator for determining a certainty score at a key point in a dimension wherein the certainty score represents a probability that a sequence of outputs observed belongs to a gesture model; and
a gesture recognizer for determining whether a subject gesture matches a recognizable gesture by using the certainty score at each key point.
25. A system as recited in claim 20 wherein the gesture recognizer further comprises a data comparator for comparing the frame data set to positional data corresponding to a dimension of a recognizable gesture.
26. A system as recited in claim 20 further comprising a status updater for updating a status report containing data on key points reached in a dimension after obtaining a next data frame thereby determining whether the subject gesture has reached a next key point.
27. A system as recited in claim 26 further comprising a status checker for checking the status report to determine if the subject gesture is a partial completion of a recognizable gesture by comparing a previous data frame to the positional data for a recognizable gesture and determining how many key points have been reached.
28. A system as recited in claim 20 further comprising:
a position comparator for determining whether the particular coordinates in the frame data set match the positional data for a potential gesture;
a data resetter for resetting a data array representative of the potential gesture and resetting a status report if the particular coordinates in the frame data set severely mismatch the positional data making up a plurality of recognizable gestures;
a data discarder for discarding data in the data array representative of the potential gesture if the particular coordinates in the frame data set mismatch the positional data to a degree lesser than a severe mismatch; and
a match indicator for signaling if the particular coordinates in the frame data set match positional data for a recognizable gesture thereby indicating that requirements for a recognizable gesture have been met.
29. A system as recited in claim 28 wherein the data discarder further comprising a timer for discarding the data in the data array representative of the potential gesture if a predetermined amount of time has passed.
30. A system as recited in claim 20 wherein the data set analyzer further comprises a data extractor for extracting data from the frame data set based on characteristics of the recognizable gesture being checked.
31. A system as recited in claim 20 wherein the recognizable gesture that matches the subject gesture first is the recognizable gesture that causes an operation to be performed in a computer.
32. A system as recited in claim 20 wherein the data set creator further comprises a data set allocator for storing the frame data set in a plurality of arrays wherein an array corresponds to one dimension for a recognizable gesture.
33. A system as recited in claim 20 further comprising:
a sample receiver for storing a plurality of samples of a subject gesture;
a gesture data intaker for accepting a plurality of key points that fits in the subject gesture, a time value representing the time for the subject gesture to complete and a plurality of dimensions of the subject gesture;
a key point locator for determining locations of key points in a model representative of the subject gesture; and
a probability evaluator for calculating a probability distribution at the key points indicating the likelihood of observing a particular output.
34. A system as recited in claim 33 further including refining the model such that the plurality of samples of the subject gesture fit within the model.
35. A system as recited in claim 33 further comprising a gesture confusion evaluator for calculating a confusion matrix wherein the subject gesture is compared with previously stored recognizable gestures so that similarities between the subject gesture to previously stored recognizable gestures can be determined.
36. A system as recited in claim 20 further comprising a data frame processor for pre-processing the data frame such that the subject is visually displayed on a computer display monitor.
37. A system as recited in claim 36 further comprising a subject compositor for compositing the subject onto a destination image such that the background image is subtracted from the data frame thereby isolating the subject to be composited.
38. A computer-implemented system for storing and recognizing gestures made by a moving subject within an image, the system comprising:
means for building a background model by obtaining at least one frame of an image;
means for obtaining a data frame containing a subject performing a part of a subject gesture;
means for analyzing the data frame thereby determining particular coordinates of the subject at a particular time while the subject is performing the subject gesture;
means for adding the particular coordinates to a frame data set;
means for examining the particular coordinates such that the particular coordinates are compared to positional data making up a plurality of recognizable gestures, wherein a recognizable gesture is made up of at least one dimension such that the positional data describes dimensions of the recognized gesture; and
means for determining whether a plurality of data frames, where a data frame is represented by the frame data set, when examined in a particular sequence, conveys a subject gesture by the subject that resembles a recognizable gesture, thereby causing an operation based on a predetermined meaning of the recognizable gesture be performed by a computer.
39. A system as recited in claim 38 wherein means for building a background model further includes means for determining whether there is significant activity in the background image thereby restarting the process for building the background model.
40. A system as recited in claim 38 wherein means for obtaining a data frame further includes means for separating the subject in the data frame into a plurality of identifiable parts wherein an identifiable part is assigned particular coordinates.
41. A system as recited in claim 38 wherein a dimension represents a movement space and has at least one key point wherein a key point represents a significant direction change.
42. A system as recited in claim 38 wherein means for analyzing the data frame further includes:
means for determining a certainty score at a key point in a dimension wherein the certainty score represents a probability that a sequence of previous data frames fit a gesture model; and
means for using the certainty score at each key point to determine whether a subject gesture matches a recognizable gesture.
43. A system as recited in claim 38 wherein means for determining whether the plurality of the data frames conveys a subject gesture further includes means for comparing the frame data set to positional data corresponding to a dimension for a recognizable gesture.
44. A system as recited in claim 38 further including:
means for obtaining a next data frame thereby determining whether the subject gesture has reached a next key point; and
means for updating a status report containing data on key points reached in a dimension.
45. A system as recited in claim 44 further including means for checking the status report to determine if a subject gesture is a partial completion of a recognizable gesture by comparing a previous data frame to the positional data for a recognizable gesture and determining how many key points have been reached.
46. A system as recited in claim 38 further including:
means for determining whether the particular coordinates in the frame data set match the positional data for a potential gesture;
means for resetting a data array representative of the potential gesture and resetting a status report if the particular coordinates in the frame data set severely mismatch the positional data making up a plurality of recognizable gestures;
means for discarding data in the data array representative of the potential gesture if the particular coordinates in the frame data set mismatch the positional data to a degree lesser than a severe mismatch; and
means for signaling if the particular coordinates in the frame data set match positional data for a recognizable gesture thereby indicating that requirements for a recognizable gesture have been met.
47. A system as recited in claim 46 further including means for discarding the data in the data array representative of the potential gesture if a predetermined amount of time has passed.
48. A system as recited in claim 38 wherein means for examining the particular coordinates further includes means for extracting data from the frame data set based on characteristics of the recognizable gesture being checked.
49. A system as recited in claim 38 wherein the recognizable gesture that matches the subject gesture first is the recognizable gesture that causes an operation to be performed in a computer.
50. A system as recited in claim 38 wherein means for adding the particular coordinates to a frame data set further includes means for storing the frame data set in a plurality of arrays wherein an array corresponds to one dimension for each recognizable gesture.
51. A system as recited in claim 38 further including:
means for storing a plurality of samples of a subject gesture;
means for inputting a number of key points that fit in the gesture and a time value representing the time for the subject gesture to complete;
means for inputting a number of dimensions of the subject gesture;
means for determining locations of key points in a model representative of the subject gesture; and
means for calculating a probability distribution for key points indicating the likelihood of observing a particular output.
52. A system as recited in claim 51 further including means for refining the model such that the plurality of samples of the subject gesture fit within the model.
53. A system as recited in claim 51 further including means for calculating a confusion matrix wherein the subject gesture is compared with previously stored recognizable gestures so that similarities between the subject gesture to previously stored recognizable gestures can be determined.
54. A system as recited in claim 38 further including means for preprocessing the data frame such that the subject is visually displayed on a computer display monitor.
55. A system as recited in claim 54 wherein the subject is composited onto a destination image such that the background image is subtracted from the data frame thereby isolating the subject to be composited.
56. A computer-implemented method of storing and recognizing gestures made by a moving subject within an image, the method including:
a) building a background model by obtaining at least one frame of an image including determining whether there is significant activity in the background image thereby restarting the process for building the background model;
b) obtaining a data frame containing a subject performing part of a gesture including separating the subject in the data frame into a plurality of identifiable parts wherein an identifiable part is assigned particular coordinates;
c) analyzing the data frame thereby determining particular coordinates of the subject at a particular time while the subject is performing the gesture;
d) adding the particular coordinates to a frame data set;
e) examining the particular coordinates such that the particular coordinates are compared to positional data making up a plurality of recognizable gestures, wherein a recognizable gesture is made up of at least one dimension such that the positional data describes dimensions of the recognized gesture;
f) repeating b through e for a plurality of data frames;
g) determining whether the plurality of the data frames when examined in a particular sequence, conveys a subject gesture by the subject that resembles a recognizable gesture, thereby causing an operation based on a predetermined meaning of the recognizable gesture be performed by a computer;
h) storing a plurality of samples of a subject gesture;
i) inputting a number of key points that fit in the subject gesture and a time value representing the time for the subject gesture to complete;
j) inputting a number of dimensions of the subject gesture;
k) determining locations of key points in a model representative of the subject gesture;
l) calculating a probability distribution for key points indicating the likelihood that a certain output will be observed; and
m) refining the model such that the plurality of samples of the subject gesture fit within the model.
57. A computer-implemented system for storing and recognizing gestures made by a moving subject within an image, the system comprising:
an image modeller for creating a background model by examining a plurality of frames of an input image that does not contain a subject comprising an image initializer for initializing the input image that does not contain the subject;
a frame capturer for obtaining a data frame containing the subject performing part of a subject gesture comprising a frame separator for categorizing the subject represented in the data frame into a plurality of identifiable parts wherein an identifiable part is assigned particular coordinates;
a frame analyzer for analyzing the data frame thereby determining relevant coordinates of the subject at a particular time while the subject is performing the subject gesture;
a data set creator for creating a frame data set by collecting the relevant coordinates;
a data set analyzer for examining the particular coordinates in the frame data set such that the particular coordinates are compared to positional data making up a plurality of recognizable gestures, wherein each recognizable gesture is made up of at least one dimension such that the positional data describes dimensions of the recognized gesture;
a gesture recognizer for determining whether a plurality of the data frames, wherein a data frame is represented by a frame data set, when examined in a particular sequence, conveys a gesture by the subject that resembles a recognizable gesture, thereby causing an operation based on a predetermined meaning of the recognizable gesture be performed by a computer;
a sample receiver for storing a plurality of samples of a subject gesture;
a gesture data intaker for accepting a plurality of key points that fits in the subject gesture, a time value representing the time for the subject gesture to complete and a plurality of dimensions of the subject gesture;
a key point locator for determining locations of key points in a model representative of the subject gesture;
a probability evaluator for calculating a probability distribution at the key points indicating the likelihood of observing a particular output; and
a model refiner for refining the model such that the plurality of samples of the subject gesture fit within the model.
58. A computer-implemented system for storing and recognizing gestures made by a moving subject within an image, the system comprising:
means for building a background model by obtaining at least one frame of an image including means for determining whether there is significant activity in the background image thereby restarting the process for building the background model;
means for obtaining a data frame containing a subject performing a part of a subject gesture including means for separating the subject in the data frame into a plurality of identifiable parts wherein an identifiable part is assigned particular coordinates;
means for analyzing the data frame thereby determining particular coordinates of the subject at a particular time while the subject is performing the subject gesture;
means for adding the particular coordinates to a frame data set;
means for examining the particular coordinates such that the particular coordinates are compared to positional data making up a plurality of recognizable gestures, wherein a recognizable gesture is made up of at least one dimension such that the positional data describes dimensions of the recognized gesture;
means for determining whether a plurality of data frames, where a data frame is represented by the frame data set, when examined in a particular sequence, conveys a subject gesture by the subject that resembles a recognizable gesture, thereby causing an operation based on a predetermined meaning of the recognizable gesture be performed by a computer;
means for storing a plurality of samples of a subject gesture;
means for inputting a number of key points that fit in the gesture and a time value representing the time for the subject gesture to complete;
means for inputting a number of dimensions of the subject gesture;
means for determining locations of key points in a model representative of the subject gesture;
means for calculating a probability distribution for key points indicating the likelihood of observing a particular output; and
means for refining the model such that the plurality of samples of the subject gesture fit within the model.
Description
CROSS REFERENCE TO RELATED APPLICATION

This application is related to co-pending U.S. patent application Ser. Nos. 08/951,089 and 08/950,404, both filed Oct. 15, 1997 filed herewith, and are incorporated herein by reference for all purposes.

BACKGROUND OF THE INVENTION

1. Background

The present invention relates generally to methods and apparatus for computer-implemented real-time gesture recognition. More particularly, the present invention relates to capturing a sequence of images of a subject moving subject performing a particular movement or gesture; extracting relevant data points from these images and comparing the resulting sequence of data points to patterns of data points for known gestures to determine if there is a match.

2. Prior Art

An emerging and increasingly important procedure in the field of computer science is gesture recognition. In order to make gesture recognition systems commercially useful and widespread, they must recognize known gestures in real-time and must do so with minimum or reduced use of the CPU. From a process perspective a gesture is defined as a time-dependent trajectory following a prescribed pattern through a feature space, e.g., a bodily movement or handwriting. Prior art methods for gesture recognition typically uses neural networks or a Hidden Markov Model's (HMM's) with HMM's being the most prevalent choice.

A Hidden Markov Model is a model made up of interconnected nodes or states. Each state contains information concerning itself and its relation to other states in the model. More specifically, each state contains (1) the probability of producing a particular observable output and (2) the probabilities of going from that state to any other state in the model. Since only the output is observed a system based on HMM's does not know which state it is in at any given time; it only knows what the probabilities are that a particular model produces the outputs seen thus far. Knowledge of the state is hidden from the system or application.

Examples of gesture recognition systems based on Hidden Markov Models include a tennis stroke recognition system, an American sign language recognition system, a system for recognizing lip movements, and systems for recognizing handwriting. The statistical nature of HMM's can capture the variance in the way different people perform gestures at different times. However, the same statistical nature makes HMM a "black box." For example, one state in the model may represent one particular point in a bodily gesture. An HMM-based application may know many things about this point, such as the probabilities that the gesturer will change position or move in other directions. However, the application will not be able to determine precisely when it has reached that point. Thus, the application is not able to determine whether the person has completed 25% or 50% of a known gesture.

Therefore, it would be desirable to have a real-time gesture recognition system that removes the "hidden" layer found in current systems which uses Hidden Markov Models while still capturing the variance in the way different people perform a gesture at different times. In addition, it would be desirable to have a system that would allow more control over the training and recognition of gestures.

SUMMARY OF THE INVENTION

The present invention provides a system for recognizing gestures made by a subject within a sequence of images and performing an operation based on the semantic meaning of the gesture. In a preferred embodiment, a subject, such as a human being, enters the viewing field of a camera connected to a computer and performs a gesture. The gesture is then examined by the system one image frame at a time. Positional data is derived from the input frame and compared to previously derived data representing gestures known to the system. The comparisons are done in real time and the system can be trained to better recognize known gestures or to recognize new gestures.

In a preferred embodiment, a computer-implemented gesture recognition system is described. A background image model is created by examining frames of an average background image before the subject that will perform the gesture enters the image. A frame of the input image containing the subject, such as a human being, is obtained after the background image model has been created. The frame captures the person in the action of performing the gesture at one moment in time. The input frame is used to derive a frame data set that contains particular coordinates of the subject at that given moment. These sequence of frame data sets taken over a period of time is compared to sequences of positional data making up one or more recognizable gestures i.e., gestures already known to the system. If the gesture performed by the subject is recognizable to the system, an operation based on the semantic meaning of the gesture may be performed by the system.

In another embodiment the gesture recognition procedure includes a routine setting its confidence level according to the degree of mismatch between the input gesture data and the patterns of positional data making up the system's recognizable gestures. If the confidence passes a threshold, a material is considered found.

In yet another preferred embodiment the gesture recognition procedure includes a partial completion query routine that updates a status report which provides information on how many of the requirements of the known gestures have been met by the input gesture. This allows queries of how much or what percentage of a known gesture is completed by probing the status report. This is done by determining how many key points of a recognizable gesture have been met.

In yet another embodiment the gesture recognition procedure includes a routine for training the system to recognize new gestures or to recognize certain gestures performed by an individual more efficiently. Several samples of the subject, i.e., individual, performing the new gesture are used by the system to extract the number of key points, the dimensions, and other relevant characteristics of the gesture. A probability distribution for each key point indicating the likelihood of producing a particular observable output at that key point is also derived. Once a characteristic data pattern is obtained for the new gesture, it can be compared to patterns of previously stored known gestures to produce a confusion matrix. The confusion matrix describes possible similarities between the new gesture and known gestures as well as the likelihood that the system will confuse these similar gestures.

In yet another embodiment the gesture recognition procedure visually displays the subject performing the gesture and any resulting transformations or augmentations to the subject on a computer monitor through model-based compositing. Such a compositing method includes shadow reduction and hole and gap filling routines for isolating the subject being composited.

In another aspect of the present invention a computer-based system for extracting data to be used to recognize gestures made by a subject is described. In a preferred embodiment an image modular for creating a background model that does not contain the subject is used to create an initial background model. The system includes a frame capturer for obtaining an image frame and a frame analyzer for analyzing the image thereby determining particular coordinates of the subject at a particular time. Also described is a data set creator for creating a frame data set from the particular coordinates and a data set analyzer for examining the coordinates in the frame data set and comparing them to positional data representing a known gesture.

Advantages of the methods and systems described and claimed are real-time recognition of gestures made by subjects within a dynamic background image. Gestures are recognized and processed immediately in a computer system that can also be trained to recognize new gestures or to recognize certain known gestures more efficiently. In addition, the subject is composited onto a destination image without distorting effects from shadows cast by the subject or from color uniformity between the subject and the background. This provides for a clean, well-defined composited subject on a display monitor which can be processed by the computer system according to the semantic meaning of the recognized or known gesture.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention, together with further advantages thereof, may best be understood by reference of the following description taken in conjunction with the accompanying drawings in which:

FIG. 1 is a schematic illustration of a general purpose computer system suitable for implementing the present invention.

FIG. 2 is a diagram of a preferred embodiment of the present invention showing a person with arms extended and with the image composited onto a computer monitor through the use of a camera.

FIG. 3 shows a series of screen shots showing a human figure performing a gesture, an arm flap, and the resulting function performed by the system of transforming the human figure to an image of a flying bird.

FIG. 4 shows another series of screen shots showing a human figure performing another recognizable gesture, jumping, and the system augmenting the human figure once the gesture is recognized.

FIG. 5a is a flowchart showing a process for a preferred embodiment for gesture recognition of the present invention.

FIG. 5b shows data stored in a frame data set as derived from a data or image frame containing a subject performing a gesture as described in block 502 of FIG. 5a.

FIG. 6 is a flowchart showing in greater detail block 504 of FIG, 5a in which the system runs the gesture recognition process.

FIGS. 7A and 7B are flowcharts showing in greater detail block 600 of FIG. 6 in which the system processes the frame data to determine whether it matches a recognized gesture.

FIGS. 8A and 8B are flowcharts showing a process for training the system to recognize a new gesture.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

Reference will now be made in detail to a preferred embodiment of the invention. An example of the preferred embodiment is illustrated in the accompanying drawings. While the invention will be described in conjunction with a preferred embodiment, it will be understood that it is not intended to limit the invention to one preferred embodiment. To the contrary, it is intended to cover alternatives, modifications, and equivalents as may be included within the spirit and scope of the invention as defined by the appended claims.

The present invention employs various processes involving data stored in computer systems. These processes are those requiring physical manipulation of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It is sometimes convenient, principally for reasons of common usage, to refer to these signals as bits, values, elements, variables, characters, data structures, or the like. It should be remembered, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Further, the manipulations performed are often referred to in terms such as identifying, running, comparing, or detecting. In any of the operations described herein that form part of the present invention, these operations are machine operations. Useful machines for performing the operations of the present invention include general purpose digital computers or other similar devices. In all cases, it should be borne in mind the distinction between the method of operations in operating a computer and the method of computation itself. The present invention relates to method blocks for operating a computer in processing electrical or other physical signals to generate other desired physical signals.

The present invention also relates to a computer system for performing these operations. This computer system may be specially constructed for the required purposes, or it may be a general purpose computer selectively activated or reconfigured by a computer program stored in the computer. The processes presented herein are not inherently related to any particular computer or other computing apparatus. In particular, various general purpose computing machines may be used with programs written in accordance with the teachings herein, or it may be more convenient to construct a more specialized computer apparatus to perform the required method blocks.

FIG. 1 is a schematic illustration of a general purpose computer system suitable for implementing the process of the present invention. The computer system includes a central processing unit (CPU) 102, which CPU is coupled bi-directionally with random access memory (RAM) 104 and unidirectionally with read only memory (ROM) 106. Typically RAM 104 includes programming instructions and data, including text objects as described herein in addition to other data and instructions for processes currently operating on CPU 102. ROM 106 typically includes basic operating instructions, data and objects used by the computer to perform its functions. In addition, a mass storage device 108, such as a hard disk, CD ROM, magneto-optical (floptical) drive, tape drive or the like, is coupled bi-directionally with CPU 102. Mass storage device 108 generally includes additional programming instructions, data and text objects that typically are not in active use by the CPU, although the address space may be accessed by the CPU, e.g., for virtual memory or the like. Each of the above described computers further includes an input/output source 110 that typically includes input media such as a keyboard, pointer devices (e.g., a mouse or stylus) and the like. Each computer can also include a network connection 112 over which data, including, e.g., text objects, and instructions can be transferred. Additional mass storage devices (not shown) may also be connected to CPU 102 through network connection 112. It will be appreciated by those skilled in the art that the above described hardware and software elements are of standard design and construction.

As discussed above, Hidden Markov Models are typically used in current gesture recognition systems to account for variance in possible movements in a gesture. The present invention uses the HMM construct and removes the hidden nature of the model by allowing the application to determine which state in the model it is in. The present invention also forces the application to move in a certain direction by removing all the connections from a particular state to the other states except for one. For example, at state one in a Hidden Markov Model, an application may be able to go to states two, three, or four. State one would have the probabilities that from it, the gesture would go to any one of the those states. In a preferred embodiment of the present invention, the connections to states three and four are removed, thus forcing the application or system to go to state two or to stay in state one. It should be noted that the HMM construct also allows for this case, which is generally known as the left-to-right HMM. However, in an HMM implement, state one will have two probabilities: one indicating the probability that it will stay in state one and another that it will go to state two. In the present invention, there are no transition probabilities. The application will stay in state one until it meets the criteria, such as reaching a local extrema for moving to state two. Also included in a preferred embodiment of the present invention is a timing constraint built into the application. This timing constraint applies to individual states in the model. For example, a state may have a timing constraint such that the person cannot stay in a particular pose or position in the gesture for more than a predetermined length of time. Furthermore, by removing the hidden layer in the HMM, the system can determine at any time how much of a particular gesture has been completed since the system knows what state the gesture is in.

In another preferred embodiment of the gesture recognition system of the present invention, a training interface is included which requires a small degree of human intervention. A person can "teach" the system new gestures for it to recognize by performing samples of the new gesture in front of a camera. The user can then enter certain information about the new gesture allowing the system to create a model of the new gesture to store in its library. FIG. 2 is a diagram of a preferred embodiment of the present invention showing a person with arms extended and having the image composited on a computer monitor through the use of a camera. It shows a computer 206 connected to a camera 200. In other preferred embodiments, the camera can be located further away from the computer. Camera 200 has within its range or field of vision, a person 202 with her arms extended, as if in the middle of an arm flap gesture. In a preferred embodiment, the image of person 202 performing the gesture is composited onto a destination image 208 which is displayed on a computer monitor as shown in FIG. 2. Assuming one of the system's recognizable gestures is arm flapping, once the system recognizes that the person is performing this gesture it will perform an operation associated with that gesture. Examples of this are shown in FIGS. 3 and 4 below. In other preferred embodiments, the person's image does not need to be composited onto a destination image or displayed on the computer monitor. The system can simply recognize the gesture and perform an operation, without having to composite the image of the person. In a preferred embodiment, although the person may be located in a room with background items that are static, such as furniture, or non-static, such as a television screen or open window showing moving objects, such items are not composited onto a destination image; only the human figure is composited.

FIG. 3 shows a series of screen shots showing a human figure performing a gesture--in this case an arm flap--and the resulting function performed by the system, i.e. transforming the human figure to other images of a flying bird. In other preferred embodiments, the human figure can perform other types of gestures and be transformed to another figure or be augmented, as shown in FIG. 4 below. At 300 of FIG. 3, the person is initially flapping her arms up and down at a rate acceptable to the system. This rate can vary in various embodiments but is generally dependent on factors such as camera frame speed or CPU clock speed. At shots 302 and 304, the person is moving her arms up and down in full range and is performing the complete gesture of arm flapping. Once this is done and the system recognizes the gesture, the system transforms the person to a bird as shown at shot 306. Transforming the human figure to a bird is one example of a function or operation the computer can perform once it recognizes the arm flapping gesture. More generally, once recognized the computer can perform any type of function that the computer was programmed to perform upon recognition, such as, changing applications or turning the computer on or off. Performing the recognized gesture is essentially the same as pressing a key on the keyboard or clicking a button on a mouse.

FIG. 4 shows another example of a preferred embodiment where the human figure performing a recognizable gesture--in this case jumping up and down--is augmented with a new hat by the system once it recognizes the gesture. In this example, the figure or subject is not transformed as in FIG. 3, but rather is augmented (i.e., a less significant change to the figure) by having an object, the hat, added to it. At shot 400 the figure is standing still. At shots 402 and 404 the figure is shown jumping straight up and down at an acceptable rate to the system as described above. Once this gesture is recognized by the system, the computer performs the function of augmenting the figure by placing a hat on the figure's head as shown at 406. As described above, this system can perform any type of function that it could normally perform from a user pressing a key or clicking a mouse, once it recognizes the gesture. This gesture recognition and training process is described in greater detail with respect to FIGS. 5 through 9.

FIG. 5a is a flow diagram showing a process for a preferred embodiment of object gesture recognition of the present invention. At 500, the system creates or digitally builds a background model by capturing several frames of a background image. The background image is essentially the setting the system is being used in, for example, a child's playroom, an office, or a living room. It is the setting in which the subject, e.g. a person, will enter and, possibly, perform a gesture. A preferred embodiment of creating a background model is described in an application titled "Method and Apparatus for Model-Based Compositing" by inventor Subutai Ahmad, assigned to Electric Planet, Inc., filed on Oct. 15, 1997 Ser. No. 08/951,089.

Once the background model is created in block 500, in a preferred embodiment, the system preprocesses an image frame within which the subject is performing a particular gesture in block 502. In a preferred embodiment, this preprocessing involves compositing the object onto a destination image and displaying the destination image on a computer monitor, as described with respect to FIG. 2 above. The compositing process can involve sub-processes for reducing the effect of shadows and filling holes and gaps in the object once composited. The destination image can be an image very different from the background image, such as an outdoor scene, outer space, or other type of imaginary scene. This gives the effect of the person performing a gesture, and being augmented or transformed, in an unusual environment or setting. A preferred embodiment of the compositing process is described in detail in co-pending application titled "Method and Apparatus for Model-Based Compositing" by inventor Subutai Ahmad, assigned to Electric Planet, Inc., filed on Oct. 15, 1997 Ser. No. 08/951,089.

At 504 the system analyzes the person's gesture by performing a gesture recognition process using as data a sequence of image frames captured in block 502. A preferred embodiment of the gesture recognition process is described in greater detail with respect to FIG. 6. The gesture recognition process is performed using a gesture database as shown in block 506. Gesture database 506 contains data arrays representing gestures known to the system and other information such as status reports, described in greater detail below. The gesture recognition process deconstructs and analyzes the gesture or gestures being made by the person. At 508 the system determines whether the gesture performed by the person is actually a recognized or known gesture. The system has a set of recognizable gestures to which the gesture being performed by the person is compared. The data representing the recognizable gestures is stored in data arrays, described in greater detail with respect to FIG. 6 below. If the gesture performed by the person is a recognizable gesture, the system proceeds to block 510. At 510 the system performs a particular function or operation based on the semantic meaning of the recognized gesture. As described above this meaning can translate to transforming the person to another figure, like a bird, or augmenting the person, for example, by adding a hat. Once the system recognizes a gesture and performs an operation based on the gesture, the system returns to block 502 and continues analyzing image frames of the person performing further gestures. That is, even though the person has performed a gesture recognizable to the system and the system has carried out an operation based on the gesture, the processing continues as long as the image frames are being sent to the system. The system will continue processing movements by the person to see if they match any of its recognizable gestures. However, if the gesture performed by the person is not recognized by the system, control also returns to block 502 where the system captures and preprocesses the next frame of the person continuing performance of a gesture (i.e. the person's continuing movements in front of the camera).

FIG. 5b shows data stored in a frame data set as derived from an image frame containing the person performing a gesture as described in block 502 of FIG. 5a. In a preferred embodiment, the frame data set shown in FIG. 5b contains x and y coordinate values of certain portions of a person performing a gesture. For example, these portions can include: a left extremity, a right extremity, a center of mass, width, top of head, and center of head. In this example, the left and right extremities can be the end of a person's right and left arms and the width can be the person's shoulder span. In other preferred embodiments, the coordinates can be of other significant or relevant portions depending on the subject performing the movements and the type of movement. The frame data set contains information on the positions (via x and y coordinates) of significant or meaningful portions of the subject's "body". What is significant or meaningful can depend on the nature and range of gestures expected to be performed by the object or that are recognized or known to the system. For example, the left and right extremities of a person are significant because one of the recognizable gestures is flapping of the arms which is determined by the movement of the ends of the person's arms. In a preferred embodiment, each image or data frame captured has a corresponding frame data set. The sequence of frame data sets is analyzed by the gesture recognition process as shown in block 504 of FIG. 5a and described in greater detail in FIGS. 6 and 7. As will be described in greater detail below, information from the frame data set is extracted in various combinations and can also be scaled as needed by the system. For example, with an arm flapping gesture the system would extract width coordinates, coordinates of right and left extremities, and center of mass coordinates, and possibly others. Essentially, the frame data set indicates the location of significant parts of the moving subject at a given moment in time.

FIG. 6 is a flow diagram showing in greater detail block 504 of FIG. 5a. In step 600 the system processes the frame data for a known gesture (gesture #1). This process is repeated for each known gesture contained in the gesture database shown in FIG. 5 as item 506. Once the frame data has been compared to gesture data as shown in blocks 600 through 604 (known gesture #N), the system then determines whether the gesture made by the moving subject meets any of the completion requirements for the known gestures in the system in block 606. If the moving subject's gesture does not meet the requirements for any of the known gestures, control returns to block 502 of FIG. 5 in which the system preprocesses a new frame of the moving subject. If the moving subject's gesture meets the requirements of any of the known gestures, the system then performs an operation based on the semantic meaning of the recognized gesture. For example, if the gesture by the moving object is recognized to be a flapping gesture, the system can then transform the human figure on the monitor into a bird or other objects. The transformation to an image of a bird would be an example of a semantic meaning of the arm flapping gesture.

FIG, 7 is a flowchart showing in greater detail block 600 of FIG. 6 in which the system processes the frame data to determine whether it matches the completion point of a known gesture. At 700 the system begins processing a frame data set representative of a captured image frame. An example of a frame data set is shown in FIG. 5b. As described above, the frame data set contains coordinates of various significant positions of the moving subject. The frame data set contains information on the moving subject at one particular point in time. As will be described below, the system continues capturing image frames and, thus, deriving frame data sets, as long as there is movement by the subject within view of the camera.

At 702 the system will extract from the frame data set positional coordinates it needs in order to perform a proper comparison with each of the gestures known or recognizable to the system. For example, a known gesture, such as squatting, may only have two relevant or necessary coordinates that need to be checked, such as top of head and center of mass. Other coordinates do not need to be checked in order to determine whether a person is performing a squatting movement. Thus, in block 702 the system extracts relevant coordinates from the frame data set (in some case sit may be all the available coordinates) for comparison to known gestures.

At 704 the system compares the extracted position al coordinates from the frame data set to the positional coordinates of a particular point of the characteristic pattern of each known gesture. Each of the known gestures in the system is made up of one or more dimensions. For example, the flap ping gesture may have four dimensions: normalized x and y for the right arm and normalized x' and y' for the left arm. A jump may have only two dimensions: one for the normalized top of the head and another for the normalized center of mass. Each dimension turns out a characteristic pattern of positional coordinates representing the expected movements of the gesture in a particular space over time. The extracted positional coordinates from the frame data set is compared to a particular point along each of these dimensional patterns for each gesture.

Each dimensional pattern has a number of key points, also referred to as states. A key point can be a characteristic pose for a particular gesture. For example, in an arm flapping gesture, a key point can be when the arms are at the highest or lowest positions. In the case of a jump, a key point may be when the object reaches the highest point. Thus, a key point can be a point where the object has a significant change in direction. Each dimension is typically made up of a few key points and flexible zones which are the areas between the key points. At 706 the system determines whether a new state has been reached. In the course of comparing the positional data to the dimensional patterns, the system determines whether the input (potential) gesture has reached a key point for any of the known gestures. Thus, if a person bends her knees to a certain point, the system may interpret that as a key point for the jump gesture or possibly a squatting or sitting gesture. Another example is a person moving her arms up to a certain point and then moving them down. The point at which the person begins moving her arms down can be interpreted by the system as a key point for the arm flap gesture. At 708 the system will make this determination. If a new state has been reached for any of the gestures, the system updates a status report to reflect this event at 710. This informs the system that the person has performed at least a part of one known gesture.

This information can be used for a partial completion query to determine whether a person's movement is likely to be a known gesture. For example, a system can inquire or automatically be informed when an input gesture has met three-quarters or two-thirds of a known gesture. This can be determined by probing the status report to see how many states of a known gesture have been reached. The system can then begin preparing for the completion of the known event. Essentially, the system can get a head start in performing the operation associated with the known gesture.

At 712 the system checks whether there is a severe mismatch between data from the frame data set and the allowable positional coordinates for each dimensional pattern of each known gesture. A severe mismatch would result, for example, from coordinates indicating a change in direction that clearly shows that the gesture does not conform to a particular known gesture (e.g., an arm going up when the system would expect it to go down for a certain gesture). A severe mismatch would first be detected at one of a known gesture's key points. If there is a severe mismatch the system resets the data array for the known gesture with which there was a mismatch at block 714. The system maintains data arrays for each gesture in which the system stores information regarding the "history" of the movements performed by the person and captured by the camera. This information is no longer needed if it determined that it is highly unlikely that the movements by the person will match a particular known gesture. Once these data arrays are cleared so they can begin storing new information, the system also resets the status reports to reflect the mismatch at block 716. By clearing the status report regarding a particular gesture, the system will not provide misleading information when a partial completion query is made regarding that gesture. The status report will indicate, at the time there is a severe mismatch, that no part of the particular gesture has been completed. At 718 the system will continue obtaining and processing input image frames of the person performing movements in the range of the camera as shown generally in FIG 5a.

Returning to block 708, if a new state has not been reached for any of the known gestures, the system continues with block 712 where it checks for any severe mismatches. If there are no severe mismatches, the system checks whether there is a match between the coordinates in the frame data set and any of the known gestures in block 720. Once again, this is done by comparing the positional coordinates from the frame data to the coordinates of a particular point along the characteristic pattern of each dimension of each of the known gestures. If there is a less-than-severe mismatch, but a mismatch nonetheless, between the positional coordinates and a known gesture, the most recent data in the known gesture's data arrays is kept and older data is discarded at 722. This is also done if a timing constraint for a state has been violated. This can occur if a person holds a position in a gesture for too long. In a preferred embodiment, the subject's gesture should be continuous. New data stored in the array is stored from where the most recent data was kept. The system then continues obtaining new image input frames as shown in block 718.

If the system determines that the movements performed by the person matches a known gesture, a recognition flag for that gesture is set at 724. A match is found when the sequence of positional coordinates from consecutive frame data sets match each of the patterns of positional coordinates of each dimension for a known gesture. Once a match is found, the system can perform an operation associated with the known and recognized gesture, such as transforming the person to another image or augmenting the person, as shown on a computer monitor. However, the system will also continue obtaining input image frames as long as the person is moving within the range of the camera. Thus, control returns to block 718.

In a preferred embodiment of the present invention, it is possible for the user to enter new gestures into the system, thereby adding them to the system library of known or recognized gestures. One process for doing this is through training the system to recognize the new gesture. The training feature can also be used to show the system how a particular person does one of the already known gestures, such as the arm flap. For example, a particular person may not raise her arms as high as someone with longer arms. By showing the system how a particular person performs a gesture, the system will be more likely to recognize that gesture done by that person and recognize it sooner and with a greater confidence level. This is a useful procedure for frequent users or for users who pattern one gesture frequently.

FIGS. 8A and 8B are flowcharts showing a process for training the system to recognize a new gesture. At 800 the system collects samples of the new gesture. One method of providing samples of the new gesture is for a person to enter the field of the camera and do the gesture a certain number of times. This, naturally, requires some user intervention. In a preferred embodiment, the user or users perform the new gesture about 30 times. The number of users and the number of samples have a direct bearing on the accuracy of the model representing the new gesture and the accuracy of the statistics of each key point (discussed in greater detail below). The more representative samples provided to the system, the more robust the recognition process will be.

At 802 the number of key points in the gesture is entered as well as the complete time it takes to finish one full gesture, from start to finish. Essentially, in blocks 800 and 802, the system is provided with a sequence of key points and flexible zones. The number of key points will vary depending on the complexity of the new gesture. The key points determine what coordinates from the input frame data set should be extracted. For example, if the new gesture is a squatting movement, the motion of the hands or arms is irrelevant. At 804 the system determines what dimensions to use to measure the frame data set. For example, a squatting gesture may have two dimensions whereas a more complex gesture may have four or five dimensions. In block 806 the system determines the location of the key points in a model representing the new gesture based on the starting and ending times provided by the user. The system does this by finding the most prominent peaks and valleys for each dimension, and then aligning these extrema across all the dimensions of the new gesture.

At 808 the system calculates a probability distribution of each state or key point in the model. The system has a set of routines for calculating the statistics at the key points given the set of sample gestures. The statistics of interest include the mean and variance values for each dimension of the gesture and statistics regarding the timing with respect to the start of the gesture. Using these means and variances, the system sets the allowable upper and lower bounds for the key points, which are used during the recognition phase to accept or reject the incoming input frame data sets as a possible gesture match. The system will examine the samples and derive a probability for each key point. For example, if an incoming gesture reaches the third state of a four-state gesture, the probability that the incoming gesture will match the newly entered gesture may be 90%. On the other hand, if an incoming gesture meets the newly entered gesture's first state, there may only be a 10% probability that the incoming gesture will match the newly entered gesture. This is done for each key point in each dimension for the newly entered gesture.

At 810 the system refines the model representing the new gesture by trying out different threshold values based on a Gaussian distribution. At this stage a first version of the model has already been created. The system then runs the same data from the initial samples and some extraneous data that clearly falls outside the model through the model. The system then determines how much of the first set of data can be recognized by the initial model. The thresholds of each state are initially set narrowly and are expanded until the model can recognize all the initial samples but not any of the extraneous data entered that should not fall within the model. The purpose of this is to ensure that the refined model is sufficiently broad to recognize all the samples of the gesture but not so broad as to accept arbitrary gestures (as represented by the extraneous data). Essentially, the system is determining what is an acceptable gesture and what is not.

At 812 the system checks if there are anymore new gestures to be entered into the system by examining frames of the subject's movements. If the system does not detect any additional movements by the subject, it proceeds to block 814.

At 814 the system updates a gesture confusion matrix. The matrix has an entry for each gesture known to the system. The system checks the newly trained gesture against existing gestures in the library for confusability. If the newly trained gesture is highly confusable with one or more existing gestures, it should be retrained using more features or different features. In a preferred embodiment the matrix would be made up of rows and columns in which the columns represent the known gestures and the rows represent or contain data on each of the gestures. A cell in which the data for a gesture, for example, jump, intersects with the jump column, should contain the highest confusability indicator. In another example, a cell in which a jump column intersects with a row for arm flap data should contain a low confusability factor or indicator. Once the confusion matrix has been set for the newly entered gesture, the system continues monitoring for additional movements by the subject starting with block 502 of FIG. 5a. Although the foregoing invention has been described in some detail for purposes of clarity of understanding, it will be apparent that certain changes and modifications may be practiced within the scope of the appended claims. For example, the image of the person performing the gesture does not need to be composited onto a destination image and then displayed on the computer monitor. The system can, for example, simply recognize the gesture and perform a particular function based on the semantic meaning of the gesture. In another example, the system can obtain data frames from another medium such as a video or film created at an earlier time, instead of obtaining the data frames from a live figure whose movements are captured by a camera in real-time. In yet another example, the frame data set can contain coordinates of sections of a moving subject other than coordinates specifically for a human body. Furthermore, it should be noted that there are alternative ways of implementing both the process and apparatus of the present invention. Accordingly, the present embodiments are to be considered as illustrative and not restrictive, and the invention is not to be limited to the details given herein, but may be modified within the scope and equivalents of the appended claims.

Patent Citations
Cited PatentFiling datePublication dateApplicantTitle
US4641349 *20 Feb 19853 Feb 1987Leonard FlomIris recognition system
US4843568 *11 Apr 198627 Jun 1989Krueger Myron WReal time perception of and response to the actions of an unencumbered participant/user
US5148477 *24 Aug 199015 Sep 1992Board Of Regents Of The University Of OklahomaMethod and apparatus for detecting and quantifying motion of a body part
US5454043 *30 Jul 199326 Sep 1995Mitsubishi Electric Research Laboratories, Inc.Dynamic and static hand gesture recognition through low-level image analysis
US5548659 *23 Sep 199420 Aug 1996Kabushiki Kaisha ToshibaMethod and apparatus for detecting changes in dynamic images
US5570113 *29 Jun 199429 Oct 1996International Business Machines CorporationComputer based pen system and method for automatically cancelling unwanted gestures and preventing anomalous signals as inputs to such system
US5577179 *23 Jul 199219 Nov 1996Imageware Software, Inc.Method for combining images in three dimensions
US5581276 *8 Sep 19933 Dec 1996Kabushiki Kaisha Toshiba3D human interface apparatus using motion recognition based on dynamic image processing
US5594469 *21 Feb 199514 Jan 1997Mitsubishi Electric Information Technology Center America Inc.Hand gesture machine control system
Non-Patent Citations
Reference
1 *A Model Based Complex Background Gesture Recognition Sys. Author: Chung Lin Huang; Ming Shan Wu 5484042 Inspec Abstract : B9703 6140C 083, C9703 5260B 040, IEEE, 1996.
2A Model-Based Complex Background Gesture Recognition Sys. Author: Chung-Lin Huang; Ming-Shan Wu 5484042 Inspec Abstract # : B9703-6140C-083, C9703-5260B-040, IEEE, 1996.
3 *Chung Lin Huang and Ming Shan Wu, A Model based Complex Background Gesture Recognition System, IEEE, 1996, vol. 1, pp. 93 98, abstract.
4Chung-Lin Huang and Ming-Shan Wu, A Model-based Complex Background Gesture Recognition System, IEEE, 1996, vol. 1, pp. 93-98, abstract.
Referenced by
Citing PatentFiling datePublication dateApplicantTitle
US6265993 *1 Oct 199824 Jul 2001Lucent Technologies, Inc.Furlable keyboard
US6351222 *30 Oct 199826 Feb 2002Ati International SrlMethod and apparatus for receiving an input by an entertainment device
US6353764 *25 Nov 19985 Mar 2002Matsushita Electric Industrial Co., Ltd.Control method
US6692259 *11 Dec 200217 Feb 2004Electric PlanetMethod and apparatus for providing interactive karaoke entertainment
US6771277 *5 Oct 20013 Aug 2004Sony Computer Entertainment Inc.Image processor, image processing method, recording medium, computer program and semiconductor device
US679506821 Jul 200021 Sep 2004Sony Computer Entertainment Inc.Prop input device and method for mapping an object from a two-dimensional camera image to a three-dimensional space for controlling action in a game program
US680439628 Mar 200112 Oct 2004Honda Giken Kogyo Kabushiki KaishaGesture recognition system
US68947145 Dec 200017 May 2005Koninklijke Philips Electronics N.V.Method and apparatus for predicting events in video conferencing and other applications
US694757115 May 200020 Sep 2005Digimarc CorporationCell phones with optical capabilities, and related applications
US697188230 Dec 20036 Dec 2005Electric Planet, Inc.Method and apparatus for providing interactive karaoke entertainment
US7036094 *31 Mar 200025 Apr 2006Cybernet Systems CorporationBehavior recognition system
US705820426 Sep 20016 Jun 2006Gesturetek, Inc.Multiple camera control system
US70719141 Sep 20004 Jul 2006Sony Computer Entertainment Inc.User input device and method for interaction with graphic images
US7089185 *27 Jun 20028 Aug 2006Intel CorporationEmbedded multi-layer coupled hidden Markov model
US7095401 *31 Oct 200122 Aug 2006Siemens Corporate Research, Inc.System and method for gesture interface
US711319326 Aug 200426 Sep 2006Sony Computer Entertainment Inc.Method for color transition detection
US7129927 *13 Sep 200231 Oct 2006Hans Arvid MattsonGesture recognition system
US71650299 May 200216 Jan 2007Intel CorporationCoupled hidden Markov model for audiovisual speech recognition
US717104311 Oct 200230 Jan 2007Intel CorporationImage recognition using hidden markov models and coupled hidden markov models
US717403117 May 20056 Feb 2007Digimarc CorporationMethods for using wireless phones having optical capabilities
US7176945 *21 Jun 200413 Feb 2007Sony Computer Entertainment Inc.Image processor, image processing method, recording medium, computer program and semiconductor device
US720026627 Aug 20033 Apr 2007Princeton UniversityMethod and apparatus for automated video activity analysis
US72033686 Jan 200310 Apr 2007Intel CorporationEmbedded bayesian network for pattern recognition
US72098839 May 200224 Apr 2007Intel CorporationFactorial hidden markov model for audiovisual speech recognition
US722499510 Jan 200129 May 2007Digimarc CorporationData entry method and system
US722752623 Jul 20015 Jun 2007Gesturetek, Inc.Video-based image control system
US72616128 Nov 200028 Aug 2007Digimarc CorporationMethods and systems for read-aloud books
US727480023 Jan 200325 Sep 2007Intel CorporationDynamic gesture recognition from stereo sequences
US7284201 *20 Sep 200116 Oct 2007Koninklijke Philips Electronics N.V.User attention-based adaptation of quality level to improve the management of real-time multi-media content delivery and distribution
US7386150 *12 Nov 200410 Jun 2008Safeview, Inc.Active subject imaging with body identification
US742109319 Dec 20052 Sep 2008Gesturetek, Inc.Multiple camera control system
US747206319 Dec 200230 Dec 2008Intel CorporationAudio-visual feature fusion and support vector machine useful for continuous speech recognition
US7499569 *23 Feb 20053 Mar 2009Mitsubishi Fuso Truck And Bus CorporationHand pattern switching apparatus
US7515735 *11 Apr 20037 Apr 2009National Institute Of Information And Communication Technology Incorporated Administrative AgencyImage recognition system and image recognition program
US75459492 Nov 20049 Jun 2009Cognex Technology And Investment CorporationMethod for setting parameters of a vision detector using production line information
US755514231 Oct 200730 Jun 2009Gesturetek, Inc.Multiple camera control system
US758381920 May 20051 Sep 2009Kyprianos PapademetriouDigital signal processing methods, systems and computer program products that identify threshold positions and values
US75989428 Feb 20066 Oct 2009Oblong Industries, Inc.System and method for gesture based control system
US759952018 Nov 20056 Oct 2009Accenture Global Services GmbhDetection of multiple targets on a plane of interest
US762311516 Jan 200424 Nov 2009Sony Computer Entertainment Inc.Method and apparatus for light input device
US76271394 May 20061 Dec 2009Sony Computer Entertainment Inc.Computer image and audio processing of intensity and input devices for interfacing with a computer program
US7629959 *20 Sep 20058 Dec 2009Victor Company Of Japan, LimitedController for electronic appliance
US763644912 Nov 200422 Dec 2009Cognex Technology And Investment CorporationSystem and method for assigning analysis parameters to vision detector using a graphical interface
US763923328 Feb 200629 Dec 2009Sony Computer Entertainment Inc.Man-machine interface using a deformable device
US764637212 Dec 200512 Jan 2010Sony Computer Entertainment Inc.Methods and systems for enabling direction detection when interfacing with a computer program
US766368916 Jan 200416 Feb 2010Sony Computer Entertainment Inc.Method and apparatus for optimizing capture device settings through depth information
US772031512 Nov 200418 May 2010Cognex Technology And Investment CorporationSystem and method for displaying and using non-numeric graphic elements to control and monitor a vision system
US7725547 *6 Sep 200625 May 2010International Business Machines CorporationInforming a user of gestures made by others out of the user's line of sight
US7760182 *21 Aug 200620 Jul 2010Subutai AhmadMethod for video enabled electronic commerce
US77602484 May 200620 Jul 2010Sony Computer Entertainment Inc.Selective sound source listening in conjunction with computer interactive processing
US7785201 *2 Sep 200531 Aug 2010Sega CorporationBackground image acquisition method, video game apparatus, background image acquisition program, and computer-readable medium containing computer program
US7788606 *14 Jun 200431 Aug 2010Sas Institute Inc.Computer-implemented system and method for defining graphics primitives
US779232812 Jan 20077 Sep 2010International Business Machines CorporationWarning a vehicle operator of unsafe operation behavior based on a 3D captured image stream
US780133212 Jan 200721 Sep 2010International Business Machines CorporationControlling a system based on user behavioral signals detected from a 3D captured image stream
US784003112 Jan 200723 Nov 2010International Business Machines CorporationTracking a range of body movement based on 3D captured image streams of a user
US7844921 *1 Jun 200730 Nov 2010Kabushiki Kaisha ToshibaInterface apparatus and interface method
US7848850 *15 Nov 20047 Dec 2010Japan Science And Technology AgencyMethod for driving robot
US787491712 Dec 200525 Jan 2011Sony Computer Entertainment Inc.Methods and systems for enabling depth and direction detection when interfacing with a computer program
US787770612 Jan 200725 Jan 2011International Business Machines CorporationControlling a document based on user behavioral signals detected from a 3D captured image stream
US78985221 Jun 20071 Mar 2011Gesturetek, Inc.Video-based image control system
US797115612 Jan 200728 Jun 2011International Business Machines CorporationControlling resource access based on user gesturing in a 3D captured image stream of the user
US799679313 Apr 20099 Aug 2011Microsoft CorporationGesture recognizer system architecture
US80356291 Dec 200611 Oct 2011Sony Computer Entertainment Inc.Hand-held computer interactive device
US8060841 *18 Mar 200815 Nov 2011NavisenseMethod and device for touchless media searching
US807247029 May 20036 Dec 2011Sony Computer Entertainment Inc.System and method for providing a real-time three-dimensional interactive environment
US8094090 *19 Oct 200710 Jan 2012Southwest Research InstituteReal-time self-visualization system
US812724727 Dec 200628 Feb 2012Cognex CorporationHuman-machine-interface and method for manipulating data in a machine vision system
US813101529 Jun 20096 Mar 2012Qualcomm IncorporatedMultiple camera control system
US814921031 Dec 20073 Apr 2012Microsoft International Holdings B.V.Pointing device and method
US816030419 May 200917 Apr 2012Digimarc CorporationInteractive systems and methods employing wireless mobile devices
US818896821 Dec 200729 May 2012Sony Computer Entertainment Inc.Methods for interfacing with a program using a light input device
US820962021 Apr 200626 Jun 2012Accenture Global Services LimitedSystem for storage and navigation of application states and interactions
US821368019 Mar 20103 Jul 2012Microsoft CorporationProxy training data for human body tracking
US823709915 Jun 20077 Aug 2012Cognex CorporationMethod and system for optoelectronic detection and location of objects
US824398626 May 200514 Aug 2012Cognex Technology And Investment CorporationMethod and apparatus for automatic visual event detection
US824929626 May 200521 Aug 2012Cognex Technology And Investment CorporationMethod and apparatus for automatic visual event detection
US824929726 May 200521 Aug 2012Cognex Technology And Investment CorporationMethod and apparatus for automatic visual event detection
US824932924 May 200521 Aug 2012Cognex Technology And Investment CorporationMethod and apparatus for detecting and characterizing an object
US825182027 Jun 201128 Aug 2012Sony Computer Entertainment Inc.Methods and systems for enabling depth and direction detection when interfacing with a computer program
US82537461 May 200928 Aug 2012Microsoft CorporationDetermine intended motions
US826453625 Aug 200911 Sep 2012Microsoft CorporationDepth-sensitive imaging via polarization-state mapping
US826534125 Jan 201011 Sep 2012Microsoft CorporationVoice-body identity correlation
US826778130 Jan 200918 Sep 2012Microsoft CorporationVisual target tracking
US826983412 Jan 200718 Sep 2012International Business Machines CorporationWarning a user about adverse behaviors of others within an environment based on a 3D captured image stream
US827453517 Aug 200725 Sep 2012Qualcomm IncorporatedVideo-based image control system
US827941817 Mar 20102 Oct 2012Microsoft CorporationRaster scanning for depth detection
US82848473 May 20109 Oct 2012Microsoft CorporationDetecting motion for a multifunction sensor device
US829023824 May 200516 Oct 2012Cognex Technology And Investment CorporationMethod and apparatus for locating objects
US829476730 Jan 200923 Oct 2012Microsoft CorporationBody scan
US829554212 Jan 200723 Oct 2012International Business Machines CorporationAdjusting a consumer experience based on a 3D captured image stream of a consumer response
US829554621 Oct 200923 Oct 2012Microsoft CorporationPose tracking pipeline
US829555229 Apr 200923 Oct 2012Cognex Technology And Investment CorporationMethod for setting parameters of a vision detector using production line information
US829615118 Jun 201023 Oct 2012Microsoft CorporationCompound gesture-speech commands
US830341112 Oct 20106 Nov 2012Sony Computer Entertainment Inc.Methods and systems for enabling depth and direction detection when interfacing with a computer program
US832061915 Jun 200927 Nov 2012Microsoft CorporationSystems and methods for tracking a model
US832062121 Dec 200927 Nov 2012Microsoft CorporationDepth projector system with integrated VCSEL array
US832310624 Jun 20084 Dec 2012Sony Computer Entertainment America LlcDetermination of controller three-dimensional location using image analysis and ultrasonic communication
US832590925 Jun 20084 Dec 2012Microsoft CorporationAcoustic echo suppression
US83259849 Jun 20114 Dec 2012Microsoft CorporationSystems and methods for tracking a model
US833013414 Sep 200911 Dec 2012Microsoft CorporationOptical fault monitoring
US83308229 Jun 201011 Dec 2012Microsoft CorporationThermally-tuned depth camera light source
US834043216 Jun 200925 Dec 2012Microsoft CorporationSystems and methods for detecting a tilt angle from a depth image
US8345918 *12 Nov 20041 Jan 2013L-3 Communications CorporationActive subject privacy imaging
US835165126 Apr 20108 Jan 2013Microsoft CorporationHand-location post-process refinement in a tracking system
US83516522 Feb 20128 Jan 2013Microsoft CorporationSystems and methods for tracking a model
US83632122 Apr 201229 Jan 2013Microsoft CorporationSystem architecture design for time-of-flight system having reduced differential pixel size, and time-of-flight systems so designed
US83744232 Mar 201212 Feb 2013Microsoft CorporationMotion detection using depth images
US837910129 May 200919 Feb 2013Microsoft CorporationEnvironment and/or target segmentation
US837991929 Apr 201019 Feb 2013Microsoft CorporationMultiple centroid condensation of probability distribution clouds
US8379987 *30 Dec 200819 Feb 2013Nokia CorporationMethod, apparatus and computer program product for providing hand segmentation for gesture analysis
US838110821 Jun 201019 Feb 2013Microsoft CorporationNatural user input for driving interactive stories
US838555719 Jun 200826 Feb 2013Microsoft CorporationMultichannel acoustic echo reduction
US838559621 Dec 201026 Feb 2013Microsoft CorporationFirst person shooter control with virtual skeleton
US83906809 Jul 20095 Mar 2013Microsoft CorporationVisual representation expression based on player expression
US8391851 *25 May 20075 Mar 2013Digimarc CorporationGestural techniques with wireless mobile phone devices
US839625220 May 201012 Mar 2013Edge 3 TechnologiesSystems and related methods for three dimensional gesture recognition in vehicles
US840122531 Jan 201119 Mar 2013Microsoft CorporationMoving object segmentation using depth images
US840124231 Jan 201119 Mar 2013Microsoft CorporationReal-time camera tracking using depth maps
US8407625 *27 Apr 200626 Mar 2013Cybernet Systems CorporationBehavior recognition system
US840772524 Apr 200826 Mar 2013Oblong Industries, Inc.Proteins, pools, and slawx in processing environments
US840870613 Dec 20102 Apr 2013Microsoft Corporation3D gaze tracker
US84119485 Mar 20102 Apr 2013Microsoft CorporationUp-sampling binary images for segmentation
US841618722 Jun 20109 Apr 2013Microsoft CorporationItem navigation using motion-capture data
US8417026 *10 Jun 20109 Apr 2013Industrial Technology Research InstituteGesture recognition methods and systems
US841808529 May 20099 Apr 2013Microsoft CorporationGesture coach
US842272927 Jun 200716 Apr 2013Cognex CorporationSystem for configuring an optoelectronic sensor
US84227695 Mar 201016 Apr 2013Microsoft CorporationImage segmentation using reduced foreground training data
US842834021 Sep 200923 Apr 2013Microsoft CorporationScreen space plane identification
US84375067 Sep 20107 May 2013Microsoft CorporationSystem for fast, probabilistic skeletal tracking
US844805617 Dec 201021 May 2013Microsoft CorporationValidation analysis of human target
US844809425 Mar 200921 May 2013Microsoft CorporationMapping a natural input device to a legacy system
US84512783 Aug 201228 May 2013Microsoft CorporationDetermine intended motions
US845205118 Dec 201228 May 2013Microsoft CorporationHand-location post-process refinement in a tracking system
US845208730 Sep 200928 May 2013Microsoft CorporationImage selection techniques
US845735318 May 20104 Jun 2013Microsoft CorporationGestures and gesture modifiers for manipulating a user-interface
US845744920 Jul 20104 Jun 2013Digimarc CorporationWireless mobile phone methods
US846757428 Oct 201018 Jun 2013Microsoft CorporationBody scan
US846759931 Aug 201118 Jun 2013Edge 3 Technologies, Inc.Method and apparatus for confusion learning
US84834364 Nov 20119 Jul 2013Microsoft CorporationSystems and methods for tracking a model
US84878711 Jun 200916 Jul 2013Microsoft CorporationVirtual desktop coordinate transformation
US8487938 *23 Feb 200916 Jul 2013Microsoft CorporationStandard Gestures
US848888828 Dec 201016 Jul 2013Microsoft CorporationClassification of posture states
US8494859 *15 Oct 200323 Jul 2013Gh, LlcUniversal processing system and methods for production of outputs accessible by people with disabilities
US849783816 Feb 201130 Jul 2013Microsoft CorporationPush actuation of interface controls
US84984817 May 201030 Jul 2013Microsoft CorporationImage segmentation using star-convexity constraints
US84992579 Feb 201030 Jul 2013Microsoft CorporationHandles interactions for human—computer interface
US85034945 Apr 20116 Aug 2013Microsoft CorporationThermal management system
US850376613 Dec 20126 Aug 2013Microsoft CorporationSystems and methods for detecting a tilt angle from a depth image
US850891914 Sep 200913 Aug 2013Microsoft CorporationSeparation of electrical and optical components
US850947916 Jun 200913 Aug 2013Microsoft CorporationVirtual object
US850954529 Nov 201113 Aug 2013Microsoft CorporationForeground subject detection
US8514251 *23 Jun 200820 Aug 2013Qualcomm IncorporatedEnhanced character input using recognized gestures
US851426926 Mar 201020 Aug 2013Microsoft CorporationDe-aliasing depth images
US85209006 Aug 201027 Aug 2013Digimarc CorporationMethods and devices involving imagery and gestures
US852366729 Mar 20103 Sep 2013Microsoft CorporationParental control settings based on body dimensions
US85267341 Jun 20113 Sep 2013Microsoft CorporationThree-dimensional background removal for vision system
US85313963 Sep 200910 Sep 2013Oblong Industries, Inc.Control system for navigating a principal dimension of a data space
US85371113 Sep 200917 Sep 2013Oblong Industries, Inc.Control system for navigating a principal dimension of a data space
US85371123 Sep 200917 Sep 2013Oblong Industries, Inc.Control system for navigating a principal dimension of a data space
US85380647 Sep 201017 Sep 2013Digimarc CorporationMethods and devices employing content identifiers
US854225229 May 200924 Sep 2013Microsoft CorporationTarget digitization, extraction, and tracking
US85429102 Feb 201224 Sep 2013Microsoft CorporationHuman tracking system
US85482704 Oct 20101 Oct 2013Microsoft CorporationTime-of-flight depth imaging
US85539348 Dec 20108 Oct 2013Microsoft CorporationOrienting the position of a sensor
US855393929 Feb 20128 Oct 2013Microsoft CorporationPose tracking pipeline
US855887316 Jun 201015 Oct 2013Microsoft CorporationUse of wavefront coding to create a depth image
US85645347 Oct 200922 Oct 2013Microsoft CorporationHuman tracking system
US85654767 Dec 200922 Oct 2013Microsoft CorporationVisual target tracking
US85654777 Dec 200922 Oct 2013Microsoft CorporationVisual target tracking
US856548513 Sep 201222 Oct 2013Microsoft CorporationPose tracking pipeline
US857126317 Mar 201129 Oct 2013Microsoft CorporationPredicting joint positions
US85770847 Dec 20095 Nov 2013Microsoft CorporationVisual target tracking
US85770857 Dec 20095 Nov 2013Microsoft CorporationVisual target tracking
US85770876 Jul 20125 Nov 2013International Business Machines CorporationAdjusting a consumer experience based on a 3D captured image stream of a consumer response
US8578282 *7 Mar 20075 Nov 2013NavisenseVisual toolkit for a virtual user interface
US85783026 Jun 20115 Nov 2013Microsoft CorporationPredictive determination
US858292512 Apr 201012 Nov 2013Cognex Technology And Investment CorporationSystem and method for displaying and using non-numeric graphic elements to control and monitor a vision system
US858758331 Jan 201119 Nov 2013Microsoft CorporationThree-dimensional environment reconstruction
US858777313 Dec 201219 Nov 2013Microsoft CorporationSystem architecture design for time-of-flight system having reduced differential pixel size, and time-of-flight systems so designed
US858846412 Jan 200719 Nov 2013International Business Machines CorporationAssisting a vision-impaired user with navigation based on a 3D captured image stream
US85884657 Dec 200919 Nov 2013Microsoft CorporationVisual target tracking
US858851715 Jan 201319 Nov 2013Microsoft CorporationMotion detection using depth images
US85927392 Nov 201026 Nov 2013Microsoft CorporationDetection of configuration changes of an optical element in an illumination system
US859714213 Sep 20113 Dec 2013Microsoft CorporationDynamic camera based practice mode
US860576331 Mar 201010 Dec 2013Microsoft CorporationTemperature measurement and control for laser and light-emitting diodes
US861066526 Apr 201317 Dec 2013Microsoft CorporationPose tracking pipeline
US861160719 Feb 201317 Dec 2013Microsoft CorporationMultiple centroid condensation of probability distribution clouds
US861366631 Aug 201024 Dec 2013Microsoft CorporationUser selection and navigation based on looped motions
US861467330 May 201224 Dec 2013May Patents Ltd.System and method for control based on face or hand gesture detection
US861467418 Jun 201224 Dec 2013May Patents Ltd.System and method for control based on face or hand gesture detection
US86184059 Dec 201031 Dec 2013Microsoft Corp.Free-space gesture musical instrument digital interface (MIDI) controller
US86191222 Feb 201031 Dec 2013Microsoft CorporationDepth camera compatibility
US862011325 Apr 201131 Dec 2013Microsoft CorporationLaser diode modes
US862493223 Aug 20127 Jan 2014Qualcomm IncorporatedVideo-based image control system
US862583716 Jun 20097 Jan 2014Microsoft CorporationProtocol and format for communicating an image from a camera to a computing environment
US86258491 Feb 20127 Jan 2014Qualcomm IncorporatedMultiple camera control system
US86258557 Feb 20137 Jan 2014Edge 3 Technologies LlcThree dimensional gesture recognition in vehicles
US86299764 Feb 201114 Jan 2014Microsoft CorporationMethods and systems for hierarchical de-aliasing time-of-flight (TOF) systems
US863045715 Dec 201114 Jan 2014Microsoft CorporationProblem states for pose tracking pipeline
US863047820 Sep 201214 Jan 2014Cognex Technology And Investment CorporationMethod and apparatus for locating objects
US86313558 Jan 201014 Jan 2014Microsoft CorporationAssigning gesture dictionaries
US863389016 Feb 201021 Jan 2014Microsoft CorporationGesture detection based on joint skipping
US8633914 *13 Jun 201321 Jan 2014Adrea, LLCUse of a two finger input on touch screens
US86356372 Dec 201121 Jan 2014Microsoft CorporationUser interface presenting an animated avatar performing a media reaction
US86389853 Mar 201128 Jan 2014Microsoft CorporationHuman body pose estimation
US863898916 Jan 201328 Jan 2014Leap Motion, Inc.Systems and methods for capturing motion in three-dimensional space
US864459920 May 20134 Feb 2014Edge 3 Technologies, Inc.Method and apparatus for spawning specialist belief propagation networks
US864460919 Mar 20134 Feb 2014Microsoft CorporationUp-sampling binary images for segmentation
US864955429 May 200911 Feb 2014Microsoft CorporationMethod to control perspective for a camera-controlled computer
US86550695 Mar 201018 Feb 2014Microsoft CorporationUpdating image segmentation following user input
US865509310 Feb 201118 Feb 2014Edge 3 Technologies, Inc.Method and apparatus for performing segmentation of an image
US86596589 Feb 201025 Feb 2014Microsoft CorporationPhysical interaction zone for gesture-based user interfaces
US866030320 Dec 201025 Feb 2014Microsoft CorporationDetection of body and props
US866031013 Dec 201225 Feb 2014Microsoft CorporationSystems and methods for tracking a model
US866614410 Feb 20114 Mar 2014Edge 3 Technologies, Inc.Method and apparatus for determining disparity of texture
US866751912 Nov 20104 Mar 2014Microsoft CorporationAutomatic passive and anonymous feedback system
US867002916 Jun 201011 Mar 2014Microsoft CorporationDepth camera illuminator with superluminescent light-emitting diode
US867598111 Jun 201018 Mar 2014Microsoft CorporationMulti-modal gender recognition including depth data
US867658122 Jan 201018 Mar 2014Microsoft CorporationSpeech recognition analysis via identification information
US868125528 Sep 201025 Mar 2014Microsoft CorporationIntegrated low power depth camera and projection device
US868132131 Dec 200925 Mar 2014Microsoft International Holdings B.V.Gated 3D camera
US86820287 Dec 200925 Mar 2014Microsoft CorporationVisual target tracking
US86869396 May 20061 Apr 2014Sony Computer Entertainment Inc.System, method, and apparatus for three-dimensional input control
US86870442 Feb 20101 Apr 2014Microsoft CorporationDepth camera compatibility
US869372428 May 20108 Apr 2014Microsoft CorporationMethod and system implementing user-centric gesture control
US870250720 Sep 201122 Apr 2014Microsoft CorporationManual and camera-based avatar control
US870587715 Nov 201122 Apr 2014Edge 3 Technologies, Inc.Method and apparatus for fast computational stereo
US87174693 Feb 20106 May 2014Microsoft CorporationFast gating photosurface
US871838712 Dec 20116 May 2014Edge 3 Technologies, Inc.Method and apparatus for enhanced stereo vision
US87231181 Oct 200913 May 2014Microsoft CorporationImager for constructing color and depth images
US87248873 Feb 201113 May 2014Microsoft CorporationEnvironmental modifications to mitigate environmental factors
US872490618 Nov 201113 May 2014Microsoft CorporationComputing pose and/or shape of modifiable entities
US874412129 May 20093 Jun 2014Microsoft CorporationDevice for identifying and tracking multiple humans over time
US87455411 Dec 20033 Jun 2014Microsoft CorporationArchitecture for controlling a computer using hand gestures
US874955711 Jun 201010 Jun 2014Microsoft CorporationInteracting with user interface via avatar
US87512154 Jun 201010 Jun 2014Microsoft CorporationMachine based sign language interpreter
US875813227 Aug 201224 Jun 2014Sony Computer Entertainment Inc.Methods and systems for enabling depth and direction detection when interfacing with a computer program
US876039531 May 201124 Jun 2014Microsoft CorporationGesture recognition techniques
US876057121 Sep 200924 Jun 2014Microsoft CorporationAlignment of lens and image sensor
US876150915 Nov 201124 Jun 2014Edge 3 Technologies, Inc.Method and apparatus for fast computational stereo
US876289410 Feb 201224 Jun 2014Microsoft CorporationManaging virtual ports
US8771206 *19 Aug 20118 Jul 2014Accenture Global Services LimitedInteractive virtual care
US877335516 Mar 20098 Jul 2014Microsoft CorporationAdaptive cursor sizing
US877591617 May 20138 Jul 2014Microsoft CorporationValidation analysis of human target
US878115610 Sep 201215 Jul 2014Microsoft CorporationVoice-body identity correlation
US878255324 Aug 201015 Jul 2014Cognex CorporationHuman-machine-interface and method for manipulating data in a machine vision system
US87825674 Nov 201115 Jul 2014Microsoft CorporationGesture recognizer system architecture
US878673018 Aug 201122 Jul 2014Microsoft CorporationImage exposure using exclusion regions
US878765819 Mar 201322 Jul 2014Microsoft CorporationImage segmentation using reduced foreground training data
US878897323 May 201122 Jul 2014Microsoft CorporationThree-dimensional gesture controlled avatar configuration interface
US20060072009 *1 Oct 20046 Apr 2006International Business Machines CorporationFlexible interaction-based computer interfacing using visible artifacts
US20100166258 *30 Dec 20081 Jul 2010Xiujuan ChaiMethod, apparatus and computer program product for providing hand segmentation for gesture analysis
US20100194762 *23 Feb 20095 Aug 2010Microsoft CorporationStandard Gestures
US20110041102 *19 Jul 201017 Feb 2011Jong Hwan KimMobile terminal and method for controlling the same
US20110156999 *10 Jun 201030 Jun 2011Industrial Technology Research InstituteGesture recognition methods and systems
US20110157009 *28 Dec 201030 Jun 2011Sungun KimDisplay device and control method thereof
US20110296505 *28 May 20101 Dec 2011Microsoft CorporationCloud-based personal trait profile data
US20120016641 *13 Jul 201019 Jan 2012Giuseppe RaffaEfficient gesture processing
US20120038637 *26 Oct 201116 Feb 2012Sony Computer Entertainment Inc.User-driven three-dimensional interactive gaming environment
US20120202569 *19 Mar 20129 Aug 2012Primesense Ltd.Three-Dimensional User Interface for Game Applications
US20120235904 *19 Mar 201120 Sep 2012The Board of Trustees of the Leland Stanford, Junior, UniversityMethod and System for Ergonomic Touch-free Interface
US20120287044 *23 Jul 201215 Nov 2012Intellectual Ventures Holding 67 LlcProcessing of gesture-based user interactions using volumetric zones
US20130046149 *19 Aug 201121 Feb 2013Accenture Global Services LimitedInteractive virtual care
US20130055163 *26 Oct 201228 Feb 2013Michael MatasTouch Screen Device, Method, and Graphical User Interface for Providing Maps, Directions, and Location-Based Information
US20140119640 *31 Oct 20121 May 2014Microsoft CorporationScenario-specific body-part tracking
USRE4435322 Dec 20109 Jul 2013Cognex Technology And Investment CorporationSystem and method for assigning analysis parameters to vision detector using a graphical interface
DE102009043277A129 Sep 200914 Oct 2010Avaya Inc.Interpretation von Gebärden, um visuelle Warteschlangen bereitzustellen
Classifications
U.S. Classification715/863, 382/218, 345/156, 715/719, 382/209
International ClassificationG06F3/00, G06K9/00, G06F3/01
Cooperative ClassificationG06F3/017, G06K9/00335, G06F3/0304
European ClassificationG06F3/03H, G06K9/00G, G06F3/01G
Legal Events
DateCodeEventDescription
9 Apr 2012ASAssignment
Effective date: 20120216
Owner name: MICROSOFT CORPORATION, WASHINGTON
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:IV GESTURE ASSETS 12, LLC;REEL/FRAME:028012/0370
15 Feb 2012ASAssignment
Owner name: IV GESTURE ASSETS 12, LLC, DELAWARE
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ELET SYSTEMS L.L.C.;REEL/FRAME:027710/0132
Effective date: 20111222
18 Nov 2011ASAssignment
Effective date: 20071211
Owner name: ELET SYSTEMS L.L.C., DELAWARE
Free format text: CORRECTION TO THE RECORDATION COVER SHEET OF THE ASSIGNMENT RECORDED AT 020986/0709 ON 05/23/2008;ASSIGNOR:ELECTRIC PLANET INTERACTIVE;REEL/FRAME:027255/0528
23 Sep 2011FPAYFee payment
Year of fee payment: 12
23 May 2008ASAssignment
Owner name: ELET SYSTEMS L.L.C., DELAWARE
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ELECTRIC PLANET INTRACTIVE;REEL/FRAME:020986/0709
Effective date: 20071211
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ELECTRIC PLANET INTRACTIVE;REEL/FRAME:20986/709
Owner name: ELET SYSTEMS L.L.C.,DELAWARE
28 Mar 2008ASAssignment
Owner name: ELECTRIC PLANET INTERACTIVE, WASHINGTON
Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE ASSIGNEE NAME PREVIOUSLY RECORDED ON REEL 009134 FRAME 0906;ASSIGNOR:SHIFFER FKA KATERINA NGUYEN, KATERINA;REEL/FRAME:020723/0008
Effective date: 20080323
6 Dec 2007FPAYFee payment
Year of fee payment: 8
8 Dec 2003FPAYFee payment
Year of fee payment: 4
7 May 2002CCCertificate of correction
13 Apr 1998ASAssignment
Owner name: ELECTRIC PLANET, CALIFORNIA
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:NGUYEN, KATERINA H.;REEL/FRAME:009134/0906
Effective date: 19980403