CN100426317C - Multiple attitude human face detection and track system and method - Google Patents

Multiple attitude human face detection and track system and method Download PDF

Info

Publication number
CN100426317C
CN100426317C CNB200610113423XA CN200610113423A CN100426317C CN 100426317 C CN100426317 C CN 100426317C CN B200610113423X A CNB200610113423X A CN B200610113423XA CN 200610113423 A CN200610113423 A CN 200610113423A CN 100426317 C CN100426317 C CN 100426317C
Authority
CN
China
Prior art keywords
face
people
detection model
point
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CNB200610113423XA
Other languages
Chinese (zh)
Other versions
CN1924894A (en
Inventor
黄英
谢东海
王浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Vimicro Ai Chip Technology Co Ltd
Original Assignee
Vimicro Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vimicro Corp filed Critical Vimicro Corp
Priority to CNB200610113423XA priority Critical patent/CN100426317C/en
Publication of CN1924894A publication Critical patent/CN1924894A/en
Application granted granted Critical
Publication of CN100426317C publication Critical patent/CN100426317C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

This invention discloses one method and system for different human faces to test multiple faces in the test sequence for continuous tracing in the images, which comprises the following steps: separately getting human face front and side test mode through sample training to determining AAM human face mode; using the above modes to test the input visual image to determine whether one frame of image is stored in the human face, if testing the face and then tracing and validating the human face in the back frame.

Description

Detection of colourful attitude people's face and tracing system and method
Technical field
The present invention relates to detection of a kind of people's face and tracing system and method thereof, relate in particular to a kind of colourful attitude people's face and detect and method for tracing.
Background technology
People's face is one of convenient mode of man-machine interaction in the computer vision system.It is exactly information such as the position of determining everyone face in image or image sequence, size that people's face detects, and face tracking then is the one or more detection people faces that continue in the tracking video sequence.People's face detects the prerequisite of being not only technology such as recognition of face, Expression Recognition, people's face be synthetic with tracking technique, and its value that has a wide range of applications in fields such as intelligent human-machine interaction, video conference, intelligent monitoring, video frequency searchings.
Native system at image be the video sequence of video frequency pick-up head input.Before, real-time detection that the applicant has proposed people's face in a kind of video sequence and the method and system that continue to follow the trail of, Chinese patent application number is 200510135668.8, hereinafter to be referred as document 1, this application this by integral body in conjunction with reference.The method and system that this application proposes has adopted the method for detecting human face based on AdaBoost statistics layering sorter, realize the real-time detection of positive homo erectus's face, and, realized the real-time tracking system of people's face in conjunction with face tracking method based on Mean shift and histogram feature.From experimental result, this system can detect people's face of-20 to 20 degree degree of depth rotations ,-20 to 20 degree planes rotations, can detect people's face, the people's face under the different illumination conditions of the different colours of skin, people's face of hyperphoria with fixed eyeballs eyeball etc.The tracking of people's face realizes that by the colour of skin track algorithm is not subjected to the influence of human face posture, and side, rotation people face can be followed the tracks of equally.
Yet the algorithm in the above-mentioned patented claim also exists certain limitation.At first, this algorithm has only been trained the detection model of front face, can't detect inclined to one side people from side face, and this just means that the detection of this people's face and checking all can only limit the range of application of algorithm greatly at front face; Secondly, this algorithm is only by colour of skin histogram track human faces, and the features of skin colors of people's face is very easy to be subjected to the interference of other area of skin color such as neck, hand or similar area of skin color such as yellow clothes, and being reflected on the tracking results is exactly that tracing area jumps on hand, neck or the yellow clothes clothes sometimes; Once more, the size and the change in location of the tracing area that algorithm originally obtains are more violent, and motionless even people's face keeps, tracking results also tangible shake can take place; Be exactly that this algorithm can't obtain further attitude information of people's face in addition, as the anglec of rotation of people's face, current roughly attitude etc.
Summary of the invention
Technical matters to be solved by this invention is to provide the detection of a kind of colourful attitude people's face and tracing system and method, people's face of the colourful attitude of trace detection, and can overcome and be subjected to the shortcoming that disturb in the non-face zone close with face complexion, and can guarantee the lasting tracking of colourful attitude people's face and the stability of detection algorithm, and obtain the anglec of rotation of people's face, the accurate size of output people face.
In order to solve the problems of the technologies described above, the invention provides a kind of colourful attitude people's face and detect and method for tracing, comprising:
(1), obtains the positive and half side-view detection model of people's face respectively, and determine initiatively outward appearance AAM faceform by people's face sample training;
(2) utilize the positive and half side-view detection model of described people's face, inputted video image is carried out people's face detect, determine whether there is people's face in the two field picture;
(3) if in certain two field picture, detect people's face, then in subsequent frame, follow the trail of and verify this people's face, comprise step:
(31) the people's face position in the tracking former frame image, the preliminary position of people's face in the acquisition present frame;
(32) with the preliminary position of described acquisition as initial value, utilize the colourity difference of present frame and former frame image, calculate the point-to-point speed of people's face;
(33) according to described point-to-point speed, estimate the Position Approximate of people's face in the present frame, and utilize positive surface model of described people's face and half side-view detection model, near this position, detect, to verify this people's face;
(34) if detect people's face near this position, then checking is passed through, and adopts described AAM faceform to calculate the affined transformation coefficient of working as forefathers' face, obtains the characteristic parameter of present frame people face.
Wherein, described step (3) further comprises:
(35) key point of present frame and former frame image people face is mated, further revise described people's face point-to-point speed of calculating according to matching result, and the characteristic parameter of present frame people face.
Wherein, described step (3) further comprises:
(36) characteristic parameter of renewal present frame people face utilizes these parameters to be used for the tracking checking of next frame image.
Wherein, described step (34) further comprises: if do not detect people's face near this position, then checking is not passed through, and follows the trail of checking in next frame.
Wherein, described step (34) further comprises: if the checking of people's face is not passed through yet in follow-up several frames, then stop to follow the trail of.
Wherein, further comprise step:
(4) previous follow the trail of the objective stop to follow the trail of after, in successive image, begin to detect from step (2) again, after finding new people's face, proceed to follow the trail of.
Wherein, the described step that obtains people's face front and half side-view detection model by people's face sample training respectively of step (1), comprise: the people's face sample training multilayer detection model that at first uses all attitudes, again people's face sample of front, left surface, right flank attitude is trained respectively, obtain the detection model of three attitudes.
Wherein, the described people's face of step (2) detects step, comprising: at first adopt the detection model of all attitudes that image is searched for, eliminate most of search window, then remaining window is input to respectively in the detection model of three attitudes, determines according to testing result that the people is bold and cause attitude.
In order to solve the problems of the technologies described above, the present invention and then provide a kind of colourful attitude people's face to detect and tracing system comprises:
Training module is used for by people's face sample training, obtains the positive and half side-view detection model of people's face respectively, and definite AAM faceform;
Detection module is used for the positive and half side-view detection model according to described people's face, inputted video image is carried out people's face detect, and determines whether there is people's face in the two field picture;
Tracing module is used for after certain two field picture detects people's face, follows the trail of in subsequent frame and verifies this people's face, comprising:
Be used for following the trail of people's face position of former frame image, obtain the unit of the preliminary position of people's face in the present frame;
Be used for preliminary position with described acquisition as initial value, utilize the colourity difference of present frame and former frame image, calculate the unit of the point-to-point speed of people's face;
Be used for estimating the Position Approximate of people's face in the present frame, and utilize positive surface model of described people's face and half side-view detection model, near this position, detect, to verify the unit of this people's face according to described point-to-point speed;
After being used near this position, detecting people's face, adopt described AAM faceform to calculate the affined transformation coefficient of working as forefathers' face, obtain the unit of the characteristic parameter of present frame people face.
Wherein, described tracing module further comprises:
Be used for the key point of present frame and former frame image people face is mated, further revise described people's face point-to-point speed of calculating according to matching result, and the unit of the characteristic parameter of present frame people face.
Wherein, described training module by using people's face sample training multilayer detection model of all attitudes, and is trained respectively people's face sample of front, left surface, right flank attitude, obtains the detection model of three attitudes.
Wherein, described detection module is searched for image by the detection model that adopts all attitudes, eliminates most of search window, and remaining window is input to respectively in the detection model of three attitudes, determines according to testing result that the people is bold and causes attitude.
A kind of colourful attitude people's face of the present invention detection and tracing system and method, people's face that can the colourful attitude of trace detection, and can overcome and be subjected to the shortcoming that the non-face zone close with face complexion such as neck, staff or yellow clothes clothes equal to disturb, and can guarantee the lasting tracking of colourful attitude people's face and the stability of detection algorithm, and obtain the anglec of rotation of people's face, the accurate size of output people face.
Description of drawings
Fig. 1 is according to the structural representation of the described a kind of colourful attitude people's face detection of the embodiment of the invention with tracing system;
Fig. 2 is according to the schematic flow sheet of the described a kind of colourful attitude people's face detection of the embodiment of the invention with method for tracing;
Fig. 3 is according to the synoptic diagram of people's face detection in described a kind of colourful attitude people's face detection of the embodiment of the invention and the method for tracing with tracking results;
Fig. 4 is the synoptic diagram according to seven groups of little features of people's face detection algorithm selection in described a kind of colourful attitude people's face detection of the embodiment of the invention and the method for tracing;
Fig. 5 is demarcation and the collection according to people's face sample in described a kind of colourful attitude people's face detection of the embodiment of the invention and the method for tracing;
Fig. 6 is the synoptic diagram according to 4 groups of colourful attitude people's face testing results in described a kind of colourful attitude people's face detection of the embodiment of the invention and the method for tracing;
Fig. 7 is the schematic flow sheet according to people's face authentication module in described a kind of colourful attitude people's face detection of the embodiment of the invention and the method for tracing;
Fig. 8 is according to the synoptic diagram of verifying the result in described a kind of colourful attitude people's face detection of the embodiment of the invention and the method for tracing by people's face of first order checking;
Fig. 9 is according to the synoptic diagram of verifying the result in described a kind of colourful attitude people's face detection of the embodiment of the invention and the method for tracing by people's face of second level checking;
Figure 10 is the example schematic diagram according to AAM algorithm affine coefficients result of calculation in described a kind of colourful attitude people's face detection of the embodiment of the invention and the method for tracing;
Figure 11 be according to the described a kind of colourful attitude people's face of the embodiment of the invention detect with method for tracing in based on the face tracking result's of AAM synoptic diagram;
Figure 12 be according to the described a kind of colourful attitude people's face of the embodiment of the invention detect with method for tracing in key point choose and the synoptic diagram of tracking results;
Figure 13 is according to the example schematic diagram of people's face detection in described a kind of colourful attitude people's face detection of the embodiment of the invention and the method for tracing with tracking results.
Embodiment
Referring to Fig. 1, the present invention at first provides a kind of colourful attitude people's face to detect and tracing system, comprises training module 100, detection module 200, and the tracing module (not shown).Wherein:
Training module 100 is used for by people's face sample training, obtains the positive and half side-view detection model (comprising right side attitude, left side attitude) of people's face respectively, and definite AAM (Active Appearance Models) faceform;
Detection module 200 is used for the positive and half side-view detection model according to described people's face, inputted video image is carried out people's face detect, and determines whether there is people's face in the two field picture;
Tracing module is used for after certain two field picture detects people's face, follows the trail of in subsequent frame and verifies this people's face, comprising:
Be used for following the trail of people's face position of former frame image, obtain the unit of the preliminary position of people's face in the present frame;
Be used for preliminary position with described acquisition as initial value, utilize the color distortion of present frame and former frame image, calculate the unit of the point-to-point speed of people's face;
Be used for estimating the Position Approximate of people's face in the present frame, and utilize positive surface model of described people's face and half side-view detection model, near this position, detect, to verify the unit of this people's face according to described point-to-point speed;
After being used near this position, detecting people's face, adopt described AAM faceform to calculate the affined transformation coefficient of working as forefathers' face, obtain the unit of the characteristic parameter of present frame people face; And
Be used for the key point of present frame and former frame image people face is mated, further revise described people's face point-to-point speed of calculating according to matching result, and the unit of the characteristic parameter of present frame people face.
According to embodiment illustrated in fig. 1, with reference to training module 100, at first need to carry out the training of two group models, the one, positive and half side dough figurine face detection model, the 2nd, AAM faceform (not shown).The training algorithm of the detection model of people's face can adopt the multistage classifier based on the AdaBoost algorithm, and by a plurality of fronts and half side dough figurine face sample training multistage classifier, it is 12 * 12 that the people of extraction is bold little.In addition, can discern the left side of people's face, front, three attitudes in right side in order to guarantee algorithm, in the present embodiment, trained left side attitude people face detection model, right side attitude people face detection model and positive attitude people's face detection model, wherein left side attitude people's face detection model and right side attitude people face detection model can be collectively referred to as half side dough figurine face detection model, and wherein right side attitude people face detection model is through obtaining after the attitude people face detection model mirror image processing of left side.In addition, for acceleration detection speed, present embodiment also adopts all attitude people face detection models of 15 layers of people's face sample trainings of all attitudes, is called as first order detection model, is used for input picture is carried out Preliminary detection with rough acquisition people face position.
In the training module 100, training AAM faceform's purpose is the Position Approximate at known input people face, calculate the affined transformation coefficient of this people's appearance under the prerequisite of approximate size to standard faces, obtain its more accurate position, the size and the anglec of rotation, this algorithm adopts pca method (PrincipalComponent Analysis, be called for short PCA) to a large amount of people's face sample training, obtain average man's face and a plurality of orthogonal vector, the gray scale and the training pattern that adopt the iterative processing method will import people's face again when using compare, and calculate the affined transformation coefficient of people's face.
With reference to detection module 200, when carrying out the detection of people's face, present embodiment at first adopts the detection model of all attitudes that input picture is searched for, eliminate most of search window, then remaining window is input to respectively in the detection model of three attitudes, return last detection candidate frame, and calculate a weight for each candidate frame according to testing result.In general, the detection model of each attitude all can return some candidate frames, contiguous candidate frame is merged, and add up the weight of the candidate frame that each attitude returns.If certain weight that merges front face in the frame is bigger, illustrate that then this detection people face should be a front face; And if the weight of left side people's face is bigger, just can judge and be bold this detection people to cause be people from left side face, can determine the general attitude of people's face thus.
Referring now to Fig. 2,, for the described colourful attitude people's face of the embodiment of the invention detects and the method for tracing schematic flow sheet.
Step 201: import a two field picture from video frequency pick-up head, before not obtaining tracking target, every frame search image detects the existence of people's face;
Among Fig. 3 301 provided the result that people's face detects, and frame wherein is for detecting people's face frame.
Step 202: judge whether former frame has tracked people's face;
Step 203: when former frame does not track people's face, current frame image is carried out colourful attitude people's face detect,, carry out step 204, detect otherwise in successive image, proceed people's face if in current frame image, find one or more people's faces;
Step 204: in ensuing two two field pictures, follow the tracks of people's face that former frame detects, and track human faces verified, after having only someone's face two continuous frames all to pass through checking, algorithm thinks that just this people's face is a necessary being,, then pick out maximum people's face and begin to follow the tracks of by checking as a plurality of people's faces.Here people's face checking is exactly again the track human faces region to be detected once more by reference detection module 200, judges whether it is genuine people's face;
Step 205: checking by after begin to follow the trail of;
After determining to track people's face, in subsequent frame, continue to follow the tracks of this people's face, tracing process comprises several steps:
Step 206: adopt based on Mean Shift and histogrammic face tracking algorithm keeps track former frame people face, obtain preliminary position when forefathers' face;
Step 207: people's face position that the track algorithm of previous step obtains is inaccurate, be easy to be subjected to the interference in the more approaching zone of other and the colour of skin, as neck, hand etc., therefore also need to utilize the chrominance information of present frame and former frame image to obtain the point-to-point speed of people's face, the present invention's this moment as initial value, adopts Lucas-Kanade inverse algorithm to obtain people's face point-to-point speed more accurately the tracking results of previous step;
Step 208: the Position Approximate of estimating people's face by the point-to-point speed of calculating, adopt people's face detection model to carry out the checking of people's face, just, near this position, search for, judge whether this zone has people's face to exist, and people's face verification method herein is consistent with the described people's face of step 205 verification method;
Step 209: judge that whether people's face is by checking;
If current region people face exists, the checking of people's face is passed through, and then comprises the steps:
Step 210: adopt the affined transformation coefficient of AAM algorithm computation, obtain the characteristic parameter of comprising of people's face of accurate position, the anglec of rotation and size when forefathers' face;
Step 211: the key point to present frame and former frame image people face is mated, and obtains in two two field pictures two width of cloth people faces point-to-point speed, change of scale, coefficient of rotary etc. more accurately, and then obtains the accurate characteristic parameter of present frame people face.Another purpose in this step then is keep tracking results stable, makes tracing area obvious shake can not occur.With reference to the face tracking result of the expression of 302 among the figure 3 by checking;
Step 212: upgrade the characteristic parameter of present frame people face, utilize these characteristic parameters to continue to handle the next frame image;
If in the step 209, in tracing area, do not search people's face, i.e. people's face checking is not passed through, and this illustrates that current tracing area does not comprise people's face or the human face posture variation is excessive, continues to follow the tracks of this people's face in subsequent frame, proceeds checking, comprises the steps:
Step 213: judge that whether the consecutive numbers frame is not yet by checking;
Step 214:, continue to follow the trail of if, then upgrade characteristic parameter by checking;
Step 215: if the checking of people's face is not still passed through in follow-up several frames, think that then current tracking target may not be people's face, perhaps the human face posture variation is excessive, and tracking value is not high, stops to follow the tracks of.Do not pass through the face tracking result's of checking example with reference to the expression of 303 among the figure 3.
After previous tracking target stops to follow the tracks of, in successive image, carry out people's face again and detect,, follow the tracks of again again up to finding new people's face.
Below some the gordian technique points in the processing procedure of the present invention are stressed respectively.
The first, the people's face detection algorithm described in the step 203 of the present invention is described in further detail.
The principle basically identical of people's face detection algorithm of the present invention and document 1, employing is based on the method for detecting human face of AdaBoost statistics layering sorter, described as document 1 before, people's face detection algorithm (P.Viola based on AdaBoost, and M.Jones, Rapid object detection usinga boosted cascade of simple features.Proc.on Computer Vision PatternRecognition, 2001, hereinafter to be referred as document 2), at first by a large amount of " people's face " and " people's face/non-face " two class sorters of " non-face " sample training, this sorter can determine whether the rectangular window of certain yardstick is people's face, if rectangle is long is m, wide is n, then the flow process of people's face detection is exactly: continuous according to a certain percentage earlier scaling image, exhaustive search and differentiation all big or small m * n pixel window in the image series that obtains, each window is input in " people's face/non-face " sorter, stay identification and be the candidate window of people's face, adopt the candidate of post-processing algorithm merging adjacent position again, the position of exporting all detected people's faces, information such as size.
1 detection of considering front face of document, with reference to the standard faces result after the cutting shown in the standard faces image and 502 shown in 501 among the figure 5, and the present invention also needs to realize the detection of inclined to one side people from side face, with the lasting tracking of guaranteeing colourful attitude people's face and the stability of detection algorithm.The present invention still adopts the little feature extraction face characteristic of seven groups shown in Fig. 4, but the image difference of different attitude people's faces is very big, cause little characteristic difference of different attitude people's appearance co-located very big, this means that if still adopt the algorithm described in the document 1 be all AdaBoost strong classifiers of positive sample training, then training algorithm is difficult to obtain the convergent result, even Weak Classifiers at different levels have been selected very many little features, but the false alarm rate of anti-sample still can be than higher., the detection of colourful attitude people's face be finished in two steps for this reason, at first be adopted 15 layers of detection model of people's face sample training of all attitudes, then again the sample of three attitudes be trained respectively, be detection model of each attitude training.
About 4500 width of cloth facial images have been collected in the present invention altogether, about 2500 width of cloth of front face image wherein, about 1000 width of cloth of people from left side face, about 1000 width of cloth of people from right side face.In conjunction with standard faces of being mentioned in the document 1 and cutting mode, to people's face sample carry out affined transformation, cutting is cut apart, with reference to the cutting result shown in the people's face sample shown in 503 and calibration point and 504 among the figure 5, and all human face regions are normalized to 12 * 12 sizes.If two distance is r, the central point of two lines is (x Center, y Center), the length and width of gathering rectangle are made as 2r, i.e. twice binocular interval, the then coordinate (x in clipping rectangle zone Left, y Top, x Right, y Bottom) be:
x left y top x right y bottom = x center - r y center - 0.5 r x center + r y center + 1.5 r - - - ( 1 )
For strengthening sorter to the rotation of people's face certain angle and the detection robustness of change in size, equally each sample is carried out mirror transformation, rotation ± 20 ° angle, size and amplify 1.1 times, each sample is extended for five samples like this, has obtained about 22500 positive samples so altogether.Anti-sample image is exactly the image that does not comprise people's face in a large number, comprises landscape image, animal, literal etc., has 5400 width of cloth.Also in full accord described in the acquisition methods of anti-sample characteristics and the document 1 in each layer AdaBoost sorter training process, the anti-sample image of elder generation's random choose one width of cloth, and definite at random size and the position of anti-sample in image, then in this image, cut out corresponding zone, with the size of cutting image normalization to 12 * 12, obtain an anti-sample.
After all model trainings finish, first order detection model has 15 layers, false alarm rate is 0.0022, the classification error rate of training positive sample is 4.8%, positive sample error rate is higher, and false alarm rate has still surpassed 0.1%, and this characteristic difference that has shown different attitude samples is bigger, model convergence is slower in the AdaBoost training process, and this just needs to be the different attitudes reason place of training patterns respectively.The detection model of positive attitude has 18 layers, and total false alarm rate is 2.6e-6, is 4.1% to the classification error rate of having passed through the training sample that the first order detects.The detection model of left side attitude has 16 layers, and total false alarm rate is 3.8e-7, is 0.42% to the classification error rate of having passed through the training sample that the first order detects.For saving the training time, the intensity profile of considering people from left side face and people from right side face again is symmetrical fully, just the detection model that does not have retraining right side attitude, but the detection model of left side attitude is carried out mirror image processing, can obtain the detection model of right side attitude people face.The front sample is many in the training sample, and many sample interference ratio are bigger, so the classification error rate is higher, and the side sample is fewer, disturbs also very for a short time, so the classification error rate is very low.
When carrying out the detection of people's face, the present invention is downscaled images on a plurality of yardsticks at first, for example for 160 * 120 images, 9 yardsticks have been considered, the minification of image is respectively 1.5,1.88,2.34,2.93,3.66,4.56,5.72,7.15,8.94, people's face frame minimum is 18 * 18 in the corresponding original image, be 107 * 107 to the maximum, adopt first order detection model that each width of cloth downscaled images is searched for then, eliminate most of search window, then remaining window is input to respectively in people's face detection model of three attitudes, returns last detection candidate frame, and calculate a weight for each candidate frame according to testing result.In general, people's face detection model of each attitude all can return some candidate frames, contiguous candidate frame is merged, and add up the weight of the candidate frame that each attitude returns.If certain weight that merges front face in the frame is bigger, illustrate that then this detection people face should be a front face; And if the weight of left side people's face is bigger, just can think that this detection people face is people from left side face, can determine the general attitude of people's face thus.With reference to figure 6, be the synoptic diagram of several groups of colourful attitude people's face testing results, the testing result of different attitudes marks with the square frame of different gray scales.
The second, the face tracking algorithm based on Mean Shift described in the step 206 of the present invention is described in further detail:
Colourful attitude people's face detection algorithm can detect positive and inclined to one side people from side face, but can't the excessive people's face of the detection plane anglec of rotation, in addition, people's face detection algorithm is very consuming time, generally need tens of milliseconds of times just can finish the detection of everyone face in one 320 * 240 image, therefore just cannot all carry out the detection of people's face to every two field picture of the video sequence of real-time input, but improve the efficient of algorithm greatly by the method for following the tracks of and verifying, and guarantee that algorithm can not trace into other non-face targets to detecting people's face.
Face tracking algorithm of the present invention is at first on the basis that colourful attitude people's face detects, people such as same employing document 1 and Comaniciu are at document 3 (D.Comaniciu, V.Ramesh, and P.Meer.Kernel-Based Object Tracking.IEEE Trans.Pattern Analysis and MachineIntelligence, May 2003,25 (5): 564-577, abbreviation document 3) the object tracking algorithm of mentioning in based on Mean shift and histogram feature is followed the tracks of detecting people's face, position size by people's face of former frame, the position of two groups of local histogram's features of the shot and long term of people's face seeker's face in the current frame image obtains the coordinate of human face region central point.The advantage of this algorithm is that efficient is very high, the influence that not changed by people's face rotation, attitude, when people's face in video fast during translation this algorithm also can obtain the position at people's face center roughly.But its defective is also apparent in view, and the tracking accuracy of algorithm is not high, although can obtain the position of people's face soon, the center point coordinate that obtains is accurate inadequately, even people's face maintains static, is subjected to noise etc. to influence the shake that central point also can not stop.In addition, this algorithm adopts colour of skin as the feature of following the tracks of, and this means that algorithm also might follow the tracks of on the area of skin color such as in one's hands, neck.
Relative merits based on this track algorithm, on basis, the accurate estimation of people's face translation, the continuous checking of facial image and the estimation of people's face yardstick attitude have been added based on the tracking results of Mean Shift, guarantee that algorithm can trace into human face region always, and make the tracing area precision higher, and can obtain accurate dimension, the anglec of rotation of people's face etc.
The 3rd, detailed description is made in the estimation of the translation described in the step 207 of the present invention:
Can obtain the rough position of present frame people face central point fast based on the face tracking algorithm of Mean Shift, and the purpose of described translation estimation is exactly in conjunction with people's face degree distribution characteristics and Lucas-Kanade inverse algorithm (I.Matthews and S.Baker.ActiveAppearance Models Revisited.International Journal of Computer Vision on the basis of this rough position, Vol.60, No.2, November, 2004, pp.135-164, hereinafter to be referred as document 4) accurately estimate the translation vector of consecutive frame people face, determine the exact position of people's face central point.
The Lucas-Kanade algorithm can calculate the point-to-point speed of certain point in the consecutive image sequence fast.Given wherein certain some A, coordinate is x A, I (x A, t k) be the brightness of this point in the k two field picture, the point-to-point speed of establishing A in adjacent two frames be u=(u v), then has:
I(x-uδt,t k)=I(x,t k-1),δt=t k-t k-1 (2)
Know the speed initial value of A under many circumstances, be made as u 0, can be made as the initial value of its speed as the point-to-point speed of one frame this point in front in the consecutive image sequence, u=u is then arranged 0+ Δ u, and Δ u is generally smaller.Consider the point in the A vertex neighborhood scope, it is very approaching with u that the point-to-point speed of these points can be thought, can calculate thus in adjacent two frames in the neighborhood scope N all sides of all some pixel differences and:
E = Σ x ∈ N [ I ( x - u 0 δt - Δuδt , t k ) - I ( x , t k - 1 ) ] 2 - - - ( 3 )
Make the u of following formula minimum just can be used as the estimated value of the point-to-point speed of A.If u is very little for Δ, then following formula can be carried out Taylor series expansion to δ t, and remove the derivative term that is higher than one-level, have:
E = Σ x ∈ N [ ( ∂ I ( x - u 0 δt , t k ) ∂ x ) T Δu + I ( x , t k - 1 ) - I ( x - u 0 δt , t k ) δt ] 2 - - - ( 4 )
Then with expansion to Δ u differentiate, derivative is equalled zero, solving equation obtains:
Δu = H - 1 Σ x ∈ N [ ( ∂ I ( x - u 0 δt , t k ) ∂ x ) T ( I ( x - u 0 δt , t k ) - I ( x , t k - 1 ) δt ) ] - - - ( 5 )
Wherein, H is the Hessian matrix:
H = Σ x ∈ N [ ( ∂ I ( x - u 0 δt , t k ) ∂ x ) T ( ∂ I ( x - u 0 δt , t k ) ∂ x ) ] - - - ( 6 )
Above-described velocity estimation formula only can adapt to the very little situation of Δ u, because adopted approximate one-level Taylor series expansion.Can estimate bigger point-to-point speed in order to guarantee algorithm, need carry out repeatedly iterative processing, the point-to-point speed of a preceding iterative estimation is as the initial value of new iterative step, each new point-to-point speed of iterative estimation, and carry out superposition with original point-to-point speed, that is:
u n=u n-1+Δu n (7)
U wherein nBe total speed after the n time iteration, Δ u nThe speed that the n time iteration tried to achieve.In addition, also need on a plurality of resolution, handle, on low resolution, estimate point-to-point speed earlier,, calculate more accurate speed then the initial value of this speed as high resolution estimating algorithm.
According to formula (7), the initial value of each iterative process is the calculated value of former frame, all must recomputate during therefore each iteration
Figure C20061011342300174
H matrix and inverse matrix thereof, this is very consuming time, the present invention adopts Lucas-Kanade inverse algorithm to improve the efficient of algorithm for this reason.
With the n time iteration is example:
I(x-u nδ,t k)=I(x,t k-1)=I(x-u n-1δt-Δu nδt,t k) (8)
With the Δ u in the following formula nChange a position, become:
I(x-u n-1δt,t k)=I(x+Δu nδt,t k-1) (9)
Thus can be in the hope of Δ u nCalculating formula be:
Δu n = H - 1 Σ x ∈ N [ ( ∂ I ( x , t k - 1 ) ∂ x ) T ( I ( x - u n - 1 δt , t k ) - I ( x , t k - 1 ) δt ) ] - - - ( 10 )
Wherein, H is the Hessian matrix:
H = Σ x ∈ N [ ( ∂ I ( x , t k - 1 ) ∂ x ) T ( ∂ I ( x , t k - 1 ) ∂ x ) ] - - - ( 11 )
The H matrix is changeless in whole iterative process in the following formula, can calculate its inverse matrix earlier before iteration begins, and does not just need to calculate then again.When iteration, only need continuous calculating like this I ( x - u n - 1 δt , t k ) - I ( x , t k - 1 ) δt With Δ u n, make computation amount.
The change in size of people's face is very violent in the video sequence, and for guaranteeing that estimating algorithm still can calculate point-to-point speed fast when people's face size is very big, at first the people's face to different scale has carried out normalization, and people's face is all zoomed to same size.According to people's face size that former frame is followed the tracks of current frame image is carried out convergent-divergent, make the size of human face region be approximately 16 * 16.And with based on the speed of Mean shift algorithm estimation initial value as the inverse algorithm, calculate point-to-point speed between two two field pictures after dwindling, earlier image being carried out multiresolution handles, image is dwindled one times again, people's face size is approximately 8 * 8, people's face center neighborhood of a point N is exactly this 8 * 8 neighborhood, the inverse algorithm estimation point-to-point speed above adopting; The speed of estimation is double, on 16 * 16 human face regions, estimate point-to-point speed again.At last total speed is reduced to the point-to-point speed of people's face central point on the original video.
When realizing the translation estimation, not only to consider half-tone information, also will take into full account the colour of skin information of people's face, three components of RGB of input picture are converted into yuv space, these three components are sent into respectively in the velocity estimation formula.In addition, in order to reduce the influence that human face light changes, also with all brightness values divided by a bigger number, to have reduced the weight of brightness Y, emphasized the effect of two chromatic components of UV, the accuracy of velocity estimation when actual effect sees that this processing mode has obviously improved people's face rapid movement.
The 4th, detailed description is made in checking to the people's face described in step 205 of the present invention and the step 208:
In the document of mentioning before 1, because people's face detection algorithm can only detect positive homo erectus's face, and track algorithm can only obtain the zone of people's face, can't know the anglec of rotation attitude of people's face etc., therefore when carrying out people's face verification operation, have only continuous hundreds of frame all to trace into target, but all do not detect front face at tracing area, just think and stop not necessarily people's face of target following the tracks of.If the shortcoming of doing like this traces into non-face target such as neck, hand etc. exactly, system needs tens of seconds time just can react, and this has also greatly influenced the performance of system.
People's face authentication module of the present invention has solved the defective of original system, because detecting, new people's face can detect front, side upright people's face, and the follow-up people's face affine coefficients algorithm for estimating based on AAM can obtain the anglec of rotation of people's face etc., just can realize the lasting checking of track human faces thus, whether promptly every frame is all differentiated tracing area is people's face, if non-face, then export non-face tracking results, in addition, if the checking of consecutive numbers frame is not passed through, then stop to follow the tracks of.System can come at 1 second internal reaction, and stop to follow the tracks of this target when tracing into non-face zone like this.
With reference to figure 7, be the detail flowchart of people's face authentication module.Detailed process is:
Step 701:, also have the input picture of present frame to carry out combination with the translation parameters of yardstick, the anglec of rotation and the previous calculations of former frame people face.
Step 702: position, size and the anglec of rotation of determining present frame people face roughly.
Step 703: cutting and normalization human face region, obtain 12 * 12 image.
By these parameters current frame image is carried out affined transformation, and carry out cutting and size normalized, obtain 12 * 12 image.
Step 704: this width of cloth image is imported in colourful attitude people's face detection model, judged whether face into true man, if, enter step 705, if not, step 706 entered.The image that obtains is sent in colourful attitude people's face detection model, calculate the weight of returning of each attitude detection model, and can be with the pairing attitude of the detecting device of weight maximum as attitude when forefathers' face, if and the weight of each gesture detector is zero, think that then input picture is not people's face, then also need the neighborhood of present frame people face position is searched for.
Step 705: checking is passed through, and returns human face posture.
Step 706: seeker's face once more in smaller territory and range scale.Search in smaller yardstick in conjunction with the size and the anglec of rotation that people's face is known, will merge by the candidate face frame of all gesture detector, and with the attitude of the corresponding attitude of weight limit as present frame people face.If find any candidate face frame, enter step 707, if do not find, enter step 708.
Step 707: merge candidate face, return people's face is new in the original image position, yardstick and attitude.
Step 708: checking is not passed through.The current search zone does not comprise people's face or the human face posture variation is excessive, and the checking of people's face is not passed through.
Provide two examples of people's face checking below, the apparatus volume image illustrates.
With reference to figure 8, for verify result's synoptic diagram by people's face of first order checking.Being the former frame image and following the trail of the result of shown in Fig. 8 801 expression, 802 expressions be current frame image, 803 then is 12 * 12 images after the cutting.Although this image is not a front face completely, passed through everyone face detector, and gesture recognition is positive, this is because this kind algorithm can detect people's face of the plane rotation of certain angle scope.
With reference to figure 9, for verify result's synoptic diagram by people's face of second level checking.Being the former frame image and following the trail of the result of shown in Fig. 9 901 expression, 902 expressions be current frame image, 903 expressions be normalized people's face, and that 904 expressions is the result of second level checking.What the figure shows is the example that first order checking is not passed through by, second level checking, the point-to-point speed estimation has deviation in this example, therefore normalized image is compared with real human face and will be taken back, first order checking is not passed through, and in the checking of the second level, equally input picture is carried out affined transformation and cutting processing, but the zone of cutting is bigger than the zone of first order checking, search for the people's face in this zone, and merge candidate result, the detection people face block diagram that obtains is shown in 904.
The 5th, the people's face affine coefficients based on AAM described in the step 210 of the present invention is estimated to be described in further detail.
People's face frame of foregoing people's face verification algorithm output can be included each organ, but yardstick, the anglec of rotation are still continued to use the former frame result, cause the excessive people's face of the anglec of rotation can't be, the plane spinning movement that algorithm can't handler's face by people's face checking.For guaranteeing that algorithm of the present invention can follow the tracks of people's face of arbitrarily angled rotation, the affined transformation coefficient estimate algorithm based on the AAM that has simplified is proposed again, obtain rotation, translation, zoom factor of present frame people face etc.
AAM is a parameter model based on pca method (PCA), target shape feature and color distribution feature, and purpose is shape, affined transformation coefficient of being obtained by a good model of precondition the target area etc.AAM uses very extensive in the modeling of people's face, people's face positioning field, adopt the AAM algorithm to obtain the profile information of each organ of people's face as document 4.
The purpose of estimating based on people's face affine coefficients of AAM among the present invention is exactly the size and the anglec of rotation in order to obtain track human faces, just calculates four affined transformation coefficient a={a i, i=0,1,2,3, only comprised translation, convergent-divergent, three conversion of rotation, that is:
x new y new = s cos θ - s sin θ s sin θ s cos θ x y + a 2 a 3 = a 0 - a 1 a 2 a 1 a 0 a 3 x y 1 - - - ( 12 )
Thus formula as can be known, the present invention does not need to know the profile information of each organ of people's face.Therefore can simplify the AAM model in the document 4, only need be people's face gray feature training gray scale pca model, and adopt the AAM search input people face that only comprises gray level model, calculate the affined transformation coefficient of people's face.
In addition, it is different that the pixel of different attitude people's faces distributes, and for this reason, is that three attitudes are trained AAM respectively.People's face sample at first people's face being detected carries out cutting, yardstick normalization and gray scale normalization to be handled, obtain several thousand 16 * 16 facial image, the cutting mode during cutting mode and people's face detect is consistent, wherein front face 2,000 several, about 1000 width of cloth in left side, also there are 1000 width of cloth on the right side.Below be example just with the front face, training and the position fixing process of AAM is described.
If a width of cloth facial image is A (x), wherein x represents a bit in 16 * 16 images.All training samples are carried out the PCA conversion, obtain average people face A 0, maximum eigenwert and m the corresponding proper vector A of m i, i=1,2 ..., m, the front face image approximate representation is A arbitrarily 0With A i, i=1,2 ..., the linearity summation of m:
A ( x ) = A 0 ( x ) + Σ i = 1 m λ i A i ( x ) - - - ( 13 )
Wherein, λ iIt is the linear weighted function coefficient of A (x).
If be input to the facial image of AAM location algorithm is I (x), and the anglec of rotation of people's face center position that this image is returned by people's face verification algorithm, people's face size, former frame people's face is tried to achieve, and needs to calculate suitable λ iWith affined transformation coefficient a={a i, i=0,1,2,3, the I (x) and the AAM of training are mated, make the following formula minimum:
E = Σ x [ A 0 ( x ) + Σ i = 1 m λ i A i ( x ) - I ( x , a ) ] 2 - - - ( 14 )
Wherein, (x is that I (x) is made the image that obtains after the affined transformation a) to I, adopts iterative processing and Lucas-Kanade inverse algorithm to obtain a equally.Each iteration is tried to achieve Δ a, is:
E = Σ x [ A 0 ( x ) + Σ i = 1 m λ i A i ( x ) - I ( x , a + Δa ) ] 2 - - - ( 15 )
Described in document 4, adopted a kind of first λ that rejects in the following formula of technology of space projection i, simplify the calculated amount that minimizes iterative processing.The space that vector Ai is opened is designated as sub (A i), A iOrthogonal intersection space be designated as sub (A i) , following formula can be written as so:
| | A 0 ( x ) + Σ i = 1 m λ i A i ( x ) - I ( x , a + Δa ) | | sub ( A i ) ⊥ 2 + | | A 0 ( x ) + Σ i = 1 m λ i A i ( x ) - I ( x , a + Δa ) | | sub ( A i ) 2 - - - ( 16 )
Wherein, first is at sub (A i) On calculate, comprise A iAll can be saved because they are at sub (A i) Projection all is zero on the space, that is:
| | A 0 ( x ) - I ( x , a + Δa ) | | sub ( A i ) ⊥ 2 + | | A 0 ( x ) + Σ i = 1 m λ i A i ( x ) - I ( x , a + Δa ) | | sub ( A i ) 2 - - - ( 17 )
First and λ in the following formula iIrrelevant, can obtain suitable affine coefficients earlier to first calculated minimum thus, then, calculate λ to second calculated minimum i:
λ i = Σ x A i ( x ) · [ I ( x , a + Δa ) - A 0 ( x ) ] - - - ( 18 )
First minimization process can realize by Lucas-Kanade inverse algorithm:
| | A 0 ( x ) - I ( x , a + Δa ) | | sub ( A i ) ⊥ 2 = | | A 0 ( x , - Δa ) - I ( x , a ) | | sub ( A i ) ⊥ 2 = | | A 0 ( x ) - ∂ A ∂ a Δa - I ( x , a ) | | sub ( A i ) ⊥ 2 - - - ( 19 )
Wherein ∂ A 0 ∂ a = ▿ A 0 ∂ x ∂ a , Then have:
Δa = H - 1 Σ x [ ▿ A 0 ∂ x ∂ a ] sub ( A i ) ⊥ T [ A 0 ( x ) - I ( x , a ) ] - - - ( 20 )
Wherein the Hessian matrix H is:
H = Σ x [ ▿ A 0 ∂ x ∂ a ] sub ( A i ) ⊥ T [ ▿ A 0 ∂ x ∂ a ] sub ( A i ) ⊥ - - - ( 21 )
Wherein, [ ▿ A 0 ∂ x ∂ a j ] sub ( A i ) ⊥ For:
[ ▿ A 0 ∂ x ∂ a j ] sub ( A i ) ⊥ = ▿ A 0 ∂ x ∂ a j - Σ i = 1 m [ Σ x ( A i ( x ) · ▿ A 0 ∂ x ∂ a j ) ] A i ( x ) - - - ( 22 )
The partial derivative that x penetrates variation factor a to visit is respectively:
∂ x ∂ a 0 = x , ∂ y ∂ a 0 = y , ∂ x ∂ a 1 = - y , ∂ y ∂ a 1 = x , ∂ x ∂ a 2 = 1 , ∂ y ∂ a 2 = 0 , ∂ x ∂ a 3 = 0 , ∂ y ∂ y 3 = 1 - - - ( 23 )
In the following formula
Figure C20061011342300238
Fully by each point coordinate decision in average image of training among the AAM and 16 * 16 images, can calculated in advance, so the inverse matrix of H also can calculated in advance.(x, a) and a, the efficient of algorithm can improve greatly so only to need to bring in constant renewal in I in iterative process.
The step of whole AAM location algorithm is:
Calculate in advance:
(1) gradient of the average image of calculation training
Figure C20061011342300239
(2) calculate each point in 16 * 16 images
(3) compute matrix
Figure C20061011342300241
(4) calculate Hessian matrix and inverse matrix thereof;
Iterative processing:
(5) according to a of former frame calculate I (x, a);
(6) computed image difference A 0(x)-(x is a) with Δ a for I;
(7) calculate new affined transformation coefficient a+ Δ a;
Subsequent calculations:
(8) return and calculate linear coefficient λ by finish a that the back calculates of iteration i
Also respectively trained an AAM simultaneously for left side and right side people's face.With reference to Figure 10, be the example schematic diagram of AAM algorithm affine coefficients result of calculation.Provided the positioning result of AAM algorithm among the figure, people's face frame that black box behaviour face detection algorithm is wherein determined, as shown in Figure 10 1001 be input as front face, 1002 is people from left side face, 1003 is people from right side face, and white edge is to try to achieve square frame after the affined transformation, for conveniently checking, also go out the position of two eyes by the position inverse of formula (1) and white lines frame, with+represent.With reference to Figure 11, be face tracking result schematic diagram based on AAM, this figure then illustrates three width of cloth images in one section sequence, and people's face anglec of rotation is very big in the image, but track algorithm still can be followed the tracks of people's face of these rotations, and accurately reflect its angle.
The 6th, the tracking of the people's face key point described in the step 211 of the present invention is described in further detail.
It all is to realize on lower people's face resolution that the point-to-point speed estimation of front, the checking of people's face and affine coefficients are calculated, and can improve the efficient of algorithm like this, but also can reduce the precision of people's face parameter of trying to achieve, because the resolution of original image is much higher.Therefore export among the result data such as people's face position, yardstick, angle and True Data and still exist small deviation.From the result, even people's face maintains static in one section sequence, tangible shake all can take place in people's face position, size, angle that aforementioned modules is obtained.For addressing this problem, the tracking module that has added people's face key point in system at last, still adopt based on the consistent translation evaluation method of the face tracking algorithm of Mean Shift based on Lucas-Kanade inverse algorithm with described, utilize the colouring information of each crucial neighborhood of a point picture element, by the AAM positioning result initial point-to-point speed is set, then between the consecutive frame input picture, calculate point-to-point speed respectively, determine the parameters such as position that people's face is final for each key point.
With reference to Figure 12, for key point is chosen and the synoptic diagram of tracking results.Wherein, definite mode of key point as shown in Figure 12 1201, the square frame behaviour face frame of former frame among the figure, five points of ABCDE are key point, A is a central point, BCDE is the central point of A and four summit lines of people's face frame.Shown in Figure 12 1202 is for current frame image and by the definite people's face frame of AAM, five corresponding key points are respectively A ' B ' C ' D ' E ', with the coordinate of these several points initial value as the translation estimation, each point has been considered the picture element in its 5 * 5 neighborhood scope, calculate the point-to-point speed of each key point, obtain new some A " B " C " D " E ", as shown in Figure 12 1203.If consecutive frame people face exists tangible rotation, list determines that by the point-to-point speed of key point the method for people's face position might be able to not reflect that the fast rotational of people's face changes, because distributing, the neighborhood territory pixel of corresponding key point no longer satisfies the relational expression (2) of translation, the translation estimation precision can reduce, A among Figure 120 3 " estimated speed just accurate inadequately.Taked compromise method for this reason, with A ' B ' C ' D ' E ' and A " B " C " D " E " coordinate be weighted summation; obtain new some A " ' B " ' C " ' D " ' E " ', as shown in Figure 12 1204, and by the final position of determining people's face of these points, housing, the anglec of rotation, size etc.Square-shaped frame shown in Figure 120 4 is people's face frame, shown in four line segments are the final output results of system, the intersection point of line segment extended line is the central point of square frame, the length of side of establishing square frame is len, then each line segment two-end-point is respectively len/2 and len to the distance of central point.
By the above, the detection of whole colourful attitude people's face and tracker are demonstrated in the many occasions of many scenes, and combine with recognition of face, the synthetic supervisor of three-dimensional face, have realized a plurality of demonstration programs.Test result is from many aspects seen, the method for detecting human face that the present invention proposes can detect-50 ° to 50 ° degree of depth rotations, people's face of-20 ° to 20 ° degree plane rotations, can detect ddressee's face of 0 ° to 30 °, can detect low tribal chief's face of 0 ° to 30 °, can detect people's face of the different colours of skin, people's face under the different illumination conditions, people's face of wearing glasses etc., can follow the tracks of positive and half side dough figurine face, follow the tracks of the people of plane rotation at any angle face, track algorithm is highly stable, can not be subjected to and similar non-face zone of face complexion such as neck, the interference of staff etc., and can obtain the anglec of rotation of people's face, the accurate size of output people face.
The efficient of algorithm of the present invention is very high, according to test result, every frame processing time was 8ms-15ms when algorithm was followed the tracks of people's face of 320 * 240 images on P4 2.8GHz computing machine, the CPU occupation rate is no more than 12% when handling frame per second and be 320 * 240 video images of 10fps, and the CPU occupation rate is no more than 18% when to handle frame per second be 640 * 480 video images of 10fps.As shown in figure 13, be the example schematic diagram of lineup's face detection with tracking results, wherein, first width of cloth figure is people's face testing result, and last width of cloth figure verifies the example that does not pass through, and represents with four black line segments.
The present invention is directed to the limitation of original algorithm, propose multiple improvement thinking, solved these defectives of original algorithm, realized more reliable and more stable tracking results, and kept very high operational efficiency.This method can detect a plurality of fronts and the half side-view homo erectus face in the photographed scene in real time, select wherein maximum people's face, employing continues to follow the tracks of to this people's face based on track algorithm and the Lucas-Kanade inverse algorithm of Mean Shift, track human faces and the faceform's who trains affined transformation coefficient is calculated in employing based on the faceform of AAM, determine the size and the anglec of rotation of track human faces thus, the checking that the present invention also adopts colourful attitude people's face detection model that track human faces is continued, judge that whether tracing area is genuine people's face, guarantees the stability of face tracking.

Claims (12)

1. colourful attitude people's face detects and method for tracing, it is characterized in that, comprising:
(1), obtains the positive and half side-view detection model of people's face respectively, and determine initiatively outward appearance faceform by people's face sample training;
(2) utilize the positive and half side-view detection model of described people's face, inputted video image is carried out people's face detect, determine whether there is people's face in the two field picture;
(3) if in certain two field picture, detect people's face, then in subsequent frame, follow the trail of and verify this people's face, comprise step:
(31) the people's face position in the tracking former frame image, the preliminary position of people's face in the acquisition present frame;
(32) with the preliminary position of described acquisition as initial value, utilize the colourity difference of present frame and former frame image, calculate the point-to-point speed of people's face;
(33) according to described point-to-point speed, estimate the Position Approximate of people's face in the present frame, and utilize positive surface model of described people's face and half side-view detection model, near this position, detect, to verify this people's face;
(34) if detect people's face near this position, then checking is passed through, and adopts described active outward appearance faceform to calculate the affined transformation coefficient of working as forefathers' face, obtains the characteristic parameter of present frame people face.
2. the method for claim 1 is characterized in that, described step (3) further comprises:
(35) key point of present frame and former frame image people face is mated, further revise described people's face point-to-point speed of calculating according to matching result, and the characteristic parameter of present frame people face.
3. method as claimed in claim 2 is characterized in that, described step (3) further comprises:
(36) characteristic parameter of renewal present frame people face utilizes these parameters to be used for the tracking checking of next frame image.
4. the method for claim 1 is characterized in that, described step (34) further comprises: if do not detect people's face near this position, then checking is not passed through, and follows the trail of checking in next frame.
5. method as claimed in claim 4 is characterized in that, described step (34) further comprises: if the checking of people's face is not passed through yet in follow-up several frames, then stop to follow the trail of.
6. method as claimed in claim 5 is characterized in that, further comprises step:
(4) previous follow the trail of the objective stop to follow the trail of after, in successive image, begin to detect from step (2) again, after finding new people's face, proceed to follow the trail of.
7. the method for claim 1, it is characterized in that, the described step that obtains people's face front and half side-view detection model by people's face sample training respectively of step (1), comprise: the people's face sample training multilayer detection model that at first uses all attitudes, again people's face sample of front, left surface, right flank attitude is trained respectively, obtain the detection model of three attitudes.
8. the method for claim 1, it is characterized in that, the described people's face of step (2) detects step, comprise: at first adopt the detection model of all attitudes that image is searched for, eliminate most of search window, then remaining window is input to respectively in the detection model of three attitudes, determines according to testing result that the people is bold and cause attitude.
9. colourful attitude people's face detects and tracing system, it is characterized in that, comprising:
Training module is used for by people's face sample training, obtains the positive and half side-view detection model of people's face respectively, and determines initiatively outward appearance faceform;
Detection module is used for the positive and half side-view detection model according to described people's face, inputted video image is carried out people's face detect, and determines whether there is people's face in the two field picture;
Tracing module is used for after certain two field picture detects people's face, follows the trail of in subsequent frame and verifies this people's face, comprising:
Be used for following the trail of people's face position of former frame image, obtain the unit of the preliminary position of people's face in the present frame;
Be used for preliminary position with described acquisition as initial value, utilize the colourity difference of present frame and former frame image, calculate the unit of the point-to-point speed of people's face;
Be used for estimating the Position Approximate of people's face in the present frame, and utilize positive surface model of described people's face and half side-view detection model, near this position, detect, to verify the unit of this people's face according to described point-to-point speed;
After being used near this position, detecting people's face, adopt described active outward appearance faceform to calculate the affined transformation coefficient of working as forefathers' face, obtain the unit of the characteristic parameter of present frame people face.
10. system as claimed in claim 9 is characterized in that, described tracing module further comprises:
Be used for the key point of present frame and former frame image people face is mated, further revise described people's face point-to-point speed of calculating according to matching result, and the unit of the characteristic parameter of present frame people face.
11. system as claimed in claim 9, it is characterized in that described training module is by using people's face sample training multilayer detection model of all attitudes, and people's face sample of front, left surface, right flank attitude trained respectively, obtain the detection model of three attitudes.
12. system as claimed in claim 11, it is characterized in that, described detection module, by the detection model that adopts all attitudes image is searched for, eliminate most of search window, and remaining window is input to respectively in the detection model of three attitudes, determine according to testing result that the people is bold and cause attitude.
CNB200610113423XA 2006-09-27 2006-09-27 Multiple attitude human face detection and track system and method Active CN100426317C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB200610113423XA CN100426317C (en) 2006-09-27 2006-09-27 Multiple attitude human face detection and track system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB200610113423XA CN100426317C (en) 2006-09-27 2006-09-27 Multiple attitude human face detection and track system and method

Publications (2)

Publication Number Publication Date
CN1924894A CN1924894A (en) 2007-03-07
CN100426317C true CN100426317C (en) 2008-10-15

Family

ID=37817523

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB200610113423XA Active CN100426317C (en) 2006-09-27 2006-09-27 Multiple attitude human face detection and track system and method

Country Status (1)

Country Link
CN (1) CN100426317C (en)

Families Citing this family (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101325691B (en) 2007-06-14 2010-08-18 清华大学 Method and apparatus for tracing a plurality of observation model with fusion of differ durations
CN101216973B (en) * 2007-12-27 2011-08-17 北京中星微电子有限公司 An ATM monitoring method, system, and monitoring device
CN101499128B (en) * 2008-01-30 2011-06-29 中国科学院自动化研究所 Three-dimensional human face action detecting and tracing method based on video stream
JP4577410B2 (en) * 2008-06-18 2010-11-10 ソニー株式会社 Image processing apparatus, image processing method, and program
CN101576953B (en) * 2009-06-10 2014-04-23 北京中星微电子有限公司 Classification method and device of human body posture
CN101739676B (en) * 2009-12-04 2012-02-22 清华大学 Method for manufacturing face effigy with ultra-low resolution
CN101794385B (en) * 2010-03-23 2012-11-21 上海交通大学 Multi-angle multi-target fast human face tracking method used in video sequence
CN101968846B (en) * 2010-07-27 2013-05-15 上海摩比源软件技术有限公司 Face tracking method
CN103544478A (en) * 2013-10-09 2014-01-29 五邑大学 All-dimensional face detection method and system
CN106605258B (en) * 2014-09-25 2021-09-07 英特尔公司 Facilitating efficient free-plane rotational landmark tracking of images on computing devices
CN104318211A (en) * 2014-10-17 2015-01-28 中国传媒大学 Anti-shielding face tracking method
CN105138956B (en) * 2015-07-22 2019-10-15 小米科技有限责任公司 Method for detecting human face and device
CN105405094A (en) * 2015-11-26 2016-03-16 掌赢信息科技(上海)有限公司 Method for processing face in instant video and electronic device
CN106251294B (en) * 2016-08-11 2019-03-26 西安理工大学 A kind of single width faces the virtual multi-pose generation method of facial image
CN106503682B (en) * 2016-10-31 2020-02-04 北京小米移动软件有限公司 Method and device for positioning key points in video data
CN106650624A (en) * 2016-11-15 2017-05-10 东软集团股份有限公司 Face tracking method and device
CN106650682B (en) * 2016-12-29 2020-05-01 Tcl集团股份有限公司 Face tracking method and device
CN106991688A (en) * 2017-03-09 2017-07-28 广东欧珀移动通信有限公司 Human body tracing method, human body tracking device and electronic installation
CN108664850B (en) * 2017-03-30 2021-07-13 展讯通信(上海)有限公司 Human face posture classification method and device
CN107993250A (en) * 2017-09-12 2018-05-04 北京飞搜科技有限公司 A kind of fast multi-target pedestrian tracking and analysis method and its intelligent apparatus
CN108875333B (en) * 2017-09-22 2023-05-16 北京旷视科技有限公司 Terminal unlocking method, terminal and computer readable storage medium
CN109754383A (en) * 2017-11-08 2019-05-14 中移(杭州)信息技术有限公司 A kind of generation method and equipment of special efficacy video
CN108197613B (en) * 2018-02-12 2022-02-08 天地伟业技术有限公司 Face detection optimization method based on deep convolution cascade network
CN108510061B (en) * 2018-03-19 2022-03-29 华南理工大学 Method for synthesizing face by multiple monitoring videos based on condition generation countermeasure network
CN109064489A (en) * 2018-07-17 2018-12-21 北京新唐思创教育科技有限公司 Method, apparatus, equipment and medium for face tracking
CN109325964B (en) * 2018-08-17 2020-08-28 深圳市中电数通智慧安全科技股份有限公司 Face tracking method and device and terminal
CN110909568A (en) * 2018-09-17 2020-03-24 北京京东尚科信息技术有限公司 Image detection method, apparatus, electronic device, and medium for face recognition
CN109670474B (en) * 2018-12-28 2023-07-25 广东工业大学 Human body posture estimation method, device and equipment based on video
CN113228626B (en) * 2018-12-29 2023-04-07 浙江大华技术股份有限公司 Video monitoring system and method
CN112101063A (en) * 2019-06-17 2020-12-18 福建天晴数码有限公司 Skew face detection method and computer-readable storage medium
WO2021248348A1 (en) * 2020-06-10 2021-12-16 Plantronics, Inc. Tracker activation and deactivation in a videoconferencing system
CN112084856A (en) * 2020-08-05 2020-12-15 深圳市优必选科技股份有限公司 Face posture detection method and device, terminal equipment and storage medium
CN112188140A (en) * 2020-09-29 2021-01-05 深圳康佳电子科技有限公司 Face tracking video chat method, system and storage medium
CN112364808A (en) * 2020-11-24 2021-02-12 哈尔滨工业大学 Living body identity authentication method based on FMCW radar and face tracking identification
CN112614168B (en) * 2020-12-21 2023-08-29 浙江大华技术股份有限公司 Target face tracking method and device, electronic equipment and storage medium
CN113705444A (en) * 2021-08-27 2021-11-26 成都玻尔兹曼智贝科技有限公司 Facial development analysis and evaluation method and system
CN114187216A (en) * 2021-11-17 2022-03-15 海南乾唐视联信息技术有限公司 Image processing method and device, terminal equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5850469A (en) * 1996-07-09 1998-12-15 General Electric Company Real time tracking of camera pose
US6741756B1 (en) * 1999-09-30 2004-05-25 Microsoft Corp. System and method for estimating the orientation of an object
CN1794264A (en) * 2005-12-31 2006-06-28 北京中星微电子有限公司 Method and system of real time detecting and continuous tracing human face in video frequency sequence
CN1794265A (en) * 2005-12-31 2006-06-28 北京中星微电子有限公司 Method and device for distinguishing face expression based on video frequency

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5850469A (en) * 1996-07-09 1998-12-15 General Electric Company Real time tracking of camera pose
US6741756B1 (en) * 1999-09-30 2004-05-25 Microsoft Corp. System and method for estimating the orientation of an object
CN1794264A (en) * 2005-12-31 2006-06-28 北京中星微电子有限公司 Method and system of real time detecting and continuous tracing human face in video frequency sequence
CN1794265A (en) * 2005-12-31 2006-06-28 北京中星微电子有限公司 Method and device for distinguishing face expression based on video frequency

Also Published As

Publication number Publication date
CN1924894A (en) 2007-03-07

Similar Documents

Publication Publication Date Title
CN100426317C (en) Multiple attitude human face detection and track system and method
Wang et al. Automatic laser profile recognition and fast tracking for structured light measurement using deep learning and template matching
CN102324025B (en) Human face detection and tracking method based on Gaussian skin color model and feature analysis
Boltes et al. Automatic extraction of pedestrian trajectories from video recordings
Liu et al. Detecting and counting people in surveillance applications
CN108717531B (en) Human body posture estimation method based on Faster R-CNN
Nedevschi et al. Stereo-based pedestrian detection for collision-avoidance applications
Nair et al. 3-D face detection, landmark localization, and registration using a point distribution model
CN100440246C (en) Positioning method for human face characteristic point
CN103020986B (en) A kind of motion target tracking method
Krotosky et al. Mutual information based registration of multimodal stereo videos for person tracking
US10885667B2 (en) Normalized metadata generation device, object occlusion detection device and method
CN111191629B (en) Image visibility detection method based on multiple targets
CN102609720B (en) Pedestrian detection method based on position correction model
US20110176000A1 (en) System and Method for Counting People
CN108961308B (en) Residual error depth characteristic target tracking method for drift detection
Albiol Colomer et al. Who is who at different cameras: people re-identification using depth cameras
CN109341703A (en) A kind of complete period uses the vision SLAM algorithm of CNNs feature detection
CN108062525A (en) A kind of deep learning hand detection method based on hand region prediction
CN110427797A (en) A kind of three-dimensional vehicle detection method based on geometrical condition limitation
Zhang et al. Local distance comparison for multiple-shot people re-identification
CN106599785A (en) Method and device for building human body 3D feature identity information database
CN103593639A (en) Lip detection and tracking method and device
CN107103301A (en) Video object space-time maximum stability identification color region matching process and system
Khan et al. Online domain-shift learning and object tracking based on nonlinear dynamic models and particle filters on Riemannian manifolds

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20180408

Address after: 100191 Xueyuan Road, Haidian District, Haidian District, Beijing, No. 607, No. six

Patentee after: Beijing Vimicro AI Chip Technology Co Ltd

Address before: 100083, Haidian District, Xueyuan Road, Beijing No. 35, Nanjing Ning building, 15 Floor

Patentee before: Beijing Vimicro Corporation