CN100426317C

CN100426317C - Multiple attitude human face detection and track system and method

Info

Publication number: CN100426317C
Application number: CNB200610113423XA
Authority: CN
Inventors: 黄英; 谢东海; 王浩
Original assignee: Vimicro Corp
Current assignee: Beijing Vimicro Ai Chip Technology Co Ltd
Priority date: 2006-09-27
Filing date: 2006-09-27
Publication date: 2008-10-15
Anticipated expiration: 2026-09-27
Also published as: CN1924894A

Abstract

This invention discloses one method and system for different human faces to test multiple faces in the test sequence for continuous tracing in the images, which comprises the following steps: separately getting human face front and side test mode through sample training to determining AAM human face mode; using the above modes to test the input visual image to determine whether one frame of image is stored in the human face, if testing the face and then tracing and validating the human face in the back frame.

Description

Detection of colourful attitude people's face and tracing system and method

Technical field

The present invention relates to detection of a kind of people's face and tracing system and method thereof, relate in particular to a kind of colourful attitude people's face and detect and method for tracing.

Background technology

People's face is one of convenient mode of man-machine interaction in the computer vision system.It is exactly information such as the position of determining everyone face in image or image sequence, size that people's face detects, and face tracking then is the one or more detection people faces that continue in the tracking video sequence.People's face detects the prerequisite of being not only technology such as recognition of face, Expression Recognition, people's face be synthetic with tracking technique, and its value that has a wide range of applications in fields such as intelligent human-machine interaction, video conference, intelligent monitoring, video frequency searchings.

Native system at image be the video sequence of video frequency pick-up head input.Before, real-time detection that the applicant has proposed people's face in a kind of video sequence and the method and system that continue to follow the trail of, Chinese patent application number is 200510135668.8, hereinafter to be referred as document 1, this application this by integral body in conjunction with reference.The method and system that this application proposes has adopted the method for detecting human face based on AdaBoost statistics layering sorter, realize the real-time detection of positive homo erectus's face, and, realized the real-time tracking system of people's face in conjunction with face tracking method based on Mean shift and histogram feature.From experimental result, this system can detect people's face of-20 to 20 degree degree of depth rotations ,-20 to 20 degree planes rotations, can detect people's face, the people's face under the different illumination conditions of the different colours of skin, people's face of hyperphoria with fixed eyeballs eyeball etc.The tracking of people's face realizes that by the colour of skin track algorithm is not subjected to the influence of human face posture, and side, rotation people face can be followed the tracks of equally.

Yet the algorithm in the above-mentioned patented claim also exists certain limitation.At first, this algorithm has only been trained the detection model of front face, can't detect inclined to one side people from side face, and this just means that the detection of this people's face and checking all can only limit the range of application of algorithm greatly at front face; Secondly, this algorithm is only by colour of skin histogram track human faces, and the features of skin colors of people's face is very easy to be subjected to the interference of other area of skin color such as neck, hand or similar area of skin color such as yellow clothes, and being reflected on the tracking results is exactly that tracing area jumps on hand, neck or the yellow clothes clothes sometimes; Once more, the size and the change in location of the tracing area that algorithm originally obtains are more violent, and motionless even people's face keeps, tracking results also tangible shake can take place; Be exactly that this algorithm can't obtain further attitude information of people's face in addition, as the anglec of rotation of people's face, current roughly attitude etc.

Summary of the invention

Technical matters to be solved by this invention is to provide the detection of a kind of colourful attitude people's face and tracing system and method, people's face of the colourful attitude of trace detection, and can overcome and be subjected to the shortcoming that disturb in the non-face zone close with face complexion, and can guarantee the lasting tracking of colourful attitude people's face and the stability of detection algorithm, and obtain the anglec of rotation of people's face, the accurate size of output people face.

In order to solve the problems of the technologies described above, the invention provides a kind of colourful attitude people's face and detect and method for tracing, comprising:

(1), obtains the positive and half side-view detection model of people's face respectively, and determine initiatively outward appearance AAM faceform by people's face sample training;

(2) utilize the positive and half side-view detection model of described people's face, inputted video image is carried out people's face detect, determine whether there is people's face in the two field picture;

(3) if in certain two field picture, detect people's face, then in subsequent frame, follow the trail of and verify this people's face, comprise step:

(31) the people's face position in the tracking former frame image, the preliminary position of people's face in the acquisition present frame;

(32) with the preliminary position of described acquisition as initial value, utilize the colourity difference of present frame and former frame image, calculate the point-to-point speed of people's face;

(33) according to described point-to-point speed, estimate the Position Approximate of people's face in the present frame, and utilize positive surface model of described people's face and half side-view detection model, near this position, detect, to verify this people's face;

(34) if detect people's face near this position, then checking is passed through, and adopts described AAM faceform to calculate the affined transformation coefficient of working as forefathers' face, obtains the characteristic parameter of present frame people face.

Wherein, described step (3) further comprises:

(35) key point of present frame and former frame image people face is mated, further revise described people's face point-to-point speed of calculating according to matching result, and the characteristic parameter of present frame people face.

Wherein, described step (3) further comprises:

(36) characteristic parameter of renewal present frame people face utilizes these parameters to be used for the tracking checking of next frame image.

Wherein, described step (34) further comprises: if do not detect people's face near this position, then checking is not passed through, and follows the trail of checking in next frame.

Wherein, described step (34) further comprises: if the checking of people's face is not passed through yet in follow-up several frames, then stop to follow the trail of.

Wherein, further comprise step:

(4) previous follow the trail of the objective stop to follow the trail of after, in successive image, begin to detect from step (2) again, after finding new people's face, proceed to follow the trail of.

Wherein, the described step that obtains people's face front and half side-view detection model by people's face sample training respectively of step (1), comprise: the people's face sample training multilayer detection model that at first uses all attitudes, again people's face sample of front, left surface, right flank attitude is trained respectively, obtain the detection model of three attitudes.

Wherein, the described people's face of step (2) detects step, comprising: at first adopt the detection model of all attitudes that image is searched for, eliminate most of search window, then remaining window is input to respectively in the detection model of three attitudes, determines according to testing result that the people is bold and cause attitude.

In order to solve the problems of the technologies described above, the present invention and then provide a kind of colourful attitude people's face to detect and tracing system comprises:

Training module is used for by people's face sample training, obtains the positive and half side-view detection model of people's face respectively, and definite AAM faceform;

Detection module is used for the positive and half side-view detection model according to described people's face, inputted video image is carried out people's face detect, and determines whether there is people's face in the two field picture;

Tracing module is used for after certain two field picture detects people's face, follows the trail of in subsequent frame and verifies this people's face, comprising:

Be used for following the trail of people's face position of former frame image, obtain the unit of the preliminary position of people's face in the present frame;

Be used for preliminary position with described acquisition as initial value, utilize the colourity difference of present frame and former frame image, calculate the unit of the point-to-point speed of people's face;

Be used for estimating the Position Approximate of people's face in the present frame, and utilize positive surface model of described people's face and half side-view detection model, near this position, detect, to verify the unit of this people's face according to described point-to-point speed;

After being used near this position, detecting people's face, adopt described AAM faceform to calculate the affined transformation coefficient of working as forefathers' face, obtain the unit of the characteristic parameter of present frame people face.

Wherein, described tracing module further comprises:

Be used for the key point of present frame and former frame image people face is mated, further revise described people's face point-to-point speed of calculating according to matching result, and the unit of the characteristic parameter of present frame people face.

Wherein, described training module by using people's face sample training multilayer detection model of all attitudes, and is trained respectively people's face sample of front, left surface, right flank attitude, obtains the detection model of three attitudes.

Wherein, described detection module is searched for image by the detection model that adopts all attitudes, eliminates most of search window, and remaining window is input to respectively in the detection model of three attitudes, determines according to testing result that the people is bold and causes attitude.

A kind of colourful attitude people's face of the present invention detection and tracing system and method, people's face that can the colourful attitude of trace detection, and can overcome and be subjected to the shortcoming that the non-face zone close with face complexion such as neck, staff or yellow clothes clothes equal to disturb, and can guarantee the lasting tracking of colourful attitude people's face and the stability of detection algorithm, and obtain the anglec of rotation of people's face, the accurate size of output people face.

Description of drawings

Fig. 1 is according to the structural representation of the described a kind of colourful attitude people's face detection of the embodiment of the invention with tracing system;

Fig. 2 is according to the schematic flow sheet of the described a kind of colourful attitude people's face detection of the embodiment of the invention with method for tracing;

Fig. 3 is according to the synoptic diagram of people's face detection in described a kind of colourful attitude people's face detection of the embodiment of the invention and the method for tracing with tracking results;

Fig. 4 is the synoptic diagram according to seven groups of little features of people's face detection algorithm selection in described a kind of colourful attitude people's face detection of the embodiment of the invention and the method for tracing;

Fig. 5 is demarcation and the collection according to people's face sample in described a kind of colourful attitude people's face detection of the embodiment of the invention and the method for tracing;

Fig. 6 is the synoptic diagram according to 4 groups of colourful attitude people's face testing results in described a kind of colourful attitude people's face detection of the embodiment of the invention and the method for tracing;

Fig. 7 is the schematic flow sheet according to people's face authentication module in described a kind of colourful attitude people's face detection of the embodiment of the invention and the method for tracing;

Fig. 8 is according to the synoptic diagram of verifying the result in described a kind of colourful attitude people's face detection of the embodiment of the invention and the method for tracing by people's face of first order checking;

Fig. 9 is according to the synoptic diagram of verifying the result in described a kind of colourful attitude people's face detection of the embodiment of the invention and the method for tracing by people's face of second level checking;

Figure 10 is the example schematic diagram according to AAM algorithm affine coefficients result of calculation in described a kind of colourful attitude people's face detection of the embodiment of the invention and the method for tracing;

Figure 11 be according to the described a kind of colourful attitude people's face of the embodiment of the invention detect with method for tracing in based on the face tracking result's of AAM synoptic diagram;

Figure 12 be according to the described a kind of colourful attitude people's face of the embodiment of the invention detect with method for tracing in key point choose and the synoptic diagram of tracking results;

Figure 13 is according to the example schematic diagram of people's face detection in described a kind of colourful attitude people's face detection of the embodiment of the invention and the method for tracing with tracking results.

Embodiment

Referring to Fig. 1, the present invention at first provides a kind of colourful attitude people's face to detect and tracing system, comprises training module 100, detection module 200, and the tracing module (not shown).Wherein:

Training module 100 is used for by people's face sample training, obtains the positive and half side-view detection model (comprising right side attitude, left side attitude) of people's face respectively, and definite AAM (Active Appearance Models) faceform;

Detection module 200 is used for the positive and half side-view detection model according to described people's face, inputted video image is carried out people's face detect, and determines whether there is people's face in the two field picture;

Be used for preliminary position with described acquisition as initial value, utilize the color distortion of present frame and former frame image, calculate the unit of the point-to-point speed of people's face;

After being used near this position, detecting people's face, adopt described AAM faceform to calculate the affined transformation coefficient of working as forefathers' face, obtain the unit of the characteristic parameter of present frame people face; And

According to embodiment illustrated in fig. 1, with reference to training module 100, at first need to carry out the training of two group models, the one, positive and half side dough figurine face detection model, the 2nd, AAM faceform (not shown).The training algorithm of the detection model of people's face can adopt the multistage classifier based on the AdaBoost algorithm, and by a plurality of fronts and half side dough figurine face sample training multistage classifier, it is 12 * 12 that the people of extraction is bold little.In addition, can discern the left side of people's face, front, three attitudes in right side in order to guarantee algorithm, in the present embodiment, trained left side attitude people face detection model, right side attitude people face detection model and positive attitude people's face detection model, wherein left side attitude people's face detection model and right side attitude people face detection model can be collectively referred to as half side dough figurine face detection model, and wherein right side attitude people face detection model is through obtaining after the attitude people face detection model mirror image processing of left side.In addition, for acceleration detection speed, present embodiment also adopts all attitude people face detection models of 15 layers of people's face sample trainings of all attitudes, is called as first order detection model, is used for input picture is carried out Preliminary detection with rough acquisition people face position.

In the training module 100, training AAM faceform's purpose is the Position Approximate at known input people face, calculate the affined transformation coefficient of this people's appearance under the prerequisite of approximate size to standard faces, obtain its more accurate position, the size and the anglec of rotation, this algorithm adopts pca method (PrincipalComponent Analysis, be called for short PCA) to a large amount of people's face sample training, obtain average man's face and a plurality of orthogonal vector, the gray scale and the training pattern that adopt the iterative processing method will import people's face again when using compare, and calculate the affined transformation coefficient of people's face.

With reference to detection module 200, when carrying out the detection of people's face, present embodiment at first adopts the detection model of all attitudes that input picture is searched for, eliminate most of search window, then remaining window is input to respectively in the detection model of three attitudes, return last detection candidate frame, and calculate a weight for each candidate frame according to testing result.In general, the detection model of each attitude all can return some candidate frames, contiguous candidate frame is merged, and add up the weight of the candidate frame that each attitude returns.If certain weight that merges front face in the frame is bigger, illustrate that then this detection people face should be a front face; And if the weight of left side people's face is bigger, just can judge and be bold this detection people to cause be people from left side face, can determine the general attitude of people's face thus.

Referring now to Fig. 2,, for the described colourful attitude people's face of the embodiment of the invention detects and the method for tracing schematic flow sheet.

Step 201: import a two field picture from video frequency pick-up head, before not obtaining tracking target, every frame search image detects the existence of people's face;

Among Fig. 3 301 provided the result that people's face detects, and frame wherein is for detecting people's face frame.

Step 202: judge whether former frame has tracked people's face;

Step 203: when former frame does not track people's face, current frame image is carried out colourful attitude people's face detect,, carry out step 204, detect otherwise in successive image, proceed people's face if in current frame image, find one or more people's faces;

Step 204: in ensuing two two field pictures, follow the tracks of people's face that former frame detects, and track human faces verified, after having only someone's face two continuous frames all to pass through checking, algorithm thinks that just this people's face is a necessary being,, then pick out maximum people's face and begin to follow the tracks of by checking as a plurality of people's faces.Here people's face checking is exactly again the track human faces region to be detected once more by reference detection module 200, judges whether it is genuine people's face;

Step 205: checking by after begin to follow the trail of;

After determining to track people's face, in subsequent frame, continue to follow the tracks of this people's face, tracing process comprises several steps:

Step 206: adopt based on Mean Shift and histogrammic face tracking algorithm keeps track former frame people face, obtain preliminary position when forefathers' face;

Step 207: people's face position that the track algorithm of previous step obtains is inaccurate, be easy to be subjected to the interference in the more approaching zone of other and the colour of skin, as neck, hand etc., therefore also need to utilize the chrominance information of present frame and former frame image to obtain the point-to-point speed of people's face, the present invention's this moment as initial value, adopts Lucas-Kanade inverse algorithm to obtain people's face point-to-point speed more accurately the tracking results of previous step;

Step 208: the Position Approximate of estimating people's face by the point-to-point speed of calculating, adopt people's face detection model to carry out the checking of people's face, just, near this position, search for, judge whether this zone has people's face to exist, and people's face verification method herein is consistent with the described people's face of step 205 verification method;

Step 209: judge that whether people's face is by checking;

If current region people face exists, the checking of people's face is passed through, and then comprises the steps:

Step 210: adopt the affined transformation coefficient of AAM algorithm computation, obtain the characteristic parameter of comprising of people's face of accurate position, the anglec of rotation and size when forefathers' face;

Step 211: the key point to present frame and former frame image people face is mated, and obtains in two two field pictures two width of cloth people faces point-to-point speed, change of scale, coefficient of rotary etc. more accurately, and then obtains the accurate characteristic parameter of present frame people face.Another purpose in this step then is keep tracking results stable, makes tracing area obvious shake can not occur.With reference to the face tracking result of the expression of 302 among the figure 3 by checking;

Step 212: upgrade the characteristic parameter of present frame people face, utilize these characteristic parameters to continue to handle the next frame image;

If in the step 209, in tracing area, do not search people's face, i.e. people's face checking is not passed through, and this illustrates that current tracing area does not comprise people's face or the human face posture variation is excessive, continues to follow the tracks of this people's face in subsequent frame, proceeds checking, comprises the steps:

Step 213: judge that whether the consecutive numbers frame is not yet by checking;

Step 214:, continue to follow the trail of if, then upgrade characteristic parameter by checking;

Step 215: if the checking of people's face is not still passed through in follow-up several frames, think that then current tracking target may not be people's face, perhaps the human face posture variation is excessive, and tracking value is not high, stops to follow the tracks of.Do not pass through the face tracking result's of checking example with reference to the expression of 303 among the figure 3.

After previous tracking target stops to follow the tracks of, in successive image, carry out people's face again and detect,, follow the tracks of again again up to finding new people's face.

Below some the gordian technique points in the processing procedure of the present invention are stressed respectively.

The first, the people's face detection algorithm described in the step 203 of the present invention is described in further detail.

The principle basically identical of people's face detection algorithm of the present invention and document 1, employing is based on the method for detecting human face of AdaBoost statistics layering sorter, described as document 1 before, people's face detection algorithm (P.Viola based on AdaBoost, and M.Jones, Rapid object detection usinga boosted cascade of simple features.Proc.on Computer Vision PatternRecognition, 2001, hereinafter to be referred as document 2), at first by a large amount of " people's face " and " people's face/non-face " two class sorters of " non-face " sample training, this sorter can determine whether the rectangular window of certain yardstick is people's face, if rectangle is long is m, wide is n, then the flow process of people's face detection is exactly: continuous according to a certain percentage earlier scaling image, exhaustive search and differentiation all big or small m * n pixel window in the image series that obtains, each window is input in " people's face/non-face " sorter, stay identification and be the candidate window of people's face, adopt the candidate of post-processing algorithm merging adjacent position again, the position of exporting all detected people's faces, information such as size.

1 detection of considering front face of document, with reference to the standard faces result after the cutting shown in the standard faces image and 502 shown in 501 among the figure 5, and the present invention also needs to realize the detection of inclined to one side people from side face, with the lasting tracking of guaranteeing colourful attitude people's face and the stability of detection algorithm.The present invention still adopts the little feature extraction face characteristic of seven groups shown in Fig. 4, but the image difference of different attitude people's faces is very big, cause little characteristic difference of different attitude people's appearance co-located very big, this means that if still adopt the algorithm described in the document 1 be all AdaBoost strong classifiers of positive sample training, then training algorithm is difficult to obtain the convergent result, even Weak Classifiers at different levels have been selected very many little features, but the false alarm rate of anti-sample still can be than higher., the detection of colourful attitude people's face be finished in two steps for this reason, at first be adopted 15 layers of detection model of people's face sample training of all attitudes, then again the sample of three attitudes be trained respectively, be detection model of each attitude training.

About 4500 width of cloth facial images have been collected in the present invention altogether, about 2500 width of cloth of front face image wherein, about 1000 width of cloth of people from left side face, about 1000 width of cloth of people from right side face.In conjunction with standard faces of being mentioned in the document 1 and cutting mode, to people's face sample carry out affined transformation, cutting is cut apart, with reference to the cutting result shown in the people's face sample shown in 503 and calibration point and 504 among the figure 5, and all human face regions are normalized to 12 * 12 sizes.If two distance is r, the central point of two lines is (x _Center, y _Center), the length and width of gathering rectangle are made as 2r, i.e. twice binocular interval, the then coordinate (x in clipping rectangle zone _Left, y _Top, x _Right, y _Bottom) be:

\{\begin{matrix} x_{left} \\ y_{top} \\ x_{right} \\ y_{bottom} \end{matrix}\} = \{\begin{matrix} x_{center} - r \\ y_{center} - 0.5 r \\ x_{center} + r \\ y_{center} + 1.5 r \end{matrix}\} - - - (1)

For strengthening sorter to the rotation of people's face certain angle and the detection robustness of change in size, equally each sample is carried out mirror transformation, rotation ± 20 ° angle, size and amplify 1.1 times, each sample is extended for five samples like this, has obtained about 22500 positive samples so altogether.Anti-sample image is exactly the image that does not comprise people's face in a large number, comprises landscape image, animal, literal etc., has 5400 width of cloth.Also in full accord described in the acquisition methods of anti-sample characteristics and the document 1 in each layer AdaBoost sorter training process, the anti-sample image of elder generation's random choose one width of cloth, and definite at random size and the position of anti-sample in image, then in this image, cut out corresponding zone, with the size of cutting image normalization to 12 * 12, obtain an anti-sample.

After all model trainings finish, first order detection model has 15 layers, false alarm rate is 0.0022, the classification error rate of training positive sample is 4.8%, positive sample error rate is higher, and false alarm rate has still surpassed 0.1%, and this characteristic difference that has shown different attitude samples is bigger, model convergence is slower in the AdaBoost training process, and this just needs to be the different attitudes reason place of training patterns respectively.The detection model of positive attitude has 18 layers, and total false alarm rate is 2.6e-6, is 4.1% to the classification error rate of having passed through the training sample that the first order detects.The detection model of left side attitude has 16 layers, and total false alarm rate is 3.8e-7, is 0.42% to the classification error rate of having passed through the training sample that the first order detects.For saving the training time, the intensity profile of considering people from left side face and people from right side face again is symmetrical fully, just the detection model that does not have retraining right side attitude, but the detection model of left side attitude is carried out mirror image processing, can obtain the detection model of right side attitude people face.The front sample is many in the training sample, and many sample interference ratio are bigger, so the classification error rate is higher, and the side sample is fewer, disturbs also very for a short time, so the classification error rate is very low.

When carrying out the detection of people's face, the present invention is downscaled images on a plurality of yardsticks at first, for example for 160 * 120 images, 9 yardsticks have been considered, the minification of image is respectively 1.5,1.88,2.34,2.93,3.66,4.56,5.72,7.15,8.94, people's face frame minimum is 18 * 18 in the corresponding original image, be 107 * 107 to the maximum, adopt first order detection model that each width of cloth downscaled images is searched for then, eliminate most of search window, then remaining window is input to respectively in people's face detection model of three attitudes, returns last detection candidate frame, and calculate a weight for each candidate frame according to testing result.In general, people's face detection model of each attitude all can return some candidate frames, contiguous candidate frame is merged, and add up the weight of the candidate frame that each attitude returns.If certain weight that merges front face in the frame is bigger, illustrate that then this detection people face should be a front face; And if the weight of left side people's face is bigger, just can think that this detection people face is people from left side face, can determine the general attitude of people's face thus.With reference to figure 6, be the synoptic diagram of several groups of colourful attitude people's face testing results, the testing result of different attitudes marks with the square frame of different gray scales.

The second, the face tracking algorithm based on Mean Shift described in the step 206 of the present invention is described in further detail:

Colourful attitude people's face detection algorithm can detect positive and inclined to one side people from side face, but can't the excessive people's face of the detection plane anglec of rotation, in addition, people's face detection algorithm is very consuming time, generally need tens of milliseconds of times just can finish the detection of everyone face in one 320 * 240 image, therefore just cannot all carry out the detection of people's face to every two field picture of the video sequence of real-time input, but improve the efficient of algorithm greatly by the method for following the tracks of and verifying, and guarantee that algorithm can not trace into other non-face targets to detecting people's face.

Face tracking algorithm of the present invention is at first on the basis that colourful attitude people's face detects, people such as same employing document 1 and Comaniciu are at document 3 (D.Comaniciu, V.Ramesh, and P.Meer.Kernel-Based Object Tracking.IEEE Trans.Pattern Analysis and MachineIntelligence, May 2003,25 (5): 564-577, abbreviation document 3) the object tracking algorithm of mentioning in based on Mean shift and histogram feature is followed the tracks of detecting people's face, position size by people's face of former frame, the position of two groups of local histogram's features of the shot and long term of people's face seeker's face in the current frame image obtains the coordinate of human face region central point.The advantage of this algorithm is that efficient is very high, the influence that not changed by people's face rotation, attitude, when people's face in video fast during translation this algorithm also can obtain the position at people's face center roughly.But its defective is also apparent in view, and the tracking accuracy of algorithm is not high, although can obtain the position of people's face soon, the center point coordinate that obtains is accurate inadequately, even people's face maintains static, is subjected to noise etc. to influence the shake that central point also can not stop.In addition, this algorithm adopts colour of skin as the feature of following the tracks of, and this means that algorithm also might follow the tracks of on the area of skin color such as in one's hands, neck.

Relative merits based on this track algorithm, on basis, the accurate estimation of people's face translation, the continuous checking of facial image and the estimation of people's face yardstick attitude have been added based on the tracking results of Mean Shift, guarantee that algorithm can trace into human face region always, and make the tracing area precision higher, and can obtain accurate dimension, the anglec of rotation of people's face etc.

The 3rd, detailed description is made in the estimation of the translation described in the step 207 of the present invention:

Can obtain the rough position of present frame people face central point fast based on the face tracking algorithm of Mean Shift, and the purpose of described translation estimation is exactly in conjunction with people's face degree distribution characteristics and Lucas-Kanade inverse algorithm (I.Matthews and S.Baker.ActiveAppearance Models Revisited.International Journal of Computer Vision on the basis of this rough position, Vol.60, No.2, November, 2004, pp.135-164, hereinafter to be referred as document 4) accurately estimate the translation vector of consecutive frame people face, determine the exact position of people's face central point.

The Lucas-Kanade algorithm can calculate the point-to-point speed of certain point in the consecutive image sequence fast.Given wherein certain some A, coordinate is x _A, I (x _A, t _k) be the brightness of this point in the k two field picture, the point-to-point speed of establishing A in adjacent two frames be u=(u v), then has:

I(x-uδt，t _k)＝I(x，t _k-1)，δt＝t _k-t _k-1 (2)

Know the speed initial value of A under many circumstances, be made as u ₀, can be made as the initial value of its speed as the point-to-point speed of one frame this point in front in the consecutive image sequence, u=u is then arranged ₀+ Δ u, and Δ u is generally smaller.Consider the point in the A vertex neighborhood scope, it is very approaching with u that the point-to-point speed of these points can be thought, can calculate thus in adjacent two frames in the neighborhood scope N all sides of all some pixel differences and:

E = \underset{x &Element; N}{Σ} {[I (x - u_{0} δt - Δuδt, t_{k}) - I (x, t_{k - 1})]}^{2} - - - (3)

Make the u of following formula minimum just can be used as the estimated value of the point-to-point speed of A.If u is very little for Δ, then following formula can be carried out Taylor series expansion to δ t, and remove the derivative term that is higher than one-level, have:

E = \underset{x &Element; N}{Σ} {[{(\frac{&PartialD; I (x - u_{0} δt, t_{k})}{&PartialD; x})}^{T} Δu + \frac{I (x, t_{k - 1}) - I (x - u_{0} δt, t_{k})}{δt}]}^{2} - - - (4)

Then with expansion to Δ u differentiate, derivative is equalled zero, solving equation obtains:

Δu = H^{- 1} \underset{x &Element; N}{Σ} [{(\frac{&PartialD; I (x - u_{0} δt, t_{k})}{&PartialD; x})}^{T} (\frac{I (x - u_{0} δt, t_{k}) - I (x, t_{k - 1})}{δt})] - - - (5)

Wherein, H is the Hessian matrix:

H = \underset{x &Element; N}{Σ} [{(\frac{&PartialD; I (x - u_{0} δt, t_{k})}{&PartialD; x})}^{T} (\frac{&PartialD; I (x - u_{0} δt, t_{k})}{&PartialD; x})] - - - (6)

Above-described velocity estimation formula only can adapt to the very little situation of Δ u, because adopted approximate one-level Taylor series expansion.Can estimate bigger point-to-point speed in order to guarantee algorithm, need carry out repeatedly iterative processing, the point-to-point speed of a preceding iterative estimation is as the initial value of new iterative step, each new point-to-point speed of iterative estimation, and carry out superposition with original point-to-point speed, that is:

u _n＝u _n-1+Δu _n (7)

U wherein _nBe total speed after the n time iteration, Δ u _nThe speed that the n time iteration tried to achieve.In addition, also need on a plurality of resolution, handle, on low resolution, estimate point-to-point speed earlier,, calculate more accurate speed then the initial value of this speed as high resolution estimating algorithm.

According to formula (7), the initial value of each iterative process is the calculated value of former frame, all must recomputate during therefore each iteration

H matrix and inverse matrix thereof, this is very consuming time, the present invention adopts Lucas-Kanade inverse algorithm to improve the efficient of algorithm for this reason.

With the n time iteration is example:

I(x-u _nδ，t _k)＝I(x，t _k-1)＝I(x-u _n-1δt-Δu _nδt，t _k) (8)

With the Δ u in the following formula _nChange a position, become:

I(x-u _n-1δt，t _k)＝I(x+Δu _nδt，t _k-1) (9)

Thus can be in the hope of Δ u _nCalculating formula be:

{Δu}_{n} = H^{- 1} \underset{x &Element; N}{Σ} [{(\frac{&PartialD; I (x, t_{k - 1})}{&PartialD; x})}^{T} (\frac{I (x - u_{n - 1} δt, t_{k}) - I (x, t_{k - 1})}{δt})] - - - (10)

Wherein, H is the Hessian matrix:

H = \underset{x &Element; N}{Σ} [{(\frac{&PartialD; I (x, t_{k - 1})}{&PartialD; x})}^{T} (\frac{&PartialD; I (x, t_{k - 1})}{&PartialD; x})] - - - (11)

The H matrix is changeless in whole iterative process in the following formula, can calculate its inverse matrix earlier before iteration begins, and does not just need to calculate then again.When iteration, only need continuous calculating like this

\frac{I (x - u_{n - 1} δt, t_{k}) - I (x, t_{k - 1})}{δt}

With Δ u _n, make computation amount.

The change in size of people's face is very violent in the video sequence, and for guaranteeing that estimating algorithm still can calculate point-to-point speed fast when people's face size is very big, at first the people's face to different scale has carried out normalization, and people's face is all zoomed to same size.According to people's face size that former frame is followed the tracks of current frame image is carried out convergent-divergent, make the size of human face region be approximately 16 * 16.And with based on the speed of Mean shift algorithm estimation initial value as the inverse algorithm, calculate point-to-point speed between two two field pictures after dwindling, earlier image being carried out multiresolution handles, image is dwindled one times again, people's face size is approximately 8 * 8, people's face center neighborhood of a point N is exactly this 8 * 8 neighborhood, the inverse algorithm estimation point-to-point speed above adopting; The speed of estimation is double, on 16 * 16 human face regions, estimate point-to-point speed again.At last total speed is reduced to the point-to-point speed of people's face central point on the original video.

When realizing the translation estimation, not only to consider half-tone information, also will take into full account the colour of skin information of people's face, three components of RGB of input picture are converted into yuv space, these three components are sent into respectively in the velocity estimation formula.In addition, in order to reduce the influence that human face light changes, also with all brightness values divided by a bigger number, to have reduced the weight of brightness Y, emphasized the effect of two chromatic components of UV, the accuracy of velocity estimation when actual effect sees that this processing mode has obviously improved people's face rapid movement.

The 4th, detailed description is made in checking to the people's face described in step 205 of the present invention and the step 208:

In the document of mentioning before 1, because people's face detection algorithm can only detect positive homo erectus's face, and track algorithm can only obtain the zone of people's face, can't know the anglec of rotation attitude of people's face etc., therefore when carrying out people's face verification operation, have only continuous hundreds of frame all to trace into target, but all do not detect front face at tracing area, just think and stop not necessarily people's face of target following the tracks of.If the shortcoming of doing like this traces into non-face target such as neck, hand etc. exactly, system needs tens of seconds time just can react, and this has also greatly influenced the performance of system.

People's face authentication module of the present invention has solved the defective of original system, because detecting, new people's face can detect front, side upright people's face, and the follow-up people's face affine coefficients algorithm for estimating based on AAM can obtain the anglec of rotation of people's face etc., just can realize the lasting checking of track human faces thus, whether promptly every frame is all differentiated tracing area is people's face, if non-face, then export non-face tracking results, in addition, if the checking of consecutive numbers frame is not passed through, then stop to follow the tracks of.System can come at 1 second internal reaction, and stop to follow the tracks of this target when tracing into non-face zone like this.

With reference to figure 7, be the detail flowchart of people's face authentication module.Detailed process is:

Step 701:, also have the input picture of present frame to carry out combination with the translation parameters of yardstick, the anglec of rotation and the previous calculations of former frame people face.

Step 702: position, size and the anglec of rotation of determining present frame people face roughly.

Step 703: cutting and normalization human face region, obtain 12 * 12 image.

By these parameters current frame image is carried out affined transformation, and carry out cutting and size normalized, obtain 12 * 12 image.

Step 704: this width of cloth image is imported in colourful attitude people's face detection model, judged whether face into true man, if, enter step 705, if not, step 706 entered.The image that obtains is sent in colourful attitude people's face detection model, calculate the weight of returning of each attitude detection model, and can be with the pairing attitude of the detecting device of weight maximum as attitude when forefathers' face, if and the weight of each gesture detector is zero, think that then input picture is not people's face, then also need the neighborhood of present frame people face position is searched for.

Step 705: checking is passed through, and returns human face posture.

Step 706: seeker's face once more in smaller territory and range scale.Search in smaller yardstick in conjunction with the size and the anglec of rotation that people's face is known, will merge by the candidate face frame of all gesture detector, and with the attitude of the corresponding attitude of weight limit as present frame people face.If find any candidate face frame, enter step 707, if do not find, enter step 708.

Step 707: merge candidate face, return people's face is new in the original image position, yardstick and attitude.

Step 708: checking is not passed through.The current search zone does not comprise people's face or the human face posture variation is excessive, and the checking of people's face is not passed through.

Provide two examples of people's face checking below, the apparatus volume image illustrates.

With reference to figure 8, for verify result's synoptic diagram by people's face of first order checking.Being the former frame image and following the trail of the result of shown in Fig. 8 801 expression, 802 expressions be current frame image, 803 then is 12 * 12 images after the cutting.Although this image is not a front face completely, passed through everyone face detector, and gesture recognition is positive, this is because this kind algorithm can detect people's face of the plane rotation of certain angle scope.

With reference to figure 9, for verify result's synoptic diagram by people's face of second level checking.Being the former frame image and following the trail of the result of shown in Fig. 9 901 expression, 902 expressions be current frame image, 903 expressions be normalized people's face, and that 904 expressions is the result of second level checking.What the figure shows is the example that first order checking is not passed through by, second level checking, the point-to-point speed estimation has deviation in this example, therefore normalized image is compared with real human face and will be taken back, first order checking is not passed through, and in the checking of the second level, equally input picture is carried out affined transformation and cutting processing, but the zone of cutting is bigger than the zone of first order checking, search for the people's face in this zone, and merge candidate result, the detection people face block diagram that obtains is shown in 904.

The 5th, the people's face affine coefficients based on AAM described in the step 210 of the present invention is estimated to be described in further detail.

People's face frame of foregoing people's face verification algorithm output can be included each organ, but yardstick, the anglec of rotation are still continued to use the former frame result, cause the excessive people's face of the anglec of rotation can't be, the plane spinning movement that algorithm can't handler's face by people's face checking.For guaranteeing that algorithm of the present invention can follow the tracks of people's face of arbitrarily angled rotation, the affined transformation coefficient estimate algorithm based on the AAM that has simplified is proposed again, obtain rotation, translation, zoom factor of present frame people face etc.

AAM is a parameter model based on pca method (PCA), target shape feature and color distribution feature, and purpose is shape, affined transformation coefficient of being obtained by a good model of precondition the target area etc.AAM uses very extensive in the modeling of people's face, people's face positioning field, adopt the AAM algorithm to obtain the profile information of each organ of people's face as document 4.

The purpose of estimating based on people's face affine coefficients of AAM among the present invention is exactly the size and the anglec of rotation in order to obtain track human faces, just calculates four affined transformation coefficient a={a _i, i=0,1,2,3, only comprised translation, convergent-divergent, three conversion of rotation, that is:

\{\begin{matrix} x_{new} \\ y_{new} \end{matrix}\} = \{\begin{matrix} s \cos θ & - s \sin θ \\ s \sin θ & s \cos θ \end{matrix}\} \{\begin{matrix} x \\ y \end{matrix}\} + \{\begin{matrix} a_{2} \\ a_{3} \end{matrix}\} = \{\begin{matrix} a_{0} & - a_{1} & a_{2} \\ a_{1} & a_{0} & a_{3} \end{matrix}\} \{\begin{matrix} x \\ y \\ 1 \end{matrix}\} - - - (12)

Thus formula as can be known, the present invention does not need to know the profile information of each organ of people's face.Therefore can simplify the AAM model in the document 4, only need be people's face gray feature training gray scale pca model, and adopt the AAM search input people face that only comprises gray level model, calculate the affined transformation coefficient of people's face.

In addition, it is different that the pixel of different attitude people's faces distributes, and for this reason, is that three attitudes are trained AAM respectively.People's face sample at first people's face being detected carries out cutting, yardstick normalization and gray scale normalization to be handled, obtain several thousand 16 * 16 facial image, the cutting mode during cutting mode and people's face detect is consistent, wherein front face 2,000 several, about 1000 width of cloth in left side, also there are 1000 width of cloth on the right side.Below be example just with the front face, training and the position fixing process of AAM is described.

If a width of cloth facial image is A (x), wherein x represents a bit in 16 * 16 images.All training samples are carried out the PCA conversion, obtain average people face A ₀, maximum eigenwert and m the corresponding proper vector A of m _i, i=1,2 ..., m, the front face image approximate representation is A arbitrarily ₀With A _i, i=1,2 ..., the linearity summation of m:

A (x) = A_{0} (x) + Σ_{i = 1}^{m} λ_{i} A_{i} (x) - - - (13)

Wherein, λ _iIt is the linear weighted function coefficient of A (x).

If be input to the facial image of AAM location algorithm is I (x), and the anglec of rotation of people's face center position that this image is returned by people's face verification algorithm, people's face size, former frame people's face is tried to achieve, and needs to calculate suitable λ _iWith affined transformation coefficient a={a _i, i=0,1,2,3, the I (x) and the AAM of training are mated, make the following formula minimum:

E = \underset{x}{Σ} {[A_{0} (x) + Σ_{i = 1}^{m} λ_{i} A_{i} (x) - I (x, a)]}^{2} - - - (14)

Wherein, (x is that I (x) is made the image that obtains after the affined transformation a) to I, adopts iterative processing and Lucas-Kanade inverse algorithm to obtain a equally.Each iteration is tried to achieve Δ a, is:

E = \underset{x}{Σ} {[A_{0} (x) + Σ_{i = 1}^{m} λ_{i} A_{i} (x) - I (x, a + Δa)]}^{2} - - - (15)

Described in document 4, adopted a kind of first λ that rejects in the following formula of technology of space projection _i, simplify the calculated amount that minimizes iterative processing.The space that vector Ai is opened is designated as sub (A _i), A _iOrthogonal intersection space be designated as sub (A _i) ^⊥, following formula can be written as so:

{| | A_{0} (x) + Σ_{i = 1}^{m} λ_{i} A_{i} (x) - I (x, a + Δa) | |}_{sub {(A_{i})}^{&perp;}}^{2} + {| | A_{0} (x) + Σ_{i = 1}^{m} λ_{i} A_{i} (x) - I (x, a + Δa) | |}_{sub (A_{i})}^{2} - - - (16)

Wherein, first is at sub (A _i) ^⊥On calculate, comprise A _iAll can be saved because they are at sub (A _i) ^⊥Projection all is zero on the space, that is:

{| | A_{0} (x) - I (x, a + Δa) | |}_{sub {(A_{i})}^{&perp;}}^{2} + {| | A_{0} (x) + Σ_{i = 1}^{m} λ_{i} A_{i} (x) - I (x, a + Δa) | |}_{sub (A_{i})}^{2} - - - (17)

First and λ in the following formula _iIrrelevant, can obtain suitable affine coefficients earlier to first calculated minimum thus, then, calculate λ to second calculated minimum _i:

λ_{i} = \underset{x}{Σ} A_{i} (x) \cdot [I (x, a + Δa) - A_{0} (x)] - - - (18)

First minimization process can realize by Lucas-Kanade inverse algorithm:

{| | A_{0} (x) - I (x, a + Δa) | |}_{sub {(A_{i})}^{&perp;}}^{2} = {| | A_{0} (x, - Δa) - I (x, a) | |}_{sub {(A_{i})}^{&perp;}}^{2} = {| | A_{0} (x) - \frac{&PartialD; A}{&PartialD; a} Δa - I (x, a) | |}_{sub {(A_{i})}^{&perp;}}^{2} - - - (19)

Wherein

\frac{{&PartialD; A}_{0}}{&PartialD; a} = &dtri; A_{0} \frac{&PartialD; x}{&PartialD; a},

Then have:

Δa = H^{- 1} \underset{x}{Σ} {[&dtri; A_{0} \frac{&PartialD; x}{&PartialD; a}]}_{sub {(A_{i})}^{&perp;}}^{T} [A_{0} (x) - I (x, a)] - - - (20)

Wherein the Hessian matrix H is:

H = \underset{x}{Σ} {[&dtri; A_{0} \frac{&PartialD; x}{&PartialD; a}]}_{sub {(A_{i})}^{&perp;}}^{T} {[&dtri; A_{0} \frac{&PartialD; x}{&PartialD; a}]}_{sub {(A_{i})}^{&perp;}} - - - (21)

Wherein,

{[&dtri; A_{0} \frac{&PartialD; x}{&PartialD; a_{j}}]}_{sub {(A_{i})}^{&perp;}}

For:

{[&dtri; A_{0} \frac{&PartialD; x}{{&PartialD; a}_{j}}]}_{sub {(A_{i})}^{&perp;}} = &dtri; A_{0} \frac{&PartialD; x}{{&PartialD; a}_{j}} - Σ_{i = 1}^{m} [\underset{x}{Σ} (A_{i} (x) \cdot &dtri; A_{0} \frac{&PartialD; x}{{&PartialD; a}_{j}})] A_{i} (x) - - - (22)

The partial derivative that x penetrates variation factor a to visit is respectively:

\frac{&PartialD; x}{{&PartialD; a}_{0}} = x, \frac{&PartialD; y}{{&PartialD; a}_{0}} = y, \frac{&PartialD; x}{{&PartialD; a}_{1}} = - y, \frac{&PartialD; y}{{&PartialD; a}_{1}} = x, \frac{&PartialD; x}{{&PartialD; a}_{2}} = 1, \frac{&PartialD; y}{{&PartialD; a}_{2}} = 0, \frac{&PartialD; x}{{&PartialD; a}_{3}} = 0, \frac{&PartialD; y}{{&PartialD; y}_{3}} = 1 - - - (23)

In the following formula

Fully by each point coordinate decision in average image of training among the AAM and 16 * 16 images, can calculated in advance, so the inverse matrix of H also can calculated in advance.(x, a) and a, the efficient of algorithm can improve greatly so only to need to bring in constant renewal in I in iterative process.

The step of whole AAM location algorithm is:

Calculate in advance:

(1) gradient of the average image of calculation training

(2) calculate each point in 16 * 16 images

(3) compute matrix

(4) calculate Hessian matrix and inverse matrix thereof;

Iterative processing:

(5) according to a of former frame calculate I (x, a);

(6) computed image difference A ₀(x)-(x is a) with Δ a for I;

(7) calculate new affined transformation coefficient a+ Δ a;

Subsequent calculations:

(8) return and calculate linear coefficient λ by finish a that the back calculates of iteration _i

Also respectively trained an AAM simultaneously for left side and right side people's face.With reference to Figure 10, be the example schematic diagram of AAM algorithm affine coefficients result of calculation.Provided the positioning result of AAM algorithm among the figure, people's face frame that black box behaviour face detection algorithm is wherein determined, as shown in Figure 10 1001 be input as front face, 1002 is people from left side face, 1003 is people from right side face, and white edge is to try to achieve square frame after the affined transformation, for conveniently checking, also go out the position of two eyes by the position inverse of formula (1) and white lines frame, with+represent.With reference to Figure 11, be face tracking result schematic diagram based on AAM, this figure then illustrates three width of cloth images in one section sequence, and people's face anglec of rotation is very big in the image, but track algorithm still can be followed the tracks of people's face of these rotations, and accurately reflect its angle.

The 6th, the tracking of the people's face key point described in the step 211 of the present invention is described in further detail.

It all is to realize on lower people's face resolution that the point-to-point speed estimation of front, the checking of people's face and affine coefficients are calculated, and can improve the efficient of algorithm like this, but also can reduce the precision of people's face parameter of trying to achieve, because the resolution of original image is much higher.Therefore export among the result data such as people's face position, yardstick, angle and True Data and still exist small deviation.From the result, even people's face maintains static in one section sequence, tangible shake all can take place in people's face position, size, angle that aforementioned modules is obtained.For addressing this problem, the tracking module that has added people's face key point in system at last, still adopt based on the consistent translation evaluation method of the face tracking algorithm of Mean Shift based on Lucas-Kanade inverse algorithm with described, utilize the colouring information of each crucial neighborhood of a point picture element, by the AAM positioning result initial point-to-point speed is set, then between the consecutive frame input picture, calculate point-to-point speed respectively, determine the parameters such as position that people's face is final for each key point.

With reference to Figure 12, for key point is chosen and the synoptic diagram of tracking results.Wherein, definite mode of key point as shown in Figure 12 1201, the square frame behaviour face frame of former frame among the figure, five points of ABCDE are key point, A is a central point, BCDE is the central point of A and four summit lines of people's face frame.Shown in Figure 12 1202 is for current frame image and by the definite people's face frame of AAM, five corresponding key points are respectively A ' B ' C ' D ' E ', with the coordinate of these several points initial value as the translation estimation, each point has been considered the picture element in its 5 * 5 neighborhood scope, calculate the point-to-point speed of each key point, obtain new some A " B " C " D " E ", as shown in Figure 12 1203.If consecutive frame people face exists tangible rotation, list determines that by the point-to-point speed of key point the method for people's face position might be able to not reflect that the fast rotational of people's face changes, because distributing, the neighborhood territory pixel of corresponding key point no longer satisfies the relational expression (2) of translation, the translation estimation precision can reduce, A among Figure 120 3 " estimated speed just accurate inadequately.Taked compromise method for this reason, with A ' B ' C ' D ' E ' and A " B " C " D " E " coordinate be weighted summation; obtain new some A " ' B " ' C " ' D " ' E " ', as shown in Figure 12 1204, and by the final position of determining people's face of these points, housing, the anglec of rotation, size etc.Square-shaped frame shown in Figure 120 4 is people's face frame, shown in four line segments are the final output results of system, the intersection point of line segment extended line is the central point of square frame, the length of side of establishing square frame is len, then each line segment two-end-point is respectively len/2 and len to the distance of central point.

By the above, the detection of whole colourful attitude people's face and tracker are demonstrated in the many occasions of many scenes, and combine with recognition of face, the synthetic supervisor of three-dimensional face, have realized a plurality of demonstration programs.Test result is from many aspects seen, the method for detecting human face that the present invention proposes can detect-50 ° to 50 ° degree of depth rotations, people's face of-20 ° to 20 ° degree plane rotations, can detect ddressee's face of 0 ° to 30 °, can detect low tribal chief's face of 0 ° to 30 °, can detect people's face of the different colours of skin, people's face under the different illumination conditions, people's face of wearing glasses etc., can follow the tracks of positive and half side dough figurine face, follow the tracks of the people of plane rotation at any angle face, track algorithm is highly stable, can not be subjected to and similar non-face zone of face complexion such as neck, the interference of staff etc., and can obtain the anglec of rotation of people's face, the accurate size of output people face.

The efficient of algorithm of the present invention is very high, according to test result, every frame processing time was 8ms-15ms when algorithm was followed the tracks of people's face of 320 * 240 images on P4 2.8GHz computing machine, the CPU occupation rate is no more than 12% when handling frame per second and be 320 * 240 video images of 10fps, and the CPU occupation rate is no more than 18% when to handle frame per second be 640 * 480 video images of 10fps.As shown in figure 13, be the example schematic diagram of lineup's face detection with tracking results, wherein, first width of cloth figure is people's face testing result, and last width of cloth figure verifies the example that does not pass through, and represents with four black line segments.

The present invention is directed to the limitation of original algorithm, propose multiple improvement thinking, solved these defectives of original algorithm, realized more reliable and more stable tracking results, and kept very high operational efficiency.This method can detect a plurality of fronts and the half side-view homo erectus face in the photographed scene in real time, select wherein maximum people's face, employing continues to follow the tracks of to this people's face based on track algorithm and the Lucas-Kanade inverse algorithm of Mean Shift, track human faces and the faceform's who trains affined transformation coefficient is calculated in employing based on the faceform of AAM, determine the size and the anglec of rotation of track human faces thus, the checking that the present invention also adopts colourful attitude people's face detection model that track human faces is continued, judge that whether tracing area is genuine people's face, guarantees the stability of face tracking.

Claims

1. colourful attitude people's face detects and method for tracing, it is characterized in that, comprising:

(1), obtains the positive and half side-view detection model of people's face respectively, and determine initiatively outward appearance faceform by people's face sample training;

(34) if detect people's face near this position, then checking is passed through, and adopts described active outward appearance faceform to calculate the affined transformation coefficient of working as forefathers' face, obtains the characteristic parameter of present frame people face.

2. the method for claim 1 is characterized in that, described step (3) further comprises:

3. method as claimed in claim 2 is characterized in that, described step (3) further comprises:

4. the method for claim 1 is characterized in that, described step (34) further comprises: if do not detect people's face near this position, then checking is not passed through, and follows the trail of checking in next frame.

5. method as claimed in claim 4 is characterized in that, described step (34) further comprises: if the checking of people's face is not passed through yet in follow-up several frames, then stop to follow the trail of.

6. method as claimed in claim 5 is characterized in that, further comprises step:

7. the method for claim 1, it is characterized in that, the described step that obtains people's face front and half side-view detection model by people's face sample training respectively of step (1), comprise: the people's face sample training multilayer detection model that at first uses all attitudes, again people's face sample of front, left surface, right flank attitude is trained respectively, obtain the detection model of three attitudes.

8. the method for claim 1, it is characterized in that, the described people's face of step (2) detects step, comprise: at first adopt the detection model of all attitudes that image is searched for, eliminate most of search window, then remaining window is input to respectively in the detection model of three attitudes, determines according to testing result that the people is bold and cause attitude.

9. colourful attitude people's face detects and tracing system, it is characterized in that, comprising:

Training module is used for by people's face sample training, obtains the positive and half side-view detection model of people's face respectively, and determines initiatively outward appearance faceform;

After being used near this position, detecting people's face, adopt described active outward appearance faceform to calculate the affined transformation coefficient of working as forefathers' face, obtain the unit of the characteristic parameter of present frame people face.

10. system as claimed in claim 9 is characterized in that, described tracing module further comprises:

11. system as claimed in claim 9, it is characterized in that described training module is by using people's face sample training multilayer detection model of all attitudes, and people's face sample of front, left surface, right flank attitude trained respectively, obtain the detection model of three attitudes.

12. system as claimed in claim 11, it is characterized in that, described detection module, by the detection model that adopts all attitudes image is searched for, eliminate most of search window, and remaining window is input to respectively in the detection model of three attitudes, determine according to testing result that the people is bold and cause attitude.