WO2002035472A2

WO2002035472A2 - Creating cartoons

Info

Publication number: WO2002035472A2
Application number: PCT/US2001/050421
Authority: WO
Inventors: Aseem Agarwala
Original assignee: Starlab Nv/Sa
Priority date: 2000-10-24
Filing date: 2001-10-24
Publication date: 2002-05-02
Also published as: WO2002035472A3

Abstract

Creating cartoons includes tracing a contour around an image in a frame of media including a sequence of frames, modifying the contour to more tightly fit around the image, and coloring closed areas surrounded by the contour.

Description

CREATING CARTOONS

BACKGROUND

This invention relates to creating cartoons.

Two-dimensional (2D) cartoons can be made in a variety of ways. High quality cartoons, such as those seen in feature films and in animated television shows, are typically hand-drawn and painted by trained artists. Such cartoons often involve large teams of workers and a significant amount of time to complete. Cartoons can also be created with web animation, e.g., in Flash format. Web animation techniques are often limited to use by trained graphics designers using complicated software. Further, web animation typically requires ten to thirty hours of work per minute of animation, not counting any time spent training the web animators in animation or in using the software.

SUMMARY According to one aspect of the invention, creating cartoons includes tracing a contour around an image in a frame of media including a sequence of frames, modifying the contour to more tightly fit around the image, and coloring closed areas surrounded by the contour.

According to another aspect of the invention, creating cartoons includes causing a machine to enable a user to trace a contour around an image in a frame of media including a sequence of frames, modify the contour to more tightly fit around the image, and color closed areas surrounded by the contour.

According to another aspect of the invention, an apparatus includes a display device configured to display a graphical user interface that enables a user to trace a contour around an image in a frame of media including a sequence of frames and a mechanism configured to automatically modify the contour to more tightly fit around the image and configured to color closed areas surrounded by the contour.

According to another aspect of the invention, creating cartoons includes attempting to maximize a curve's overlap with edges of an image that the curve surrounds by maximizing an objective function including the weighted sum of terms

- i - including an edge overlap term defining the negative of a line integral of the curve across the image, a curvature term defining an integral of the curvature along the curve, a corner sharpness term defining an angle between tangent lines of the curve where the curve connects to another curve, and a deformation term defining dissimilarity between a current position of the curve and a previous position of the curve.

According to another aspect of the invention, creating cartoons includes determining that a gap exists between a first endpoint of a first contour and a second endpoint of a second contour, moving the first endpoint to a location of the second endpoint, and moving the first contour a distance equal to a distance that the first endpoint moved to the location of the second endpoint.

One or more of the following advantages may be provided by one or more aspects of the invention.

By combining input from a human user with machine-performed analysis of video graphics, cartoons can be created quickly and easily by trained and untrained users of all ages via a graphical user interface. The interface is easy enough so that no particular artistic training or computer skills are necessary to create 2D cartoons from the video graphics. The human user can create a 2D cartoon from any video sequence accessible on a computer having the video graphics analysis capabilities.

For example, the user can film a scene involving humans, inanimate objects, or a combination of both, input the video to the computer, and animate the scene, The video may even have inanimate objects manipulated by a human, i.e., a child moving an action figure through a self-created storyline, and the user can choose the objects in the video that he or she wants to animate. In this way, the hands of the child could be ignored, thereby only animating the action figure and any appropriate surroundings. The user selects the objects in the video to turn into cartoons and maintains control over how the cartoons are formed. The computer can correct user inputs, but the user can correct computer alterations at any time. Also, the user can trace new objects for the animation at any time Other advantages and features will become apparent from the following description and from the claims.

DESCRIPTION OF DRAWINGS FIG. 1 is a flowchart showing a process of creating a cartoon. FIG. 2 is a block diagram of a computer system.

FIGS. 3- 10 show frames as a video sequence.

FIG. 1 1 is a flowchart showing a process of creating a cartoon image. FIGS. 12- 13 show an image with snake boundaries. FIGS. 14A-C illustrate snapping techniques. FIG. 15 is a flowchart showing a process of propagating a cartoon image.

FIGS. 16-18 shows screens on a graphical user interface.

DETAILED DESCRIPTION

Referring to FIG. 1, a process 100 illustrates creating a 2D, eel-style cartoon from a video sequence using active contours. The video sequence can come from any source; the Internet, television, pre-recorded videos, compact disG read-only-memories (CD-ROMs), camcorders (video cameras), etc. To create an animated cartoon, a single cartoon image is first created 102. Creating this single cartoon image indudes a human user tracing 104 relevant contours of an image in a frame of the video sequence that will become the initial frame of an animated cartoon. The user traces the relevant contours on a graphical user interface of a machine such as a stationary or mobile computer, television, or other device having access to processing capabilities and capable of displaying images. After the user has traced the relevant contours, the machine propagates 106 the image surrounded by the relevant contours over future frames of the cartoon. To propagate the image, the machine uses one or more software and/or hardware algorithms to modify the relevant contours to tightly fit the actual contours of the image data as the image changes over adjacent frames of the video sequence and to define closed regions. Then, as the animation progresses, the machine contours 108 the user-contoured image frame-by-frame using the video data as a guide. The user can at any time input corrections to machine calculations to help the machine with tracking the image over successive frames of the 2D cartoon. The user also can trace any new items that may enter the video scene.

Referring to FIG 2, a system 200 includes a machine 202 capable of processing the user's commands with software, hardware, or a combination of the two. The machine

202 is illustrated here as a desktop computer with internal processing capabilities, although the machine 202 could also or instead have access to an external processing mechanism 206, such as a database, a server, or a game console. Similarly, the machine 202 could also or instead include a display device such as a computer monitor or a television and an accessible processor such as a database, a server, or a game console.

The user (not shown) Gan interact with the machine 202 with tools such as a keyboard 208, a mouse 210, and/or a stylus 212.

Referring to FIGS. 3- 10, an example of a dialogue between a user and a system to create contours around images in a video sequence is illustrated. The dialogue begins when the machine 202 displays an initial frame 300 (FIG. 3) of a video stream (not shown) of a hand moving a stuffed toy duck 302 on the display 204. The user can choose to cartoonize any object in the video stream. Assume that the user decides to cartoonize a hand 304 first. The user draws a contour 400 (FIG. 4) around the hand 304, directly on the video image using one or more of the tools. The machine 202 fits piecewise Bezier splines to the hand-drawn path 400 as explained further below. The machine 202 also closes any gaps, such as a gap 402 between the hand's thumb and the rest of the hand 304. The machine 202 also optimizes the splines to best fit the edges of the hand 304 The resulting contour 500 around the hand 304 is shown in FIG, 5.

The user advances to the next frame 600 of video (see FIG. 6). The duck 302 and the hand 304 in the new frame 600 have both moved slightly up and to the right from the first frame 300. The machine 202 calculates an estimate of the position of the contours in the new frame 600 and moves them accordingly, resulting in a calculated contour 602. The user notices an error in the calculated contour 602 at the bottom left of the hand 304 and drags this point to its correct location as shown in FIG. 7. The cartoon now includes two correct frames, frame 502 in FIG. 5 and frame 700 in FIG, 7.

The user advances to another frame 800 of video (see FIG. 8). This time the machine 202 estimates the contour motion correctly, and the user is satisfied with the frame 800 and makes no corrections.

The user now returns to the corrected initial frame 500 to add contours to the duck 302. The user draws a contour 900 around the duck 302 (see FIG. 9), and the machine 202 creates cartoon contours from the user-drawn contours. The machine also snaps endpoints as appropriate. The user then advances to the next corrected frame 700 to add contours to the duck 302. The contours of the hand remain in the position calculated above and the contours of the duck 302 are advanced. The machine 202 estimates the contour motion correctly, and the user makes no corrections to corrected frame 1000 (FIG. 10). The cartoon now includes two correct frames, frame 902 in FIG. 9 and frame 1000 in FIG. 10, with contours for the hand 304 and for the duck 302. The user continues advancing through frames of the video sequence and the machine processes the images in a similar manner until contours are created for each desired image in each desired frame of the video sequence.

Referring to FIG. 1 1, a process 1 100 of creating a single cartoon image is shown. To create an animated 2D cartoon, a human user first Greates a cartoon of the first frame of the cartoon from a frame in a video sequence. Then, successive frames of the cartoon can be created from the video sequence as described below.

Cartoon images are generally composed of solid regions of color separated into closed regions demarcated by curves. Thus, a cartoon of photographic image would consist of curves that follow the edges in the photographic image.

The process 1 100 uses a computer vision technique called active contours, also known as snakes, as the curves in the cartoons. A snake is a contour that "wiggles" around an image while trying to maximize some function In many cases, the snake attempts to maximize its overlap with edges in an image, which can be found using edge detection techniques. Snakes traditionally consist of separate points connected by lines to form a boundary around an image. Such snakes are typically not smooth and, therefore, not aesthetically pleasing.

However, a boundary around the image can remain smooth at any resolution by using connected piecewise cubic Bezier curves as the graphics primitives for the snakes.

A Bezier curve includes at least three points: two endpoints and one or more control points. The endpoints define the start and end positions of the Bezier curve. The control point(s) define the curvature of the Bezier curve. Moving a control point changes the shape of the Bezier curve between the endpoints. Mathematically, Bezier curves are parametric splines (smooth curves) that describe a curve as;

R(t) = (X(t), Y(t)),

where t ranges from zero to one and the functions X(t) and Y(t) are cubic polynomials, When t changes value, e.g., by moving a control point, a point defined by (X(t)N(t)) changes position. Stringing a number of Bezier curves together can create a string of curves around the image.

To start creating the first single cartoon image, the user traces 1 102 a contour in the image that he or she wants to have in his cartoon. For example, the user can trace around an alien image 1200 as shown in FIG. 12. Note that the user-created outline 1202 resembles traditional snakes, in that lines around the image 1200 do not appear very smooth. To smooth the user-created curve, the process 1 100 converts 1 104 the curve into one or more piece-wise Bezier splines using an iterative technique, such as the iterative technique presented in Philip J. Schneider, "An Algorithm for Automatically Fitting Digitized Curves," In Graphics Gems, 1990, pp. 612-26, 797-807. A smoothed curve

1300 is shown around the alien image 1200 in FIG. 13.

The process 1 100 then closes small gaps and defines closed regions in the smoothed curve by snapping 1 106 together closely adjacent contours. When the user draws a curve that is supposed to be attached to another curve, it is unlikely that the user will be able to exactly position the curve so that it is attached to the other curve. A gap or an overrun will likely occur. Identifying these situations and aligning the curves can make the resultant curves more aesthetically pleasing and can define closed regions that will contain a single solid color in the final cartoon image. There are numerous kinds of snapping techniques that can be used in snapping together closely adjacent locations. For example, snapping can be carried out by moving the user-drawn contour to close the gap and/or by adding a bit of line to the user-drawn contour so as to close the gap.

Referring to FIG. 14 A, an endpoint 1400 of a user-drawn contour 1402 can be moved to meet an endpoint 1404 of a nearby curve 1406. As shown in FIG. 14B, this movement closes a gap 1408 between the user-drawn contour 1402 and the nearby curve 1406 to produce a snapped curve 1410. Moving the endpoint 1404 affects the shape of the user-drawn contour 1402, so the entire user-drawn contour 1402 is shifted. Alternatively, as shown in FIG. 14C, a gap-filling line 1412 may be added between the endpoints 1400, 1404 to close the gap 1408.

There are various scenarios in which snapping may be appropriate. In order of priority, the techniques include edge snapping 1 108, close snapping 1 1 10, endpoint snapping 1 1 12, and curve snapping 1 1 14. If the user starts or ends a curve very near the edge of the desired image, it is likely that the user wishes the curve to attach to the end of the image, i.e., surround and define that image Edge snapping defines two separate regions bounded by the curve and the edge of the image.

The user may be trying to draw a closed curve, a curve that ends very near where the curve starts If the last point of the curve is within a certain threshold distance (tolerance) to the first point of the curve, close snapping snaps the two points together to form a closed curve.

The user may want a curve to start or end at an endpoint included in another curve. Thus, endpoint snapping checks both the first and last points of a newly-drawn curve for their proximity within a certain threshold distance of another curve's endpoints. If within the threshold distance (tolerance), endpoint snapping snaps the first point and/or the last point to the appropriate other curve's endpoints. Endpoint snapping can also snap to corner points that form the intersection between connected Bezier segments.

The user may be trying to connect the first point or the last point of a newly-drawn curve to an arbitrary point along another curve. Curve snapping finds the closest point on all or nearest proximity other curves to the endpoint in question. This finding includes finding the roots of a fifth order polynomial. One technique that performs this finding is described in Philip J, Schneider, "Solving the Nearest Point-On-Curve-Problem," In Graphics Gems. 1990, pp. 617- 1 1, 787-796. After finding the roots, curve snapping chooses the closest of the found points. If the closest found point is within a certain threshold distance (tolerance) from the first point or the last point, then the endpoint in question is snapped to that found point along the curve.

Along with modifying coordinate values to cause exact alignment, snapping includes keeping 1 1 16 records of which points are snapped together. This recorded data structure can be searched to find closed regions (used in coloring as described below). Also, when locations of points are modified by the user, any curves snapped to the modified points should be moved as well. Thus, a pointer structure is kept that documents any snapping that occurred.

After snapping considerations are made, the snake is optimized 1 1 18. The optimizing includes "wiggling" the snake. The optimization attempts to maximize the curve's overlap with edges of the image that the snake surrounds, to reflect the user's desires, and to produce an aesthetically pleasing, smooth curve, The optimization includes iterating 1 120 over each control point in the snake and move 1 122 each control point in the contour to the adjacent location that maximizes an objective function. The objective function is calculated for the Bezier curve segments that are affected by the moving of the control point.

The objective function includes four terms^* edge overlap, curvature, corner sharpness, and deformation. The objective function is a weighted sum of these four terms The weights are determined experimentally, although the user can adjust the weights via the graphical user interface. The edge overlap term defines the negative of the line integral of the Bezier spline across the image after the application of a Sobel edge detection filter, normalized by the length of the spline. The edge overlap term encourages the spline to lie along edges in the underlying image.

The curvature term includes the integral of the curvature along the Bezier spline, normalized by the length of the spline. The curvature term discourages heavily convoluted splines,

The corner sharpness term defines the angle between the tangent lines of the spline where separate Bezier splines connect. The corner sharpness term discourages sharp corners, The deformation term measures dissimilarity between the current position of the spline and the spline originally drawn by the user. The deformation term may be expressed as the L₂-norm between the current position of the spline and the spline originally drawn by the user, normalized by the length of the spline, or as a calculation of the difference in turning angles. This difference can be calculated in closed form as an integral along the length of "turning angles" for the current position of the spline and for the user-drawn position of the spline. A turning angle at a point on a curve is the angle made by a vector tangent to the curve at that point. The integral can be expressed as

}(A(t) - (A₀(t))²,

the L₂-norm, where A₀(t) is the turning angle for the user-drawn curve. Alternatively, the deformation term can be calculated along the length of the spline as

J(R(t) - (R_υ(t))²,

where R₀(t) is the initial position of the user-drawn curve. The deformation term discourages significant changes from what the user originally expressed in the user trace.

The snake continues to be optimized until a condition is met, e.g., a local minimum is reached or the user signals that the present position of the snake is acceptable, e.g., by clicking a mouse button. Note that the optimization does not move the first point or the last point of the spline because it is expected that the user correctly positions these points. Keeping these points frozen leads to better-optimized snakes.

Once the curves have been established, colors are selected 1 124 for closed regions of the cartoon, The user Gan manually select the colors or can request automatic selection of the colors. If the user requests automatic selection, the process 1 100 averages the pixel values in the region of the image enclosed by the boundary contours. Each pixel has an associated pixel value that defines the pixel's color. Depending on the number of bits used to store pixel information, the pixel value has different numerical ranges. Each number in the range defines a particular color. Corresponding color depths (numbers of bits) and possible pixel values include 4/16, 8/256, 16/65,536, and 24/16,777,216, The average pixel value is used to fill the closed region, The user now has a single cartoon image.

Referring to FIG. 15, once the first image of the animation has been converted into a cartoon, a process 1500 propagates the first cartoon image over future frames of animation. The propagating includes tracking 1502 contours and features across successive frames of the video sequence and allowing the snakes to optimize 1504 on each of the successive frames. The tracking includes, for each endpoint of the Bezier snake(s) in the current video image frame (including extreme endpoints and control point(s) between Bezier segments), attempting 1506 to find a location in the next video frame whose image data most closely matches the image data in the current frame. The next frame as discussed here considers adjacent frames, meaning that the current frame can be compared to the subsequent frame and/or to the previous frame in the video sequence. Two assumptions are made in attempting to find a match. First, it is assumed that the endpoints of contours correspond to distinct locations in the image data and thus are fairly easy to track. Users typically start and end contours at strong locations in an image, such as at the corner of an eye or at the tip of a bird's beak. Second, it is assumed that there is not significant movement between frames. For example, the National Television Standards Committee (NTSC) and the Phase Alternating Line (PAL) video format deliver thirty frames per second (fps) and twenty-five fps respectively, so it could be assumed that in 1/3 Oth of second (NTSC) or l/25th of a second (PAL), objects in the video sequence move only incrementally. In attempting to find a match as similar as possible to the image data in the current frame, the process 1500 can perform a block matching algorithm. The process 1500 considers

1508 a window of image data centered at the contoured image's location. The window can vary in size. Because it is assumed that little movement has occurred between consecutive video frames, the window here is relatively small, e.g., a square including a number of pixels approximating the number of pixels included in the image data in the current frame. To search for a match, the window is moved 1510 around a search space in the next frame. If a match is found in the search space, a distance function is computed 1512 between the image data in the current frame and the matched image data in the next frame. The distance function can be a variance-normalized cross-correlation, defined as:

where the window in the current frame is centered at f(i,j), the window being tested in the next frame is centered at g(η, ), f and g are the mean values of their respective windows, and the Var function is the variance of a window. Since these values come from color (black/white or multicolor) images, f and g return vectors of three elements representing red, blue, and green. Thus, the calculations are veGtor operations that in the end yield a scalar number.

If the search space for finding the new location of a feature is too large, locating a match can be a slow process. Thus, to minimize the search space, the process 1500 can calculate an initial guess of the feature translation. This calculation considers the acceleration and velocity of the feature in the previous frame. In beginning to track a feature, however, velocity and acceleration information likely does not exist. Thus, a

- il - larger search space is used when first tracking a feature than in subsequent trackings of the feature, e.g., a seventy-pixel window shrunk down to sixteen pixels as the tracking progresses.

Once the translation of the endpoints of the Bezier splines is chosen for the next frame, the whole contour is moved 15 14 by the translations. The contour should thus be within a local minimum of the contour state in the image data in the next frame. The snakes are then optimized 1504 ("wiggled") as described above (see FIG, 1 1 and accompanying description). The contours in the image data are not assumed to represent rigid objects undergoing only affine transformations. The optimization process can work on rigid objects, e.g., a book being pushed across a table, and on deformable objects, e.g., a balloon being inflated.

The process 1500 continues frame to frame. Because the calculations may lead to tracking mistakes, the user remains in the loop to correct mistakes as they occur. Also, when new objects enter the scene, or when objects change shape significantly or reappear after occlusions, the user can draw or re-draw these objects. Full manual control by the user is always possible, thereby giving the user full creative control.

The manual controls available to the user should enable the user to push, pull, and move all cartoon elements to achieve any desired effects and appearances. Such manual controls can include endpoint moving, curve pulling, and smoothing. With endpoint moving, the user can click on and drag any endpoint on any contour in a given frame of animation. If there is more than one endpoint snapped to a selected endpoint, all of the endpoints move per the user's clicking and dragging. To identify whether multiple endpoints are at a user-selected endpoint, the snapping records described above are consulted. Curve pulling allows the user to push or pull on arbitrary points of the curve. To begin curve pulling, the user selects a point on the curve. The closest point to the user-selected point as described above with reference to curve snapping, As the user tries to pull the curve, the Bezier segments should be modified so that the curve passes through the point that the user desires at the same parameter value that the user originally selected, Thus, the curve pulling technique should find the smallest perturbation to the control points of a Bezier segment such that the segment passes through a specified point at a specified parameter value, Additional constraints may apply if the first and/or last points of the segment are frozen at their location. One curve pulling technique that can be used to meet these criteria is presented in Barry Fowler and Richard Bartels,

"Constraint-based Curve Manipulation," IEEE Computer Graphics and Applications, Sept. 1993, pp.43-49.

Smoothing enables the user to point to a certain section of a contour and request that the section smoother by, for example, positioning a smoothing tool over the contour. The snake is "wiggled" at the selected section of the contour, significantly increasing the smoothness constraint local to the indicated section. The longer that the user keeps the smoothing tool positioned over a section, the more the smoothness constraint is increased.

FIGS. 16- 18 illustrate a graphical user interface that may be used to display images in a video sequence to a user and to enable the user to manipulate the images as described above to create cartoons. Referring to FIG. 16, a frame of the video sequence (the initial frame 300) is presented to the user on a display screen 1600. A toolbar 1602 includes icon buttons 1604-1616 that the user can click on using a mouse to manipulate the initial frame 300. The cursor on the graphical user interface can vary depending on the button 1604- 1616 depressed by the user. Thus, the user can identify the tool currently being used by the appearance of the cursor.

A contour drawing tool button 1604 enables the user to trace a contour around an image. The user moves the cursor around images in the initial frame 300 to create contours (see FIG. 17). The contours appear in different colors and with Golored dots in-between them to help the user identify individual splines. The user can click on an erase contour button 1606 and erase contours displayed on the graphical user interface. The user can also edit the contours with a contour editing button 1608 (to move contours, e.g., dragging and dropping), a split contour button 1610 (to separate contours), and a join contours button 1612 (to snap contours, e.g , close gaps). Once the contours are drawn to the user's satisfaction, the user can color the image by clicking on a coloring button 1614. To make the contoured image easier to see, either while coloring the image or while shaping the contours, the user can click on a toggle button 1616 to show the video frame or not. A colored frame 1800 in FIG. 18 shows the video frame toggled off To move between frames in the video sequence, the user can use a frame number menu 1618 Clicking on up arrow 1620 shows frames after the currently-displayed frame in the video sequence while clicking on a down arrow 1622 shows frames before the currently-displayed frame in the video sequence, The user can also type in a frame number in the frame number menu 1618 to view that frame number. The tools provided by the buttons 1604- 1616 and the frame number menu 1618 need not necessarily be provided as shown. For example, some or all of the tools 1604- 1618 can also or instead be provided in drop-down menu form.

Other embodiments are within the scope of the following claims

Claims

What is claimed is:

1. A method comprising: tracing a contour around an image in a frame of media including a sequence of frames; modifying the contour to more tightly fit around the image; and coloring closed areas surrounded by the contour.

2. The method of claim 1 further comprising repeating the tracing, the modifying, and the coloring for adjacent frames in the sequence of frames.

3. The method of claim 1 further comprising tracking movement of the image from a first location in the frame to a second location in a frame adjacent to the frame in the sequence of frames, and moving the contour from the first location to the second location in the frame adjacent to the frame in the sequence of frames,

4. The method of claim 3 further comprising modifying the contour in the second location to more tightly fit around the image in the frame adjacent to the frame.

5. The method of claim 3 in which the tracking is performed using a block matching algorithm.

6. The method of claim 1 in which the modifying includes snapping the contour to another contour in the frame.

7. The method of claim 1 in which the modifying includes smoothing the contour.

8. The method of claim 1 further comprising allowing manual termination of the modifying.

9. The method of claim 1 further comprising allowing manual correction of the modifying.

10. The method of claim 1 in which the media includes video.

1 1. The method of claim 1 in which a human user traces the contour using a graphical user interface.

12. The method of claim 1 in which the contour includes Bezier curves.

13. The method of claim 1 in which the coloring includes manually selecting a color to use in the coloring.

14. The method of claim 1 in which the coloring includes automatically selecting a color to use in the coloring.

15. The method of claim 1 in which the modifying is performed automatically.

16. An article comprising a machine-readable medium which stores machine- executable instructions, the instructions causing a machine to: enable a user to trace a contour around an image in a frame of media including a sequence of frames; modify the contour to more tightly fit around the image; and color closed areas surrounded by the contour.

17. The article of claim 16 further causing a machine to repeat the tracing, the modifying, and the coloring for adjacent frames in the sequence of frames.

18. The article of claim 16 further causing a machine to track movement of the image from a first location in the frame to a second location in a frame adjacent to the frame in the sequence of frames, and move the contour from the first location to the second location in the frame adjacent to the frame in the sequence of frames

19. The article of claim 18 further causing a machine to automatically modify the contour in the second location to more tightly fit around the image in the frame adjacent to the frame.

20. The article of claim 18 in which the tracking is performed using a block matching algorithm.

21 , The article of claim 16 in which the modifying includes snapping the contour to another contour in the frame.

22, The article of claim 16 in which the modifying includes smoothing the contour.

23. The article of claim 16 further causing a machine to allow the user to manually terminate the modifying.

24, The article of claim 16 further causing a machine to allow the user to manual correct the modifying.

25. The article of claim 16 in which the media includes video.

26. The article of claim 16 further causing a machine to provide a graphical user interface that enables the user to trace the contour.

27. The article of claim 16 in which the contour includes Bezier curves.

28. The article of claim 16 in which the coloring includes the user manually selecting a color to use in the coloring.

29. The article of claim 16 in which the coloring includes automatically selecting a color to use in the coloring.

30. The article of claim 16 in which the modifying is performed automatically.

3 1 An apparatus comprising: a display device configured to display a graphical user interface that enables a user to trace a contour around an image in a frame of media including a sequence of frames; and a mechanism configured to automatically modify the contour to more tightly fit around the image and configured to color closed areas surrounded by the contour.

32. The apparatus of claim 3 1 in which the mechanism is also configured to track movement of the image from a first location in the frame to a second location in a frame adjacent to the frame in the sequence of frames, and move the contour from the first location to the second location in the frame adjacent to the frame in the sequence of frames.

33. The apparatus of claim 32 in which the mechanism is also configured to modify the contour in the second location to more tightly fit around the image in the frame adjacent to the frame.

34. The apparatus of claim 32 in which the tracking is performed using a block matching algorithm.

35, The apparatus of claim 3 1 in which the modifying includes snapping the contour to another contour in the frame.

36, The apparatus of claim 3 1 in which the modifying includes smoothing the contour.

37, The apparatus of claim 3 1 in which the media includes video

38 The apparatus of claim 3 1 in which the user can manually change the mechanism^'s modifications using the graphical user interface.

39. The apparatus of claim 3 1 in which the contour includes Bezier curves

40. The apparatus of claim 3 1 in which the coloring includes the user manually selecting a color to use in the coloring.

41. The apparatus of claim 3 1 in which the coloring includes automatically selecting a color to use in the coloring.

42. The apparatus of claim 31 in which the mechanism includes software.

43. A method comprising: attempting to maximize a curve's overlap with edges of an image that the curve surrounds by maximizing an objective function including the weighted sum of terms including an edge overlap term defining the negative of a line integral of the curve across the image, a curvature term defining an integral of the curvature along the curve, a corner sharpness term defining an angle between tangent lines of the curve where the curve connects to another curve, and a deformation term defining dissimilarity between a current position of the curve and a previous position of the curve.

44. The method of claim 43 in which the curve includes a Bezier spline.

45. The method of claim 43 further comprising applying a Sobel edge detection filter to the edges of the image before maximizing the objective function.

46. The method of claim 43 further comprising continuing to maximize the objective function until the maximizing is terminated by a human user.

47. A method comprising: determining that a gap exists between a first endpoint of a first contour and a second endpoint of a second contour; moving the first endpoint to a location of the second endpoint, and moving the first contour a distance equal to a distance that the first endpoint moved to the location of the second endpoint.

48. The method of claim 47 in which the first contour and the second contour surround an image in a frame of media.

49. The method of claim 47 further comprising, after the moving, modifying the first contour and the second contour to more tightly fit around the image.