WO1997033249A1 - Method and device for handwritten character recognition - Google Patents

Method and device for handwritten character recognition Download PDF

Info

Publication number
WO1997033249A1
WO1997033249A1 PCT/US1997/004349 US9704349W WO9733249A1 WO 1997033249 A1 WO1997033249 A1 WO 1997033249A1 US 9704349 W US9704349 W US 9704349W WO 9733249 A1 WO9733249 A1 WO 9733249A1
Authority
WO
WIPO (PCT)
Prior art keywords
characters
character
combined
possible characters
value
Prior art date
Application number
PCT/US1997/004349
Other languages
French (fr)
Inventor
Farzad Ehsani
Liyang Zhou
Original Assignee
Motorola Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Motorola Inc. filed Critical Motorola Inc.
Priority to AU22168/97A priority Critical patent/AU726852B2/en
Priority to EP97915155A priority patent/EP0896704A1/en
Priority to IL12564897A priority patent/IL125648A0/en
Publication of WO1997033249A1 publication Critical patent/WO1997033249A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Definitions

  • This invention relates generally to handwriting recognition by a character recognizer, and more particularly to improving recognition of handwritten characters using a post ⁇ processing method and device.
  • Conventional character recognizers have approximately a 70 to 80 percent accuracy rate when attempting to correctly recognize handwritten characters from a digitizing tablet or other input device, yielding a 15 to 30 percent error rate. This accuracy rate is not good enough for the average user to feel confident in the ability of the recognizer.
  • character recognizers can be useful and valuable. For instance, character recognizers can be useful in conferences or seminars where a user does not bring in a keyboard but desires to electronically take notes. A character recognizer would then be used. If the character recognizer does not have a fairly high rate of accuracy, the notes taken during the seminar may become misleading.
  • character recognizers may be valuable in hospitals if the character recognizer has a high rate of accuracy.
  • Hand ⁇ held character recognizers would allow hospital personnel to checks patients and enter by hand reports which may be life saving. Without a high recognition rate, lives may be endangered.
  • One very useful application for character recognizers is inputting Chinese characters for electronic processing and storage. Chinese characters do not lend themselves well to keyboard entry making word processing in the Chinese language difficult. Chinese characters are complex and changing a small portion of the character may entirely change the meaning of the character or word. A high rate of accuracy is necessary for Chinese character recognition. Unfortunately, conventional character recognizers and recognition processes have not achieved the high accuracy necessary for these varying application.
  • a method comprising the steps of: choosing a number of template characters from a template character set which are likely to resemble a handwritten character thereby providing a set of possible characters, each of the possible characters having a value representing a degree of similarity with the handwritten character; and processing the possible characters according to a language model to determine which of the possible characters most resembles the handwritten character.
  • the step of processing the possible characters according to a language model preferably includes: combining each of the possible characters with a surrounding character to form combined characters; assigning a combined value to each of the combined characters where the combined value represents a probability that the surrounding character would be in combination with a respective one of the possible characters; and resorting the possible characters.
  • it includes comparing each of the possible characters with a surrounding character to determine a probability that the surrounding character would be in combination with a respective one of the possible characters; and determining from the probability for each of the possible characters which of the possible characters most resembles the handwritten character.
  • the value for each of the possible characters and the combined value of the combined characters may be weighted to determine a weighted value for each of the possible characters; and these may be ordered for each of the possible characters to determine a sequential order for resorting the possible characters.
  • a recognizer comprising: a character recognizer coupled to a handwriting input device, to choose a number of template characters from a template character set which are likely to resemble a handwritten character (possible characters), each of the possible characters having a value representing a degree of similarity with the handwritten character; a post-processor coupled to the character recognizer to process the possible characters according to a language model to determine which of the possible characters most resembles the handwritten character; and a display device coupled to the post-processor to receive the one of the possible characters most resembling the handwritten character.
  • FIG. 1 is a block diagram illustrating a preferred embodiment of the present invention.
  • FIG. 2 is a flow chart illustrating a method of performing the present invention.
  • FIG. 3 is a flow chart illustrating the method of performing the present invention according to the preferred embodiment.
  • FIG. 4 shows an example of the operation of a language modeling post-processor according to the present invention.
  • FIG. 1 illustrates, with reference also to FIG. 2, a device and method, according to the present invention, for improving the accuracy of handwritten character recognition.
  • Handwritten character recognizing devices such as character recognizing device 100, generally include some sort of handwriting input device or tablet 110 allowing a user to enter handwritten characters to character recognizing device 100. It will be noted at this point that character recognizing devices may also receive input through devices other than through tablets. For instance, handwritten characters may be input to character recognizing device 100 via facsimile or any other media in addition to tablet 1 10.
  • handwritten characters are input from tablet 110 to a character recognizer 120 (Step 200 of FIG. 2).
  • the character recognizer 120 chooses characters from a predetermined template character set 125 (step 210) for comparison with the handwritten character.
  • the predetermined template characters of template character set 125 are the characters used in the language for which character recognizing device 100 is designed. For instance, if English handwritten characters are being input to character recognizing device 100, template character set 125 will contain information representing English characters in some form, such as longhand, print, or a combination of styles. If, for instance, recognizer 100 is designed for Chinese character input, template character set 125 will contain information representing Chinese characters in such styles (cursive or printed) as character recognizing device 100 is designed for.
  • Character recognizer 120 compares each input handwritten character to the characters stored in template character set 125 and chooses a number of the characters, or possible characters, which most closely resemble the input character. In a preferred embodiment, character recognizer
  • character recognizer 120 chooses 10 characters from the template character set 125. To each of these number of possible characters (10 in the preferred embodiment), character recognizer 120 assigns a score (or value) that represents the degree of similarity between the respective possible character and the input character (step 220 of FIG. 2). Character recognizer 120 then prioritizes the number of possible characters according to their respective scores (step 240). Character recognizer 120 prioritizes the number of possible characters according to their respective scores ordered into a chronological order with the possible character having the score indicative of the nearest similarity ordered at the top of the list.
  • handwritten characters which are processed simply by a character recognizer have approximately a 15 to 30 percent error rate when choosing the top prioritized possible character.
  • the probability that the top prioritized possible character chosen by character recognizer 120 is actually the same as the handwritten input character is about 80 to 85 percent.
  • There is a 92 to 96 percent probability that the actual handwritten input character is one of the number of possible characters chosen by character recognizer 120 when the total number of possible characters is 10 pursuant to the preferred embodiment. This accuracy is nearly the same as the degree of accuracy most people have when reading handwritten characters, which accuracy is about 95 to 97 percent.
  • the accuracy of the 10 chosen possible characters is capitalized upon through the method described below to increase the probability that the character chosen as the top prioritized possible character is the same as the handwritten character.
  • the present invention contemplates further analyzing and processing the number of possible characters generated by character recognizer 120 to improve recognition accuracy.
  • the additional analysis and processing (post-processing) focuses on the 10 possible characters.
  • character recognizer 120 outputs the list of 10 possible characters to a post- processor 130 (step 250).
  • Post-processor 130 processes the 10 possible characters according to a language model to select which of the 10 possible characters is a best-fit character (step 260).
  • the language model post-processing chooses one of the 10 possible characters for output. This yields approximately a 90 to 92 percent probability that the character which is output, or best-fit character, is the same as the input handwritten character.
  • Language modeling is a process where each possible character processed is compared with a surrounding character to determine the probability that the possible character could be properly used in combination with such surrounding characters in the language being used. This process will be described in detail later.
  • post-processor 130 After post-processor 130 has chosen a best-fit character from the 10 possible characters, post-processor 130 outputs the best-fit character (step 270). In the preferred embodiment shown in FIG. 1, the best-fit character is output to digitiizing display 1 10 and displayed to the user.
  • the flow chart of FIG. 3 shows a preferred embodiment of the post-processing method and is described in conjunction with the preferred embodiment of FIG. 1.
  • the top prioritized possible character is chosen as the best-fit character (step 325) and output to the digitizing display 110 of the preferred embodiment (step 380). Choosing the top prioritized possible character as the best-fit character simply means that no further processing is chosen and character recognizer 120 operates in a conventional manner with the output (top prioritized possible character) sent directly to the digitizing display 110.
  • language model processor 140 which includes combiner 142, scoring device 144, and language model library 145.
  • Language model processor 140 compares each of the possible characters from character recognizer 120 with surrounding characters to determine the probability that the possible character could be in combination with the surrounding characters.
  • the surrounding characters are usually characters which have been already been recognized which are stored by the computing device (Surrounding Character 141), but may also be numbers, indications of the beginning of a sentence or word, words from a different language (such as English company names used while writing Chinese characters), etc.
  • the surrounding characters may also be characters which have not been recognized, such as a character subsequent in sequence to the handwritten character currently being recognized.
  • FIG. 4 illustrates the language model post-processing method using an example of two letters.
  • two letters are assumed to have been input as handwritten characters.
  • the first character in slot b will be assumed to have been processed previously and correctly by character recognizer 100 and confirmed as the letter "h”.
  • the second character in slot a is the character to be recognized.
  • character recognizer 120 generates a number of possible characters, which for the preferred embodiment is 10 possible characters, listed in FIG.
  • Scoring device 144 obtains from language model library 145, for each of the combinations, a predetermined probability (combined score) that the adjacent character in slot b, "h", will be combined with the number of possible characters ai through a n .
  • each of the combinations are assigned their respective combined score (step 340 and column 420). For instance, if character recognizer 120 determined ai to be "a”, and the letter in slot b was already determined to be "h", the probability that these two letters would be combined in sequence would be very high since "h” and "a” are combined in sequence in many different words.
  • the combined score representing this probability found in language model library 145 would be high and an appropriate combined score would be assigned to "ha”.
  • resorter 150 obtains the combined scores from scoring device 144, generates an order from the combined scores, and resorts the number of possible characters based upon that order. Specifically, a weighting element 152 of resorter 150 weights each of the combined scores from the scoring device 144 with the score of its corresponding number of possible characters to determine a weighted score for each of the number of possible characters (step 350). The weighting is calculated for each of the number of possible characters by: (i) multiplying the score (see previous discussion with respect to step 220 of FIG.
  • ⁇ c R and LM combined equal 1. Further, at optimum values for the weighting factors, ⁇ c R is greater than ⁇ LM. and ⁇ M is equal to 0.33. A user may choose a value for ⁇ M which is greater than or less than the optimum value, depending upon the desired output, and the choice may be input manually into weighting element 152.
  • Reorderer 154 of resorter 150 receives the weighted scores from weighting element 152 and orders the weighted scores in chronological order. In the preferred embodiment, the weighted scores are ordered from highest to lowest. This determined order is used to resort the number of possible characters. Reorderer 154 then resorts the number of possible characters according to the order it just determined, and chooses the best-fit character from the reordered number of possible characters (steps 360 and 370). The best-fit character is then output (step 380).
  • Post-processing of the output of character recognizers is necessary in order to improve the rate of accuracy of selecting a single possible character representing an input handwritten character. Without the additional accuracy of post-processing, character recognizers will probably not become commercially viable.
  • the probability of selecting a single possible character which is the same as a handwritten character increases from roughly 84% to approximately 90 to 92 percent. This recognition accuracy brings handwriting recognition into an acceptable range for consumer use.

Abstract

A handwritten character recognizing device (100) and method for handwritten character recognition improves the accuracy of conventinal character recognizers by using a post-processor (130). The handwritten character recognizing device (100) includes a character recognizer (120), a language model processor (140) which compares outputs from the character recognizer (120) with surrounding handwritten characters to determine a probability that the outputs from the character recognizer (120) would be in combination with the surrounding handwritten characters, and a resorter (150) which weights the probabilities determined in the language model processor (140) with the character recognizer and resorts the character recognizer (120) outputs.

Description

METHOD AND DEVICE FOR HANDWRITTEN CHARACTER RECOGNITION
Field of the Invention
This invention relates generally to handwriting recognition by a character recognizer, and more particularly to improving recognition of handwritten characters using a post¬ processing method and device.
Background of the Invention
Several contemporary hand-held devices have attempted to permit users to enter text characters or alphanumeric information via a stylus and a digitizing tablet. For example, the Newton™ personal digital assistant, by Apple Computer, and the Marco™ wireless personal communicator by Motorola, accepts handwritten characters (as opposed to cursive handwriting) on a display screen. The device attempts to convert the handwritten characters into a typewritten representation of the handwritten characters. The concept of converting handwritten entries into typewritten representations is expanding to recognition of characters written in Chinese and other languages. Part of the known process of converting handwritten characters into typewritten representations includes comparing the elements of the handwritten characters with a predetermined set of template characters. This is generally performed by a character recognizer that determines which of a set of template characters in a template character library most closely resembles a given handwritten character. The determination is generally based upon probabilities that the chosen template characters will be the same as the handwritten character. Conventional character recognizers have approximately a 70 to 80 percent accuracy rate when attempting to correctly recognize handwritten characters from a digitizing tablet or other input device, yielding a 15 to 30 percent error rate. This accuracy rate is not good enough for the average user to feel confident in the ability of the recognizer.
There are several areas where character recognizers can be useful and valuable. For instance, character recognizers can be useful in conferences or seminars where a user does not bring in a keyboard but desires to electronically take notes. A character recognizer would then be used. If the character recognizer does not have a fairly high rate of accuracy, the notes taken during the seminar may become misleading.
Use of character recognizers may be valuable in hospitals if the character recognizer has a high rate of accuracy. Hand¬ held character recognizers would allow hospital personnel to checks patients and enter by hand reports which may be life saving. Without a high recognition rate, lives may be endangered. One very useful application for character recognizers is inputting Chinese characters for electronic processing and storage. Chinese characters do not lend themselves well to keyboard entry making word processing in the Chinese language difficult. Chinese characters are complex and changing a small portion of the character may entirely change the meaning of the character or word. A high rate of accuracy is necessary for Chinese character recognition. Unfortunately, conventional character recognizers and recognition processes have not achieved the high accuracy necessary for these varying application.
Summary of the Invention
According to a first aspect of the present invention, a method is provided comprising the steps of: choosing a number of template characters from a template character set which are likely to resemble a handwritten character thereby providing a set of possible characters, each of the possible characters having a value representing a degree of similarity with the handwritten character; and processing the possible characters according to a language model to determine which of the possible characters most resembles the handwritten character. The step of processing the possible characters according to a language model preferably includes: combining each of the possible characters with a surrounding character to form combined characters; assigning a combined value to each of the combined characters where the combined value represents a probability that the surrounding character would be in combination with a respective one of the possible characters; and resorting the possible characters. Additionally or in the alternative it includes comparing each of the possible characters with a surrounding character to determine a probability that the surrounding character would be in combination with a respective one of the possible characters; and determining from the probability for each of the possible characters which of the possible characters most resembles the handwritten character.
The value for each of the possible characters and the combined value of the combined characters may be weighted to determine a weighted value for each of the possible characters; and these may be ordered for each of the possible characters to determine a sequential order for resorting the possible characters.
According to a second aspect of the present invention, a recognizer is provided comprising: a character recognizer coupled to a handwriting input device, to choose a number of template characters from a template character set which are likely to resemble a handwritten character (possible characters), each of the possible characters having a value representing a degree of similarity with the handwritten character; a post-processor coupled to the character recognizer to process the possible characters according to a language model to determine which of the possible characters most resembles the handwritten character; and a display device coupled to the post-processor to receive the one of the possible characters most resembling the handwritten character.
Further aspects and embodiments of the invention are illustrated in the following detailed description, which is given by way of illustration and example only.
Brief Description of the Drawings
FIG. 1 is a block diagram illustrating a preferred embodiment of the present invention. FIG. 2 is a flow chart illustrating a method of performing the present invention.
FIG. 3 is a flow chart illustrating the method of performing the present invention according to the preferred embodiment. FIG. 4 shows an example of the operation of a language modeling post-processor according to the present invention.
Detailed Description of the Preferred Embodiment
FIG. 1 illustrates, with reference also to FIG. 2, a device and method, according to the present invention, for improving the accuracy of handwritten character recognition. Handwritten character recognizing devices, such as character recognizing device 100, generally include some sort of handwriting input device or tablet 110 allowing a user to enter handwritten characters to character recognizing device 100. It will be noted at this point that character recognizing devices may also receive input through devices other than through tablets. For instance, handwritten characters may be input to character recognizing device 100 via facsimile or any other media in addition to tablet 1 10.
Referring to the preferred embodiment of FIG. 1, with reference to the flow chart of FIG. 2, handwritten characters are input from tablet 110 to a character recognizer 120 (Step 200 of FIG. 2). The character recognizer 120, in this preferred embodiment, chooses characters from a predetermined template character set 125 (step 210) for comparison with the handwritten character. The predetermined template characters of template character set 125 are the characters used in the language for which character recognizing device 100 is designed. For instance, if English handwritten characters are being input to character recognizing device 100, template character set 125 will contain information representing English characters in some form, such as longhand, print, or a combination of styles. If, for instance, recognizer 100 is designed for Chinese character input, template character set 125 will contain information representing Chinese characters in such styles (cursive or printed) as character recognizing device 100 is designed for.
Character recognizer 120 compares each input handwritten character to the characters stored in template character set 125 and chooses a number of the characters, or possible characters, which most closely resemble the input character. In a preferred embodiment, character recognizer
120 chooses 10 characters from the template character set 125. To each of these number of possible characters (10 in the preferred embodiment), character recognizer 120 assigns a score (or value) that represents the degree of similarity between the respective possible character and the input character (step 220 of FIG. 2). Character recognizer 120 then prioritizes the number of possible characters according to their respective scores (step 240). Character recognizer 120 prioritizes the number of possible characters according to their respective scores ordered into a chronological order with the possible character having the score indicative of the nearest similarity ordered at the top of the list.
As mentioned previously, handwritten characters which are processed simply by a character recognizer have approximately a 15 to 30 percent error rate when choosing the top prioritized possible character. Put another way, the probability that the top prioritized possible character chosen by character recognizer 120 is actually the same as the handwritten input character is about 80 to 85 percent. There is a 92 to 96 percent probability that the actual handwritten input character is one of the number of possible characters chosen by character recognizer 120 when the total number of possible characters is 10 pursuant to the preferred embodiment. This accuracy is nearly the same as the degree of accuracy most people have when reading handwritten characters, which accuracy is about 95 to 97 percent. According to the teachings of the present invention, the accuracy of the 10 chosen possible characters is capitalized upon through the method described below to increase the probability that the character chosen as the top prioritized possible character is the same as the handwritten character. The present invention contemplates further analyzing and processing the number of possible characters generated by character recognizer 120 to improve recognition accuracy. The additional analysis and processing (post-processing) focuses on the 10 possible characters.
Referring again to Figures 1 and 2, according to the preferred embodiment of the invention, character recognizer 120 outputs the list of 10 possible characters to a post- processor 130 (step 250). Post-processor 130 processes the 10 possible characters according to a language model to select which of the 10 possible characters is a best-fit character (step 260). In other words, the language model post-processing chooses one of the 10 possible characters for output. This yields approximately a 90 to 92 percent probability that the character which is output, or best-fit character, is the same as the input handwritten character.
Language modeling is a process where each possible character processed is compared with a surrounding character to determine the probability that the possible character could be properly used in combination with such surrounding characters in the language being used. This process will be described in detail later.
After post-processor 130 has chosen a best-fit character from the 10 possible characters, post-processor 130 outputs the best-fit character (step 270). In the preferred embodiment shown in FIG. 1, the best-fit character is output to digitiizing display 1 10 and displayed to the user.
The flow chart of FIG. 3 shows a preferred embodiment of the post-processing method and is described in conjunction with the preferred embodiment of FIG. 1. After character recognizer 120 has performed the character recognition process to generate the list of number of possible characters (steps 300 and 310 of FIG. 3) (as taught earlier, the preferred embodiment uses 10 possible characters), a user must determine if the character recognition output is to be further processed (step 320). The choice made by the user is represented in FIG. 1 as switch 160.
There may be circumstances where the added accuracy of the post-processor 130 is not needed. Under these circumstances, a user may choose to use only the character recognizer 120 output.
If the user chooses not to further process the output of character recognizer 120, the top prioritized possible character is chosen as the best-fit character (step 325) and output to the digitizing display 110 of the preferred embodiment (step 380). Choosing the top prioritized possible character as the best-fit character simply means that no further processing is chosen and character recognizer 120 operates in a conventional manner with the output (top prioritized possible character) sent directly to the digitizing display 110.
If the user determines that additional processing, and additional accuracy of post-processor 130, is needed, the number of possible characters are passed to language model processor 140 which includes combiner 142, scoring device 144, and language model library 145. Language model processor 140 compares each of the possible characters from character recognizer 120 with surrounding characters to determine the probability that the possible character could be in combination with the surrounding characters. The surrounding characters are usually characters which have been already been recognized which are stored by the computing device (Surrounding Character 141), but may also be numbers, indications of the beginning of a sentence or word, words from a different language (such as English company names used while writing Chinese characters), etc. The surrounding characters may also be characters which have not been recognized, such as a character subsequent in sequence to the handwritten character currently being recognized. The present invention looks to any information surrounding the handwritten character which is stored in previous character 141, since the surrounding information will always have some information that assists in the language model process of language model processor 140. Generally, the previous adjacent recognized character which has been correctly recognized will be the surrounding character compared with the possible characters in language model processor 140. FIG. 4, illustrates the language model post-processing method using an example of two letters. In the example, two letters are assumed to have been input as handwritten characters. The first character in slot b will be assumed to have been processed previously and correctly by character recognizer 100 and confirmed as the letter "h". The second character in slot a is the character to be recognized. As explained above, character recognizer 120 generates a number of possible characters, which for the preferred embodiment is 10 possible characters, listed in FIG. 4 as ai to an where n is 10 (column 400). These template characters are then combined in combiner 142 (step 330 of FIG. 3) with, in the preferred embodiment, the previous correctly recognized adjacent handwritten character of slot b, or the letter "h". The result is the combination of the letter in slot b, or in this example "h", with each of the possible characters aj through an (column 410) .
The combinations of slot b with ai through an are received in scoring device 144 of FIG. 1. Scoring device 144 obtains from language model library 145, for each of the combinations, a predetermined probability (combined score) that the adjacent character in slot b, "h", will be combined with the number of possible characters ai through an. Next, each of the combinations are assigned their respective combined score (step 340 and column 420). For instance, if character recognizer 120 determined ai to be "a", and the letter in slot b was already determined to be "h", the probability that these two letters would be combined in sequence would be very high since "h" and "a" are combined in sequence in many different words. The combined score representing this probability found in language model library 145 would be high and an appropriate combined score would be assigned to "ha".
If the letters were, for instance, an = "z" and b = "m", the probability that the two letters are combined, based upon common English words, is very low. In another language the probability may be higher. Language model library 145 is designed for specific uses and languages. The low combined score reflecting this unusual combination might be close to zero, or even zero if the probability is so low that the combination is not included in the language model library 145. This combined score would be then assigned to the combination "mz", In the above example, each of the number of possible characters are combined with the previous adjacent character. In fact, any combination of surrounding characters are contemplated in the present invention. The above example using the previous adjacent character in the language model process is a preferred embodiment.
Referring again to figures 1 , 3, and 4, resorter 150 obtains the combined scores from scoring device 144, generates an order from the combined scores, and resorts the number of possible characters based upon that order. Specifically, a weighting element 152 of resorter 150 weights each of the combined scores from the scoring device 144 with the score of its corresponding number of possible characters to determine a weighted score for each of the number of possible characters (step 350). The weighting is calculated for each of the number of possible characters by: (i) multiplying the score (see previous discussion with respect to step 220 of FIG. 2) from the character recognizer 120 with a first weighting factor CR to obtain a weighted possible character score, (ii) multiplying the corresponding combined score with a second weighting factor LM to obtain a weighted combined character score, and (iii) combining the two scores to obtain a weighted score for each of the number of possible characters (column 430 of FIG. 4). The equation is: SτθTan = λcR ScRan + λLM SLMan-
In the preferred embodiment, λcR and LM combined equal 1. Further, at optimum values for the weighting factors, λcR is greater than λLM. and λ M is equal to 0.33. A user may choose a value for λ M which is greater than or less than the optimum value, depending upon the desired output, and the choice may be input manually into weighting element 152. Reorderer 154 of resorter 150 receives the weighted scores from weighting element 152 and orders the weighted scores in chronological order. In the preferred embodiment, the weighted scores are ordered from highest to lowest. This determined order is used to resort the number of possible characters. Reorderer 154 then resorts the number of possible characters according to the order it just determined, and chooses the best-fit character from the reordered number of possible characters (steps 360 and 370). The best-fit character is then output (step 380).
Post-processing of the output of character recognizers is necessary in order to improve the rate of accuracy of selecting a single possible character representing an input handwritten character. Without the additional accuracy of post-processing, character recognizers will probably not become commercially viable. By using the language modeling process of the present invention, the probability of selecting a single possible character which is the same as a handwritten character increases from roughly 84% to approximately 90 to 92 percent. This recognition accuracy brings handwriting recognition into an acceptable range for consumer use.

Claims

Claims
1. A method comprising the steps of: choosing a number of template characters from a template character set which are likely to resemble a handwritten character thereby providing a set of possible characters, each of the possible characters having a value representing a degree of similarity with the handwritten character; and processing the possible characters according to a language model to determine which of the possible characters most resembles the handwritten character.
2. A method according to claim 1 wherein the step of processing the possible characters according to a language model includes: combining each of the possible characters with a surrounding character to form combined characters; assigning a combined value to each of the combined characters where the combined value represents a probability that the surrounding character would be in combination with a respective one of the possible characters; and resorting the possible characters.
3. A method according to claim 2 where the step of resorting the possible characters according to a sequencial order of the combined value for each of the combined characters includes: weighting the value for each of the possible characters and the combined value of the combined characters to determine a weighted value for each of the possible characters; and ordering the weighted value for each of the possible characters to determine a sequential order for resorting the possible characters.
4. A method according to claim 3 where the step of weighting the value for each of the possible characters against a corresponding of the combined value of the combined characters to determine a weighted value for each of the possible characters includes: multiplying the score for each of the possible characters by a first weighting factor to obtain weighted possible character values; multiplying the combined value for each of the combined characters with a second weighting factor to obtain weighted combined character values; and combining each of the weighted possible character values with respective of the weighted combined character values to obtain the weighted value for each of the possible characters.
5. A method according to claim 2 wherein the step of assigning a combined value to each of the combined characters includes obtaining from a language model library the combined value.
6. A method according to claim 1 wherein the step of processing the possible characters comprises: comparing each of the possible characters with a surrounding character to determine a probability that the surrounding character would be in combination with a respective one of the possible characters; and determining from the probability for each of the possible characters which of the possible characters most resembles the handwritten character.
7. A method according to claim 6 wherein the step of comparing each of the number of template characters with a surrounding character comprises: combining each of the possible characters with a surrounding character to form combined characters; and assigning a combined value to each of the combined characters where the combined value represents the probability.
8. A method according to claim 6 or 7 where the step of comparing each of the number of template characters with a surrounding character to determine a probability includes obtaining the probability from a language model look-up table.
9. A recognizer comprising: a character recognizer coupled to a handwriting input device, to choose a number of template characters from a template character set which are likely to resemble a handwritten character thereby providing a set of possible characters, each of the possible characters having a value representing a degree of similarity with the handwritten character; a post-processor coupled to the character recognizer to process the possible characters according to a language model to determine which of the possible characters most resembles the handwritten character; and a display device coupled to the post-processor to receive the one of the possible characters most resembling the handwritten character.
10. A handwriting recognizer according to claim 9 wherein the character recognizer is coupled to a template character set to receive from the template character set the possible characters and assigning the value to each of the possible characters.
11. A handwriting recognizer according to claim 10 wherein the post-processor comprises: a combiner coupled to the character recognizer to combine each of the possible characters with a surrounding character to form combined characters; a scoring device coupled to the combiner to assign a combined value to each of the combined characters where the combined value represents a probability that the surrounding correctly recognized character would be in combination with a respective one of the possible characters; and a resorter coupled to the scoring device to resort the possible characters according to a sequencial order of the combined value for each of the combined characters.
12. A handwriting recognizer according to claim 11 wherein the resorter comprises: a weighting element coupled to the scoring device to weight the value for each of the possible characters against a corresponding of the combined value of the combined characters to determine a weighted value for each of the possible characters; a reorderer coupled to the weighting element to reorder the possible characters according to a sequencial order of the weighted value of each of the possible characters; and the reorderer coupled to the display device to output the one of the possible characters most resembling the handwritten character.
13. A handwriting recognizer according to claim 12 wherein the scoring device is coupled to a language model library to obtain the combined value.
14. A handwriting recognizer according to claim 11 wherein the handwriting input device and the display device are a single digitizing tablet.
PCT/US1997/004349 1996-03-08 1997-03-06 Method and device for handwritten character recognition WO1997033249A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
AU22168/97A AU726852B2 (en) 1996-03-08 1997-03-06 Method and device for handwritten character recognition
EP97915155A EP0896704A1 (en) 1996-03-08 1997-03-06 Method and device for handwritten character recognition
IL12564897A IL125648A0 (en) 1996-03-08 1997-03-06 A method and device for handwritten character recognition

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US61484696A 1996-03-08 1996-03-08
US08/614,846 1996-03-08

Publications (1)

Publication Number Publication Date
WO1997033249A1 true WO1997033249A1 (en) 1997-09-12

Family

ID=24462951

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1997/004349 WO1997033249A1 (en) 1996-03-08 1997-03-06 Method and device for handwritten character recognition

Country Status (6)

Country Link
EP (1) EP0896704A1 (en)
CN (1) CN1181827A (en)
AU (1) AU726852B2 (en)
CA (1) CA2247359A1 (en)
IL (1) IL125648A0 (en)
WO (1) WO1997033249A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003034326A1 (en) * 2001-10-15 2003-04-24 Silverbrook Research Pty Ltd Character string identification
WO2005106771A1 (en) * 2004-05-04 2005-11-10 Nokia Corporation Apparatus and method for handwriting recognition
US7873217B2 (en) 2003-02-26 2011-01-18 Silverbrook Research Pty Ltd System for line extraction in digital ink

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100356392C (en) * 2005-08-18 2007-12-19 北大方正集团有限公司 Post-processing approach of character recognition

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5131053A (en) * 1988-08-10 1992-07-14 Caere Corporation Optical character recognition method and apparatus
US5151950A (en) * 1990-10-31 1992-09-29 Go Corporation Method for recognizing handwritten characters using shape and context analysis
US5343537A (en) * 1991-10-31 1994-08-30 International Business Machines Corporation Statistical mixture approach to automatic handwriting recognition
US5392363A (en) * 1992-11-13 1995-02-21 International Business Machines Corporation On-line connected handwritten word recognition by a probabilistic method
US5465309A (en) * 1993-12-10 1995-11-07 International Business Machines Corporation Method of and apparatus for character recognition through related spelling heuristics
US5467407A (en) * 1991-06-07 1995-11-14 Paragraph International Method and apparatus for recognizing cursive writing from sequential input information
US5621809A (en) * 1992-06-09 1997-04-15 International Business Machines Corporation Computer program product for automatic recognition of a consistent message using multiple complimentary sources of information

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5131053A (en) * 1988-08-10 1992-07-14 Caere Corporation Optical character recognition method and apparatus
US5436983A (en) * 1988-08-10 1995-07-25 Caere Corporation Optical character recognition method and apparatus
US5151950A (en) * 1990-10-31 1992-09-29 Go Corporation Method for recognizing handwritten characters using shape and context analysis
US5467407A (en) * 1991-06-07 1995-11-14 Paragraph International Method and apparatus for recognizing cursive writing from sequential input information
US5343537A (en) * 1991-10-31 1994-08-30 International Business Machines Corporation Statistical mixture approach to automatic handwriting recognition
US5621809A (en) * 1992-06-09 1997-04-15 International Business Machines Corporation Computer program product for automatic recognition of a consistent message using multiple complimentary sources of information
US5392363A (en) * 1992-11-13 1995-02-21 International Business Machines Corporation On-line connected handwritten word recognition by a probabilistic method
US5465309A (en) * 1993-12-10 1995-11-07 International Business Machines Corporation Method of and apparatus for character recognition through related spelling heuristics

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003034326A1 (en) * 2001-10-15 2003-04-24 Silverbrook Research Pty Ltd Character string identification
AU2002333063B2 (en) * 2001-10-15 2007-09-06 Silverbrook Research Pty Ltd Character string identification
US7444021B2 (en) 2001-10-15 2008-10-28 Silverbrook Research Pty Ltd Character string identification
US7532758B2 (en) 2001-10-15 2009-05-12 Silverbrook Research Pty Ltd Method and apparatus for generating handwriting recognition template
US7756336B2 (en) 2001-10-15 2010-07-13 Silverbrook Research Pty Ltd Processing system for identifying a string formed from a number of hand-written characters
US8000531B2 (en) 2001-10-15 2011-08-16 Silverbrook Research Pty Ltd Classifying a string formed from a known number of hand-written characters
US8285048B2 (en) 2001-10-15 2012-10-09 Silverbrook Research Pty Ltd Classifying a string formed from hand-written characters
US7873217B2 (en) 2003-02-26 2011-01-18 Silverbrook Research Pty Ltd System for line extraction in digital ink
WO2005106771A1 (en) * 2004-05-04 2005-11-10 Nokia Corporation Apparatus and method for handwriting recognition
KR100858545B1 (en) * 2004-05-04 2008-09-12 노키아 코포레이션 Apparatus and method for handwriting recognition
US8411958B2 (en) 2004-05-04 2013-04-02 Nokia Corporation Apparatus and method for handwriting recognition

Also Published As

Publication number Publication date
CN1181827A (en) 1998-05-13
CA2247359A1 (en) 1997-09-12
IL125648A0 (en) 1999-04-11
EP0896704A1 (en) 1999-02-17
AU726852B2 (en) 2000-11-23
AU2216897A (en) 1997-09-22

Similar Documents

Publication Publication Date Title
US7129932B1 (en) Keyboard for interacting on small devices
US6173253B1 (en) Sentence processing apparatus and method thereof,utilizing dictionaries to interpolate elliptic characters or symbols
US7164367B2 (en) Component-based, adaptive stroke-order system
US6513005B1 (en) Method for correcting error characters in results of speech recognition and speech recognition system using the same
US20050089226A1 (en) Apparatus and method for letter recognition
JPH07105316A (en) Handwritten-symbol recognition apparatus
WO2006044207A2 (en) An electronic device and method for visual text interpretation
KR20070043673A (en) System and its method for inputting character by predicting character sequence of user's next input
US6799914B2 (en) Arabic-persian alphabeth input apparatus
EP0797157A2 (en) Machine interpreter
AU726852B2 (en) Method and device for handwritten character recognition
US6320985B1 (en) Apparatus and method for augmenting data in handwriting recognition system
JPH08263587A (en) Method and device for document input
TW409213B (en) Method and device for handwritten character recognition
El-Nasan et al. Ink-link [character recognition]
JPH02112058A (en) Character recognition input system
JP3022790B2 (en) Handwritten character input device
US20030110451A1 (en) Practical chinese classification input method
JP2990734B2 (en) Character recognition device output control method for character recognition device
JPH10320107A (en) Handwritten character input device having handwritten character recognizing function
JP2639314B2 (en) Character recognition method
JPS61131159A (en) Erroneously read character correcting device
TW511039B (en) Apparatus for encoding and defining symbols and, assembling text in ideographic languages
JP3157995B2 (en) Character processor
JPH01166187A (en) Method for recognizing character

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 97190161.9

Country of ref document: CN

AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH HU IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK TJ TM TR TT UA UG UZ VN YU

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH KE LS MW SD SZ UG AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)

Free format text: CN

ENP Entry into the national phase

Ref document number: 2247359

Country of ref document: CA

Ref document number: 2247359

Country of ref document: CA

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 1997915155

Country of ref document: EP

NENP Non-entry into the national phase

Ref document number: 97532057

Country of ref document: JP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWP Wipo information: published in national office

Ref document number: 1997915155

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 1997915155

Country of ref document: EP