UNDOING SPELLING CORRECTION BY OVERDING DELETE AND BACKSPACE
BACKGROUND OF THE INVENTION
1. Field of the Invention
5 This invention relates to the entry of text into a computer by any user input system capable of single character operation, for example, a keyboard or a pen stylus on a touch-sensitive writing surface.
2. Description of the Prior Art
10 When users of a computer notice a spelling or typing error as characters are being entered, they use a delete operation, such as the delete or the backspace function, to manually correct the error, sometimes after repositioning the cursor. In most text editors and word
15 processing programs, the delete key deletes the character following the cursor and the backspace key deletes the character before the cursor. Operators often repeatedly backspace to delete all characters up to and including an error and then resume typing. They sometimes reposition
20 the cursor at or near the error and backspace or delete to correct the error. In the case of pen stroke entry where a character is entered with a prescribed stroke or strokes, as with a hand-held pocket organizer such as the "Palm Pilot" , backspacing can be effected by a dash written from
25 right to left.
SUMMARY OF THE INVENTION The present invention exploits these reflexes to undo automatic spelling correction, automatic word/phrase completion and other automatic substitutions, for example,
30 grammar correction, capitalization correction or automatic formatting, when they modify the word at the current cursor position or to invoke spelling correction, a thesaurus or
other writing tool when the word in question has not been modified.
The methods disclosed herein have application to many text inputting methods for a variety of applications. For example, the methods disclosed herein are useful for word processing software, command line interfaces, text input fields and handwriting recognition systems.
As used herein, "text" comprises words stored as a string of alphanumeric character codes (for example, ASCII codes) separated by codes for word break characters.
As used in the specification and claims, a "word" is a string of characters (usually upper and lower case letters) bound by word break characters, such as a space, a tab, a carriage return or a punctuation mark. Unless otherwise specifically set forth, the term
"delete operation" as used -in the specification and claims refers to striking a key that either removes a character following the cursor (often marked "Del" on the keyboard) and/or removes the character preceding the cursor (often marked "Backspace" on the keyboard) . It also refers to a pen stroke that is interpreted as a backspace/delete with a handwriting text entry system.
As used in the specification and claims, an "operation" refers to a rule-based procedure, an algorithm or a submission to a neural network. Therefore, for the purposes of this application, an "operation" is any procedure that accepts a word or phrase as input and produces a word or phrase as output . The output is typically substituted for the input text and can be referred to as a "substitution" or "correction".
Briefly, according to this invention, there is provided a computer implemented method for correcting text while it is being entered. According to this method, upon a delete operation, a determination is made whether: a) the word affected has not been previously corrected or completed by an operation for automatic alteration and is currently misspelled, b) the word affected has been previously corrected or completed by an operation for automatic alteration or c) neither a) nor b) . In step a) , if the word has not been previously corrected/completed and the word was misspelled before the removal of a character by the delete operation, spelling correction will be invoked. The spelling correction algorithm will initiate a search to determine whether an operation to invoke a unique spelling correction can be applied. If so, that operation will be applied. Otherwise, alternative substitutions (such as other spellings, synonyms, antonyms, hypernyms and hyponyms) may be selected. The spelling correction algorithm can be a rule- based spelling correction algorithm or any other spelling correction procedure so long as the procedure transforms an input word into an output word. The search to identify an applicable operation involves testing the different operations to see which, if any, applies to the input word. In a rule-based spelling correction algorithm, testing a rule involves checking .whether the rule's conditions match the input word (i.e., whether the pattern of the rule matches the input word) . In spelling correction algorithms that are not rule-based, testing an operation involves a-PPlying the operation to a copy of the input word and
comparing the resulting word to the input word. If the two are different, then the operation is applicable.
In step b) , if the delete operation affects a word that has already been automatically corrected or completed, the previously implemented alteration will be undone. Preferably, the word for which an automatic alteration of the word has been overridden (undone) is marked so that the word can be treated differently upon the invocation of other word processing procedures. For example, a batch processing spell correction procedure will not attempt to correct a word for which automatic correction was undone.
In step c) , if the delete operation invoked extra functionality (override of automatic alteration or invocation of spelling correction) , subsequent keystrokes will operate normally for the affected word. In situations where the delete operation does not invoke extra functionality, the delete operation will function as a normal deletion (i.e., remove a character). A delete operation may also be applied to selected (or blocked) text that may be comprised of more than one word. Selected text is text that is blocked and highlighted using some combination of keystrokes and/or the mouse. If the delete operation is performed on a selected word, spelling correction will be invoked. If the delete operation is performed on a partially selected word, spelling correction will be invoked using the cursor location as a hint to the problem with the word. Lastly, if the delete operation is performed on a selected phrase, grammar correction will be invoked.
BRIEF DESCRIPTION OF THE DRAWINGS Further features and other objects and advantages will become apparent from the following detailed description made with reference to the drawings in which: Fig. 1 is a flow diagram illustrating the general process according to this invention; and
Figs. 2(a) and 2(b) are flow diagrams for illustrating a computer program object for implementing one embodiment of this invention. DESCRIPTION OF THE PREFERRED EMBODIMENTS
With a computer word processing program, the words are stored in memory as codes and are also displayed as text on a monitor or other display. The word processing program simultaneously displays a cursor and points to the location in memory corresponding to the location of the cursor on the display. This is typical of word processing programs .
This invention has application to many text inputting methods. Not only will the invention apply to word processing or text editing programs wherein characters are entered from the keyboard, but also to command line interfaces, text input fields and typewriters, among other applications. Additionally, this invention has application to small computers where characters are entered by pen strokes with a stylus on a touch sensitive pad such as the "Palm Pilot" . The "Palm Pilot" is a pocket organizer manufactured by U.S. Robotics Corporation. It has a touch sensitive display screen which responds to the touch of a stylus. It also has a writing area where pen strokes are made which can be interpreted as alphanumeric characters, punctuation marks and spaces. The pen strokes are designed
to closely resemble those of the regular alphabet . A character entered in the writing space is entered in memory and displayed as text on the display screen.
Referring to Fig. 1, an initial step 10, in the method according to this invention, is the identification of the word affected by the deletion. In the case of keyboard text entry, the backspace or delete keys can be applied to the letters within the word or to word break delimiters. If the delete operation removes a letter within a word, that word is deemed the affected word. If the delete operation removes -a word break delimiter, the direction of the deletion determines the affected word. A backward deletion caused by a backspace at the end of the word affects the word to the left of the cursor. A forward delete affects the word to the right of the cursor. Preferably, the deleted character is saved for use in a later step.
A next step 11, in the method according to this invention, is to determine if the previous delete operation affected the word. If so, at step 12, the delete operation is allowed to operate normally.
A next step 13 in the method is to determine if the affected word is the result of a previous automatic correction. If the word has not been modified by automatic spelling correction or other automatic substitutions, then at step 14, spelling correction is invoked and provided with information about the letters affected by the delete operation to be used as a hint to the position of the possible error. The remaining letters in the word are also used to identify a correction. If the spelling correction algorithm provides a unique correction, that correction is
executed. Otherwise, correction may be deferred until additional letters are keyed that may identify a unique correction which is then executed. In one embodiment, a pop-up menu of alternate choices may be displayed allowing the user to select the correction to be executed. In addition to possible correction candidates, the choices may include synonyms, antonyms, hypernyms, hyponyms and other words semantically related to the word affected. Preferably, user-configurable parameters control whether the choices include synonyms, antonyms and the like, whether subsequent deletions cycle through the choices or a menu of the choices pops up, whether only unique corrections occur (with deletions operating normally in cases of ambiguous correction) and so forth. At step 15, if the word has been modified by a previous automatic correction, the correction is undone and subsequent keystrokes are allowed to operate normally upon the word. After the operator enters a word delimiter or uses cursor movement to end modification of the word, the word is flagged at step 16 to suppress any further automatic correction of the word, such as by a batch spelling correction procedure.
According to a preferred embodiment, external deletion, that is, deletion of word break characters adjacent a word, may be handled according to various setable parameters. For example, external delete may undo automatic correction. A following spacebar stroke may redo the correction. Control-space may enter a word without redoing the automatic correction. This permits the user to switch back and forth between two views of the word using the spacebar and a delete key. In an alternate preferred
embodiment, external delete undoes automatic correction and temporarily suppresses the automatic correction so that the spacebar does not invoke automatic correction on the word. This prevents the automatic correction process from affecting the word when the user resumes typing.
A special case may be required for embodiments that permit the method to be applied to selected text. Selected text is text that is blocked and highlighted using some combination of keystrokes and/or the mouse. If the selected text consists of one word, spelling correction is invoked on the word using the most specific spelling substitution available. If the selected text comprises multiple words, grammar correction is invoked. If the selected text consists of a part of a word, the spelling correction algorithm is invoked on the word using the selected text as a hint to the problem with the word.
According to the preferred embodiments of this invention, the user can set a parameter to control which behaviors are invoked in each situation. For example, instead of undoing the previous correction, the user could set a parameter to have the delete operation pop up a menu of high-probability alternate corrections. It may be desirable to use less than all of the functionality described above for some tasks. Thus, for example, the automatic undoing of corrections may be turned off or spelling correction may be turned off .
Various spelling correction algorithms may be used with this invention. However, the preferred spelling correction algorithm is based upon rules for matching unlikely n-grams with likely replacement, n-grams. (An n- gram is a string of characters that may comprise all or
part of a word.) The rules within the spelling correction algorithm are expressions that match pieces of correctly spelled words and replace incorrect pieces in the text. Examples of possible rules are as follows :
oualy -* ously mnet -.► ment fuly -> fully ierd -> eird eif -^ ief
The preferred spelling correction algorithm is described in considerable detail in a co-pending U.S. Patent Application Serial No. 09/312,229, filed May 14, 1999 entitled "Method for Rule-Based Correction of Spelling and Grammar Errors", and assigned to the same assignee as this patent application. However, the spelling correction algorithm can be any spelling correction procedure, so long as the procedure transforms an input word into an output word. See Kukich, Karen, "Techniques for Automatically Correcting Words in Text", ACM Computincf Surveys , Vol. 24, No. 4, December 1992.
In a preferred embodiment, the spelling correction algorithm uses information about the position of the character deleted to disambiguate between several possible corrections. This only works when the affected letters are internal to the word and so represent a clear signal that the user's intent is to correct those letters. For example, if the user entered "amoung" , the most likely correction is "amount" . However, if the user deletes the "u" , then "among" is the most likely correction. This is
an example where position information indicates that deletion should be treated as regular deletion and not as a signal to change the spelling of the word.
Corrections should only be made if there exists a unique correction. The present invention finds more unique corrections as it obtains more precise information about the nature of the problem, such as the position of the error. This extra information allows the spelling correction algorithm to restrict the set of candidate corrections allowing more errors to be corrected automatically. If the user moves the cursor into the middle of a word and begins deleting characters, this indicates the position of the problem with the word. For example, if the user moves the cursor just after "ie" in "receive" , it is likely that the user will replace those characters. Likewise, if the operator deletes all characters up to a point and resumes typing, this implies that the prefix of the word is complete and again indicates the position of the problem. For example, if the user types "quiet", deletes the "et" and types a "t", one can assume that the next character will be an "e" , not a space. The fact that the user originally typed "quiet" with an "e" increases the probability that this was a transposition error as opposed to an insertion error. Thus, the most likely correction is "quite", not "quit".
Figs. 2(a) and 2(b) together are a programming flow diagram describing a program to implement the process using a rule-based spelling correction algorithm according to this invention. Starting at Fig. 2(a), the State is initialized at step 20. At steps 21 and 23, events are processed and if a Key event is detected at step 23,
processing begins. If a rule has not been applied (in other words, the State is Looking for Rule) and the Key is Delete as determined at step 25, then at step 26 a rule is sought by using the characters remaining in the word. If a .rule is found for correcting the word, it is saved as a Deferred Rule at step 27. The rule is not applied because subsequent deletions and/or character additions may provide a better rule. Program flow from here awaits the next event. If the State is Looking for Rule and the Key is not Delete at step 25, the flow moves to B in Fig. 2(b), which is the automatic spell correction routine.
With continued reference to Fig. 2(a), if a Key event takes place, the State is Applied Rule (a rule has been applied to the word) as determined at steps 27 and 28 and the Key is Delete as determined at step 29, then at step 30 the rule is undone and the State is reset at step 31. This enables an automatic correction to be overridden. If at step 29 the Key is not Delete, the State is reset at step 32. Referring to Fig 2(b), starting at B, the State is Looking for Rule and the Key is not Delete as determined by steps 24 and 25. If the Key is found to be Word Break at step 33, a search is made for a rule at step 34. If a rule is found at step 35, it is applied at step 36, the State is set to Rule Applied (step 46) and the rule number is saved (step 47) so that the rule can be reversed. Then the next event is awaited.
If a rule is not found at step 35, but a rule had been deferred as determined at step 37, the Deferred Rule is applied at steps 36, 46 and 47. If no rule is found and
no rule has been deferred, the State is reset at step 38 and the next event is awaited.
If the State is Looking for Rule and the Key is not Delete or Word Break (i.e., the Key is Character, the word being entered) as determined at steps 24, 25 and 33, at step 40 a search is made for a rule based upon the characters already entered in the word. If a rule is found as determined at step 41, but a more specific rule is possible based on the addition of more characters to the word as determined at step 42, then the rule is saved at step 43 as a Deferred Rule and the next event is awaited. If no rule is found at step 41 but a Deferred Rule is being held as determined at step 44, the Deferred Rule is applied at step 36, the State is set to Rule Applied at step 46 and the rule number is saved at step 47; then the program awaits the next event. If no rule is found at step 41 and no Deferred Rule is being held, the State is reset at step 45 and the next event is awaited.
The present invention has been implemented with the "Palm Pilot" for spelling correction assisted handwriting recognition. The "Palm Pilot" is a hand-held computer with a handwriting interface. On the face of the display of the "Palm Pilot", there is provided a space for hand forming characters that are then translated and displayed. The spelling correction system intercepts the characters entered by the user and uses a variety of rules and heuristics to correct errors in the input . For example, if the handwriting recognition software that comes with the "Palm Pilot" misreads a letter, an automatic spelling correction substitutes the correct letter.
Sometimes the automatic substitution, according to this
invention, produces an erroneous change. For example, the user might insert an acronym that appears to the "Palm Pilot" to be an error. The "Palm Pilot" will automatically correct the presumed error. The user may then use a delete stroke to remove the automatic correction. The delete stroke is interpreted as a command to undo the automatic change. Subsequent delete strokes operate normally. If the user enters the same sequence of letters again that led to the automatic substitution, the automatic substitution is suppressed.
The present invention has also been implemented in GNU-Emacs which is a word processing program with keyboard input .
The following computer code in the C language sets forth portions of a function for implementing the correction of handwriting and spelling errors in the embodiment of this invention implemented with the "Palm Pilot" .
Boolean HackFldHandleEvent (FieldPtr fid, EventPtr eventP)
{ enum events eType = eventP->eType;
if (eType == keyDownEvent) { return StepChar (fid, eventP) ;
} if (eType == penDownEvent) { return StepPen (fid, eventP) ;
} else
return NextFldHandleEvent (fid, eventP) ;
H if Boolean StepChar (FieldPtr fid, EventPtr eventP) result = StepCharAux (fid, eventP, globalsP, trieP) return result;
Boolean StepPen (FieldPtr fid, EventPtr eventP)
StepPenAux (fid, eventP, globalsP) ;
Boolean StepCharAux (FieldPtr fid, EventPtr eventP,
KSGlobals *globalsP, Word *trieP)
{
RuleDesc ra;
Word specificity = 2;
Boolean result = true; char c = eventP->da a.keyDown. chr;
switch (globalsP->state)
{ case KSS_LOOKING_FOR_RULE : result = NextFldHandleEvent (fid, eventP); if (result)
{ if (globalsP->rule. index != 0)
{ specificity = toSpecificit (toRule (&globalsP->rule) ) ;
}
_ switch (charType (c) )
{ 1 case C_DELETE:
1 DecCC (globalsP) ;
SetRule (globalsP, 0) ; if (LookupRule (globalsP, trieP, &ra, fid, specificity) )
SetRule (globalsP, &ra) ; break;
case C_WORDBREAK:
IncCC (globalsP) ;
if (LookupRule (globalsP, trieP, &ra, fid, specificity) ) r
. ApplyRule (globalsP, fid, toRule (&ra) , -1);
SetState (globalsP, KSS_RULE_APPLIED) ;
} else if (globalsP->rule . index != 0)
{
ApplyRule (globalsP, fid, toRule (&globalsP->rule) , -1);
SetState (globalsP, KSS RULE_APPLIED) ;
} else
{
ClrCC (globalsP) ; SetRule (globalsP, 0);
} break;
case C_TARGET:
IncCC (globalsP) ;
H if (ILookupRule (globalsP, trieP, &ra, fid, specificity)) CD { if (globalsP->rule. index != 0)
{
ApplyRule (globalsP, fid, toRule (&globalsP->rule) , -1) ;
SetState (globalsP, KSS_RULE_APPLIED) ;
}
} else
{
SetRule (globalsP, &ra) ; if ( ILookupPartialRule (globalsP, trieP) )
{
ApplyRule (globalsP, fid, toRule (&ra), 0) SetState (globalsP, KSS RULE APPLIED);
}
} break;
case C OTHER:
SetRule (globalsP, 0) ; ClrCC (globalsP) ; • break;
}
} break;
case KSS_RULE_APPLIED: switch (charType (c) )
{ case CJDELETE:
H1 UndoRule (globalsP, fid, toRule (&globalsP->rule) ) ;
-* ClrCC (globalsP) ;
1 break;
case C_TARGET: result = NextFldHandleEvent (fid, eventP) ;
ClrCC (globalsP) ; if (result) IncCC (globalsP) ; break;
default : result = NextFldHandleEvent (fid, eventP);
ClrCC (globalsP) ; break;
}
SetRule (globalsP, 0);
SetState (globalsP, KSS_L00KING_F0R_RULE) ; break;
] return result;
The program from which the above programming functions are taken is event driven. Thus, the object HackFldHandleEvent has for its purpose to handle events, such as the keyDownEvent , which is entry of characters to the field. The penDownEvent occurs when the pen is used to change the cursor position for example.
A keyDownEvent passes control to the StepChar function which, among many other tasks, passes control to the StepCharAux function where the processing of characters and codes to effect the purposes of the invention takes place .
It should be understood that certain global constants represent values that define several states that determine how the StepCharAux object reacts to the entry of a particular character or code. The program is initially in the State named by the constant KSS_LOOKING_FOR_RULE . The State is also set to KSS_LOOKING_FOR_RULE if a misspelling has been identified in a word being entered and a potential rule has been found, but the application of the rule has been deferred while looking for a more specific rule. The State is set to KSS_RULE_APPLIED if a rule has been applied to a word.
Three other global constants characterize the type of character just entered: C_WORDBREAK identifies a word break character, such as a space; C_TARGET identifies a character within a word; and C_DELETE identifies a delete code .
The way in which the function StepCharAux responds to a new character or code depends upon the States defined by the five constants explained above. The function StepCharAux is structured as nested switch
statements. Of course, the method, according to this invention, can be arbitrarily implemented in hardware or software.
Having thus described our invention with the detail and particularity required by the Patent Laws, what is desired to be protected by Letters Patent is set forth in the following claims.