WO2008095021A1 - Free form voice command model interface for ivr navigation - Google Patents

Free form voice command model interface for ivr navigation Download PDF

Info

Publication number
WO2008095021A1
WO2008095021A1 PCT/US2008/052501 US2008052501W WO2008095021A1 WO 2008095021 A1 WO2008095021 A1 WO 2008095021A1 US 2008052501 W US2008052501 W US 2008052501W WO 2008095021 A1 WO2008095021 A1 WO 2008095021A1
Authority
WO
WIPO (PCT)
Prior art keywords
command
prompt
parameters
user
call flow
Prior art date
Application number
PCT/US2008/052501
Other languages
French (fr)
Other versions
WO2008095021A9 (en
Inventor
Sunil Vemuri
Debra K. Miller
Donald A. Norman
Rajib Ghosh
Original Assignee
Qtech, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qtech, Inc. filed Critical Qtech, Inc.
Publication of WO2008095021A1 publication Critical patent/WO2008095021A1/en
Publication of WO2008095021A9 publication Critical patent/WO2008095021A9/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/487Arrangements for providing information services, e.g. recorded voice services or time announcements
    • H04M3/493Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
    • H04M3/4936Speech interaction details

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephonic Communication Services (AREA)

Abstract

A method and system for dynamically presenting command prompts in the call flow of an integrated voice response application are disclosed. Accordingly, a single top-level command prompt is presented, and based on an analysis of a spoken command and a variable number of command parameters received from a user, an additional confirmation prompt or command prompt may be selected and presented.

Description

FREE FORM VOICE COMMAND MODEL INTERFACE FOR IVR
NAVIGATION
RELATED APPLICATIONS
[001] This application is a Nonprovisional of, claims priority to and incorporates by reference, Indian Application No.: 1 13/KOL/2007 filed 31 January 2007.
FIELD
[002] The invention described herein is generally related to the fields of computer telephony and schedule management applications. In particular, the present invention is related to a user interface of an interactive voice response for a schedule management application.
BACKGROUND
[003] A variety of scheduling software applications provide users with the ability to schedule meetings, create reminders, and manage contacts, etc. For example, one such application is Microsoft Outlook®. With Outlook®, users can enter appointments, contacts, events, and so on, and then view the information on a display in a variety of formats. As a software application that is primarily intended to execute on a user's personal computer, the primary user interface to Outlook® includes a display, a keyboard and a mouse. Accordingly, users input information with the keyboard and mouse, and the output is provided by means of the display.
[004] With the increasing popularity of wireless networks, people want access to their information when they are on the move. Accordingly, scheduling and contact management applications for mobile phones have been developed to provide user's wireless access to information while users are on the move and away from their personal computers. However, mobile phones typically have limited input means, such as a number pad. Even in a best case scenario, entering data via a tiny QWERTY-style keyboard on a so-called smartphone can be difficult, time consuming, and prone to errors. Consequently, an improved means of interfacing with a scheduling and contact management application is desirable. [005] One such interface is facilitated by an integrated voice response (IVR) system. An IVR system is a computerized system that allows a person, typically a telephone caller, to select an option from a voice menu and otherwise interface with a computer system. Improved speech processing technologies have made it popular to use voice commands (e.g., speech recognition) in addition to, or in place of, touch tone (e.g., DTMF) commands. However, conventional IVR systems, particularly the call prompting or menu-like interfaces of such systems, suffer from a variety of issues. [006] FIG. 1 illustrates a typical call flow or menu system for an IVR application for an airline company. The main prompt 10 requests the caller to enter one of three inputs (e.g., a command), and based on such input, the caller is directed down a particular call path. If, for example, the caller would like to make a reservation, the caller enters or says, "one". Next, a second call prompt or menu prompts the caller to provide additional command parameters, for example, the desired date of departure. A third call prompt requests yet another command parameter - the desired time of departure, and so on, until all command parameters have been collected. One of the problems with this type of call flow is that each prompt is designed to handle a limited amount of input, such as a single command or a single command parameter. That is, a caller who wishes to make a reservation is prohibited from providing all, or some, of the necessary information at the main prompt. Instead, the caller is forced to select from one of three options, and then listen to a second call prompt before providing additional information, such as a departure date, time, originating city, and destination. This often causes confusion because the caller can lose track of where he or she is in the process, particularly when more than a few options (command parameters) are required. Furthermore, the design is rigid in the sense that the order in which the required information is input is predetermined by the layout of the call flow path. That is, the user must enter the date of departure, prior to entering the time of departure, and so on. Moreover, this type of system does not permit any variation in the way that people interact with the system. That is, everyone must follow the same call flow process. In general, the design of the call flow is such that there is little, if any, flexibility for providing input (referred to herein as commands and command parameters) from the perspective of the caller. Without such flexibility, callers often get frustrated and abandon the call without completing a transaction. BRIEF DESCRIPTION OF THE DRAWINGS
[007] The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate an implementation of the invention and, together with the description, serve to explain the advantages and principles of the invention. In the drawings,
FlG. 1 illustrates a typical call flow or menu system for a conventional IVR application for an airline company;
FIG. 2 illustrates an example of various call flow paths for adding postings to a scheduling application via an IVR interface, according to an embodiment of the invention; and
FIG. 3 illustrates an example of various call flow paths for recalling postings from a scheduling application via an IVR interface, according to an embodiment of the invention.
DETAILED DESCRIPTION
[008] Reference will now be made in detail to an implementation consistent with the present invention as illustrated in the accompanying drawings. Wherever possible, the same reference numbers will be used throughout the drawings and the following description to refer to the same or like parts. Although discussed with reference to these illustrations, the present invention is not limited to the implementations illustrated therein. Hence, the reader should regard these illustrations merely as examples of embodiments of the present invention, the full scope of which is measured only in terms of the claims following this description.
[009] Consistent with one embodiment of the invention, a scheduling application has a simple to use interactive voice response (IVR) interface. Accordingly, a user can input or add scheduling information and recall scheduling information by issuing free- form (e.g., natural language) voice commands to a scheduling management application, with or without additional command parameters. In contrast to many conventional IVR systems, the present invention allows a user to speak commands in free-form or natural speaking mode at a single call prompt, with or without additional command parameters. The presence or lack of additional command parameters entered at the main call prompt does not "trip up" the system. Instead, call flow optimization logic is built-in to the application such that the call flow - for example, the presentation and order of call prompts presented to the user- is dependent in part on the information provided by the caller at the main call prompt, or any subsequent call prompts. Accordingly, if a caller provides a voice command and all necessary additional command parameters for a particular transaction, the application will take the necessary action to process the voice command without the need to prompt the caller for additional command parameters. [0010] One advantage of the present invention is its overall simplicity and user- friendliness. Because the system is implemented to work with a single top level prompt, and is designed to accept natural language commands with or without additional parameters, the system is extremely flexible and adaptable to a wide variety of users, and user styles. Another advantage of a system consistent with an embodiment of the invention is that it provides significantly greater flexibility in allowing the caller to provide information, and is therefore far more user-friendly than conventional systems or applications. It will be appreciated by those skilled in the art that, in general, the order in which a system prompts for additional required command parameters is a design choice, and will vary depending on the design requirements. Similarly, the manner in which recognized speech commands and parameters are confirmed by the system is a matter of design choice. For instance, confirmation of input may be an affirmative process where the system prompts the user to verify that a command or parameter was recognized properly. Alternatively, the system may passively confirm the caller's input. The following example is illustrative of the many features and advantages of the present invention.
[0011] FIG's. 2 and 3 illustrate an example of an IVR interface according to an embodiment of the invention, in the was' of a simple scheduling application. The scheduling application illustrated in FIG's. 2 and 3 enables a caller to schedule various types of events (e.g., appointments, meetings, birthdays, notes, tasks) by issuing different commands and command parameters. For instance, as illustrated in FIG. 2, in order to add an appointment or meeting to a schedule, a caller may simply issue a verbal command with parameters to the system, such as, "add meeting for tomorrow at 5 PM." Here, the command is associated with the keyword "add" and the additional information is analyzed by the system to determine whether it includes command parameters for the command "add". In this case, the caller has provided additional parameters indicating the type of event to be added, for example, "meeting", the date, "tomorrow", and the time, "5:00 pm" In addition, the user may be prompted to provide a description of the meeting which is recorded as an audio clip. In FIG. 2, the prompt requesting the description is represented as a "beep" and the caller's provided description by the ellipses "... ".
[0012] In this simple example, two commands - "add" and "recall" - are used. In addition, several command parameters can be used with the two commands. In FlG's. 2 and 3, the use of the word "<tag>" represents a special command parameter representing a tag or label for the type of an event, such as "appointment, meeting, birthday, note, or task". In some embodiments of the invention, the inclusion of a particular command parameter may indicate that other command parameters are required. For instance, in one embodiment, a meeting or appointment may indicate that command parameters for the date and time are required. According to an embodiment of the invention, the call flow optimization logic will dynamically adapt, if necessary, to prompt for any required command parameters. However, in an alternative embodiment, the system may accept a meeting or appointment without additional command parameters to indicate when the meeting or appointment is to occur.
[0013] In general, the scheduling application may be implemented on a computer (e.g., a server) that is integrated with or connected to an IVR system. Accordingly, a user may call a particular number to establish a session with the IVR system. The IVR system will capture and process data (e.g., touch tone inputs or voice commands), and then provide the data to the server executing the scheduling application where it may be further processed. An embodiment of the invention may me implemented with the system described in U.S. Patent Application No. 11/560,295 entitled "Memory Assistance System and Method, and filed on November 15, 2006. [0014] In another embodiment of the invention, interaction with the application may be achieved via a client application executing on a personal computer, mobile phone, smartphone, personal digital assistant, or any computing device with suitable input and processing means. For instance, in one embodiment of the invention, a client application may execute on a mobile phone and interaction with the application is achieved by simply pressing a dedicated button on the mobile phone. Accordingly, it may not be necessary to dial a number to access the application. [0015] FIG. 2 illustrates an example call flow for a scheduling application consistent with an embodiment of the invention. Generally, to initiate a session with the scheduling application, the user will dial a particular number, or press a button on the user's client computing device. As illustrated in FIG. 2, after the user has established a session with the application, the user is prompted to provide a voice command. For example, in one embodiment of the invention, the user may be prompted to add or recall a posting. In this context, a posting represents any type of formatted data recognized by the system, and may include appointments, meetings, tasks, notes, birthdays, shopping lists, etc. Those skilled in the art will appreciate that the types of postings recognized by the system are a design choice, and may be expanded depending upon the system requirements.
[0016] As illustrated in FIG. 2, the system uses natural language speech recognition, and therefore voice commands can be entered in free-form mode, with or without additional command parameters. Furthermore, each type of posting (e.g., appointments, meetings, tasks, notes, birthdays, and shopping lists) is associated with a particular tag or label. To add a particular type of posting, the user simply issues a voice command "add" along with a command parameter indicating the appropriate tag or label. The application will analyze the voice command, determine the exact command and/or command parameters, and then determine the appropriate next action. If, for example, the command includes extraneous information such as the time, and date, the scheduling application will automatically interpret the information based on a context determined by the particular command and one or more command parameters, such as a command parameter associated with a tag or label indicating the type of an event. For instance, if the command issued is, "add birthday on January 15, 2007", the scheduling application will appropriately interpret the date provided as a birthday'.
[0017] FIG. 2 illustrates a variety of different call flow paths based on the particular voice command and parameters issued. For example, call flow path 22 illustrates a case where the voice command is "Add <tag> for tomorrow 5 pm." Here, the "<tag>" represents a special command parameter indicating the type of posting, for example, meeting, appointment, birthday, note, etc. After the command is issued, the application generates an audible noise (e.g., a beep or tone), which is a prompt directing the user to enter additional information, such as command parameters that are not required. For instance, if the posting is for a meeting or appointment, the user may provide a brief description of the meeting or appointment including the names of the other participants, a location, and any other relevant information.
[0018] After the user has provided the additional information, the scheduling application will confirm the posting by presenting what will be added to the user's schedule. For example, in the example described above with respect to call flow path 10, the scheduling application might speak back, "Adding as <tag> for tomorrow at 5 pm" where "<tag>" represents a type of event, such as meeting, etc. Assuming the scheduling application has correctly interpreted the user's command and the corresponding parameters, including those associated with a tag or label, the caller need not do anything, and the application will add the posting to the caller's schedule. However, if the scheduling application has incorrectly interpreted the voice command, the caller can issue a command, such as "no", "stop", or "cancel" and the scheduling application will re-prompt the caller to issue the voice command. [0019] In call flow path 24, an example of a voice command without a particular tag or label is shown. In the case when the caller choose not to speak an additional command parameter such as a specific tag or label indicating a type of event, the scheduling application may (as illustrated in FIG. 3) simply add the posting to the caller's schedule as a generic type, for example, without a specific tag or label to indicate a type. Alternatively, the scheduling application may prompt the caller to provide an additional command parameter or tag or label. Call flow paths 26, 28, 30, 32, and 34 illustrate how the scheduling application handles various voice commands, according to one embodiment of the invention. In contrast to conventional scheduling systems, an embodiment of the invention enables a caller to add or recall postings by specifying commands with or without command parameters at a top level call prompt. Accordingly, a caller may specify a command and one or more command parameters at a top level prompt, thereby completing a transaction that would require hearing and responding to several call prompts in a conventional system.
[0020J In one embodiment of the invention, a caller may cancel a voice command if, for example, the system attempts to confirm a command that has been misinterpreted. To cancel a command or command parameter, a caller may simply speak a particular command, such as, "cancel", "stop", or "no". [0021] FIG. 3 illustrates various call flow paths associated with retrieving or recalling previously entered postings, according to an embodiment of the invention. After being prompted to add or recall a posting, the user simply issues a voice command including the command word "recall" in order to recall previously input postings. As with the voice commands and parameters for adding a posting, when recalling information, the user can provide a special command parameter indicating a tag or label to retrieve only postings of the type associated with the spoken tag or label. Additionally, the user can provide additional command parameters such as time and date information, to indicate to the system the exact postings to be retrieved. [0022] As illustrated by the call flow path with reference number 36 of FIG. 3, if a user simply speaks the command "recall", the scheduling application will present the user with all postings. For example, the scheduling application may announce the number of postings, and then begin playback of all previously recorded audio postings. In one embodiment of the invention, a user may be able to control aspects of how the scheduling application plays back audio clips, for example by issuing commands (e.g., spoken or touch tone) to skip, fast forward, rewind, replay, etc.
[0023] As illustrated by the call flow path with reference number 38 of FIG. 3, when the user issues a command to recall a posting along with a particular command parameter indicating a posting type, for example, by speaking the tag or label associated with the type, the scheduling application will play back all the audio clip postings of the particular type associated with the spoken tag or label. For example, if the user requests to have all appointments recalled, the user may speak the voice command, "recall appointments". Accordingly, the scheduling application will retrieve all the audio clips associated with all previously recorded postings that are indicated as having type "appointment". The scheduling application may present the number of postings, and then begin playback of the postings.
[0024J Call flow path 40 illustrates an example where the user issues a command to recall all postings for a particular date or time range - in this case, the upcoming week. Accordingly, the scheduling application will retrieve and replay all postings, regardless of type, that are scheduled for the upcoming week. In this way, the user may hear all meetings, appointments, birthdays, notes, etc., for the upcoming week. [0025] Call flow path 42 illustrates an example where the user issues a command to recall postings associated with a particular label and a particular date or time period. Accordingly, the scheduling application will retrieve and replay all postings associated with the "<tag>" spoken by the user, which are scheduled for the particular time period requested by the user. For example, if the user desires to hear all appointments in the next week, the user may request, "recall appointments for this week", and the scheduling application will replay all appointments for the week. The time range may be specified by date (e.g., a particular day, a particular week, a particular range of days or weeks, or months) as well as times (e.g., 8 am through 5 pm).
[0026] The foregoing description of various implementations of the invention has been presented for purposes of illustration and description. It is not exhaustive and does not limit the invention to the precise form or forms disclosed. Furthermore, it will be appreciated by those skilled in the art that the present invention may find practical application in a variety of alternative contexts that have not explicitly been addressed herein. Finally, the illustrative processing steps performed by a computer-implemented program (e.g., instructions) may be executed simultaneously, or in a different order than described above, and additional processing steps may be incorporated. The invention may be implemented in hardware, software, or a combination thereof. When implemented partly in software, the invention may be embodied as a set of instructions stored on a computer-readable medium. The scope of the invention is defined by the claims and their equivalents.

Claims

CLAIMSWhat is claimed is:
1. An IVR system including call flow optimization logic configured to process a spoken command and one or more command parameters received at a top level command prompt, so as to dynamically determine whether a secondary command prompt is required to prompt for additional command parameters, based on an analysis of the command and command parameters received at the top level command prompt.
2. An IVR system including call flow optimization logic configured to primarily use a single top-level command prompt, such that a voice command and optional command parameters received at the command prompt are automatically processed regardless of the number of command parameters received; and said call flow optimization logic is configured to dynamically determine whether a confirmation prompt and/or secondary command prompt is necessary, based on an analysis of the voice command and optional command parameters received.
3. A scheduling application, integrated with an IVR system, configured to i) receive a voice command and one or more optional command parameters at a top level command prompt; and process the received command and the one or more optional command parameters so as to dynamically determine, based on the received command and optional command parameters, whether it is necessary to prompt for additional command parameters.
4. A scheduling application, integrated with an IVR system, configured to enable a user to record and/or retrieve recorded audio clips by providing a natural language command that may or may not include additional command parameters indicating an audio clip type, date and/or time associated with the audio clip.
PCT/US2008/052501 2007-01-31 2008-01-30 Free form voice command model interface for ivr navigation WO2008095021A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN113/KOL/2007 2007-01-31
IN113KO2007 2007-01-31

Publications (2)

Publication Number Publication Date
WO2008095021A1 true WO2008095021A1 (en) 2008-08-07
WO2008095021A9 WO2008095021A9 (en) 2014-12-04

Family

ID=39674488

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2008/052501 WO2008095021A1 (en) 2007-01-31 2008-01-30 Free form voice command model interface for ivr navigation

Country Status (1)

Country Link
WO (1) WO2008095021A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9338295B2 (en) 2013-09-27 2016-05-10 At&T Intellectual Property I, L.P. Dynamic modification of automated communication systems

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020077829A1 (en) * 2000-12-19 2002-06-20 Brennan Paul Michael Speech based status and control user interface customisable by the user
US20050152516A1 (en) * 2003-12-23 2005-07-14 Wang Sandy C. System for managing voice files of a voice prompt server
US20060122837A1 (en) * 2004-12-08 2006-06-08 Electronics And Telecommunications Research Institute Voice interface system and speech recognition method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020077829A1 (en) * 2000-12-19 2002-06-20 Brennan Paul Michael Speech based status and control user interface customisable by the user
US20050152516A1 (en) * 2003-12-23 2005-07-14 Wang Sandy C. System for managing voice files of a voice prompt server
US20060122837A1 (en) * 2004-12-08 2006-06-08 Electronics And Telecommunications Research Institute Voice interface system and speech recognition method

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9338295B2 (en) 2013-09-27 2016-05-10 At&T Intellectual Property I, L.P. Dynamic modification of automated communication systems
US9609127B2 (en) 2013-09-27 2017-03-28 At&T Intellectual Property I, L.P. Dynamic modification of automated communication systems
US9794405B2 (en) 2013-09-27 2017-10-17 At&T Intellectual Property I, L.P. Dynamic modification of automated communication systems

Also Published As

Publication number Publication date
WO2008095021A9 (en) 2014-12-04

Similar Documents

Publication Publication Date Title
CN110248019B (en) Method, computer storage medium, and apparatus for voice-enabled dialog interface
US11501780B2 (en) Device, system, and method for multimodal recording, processing, and moderation of meetings
US20190095050A1 (en) Application Gateway for Providing Different User Interfaces for Limited Distraction and Non-Limited Distraction Contexts
US10861438B2 (en) Methods and systems for correcting transcribed audio files
US7548895B2 (en) Communication-prompted user assistance
US10705794B2 (en) Automatically adapting user interfaces for hands-free interaction
US10679605B2 (en) Hands-free list-reading by intelligent automated assistant
CN105144133B (en) Context-sensitive handling of interrupts
Walker et al. What can I say? Evaluating a spoken language interface to email
CN108337380B (en) Automatically adjusting user interface for hands-free interaction
KR101809808B1 (en) System and method for emergency calls initiated by voice command
TWI376681B (en) Speech understanding system for semantic object synchronous understanding implemented with speech application language tags, and computer readable medium for recording related instructions thereon
KR101834624B1 (en) Automatically adapting user interfaces for hands-free interaction
US20020032591A1 (en) Service request processing performed by artificial intelligence systems in conjunctiion with human intervention
US20050131684A1 (en) Computer generated prompting
WO2014152046A2 (en) Refining a search based on schedule items
TW201042987A (en) Intuitive voice navigation
Lyons et al. Augmenting conversations using dual-purpose speech
JP6774120B1 (en) Automatic report creation system
TWI761841B (en) Mobile device, system and method for task management based on voice intercom function
Schnelle et al. Voice User Interface Design Patterns.
WO2008095021A1 (en) Free form voice command model interface for ivr navigation
Sawhney Contextual awareness, messaging and communication in nomadic audio environments
Goldman et al. Voice Portals—Where Theory Meets Practice
Zou An experimental evaluation of grounding strategies for conversational agents

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08728588

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08728588

Country of ref document: EP

Kind code of ref document: A1