US20080319757A1 - Speech processing system based upon a representational state transfer (rest) architecture that uses web 2.0 concepts for speech resource interfaces - Google Patents

Speech processing system based upon a representational state transfer (rest) architecture that uses web 2.0 concepts for speech resource interfaces Download PDF

Info

Publication number
US20080319757A1
US20080319757A1 US11/765,900 US76590007A US2008319757A1 US 20080319757 A1 US20080319757 A1 US 20080319757A1 US 76590007 A US76590007 A US 76590007A US 2008319757 A1 US2008319757 A1 US 2008319757A1
Authority
US
United States
Prior art keywords
speech
web
server
command
enabled application
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US11/765,900
Inventor
William V. Da Palma
Victor S. Moore
Wendi L. Nusbickel
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US11/766,002 priority Critical patent/US7890333B2/en
Priority to US11/765,900 priority patent/US20080319757A1/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MOORE, VICTOR S., DA PALMA, WILLIAM V., NUSBICKEL, WENDI L.
Priority to US11/766,210 priority patent/US8032379B2/en
Priority to US11/766,335 priority patent/US7996229B2/en
Priority to US11/766,255 priority patent/US9311420B2/en
Priority to US11/766,157 priority patent/US8041573B2/en
Priority to US11/766,139 priority patent/US7631104B2/en
Priority to US11/766,291 priority patent/US8074202B2/en
Priority to PCT/EP2008/057671 priority patent/WO2008155343A2/en
Publication of US20080319757A1 publication Critical patent/US20080319757A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/32Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems

Definitions

  • the present invention relates to the field of speech processing technologies and, more particularly, to a speech processing system based upon Representational State Transfer (REST) architecture that uses Web 2.0 concepts for speech resource interfaces.
  • REST Representational State Transfer
  • Web 2.0 signifies a second generation of Web based services and applications that emphasize online collaboration and information sharing among users.
  • a Web 1.0 application would be one that was effectively read-only from a user perspective, where a Web 2.0 application would provide read, write, and update access to end-users.
  • Web 2.0 users can fundamentally change a Web 2.0 application.
  • Web 2.0 instances include WIKIs, BLOGs, social networking sites, FOLKSONOMIEs, MASHUPs, and the like. All of these Web 2.0 instances allow end-users to add content which other users are able to access. A value of a Web 2.0 Web site is enhanced by the user provided content and may even be completely dependent upon it.
  • WIKIPEDIA e.g., one Web 2.0 application
  • WIKIPEDIA is a WIKI based encyclopedia where each end-user is able to view, add, and edit content. No content would exist without end-user contributions. Information accuracy results from an end-user population constantly updating erroneous entries which other users provide. As new innovations emerge, customers update and add WIKIPEDIA entries that describe these new innovations.
  • Web 2.0 applications include MYSPACE.com, YOUTUBE.com, DEL.ICIO.US.com, CRAIGSLIST.com, and the like.
  • a problem contributing to the schism is that speech processing technologies are currently implemented using a non-uniform interface and the Web 2.0 is generally based upon a uniform interface. That is, speech processing operations are accessed via function calls, method invocations, remote procedure calls (RPC), and other messages that are only understood by a specific server or a small subset of components. A specific invocation mechanism and required parameters must be known by a client and must be integrated into an interface.
  • a non-uniform interface is characteristic of RPC based techniques, which includes Simple Object Access Protocol (SOAP), Common Object Request Broker Architecture (COBRA), Distributed Component Object Model (DCOM), JINI, and the like. Without deliberate integration efforts, however, the chances that two software objects designed from an unconstrained architecture are near nil.
  • a uniform interface exists that includes a few basic primitive commands (e.g., GET, PUT, POST, DELETE) that act upon targets, which in a Web 2.0 context are generally able to be referenced by Uniform Resource Identifiers (URIs).
  • URIs Uniform Resource Identifiers
  • a term used for this type of architecture is Representational State Transfer (REST).
  • REST based solutions simplify component implementation, reduce the complexity of connector semantics, improve the effectiveness of performance tuning, and increase the scalability of pure server components.
  • the Web e.g., hypertext technologies
  • Web 2.0 expands these REST principles to permit end users to add (HTTP PUT), update (HTTP POST), and remove (HTTP DELETE) content.
  • WIKIs, BLOGs, FOLKSONOMIEs, MASHUPs, and the like are all considered RESTful, since each generally follows REST principles.
  • the present invention discloses a RESTful speech processing system that uses Web 2.0 concepts for interfacing with server-side speech resources.
  • the RESTful speech processing system can be used to add customizable speech processing capabilities to Web 2.0 instances, such as WIKIs, BLOGs, social networking sites, FOLKSONOMIEs, MASHUPs, and the like.
  • the invention can access speech-enabled applications via introspection documents.
  • Each speech-enabled application can contain a collection of entries and resources.
  • the entries can include Web 2.0 entries, such as WIKI entries and the resources can include speech resources, such as speech recognition, speech synthesis, speech identification, and voice interpreter resources.
  • Each entry and resource can be further decomposed into sub-components specified at a lower granularity level.
  • Each application resource/entry can be introspected, customized, replaced, added, re-ordered, and/or removed by end users.
  • one aspect of the present invention can include a speech processing system that includes a client, a speech for Web 2.0 system, and a speech processing system.
  • the client can access a speech-enabled application using at least one Web 2.0 communication protocol.
  • a standard browser of the client can use a HyperText Transfer Protocol (HTTP) to communicate with the speech-enabled application executing on the speech for Web 2.0 system.
  • HTTP HyperText Transfer Protocol
  • the speech for Web 2.0 system can access a data store within which user specific speech parameters are included, wherein a user of the client is able to configure the specific speech parameters of the data store.
  • a user can configure which speech resources are available (e.g., TTS, ASR, SIV, VoiceXML interpreter, and the like), resource characteristics (language, grammar, voice gender, speaking rate, and the like), delivery characteristics (real-time or not, synchronous or not, delivery protocol, delivery codec, delivery fidelity, and the like), and other such characteristics. Suitable ones of these speech parameters are utilized whenever the user interacts with the Web 2.0 system.
  • the speech processing system can include one or more speech processing engines.
  • the speech processing system can interact with the speech for Web 2.0 system to handle speech processing tasks associated with the speech-enabled application.
  • the present invention can include a system for using Web 2.0 as an interface to speech engines.
  • the system can include a Web 2.0 server and a server-side speech processing system.
  • the Web 2.0 server can serve at least one speech-enabled application to at least one remotely located client.
  • the server-side speech processing system can handle speech processing operations for the speech-enabled applications. Communications with the server-side speech processing system can occur via a set of RESTful commands, such as GET, PUT, POST, and DELETE.
  • Still another aspect of the present invention can include a speech for Web 2.0 system that includes a Web 2.0 server.
  • the Web 2.0 server can serve at least one speech-enabled application to remotely located clients.
  • the speech-enabled application can include an introspection document, a collection of entries, and a collection of resources. At least one of the resources can be a speech resource associated with a speech engine, which adds a speech processing capability to the speech-enabled application.
  • various aspects of the invention can be implemented as a program for controlling computing equipment to implement the functions described herein, or a program for enabling computing equipment to perform processes corresponding to the steps disclosed herein.
  • This program may be provided by storing the program in a magnetic disk, an optical disk, a semiconductor memory, or any other recording medium.
  • the program can also be provided as a digitally encoded signal conveyed via a carrier wave.
  • the described program can be a single program or can be implemented as multiple subprograms, each of which interact within a single computing device or interact in a distributed fashion across a network space.
  • the methods detailed herein can also be methods performed at least in part by a service agent and/or a machine manipulated by a service agent in response to a service request.
  • FIG. 1 is a schematic diagram of a system that utilizes Web 2.0 concepts for speech processing operations in accordance with an embodiment of the inventive arrangements disclosed herein.
  • FIG. 2 is a schematic diagram of a system for a Web 2.0 for voice system in accordance with an embodiment of the inventive arrangements disclosed herein.
  • FIG. 3 is a schematic diagram showing a WIKI server adapted for communications with a Web 2.0 for voice system in accordance with an embodiment of the inventive arrangements disclosed herein.
  • FIG. 1 is a schematic diagram of a system 100 that utilizes Web 2.0 concepts for speech processing operations in accordance with an embodiment of the inventive arrangements disclosed herein.
  • a user 110 can use an interface 114 of client 112 to communicate with the speech for Web 2.0 system 120 , which can include a Web 2.0 server 122 and/or a RESTful server 130 .
  • a middleware server 116 can provide an interface 118 to system 120 .
  • Interface 114 and/or 118 can be a Web or voice browser, which communicates directly with system 120 using Web 2.0 conventions.
  • Applications 126 which the client 112 accesses, can be voice-enabled applications stored in data store 124 .
  • a type of browser e.g., interface 114 and/or 118 ) used to access the applications 126 can be transparent to the system 120 , or can be transparent at least to RESTful server 130 of system 120 .
  • the RESTful server 130 can provide speech processing operations for applications 126 by interfacing with speech processing system 150 .
  • Communications between the Web 2.0 server 122 and the RESTful server 130 can be REST based communications, such as those conducted using the ATOM PUBLISHING PROTOCOL (APP).
  • APP ATOM PUBLISHING PROTOCOL
  • servers 122 and 130 can be functionally integrated into a single server of speech for Web 2.0 system 120 .
  • the RESTful server 130 can utilize a set of basic commands enabling the command engine 132 to conduct speech processing operations.
  • the commands can be REST commands that include an HTTP GET, an HTTP POST, and HTTP PUT, and an HTTP DELETE command.
  • the RESTful server 130 can also include an introspection/discovery engine 134 and/or a media engine 136 as well as data store 138 .
  • Data store 138 can include a set of documents 140 , such as introspection documents 142 , entry collection documents 144 , and resource collection documents 146 .
  • the documents 140 together can link the RESTful server 130 to speech processing engines 156 of speech processing server 150 and can control behavior of speech processing server 150 .
  • the documents 140 and resulting behavior of the speech processing server 150 can be configured by user 110 in a user-specific manner. That is different users 110 can inject their own voice characteristics, markup, behavior, and/or other features, which the speech processing system 150 utilizes.
  • the Web 2.0 system 120 can be communicatively linked to one or more enterprise servers 158 having an associated data store 160 .
  • the Web 2.0 system 120 can be a communication intermediary which provides user 110 with access to information and services of the enterprise server and data store 160 .
  • Web 2.0 system 120 can further be communicatively linked to one or more additional RESTful servers 162 , each associated with a data store 164 , within which a set of documents, approximately equivalent to documents 140 , are stored. Communications between Web 2.0 system 120 and speech processing system 150 or RESTful server 162 can be based on a RESTful protocol, such as APP.
  • RESTful servers 130 and 162 are able to operate in a stateless fashion which permits RESTful server 162 to seamlessly replace functionality of server 130 . That is, state information does not have to be transferred when control is transferred from one server 130 to another 162 .
  • system 100 provides a highly scalable solution (i.e., when under a heavy load, server 130 can transfer load to server 162 ) and can provide fault tolerance and recovery capabilities (i.e., when server 130 experiences runtime problems, a different operational server 162 can immediately perform operations previously handled by server 130 ).
  • client 112 is able to interact with the speech-enabled application 126 using Web 2.0 communication protocols only. No special client-side speech interface is required. At the same time, the user 110 is able to customize/personalize/configure speech processing behavior at low-levels.
  • Web 2.0 is a concept that refers to a cooperative Web in which end-users 110 add value by providing content, as opposed to Web systems that unidirectionally provide information from an information provider to an information consumer.
  • Web 2.0 refers to a readable, writable, and updateable Web. While a myriad of types of Web 2.0 instances exist, some currently popular ones include WIKIs, BLOGS, MASHUPs, FOLKSONOMIEs, social networking sites, and the like.
  • REST refers to a Representational State Transfer architecture.
  • a REST approach focuses on utilizing a constrained operation set, such as GET, PUT, POST, and DELETE, to act against a set of structured targets which can be URL addressable.
  • a REST architecture is a client/server architecture which is stateless, cacheable, and layered by nature. REST replaces a paradigm of do-something with a make-something-so concept. That is, instead of attempting to execute a kind of state transition for a software object, the REST concept changes a state of a software object to a user designated state.
  • a RESTful object e.g., RESTful server 130 , 162
  • a RESTful interface can be a simple interface that transmits domain-specific data using an HTTP based protocol without utilizing an additional messaging layer, such as SOAP, and without reliance of session tracking HTTP cookies.
  • the client 112 can be any computing device capable of communicating with either the system 120 or middleware server 116 .
  • client 112 can include a Web browser 114 , which operates as an interface between the user 110 and the system 120 .
  • the client 112 can be a voice communication device that communicates with the middleware server 116 , which can include a voice browser 118 .
  • specific instances of the client 112 can include a computer, a Web station, a media player, a telephone, a smart phone, and the like
  • Web 2.0 server 120 can be a server 120 that provides Web content to interface 114 and/or 118 and which permits a user 110 to provide additional Web content, which is made available to other users.
  • the Web 2.0 server can be a WIKI server, a BLOG server, a social networking server, a MASHUP server, a FOLKSONOMY server, and the like.
  • the Web 120 can be a RESTful server, in which case functionality shown for server 130 can be incorporated within server 120 .
  • a transformer can be included in Web 2.0 server, which converts content between a server-specific format (e.g., a WIKI format) and a RESTful format, such as a format adhering to an APP based protocol.
  • RESTful server 130 and 162 can be a server adhering to REST concepts, which links the server 120 to speech processing server 150 .
  • the RESTful server 130 can be an APP server.
  • RESTful commands can be issued by command engine 132 , which are received and processed by command interpreter 154 .
  • a media interface 136 of the RESTful server 130 can control caching, delivery, fidelity, and formatting of delivered media, which includes delivered speech. Delivery can be in accordance with a streaming protocol, a file based protocol, a real-time protocol, and the like.
  • Speech processing server 150 can be any networked server or speech processing system which is able to process speech requests using one or more speech engines 156 .
  • the speech processing server 150 can be a turn-based and/or clustered system capable of handling multiple requests in real-time.
  • speech processing server 150 can be implemented as a WEBSPHERE VOICE SERVER or other such commercially available product. Management tasks of the server 150 can be handled by the management processor 152 .
  • the various speech engines 156 can include ASR, TTS, SIV, voice markup interpreters, and the like.
  • Data stores 124 , 138 , 160 , and 164 can be a physical or virtual storage space configured to store digital information.
  • Data stores 124 , 138 , 160 , and 164 can be physically implemented within any type of hardware including, but not limited to, a magnetic disk, an optical disk, a semiconductor memory, a digitally encoded plastic memory, a holographic memory, or any other recording medium.
  • Each of the data stores 124 , 138 , 160 , and 164 can be a stand-alone storage unit as well as a storage unit formed from a plurality of physical devices. Additionally, information can be stored within data stores 124 , 138 , 160 , and 164 in a variety of manners.
  • information can be stored within a database structure or can be stored within one or more files of a file storage system, where each file may or may not be indexed for information searching purposes.
  • data stores 124 , 138 , 160 , and 164 can utilize one or more encryption mechanisms to protect stored information from unauthorized access.
  • the components of system 100 can be communicatively linked to each other via a network (not shown).
  • the network can include any hardware/software/and firmware necessary to convey data encoded within carrier waves. Data can be contained within analog or digital signals and conveyed though data or voice channels.
  • the network can include local components and data pathways necessary for communications to be exchanged among computing device components and between integrated device components and peripheral devices.
  • the network can also include network equipment, such as routers, data lines, hubs, and intermediary servers which together form a data network, such as the Internet.
  • the network can also include circuit-based communication components and mobile communication components, such as telephony switches, modems, cellular communication towers, and the like.
  • the network can include line based and/or wireless communication pathways.
  • FIG. 2 is a schematic diagram of a system 200 for a Web 2.0 for voice system 230 in accordance with an embodiment of the inventive arrangements disclosed herein.
  • System 200 can be an alternative representation and/or an embodiment for the system 100 of FIG. 1 or for a system that provides approximately equivalent functionality as system 100 utilizing Web 2.0 concepts to provide speech processing capabilities.
  • Web 2.0 clients 240 can communicate with Web 2.0 servers 210 - 214 utilizing a REST/ATOM 250 protocol.
  • the Web 2.0 servers 210 - 214 can serve one or more speech-enabled applications 220 - 224 , where speech resources are provided by a Web 2.0 for Voice system 230 .
  • One or more of the applications 220 - 224 can include AJAX 256 or other JavaScript code.
  • the AJAX 256 code can be automatically converted from WIKI or other syntax by a transformer of a server 210 - 214 .
  • Communications between the Web 2.0 servers 210 - 214 and system 230 can be in accordance with REST/ATOM 256 protocols.
  • Each speech-enabled application 220 - 224 can be associated with an ATOM container 231 , which specifies Web 2.0 items 232 , resources 233 , and media 234 .
  • One or more resource 233 can correspond to a speech engine 238 .
  • the Web 2.0 clients 240 can be any client capable of interfacing with a Web 2.0 server 210 - 214 .
  • the clients 240 can include a Web or voice browser 241 as well as any other type of interface 244 , which executes upon a computing device.
  • the computing device can include a mobile telephone 242 , a mobile computer 243 , a laptop, a media player, a desktop computer, a two-way radio, a line-based phone, and the like.
  • the clients 240 need not have a speech-specific interface and instead only require a standard Web 2.0 interface. That is, there are no assumptions regarding the client 240 other than an ability to communicate with a Web 2.0 server 210 - 214 using Web 2.0 conventions.
  • the Web 2.0 servers 210 - 214 can be any server that provides Web 2.0 content to clients 240 and that provides speech processing capabilities through the Web 2.0 for voice system 230 .
  • the Web 2.0 servers can include a WIKI server 210 , a BLOG server 212 , a MASHUP server, a FOLKSONOMY server, a social networking server, and any other Web 2.0 server 214 .
  • the Web 2.0 for voice system 230 can utilize Web 2.0 concepts to provide speech capabilities.
  • a server-side interface is established between the voice system 230 and a set of Web 2.0 servers 210 - 214 .
  • Available speech resources can be introspected and discovered via introspection documents, which are one of the Web 2.0 items 232 .
  • Introspection can be in accordance with the APP specification or a similar protocol. The ability for dynamic configuration and installation is exposed to the servers 210 - 214 via the introspection document.
  • Web 2.0 for voice system 230 can be through a Web 2.0 server that lets users (e.g., clients 240 ) provide their own customizations/personalizations.
  • use of the APP 256 opens up the application interface to speech resources using Web 2.0, JAVA 2 ENTERPRISE EDITION (J2EE), WEBSPHERE APPLICATION SERVER (WAS), and other conventions, rather than being restricted to protocols, such as media resource control protocol (MRCP), real time streaming protocol (RTSP), or real time protocol (RTP).
  • MCP media resource control protocol
  • RTSP real time streaming protocol
  • RTP real time protocol
  • a constrained set of RESTful commands can be used to interface with the Web 2.0 for voice system 230 .
  • RESTful commands can include a GET command, a POST command, a PUT command, and a DELETE command, each of which is able to be implemented as an HTTP command.
  • GET e.g., HTTP GET
  • the GET command can also be used for submitting simplistic speech queries and for receiving query results.
  • the POST command can create media-related resources using speech engines 238 .
  • the POST command can create an audio “file” from input text using a text-to-speech (TTS) resource 233 which is linked to a TTS engine 238 .
  • TTS text-to-speech
  • the POST command can create a text representation given an audio input, using an automatic speech recognition (ASR) resource 233 which is linked to an ASR engine 238 .
  • ASR automatic speech recognition
  • the POST command can create a score given an audio input, using a Speaker Identification and Verification (SIV) resource which is linked to a SIV engine 238 . Any type of speech processing resource can be similarly accessed using the POST command.
  • ASR automatic speech recognition
  • SIV Speaker Identification and Verification
  • the PUT command can be used to update configuration of speech resources (e.g., default voice-name, ASR or TTS language, TTS voice, media destination, media delivery type, etc.)
  • the PUT command can also be used to add a resource or capability to a Web 2.0 server 210 - 214 (e.g. installing an SIV component).
  • the DELETE command can remove a speech resource from a configuration. For example, the DELETE command can be used to uninstall a previously installed speech component.
  • Customizable speech processing elements can include speech resource availability, request characteristics, result characteristics, media characteristics, and the like.
  • Speech resource availability can indicate whether a specific type of resource (e.g., ASR, TTS, SIV, Voice XML interpreter) is available.
  • Request characteristics can refer to characteristics such as language, grammar, voice attributes, gender, rate of speech, and the like.
  • the result characteristics can specify whether results are to be delivered synchronously or asynchronously. Result characteristics can alternatively indicate whether a listener for callback is to be supplied with results.
  • Media characteristics can include input and output characteristics, which can vary from a URI reference to an RTP stream.
  • the media characteristics can specify a codec (e.g., G711), a sample rate (e.g., 8 KHz to 22 KHz), and the like.
  • the speech engines 238 can be provided from a J2EE environment 236 , such as a WAS environment. This environment 236 can conform to a J2EE Connector Architecture (JCA) 237 .
  • JCA J2EE Connector Architecture
  • a set of additional facades 260 can be utilized on top of Web 2.0 protocols to provide additional interface and protocol 262 options (e.g., MRCP, RTSP, RTP, Session Initiation Protocol (SIP), etc.) to the Web 2.0 for voice system 230 .
  • Use of facades 260 can enable legacy access/use of the Web 2.0 for voice system 230 .
  • the facades 260 can be designed to segment the protocol 262 from underlying details so that characteristics of the facade do not bleed through to speech implementation details.
  • Functions, such as the WAS 6.1 channel framework or a JCA container can be used to plug-in a protocol, which is not native to the J2EE environment 236 .
  • the media component 234 of the container 231 can be used to handle media storage, delivery, and format conversions as necessary. Facades 260 can be used for asynchronous or synchronous protocols 262 .
  • FIG. 3 is a schematic diagram showing a WIKI server 330 adapted for communications with a Web 2.0 for voice system 310 in accordance with an embodiment of the inventive arrangements disclosed herein.
  • server 330 can be any WEB 2.0 server (e.g., server 120 of system 100 or server 210 - 214 of system 200 ) including, but not limited to, a BLOG server, a MASHUP server, a FOLKSONOMY server, a social networking server, and the like.
  • a browser 320 can communicate with Web 2.0 server 330 via Representational State Transfer (REST) architecture / ATOM 304 based protocol.
  • the Web 2.0 server 330 can communicate with a speech for Web 2.0 system 310 via a REST/ATOM 302 based protocol.
  • Protocols 302 , 304 can include HTTP and similar protocols that are RESTful by nature as well as an Atom Publishing Protocol (APP) or other protocol that is specifically designed to conform to REST principles.
  • APP Atom Publishing Protocol
  • the Web 2.0 server 330 can include a data store 332 in which applications 334 , which can be speech-enabled, are stored.
  • the applications 332 can be written in a WIKI or other Web 2.0 syntax and can be stored in an APP format.
  • the contents of the application 332 can be accessed and modified using editor 350 .
  • the editor 350 can be a standard WIKI or other Web 2.0 editor having a voice plug-in or extensions 352 .
  • user-specific modifications made to the speech-enabled application 334 via the editor 350 can be stored in customization data store as a customization profile and/or a state definition.
  • the customization profile and state definition can contain customization settings that can override entries contained within the original application 332 . Customizations can be related to a particular user or set of users.
  • the transformer 340 can convert WIKI or other Web 2.0 syntax into standard markup for browsers.
  • the transformer 340 can be an extension of a conventional transformer that supports HTML and XML.
  • the extended transformer 340 can be enhanced to handle JAVA SCRIPT, such as AJAX.
  • resource links of application 332 can be converted into AJAX functions by the transformer 340 having an AJAX plug-in 342 .
  • the transformer 340 can also include a VoiceXML plug-in 344 , which generates VoiceXML markup for voice-only clients.
  • the present invention may be realized in hardware, software, or a combination of hardware and software.
  • the present invention may be realized in a centralized fashion in one computer system or in a distributed fashion where different elements are spread across several interconnected computer systems. Any kind of computer system or other apparatus adapted for carrying out the methods described herein is suited.
  • a typical combination of hardware and software may be a general purpose computer system with a computer program that, when being loaded and executed, controls the computer system such that it carries out the methods described herein.
  • the present invention also may be embedded in a computer program product, which comprises all the features enabling the implementation of the methods described herein, and which when loaded in a computer system is able to carry out these methods.
  • Computer program in the present context means any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either both of the following: a) conversion to another language, code or notation; b) reproduction in a different material form.

Abstract

A speech processing system can include a client, a speech for Web 2.0 system, and a speech processing system. The client can access a speech-enabled application using at least one Web 2.0 communication protocol. For example, a standard browser of the client can use a standard protocol to communicate with the speech-enabled application executing on the speech for Web 2.0 system. The speech for Web 2.0 system can access a data store within which user specific speech parameters are included, wherein a user of the client is able to configure the specific speech parameters of the data store. Suitable ones of these speech parameters are utilized whenever the user interacts with the Web 2.0 system. The speech processing system can include one or more speech processing engines. The speech processing system can interact with the speech for Web 2.0 system to handle speech processing tasks associated with the speech-enabled application.

Description

    BACKGROUND
  • 1. Field of the Invention
  • The present invention relates to the field of speech processing technologies and, more particularly, to a speech processing system based upon Representational State Transfer (REST) architecture that uses Web 2.0 concepts for speech resource interfaces.
  • 2. Description of the Related Art
  • In the past, companies having a Web presence thrived by providing as many people broad access to as much information as possible. Information flow was unidirectional, from a company to information consumers. As time has progressed, users have become inundated with too much information from too many sources. Successful Web sites began to provide user-facing information management and information filtration mechanisms designed to aid users in identifying information of interest. Even these Web sites were somewhat flawed in a sense that information still flowed in a unidirectional manner. A user was limited to information gathered and groomed by a particular information provider.
  • A new type of Web application began to emerge which emphasized user interactions and two-way information exchange. These new Web applications operated more as information marketplaces were people shared information and not as information depots where users accessed a semi-static reservoir of information. This new Web and set of Web applications can be referred to as Web 2.0, where Web 2.0 signifies a second generation of Web based services and applications that emphasize online collaboration and information sharing among users. In other words, a Web 1.0 application would be one that was effectively read-only from a user perspective, where a Web 2.0 application would provide read, write, and update access to end-users. Web 2.0 users can fundamentally change a Web 2.0 application.
  • Specific examples of Web 2.0 instances include WIKIs, BLOGs, social networking sites, FOLKSONOMIEs, MASHUPs, and the like. All of these Web 2.0 instances allow end-users to add content which other users are able to access. A value of a Web 2.0 Web site is enhanced by the user provided content and may even be completely dependent upon it.
  • For example, WIKIPEDIA (e.g., one Web 2.0 application) is a WIKI based encyclopedia where each end-user is able to view, add, and edit content. No content would exist without end-user contributions. Information accuracy results from an end-user population constantly updating erroneous entries which other users provide. As new innovations emerge, customers update and add WIKIPEDIA entries that describe these new innovations. Other examples of Web 2.0 applications include MYSPACE.com, YOUTUBE.com, DEL.ICIO.US.com, CRAIGSLIST.com, and the like.
  • Currently, a schism exists between speech processing technologies and Web 2.0 applications, meaning that Web 2.0 instances do not generally incorporate speech processing technologies. One reason for this is that conventional interfaces to speech resources are too complex for an average end-user to utilize. For this reason, speech technologies are typically only available from Web sites/services that provide a unidirectional flow of information. For example, speech technologies are commonly used by enterprises to handle routine customer interactions via a telephone interface, such as providing bank balances and the like.
  • One problem contributing to the schism is that speech processing technologies are currently implemented using a non-uniform interface and the Web 2.0 is generally based upon a uniform interface. That is, speech processing operations are accessed via function calls, method invocations, remote procedure calls (RPC), and other messages that are only understood by a specific server or a small subset of components. A specific invocation mechanism and required parameters must be known by a client and must be integrated into an interface. A non-uniform interface is characteristic of RPC based techniques, which includes Simple Object Access Protocol (SOAP), Common Object Request Broker Architecture (COBRA), Distributed Component Object Model (DCOM), JINI, and the like. Without deliberate integration efforts, however, the chances that two software objects designed from an unconstrained architecture are near nil. At best, an ad hoc collection of software objects having vastly different interface requirements results from the RPC style architecture. The lack of uniform interfaces makes integrating speech processing capabilities for each RPC based application a unique endeavor fraught with application specific challenges, which usually require significant speech processing design skills to overcome.
  • In contrast, a uniform interface exists that includes a few basic primitive commands (e.g., GET, PUT, POST, DELETE) that act upon targets, which in a Web 2.0 context are generally able to be referenced by Uniform Resource Identifiers (URIs). A term used for this type of architecture is Representational State Transfer (REST). REST based solutions simplify component implementation, reduce the complexity of connector semantics, improve the effectiveness of performance tuning, and increase the scalability of pure server components. The Web (e.g., hypertext technologies) in general is founded upon REST principles. Web 2.0 expands these REST principles to permit end users to add (HTTP PUT), update (HTTP POST), and remove (HTTP DELETE) content. Thus, WIKIs, BLOGs, FOLKSONOMIEs, MASHUPs, and the like are all considered RESTful, since each generally follows REST principles.
  • What is needed to bridge the gap between speech processing resources and conventional Web 2.0 applications is a new paradigm for interfacing with speech processing resources, which makes speech processing resources more available to end-users. In this contemplated paradigm, end-users would optimally be able to cooperatively and dynamically develop speech-enabled solutions, which the end-users would then be able to integrate into Web 2.0 content. Thus, a more robust Web 2.0 environment that incorporates speech processing technologies will be allowed to evolve. This is a stark contrast with a conventional paradigm for interfacing with speech processing resources, which is decisively non-RESTful in nature.
  • SUMMARY OF THE INVENTION
  • The present invention discloses a RESTful speech processing system that uses Web 2.0 concepts for interfacing with server-side speech resources. The RESTful speech processing system can be used to add customizable speech processing capabilities to Web 2.0 instances, such as WIKIs, BLOGs, social networking sites, FOLKSONOMIEs, MASHUPs, and the like. The invention can access speech-enabled applications via introspection documents. Each speech-enabled application can contain a collection of entries and resources. The entries can include Web 2.0 entries, such as WIKI entries and the resources can include speech resources, such as speech recognition, speech synthesis, speech identification, and voice interpreter resources. Each entry and resource can be further decomposed into sub-components specified at a lower granularity level. Each application resource/entry can be introspected, customized, replaced, added, re-ordered, and/or removed by end users.
  • The present invention can be implemented in accordance with numerous aspects consistent with the material presented herein. For example, one aspect of the present invention can include a speech processing system that includes a client, a speech for Web 2.0 system, and a speech processing system. The client can access a speech-enabled application using at least one Web 2.0 communication protocol. For example, a standard browser of the client can use a HyperText Transfer Protocol (HTTP) to communicate with the speech-enabled application executing on the speech for Web 2.0 system. The speech for Web 2.0 system can access a data store within which user specific speech parameters are included, wherein a user of the client is able to configure the specific speech parameters of the data store. For example, a user can configure which speech resources are available (e.g., TTS, ASR, SIV, VoiceXML interpreter, and the like), resource characteristics (language, grammar, voice gender, speaking rate, and the like), delivery characteristics (real-time or not, synchronous or not, delivery protocol, delivery codec, delivery fidelity, and the like), and other such characteristics. Suitable ones of these speech parameters are utilized whenever the user interacts with the Web 2.0 system. The speech processing system can include one or more speech processing engines. The speech processing system can interact with the speech for Web 2.0 system to handle speech processing tasks associated with the speech-enabled application.
  • Another aspect of the present invention can include a system for using Web 2.0 as an interface to speech engines. The system can include a Web 2.0 server and a server-side speech processing system. The Web 2.0 server can serve at least one speech-enabled application to at least one remotely located client. The server-side speech processing system can handle speech processing operations for the speech-enabled applications. Communications with the server-side speech processing system can occur via a set of RESTful commands, such as GET, PUT, POST, and DELETE.
  • Still another aspect of the present invention can include a speech for Web 2.0 system that includes a Web 2.0 server. The Web 2.0 server can serve at least one speech-enabled application to remotely located clients. The speech-enabled application can include an introspection document, a collection of entries, and a collection of resources. At least one of the resources can be a speech resource associated with a speech engine, which adds a speech processing capability to the speech-enabled application.
  • It should be noted that various aspects of the invention can be implemented as a program for controlling computing equipment to implement the functions described herein, or a program for enabling computing equipment to perform processes corresponding to the steps disclosed herein. This program may be provided by storing the program in a magnetic disk, an optical disk, a semiconductor memory, or any other recording medium. The program can also be provided as a digitally encoded signal conveyed via a carrier wave. The described program can be a single program or can be implemented as multiple subprograms, each of which interact within a single computing device or interact in a distributed fashion across a network space.
  • It should also be noted that the methods detailed herein can also be methods performed at least in part by a service agent and/or a machine manipulated by a service agent in response to a service request.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • There are shown in the drawings, embodiments which are presently preferred, it being understood, however, that the invention is not limited to the precise arrangements and instrumentalities shown.
  • FIG. 1 is a schematic diagram of a system that utilizes Web 2.0 concepts for speech processing operations in accordance with an embodiment of the inventive arrangements disclosed herein.
  • FIG. 2 is a schematic diagram of a system for a Web 2.0 for voice system in accordance with an embodiment of the inventive arrangements disclosed herein.
  • FIG. 3 is a schematic diagram showing a WIKI server adapted for communications with a Web 2.0 for voice system in accordance with an embodiment of the inventive arrangements disclosed herein.
  • DETAILED DESCRIPTION OF THE INVENTION
  • FIG. 1 is a schematic diagram of a system 100 that utilizes Web 2.0 concepts for speech processing operations in accordance with an embodiment of the inventive arrangements disclosed herein. In system 100, a user 110 can use an interface 114 of client 112 to communicate with the speech for Web 2.0 system 120, which can include a Web 2.0 server 122 and/or a RESTful server 130. When the client 112 is a basic computing device (e.g., a telephone), a middleware server 116 can provide an interface 118 to system 120. Interface 114 and/or 118 can be a Web or voice browser, which communicates directly with system 120 using Web 2.0 conventions. Applications 126, which the client 112 accesses, can be voice-enabled applications stored in data store 124. A type of browser (e.g., interface 114 and/or 118) used to access the applications 126 can be transparent to the system 120, or can be transparent at least to RESTful server 130 of system 120.
  • The RESTful server 130 can provide speech processing operations for applications 126 by interfacing with speech processing system 150. Communications between the Web 2.0 server 122 and the RESTful server 130 can be REST based communications, such as those conducted using the ATOM PUBLISHING PROTOCOL (APP). In one embodiment, servers 122 and 130 can be functionally integrated into a single server of speech for Web 2.0 system 120.
  • The RESTful server 130 can utilize a set of basic commands enabling the command engine 132 to conduct speech processing operations. The commands can be REST commands that include an HTTP GET, an HTTP POST, and HTTP PUT, and an HTTP DELETE command. The RESTful server 130 can also include an introspection/discovery engine 134 and/or a media engine 136 as well as data store 138.
  • Data store 138 can include a set of documents 140, such as introspection documents 142, entry collection documents 144, and resource collection documents 146. The documents 140 together can link the RESTful server 130 to speech processing engines 156 of speech processing server 150 and can control behavior of speech processing server 150. The documents 140 and resulting behavior of the speech processing server 150 can be configured by user 110 in a user-specific manner. That is different users 110 can inject their own voice characteristics, markup, behavior, and/or other features, which the speech processing system 150 utilizes.
  • The Web 2.0 system 120 can be communicatively linked to one or more enterprise servers 158 having an associated data store 160. Thus, the Web 2.0 system 120 can be a communication intermediary which provides user 110 with access to information and services of the enterprise server and data store 160.
  • Web 2.0 system 120 can further be communicatively linked to one or more additional RESTful servers 162, each associated with a data store 164, within which a set of documents, approximately equivalent to documents 140, are stored. Communications between Web 2.0 system 120 and speech processing system 150 or RESTful server 162 can be based on a RESTful protocol, such as APP.
  • It should be appreciated that RESTful servers 130 and 162 are able to operate in a stateless fashion which permits RESTful server 162 to seamlessly replace functionality of server 130. That is, state information does not have to be transferred when control is transferred from one server 130 to another 162. Thus, system 100 provides a highly scalable solution (i.e., when under a heavy load, server 130 can transfer load to server 162) and can provide fault tolerance and recovery capabilities (i.e., when server 130 experiences runtime problems, a different operational server 162 can immediately perform operations previously handled by server 130).
  • Another point about system 100 that should be emphasized is that client 112 is able to interact with the speech-enabled application 126 using Web 2.0 communication protocols only. No special client-side speech interface is required. At the same time, the user 110 is able to customize/personalize/configure speech processing behavior at low-levels.
  • As used herein, Web 2.0 is a concept that refers to a cooperative Web in which end-users 110 add value by providing content, as opposed to Web systems that unidirectionally provide information from an information provider to an information consumer. In other words, Web 2.0 refers to a readable, writable, and updateable Web. While a myriad of types of Web 2.0 instances exist, some currently popular ones include WIKIs, BLOGS, MASHUPs, FOLKSONOMIEs, social networking sites, and the like.
  • REST refers to a Representational State Transfer architecture. A REST approach focuses on utilizing a constrained operation set, such as GET, PUT, POST, and DELETE, to act against a set of structured targets which can be URL addressable. A REST architecture is a client/server architecture which is stateless, cacheable, and layered by nature. REST replaces a paradigm of do-something with a make-something-so concept. That is, instead of attempting to execute a kind of state transition for a software object, the REST concept changes a state of a software object to a user designated state. A RESTful object (e.g., RESTful server 130, 162) is one which primarily conforms to REST concepts. A RESTful interface can be a simple interface that transmits domain-specific data using an HTTP based protocol without utilizing an additional messaging layer, such as SOAP, and without reliance of session tracking HTTP cookies.
  • The client 112 can be any computing device capable of communicating with either the system 120 or middleware server 116. In one embodiment, client 112 can include a Web browser 114, which operates as an interface between the user 110 and the system 120. In another embodiment, the client 112 can be a voice communication device that communicates with the middleware server 116, which can include a voice browser 118. In these embodiments, specific instances of the client 112 can include a computer, a Web station, a media player, a telephone, a smart phone, and the like
  • Web 2.0 server 120 can be a server 120 that provides Web content to interface 114 and/or 118 and which permits a user 110 to provide additional Web content, which is made available to other users. The Web 2.0 server can be a WIKI server, a BLOG server, a social networking server, a MASHUP server, a FOLKSONOMY server, and the like. In one embodiment, the Web 120 can be a RESTful server, in which case functionality shown for server 130 can be incorporated within server 120. Alternatively, a transformer can be included in Web 2.0 server, which converts content between a server-specific format (e.g., a WIKI format) and a RESTful format, such as a format adhering to an APP based protocol.
  • RESTful server 130 and 162 can be a server adhering to REST concepts, which links the server 120 to speech processing server 150. In one embodiment, the RESTful server 130 can be an APP server. RESTful commands can be issued by command engine 132, which are received and processed by command interpreter 154. A media interface 136 of the RESTful server 130 can control caching, delivery, fidelity, and formatting of delivered media, which includes delivered speech. Delivery can be in accordance with a streaming protocol, a file based protocol, a real-time protocol, and the like.
  • Speech processing server 150 can be any networked server or speech processing system which is able to process speech requests using one or more speech engines 156. In one embodiment, the speech processing server 150 can be a turn-based and/or clustered system capable of handling multiple requests in real-time. For example, speech processing server 150 can be implemented as a WEBSPHERE VOICE SERVER or other such commercially available product. Management tasks of the server 150 can be handled by the management processor 152. The various speech engines 156 can include ASR, TTS, SIV, voice markup interpreters, and the like.
  • Data stores 124, 138, 160, and 164 can be a physical or virtual storage space configured to store digital information. Data stores 124, 138, 160, and 164 can be physically implemented within any type of hardware including, but not limited to, a magnetic disk, an optical disk, a semiconductor memory, a digitally encoded plastic memory, a holographic memory, or any other recording medium. Each of the data stores 124, 138, 160, and 164 can be a stand-alone storage unit as well as a storage unit formed from a plurality of physical devices. Additionally, information can be stored within data stores 124, 138, 160, and 164 in a variety of manners. For example, information can be stored within a database structure or can be stored within one or more files of a file storage system, where each file may or may not be indexed for information searching purposes. Further, data stores 124, 138, 160, and 164 can utilize one or more encryption mechanisms to protect stored information from unauthorized access.
  • The components of system 100 can be communicatively linked to each other via a network (not shown). The network can include any hardware/software/and firmware necessary to convey data encoded within carrier waves. Data can be contained within analog or digital signals and conveyed though data or voice channels. The network can include local components and data pathways necessary for communications to be exchanged among computing device components and between integrated device components and peripheral devices. The network can also include network equipment, such as routers, data lines, hubs, and intermediary servers which together form a data network, such as the Internet. The network can also include circuit-based communication components and mobile communication components, such as telephony switches, modems, cellular communication towers, and the like. The network can include line based and/or wireless communication pathways.
  • FIG. 2 is a schematic diagram of a system 200 for a Web 2.0 for voice system 230 in accordance with an embodiment of the inventive arrangements disclosed herein. System 200 can be an alternative representation and/or an embodiment for the system 100 of FIG. 1 or for a system that provides approximately equivalent functionality as system 100 utilizing Web 2.0 concepts to provide speech processing capabilities.
  • In system 200, Web 2.0 clients 240 can communicate with Web 2.0 servers 210-214 utilizing a REST/ATOM 250 protocol. The Web 2.0 servers 210-214 can serve one or more speech-enabled applications 220-224, where speech resources are provided by a Web 2.0 for Voice system 230. One or more of the applications 220-224 can include AJAX 256 or other JavaScript code. In one embodiment, the AJAX 256 code can be automatically converted from WIKI or other syntax by a transformer of a server 210-214.
  • Communications between the Web 2.0 servers 210-214 and system 230 can be in accordance with REST/ATOM 256 protocols. Each speech-enabled application 220-224 can be associated with an ATOM container 231, which specifies Web 2.0 items 232, resources 233, and media 234. One or more resource 233 can correspond to a speech engine 238.
  • The Web 2.0 clients 240 can be any client capable of interfacing with a Web 2.0 server 210-214. For example, the clients 240 can include a Web or voice browser 241 as well as any other type of interface 244, which executes upon a computing device. The computing device can include a mobile telephone 242, a mobile computer 243, a laptop, a media player, a desktop computer, a two-way radio, a line-based phone, and the like. Unlike conventional speech clients, the clients 240 need not have a speech-specific interface and instead only require a standard Web 2.0 interface. That is, there are no assumptions regarding the client 240 other than an ability to communicate with a Web 2.0 server 210-214 using Web 2.0 conventions.
  • The Web 2.0 servers 210-214 can be any server that provides Web 2.0 content to clients 240 and that provides speech processing capabilities through the Web 2.0 for voice system 230. The Web 2.0 servers can include a WIKI server 210, a BLOG server 212, a MASHUP server, a FOLKSONOMY server, a social networking server, and any other Web 2.0 server 214.
  • The Web 2.0 for voice system 230 can utilize Web 2.0 concepts to provide speech capabilities. A server-side interface is established between the voice system 230 and a set of Web 2.0 servers 210-214. Available speech resources can be introspected and discovered via introspection documents, which are one of the Web 2.0 items 232. Introspection can be in accordance with the APP specification or a similar protocol. The ability for dynamic configuration and installation is exposed to the servers 210-214 via the introspection document.
  • That is, access to Web 2.0 for voice system 230 can be through a Web 2.0 server that lets users (e.g., clients 240) provide their own customizations/personalizations. Appreciably, use of the APP 256 opens up the application interface to speech resources using Web 2.0, JAVA 2 ENTERPRISE EDITION (J2EE), WEBSPHERE APPLICATION SERVER (WAS), and other conventions, rather than being restricted to protocols, such as media resource control protocol (MRCP), real time streaming protocol (RTSP), or real time protocol (RTP).
  • A constrained set of RESTful commands can be used to interface with the Web 2.0 for voice system 230. RESTful commands can include a GET command, a POST command, a PUT command, and a DELETE command, each of which is able to be implemented as an HTTP command. As applied to speech, GET (e.g., HTTP GET) can return capabilities and elements that are modifiable. The GET command can also be used for submitting simplistic speech queries and for receiving query results.
  • The POST command can create media-related resources using speech engines 238. For example, the POST command can create an audio “file” from input text using a text-to-speech (TTS) resource 233 which is linked to a TTS engine 238. The POST command can create a text representation given an audio input, using an automatic speech recognition (ASR) resource 233 which is linked to an ASR engine 238. The POST command can create a score given an audio input, using a Speaker Identification and Verification (SIV) resource which is linked to a SIV engine 238. Any type of speech processing resource can be similarly accessed using the POST command.
  • The PUT command can be used to update configuration of speech resources (e.g., default voice-name, ASR or TTS language, TTS voice, media destination, media delivery type, etc.) The PUT command can also be used to add a resource or capability to a Web 2.0 server 210-214 (e.g. installing an SIV component). The DELETE command can remove a speech resource from a configuration. For example, the DELETE command can be used to uninstall a previously installed speech component.
  • The Web 2.0 for Voice system 230 is an extremely flexible solution that permits users (of clients 240) to customize numerous speech processing elements. Customizable speech processing elements can include speech resource availability, request characteristics, result characteristics, media characteristics, and the like. Speech resource availability can indicate whether a specific type of resource (e.g., ASR, TTS, SIV, Voice XML interpreter) is available. Request characteristics can refer to characteristics such as language, grammar, voice attributes, gender, rate of speech, and the like. The result characteristics can specify whether results are to be delivered synchronously or asynchronously. Result characteristics can alternatively indicate whether a listener for callback is to be supplied with results. Media characteristics can include input and output characteristics, which can vary from a URI reference to an RTP stream. The media characteristics can specify a codec (e.g., G711), a sample rate (e.g., 8 KHz to 22 KHz), and the like. In one configuration, the speech engines 238 can be provided from a J2EE environment 236, such as a WAS environment. This environment 236 can conform to a J2EE Connector Architecture (JCA) 237.
  • In one embodiment, a set of additional facades 260 can be utilized on top of Web 2.0 protocols to provide additional interface and protocol 262 options (e.g., MRCP, RTSP, RTP, Session Initiation Protocol (SIP), etc.) to the Web 2.0 for voice system 230. Use of facades 260 can enable legacy access/use of the Web 2.0 for voice system 230. The facades 260 can be designed to segment the protocol 262 from underlying details so that characteristics of the facade do not bleed through to speech implementation details. Functions, such as the WAS 6.1 channel framework or a JCA container, can be used to plug-in a protocol, which is not native to the J2EE environment 236. The media component 234 of the container 231 can be used to handle media storage, delivery, and format conversions as necessary. Facades 260 can be used for asynchronous or synchronous protocols 262.
  • FIG. 3 is a schematic diagram showing a WIKI server 330 adapted for communications with a Web 2.0 for voice system 310 in accordance with an embodiment of the inventive arrangements disclosed herein. Although a WIKI server 330 is illustrated, server 330 can be any WEB 2.0 server (e.g., server 120 of system 100 or server 210-214 of system 200) including, but not limited to, a BLOG server, a MASHUP server, a FOLKSONOMY server, a social networking server, and the like.
  • In the system 300, a browser 320 can communicate with Web 2.0 server 330 via Representational State Transfer (REST) architecture / ATOM 304 based protocol. The Web 2.0 server 330 can communicate with a speech for Web 2.0 system 310 via a REST/ATOM 302 based protocol. Protocols 302, 304 can include HTTP and similar protocols that are RESTful by nature as well as an Atom Publishing Protocol (APP) or other protocol that is specifically designed to conform to REST principles.
  • The Web 2.0 server 330 can include a data store 332 in which applications 334, which can be speech-enabled, are stored. In one embodiment, the applications 332 can be written in a WIKI or other Web 2.0 syntax and can be stored in an APP format.
  • The contents of the application 332 can be accessed and modified using editor 350. The editor 350 can be a standard WIKI or other Web 2.0 editor having a voice plug-in or extensions 352. In one implementation, user-specific modifications made to the speech-enabled application 334 via the editor 350 can be stored in customization data store as a customization profile and/or a state definition. The customization profile and state definition can contain customization settings that can override entries contained within the original application 332. Customizations can be related to a particular user or set of users.
  • The transformer 340 can convert WIKI or other Web 2.0 syntax into standard markup for browsers. In one embodiment, the transformer 340 can be an extension of a conventional transformer that supports HTML and XML. The extended transformer 340 can be enhanced to handle JAVA SCRIPT, such as AJAX. For example, resource links of application 332 can be converted into AJAX functions by the transformer 340 having an AJAX plug-in 342. The transformer 340 can also include a VoiceXML plug-in 344, which generates VoiceXML markup for voice-only clients.
  • The present invention may be realized in hardware, software, or a combination of hardware and software. The present invention may be realized in a centralized fashion in one computer system or in a distributed fashion where different elements are spread across several interconnected computer systems. Any kind of computer system or other apparatus adapted for carrying out the methods described herein is suited. A typical combination of hardware and software may be a general purpose computer system with a computer program that, when being loaded and executed, controls the computer system such that it carries out the methods described herein.
  • The present invention also may be embedded in a computer program product, which comprises all the features enabling the implementation of the methods described herein, and which when loaded in a computer system is able to carry out these methods. Computer program in the present context means any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either both of the following: a) conversion to another language, code or notation; b) reproduction in a different material form.
  • This invention may be embodied in other forms without departing from the spirit or essential attributes thereof. Accordingly, reference should be made to the following claims, rather than to the foregoing specification, as indicating the scope of the invention.

Claims (20)

1. A speech processing system comprising:
a client configured to access a speech-enabled application using at least one Web 2.0 communication protocol;
a speech for Web 2.0 system within which the speech-enabled application executes, said speech for Web 2.0 system accessing a data store within which user specific speech parameters are included, wherein a user of the client is able to configure the specific speech parameters of the data store associated with the user, and wherein the speech-enabled application executes in accordance with the specific speech parameters corresponding to the user of the client; and
a speech processing system comprising a plurality of speech processing engines, wherein the speech processing system interacts with the speech for Web 2.0 system to handle speech processing tasks associated with the speech-enabled application.
2. The system of claim 1, wherein the specific speech parameters specify at least one of speech resource availability, speech resource characteristics, and speech delivery characteristics.
3. The system of claim 1, wherein the Web 2.0 communication protocol is a Hypertext Transfer Protocol (HTTP) based protocol, and wherein the speech processing system interfaces with the speech for Web 2.0 system using an Atom Publication Protocol (APP) based protocol.
4. The system of claim 1, wherein interactions between the speech processing system and the speech for Web 2.0 system occur through one of four RESTful commands, said RESTful commands comprising a GET command, a POST command, a PUT command, and a DELETE command.
5. The system of claim 1, wherein said speech-enabled application comprises at least one introspection document, which is used to enable the client to configure the specific speech parameters.
6. The system of claim 1, wherein the speech enabled application comprises two collections, one of these collections comprising at least one entry, each entry defining content that is presented to the client, the other one of the collections comprising a collection of resources that include speech processing resources, wherein a one-to-one relationship exists between the speech processing resources of the collection of resources and a type of speech comprising engine of the speech processing system to which the speech processing resource corresponds, said types of speech processing engines including at least two of a recognition engine, a text-to-speech engine, a speech identification and verification (SIV) engine, and a VoiceXML interpreter.
7. The system of claim 1, wherein the speech-enabled application is at least one of a WIKI, a BLOG, a MASHUP, a social networking application, and a FOLKSONOMY.
8. The system of claim 1, wherein the client comprises a standard Web browser through which the client interfaces with the speech for Web 2.0 system, wherein the Web 2.0 communication protocol is directly supported by the standard Web browser.
9. The system of claim 1, further comprising:
a middleware server comprising a standard voice browser, wherein said client interacts with the middleware server over a real-time voice communication channel, wherein the standard voice browser interfaces with the speech for Web 2.0 system, wherein the Web 2.0 communication protocol is directly supported by the standard voice browser.
10. The system of claim 1, further comprising:
an enterprise server comprising enterprise content, wherein the enterprise server interacts with the speech for Web 2.0 system to permit the client to access the enterprise content by interacting with the speech-enabled application.
11. A system for using Web 2.0 as an interface to speech engines comprising:
a Web 2.0 server configured to serve at least one speech-enabled application to at least one remotely located client; and
a server-side speech processing system configured to handle speech processing operations for the at least one speech-enabled application, wherein communications with the server-side speech processing system occur via a set of RESTful commands.
12. The system of claim 11, wherein Web 2.0 server utilizes at least one introspection document associated with the speech-enabled application for introspection and discovery of speech resources and to configure the speech resources.
13. The system of claim 12, wherein the introspection document and the RESTful commands conform to an Atom Publication Protocol (APP) based specification.
14. The system of claim 11, wherein the set of RESTful commands comprise an HTTP GET command, an HTTP POST command, an HTTP PUT command, and an HTTP DELETE command.
15. The system of claim 14, wherein said GET command selectively returns modifiable speech processing capabilities and elements, said GET command also selectively returning speech query results, wherein said POST command selectively provides input to a speech engine and returning output from the speech engine, said output being a processed result of the input, wherein said PUT command selectively updates speech resources for a configuration, said PUT command also selectively installing a speech resource for a configuration, and wherein said DELETE command selectively removes a speech resource from a configuration.
16. The system of claim 11, wherein the set of RESTful commands consist of an HTTP GET command, an HTTP POST command, an HTTP PUT command, and an HTTP DELETE command.
17. A speech for Web 2.0 system comprising:
a Web 2.0 server configured to serve at least one speech-enabled application to remotely located clients, said speech-enabled application comprising an introspection document, a collection of entries, and a collection of resources, wherein at least one of the resources is a speech resource associated with a speech engine, which adds a speech processing capability to the speech-enabled application.
18. The system of claim 17, wherein the speech-enabled application conforms to an Atom Publication Protocol (APP) based specification.
19. The system of claim 17, wherein the speech engine is a turn-based speech processing engine executing within a JAVA 2 ENTERPRISE EDITION (J2EE) middleware environment.
20. The system of claim 17, wherein the Web 2.0 server is configured so that end-users are able to introspect, customize, replace, add, re-order, and remove entries and resources in the collections.
US11/765,900 2007-06-20 2007-06-20 Speech processing system based upon a representational state transfer (rest) architecture that uses web 2.0 concepts for speech resource interfaces Abandoned US20080319757A1 (en)

Priority Applications (9)

Application Number Priority Date Filing Date Title
US11/766,002 US7890333B2 (en) 2007-06-20 2007-06-20 Using a WIKI editor to create speech-enabled applications
US11/765,900 US20080319757A1 (en) 2007-06-20 2007-06-20 Speech processing system based upon a representational state transfer (rest) architecture that uses web 2.0 concepts for speech resource interfaces
US11/766,291 US8074202B2 (en) 2007-06-20 2007-06-21 WIKI application development tool that uses specialized blogs to publish WIKI development content in an organized/searchable fashion
US11/766,255 US9311420B2 (en) 2007-06-20 2007-06-21 Customizing web 2.0 application behavior based on relationships between a content creator and a content requester
US11/766,335 US7996229B2 (en) 2007-06-20 2007-06-21 System and method for creating and posting voice-based web 2.0 entries via a telephone interface
US11/766,210 US8032379B2 (en) 2007-06-20 2007-06-21 Creating and editing web 2.0 entries including voice enabled ones using a voice only interface
US11/766,157 US8041573B2 (en) 2007-06-20 2007-06-21 Integrating a voice browser into a Web 2.0 environment
US11/766,139 US7631104B2 (en) 2007-06-20 2007-06-21 Providing user customization of web 2.0 applications
PCT/EP2008/057671 WO2008155343A2 (en) 2007-06-20 2008-06-18 Speech processing system based upon a representational state transfer (rest) architecture that uses web 2.0 concepts for speech resource interfaces

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US11/765,900 US20080319757A1 (en) 2007-06-20 2007-06-20 Speech processing system based upon a representational state transfer (rest) architecture that uses web 2.0 concepts for speech resource interfaces

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US11/765,928 Continuation-In-Part US8041572B2 (en) 2007-06-20 2007-06-20 Speech processing method based upon a representational state transfer (REST) architecture that uses web 2.0 concepts for speech resource interfaces

Related Child Applications (8)

Application Number Title Priority Date Filing Date
US11/766,002 Continuation-In-Part US7890333B2 (en) 2007-06-20 2007-06-20 Using a WIKI editor to create speech-enabled applications
US11/766,291 Continuation-In-Part US8074202B2 (en) 2007-06-20 2007-06-21 WIKI application development tool that uses specialized blogs to publish WIKI development content in an organized/searchable fashion
US11/766,291 Continuation US8074202B2 (en) 2007-06-20 2007-06-21 WIKI application development tool that uses specialized blogs to publish WIKI development content in an organized/searchable fashion
US11/766,335 Continuation-In-Part US7996229B2 (en) 2007-06-20 2007-06-21 System and method for creating and posting voice-based web 2.0 entries via a telephone interface
US11/766,157 Continuation-In-Part US8041573B2 (en) 2007-06-20 2007-06-21 Integrating a voice browser into a Web 2.0 environment
US11/766,255 Continuation-In-Part US9311420B2 (en) 2007-06-20 2007-06-21 Customizing web 2.0 application behavior based on relationships between a content creator and a content requester
US11/766,210 Continuation-In-Part US8032379B2 (en) 2007-06-20 2007-06-21 Creating and editing web 2.0 entries including voice enabled ones using a voice only interface
US11/766,139 Continuation-In-Part US7631104B2 (en) 2007-06-20 2007-06-21 Providing user customization of web 2.0 applications

Publications (1)

Publication Number Publication Date
US20080319757A1 true US20080319757A1 (en) 2008-12-25

Family

ID=40039945

Family Applications (2)

Application Number Title Priority Date Filing Date
US11/765,900 Abandoned US20080319757A1 (en) 2007-06-20 2007-06-20 Speech processing system based upon a representational state transfer (rest) architecture that uses web 2.0 concepts for speech resource interfaces
US11/766,291 Expired - Fee Related US8074202B2 (en) 2007-06-20 2007-06-21 WIKI application development tool that uses specialized blogs to publish WIKI development content in an organized/searchable fashion

Family Applications After (1)

Application Number Title Priority Date Filing Date
US11/766,291 Expired - Fee Related US8074202B2 (en) 2007-06-20 2007-06-21 WIKI application development tool that uses specialized blogs to publish WIKI development content in an organized/searchable fashion

Country Status (2)

Country Link
US (2) US20080319757A1 (en)
WO (1) WO2008155343A2 (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080320079A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Customizing web 2.0 application behavior based on relationships between a content creator and a content requester
US20080319761A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Speech processing method based upon a representational state transfer (rest) architecture that uses web 2.0 concepts for speech resource interfaces
US20080320443A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Wiki application development tool that uses specialized blogs to publish wiki development content in an organized/searchable fashion
US20080319762A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Using a wiki editor to create speech-enabled applications
US20080319759A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Integrating a voice browser into a web 2.0 environment
US20080319760A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Creating and editing web 2.0 entries including voice enabled ones using a voice only interface
US20080319758A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Speech-enabled application that uses web 2.0 concepts to interface with speech engines
US20080319742A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation System and method for posting to a blog or wiki using a telephone
US20090254346A1 (en) * 2008-04-07 2009-10-08 International Business Machines Corporation Automated voice enablement of a web page
US20090254348A1 (en) * 2008-04-07 2009-10-08 International Business Machines Corporation Free form input field support for automated voice enablement of a web page
US20100199260A1 (en) * 2009-02-02 2010-08-05 Duggal Dave M Resource processing using an intermediary for context-based customization of interaction deliverables
US20110022388A1 (en) * 2009-07-27 2011-01-27 Wu Sung Fong Solomon Method and system for speech recognition using social networks
US20110041171A1 (en) * 2009-08-11 2011-02-17 Lloyd Leon Burch Techniques for virtual representational state transfer (rest) interfaces
US20130124631A1 (en) * 2011-11-04 2013-05-16 Fidelus Technologies, Llc. Apparatus, system, and method for digital communications driven by behavior profiles of participants
US20130268483A1 (en) * 2012-04-06 2013-10-10 Sony Corporation Information processing apparatus, information processing method, and computer program
US9075616B2 (en) 2012-03-19 2015-07-07 Enterpriseweb Llc Declarative software application meta-model and system for self-modification
US10129720B1 (en) * 2011-12-30 2018-11-13 Genesys Telecommunications Laboratories, Inc. Conversation assistant
CN110619101A (en) * 2018-12-29 2019-12-27 北京时光荏苒科技有限公司 Method and apparatus for processing information
US11641397B2 (en) * 2014-05-11 2023-05-02 Microsoft Technology Licensing, Llc File service using a shared file access-rest interface

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7904559B2 (en) * 2007-10-19 2011-03-08 International Business Machines Corporation HTTP-based publish-subscribe service
US20090300485A1 (en) * 2008-05-27 2009-12-03 Microsoft Corporation Techniques for automatically generating wiki content
US8285787B2 (en) * 2008-11-26 2012-10-09 Red Hat, Inc. Systems and methods for managing a collaboration space having application hosting capabilities
US8805930B2 (en) 2009-02-24 2014-08-12 Red Hat, Inc. Managing application programming interfaces in a collaboration space
US9524345B1 (en) 2009-08-31 2016-12-20 Richard VanderDrift Enhancing content using linked context
US9639707B1 (en) 2010-01-14 2017-05-02 Richard W. VanderDrift Secure data storage and communication for network computing
US8826260B2 (en) * 2011-01-11 2014-09-02 Intuit Inc. Customization of mobile-application delivery
US9058401B2 (en) * 2011-08-16 2015-06-16 Fabebook, Inc. Aggregating plug-in requests for improved client performance
KR101703168B1 (en) * 2011-12-27 2017-02-07 한국전자통신연구원 Apparatus and method based Wiki for providing an information by using a user relationship
US10026051B2 (en) 2014-09-29 2018-07-17 Hartford Fire Insurance Company System for accessing business metadata within a distributed network
US11003655B2 (en) 2016-09-22 2021-05-11 Hartford Fire Insurance Company System for uploading information into a metadata repository

Citations (62)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6269336B1 (en) * 1998-07-24 2001-07-31 Motorola, Inc. Voice browser for interactive services and methods thereof
US6314402B1 (en) * 1999-04-23 2001-11-06 Nuance Communications Method and apparatus for creating modifiable and combinable speech objects for acquiring information from a speaker in an interactive voice response system
US6324511B1 (en) * 1998-10-01 2001-11-27 Mindmaker, Inc. Method of and apparatus for multi-modal information presentation to computer users with dyslexia, reading disabilities or visual impairment
US20020010756A1 (en) * 2000-07-24 2002-01-24 Kazuho Oku System and method for providing contents on a network
US20020052747A1 (en) * 2000-08-21 2002-05-02 Sarukkai Ramesh R. Method and system of interpreting and presenting web content using a voice browser
US20020098864A1 (en) * 2001-01-25 2002-07-25 Manabu Mukai Mobile radio communication apparatus capable to plurality of radio communication systems
US6442577B1 (en) * 1998-11-03 2002-08-27 Front Porch, Inc. Method and apparatus for dynamically forming customized web pages for web sites
US6529871B1 (en) * 1997-06-11 2003-03-04 International Business Machines Corporation Apparatus and method for speaker verification/identification/classification employing non-acoustic and/or acoustic models and databases
US20030055884A1 (en) * 2001-07-03 2003-03-20 Yuen Michael S. Method for automated harvesting of data from a Web site using a voice portal system
US20030088421A1 (en) * 2001-06-25 2003-05-08 International Business Machines Corporation Universal IP-based and scalable architectures across conversational applications using web services for speech and audio processing resources
US20030139928A1 (en) * 2002-01-22 2003-07-24 Raven Technology, Inc. System and method for dynamically creating a voice portal in voice XML
US20030177010A1 (en) * 2002-03-11 2003-09-18 John Locke Voice enabled personalized documents
US6636831B1 (en) * 1999-04-09 2003-10-21 Inroad, Inc. System and process for voice-controlled information retrieval
US20040083133A1 (en) * 2001-06-14 2004-04-29 Nicholas Frank C. Method and system for providing network based target advertising and encapsulation
US6865599B2 (en) * 2001-09-04 2005-03-08 Chenglin Zhang Browser-to-browser, dom-based, peer-to-peer communication with delta synchronization
US20050132056A1 (en) * 2003-12-15 2005-06-16 International Business Machines Corporation Method, system, and apparatus for generating weblogs from interactive communication client software
US20060004700A1 (en) * 2004-06-30 2006-01-05 Hofmann Helmut A Methods and systems for providing validity logic
US20060015335A1 (en) * 2004-07-13 2006-01-19 Ravigopal Vennelakanti Framework to enable multimodal access to applications
US20060085741A1 (en) * 2004-10-20 2006-04-20 Viewfour, Inc. A Delaware Corporation Method and apparatus to view multiple web pages simultaneously from network based search
US7047196B2 (en) * 2000-06-08 2006-05-16 Agiletv Corporation System and method of voice recognition near a wireline node of a network supporting cable television and/or video delivery
US20060122836A1 (en) * 2004-12-08 2006-06-08 International Business Machines Corporation Dynamic switching between local and remote speech rendering
US20060195328A1 (en) * 2005-02-15 2006-08-31 International Business Machines Corporation Enhancing web experiences using behavioral biometric data
US7143148B1 (en) * 1996-05-01 2006-11-28 G&H Nevada-Tek Method and apparatus for accessing a wide area network
US20070078884A1 (en) * 2005-09-30 2007-04-05 Yahoo! Inc. Podcast search engine
US20070118484A1 (en) * 2005-11-22 2007-05-24 International Business Machines Corporation Conveying reliable identity in electronic collaboration
US20070185927A1 (en) * 2006-01-27 2007-08-09 International Business Machines Corporation System, method and computer program product for shared user tailoring of websites
US20070188657A1 (en) * 2006-02-15 2007-08-16 Basson Sara H Synchronizing method and system
US20070213980A1 (en) * 2000-06-12 2007-09-13 Danner Ryan A Apparatus and methods for providing network-based information suitable for audio output
US20080010387A1 (en) * 2006-07-07 2008-01-10 Bryce Allen Curtis Method for defining a Wiki page layout using a Wiki page
US20080010609A1 (en) * 2006-07-07 2008-01-10 Bryce Allen Curtis Method for extending the capabilities of a Wiki environment
US20080010341A1 (en) * 2006-07-07 2008-01-10 Bryce Allen Curtis Processing model of an application wiki
US20080033739A1 (en) * 2006-08-02 2008-02-07 Facebook, Inc. Systems and methods for dynamically generating segmented community flyers
US20080040661A1 (en) * 2006-07-07 2008-02-14 Bryce Allen Curtis Method for inheriting a Wiki page layout for a Wiki page
US7334050B2 (en) * 2000-06-07 2008-02-19 Nvidia International, Inc. Voice applications and voice-based interface
US20080046976A1 (en) * 2006-07-25 2008-02-21 Facebook, Inc. Systems and methods for dynamically generating a privacy summary
US20080086689A1 (en) * 2006-10-09 2008-04-10 Qmind, Inc. Multimedia content production, publication, and player apparatus, system and method
US20080177831A1 (en) * 2007-01-19 2008-07-24 Kat Digital Corp. Communitized media application and sharing apparatus
US20080240397A1 (en) * 2007-03-29 2008-10-02 Fatdoor, Inc. White page and yellow page directories in a geo-spatial environment
US20080242221A1 (en) * 2007-03-27 2008-10-02 Shapiro Andrew J Customized Content Delivery System and Method
US20080244020A1 (en) * 2007-03-28 2008-10-02 Michael R. Dolan System and method of user definition of and participation in communities and management of individual and community information and communication
US20080320168A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Providing user customization of web 2.0 applications
US20080319742A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation System and method for posting to a blog or wiki using a telephone
US20080319759A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Integrating a voice browser into a web 2.0 environment
US20080319758A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Speech-enabled application that uses web 2.0 concepts to interface with speech engines
US20080319760A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Creating and editing web 2.0 entries including voice enabled ones using a voice only interface
US20080320079A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Customizing web 2.0 application behavior based on relationships between a content creator and a content requester
US20080319762A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Using a wiki editor to create speech-enabled applications
US20080320443A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Wiki application development tool that uses specialized blogs to publish wiki development content in an organized/searchable fashion
US20080319761A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Speech processing method based upon a representational state transfer (rest) architecture that uses web 2.0 concepts for speech resource interfaces
US7581166B2 (en) * 2006-07-21 2009-08-25 At&T Intellectual Property Ii, L.P. System and method of collecting, correlating, and aggregating structured edited content and non-edited content
US20090252159A1 (en) * 2008-04-02 2009-10-08 Jeffrey Lawson System and method for processing telephony sessions
US7669123B2 (en) * 2006-08-11 2010-02-23 Facebook, Inc. Dynamically providing a news feed about a user of a social network
US7673017B2 (en) * 2005-09-06 2010-03-02 Interpolls Network Inc. Systems and methods for integrating XML syndication feeds into online advertisement
US20100100439A1 (en) * 2008-06-12 2010-04-22 Dawn Jutla Multi-platform system apparatus for interoperable, multimedia-accessible and convertible structured and unstructured wikis, wiki user networks, and other user-generated content repositories
US7725492B2 (en) * 2005-12-23 2010-05-25 Facebook, Inc. Managing information about relationships in a social network via a social timeline
US7788260B2 (en) * 2004-06-14 2010-08-31 Facebook, Inc. Ranking search results based on the frequency of clicks on the search results by members of a social network who are within a predetermined degree of separation
US20100241507A1 (en) * 2008-07-02 2010-09-23 Michael Joseph Quinn System and method for searching, advertising, producing and displaying geographic territory-specific content in inter-operable co-located user-interface components
US7809805B2 (en) * 2007-02-28 2010-10-05 Facebook, Inc. Systems and methods for automatically locating web-based social network members
US7827265B2 (en) * 2007-03-23 2010-11-02 Facebook, Inc. System and method for confirming an association in a web-based social network
US7827208B2 (en) * 2006-08-11 2010-11-02 Facebook, Inc. Generating a feed of stories personalized for members of a social network
US20110035687A1 (en) * 2009-08-10 2011-02-10 Rebelvox, Llc Browser enabled communication device for conducting conversations in either a real-time mode, a time-shifted mode, and with the ability to seamlessly shift the conversation between the two modes
US8145472B2 (en) * 2005-12-12 2012-03-27 John Shore Language translation using a hybrid network of human and machine translators

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6938526B2 (en) 2003-07-30 2005-09-06 Black & Decker Inc. Impact wrench having an improved anvil to square driver transition
US7433876B2 (en) * 2004-02-23 2008-10-07 Radar Networks, Inc. Semantic web portal and platform
US7584268B2 (en) * 2005-02-01 2009-09-01 Google Inc. Collaborative web page authoring
US7860946B1 (en) * 2007-05-01 2010-12-28 Disintegrated Communication Systems, Llc Systems, methods, and computer-readable media for searching and concomitantly interacting with multiple information content providers, other individuals, relevant communities of individuals, and information provided over a network

Patent Citations (73)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7143148B1 (en) * 1996-05-01 2006-11-28 G&H Nevada-Tek Method and apparatus for accessing a wide area network
US6529871B1 (en) * 1997-06-11 2003-03-04 International Business Machines Corporation Apparatus and method for speaker verification/identification/classification employing non-acoustic and/or acoustic models and databases
US6269336B1 (en) * 1998-07-24 2001-07-31 Motorola, Inc. Voice browser for interactive services and methods thereof
US6324511B1 (en) * 1998-10-01 2001-11-27 Mindmaker, Inc. Method of and apparatus for multi-modal information presentation to computer users with dyslexia, reading disabilities or visual impairment
US6442577B1 (en) * 1998-11-03 2002-08-27 Front Porch, Inc. Method and apparatus for dynamically forming customized web pages for web sites
US6636831B1 (en) * 1999-04-09 2003-10-21 Inroad, Inc. System and process for voice-controlled information retrieval
US6314402B1 (en) * 1999-04-23 2001-11-06 Nuance Communications Method and apparatus for creating modifiable and combinable speech objects for acquiring information from a speaker in an interactive voice response system
US7334050B2 (en) * 2000-06-07 2008-02-19 Nvidia International, Inc. Voice applications and voice-based interface
US7047196B2 (en) * 2000-06-08 2006-05-16 Agiletv Corporation System and method of voice recognition near a wireline node of a network supporting cable television and/or video delivery
US20070213980A1 (en) * 2000-06-12 2007-09-13 Danner Ryan A Apparatus and methods for providing network-based information suitable for audio output
US20020010756A1 (en) * 2000-07-24 2002-01-24 Kazuho Oku System and method for providing contents on a network
US20020052747A1 (en) * 2000-08-21 2002-05-02 Sarukkai Ramesh R. Method and system of interpreting and presenting web content using a voice browser
US20020098864A1 (en) * 2001-01-25 2002-07-25 Manabu Mukai Mobile radio communication apparatus capable to plurality of radio communication systems
US20040083133A1 (en) * 2001-06-14 2004-04-29 Nicholas Frank C. Method and system for providing network based target advertising and encapsulation
US20030088421A1 (en) * 2001-06-25 2003-05-08 International Business Machines Corporation Universal IP-based and scalable architectures across conversational applications using web services for speech and audio processing resources
US6801604B2 (en) * 2001-06-25 2004-10-05 International Business Machines Corporation Universal IP-based and scalable architectures across conversational applications using web services for speech and audio processing resources
US20030055884A1 (en) * 2001-07-03 2003-03-20 Yuen Michael S. Method for automated harvesting of data from a Web site using a voice portal system
US6865599B2 (en) * 2001-09-04 2005-03-08 Chenglin Zhang Browser-to-browser, dom-based, peer-to-peer communication with delta synchronization
US20030139928A1 (en) * 2002-01-22 2003-07-24 Raven Technology, Inc. System and method for dynamically creating a voice portal in voice XML
US20030177010A1 (en) * 2002-03-11 2003-09-18 John Locke Voice enabled personalized documents
US20050132056A1 (en) * 2003-12-15 2005-06-16 International Business Machines Corporation Method, system, and apparatus for generating weblogs from interactive communication client software
US7788260B2 (en) * 2004-06-14 2010-08-31 Facebook, Inc. Ranking search results based on the frequency of clicks on the search results by members of a social network who are within a predetermined degree of separation
US20060004700A1 (en) * 2004-06-30 2006-01-05 Hofmann Helmut A Methods and systems for providing validity logic
US20060015335A1 (en) * 2004-07-13 2006-01-19 Ravigopal Vennelakanti Framework to enable multimodal access to applications
US20060085741A1 (en) * 2004-10-20 2006-04-20 Viewfour, Inc. A Delaware Corporation Method and apparatus to view multiple web pages simultaneously from network based search
US20060122836A1 (en) * 2004-12-08 2006-06-08 International Business Machines Corporation Dynamic switching between local and remote speech rendering
US20060195328A1 (en) * 2005-02-15 2006-08-31 International Business Machines Corporation Enhancing web experiences using behavioral biometric data
US7673017B2 (en) * 2005-09-06 2010-03-02 Interpolls Network Inc. Systems and methods for integrating XML syndication feeds into online advertisement
US20070078884A1 (en) * 2005-09-30 2007-04-05 Yahoo! Inc. Podcast search engine
US20070118484A1 (en) * 2005-11-22 2007-05-24 International Business Machines Corporation Conveying reliable identity in electronic collaboration
US8145472B2 (en) * 2005-12-12 2012-03-27 John Shore Language translation using a hybrid network of human and machine translators
US7725492B2 (en) * 2005-12-23 2010-05-25 Facebook, Inc. Managing information about relationships in a social network via a social timeline
US20070185927A1 (en) * 2006-01-27 2007-08-09 International Business Machines Corporation System, method and computer program product for shared user tailoring of websites
US20070188657A1 (en) * 2006-02-15 2007-08-16 Basson Sara H Synchronizing method and system
US20080010609A1 (en) * 2006-07-07 2008-01-10 Bryce Allen Curtis Method for extending the capabilities of a Wiki environment
US20080010341A1 (en) * 2006-07-07 2008-01-10 Bryce Allen Curtis Processing model of an application wiki
US20080040661A1 (en) * 2006-07-07 2008-02-14 Bryce Allen Curtis Method for inheriting a Wiki page layout for a Wiki page
US20080010387A1 (en) * 2006-07-07 2008-01-10 Bryce Allen Curtis Method for defining a Wiki page layout using a Wiki page
US7581166B2 (en) * 2006-07-21 2009-08-25 At&T Intellectual Property Ii, L.P. System and method of collecting, correlating, and aggregating structured edited content and non-edited content
US20080046976A1 (en) * 2006-07-25 2008-02-21 Facebook, Inc. Systems and methods for dynamically generating a privacy summary
US7797256B2 (en) * 2006-08-02 2010-09-14 Facebook, Inc. Generating segmented community flyers in a social networking system
US20080033739A1 (en) * 2006-08-02 2008-02-07 Facebook, Inc. Systems and methods for dynamically generating segmented community flyers
US7669123B2 (en) * 2006-08-11 2010-02-23 Facebook, Inc. Dynamically providing a news feed about a user of a social network
US7827208B2 (en) * 2006-08-11 2010-11-02 Facebook, Inc. Generating a feed of stories personalized for members of a social network
US20080086689A1 (en) * 2006-10-09 2008-04-10 Qmind, Inc. Multimedia content production, publication, and player apparatus, system and method
US20080177831A1 (en) * 2007-01-19 2008-07-24 Kat Digital Corp. Communitized media application and sharing apparatus
US7809805B2 (en) * 2007-02-28 2010-10-05 Facebook, Inc. Systems and methods for automatically locating web-based social network members
US7827265B2 (en) * 2007-03-23 2010-11-02 Facebook, Inc. System and method for confirming an association in a web-based social network
US20080242221A1 (en) * 2007-03-27 2008-10-02 Shapiro Andrew J Customized Content Delivery System and Method
US20080244020A1 (en) * 2007-03-28 2008-10-02 Michael R. Dolan System and method of user definition of and participation in communities and management of individual and community information and communication
US20080240397A1 (en) * 2007-03-29 2008-10-02 Fatdoor, Inc. White page and yellow page directories in a geo-spatial environment
US20080319761A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Speech processing method based upon a representational state transfer (rest) architecture that uses web 2.0 concepts for speech resource interfaces
US8074202B2 (en) * 2007-06-20 2011-12-06 International Business Machines Corporation WIKI application development tool that uses specialized blogs to publish WIKI development content in an organized/searchable fashion
US20080320168A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Providing user customization of web 2.0 applications
US20080319742A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation System and method for posting to a blog or wiki using a telephone
US8086460B2 (en) * 2007-06-20 2011-12-27 International Business Machines Corporation Speech-enabled application that uses web 2.0 concepts to interface with speech engines
US8041572B2 (en) * 2007-06-20 2011-10-18 International Business Machines Corporation Speech processing method based upon a representational state transfer (REST) architecture that uses web 2.0 concepts for speech resource interfaces
US20080319759A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Integrating a voice browser into a web 2.0 environment
US20080320443A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Wiki application development tool that uses specialized blogs to publish wiki development content in an organized/searchable fashion
US20080319762A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Using a wiki editor to create speech-enabled applications
US7631104B2 (en) * 2007-06-20 2009-12-08 International Business Machines Corporation Providing user customization of web 2.0 applications
US20080320079A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Customizing web 2.0 application behavior based on relationships between a content creator and a content requester
US20080319760A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Creating and editing web 2.0 entries including voice enabled ones using a voice only interface
US20080319758A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Speech-enabled application that uses web 2.0 concepts to interface with speech engines
US8041573B2 (en) * 2007-06-20 2011-10-18 International Business Machines Corporation Integrating a voice browser into a Web 2.0 environment
US7890333B2 (en) * 2007-06-20 2011-02-15 International Business Machines Corporation Using a WIKI editor to create speech-enabled applications
US7996229B2 (en) * 2007-06-20 2011-08-09 International Business Machines Corporation System and method for creating and posting voice-based web 2.0 entries via a telephone interface
US8032379B2 (en) * 2007-06-20 2011-10-04 International Business Machines Corporation Creating and editing web 2.0 entries including voice enabled ones using a voice only interface
US20100142516A1 (en) * 2008-04-02 2010-06-10 Jeffrey Lawson System and method for processing media requests during a telephony sessions
US20090252159A1 (en) * 2008-04-02 2009-10-08 Jeffrey Lawson System and method for processing telephony sessions
US20100100439A1 (en) * 2008-06-12 2010-04-22 Dawn Jutla Multi-platform system apparatus for interoperable, multimedia-accessible and convertible structured and unstructured wikis, wiki user networks, and other user-generated content repositories
US20100241507A1 (en) * 2008-07-02 2010-09-23 Michael Joseph Quinn System and method for searching, advertising, producing and displaying geographic territory-specific content in inter-operable co-located user-interface components
US20110035687A1 (en) * 2009-08-10 2011-02-10 Rebelvox, Llc Browser enabled communication device for conducting conversations in either a real-time mode, a time-shifted mode, and with the ability to seamlessly shift the conversation between the two modes

Cited By (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080320079A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Customizing web 2.0 application behavior based on relationships between a content creator and a content requester
US8041572B2 (en) 2007-06-20 2011-10-18 International Business Machines Corporation Speech processing method based upon a representational state transfer (REST) architecture that uses web 2.0 concepts for speech resource interfaces
US20080320443A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Wiki application development tool that uses specialized blogs to publish wiki development content in an organized/searchable fashion
US20080319762A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Using a wiki editor to create speech-enabled applications
US20080319759A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Integrating a voice browser into a web 2.0 environment
US20080319760A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Creating and editing web 2.0 entries including voice enabled ones using a voice only interface
US20080319758A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Speech-enabled application that uses web 2.0 concepts to interface with speech engines
US8086460B2 (en) 2007-06-20 2011-12-27 International Business Machines Corporation Speech-enabled application that uses web 2.0 concepts to interface with speech engines
US8074202B2 (en) 2007-06-20 2011-12-06 International Business Machines Corporation WIKI application development tool that uses specialized blogs to publish WIKI development content in an organized/searchable fashion
US20080319761A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation Speech processing method based upon a representational state transfer (rest) architecture that uses web 2.0 concepts for speech resource interfaces
US20080319742A1 (en) * 2007-06-20 2008-12-25 International Business Machines Corporation System and method for posting to a blog or wiki using a telephone
US9311420B2 (en) 2007-06-20 2016-04-12 International Business Machines Corporation Customizing web 2.0 application behavior based on relationships between a content creator and a content requester
US8041573B2 (en) 2007-06-20 2011-10-18 International Business Machines Corporation Integrating a voice browser into a Web 2.0 environment
US7890333B2 (en) 2007-06-20 2011-02-15 International Business Machines Corporation Using a WIKI editor to create speech-enabled applications
US8032379B2 (en) 2007-06-20 2011-10-04 International Business Machines Corporation Creating and editing web 2.0 entries including voice enabled ones using a voice only interface
US7996229B2 (en) 2007-06-20 2011-08-09 International Business Machines Corporation System and method for creating and posting voice-based web 2.0 entries via a telephone interface
US8831950B2 (en) * 2008-04-07 2014-09-09 Nuance Communications, Inc. Automated voice enablement of a web page
US20090254348A1 (en) * 2008-04-07 2009-10-08 International Business Machines Corporation Free form input field support for automated voice enablement of a web page
US9047869B2 (en) 2008-04-07 2015-06-02 Nuance Communications, Inc. Free form input field support for automated voice enablement of a web page
US20090254346A1 (en) * 2008-04-07 2009-10-08 International Business Machines Corporation Automated voice enablement of a web page
US20100199260A1 (en) * 2009-02-02 2010-08-05 Duggal Dave M Resource processing using an intermediary for context-based customization of interaction deliverables
US8533675B2 (en) 2009-02-02 2013-09-10 Enterpriseweb Llc Resource processing using an intermediary for context-based customization of interaction deliverables
US10824418B2 (en) 2009-02-02 2020-11-03 Enterpriseweb Llc Resource processing using an intermediary for context-based customization of interaction deliverables
WO2010088649A1 (en) * 2009-02-02 2010-08-05 Consilience International Llc Resource processing using an intermediary for context-based customization of interaction deliverables
US9182977B2 (en) 2009-02-02 2015-11-10 Enterpriseweb Llc Resource processing using an intermediary for context-based customization of interaction deliverables
US20110022388A1 (en) * 2009-07-27 2011-01-27 Wu Sung Fong Solomon Method and system for speech recognition using social networks
US9117448B2 (en) * 2009-07-27 2015-08-25 Cisco Technology, Inc. Method and system for speech recognition using social networks
US9049182B2 (en) 2009-08-11 2015-06-02 Novell, Inc. Techniques for virtual representational state transfer (REST) interfaces
US20110041171A1 (en) * 2009-08-11 2011-02-17 Lloyd Leon Burch Techniques for virtual representational state transfer (rest) interfaces
US10182074B2 (en) 2009-08-11 2019-01-15 Micro Focus Software, Inc. Techniques for virtual representational state transfer (REST) interfaces
US20130124631A1 (en) * 2011-11-04 2013-05-16 Fidelus Technologies, Llc. Apparatus, system, and method for digital communications driven by behavior profiles of participants
US10129720B1 (en) * 2011-12-30 2018-11-13 Genesys Telecommunications Laboratories, Inc. Conversation assistant
US10678518B2 (en) 2012-03-19 2020-06-09 Enterpriseweb Llc Declarative software application meta-model and system for self modification
US9483238B2 (en) 2012-03-19 2016-11-01 Enterpriseweb Llc Declarative software application meta-model and system for self-modification
US10175956B2 (en) 2012-03-19 2019-01-08 Enterpriseweb Llc Declarative software application meta-model and system for self-modification
US9075616B2 (en) 2012-03-19 2015-07-07 Enterpriseweb Llc Declarative software application meta-model and system for self-modification
US10901705B2 (en) 2012-03-19 2021-01-26 Enterpriseweb Llc System for self modification
US20130268483A1 (en) * 2012-04-06 2013-10-10 Sony Corporation Information processing apparatus, information processing method, and computer program
WO2014070238A1 (en) * 2012-11-02 2014-05-08 Fidelus Technologies, Llc Apparatus, system, and method for digital communications driven by behavior profiles of participants
US11641397B2 (en) * 2014-05-11 2023-05-02 Microsoft Technology Licensing, Llc File service using a shared file access-rest interface
CN110619101A (en) * 2018-12-29 2019-12-27 北京时光荏苒科技有限公司 Method and apparatus for processing information

Also Published As

Publication number Publication date
WO2008155343A3 (en) 2009-03-05
WO2008155343A2 (en) 2008-12-24
US8074202B2 (en) 2011-12-06
US20080320443A1 (en) 2008-12-25

Similar Documents

Publication Publication Date Title
US20080319757A1 (en) Speech processing system based upon a representational state transfer (rest) architecture that uses web 2.0 concepts for speech resource interfaces
US8086460B2 (en) Speech-enabled application that uses web 2.0 concepts to interface with speech engines
US8041572B2 (en) Speech processing method based upon a representational state transfer (REST) architecture that uses web 2.0 concepts for speech resource interfaces
US7631104B2 (en) Providing user customization of web 2.0 applications
US8041573B2 (en) Integrating a voice browser into a Web 2.0 environment
US9311420B2 (en) Customizing web 2.0 application behavior based on relationships between a content creator and a content requester
US20170046124A1 (en) Responding to Human Spoken Audio Based on User Input
US10547747B1 (en) Configurable natural language contact flow
US7028306B2 (en) Systems and methods for implementing modular DOM (Document Object Model)-based multi-modal browsers
US7996229B2 (en) System and method for creating and posting voice-based web 2.0 entries via a telephone interface
US8032379B2 (en) Creating and editing web 2.0 entries including voice enabled ones using a voice only interface
US7890333B2 (en) Using a WIKI editor to create speech-enabled applications
KR100459299B1 (en) Conversational browser and conversational systems
US9069450B2 (en) Multi-modal/multi-channel application tool architecture
US7487440B2 (en) Reusable voiceXML dialog components, subdialogs and beans
JP5179375B2 (en) Method and server for processing voice applications in a client-server computing system
US10249296B1 (en) Application discovery and selection in language-based systems
US11749276B2 (en) Voice assistant-enabled web application or web page
US20100094635A1 (en) System for Voice-Based Interaction on Web Pages
US8027839B2 (en) Using an automated speech application environment to automatically provide text exchange services
Saylor Spoke: A framework for building speech-enabled websites
Niazi et al. An ontology-based framework for discovering mobile services
US11551695B1 (en) Model training system for custom speech-to-text models
Prange et al. Easy deployment of spoken dialogue technology on smartwatches for mental healthcare
Boonstra Implementing a Dialogflow Voice Agent in Your Website or App Using the SDK

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:DA PALMA, WILLIAM V.;MOORE, VICTOR S.;NUSBICKEL, WENDI L.;REEL/FRAME:019456/0871;SIGNING DATES FROM 20070614 TO 20070620

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION