WO2017059500A1 - Frameworks and methodologies configured to enable streamlined integration of natural language processing functionality with one or more user interface environments, including assisted learning process - Google Patents

Frameworks and methodologies configured to enable streamlined integration of natural language processing functionality with one or more user interface environments, including assisted learning process Download PDF

Info

Publication number
WO2017059500A1
WO2017059500A1 PCT/AU2016/050950 AU2016050950W WO2017059500A1 WO 2017059500 A1 WO2017059500 A1 WO 2017059500A1 AU 2016050950 W AU2016050950 W AU 2016050950W WO 2017059500 A1 WO2017059500 A1 WO 2017059500A1
Authority
WO
WIPO (PCT)
Prior art keywords
natural language
template
data
user
action
Prior art date
Application number
PCT/AU2016/050950
Other languages
French (fr)
Inventor
Matthew PARTRIDGE
Original Assignee
Sayity Pty Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU2015904121A external-priority patent/AU2015904121A0/en
Application filed by Sayity Pty Ltd filed Critical Sayity Pty Ltd
Publication of WO2017059500A1 publication Critical patent/WO2017059500A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2452Query translation
    • G06F16/24522Translation of natural language queries to structured queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation

Definitions

  • the present invention relates to frameworks and methodologies configured to enable streamlined integration of natural language processing functionality with one or more of user interface environments.
  • Embodiments of the invention have been particularly developed for enabling providers of client-facing technology (such as websites, software applications, and controllable machinery) to implement a natural language processing functionality (including, but not limited to voice control), in a streamlined manner. While some embodiments will be described herein with particular reference to that application, it will be appreciated that the invention is not limited to such a field of use, and is applicable in broader contexts.
  • One embodiment provides a computer implemented method for providing natural language processing functionality, the method including:
  • the template management interface is configured to enable an administrator user to: (i) view data representative of the natural language based request; and (ii) define an instruction template and action template in relation to the natural language based request.
  • One embodiment provides a computer implemented method wherein the template management interface is configured to:
  • each instruction template provide data representative of one or more natural language based requests associated with the instruction template
  • One embodiment provides a computer implemented method wherein enabling user-definition and/or modification of an action template includes: (i) defining a parametrized portion of one or more of the natural language based requests associated with the instruction template; and (ii) defining a corresponding parametrized portion of the action template.
  • One embodiment provides a computer implemented method wherein the parametrized portion of the action template is a portion of HTTP data.
  • One embodiment provides a computer implemented method wherein enabling user-definition and/or modification of an action template includes enabling the user to selectively define (i) a text-based response; or (ii) a HTTP based response.
  • One embodiment provides a computer implemented method wherein enabling user-definition and/or modification of an action template includes enabling the user to selectively define (i) a text-based response; (ii) a HTTP link based response; or (iii) a HTTP post/put/delete (or other) based response.
  • One embodiment provides a computer implemented method wherein enabling user-definition and/or modification of an action template includes enabling the user to selectively define a complex action template.
  • One embodiment provides a computer implemented method wherein the user interface is configured to enable a user to: (i) select a natural language based request associated with a first instruction template; and (ii) re-associate that natural language-based request with a second template.
  • One embodiment provides a computer implemented method wherein the second template is a new template.
  • One embodiment provides a computer implemented method wherein the natural language based request is receivable in either text form or speech form.
  • One embodiment provides a computer implemented method for providing natural language processing functionality to a plurality of technology environments, the method including:
  • each instruction template is associated with a respective instruction template
  • One embodiment provides a computer implemented method wherein the method of matching includes the consideration of any one or more of the following variations: grammatical variations; synonymous variations; hyponyms of hypernyms; hyponymous, hypernymous, meronymous, holonymous variations; instantiable variations; and variations based on liberal matching of adjectives and adverbs.
  • One embodiment provides a computer implemented method wherein the matching includes a scoring process, wherein the scoring process is affected by the presence of one or more variations.
  • One embodiment provides a computer implemented method wherein the degree to which the score is effected in response to the presence of one or more variations is based on a learning algorithm.
  • One embodiment provides a computer implemented method wherein a plurality of actions are chained together.
  • One embodiment provides a computer implemented method wherein one or more of the instruction templates are parametrised.
  • One embodiment provides a computer implemented method wherein the natural language processing engine is configured to, in the case that the matched instruction template is a parametrised instruction template: process the natural language based request thereby to determine parameter values for the parametrised instruction template.
  • One embodiment provides a computer implemented method wherein defining an executable instance for the identified action template associated with a parametrised instruction template includes applying the determined parameter values to corresponding parametrised portions of the identified action template.
  • One embodiment provides a computer implemented method wherein each parameter has a parameter type, and wherein the parameter type is used in the matching.
  • One embodiment provides a computer implemented method wherein queries which are imperfectly aligned with instruction templates are handled.
  • One embodiment provides a computer implemented method wherein the technology environments include: (i) a plurality of websites; (ii) a plurality of software applications; and (iii) a plurality of controllable machines.
  • One embodiment provides a computer implemented method including providing to the client device a signal configured to cause the client device to request further information from the user to support the natural language based request.
  • One embodiment provides a computer implemented method wherein the executable instance of the action template defines a series of consecutive processes configured to be performed by the client device thereby to satisfy the natural language based request.
  • One embodiment provides a computer implemented method wherein the data representative of the natural language request includes: (i) request data; and (ii) contextual data.
  • One embodiment provides a computer implemented method wherein the request data includes either: (i) speech data received by the client device, which is converted to text by a speech recognition associated with the natural language processing engine; or (ii) text data inputted at the client device.
  • One embodiment provides a computer implemented method wherein the contextual data includes, in the case of a technology environment in the form of a software application, one or more of: localisation information, available objects and any objects which are displayed to the user, the current state of such objects, etc.
  • One embodiment provides a computer implemented method wherein the contextual data includes, in the case of a controllable machine, one or more of: position information, information coming from sensors, current actions in progress, world information such as location of objects, size and weight of objects, etc.
  • One embodiment provides a computer implemented method wherein, for a given technology environment, one or more of the instruction templates and/or action templates is defined, in whole or in part, by an automated process.
  • One embodiment provides a computer implemented method wherein the automated process includes analysing user interface artefacts thereby to identify instructions that are configured to be provided directly via the user interface.
  • One embodiment provides a computer implemented method including, for each technology environment as a configuration step, performing an automated process in respect of one or more databases associated with the technology environment thereby to extract parameter contextual data.
  • One embodiment provides a computer implemented method for providing natural language processing functionality, the method including:
  • an instruction template which includes parametrized portions based on the identified portions of the data representative of the natural language based request ;
  • an associated action template which includes parametrized portions based on the identified portions of the data associated with the manually defined response
  • One embodiment provides a computer implemented method wherein the step of providing an interface that is configured to enable manual defining of a response to the natural language based request is performed only in the case that the natural language based request is not matched to an existing instruction template. [0067] One embodiment provides a computer implemented method wherein performing a comparison of: (i) the data representative of the natural language based request; and (ii) data associated with the manually defined response includes:
  • One embodiment provides a computer implemented method the discrete grammatical artefacts include datives, accusatives, direct objects, and indirect objects.
  • One embodiment provides a computer implemented method wherein performing a comparison of: (i) the data representative of the natural language based request; and (ii) data associated with the manually defined response additionally includes:
  • One embodiment provides a computer implemented method wherein the data commonalities are identified in a URL portion of a web page associated with the manually defined response.
  • One embodiment provides a computer implemented method wherein the data commonalities are identified in context of a web page associated with the manually defined response.
  • One embodiment provides a computer implemented method wherein context of a web page associated with the manually defined response includes a manually defined field value.
  • One embodiment provides a computer implemented method wherein identifying data commonalities includes normalizing a data portion from either or both of the data representative of the natural language based request and the data associated with the manually defined response based on a predicted data type.
  • One embodiment provides a computer implemented method wherein, the instruction template includes data defining a plurality of combinations between : identifiable grammatical artefact types; and identifiable data types.
  • One embodiment provides a computer program product for performing a method as described herein.
  • One embodiment provides a non-transitory carrier medium for carrying computer executable code that, when executed on a processor, causes the processor to perform a method as described herein.
  • One embodiment provides a system configured for performing a method as described herein.
  • any one of the terms comprising, comprised of or which comprises is an open term that means including at least the elements/features that follow, but not excluding others.
  • the term comprising, when used in the claims should not be interpreted as being limitative to the means or elements or steps listed thereafter.
  • the scope of the expression a device comprising A and B should not be limited to devices consisting only of elements A and B.
  • Any one of the terms including or which includes or that includes as used herein is also an open term that also means including at least the elements/features that follow the term, but not excluding others. Thus, including is synonymous with and means comprising.
  • exemplary is used in the sense of providing examples, as opposed to indicating quality. That is, an "exemplary embodiment” is an embodiment provided as an example, as opposed to necessarily being an embodiment of exemplary quality.
  • FIG. 1 schematically illustrates a framework according to one embodiment.
  • FIG. 2 illustrates a method according to one embodiment.
  • FIG. 3 illustrates a client-server framework leveraged by some embodiments.
  • FIG. 4A to 4E illustrate exemplary partial screenshots according to an embodiment.
  • Fig. 5 illustrates a method according to one embodiment.
  • the present invention relates to frameworks and methodologies configured to enable streamlined integration of natural language processing functionality with user interface environments.
  • embodiments provide technology that enable integration of a natural language query processing functionality into existing use interface environments, such as websites (although websites are used as an illustrative example below, the technology is applicable to a wide range of other user interface environments).
  • the technology allows a query to be submitted in natural language form (for example via text input and/or voice input), and for that query to be processed thereby to trigger an action that is specific to the user interface environment.
  • Some embodiments described below relate to technology whereby a learning interface is configured to enable automated and/or semi-automated generation of computer coded rules and logic that is able to perform defined actions based on analysis of natural language queries.
  • a learning interface is configured to enable automated and/or semi-automated generation of computer coded rules and logic that is able to perform defined actions based on analysis of natural language queries.
  • such embodiments make use of human-assisted learning, whereby a human operator defines an appropriate response to a first natural language query, and the learning interface induces from that human-defined response (either automatically or semi-automatically) a template that allows subsequent automated responding to a second natural language query with different parameters (but identifiable commonality in terms of particular structural elements).
  • Some embodiments provide technology configured to enable providers of client-facing technology (such as websites, software applications, and controllable machinery) to implement a natural language processing functionality, such as voice control, in a streamlined manner. More specifically, in some embodiments a natural language processing engine is made available via a cloud service. The cloud service maintains access to, for each client-facing technology environment, a respective data set including predefined parameterised instructions and actions specific to the relevant environment. In this manner, only a relatively small amount of environment-specific data is required for each individual environment, whereas the underlying natural language processing engine (and associated technology) scales commonly across the plurality of environments.
  • client-facing technology such as websites, software applications, and controllable machinery
  • FIG. 1 An exemplary framework is illustrated in FIG. 1 .
  • a natural language command processing server 100 is configured to interact with instances of a plurality of technology environments.
  • FIG. 1 illustrates three exemplary technology environments:
  • a website 130 which is executed at a plurality of client devices via website rendering instances 1 -n. It will be appreciated that each individual website has a unique set of functions that it is configured to provide. These may be somewhat generic in function (for example navigation and the like), but unique in terms of the specific website (for example by reference to menu item names and the like).
  • a controllable equipment environment which is executed by way of a plurality of individual physical machines 1 -n.
  • the technology environment is defined by reference to machine hardware and software, and in some cases user interface software that executes separately of the machine.
  • a software application instances 1 -n of which execute at respective client terminals (or, in the case of cloud hosted software, user interfaces of which execute at the client terminals).
  • each category is illustrated, in practice there are a plurality of environments of each category.
  • multiple software developers and/or website administrators adapt the technology environments they provide for compatibility with server 100.
  • the technology environments may include several hundred distinct software applications, and for each software application there may be hundreds (or thousands) of individual executing instances simultaneously.
  • one or more intermediate servers are used to assist in coordinating requests from groups of client devices (for example a dedicated server to receive and optionally partially process requests from a common software application).
  • these technology environments are configured to interact with server 100 thereby to enable the provision of a natural language based request functionality.
  • this functionality may include either or both of: a speech-based request interface (whereby a user of a client device that presents an instance of the technology environment provides a request verbally) and a text-based interface (whereby a user of a client device that presents an instance of the technology environment provides a request by typing that request into a user interface object).
  • the client device Upon a natural language based request being received at a client device in one of the supported technology environments, the client device communicates data representative of the natural language based request to server 100 (for example audio data and/or text data), where it is received by input modules 101 .
  • server 100 for example audio data and/or text data
  • Server 100 executes a natural language based request processing engine 102.
  • Engine 102 is available to all of the supported technology environments, and makes use of a common natural language processing engine 103, text-to-speech engine 107, and global learning data 108 (comprising data developed over time thereby to tune the effectiveness of engines 102, 1 03 and/or 107).
  • Server 100 is configured to maintain access to data that defines, for each of the plurality of technology environments:
  • a given action template may include a plurality of steps and/or reference one or more further action templates
  • a supported environment data store 1 10 which maintains instruction/action templates 1 1 1 for each of a plurality of n supported technology environments.
  • data store 1 10 additionally maintains other contextual data for each supported technology environment, which may include parameter field types, context to parameter meanings, and the like.
  • environment-specific learning data 1 13 enabling tuning of natural language processing both at a global cross-environment level via data 108, and at an environment- specific level via data 1 13.
  • FIG. 2 illustrates a method performed by the framework of FIG. 1 .
  • Block 201 represents a process including receiving, from a client device, data representative of a natural language based request submitted by a user of the client device. This is followed, in the case of a speech command, by speech-to- text conversion at block 202.
  • Block 203 represents a process including identifying the technology environment being executed at the client device.
  • the data received at 201 in some cases includes or is associated with information identifying the client device and the technology environment executing at the client device.
  • Block 204 then includes executing a natural language processing engine, which is available across of plurality of technology environments, thereby to match the natural language based request to one of the plurality of instruction templates associated with the identified one of the technology environments.
  • the request is matched to a plurality of instructions (for example where the request combines functionalities provided by discrete instructions).
  • Block 205 represents, based on the matching, identifying the action template associated with the matched instruction template. Then, block 206 represents a process including defining an executable instance for the identified action template.
  • one or more of the instruction templates are parametrised (that is, they are defined as a template including a plurality of parametric portions that are configured to be populated with actual parameter values). That is, they are defined by reference to functional components, and portions that are parametric.
  • the method includes processing the natural language based request thereby to determine parameter values for the parametrised instruction template.
  • defining executable instances for the identified action template associated with a parametrised instruction template includes applying the determined parameter values to corresponding parametrised portions of the identified action template. In FIG. 1 , this is performed using an action construction engine 104.
  • the literal value of the parametric portions is typically not of importance to the function of server 100.
  • attributes of the parametric portion are in some cases relevant to selection instructions/action templates. For instance, where a parameter is a name, the name itself is typically not relevant to the matching process, but the fact that the parameter is of the type "person" is of relevance.
  • parameter types have additional downstream bearing in terms of defining an executable instance of an action.
  • an action may be defined that replaces the literal value "Bruce” with an alternate term based on what is an acceptable parameter for the defined action template being used. For example, “Bruce”, “him”, “the tall cook” and “somebody” are all able to be matched with person-type parameters.
  • contextual data is used to assist in resolving ambiguities with respect to parameters of known types.
  • types may have properties, which may or may not be inherent to the object (such as having a handle, or such as being within-reach).
  • Block 207 represents a process including transmitting data representative of the executable instances for the identified action template to the client device, thereby to cause the client device to perform functions based on the executable instances for the identified action template. In FIG. 1 this is achieved via action output modules 106.
  • a given action template defines a series of consecutive processes configured to be performed by the client device thereby to satisfy the natural language based request.
  • engine 1 02 determines that additional information is required to enable matching to an instruction template and/or constriction of an executable action.
  • contact modules 105 are configured to enable two-way communication between server 100 and a client device thereby to request (and obtain) further information from the user to support the natural language based request.
  • server 100 defines a question which is intended to seek clarification and/or input from a user of the client device.
  • the data representative of the natural language request includes: (i) request data; and (ii) contextual data.
  • the request data includes either: (i) speech data received by the client device, which is converted to text by a speech recognition associated with the natural language processing engine; or (ii) text data inputted at the client device.
  • the contextual data includes, in the case of a technology environment in the form of a software application, one or more of: localisation information, available objects and any objects which are displayed to the user, the current state of such objects, etc.
  • the contextual data includes one or more of: position information, information coming from sensors, current actions in progress, world information such as location of objects, size and weight of objects, etc.
  • contextual data is used in the context of a reasoning process, which supports the natural language processing.
  • the natural language processing results in an ambiguity - which is not able to be resolved without recourse to reasoning. This may relate to a parameter: a request "pick up that thing” is inherently unclear as to what is meant by "that thing”. Alternately, there may be two or more possible interpretations for a given word/phrase/concept.
  • a reasoning engine is configured to apply a set of rules thereby to seek resolution of ambiguity based on static and/or the contextual data.
  • Rules defined for execution by a reasoning engine may include either or both of (i) rules defined for a specific technology environment; (ii) rules that are common across the plurality of technology environments.
  • the reasoning engine is implemented as a first stage of ambiguity resolution, with two-way communication with a user of the client device via contact modules 1 05 being a second stage that is utilised in the case that automated resolution via the reasoning engine is unsuccessful (or provides a result of below a threshold level of accuracy probability).
  • the user query is submittable in either written or spoken form (and in some cases a user interface environment provides for both forms of input).
  • the user query arrives in spoken form, it is preferably converted into a textual form, using a speech recognition engine.
  • the speech recognition engine is configured to define multiple textual queries (or query portions), which are preferably each associated with probability data. For example, a given spoken word may be converted into two variant text forms, one being associated with an 80% probability and the other with a 60% probability. Making such variants available to downstream steps (which have increased knowledge of environment-specific terminology) leads to performance advantages.
  • this preferably includes information representative of the state of the local environment, such as a history of recent user queries (for anaphora resolution and for subtopic identification), and restrictions on available actions. In some embodiments it includes:
  • the contextual information also preferably includes (or is configured to reference) meta information describing relationships between objects, linguistic terminology, etc. For example, ontologies and other data sources representing facts and rules concerning real world objects.
  • each technology environment is configured to be compatible with server 100 by a process including defining action and instruction templates. These are collectively referred to as "rule templates”.
  • the instruction represents something which is searchable; the action is something which can be executed in some fashion to achieve some outcome.
  • template refers to the fact that part of instruction may be specified in terms of a parameter which is allowed to vary. This allows many (sometimes infinitely many) related instructions to be represented succinctly. For example, "Move ⁇ amount> metres forward” might represent a movement template which can be matched against “Move 5.3 metres forward", but also with "Move 21 metres forward”.
  • the corresponding action will typically be parameterised in the same way so that the information which is realised by the query may be utilised in the execution of the action. For example, a robot may be able to make use of an action described as a program (where amount gets instantiated to the corresponding value in the user query):
  • the user query gets parsed to reveal the structure of the query.
  • the parse structure is analysed to discover the semantic roles of various components of the query. (Optional - perhaps just the structure is used, or even direct matching will be used later).
  • Hyponymous, hypernymous, meronymous, holonymous variations e.g. "Turn on the hose” - "Turn on the tap of the hose” - "Turn on the tap”
  • the quality of the match may be determined with reference to various resources such as ontologies. These resources is in some embodiments general purpose, or may be specialised to the application, or even specialised to the context of the application, as supplied in the accompanying contextual information.
  • the relative importance of the various factors determining the quality of the match may be determined manually, or may be estimated using some machine learning technique.
  • the best match is determined.
  • This "best match” may be non-unique if the best few matches have scores which are similar.
  • the query is deemed to be ambiguous, and may be returned ambiguously, or more information may be sought from the user in order to resolve the ambiguity.
  • Matches may be made on subparts of the query. For example, a user query, "Display sales figures for March” might match “Show the results” ("Display ... ”) and “values for a period” ("... sales figures for March”). This might result in two or more actions being chained together. Of particular note, are actions associated with noun phrases, which will often be find()-type actions. These find()-type actions then become inputs to other actions, as noun phrases are often arguments of verb phrases.
  • the instruction-action template may have parts which represent parameters of the action.
  • the template In order to instantiate the template:
  • the user query and the instruction template are aligned. This is done by matching the (semantically) corresponding parts of the query with the instruction template. For example, this could be done by parsing both, recognising the semantic roles that each of the constituent parts play in the parse, and aligning common parts. Frame/slot representations may be useful intermediate structures for making such an alignment, but this could be done working more directly with the parse trees for example.
  • the parameters of the template are grounded by copying the values in their corresponding structure from the user query.
  • the instruction template may be represented using English words which represent an instance of the way in which the corresponding action template may be referenced. For example, "Lift up a mug.”
  • an example request is a convenient form of representation. It has the advantages that it is easy to understand and easy to create and modify for non-specialists who may be responsible for the collection of rule templates. Whilst such an instruction may be able to deal with a large number of different mugs, it may prove even more useful to generalise it further.
  • Action templates communicate the user query in a form in which the software which performs the action can understand. What is the most appropriate representation is largely a question of what makes sense to the client software. SQL, XML, JSON, executable code, references to menus or functions in the client software may all be appropriate representations.
  • server code For example, a phrase such as "the pink cup” may be disambiguated between multiple such cups available in the context by considering the distance between the speaker and each cup, or the distance between the actor (i.e. the robot) and each cup - the closer cup normally would have greater salience.
  • server code For example, a phrase such as "the pink cup” may be disambiguated between multiple such cups available in the context by considering the distance between the speaker and each cup, or the distance between the actor (i.e. the robot) and each cup - the closer cup normally would have greater salience.
  • this calculation is best performed by reasoning on the service as part of its ambiguity resolution.
  • the server code may be directly included in each action which requires it, or may be contained in the contextual information which accompanies the request, with only a reference to the code contained in the action itself.
  • the natural language processing service may choose to hobble the generality of the execution of client code for security, stability, privacy and sandboxing considerations.
  • the action is described in terms of three other actions: it is convenient to express the action in natural language processing, or however else the instructions are represented.
  • the action may have components which refer to other actions, expressed in natural language processing, and other components which are represented in some other fashion, as described earlier. Note that some of the actions which are described elsewhere may be further described in terms of other actions. For example, “Pick up the thing” might match with “Pick up something", which is in some embodiments defined as another 3-step action:
  • Actions may also have optional parts.
  • an action associated with getting a robot to walk may have a speed parameter which may be optionally expressed.
  • Such optional parts may be handled in server code, or the representation of the action template might may this optionality explicit, for instance defining values to be used in the case that the corresponding optional part is missing.
  • Actions can be anything which can be executed to achieve some outcome. They may result in a different presentation of some data, they may add, delete or modify some data. They may result in a motor being activated. They may change a mode of operation of the application. They add, delete or modify the application's understanding of the world (e.g. giving names to objects in the world). They may even add, delete or modify the available actions which the natural language processing service may operate over. [00150] There may be special types of actions which do not require a representation like those described so far. For example, "searchQ" and "helpQ" type functions may be special-cased.
  • search() Return a list of all objects which meet some criteria.
  • search is relevant here in a couple of ways: the search facility may be most easily specified using natural language processing ("search”, “find”, “show me”, 7) and the search criteria may also be most easily specified using natural language processing.
  • Many search facilities have complicated user interfaces to allow the user to indicate restrictions on the search ("whole word”, “at the start”, “within the current selection”, etc); these can often be readily expressed in natural language processing.
  • the search facility is greatly improved particularly in non-text applications, where the entities in the application may be hard to find. For example, “find the best salesman for last month” or asking a robot "what cutlery can you see?" or "show me all communication I've had with Mr. Benson in the last week.”
  • a natural language processing service may be able to understand entities, and their relationship between each other via supplied ontologies or other reasoning resources. In the case of categorising the results, type and other knowledge can again be useful.
  • Search is also relevant as it may be the input to another part of a natural language processing query (see discussion on action chaining earlier).
  • the presentation of the results would nonetheless require that the client application be informed of the results and asked to present them.
  • One application might choose to present them as a table on the right-hand side of the application; a robot might speak the results or even point to each of the objects in turn.
  • search is just a special type of rule, but one where a lot of the details may be assumed rather than specified. Note that as it is a type of rule, it is possible to have multiple search rules available, each which are applicable for different types of objects being searched, for example, so that they are presented in different ways, or so that the search orders the results differently etc.
  • the collection of rules is the glue which links the capabilities of the client application with instructions; the natural language processing service generalises the invokability of these rules.
  • These rules may need to be checked so that they are properly understood by the natural language processing service, and for consistency.
  • the natural language processing service might produce example natural language processing queries to simulate what the user might have requested, in order to help the author of the rules understand possible actions which is in some embodiments undertaken if the user were to express this.
  • the natural language processing service could try to generate examples which it found ambiguous to tease out the most appropriate action to undertake in such a situation.
  • Some embodiments provide a method wherein, for a given technology environment, one or more of the instruction templates and/or action templates is defined, in whole or in part, by an automated process.
  • the automated process includes analysing user interface artefacts thereby to identify instructions that are configured to be provided directly via the user interface.
  • the rules may be automatically generated. This is particularly the case where the client application is particularly structured in some way. For example, consider an application which interacts heavily with a database. This database may be automatically inspected to determine the nature and relationships between tables and fields in the database. In some cases, this may be able to be translated into rules, ready for the natural language processing service. Other examples of automatic extraction of rules may include automatic analysis of the user-interface elements in a software (menus, and buttons on user interfaces often correspond to actions, and the way they are labelled correspond to associated instructions). This is in some embodiments achieved by scanning the software code which represents the application, looking for such objects and their associated actions.
  • the natural language processing service only requires the output of this automatic extraction. This means that such extraction may be offered such that it is run behind a firewall (for instance on the client's computer) so that the natural language processing service does not have access to associated databases or software code. The client might then be able to manually inspect the extracted rules and other associated information, before submitting it to the natural language processing service. In this way, the client's privacy and security is respected.
  • the client's software were so arranged, it is in some embodiments possible to get the software to learn from certain situations.
  • a robot In the case of a robot equipped with object tracking and computer vision, for example, a robot is in some embodiments told to watch and learn an action. By associating the action with an instruction, a new rule could be created.
  • a desktop software application is in some embodiments able to be put in a "record" mode, where a sequence of actions are able to be combined; if those actions are then associated with an instruction, this rule could then be added to the rules available for the natural language processing service to consider.
  • learnt rules could be created by the authors of the software, or potentially by users of the software. Such rules could potentially be shared between different users, or made available only to the user who created the rule.
  • actions may be restricted in their application either by the situation or because those actions are restricted for that user. Such restrictions can be modelled in two ways. Actions may be hidden from the user in a particular situation if the user does not have adequate permission, and this hiding may be done in a way in which the rule appears to be completely absent from consideration. Alternatively, actions may be restricted from the user in a particular situation such that the user query may still be matched against the restricted actions (for example, blocking the applicability of other matching actions which have a lower score) and optionally, the user may be notified that he or she does not have permission to perform that action at this time.
  • the permission information is in some embodiments attached to each action, with the user's identity being passed in the contextual information which accompanies the user request. In some cases, it may be desirable to use some strategy such as signing the request to prevent malicious forgeries from accessing parts of the application which are restricted.
  • the natural language processing service may provide feedback to the user to help resolve ambiguities in the natural language processing service's understanding and applicability of the user query. But feedback may be an inherent requirement of the fulfilment of the action.
  • "wizards" in some software applications which guide the user through a series of steps to elicit the information required in order to complete some task.
  • a user query might have requested some task which is best matched to a wizard in the software: however some of the information which the user provides in their query is in some embodiments not applicable until later steps of the wizard. In some cases, this is best handled by having the natural language processing service interact with the user, effectively creating a dialogue which achieves the same end as the wizard in the extraction of the relevant inputs required to complete some task.
  • the natural language processing service could be configured in a way so as to:
  • Exemplary software Many software applications could benefit by adding a natural language processing interface. However, the financial and time cost as well as the risks involved in developing such software may be prohibitive if done for each software product, and would require on-going development to stay cutting edge. However, whilst a natural language processing service is in some embodiments able to identify what action is required, there are still issues involving getting this action to be performed, fire-walling the interface between the software and the natural language processing service (so that clients' data is not given to the service) and yet still being context aware in order to resolve natural language processing ambiguities.
  • some embodiments relate to technology whereby a learning interface is configured to enable automated and/or semi-automated generation of computer coded rules and logic that is able to perform defined actions based on analysis of natural language queries.
  • a learning interface is configured to enable automated and/or semi-automated generation of computer coded rules and logic that is able to perform defined actions based on analysis of natural language queries.
  • such embodiments make use of human-assisted learning, whereby a human operator defines an appropriate response to a first natural language query, and the learning interface induces from that human-defined response (either automatically or semi-automatically) a template that allows subsequent automated responding to a second natural language query with different parameters (but identifiable commonality in terms of particular structural elements).
  • Some embodiments provide computer implemented frameworks and methods for providing natural language processing functionality, which implement an assisted learning process as described below. This may be provided via a framework which is available for integration across a plurality of separate implementation environments, as described further above (for example providing access to global learning advantages and the like). In some embodiments it is integrated locally at a single implementation environment.
  • this method (and associated software platforms) provide a simplified and streamlined mechanism to enable construction of instruction and action templates for a given implementation environment. More specifically, some embodiments provide a software platform that tracks and records natural language based requests submitted via the implementation environment, and enables a user to associate those with action templates (for example newly created action templates). At a practical level, this enables one or more of the following:
  • a user to generate instruction templates by, in effect, reviewing and responding to queries submitted via a natural language query interface (which become associated with instruction templates), and defining responses to those questions (which become action templates).
  • the responses may include text- based responses (and/or images, videos, and the like), HTTP-based responses, and more complex responses.
  • This may be performed in a "live" mode, whereby a back-end service provider manually generates a query response (where an existing instruction/action template pair is not able to be defined) and provides that to the query-submitting user, and/or a non-live mode whereby the manual generation of the query response is performed at a later time.
  • a user to quickly generate instruction templates by, in effect, asking a plurality of common/relevant questions (which become associated with instruction templates), and defining responses to those questions (which become action templates).
  • the responses may include text-based responses, HTTP-based responses, and more complex responses.
  • the technology includes maintaining, for a given implementation environment, access to data that defines: (i) a plurality of instruction templates; and (ii) a plurality of action templates, wherein each instruction template is associated with a respective action template.
  • data defines: (i) a plurality of instruction templates; and (ii) a plurality of action templates, wherein each instruction template is associated with a respective action template.
  • the collection of instruction templates is built up based on receipt of natural language based requests. For example, each time a natural language based request is received and processed, it is either associated with an existing instruction template (i.e. based on natural language processing based matching), or flagged for potential generation of a new instruction template via a template management interface.
  • the method includes receiving, from a client device, data representative of a natural language based request submitted by a user of the client device.
  • This may be a "typical" client device that accesses the implementation environment and provides the request in that context, or an administrator client device that provides the request in the context of the template management interface.
  • the method then includes executing a natural language processing engine, thereby to attempt to match the natural language based request to one of the plurality of instruction templates. As a result of this, the request is either: (i) matched; or (ii) unable to be matched.
  • the associated action template is able to be executed (for example by being returned to the calling application for execution).
  • a new instruction template is defined (in draft form, and typically with parameter identification), and a notification generated to inform one or more administrator users that a new instruction template requires attention (e.g. in the sense of defining a new action template) via the template management interface.
  • an administrator user is available to review and define an action for the new instruction template substantially immediately (in which case the user receives a response to his/her request, albeit via manual intervention, and that response is used subsequently as a basis for a new action template).
  • the method includes making data representative of the natural language based request available in the template management interface.
  • the template management interface is configured to enable an administrator user to: (i) view data representative of the natural language based request; and (ii) define an instruction template and action template in relation to the natural language based request.
  • an automated process is performed thereby to autonomously define an instruction template and action template in relation to the natural language based request. This automated process is in some cases supplemented by manual input (for example in the context of a "review and revise" approach, by which a user inspects and optionally adjusts an autonomously defined instruction/action template pair).
  • FIG. 4A illustrates an exemplary partial screenshot from a template management interface according to one embodiment.
  • This shows a plurality of instruction templates, each described in terms of a primary natural language based request associated with that instruction template.
  • the requests are referred to here as "questions”.
  • Each instruction is associated with a count of the number of times the request has been invoked (see the right hand column), and is able to be expanded to reveal the set of individual natural language based requests that were matched to the instruction template (as shown in FIG. 4E).
  • FIG. 4B illustrates another exemplary partial screenshot from a template management interface according to one embodiment. This shows an "edit answer” interface.
  • a user selects an instruction template ("question") from the list in FIG. 4A, and is enabled to define an action template (an “answer").
  • the action template is able to take any of the following forms:
  • a HTTP response which causes the accessing of data via HTTP (for example to cause a page redirect, or rendering in an on-page object).
  • a HTTP (post) response which causes, for example, a back-end functionality (for example updating a database, automatically completing a web form, or the like).
  • More complex forms of action templates may also be defined in further embodiments, for example using approaches described further above.
  • Other forms, such as charts and tables are made available in some embodiments, as a convenient manner for defining an action template without needing to define HTTP page content to provide dynamic functionality.
  • the user interface provides an administrator user with functionality to define an "answer" by: accessing a web page that provides and answer to the relevant query, and inputting into that page data relevant to the query (or, in some embodiments, accessing a web page that is a result of corresponding to the query), and causing that web page to be delivered to the request-submitting user's terminal with any associated additional data. That is described in additional detail further below.
  • FIG. 4B also shows an "examples" interface, which displays a plurality of sample natural language based requests and how they would be answered based on the user's input via the edit answer" interface.
  • the "edit answer" interface enables user-definition and/or modification of an action templates including: (i) defining a parameterized portion of one or more of the natural language based requests associated with the instruction template; and (ii) defining a corresponding parameterized portion of the action template.
  • FIG. 4C and FIG. 4D a user is provide with an "edit variations" option, which allows a user to select a portion of question text and answer text (which may be text in a HTTP data), and apply a variation to those. As shown in FIG. 4D, this includes assigning a "type" to the variation.
  • types include: general purpose descriptors such as dates, names, places, and times, and also application-specific types such as a direction or path (for example in a robotics application), or a disease/symptom (for example in a medical application).
  • application-specific types such as a direction or path (for example in a robotics application), or a disease/symptom (for example in a medical application).
  • There may be a format associated with the type such as a date format).
  • the collection of available types may be expanded over time.
  • the user selects a "date” type.
  • the user also defines the format for the date value in the answer (which it will be appreciated is of particular importance where the answer includes HTTP data).
  • the format may be different for the question as opposed to the answer. That is, in the context of a question, the format relates to a format in which a parameter value is expected to be supplied by a user. In the context of an answer, on the other hand, the format may be quite different (for example where the value defines part of a HTTP request) . In some cases a given parameter is used at multiple locations in an action template, optionally in different formats.
  • an administrator user is also enabled to specify further data sources when defining an action template.
  • a request may be converted into a form configured to cause a lookup of rows in a database or a spreadsheet.
  • the values of parameterized portions of the instruction template when grounded by matching against the user request, are optionally fed into a WHERE clause of an SQL statement and used to identify rows in a database. The identified rows (and values defined therein) are then able to be used as parameter values in the action template when returning the template to the calling application for execution.
  • contextual information is also available to an action template.
  • the calling application in some embodiments is configured to pass contextual information such as: personal information (e.g. the user's name and address); information about the state of the implementation environment and its state (e.g. what information is being displayed, what is selected, a cursor position, etc); and information concerning previous interactions (so a dialog might be realized and ambiguities resolved).
  • contextual information might be defines as inputs for an action template.
  • contextual information is additionally associated with the instruction template.
  • an administrator user is enabled to specify context-based constraints, thereby limiting applicability of a given instruction template.
  • a given instruction template may only be considered for matching in the event that, for example: the requesting user has prescribed attributes of personal information (such as an age); the implementation environment is in a prescribed state (for example the user is on a particular web page); and the like.
  • an administrator user registers to use the natural language processing technology via a signup webpage.
  • This enables the administrator user to embed code in a website (or other implementation environment) which is configured to receive natural language based request.
  • the administrator user is also provided with an account with the template management interface, to enable managing of instruction and action templates for their implementation environment.
  • the administrator user performs initial configuration by asking a range of expected common questions relevant to the implementation environment, and via the template management interface defines answers to those questions (for example via the interface shown in FIG. 4A to FIG. 4E).
  • functionality is provided to enable provision the questions and/or answers in speech form. This provides a mechanism to quickly and efficiently create an initial set of functionalities available via natural language requests.
  • This set of functionalities is expanded over time, as unmatchable requests are submitted, flagged, and brought to the attention of the administrator user (for example via electronic notifications), who is enabled to review those, and either configure them to in future be matched to an existing instruction template, or define a new instruction template and action template.
  • any users who made unmatched requests which would now match against these action templates are notified (and in some cases the action template executed or made available for execution at their request). In this manner, a user who asks a question which is initially unable to be matched to an instruction template is provided an answer at a later point in time upon (upon generation/modification of an instruction template having an associated action template).
  • users and/or administrator users have control over settings affecting delivery of such notifications.
  • a feedback mechanism is provided for users, such that users are able to identify instances where an executed action template is unhelpful/inappropriate to the users needs, thereby to assist an administrator user in modifying and/or fine tuning instruction and action templates over time (for example notifications of feedback are made available to administrator users via the template management interface).
  • a further example embodiment is described below by reference to FIG. 5.
  • This embodiment is described by reference to a relatively straightforward practical example, in the context of a website that is configured to enable party-to-party payments (among other functionalities).
  • a user of that website wishing to make a party-to-party payment navigates, via menus, to a "payments" page, and inputs into that page details of a payee and a payment amount.
  • a user is provided with an interface via which to provide a natural language query (for example as text or as speech). For the sake of example, assume the following query is provided: "pay John Smith $50", as represented by block 501 .
  • Block 502 represents a matching process, whereby "pay John Smith $50" is processed to determine whether it has a threshold match with an existing instruction template. For the present example, we assume it does not. Therefore, at block 503, the natural language query ("pay John Smith $50") is presented to an administrative user, referred to in this example as a natural language query training user (being a back-end user tasked with reviewing and responding manually to natural language queries). For the sake of this example, we assume that one or more natural language query training users are on hand to manually respond to unmatched natural language queries on an as-required basis (i.e. substantially immediately).
  • the training user prepares a web page that corresponds to query.
  • the training interface provides user interface objects that allow the training user to access the website from which the natural language query was submitted, navigate to a desired page, and where relevant adjust that page context.
  • the way in which the training user navigates and provides context to a web page varies between embodiments, and the following are optionally used:
  • the training user navigates to a "payments" page, which provides a form for inputting details for a payment to be made, and inputs into relevant fields "John Smith” and "$50".
  • the training user then causes that payments page with its defined context (i.e. completed fields) to be rendered at the user terminal in response to the natural language request (this may be rendered in the existing window, or in a separate window/object).
  • the user is provided with a web page that provides a pre- filled form to pay $50 to John Smith (with "John Smith” in a name field, and "50.00” in a payment amount field), and is able to complete the transaction (for example by clicking a "perform transfer” button on the page).
  • Block 510 represents an optional step of normalizing a natural language query thereby to remove redundant language and/or standardising parts of the query in a way in which the meaning of the query is not changed. For example, natural language expressions such as “I would like to”, “can you please” and the like are removed. Various tools for de-complicating natural language expressions in this manner are known in the art.
  • Block 51 1 represents a process of query term tagging using NLP techniques (this is an optional selection of technology; other approaches are used in further examples) .
  • Some embodiments make use of available tagging technologies, for example those provided by third parties such as SpaCy.
  • This allows identification, in the query of grammatical artefact types such as direct and indirect objects and the like.
  • This is followed by an optional tagging simplification step at 512, which is configured to consolidate tagging (such as SpaCy type tagging) thereby to provide a simplified degree of tagging (for example combination of two-part names into a single artefact).
  • tagging simplification step at 512, which is configured to consolidate tagging (such as SpaCy type tagging) thereby to provide a simplified degree of tagging (for example combination of two-part names into a single artefact).
  • tagging simplification step is configured to consolidate tagging (such as SpaCy type tagging) thereby to provide a simplified degree
  • Block 513 represents a process including identifying artefacts associated with the web page (with context) prepared at 504. This may include data inputted into forms, portions of a page URL, displayed content, and the like.
  • Block 514 represents a matching process, whereby tagged artefacts derived from the query are matched against artefacts associated with the web page. This preferably includes:
  • Each artefact is analyzed to identify one or more predicted data types (for example a name, a date, a currency value, and so on), and the artefact is associated with a basic value based on that data type. For example, “$50.00”, “fifty dollars”, “50 dollars”, and other variations are all able to be associated with a currency value of 50.00. As another example, a date in the format "next Thursday" is associated with an actual date in a standardized date format. An artefact may have one or more types and one or more values.
  • Matched artefacts may be determined to be parameter values.
  • the one query artefact may correspond to more than one part of the action template, for example "John Smith” in the query may correspond to FIRST NAME: "John”, LAST NAME: "Smith” in the action template.
  • the reverse is not true: so at most one query artefact may be deemed to correspond to part of the action template.
  • the exploitation of the commonality enables the automated defining of instruction and action templates at 515:
  • the instruction template includes data defining an action type, and data types as tagged language parts.
  • the action template includes data defining a web page (with context) which uses values extracted from the tagged language parts of identified data types.
  • the format of the associated artefact is analysed and recorded. This is not the format of the instruction, (that is allowed to vary according to the requesting user's preferences), but is to be used for the format of the action. For example, a date corresponding to 28 September 2009 might be represented as "2009-09-28” and have a format representation "YYYY-MM-DD”. By recording the format of the action template parameter, when "14/8/2016" is required to be rendered, it would be represented as "2016-08-14".
  • the matching makes use of a wider collection of data, including data associated with the request-defining user's web page context (at the time of the query and/or based on the response page that is defined), data associated with the user (for example personal information, known contacts, location, and so on), general context data (for example a current data and time).
  • data associated with the request-defining user's web page context at the time of the query and/or based on the response page that is defined
  • data associated with the user for example personal information, known contacts, location, and so on
  • general context data for example a current data and time
  • the technology is configured to handle situations where there is ambiguity in a parameter value required for an instruction/action template. In some embodiments this includes asking a further question to resolve the ambiguity.
  • a portion of a natural language request includes a parameterised that portion could have two potential values based on the wording of the request (for example "Sydney" could be in Australia, or Sydney in Canada).
  • a question is defined to elicit a response from the user to resolve the ambiguity.
  • Further embodiments apply alternate resolution techniques, for example using knowledge of the user, the user's location, and the like.
  • the technology is configured to handle situations where there is optionality in at least one parameter value required for an instruction/action template.
  • a request is able to be matched to an instruction template in spite of lacking parameterized portions for optional parameters.
  • a date value may be optional, allowing "I want to fly to Paris on Thursday" and "I want to fly to Paris” to be matched to the same instruction template.
  • the former will use the date value in the action template, the latter will not (and may optionally ask additional question of the user, or use default values) .
  • a training user is enabled to define multiple conditional manual responses, and associate those with conditions (for example conditions based on context, user attributes, user login status, and so on).
  • a web server 302 provides a web interface 303.
  • This web interface is accessed by the parties by way of client terminals 304.
  • users access interface 303 over the Internet by way of client terminals 304, which in various embodiments include the likes of personal computers, PDAs, cellular telephones, gaming consoles, and other Internet enabled devices.
  • Server 303 includes a processor 305 coupled to a memory module 306 and a communications interface 307, such as an Internet connection, modem, Ethernet port, wireless network card, serial port, or the like.
  • a communications interface 307 such as an Internet connection, modem, Ethernet port, wireless network card, serial port, or the like.
  • distributed resources are used.
  • server 302 includes a plurality of distributed servers having respective storage, processing and communications resources.
  • Memory module 306 includes software instructions 308, which are executable on processor [00229]
  • Server 302 is coupled to a database 310. In further embodiments the database leverages memory module 306.
  • web interface 303 includes a website.
  • the term "website” should be read broadly to cover substantially any source of information accessible over the Internet or another communications network (such as WAN, LAN or WLAN) via a browser application running on a client terminal.
  • a website is a source of information made available by a server and accessible over the Internet by a web-browser application running on a client terminal.
  • the web-browser application downloads code, such as HTML code, from the server. This code is executable through the web-browser on the client terminal for providing a graphical and often interactive representation of the website on the client terminal.
  • a user of the client terminal is able to navigate between and throughout various web pages provided by the website, and access various functionalities that are provided.
  • client terminals 304 maintain software instructions for a computer program product that essentially provides access to a portal via which framework 100 is accessed (for instance via an iPhone app or the like).
  • each terminal 304 includes a processor 31 1 coupled to a memory module 313 and a communications interface 312, such as an internet connection, modem, Ethernet port, serial port, or the like.
  • Memory module 313 includes software instructions 314, which are executable on processor 31 1 . These software instructions allow terminal 304 to execute a software application, such as a proprietary application or web browser application and thereby render on-screen a user interface and allow communication with server 302. This user interface allows for the creation, viewing and administration of profiles, access to the internal communications interface, and various other functionalities.
  • processor may refer to any device or portion of a device that processes electronic data, e.g., from registers and/or memory to transform that electronic data into other electronic data that, e.g., may be stored in registers and/or memory.
  • a "computer” or a “computing machine” or a “computing platform” may include one or more processors.
  • the methodologies described herein are, in one embodiment, performable by one or more processors that accept computer-readable (also called machine-readable) code containing a set of instructions that when executed by one or more of the processors carry out at least one of the methods described herein.
  • Any processor capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken are included.
  • a typical processing system that includes one or more processors.
  • Each processor may include one or more of a CPU, a graphics processing unit, and a programmable DSP unit.
  • the processing system further may include a memory subsystem including main RAM and/or a static RAM, and/or ROM.
  • a bus subsystem may be included for communicating between the components.
  • the processing system further may be a distributed processing system with processors coupled by a network. If the processing system requires a display, such a display may be included, e.g., a liquid crystal display (LCD) or a cathode ray tube (CRT) display. If manual data entry is required, the processing system also includes an input device such as one or more of an alphanumeric input unit such as a keyboard, a pointing control device such as a mouse, and so forth. Input devices may also include audio/video input devices, and/or devices configured to derive information relating to characteristics/attributes of a human user.
  • the term memory unit as used herein, if clear from the context and unless explicitly stated otherwise, also encompasses a storage system such as a disk drive unit.
  • the processing system in some configurations may include a sound output device, and a network interface device.
  • the memory subsystem thus includes a computer-readable carrier medium that carries computer- readable code (e.g., software) including a set of instructions to cause performing, when executed by one or more processors, one of more of the methods described herein. Note that when the method includes several elements, e.g., several steps, no ordering of such elements is implied, unless specifically stated.
  • the software may reside in the hard disk, or may also reside, completely or at least partially, within the RAM and/or within the processor during execution thereof by the computer system.
  • the memory and the processor also constitute computer-readable carrier medium carrying computer-readable code.
  • a computer-readable carrier medium may form, or be included in a computer program product.
  • the one or more processors operate as a standalone device or may be connected, e.g., networked to other processor(s), in a networked deployment, the one or more processors may operate in the capacity of a server or a user machine in server-user network environment, or as a peer machine in a peer-to-peer or distributed network environment.
  • the one or more processors may form a personal computer (PC), a tablet PC, a set-top box (STB), a Personal Digital Assistant (PDA), a cellular telephone, a web appliance, a network router, switch or bridge, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine.
  • PC personal computer
  • PDA Personal Digital Assistant
  • each of the methods described herein is in the form of a computer- readable carrier medium carrying a set of instructions, e.g., a computer program that is for execution on one or more processors, e.g., one or more processors that are part of web server arrangement.
  • a computer-readable carrier medium carrying computer readable code including a set of instructions that when executed on one or more processors cause the processor or processors to implement a method.
  • aspects of the present invention may take the form of a method, an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects.
  • the present invention may take the form of carrier medium (e.g., a computer program product on a computer-readable storage medium) carrying computer-readable program code embodied in the medium.
  • the software may further be transmitted or received over a network via a network interface device.
  • the carrier medium is shown in an exemplary embodiment to be a single medium, the term “carrier medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more sets of instructions.
  • the term “carrier medium” shall also be taken to include any medium that is capable of storing, encoding or carrying a set of instructions for execution by one or more of the processors and that cause the one or more processors to perform any one or more of the methodologies of the present invention.
  • a carrier medium may take many forms, including but not limited to, non-volatile media, volatile media, and transmission media.
  • Non-volatile media includes, for example, optical, magnetic disks, and magneto- optical disks.
  • Volatile media includes dynamic memory, such as main memory.
  • Transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise a bus subsystem. Transmission media also may also take the form of acoustic or light waves, such as those generated during radio wave and infrared data communications.
  • carrier medium shall accordingly be taken to included, but not be limited to, solid-state memories, a computer product embodied in optical and magnetic media; a medium bearing a propagated signal detectable by at least one processor of one or more processors and representing a set of instructions that, when executed, implement a method; and a transmission medium in a network bearing a propagated signal detectable by at least one processor of the one or more processors and representing the set of instructions.
  • Coupled when used in the claims, should not be interpreted as being limited to direct connections only.
  • the terms “coupled” and “connected,” along with their derivatives, may be used. It should be understood that these terms are not intended as synonyms for each other.
  • the scope of the expression a device A coupled to a device B should not be limited to devices or systems wherein an output of device A is directly connected to an input of device B. It means that there exists a path between an output of A and an input of B which may be a path including other devices or means.
  • Coupled may mean that two or more elements are either in direct physical or electrical contact, or that two or more elements are not in direct contact with each other but yet still co-operate or interact with each other.

Abstract

The present disclosure relates to frameworks and methodologies configured to enable streamlined integration of natural language processing functionality with user interface environments. For example, embodiments provide technology that enable integration of a natural language query processing functionality into existing use interface environments, such as websites (although websites are used as an illustrative example below, the technology is applicable to a wide range of other user interface environments). At a broad level, the technology allows a query to be submitted in natural language form (for example via text input and/or voice input), and for that query to be processed thereby to trigger an action that is specific to the user interface environment.

Description

FRAMEWORKS AND METHODOLOGIES CONFIGURED TO ENABLE STREAMLINED INTEGRATION OF NATURAL LANGUAGE PROCESSING FUNCTIONALITY WITH ONE OR MORE USER INTERFACE ENVIRONMENTS, INCLUDING ASSISTED LEARNING PROCESS
FIELD OF THE INVENTION
[0001 ] The present invention relates to frameworks and methodologies configured to enable streamlined integration of natural language processing functionality with one or more of user interface environments. Embodiments of the invention have been particularly developed for enabling providers of client-facing technology (such as websites, software applications, and controllable machinery) to implement a natural language processing functionality (including, but not limited to voice control), in a streamlined manner. While some embodiments will be described herein with particular reference to that application, it will be appreciated that the invention is not limited to such a field of use, and is applicable in broader contexts.
BACKGROUND
[0002] Any discussion of the background art throughout the specification should in no way be considered as an admission that such art is widely known or forms part of common general knowledge in the field.
[0003] Natural language processing interfaces have increased in popularity over recent years. Prime examples are Apple's "Siri" and Microsoft's "Cortana", which provide virtual assistants able to provide a wide range of functionalities through their operating system environments and supported apps.
[0004] However, whilst it is accessible for giant technology companies to develop natural language processing solutions for their user interface environments, for smaller businesses the costs and efforts are largely prohibitive. This is both due to the complex nature of natural language processing technology, and challenges associated with managing small-scale implementations (for example due to a limited user base over which to perform learning-based improvements and optimisations). Scale is a key factor; the ability to tune a natural language processing engine over time is of high importance in delivering a successful product.
SUMMARY OF THE INVENTION
[0005] It is an object of the present invention to overcome or ameliorate at least one of the disadvantages of the prior art, or to provide a useful alternative.
[0006] One embodiment provides a computer implemented method for providing natural language processing functionality, the method including:
[0007] maintaining access to data that defines:
[0008] (i) a plurality of instruction templates; and
[0009] (ii) a plurality of action templates, wherein each instruction template is associated with a respective instruction template; [0010] receiving, based on input at a client device, data representative of a natural language based request submitted by a user of the client device;
[001 1 ] executing a natural language processing engine, thereby to attempt to match the natural language based request to one of the plurality of instruction templates;
[0012] in the case that the natural language based request is unable to be matched to one of the plurality of instruction templates, making data representative of the natural language based request available in a template management interface, wherein the template management interface is configured to enable an administrator user to: (i) view data representative of the natural language based request; and (ii) define an instruction template and action template in relation to the natural language based request.
[0013] One embodiment provides a computer implemented method wherein the template management interface is configured to:
[0014] display a plurality of instruction templates;
[0015] for each instruction template, provide data representative of one or more natural language based requests associated with the instruction template; and
[0016] for each instruction template, enable user-definition and/or modification of an action template.
[0017] One embodiment provides a computer implemented method wherein enabling user-definition and/or modification of an action template includes: (i) defining a parametrized portion of one or more of the natural language based requests associated with the instruction template; and (ii) defining a corresponding parametrized portion of the action template.
[0018] One embodiment provides a computer implemented method wherein the parametrized portion of the action template is a portion of HTTP data.
[0019] One embodiment provides a computer implemented method wherein enabling user-definition and/or modification of an action template includes enabling the user to selectively define (i) a text-based response; or (ii) a HTTP based response.
[0020] One embodiment provides a computer implemented method wherein enabling user-definition and/or modification of an action template includes enabling the user to selectively define (i) a text-based response; (ii) a HTTP link based response; or (iii) a HTTP post/put/delete (or other) based response.
[0021 ] One embodiment provides a computer implemented method wherein enabling user-definition and/or modification of an action template includes enabling the user to selectively define a complex action template.
[0022] One embodiment provides a computer implemented method wherein the user interface is configured to enable a user to: (i) select a natural language based request associated with a first instruction template; and (ii) re-associate that natural language-based request with a second template.
[0023] One embodiment provides a computer implemented method wherein the second template is a new template. [0024] One embodiment provides a computer implemented method wherein the natural language based request is receivable in either text form or speech form.
[0025] One embodiment provides a computer implemented method for providing natural language processing functionality to a plurality of technology environments, the method including:
[0026] maintaining access to data that defines, for each of the plurality of technology environments:
[0027] (i) a plurality of instruction templates; and
[0028] (ii) an action template, wherein each instruction template is associated with a respective instruction template;
[0029] receiving, from a client device, data representative of a natural language based request submitted by a user of the client device;
[0030] identifying one of the plurality of technology environments being executed at the client device;
[0031 ] executing a natural language processing engine, which is available to the plurality of technology environments, thereby to match the natural language based request to one of the plurality of instruction templates associated with the identified one of the technology environments;
[0032] based on the matching, identifying the action template associated with the matched instruction template;
[0033] defining an executable instance for the identified action template; and
[0034] transmitting data representative of the executable instance for the client device, thereby to cause the client device to perform functions based on the executable instance of the identified action template.
[0035] One embodiment provides a computer implemented method wherein the method of matching includes the consideration of any one or more of the following variations: grammatical variations; synonymous variations; hyponyms of hypernyms; hyponymous, hypernymous, meronymous, holonymous variations; instantiable variations; and variations based on liberal matching of adjectives and adverbs.
[0036] One embodiment provides a computer implemented method wherein the matching includes a scoring process, wherein the scoring process is affected by the presence of one or more variations.
[0037] One embodiment provides a computer implemented method wherein the degree to which the score is effected in response to the presence of one or more variations is based on a learning algorithm.
[0038] One embodiment provides a computer implemented method wherein a plurality of actions are chained together.
[0039] One embodiment provides a computer implemented method wherein one or more of the instruction templates are parametrised.
[0040] One embodiment provides a computer implemented method wherein the natural language processing engine is configured to, in the case that the matched instruction template is a parametrised instruction template: process the natural language based request thereby to determine parameter values for the parametrised instruction template.
[0041 ] One embodiment provides a computer implemented method wherein defining an executable instance for the identified action template associated with a parametrised instruction template includes applying the determined parameter values to corresponding parametrised portions of the identified action template.
[0042] One embodiment provides a computer implemented method wherein each parameter has a parameter type, and wherein the parameter type is used in the matching.
[0043] One embodiment provides a computer implemented method wherein queries which are imperfectly aligned with instruction templates are handled.
[0044] One embodiment provides a computer implemented method wherein the technology environments include: (i) a plurality of websites; (ii) a plurality of software applications; and (iii) a plurality of controllable machines.
[0045] One embodiment provides a computer implemented method including providing to the client device a signal configured to cause the client device to request further information from the user to support the natural language based request.
[0046] One embodiment provides a computer implemented method wherein the executable instance of the action template defines a series of consecutive processes configured to be performed by the client device thereby to satisfy the natural language based request.
[0047] One embodiment provides a computer implemented method wherein the data representative of the natural language request includes: (i) request data; and (ii) contextual data.
[0048] One embodiment provides a computer implemented method wherein the request data includes either: (i) speech data received by the client device, which is converted to text by a speech recognition associated with the natural language processing engine; or (ii) text data inputted at the client device.
[0049] One embodiment provides a computer implemented method wherein the contextual data includes, in the case of a technology environment in the form of a software application, one or more of: localisation information, available objects and any objects which are displayed to the user, the current state of such objects, etc.
[0050] One embodiment provides a computer implemented method wherein the contextual data includes, in the case of a controllable machine, one or more of: position information, information coming from sensors, current actions in progress, world information such as location of objects, size and weight of objects, etc.
[0051 ] One embodiment provides a computer implemented method wherein, for a given technology environment, one or more of the instruction templates and/or action templates is defined, in whole or in part, by an automated process. [0052] One embodiment provides a computer implemented method wherein the automated process includes analysing user interface artefacts thereby to identify instructions that are configured to be provided directly via the user interface.
[0053] One embodiment provides a computer implemented method including, for each technology environment as a configuration step, performing an automated process in respect of one or more databases associated with the technology environment thereby to extract parameter contextual data.
[0054] One embodiment provides a computer implemented method for providing natural language processing functionality, the method including:
[0055] receiving, from a client device, data representative of a natural language based request submitted by a user of a client device;
[0056] providing an interface that is configured to enable manual defining of a response to the natural language based request;
[0057] performing a comparison of: (i) the data representative of the natural language based request; and (ii) data associated with the manually defined response;
[0058] based on the comparison, identifying one or more portions of the data representative of the natural language based request that represent parameter values, and one or more portions of the data associated with the manually defined response that represent corresponding parameter values;
[0059] generating data that defines:
[0060] an instruction template, which includes parametrized portions based on the identified portions of the data representative of the natural language based request ; and
[0061 ] an associated action template, which includes parametrized portions based on the identified portions of the data associated with the manually defined response; and
[0062] receiving, from a further client device, data representative of a further natural language based request submitted by a user of a client device;
[0063] determining that the further natural language based request matches the instruction template;
[0064] analyzing the data representative of the further natural language based request thereby to identify parameter values for the parametrized portions of the instruction template;
[0065] applying corresponding parameter values to the associated action template, such that the action template is configured to provide a response to the further natural language based request.
[0066] One embodiment provides a computer implemented method wherein the step of providing an interface that is configured to enable manual defining of a response to the natural language based request is performed only in the case that the natural language based request is not matched to an existing instruction template. [0067] One embodiment provides a computer implemented method wherein performing a comparison of: (i) the data representative of the natural language based request; and (ii) data associated with the manually defined response includes:
[0068] processing the natural language based request thereby to identify discrete grammatical artefacts.
[0069] One embodiment provides a computer implemented method the discrete grammatical artefacts include datives, accusatives, direct objects, and indirect objects.
[0070] One embodiment provides a computer implemented method wherein performing a comparison of: (i) the data representative of the natural language based request; and (ii) data associated with the manually defined response additionally includes:
[0071 ] processing the data associated with the manually defined response thereby to identify data commonalities with the identified discrete grammatical artefacts.
[0072] One embodiment provides a computer implemented method wherein the data commonalities are identified in a URL portion of a web page associated with the manually defined response.
[0073] One embodiment provides a computer implemented method wherein the data commonalities are identified in context of a web page associated with the manually defined response.
[0074] One embodiment provides a computer implemented method wherein context of a web page associated with the manually defined response includes a manually defined field value.
[0075] One embodiment provides a computer implemented method wherein identifying data commonalities includes normalizing a data portion from either or both of the data representative of the natural language based request and the data associated with the manually defined response based on a predicted data type.
[0076] One embodiment provides a computer implemented method wherein, the instruction template includes data defining a plurality of combinations between : identifiable grammatical artefact types; and identifiable data types.
[0077] One embodiment provides a computer program product for performing a method as described herein.
[0078] One embodiment provides a non-transitory carrier medium for carrying computer executable code that, when executed on a processor, causes the processor to perform a method as described herein.
[0079] One embodiment provides a system configured for performing a method as described herein.
[0080] Reference throughout this specification to "one embodiment", "some embodiments" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, appearances of the phrases "in one embodiment", "in some embodiments" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment, but may. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner, as would be apparent to one of ordinary skill in the art from this disclosure, in one or more embodiments.
[0081 ] As used herein, unless otherwise specified the use of the ordinal adjectives "first", "second", "third", etc., to describe a common object, merely indicate that different instances of like objects are being referred to, and are not intended to imply that the objects so described must be in a given sequence, either temporally, spatially, in ranking, or in any other manner.
[0082] In the claims below and the description herein, any one of the terms comprising, comprised of or which comprises is an open term that means including at least the elements/features that follow, but not excluding others. Thus, the term comprising, when used in the claims, should not be interpreted as being limitative to the means or elements or steps listed thereafter. For example, the scope of the expression a device comprising A and B should not be limited to devices consisting only of elements A and B. Any one of the terms including or which includes or that includes as used herein is also an open term that also means including at least the elements/features that follow the term, but not excluding others. Thus, including is synonymous with and means comprising.
[0083] As used herein, the term "exemplary" is used in the sense of providing examples, as opposed to indicating quality. That is, an "exemplary embodiment" is an embodiment provided as an example, as opposed to necessarily being an embodiment of exemplary quality.
BRIEF DESCRIPTION OF THE DRAWINGS
[0084] Embodiments of the invention will now be described, by way of example only, with reference to the accompanying drawings in which:
[0085] FIG. 1 schematically illustrates a framework according to one embodiment.
[0086] FIG. 2 illustrates a method according to one embodiment.
[0087] FIG. 3 illustrates a client-server framework leveraged by some embodiments.
[0088] FIG. 4A to 4E illustrate exemplary partial screenshots according to an embodiment.
[0089] Fig. 5 illustrates a method according to one embodiment.
DETAILED DESCRIPTION
[0090] The present invention relates to frameworks and methodologies configured to enable streamlined integration of natural language processing functionality with user interface environments. For example, embodiments provide technology that enable integration of a natural language query processing functionality into existing use interface environments, such as websites (although websites are used as an illustrative example below, the technology is applicable to a wide range of other user interface environments). At a broad level, the technology allows a query to be submitted in natural language form (for example via text input and/or voice input), and for that query to be processed thereby to trigger an action that is specific to the user interface environment.
[0091 ] Some embodiments described below relate to technology whereby a learning interface is configured to enable automated and/or semi-automated generation of computer coded rules and logic that is able to perform defined actions based on analysis of natural language queries. In broad terms, such embodiments make use of human-assisted learning, whereby a human operator defines an appropriate response to a first natural language query, and the learning interface induces from that human-defined response (either automatically or semi-automatically) a template that allows subsequent automated responding to a second natural language query with different parameters (but identifiable commonality in terms of particular structural elements).
[0092] Some embodiments provide technology configured to enable providers of client-facing technology (such as websites, software applications, and controllable machinery) to implement a natural language processing functionality, such as voice control, in a streamlined manner. More specifically, in some embodiments a natural language processing engine is made available via a cloud service. The cloud service maintains access to, for each client-facing technology environment, a respective data set including predefined parameterised instructions and actions specific to the relevant environment. In this manner, only a relatively small amount of environment-specific data is required for each individual environment, whereas the underlying natural language processing engine (and associated technology) scales commonly across the plurality of environments.
[0093] Although various embodiments are described by reference to a cloud-hosted natural language processing engine, in further embodiments some or all elements of such an engine are hosted locally, for example on a local server or in a client device.
Exemplary Framework
[0094] An exemplary framework is illustrated in FIG. 1 . In this embodiment, a natural language command processing server 100 is configured to interact with instances of a plurality of technology environments. FIG. 1 illustrates three exemplary technology environments:
• A website 130, which is executed at a plurality of client devices via website rendering instances 1 -n. It will be appreciated that each individual website has a unique set of functions that it is configured to provide. These may be somewhat generic in function (for example navigation and the like), but unique in terms of the specific website (for example by reference to menu item names and the like).
• A controllable equipment environment, which is executed by way of a plurality of individual physical machines 1 -n. The technology environment is defined by reference to machine hardware and software, and in some cases user interface software that executes separately of the machine.
• A software application, instances 1 -n of which execute at respective client terminals (or, in the case of cloud hosted software, user interfaces of which execute at the client terminals).
[0095] Although only one technology environment of each category is illustrated, in practice there are a plurality of environments of each category. For example, multiple software developers and/or website administrators adapt the technology environments they provide for compatibility with server 100. The technology environments may include several hundred distinct software applications, and for each software application there may be hundreds (or thousands) of individual executing instances simultaneously. In some embodiments one or more intermediate servers are used to assist in coordinating requests from groups of client devices (for example a dedicated server to receive and optionally partially process requests from a common software application).
[0096] In overview, these technology environments are configured to interact with server 100 thereby to enable the provision of a natural language based request functionality. For example, this functionality may include either or both of: a speech-based request interface (whereby a user of a client device that presents an instance of the technology environment provides a request verbally) and a text-based interface (whereby a user of a client device that presents an instance of the technology environment provides a request by typing that request into a user interface object).
[0097] Upon a natural language based request being received at a client device in one of the supported technology environments, the client device communicates data representative of the natural language based request to server 100 (for example audio data and/or text data), where it is received by input modules 101 .
[0098] Server 100 executes a natural language based request processing engine 102. Engine 102 is available to all of the supported technology environments, and makes use of a common natural language processing engine 103, text-to-speech engine 107, and global learning data 108 (comprising data developed over time thereby to tune the effectiveness of engines 102, 1 03 and/or 107).
[0099] Server 100 is configured to maintain access to data that defines, for each of the plurality of technology environments:
• a plurality of instruction templates; and
• a plurality of action templates, wherein each instruction template is associated with a respective action template. A given action template may include a plurality of steps and/or reference one or more further action templates;
[00100] In the case of FIG. 1 , these are maintained in a supported environment data store 1 10, which maintains instruction/action templates 1 1 1 for each of a plurality of n supported technology environments. In the illustrated embodiment data store 1 10 additionally maintains other contextual data for each supported technology environment, which may include parameter field types, context to parameter meanings, and the like. Furthermore, there is environment-specific learning data 1 13, enabling tuning of natural language processing both at a global cross-environment level via data 108, and at an environment- specific level via data 1 13.
[00101 ] FIG. 2 illustrates a method performed by the framework of FIG. 1 . Block 201 represents a process including receiving, from a client device, data representative of a natural language based request submitted by a user of the client device. This is followed, in the case of a speech command, by speech-to- text conversion at block 202.
[00102] Block 203 represents a process including identifying the technology environment being executed at the client device. For example, the data received at 201 in some cases includes or is associated with information identifying the client device and the technology environment executing at the client device. Block 204 then includes executing a natural language processing engine, which is available across of plurality of technology environments, thereby to match the natural language based request to one of the plurality of instruction templates associated with the identified one of the technology environments. In some cases the request is matched to a plurality of instructions (for example where the request combines functionalities provided by discrete instructions).
[00103] Block 205 represents, based on the matching, identifying the action template associated with the matched instruction template. Then, block 206 represents a process including defining an executable instance for the identified action template.
[00104] In some embodiments, one or more of the instruction templates are parametrised (that is, they are defined as a template including a plurality of parametric portions that are configured to be populated with actual parameter values). That is, they are defined by reference to functional components, and portions that are parametric. In the case that the matched instruction template is a parametrised instruction template, the method includes processing the natural language based request thereby to determine parameter values for the parametrised instruction template. Then, defining executable instances for the identified action template associated with a parametrised instruction template includes applying the determined parameter values to corresponding parametrised portions of the identified action template. In FIG. 1 , this is performed using an action construction engine 104.
[00105] The literal value of the parametric portions is typically not of importance to the function of server 100. However, attributes of the parametric portion (for example the type of parameter) are in some cases relevant to selection instructions/action templates. For instance, where a parameter is a name, the name itself is typically not relevant to the matching process, but the fact that the parameter is of the type "person" is of relevance. As an example, there may be two similar templates: Take X from Y and Take X from Z, where Y is a person type parameter, and Z is a location type parameter. A request may be "take the apple from Bruce". In that case, the processing determines that "Bruce" is a parameter, and more specifically that "Bruce" is a name which may be matched against a person type parameter, and therefore selects the Take X from Y parameter (as opposed to what would occur in the case of a request such as "take the apple from the box"). In some embodiments parameter types have additional downstream bearing in terms of defining an executable instance of an action. Continuing with the previous example, an action may be defined that replaces the literal value "Bruce" with an alternate term based on what is an acceptable parameter for the defined action template being used. For example, "Bruce", "him", "the tall cook" and "somebody" are all able to be matched with person-type parameters. In some embodiments contextual data is used to assist in resolving ambiguities with respect to parameters of known types. In some embodiments, types may have properties, which may or may not be inherent to the object (such as having a handle, or such as being within-reach).
[00106] Block 207 represents a process including transmitting data representative of the executable instances for the identified action template to the client device, thereby to cause the client device to perform functions based on the executable instances for the identified action template. In FIG. 1 this is achieved via action output modules 106.
[00107] In some embodiments a given action template defines a series of consecutive processes configured to be performed by the client device thereby to satisfy the natural language based request. [00108] In some embodiments, engine 1 02 determines that additional information is required to enable matching to an instruction template and/or constriction of an executable action. In such cases, contact modules 105 are configured to enable two-way communication between server 100 and a client device thereby to request (and obtain) further information from the user to support the natural language based request. For example, server 100 defines a question which is intended to seek clarification and/or input from a user of the client device.
[00109] In some embodiments the data representative of the natural language request includes: (i) request data; and (ii) contextual data. The request data includes either: (i) speech data received by the client device, which is converted to text by a speech recognition associated with the natural language processing engine; or (ii) text data inputted at the client device. The contextual data includes, in the case of a technology environment in the form of a software application, one or more of: localisation information, available objects and any objects which are displayed to the user, the current state of such objects, etc. Alternatively, in the case of a controllable machine, the contextual data includes one or more of: position information, information coming from sensors, current actions in progress, world information such as location of objects, size and weight of objects, etc.
[001 10] In some embodiments, contextual data is used in the context of a reasoning process, which supports the natural language processing. For example, in some embodiments the natural language processing results in an ambiguity - which is not able to be resolved without recourse to reasoning. This may relate to a parameter: a request "pick up that thing" is inherently unclear as to what is meant by "that thing". Alternately, there may be two or more possible interpretations for a given word/phrase/concept. To resolve such ambiguities, a reasoning engine is configured to apply a set of rules thereby to seek resolution of ambiguity based on static and/or the contextual data. This may include the contextual data provided with the request data, or contextual data subsequently obtained subject to a query of the client device (for example a query of contextual data of types known to be available to the client device, such as sensor- derived information and the like). Rules defined for execution by a reasoning engine may include either or both of (i) rules defined for a specific technology environment; (ii) rules that are common across the plurality of technology environments.
[001 1 1 ] In some embodiments, the reasoning engine is implemented as a first stage of ambiguity resolution, with two-way communication with a user of the client device via contact modules 1 05 being a second stage that is utilised in the case that automated resolution via the reasoning engine is unsuccessful (or provides a result of below a threshold level of accuracy probability).
Input to the Natural Language Processing Service
[001 12] As noted, there are two main input artefacts delivered to a natural language processing service according to embodiments considered herein: (i) a user query; and (ii) contextual information.
[001 13] The user query is submittable in either written or spoken form (and in some cases a user interface environment provides for both forms of input).
[001 14] If the user query arrives in spoken form, it is preferably converted into a textual form, using a speech recognition engine. In some embodiments, the speech recognition engine is configured to define multiple textual queries (or query portions), which are preferably each associated with probability data. For example, a given spoken word may be converted into two variant text forms, one being associated with an 80% probability and the other with a 60% probability. Making such variants available to downstream steps (which have increased knowledge of environment-specific terminology) leads to performance advantages.
[001 15] If the user query arrives in textual form, various transformations are preferably applied to handle issues associated with capitalisation, punctuation, hyphenation, non-standard spelling, as well as errors such as omissions, repetitions and spelling mistakes. Again, probabilities associated with such transformations (and variants) are preferably made available to subsequent steps in the analysis process.
[001 16] In relation to contextual information, this preferably includes information representative of the state of the local environment, such as a history of recent user queries (for anaphora resolution and for subtopic identification), and restrictions on available actions. In some embodiments it includes:
[001 17] in the case of software: localisation information, available objects and any objects which are displayed to the user, the current state of such objects, etc.
[001 18] in the case of physical devices such as robots: position information, information coming from sensors, current actions in progress, world information such as location of objects, size and weight of objects, etc.
[001 19] information about the user, such as tone, emotional state, gaze direction, gestures, and the like.
[00120] The contextual information also preferably includes (or is configured to reference) meta information describing relationships between objects, linguistic terminology, etc. For example, ontologies and other data sources representing facts and rules concerning real world objects.
Rule Templates
[00121 ] As noted, each technology environment is configured to be compatible with server 100 by a process including defining action and instruction templates. These are collectively referred to as "rule templates".
[00122] The instruction represents something which is searchable; the action is something which can be executed in some fashion to achieve some outcome. The term "template" refers to the fact that part of instruction may be specified in terms of a parameter which is allowed to vary. This allows many (sometimes infinitely many) related instructions to be represented succinctly. For example, "Move <amount> metres forward" might represent a movement template which can be matched against "Move 5.3 metres forward", but also with "Move 21 metres forward". The corresponding action will typically be parameterised in the same way so that the information which is realised by the query may be utilised in the execution of the action. For example, a robot may be able to make use of an action described as a program (where amount gets instantiated to the corresponding value in the user query):
angle = amount * 21 .7;
rotate(motor1 , angle)
rotate (motor2, angle) [00123] An even more general template is in some embodiments possible "Move <dist> <direction>" where "<dist>" encodes not only a number but also units of measurement; "<direction>" encodes a direction. For such a more general template to be applicable, one would expect that the corresponding action would need to be modified to deal with this greater generality.
Matching a user query
[00124] The user query gets parsed to reveal the structure of the query. The parse structure is analysed to discover the semantic roles of various components of the query. (Optional - perhaps just the structure is used, or even direct matching will be used later).
[00125] Compare the user query with each of the available instructions in the collection of rule templates.
[00126] The comparison above yields a number representing the quality of the match. Matches are not required to be exact. Variations such as:
• Grammatical variations: e.g. "Give the ball to me" - "Give me the ball"
• Synonymous variations: e.g. "Lift the plank" - "Raise the plank". Hyponyms of hypernyms (i.e. "siblings" and "cousins" of terms may also be considered - e.g. "walk" - "run" is in some embodiments a match in the absence of a closer match; better still "walk" - "run slowly" is in some embodiments even better)
• Hyponymous, hypernymous, meronymous, holonymous variations: e.g. "Turn on the hose" - "Turn on the tap of the hose" - "Turn on the tap"
• Instantiate variations: e.g. "Lift up the mug" - "Lift up the vessel with a handle"
• Adjectives and adverbs may be sometimes liberally matched e.g. "pink" - "red", or even "heavy" = "large" or may sometimes be simply ignored as the information may not be known e.g. "Give the book to the happy robot" or "Pick up the squashy parcel" (the parcel stiffness may be unknown).
[00127] Each of these may be deemed acceptable. With such a variation, some measure of the degree of variation may lead to a lower score which indicates the quality of the match.
[00128] The quality of the match may be determined with reference to various resources such as ontologies. These resources is in some embodiments general purpose, or may be specialised to the application, or even specialised to the context of the application, as supplied in the accompanying contextual information.
[00129] The relative importance of the various factors determining the quality of the match, may be determined manually, or may be estimated using some machine learning technique.
[00130] Once the user's query is compared with the available instruction templates, the best match is determined. This "best match" may be non-unique if the best few matches have scores which are similar. In this case, the query is deemed to be ambiguous, and may be returned ambiguously, or more information may be sought from the user in order to resolve the ambiguity. However, it may be the case that even the best match has a low score. In this case it may be determined that there is no suitable match, and either more information is sought (for example by suggesting some relaxation in the query), or no action is returned (indicating to the user that no suitable action matches their request).
[00131 ] Matches may be made on subparts of the query. For example, a user query, "Display sales figures for March" might match "Show the results" ("Display ... ") and "values for a period" ("... sales figures for March"). This might result in two or more actions being chained together. Of particular note, are actions associated with noun phrases, which will often be find()-type actions. These find()-type actions then become inputs to other actions, as noun phrases are often arguments of verb phrases.
Instantiating a template
[00132] As alluded to above, the instruction-action template may have parts which represent parameters of the action. In order to instantiate the template:
[00133] The user query and the instruction template are aligned. This is done by matching the (semantically) corresponding parts of the query with the instruction template. For example, this could be done by parsing both, recognising the semantic roles that each of the constituent parts play in the parse, and aligning common parts. Frame/slot representations may be useful intermediate structures for making such an alignment, but this could be done working more directly with the parse trees for example.
[00134] Once the constituents are aligned, the parameters of the template are grounded by copying the values in their corresponding structure from the user query.
[00135] The parameters which are found above are then applied to the action in the form of arguments to the action or by some other means.
[00136] Imperfectly aligned matches. In some cases, a perfect correspondence between the user query and the template may not be possible (there is an imperfect alignment between constituents). This may be one of the factors considered during the matching process, adversely affecting the score. Nonetheless, an imperfect alignment may still be the best match possible. If so, at least one of the following cases will arise:
[00137] It may be the case that not all the parameters in the template have been adequately grounded. This may be resolved by seeking further information from the user, or the partially-grounded action template may be returned as is to the client software, which may then take responsibility for adequately resolving such issues.
[00138] Conversely, there may exist extraneous parts of the user query which have no correspondence in the instruction template. This may be resolved as:
• further information may be sought from the user, or
• this extraneous information may be passed to further action templates in the case of chained actions (described above), or
• the extraneous information may be ignored, or
• the extraneous information may be passed back to the client software as part of the result for the software to resolve. Representation of an instruction template
[00139] The instruction template may be represented using English words which represent an instance of the way in which the corresponding action template may be referenced. For example, "Lift up a mug." As the matching process described earlier is responsible for matching against variations of this request, an example request is a convenient form of representation. It has the advantages that it is easy to understand and easy to create and modify for non-specialists who may be responsible for the collection of rule templates. Whilst such an instruction may be able to deal with a large number of different mugs, it may prove even more useful to generalise it further. For example, "Lift up a vessel with a handle." Mugs, teacups and milk jugs might all be treated in a similar manner, in that the important facts are they are all vessels, and the lifting action is able to make use of the fact that they are known to have handles. When the phrase "a vessel with a handle" is being aligned in the instantiation phase, the type of the object is taken into consideration, along with any properties (such as having a handle).
[00140] If a hierarchy of object types is available, this may be employed to determine that mugs, teacups and milk jugs are all vessels. But similarly, there is in some embodiments another instruction template, "Lift up an object with a handle." Both "Lift up a vessel with a handle" and "Lift up an object with a handle" might match "Pick up Susan's pink mug", but it is likely that "Lift up a vessel with a handle" would be a better match as "mug" would normally be found closer to "vessel" in an object hierarchy than to a more generic term like "object". This would be reflected in the match scores, which would ultimately translate into the selection of the appropriate rule.
[00141 ] In the example just given, "a vessel with a handle" is matched and then instantiated by "Susan's pink mug". Again, this is because the type and properties are unifiable. The action will then be able to make use of "the vessel" or "the handle" which will refer to "Susan's pink mug" or "the handle of Susan's pink mug" respectively, which in turn might refer to physical objects found in the real world.
[00142] Whilst plain natural language processing is an effective representation for instruction templates, it is not the only one which may be considered. Indeed, earlier in this document we used representations like "Move <amount> metres forward". Whilst this was mainly intended to be for didactic purposes, it is conceivable that such a representation is in some embodiments plausible. Other representations in terms of a programming language or in terms of a structure serialisation such as XML or JSON , may be more efficient (as they may be more directly applicable to the software which uses them) and may be less ambiguous.
Representation of an action template
[00143] Action templates communicate the user query in a form in which the software which performs the action can understand. What is the most appropriate representation is largely a question of what makes sense to the client software. SQL, XML, JSON, executable code, references to menus or functions in the client software may all be appropriate representations.
[00144] However, in determining the most appropriate action to return to the client, (particularly in the case of chained actions as described earlier), it may be desirable to have some code which is run on the natural language processing service, ("server code"). For example, a phrase such as "the pink cup" may be disambiguated between multiple such cups available in the context by considering the distance between the speaker and each cup, or the distance between the actor (i.e. the robot) and each cup - the closer cup normally would have greater salience. Provided that sufficient context is given to the natural language processing service, and provided that there is a means for the service to perform this calculation, it may be deemed that this calculation is best performed by reasoning on the service as part of its ambiguity resolution. The server code may be directly included in each action which requires it, or may be contained in the contextual information which accompanies the request, with only a reference to the code contained in the action itself. Note that the natural language processing service may choose to hobble the generality of the execution of client code for security, stability, privacy and sandboxing considerations.
[00145] To reduce repetition, it may be desirable to describe some actions in terms of others. For example:
Instruction: "Give something to someone. "
Action: (1) Pick up the thing. (2) Go to the person. (3) Hand the thing to the person.
[00146] In this case, the action is described in terms of three other actions: it is convenient to express the action in natural language processing, or however else the instructions are represented. The action may have components which refer to other actions, expressed in natural language processing, and other components which are represented in some other fashion, as described earlier. Note that some of the actions which are described elsewhere may be further described in terms of other actions. For example, "Pick up the thing" might match with "Pick up something", which is in some embodiments defined as another 3-step action:
(1) Go to the thing. (2) Extend your arm towards it. (3) Grasp the thing.
[00147] Note that when actions are expressed in terms of other actions, it is convenient to refer to parameters in the instruction using indefinite articles such as "a", "an", "some" or "any" (or related compounds such as "somebody", "anything" etc.) and natural to refer to the corresponding parameters in the action using definite articles such as "the" or "this", or even with pronouns such as "it", "them", "the first one". Standard techniques in anaphora resolution should be employed to help connect the references to their referents. Furthermore, similar comments can be made with respect to the user query, namely it is normally made using phrases with definite articles, which will need to be unified with the corresponding indefinite phrases of the instruction template.
[00148] Actions may also have optional parts. For example, an action associated with getting a robot to walk, may have a speed parameter which may be optionally expressed. Such optional parts may be handled in server code, or the representation of the action template might may this optionality explicit, for instance defining values to be used in the case that the corresponding optional part is missing.
Types of actions
[00149] Actions can be anything which can be executed to achieve some outcome. They may result in a different presentation of some data, they may add, delete or modify some data. They may result in a motor being activated. They may change a mode of operation of the application. They add, delete or modify the application's understanding of the world (e.g. giving names to objects in the world). They may even add, delete or modify the available actions which the natural language processing service may operate over. [00150] There may be special types of actions which do not require a representation like those described so far. For example, "searchQ" and "helpQ" type functions may be special-cased.
• search(): Return a list of all objects which meet some criteria.
• help(): Document a feature of the application.
Search
[00151 ] Many software applications have a search facility. Typical desirable functionality of this facility is to:
• find objects which meet some criteria,
• determine an ordering (so that the most relevant matches are presented first),
• possibly categorise and possibly group such objects (so that a user may be better able to navigate the object which have been found), and
• present the results in such a way to aid the user in their understanding of the way in which the result matched the criteria (for example, highlighting matching text).
[00152] Search is relevant here in a couple of ways: the search facility may be most easily specified using natural language processing ("search", "find", "show me", ...) and the search criteria may also be most easily specified using natural language processing. Many search facilities have complicated user interfaces to allow the user to indicate restrictions on the search ("whole word", "at the start", "within the current selection", etc); these can often be readily expressed in natural language processing. But the search facility is greatly improved particularly in non-text applications, where the entities in the application may be hard to find. For example, "find the best salesman for last month" or asking a robot "what cutlery can you see?" or "show me all communication I've had with Mr. Benson in the last week." In these cases, a natural language processing service may be able to understand entities, and their relationship between each other via supplied ontologies or other reasoning resources. In the case of categorising the results, type and other knowledge can again be useful.
[00153] Search is also relevant as it may be the input to another part of a natural language processing query (see discussion on action chaining earlier).
[00154] Search is largely a facility which is common across applications, so it makes sense for this to be baked into a natural language processing service, rather than requiring a detailed specification of the action.
[00155] However, the presentation of the results would nonetheless require that the client application be informed of the results and asked to present them. One application might choose to present them as a table on the right-hand side of the application; a robot might speak the results or even point to each of the objects in turn.
[00156] This means that search is just a special type of rule, but one where a lot of the details may be assumed rather than specified. Note that as it is a type of rule, it is possible to have multiple search rules available, each which are applicable for different types of objects being searched, for example, so that they are presented in different ways, or so that the search orders the results differently etc.
Creating rules
[00157] The collection of rules is the glue which links the capabilities of the client application with instructions; the natural language processing service generalises the invokability of these rules.
[00158] In many cases the rules will be created manually by the authors of the client application software.
[00159] These rules may need to be checked so that they are properly understood by the natural language processing service, and for consistency. As part of this checking process, the natural language processing service might produce example natural language processing queries to simulate what the user might have requested, in order to help the author of the rules understand possible actions which is in some embodiments undertaken if the user were to express this. In particular, the natural language processing service could try to generate examples which it found ambiguous to tease out the most appropriate action to undertake in such a situation.
[00160] Due to this checking requirement, and for efficiency and security reasons, it will often be desirable that the authors of the client software upload or similarly make the rules available to the natural language processing service before and independently of the actual user query.
[00161 ] Not all rules will be available in all situations. For example, context menus in modern software often present a limited set of options which are appropriate to the issue at hand. In such cases, the contextual information accompanying the user query should specify which instruction-actions are available, or provide a mechanism for this to be inferred from the current context.
[00162] Other information such as specialised vocabularies and ontologies is in some embodiments also developed and made available for the natural language processing service to help understand user queries.
Automatic extraction
[00163] Some embodiments provide a method wherein, for a given technology environment, one or more of the instruction templates and/or action templates is defined, in whole or in part, by an automated process. For example, the automated process includes analysing user interface artefacts thereby to identify instructions that are configured to be provided directly via the user interface.
[00164] In this regard, the rules may be automatically generated. This is particularly the case where the client application is particularly structured in some way. For example, consider an application which interacts heavily with a database. This database may be automatically inspected to determine the nature and relationships between tables and fields in the database. In some cases, this may be able to be translated into rules, ready for the natural language processing service. Other examples of automatic extraction of rules may include automatic analysis of the user-interface elements in a software (menus, and buttons on user interfaces often correspond to actions, and the way they are labelled correspond to associated instructions). This is in some embodiments achieved by scanning the software code which represents the application, looking for such objects and their associated actions. [00165] Note that whilst automatic generation would be hugely beneficial, manual modification of the resulting rules will often be desirable: special-casing some situations, modifying implied relationships between objects, increasing the coverage beyond those actions detected automatically, or deliberately removing some rules to control what the user is allowed to access through natural language processing means.
[00166] Other outputs of this automatic generation might include the ontological relationships between objects types found, and extended vocabularies to deal with the idiosyncrasies of the application.
[00167] Note that the natural language processing service only requires the output of this automatic extraction. This means that such extraction may be offered such that it is run behind a firewall (for instance on the client's computer) so that the natural language processing service does not have access to associated databases or software code. The client might then be able to manually inspect the extracted rules and other associated information, before submitting it to the natural language processing service. In this way, the client's privacy and security is respected.
Learnt rules
[00168] If the client's software were so arranged, it is in some embodiments possible to get the software to learn from certain situations. In the case of a robot equipped with object tracking and computer vision, for example, a robot is in some embodiments told to watch and learn an action. By associating the action with an instruction, a new rule could be created. As another example, a desktop software application is in some embodiments able to be put in a "record" mode, where a sequence of actions are able to be combined; if those actions are then associated with an instruction, this rule could then be added to the rules available for the natural language processing service to consider.
[00169] Note that these learnt rules could be created by the authors of the software, or potentially by users of the software. Such rules could potentially be shared between different users, or made available only to the user who created the rule.
[00170] In the case of learnt rules, it could be the case that such rules are (re-)created subtly in different ways. Some machine learning techniques could be applied to these rules to detect the common parts of the rules and try to unify them in that way; the uncommon parts might get translated into parameters of those rules. In this way, general rules is in some embodiments able to be derived from specific instances. In some cases parameters are inferred from context, for example subject to the operation of a reasoning engine as described above.
Permissions
[00171 ] Some actions may be restricted in their application either by the situation or because those actions are restricted for that user. Such restrictions can be modelled in two ways. Actions may be hidden from the user in a particular situation if the user does not have adequate permission, and this hiding may be done in a way in which the rule appears to be completely absent from consideration. Alternatively, actions may be restricted from the user in a particular situation such that the user query may still be matched against the restricted actions (for example, blocking the applicability of other matching actions which have a lower score) and optionally, the user may be notified that he or she does not have permission to perform that action at this time. [00172] The permission information is in some embodiments attached to each action, with the user's identity being passed in the contextual information which accompanies the user request. In some cases, it may be desirable to use some strategy such as signing the request to prevent malicious forgeries from accessing parts of the application which are restricted.
Feedback
[00173] In some cases the natural language processing service may provide feedback to the user to help resolve ambiguities in the natural language processing service's understanding and applicability of the user query. But feedback may be an inherent requirement of the fulfilment of the action. Consider by way of example, "wizards" in some software applications, which guide the user through a series of steps to elicit the information required in order to complete some task. In this case, a user query might have requested some task which is best matched to a wizard in the software: however some of the information which the user provides in their query is in some embodiments not applicable until later steps of the wizard. In some cases, this is best handled by having the natural language processing service interact with the user, effectively creating a dialogue which achieves the same end as the wizard in the extraction of the relevant inputs required to complete some task.
Additional Services Provided
[00174] The natural language processing service could be configured in a way so as to:
• monitor which requests are made, providing support to the author of the client software, so that they may be able to analyse which features are most used etc.
• work cross-lingually so as to allow the rules to be specified in one language, yet receive user queries potentially in another language, matching and instantiating them largely as described
• provide feedback to the author of the client software regarding user requests which are not matched against any rule. This may result in broadening the scope of the rules or in adding additional features to the client software which are demanded by users.
• continuously learn from user queries, in order to improve recognition accuracy and the speed of recognition
[00175] Whilst the natural language processing service is described here as being separated from the client software, it could be integrated with it, either directly, for example, as a library of functions accessible by the client software, or indirectly, for example, as a service hosted on the same computer or behind the same firewall.
[00176] Many different models of robots may have different capabilities in terms of what they are able to do, and in terms of their understanding of the world. It is desirable to build a natural language processing technology which reaches across different models, enjoying the commonalities of spatial and motion linguistic comprehension, but still being able to respond to the idiosyncrasies of each particular robot model.
Exemplary software [00177] Many software applications could benefit by adding a natural language processing interface. However, the financial and time cost as well as the risks involved in developing such software may be prohibitive if done for each software product, and would require on-going development to stay cutting edge. However, whilst a natural language processing service is in some embodiments able to identify what action is required, there are still issues involving getting this action to be performed, fire-walling the interface between the software and the natural language processing service (so that clients' data is not given to the service) and yet still being context aware in order to resolve natural language processing ambiguities.
Assisted Learning Process
[00178] As noted above, some embodiments relate to technology whereby a learning interface is configured to enable automated and/or semi-automated generation of computer coded rules and logic that is able to perform defined actions based on analysis of natural language queries. In broad terms, such embodiments make use of human-assisted learning, whereby a human operator defines an appropriate response to a first natural language query, and the learning interface induces from that human-defined response (either automatically or semi-automatically) a template that allows subsequent automated responding to a second natural language query with different parameters (but identifiable commonality in terms of particular structural elements).
[00179] Some embodiments provide computer implemented frameworks and methods for providing natural language processing functionality, which implement an assisted learning process as described below. This may be provided via a framework which is available for integration across a plurality of separate implementation environments, as described further above (for example providing access to global learning advantages and the like). In some embodiments it is integrated locally at a single implementation environment.
[00180] In overview, this method (and associated software platforms) provide a simplified and streamlined mechanism to enable construction of instruction and action templates for a given implementation environment. More specifically, some embodiments provide a software platform that tracks and records natural language based requests submitted via the implementation environment, and enables a user to associate those with action templates (for example newly created action templates). At a practical level, this enables one or more of the following:
[00181 ] A user to generate instruction templates by, in effect, reviewing and responding to queries submitted via a natural language query interface (which become associated with instruction templates), and defining responses to those questions (which become action templates). The responses may include text- based responses (and/or images, videos, and the like), HTTP-based responses, and more complex responses. This may be performed in a "live" mode, whereby a back-end service provider manually generates a query response (where an existing instruction/action template pair is not able to be defined) and provides that to the query-submitting user, and/or a non-live mode whereby the manual generation of the query response is performed at a later time.
[00182] A user to quickly generate instruction templates by, in effect, asking a plurality of common/relevant questions (which become associated with instruction templates), and defining responses to those questions (which become action templates). The responses may include text-based responses, HTTP-based responses, and more complex responses.
[00183] The technology includes maintaining, for a given implementation environment, access to data that defines: (i) a plurality of instruction templates; and (ii) a plurality of action templates, wherein each instruction template is associated with a respective action template. Initially, when the natural language processing platform is first integrated with the implementation environment, there may be zero instruction templates and zero action templates. As discussed below:
• The collection of instruction templates is built up based on receipt of natural language based requests. For example, each time a natural language based request is received and processed, it is either associated with an existing instruction template (i.e. based on natural language processing based matching), or flagged for potential generation of a new instruction template via a template management interface.
• The collection of action templates is built up based on user interaction with the template management interface.
[00184] The method includes receiving, from a client device, data representative of a natural language based request submitted by a user of the client device. This may be a "typical" client device that accesses the implementation environment and provides the request in that context, or an administrator client device that provides the request in the context of the template management interface.
[00185] The method then includes executing a natural language processing engine, thereby to attempt to match the natural language based request to one of the plurality of instruction templates. As a result of this, the request is either: (i) matched; or (ii) unable to be matched.
[00186] In the case that the request is matched (e.g. matched with greater than a threshold level of confidence), the associated action template is able to be executed (for example by being returned to the calling application for execution).
[00187] In the case that the request is not able to be matched, a new instruction template is defined (in draft form, and typically with parameter identification), and a notification generated to inform one or more administrator users that a new instruction template requires attention (e.g. in the sense of defining a new action template) via the template management interface. In some embodiments an administrator user is available to review and define an action for the new instruction template substantially immediately (in which case the user receives a response to his/her request, albeit via manual intervention, and that response is used subsequently as a basis for a new action template).
[00188] From a user experience perspective, there is preferably a protocol in place for handling responses to requests where matching is unsuccessful (and where there is not an administrator user available to provide an instant manual response). This may include asking further questions (for example to derive additional information), resorting to a text-search query approach (e.g. suggesting one or more potentially relevant local web pages) and so on. However, the focus of the present disclosure is the back- end implications in the context of the template management interface. [00189] In a preferred embodiment, in the case that the natural language based request is unable to be matched to one of the plurality of instruction templates, the method includes making data representative of the natural language based request available in the template management interface. In this regard, the template management interface is configured to enable an administrator user to: (i) view data representative of the natural language based request; and (ii) define an instruction template and action template in relation to the natural language based request. In some embodiments an automated process is performed thereby to autonomously define an instruction template and action template in relation to the natural language based request. This automated process is in some cases supplemented by manual input (for example in the context of a "review and revise" approach, by which a user inspects and optionally adjusts an autonomously defined instruction/action template pair).
[00190] FIG. 4A illustrates an exemplary partial screenshot from a template management interface according to one embodiment. This shows a plurality of instruction templates, each described in terms of a primary natural language based request associated with that instruction template. The requests are referred to here as "questions". Each instruction is associated with a count of the number of times the request has been invoked (see the right hand column), and is able to be expanded to reveal the set of individual natural language based requests that were matched to the instruction template (as shown in FIG. 4E).
[00191 ] Expanding of the list of individual natural language based requests that were matched to the instruction template (as shown in FIG. 4E) enables a user to selectively define new instruction templates for any of those requests, and/or re-associate a given request with a different existing instruction template.
[00192] FIG. 4B illustrates another exemplary partial screenshot from a template management interface according to one embodiment. This shows an "edit answer" interface. A user selects an instruction template ("question") from the list in FIG. 4A, and is enabled to define an action template (an "answer"). In this embodiment, the action template is able to take any of the following forms:
• A text-based response.
• A formatted text response.
• A HTTP response, which causes the accessing of data via HTTP (for example to cause a page redirect, or rendering in an on-page object).
• A HTTP (post) response, which causes, for example, a back-end functionality (for example updating a database, automatically completing a web form, or the like).
[00193] More complex forms of action templates may also be defined in further embodiments, for example using approaches described further above. Other forms, such as charts and tables are made available in some embodiments, as a convenient manner for defining an action template without needing to define HTTP page content to provide dynamic functionality.
[00194] In some embodiments, the user interface provides an administrator user with functionality to define an "answer" by: accessing a web page that provides and answer to the relevant query, and inputting into that page data relevant to the query (or, in some embodiments, accessing a web page that is a result of corresponding to the query), and causing that web page to be delivered to the request-submitting user's terminal with any associated additional data. That is described in additional detail further below.
[00195] The example of FIG. 4B also shows an "examples" interface, which displays a plurality of sample natural language based requests and how they would be answered based on the user's input via the edit answer" interface.
[00196] The "edit answer" interface enables user-definition and/or modification of an action templates including: (i) defining a parameterized portion of one or more of the natural language based requests associated with the instruction template; and (ii) defining a corresponding parameterized portion of the action template. This is shown in FIG. 4C and FIG. 4D. In these examples, a user is provide with an "edit variations" option, which allows a user to select a portion of question text and answer text (which may be text in a HTTP data), and apply a variation to those. As shown in FIG. 4D, this includes assigning a "type" to the variation. Examples of "types" include: general purpose descriptors such as dates, names, places, and times, and also application-specific types such as a direction or path (for example in a robotics application), or a disease/symptom (for example in a medical application). There may be a format associated with the type (such as a date format). The collection of available types may be expanded over time. In the example of FIG. 4D the user selects a "date" type. The user also defines the format for the date value in the answer (which it will be appreciated is of particular importance where the answer includes HTTP data).
[00197] In relation to data format, the format may be different for the question as opposed to the answer. That is, in the context of a question, the format relates to a format in which a parameter value is expected to be supplied by a user. In the context of an answer, on the other hand, the format may be quite different (for example where the value defines part of a HTTP request) . In some cases a given parameter is used at multiple locations in an action template, optionally in different formats.
[00198] In some embodiments, an administrator user is also enabled to specify further data sources when defining an action template. For example, a request may be converted into a form configured to cause a lookup of rows in a database or a spreadsheet. More specifically, in some embodiments the values of parameterized portions of the instruction template, when grounded by matching against the user request, are optionally fed into a WHERE clause of an SQL statement and used to identify rows in a database. The identified rows (and values defined therein) are then able to be used as parameter values in the action template when returning the template to the calling application for execution.
[00199] In some embodiments contextual information is also available to an action template. For example, the calling application in some embodiments is configured to pass contextual information such as: personal information (e.g. the user's name and address); information about the state of the implementation environment and its state (e.g. what information is being displayed, what is selected, a cursor position, etc); and information concerning previous interactions (so a dialog might be realized and ambiguities resolved). Such contextual information might be defines as inputs for an action template.
[00200] In some embodiments, contextual information is additionally associated with the instruction template. For instance, an administrator user is enabled to specify context-based constraints, thereby limiting applicability of a given instruction template. For example, a given instruction template may only be considered for matching in the event that, for example: the requesting user has prescribed attributes of personal information (such as an age); the implementation environment is in a prescribed state (for example the user is on a particular web page); and the like.
[00201 ] In practice, for some embodiments, an administrator user registers to use the natural language processing technology via a signup webpage. This enables the administrator user to embed code in a website (or other implementation environment) which is configured to receive natural language based request. The administrator user is also provided with an account with the template management interface, to enable managing of instruction and action templates for their implementation environment. In some cases, the administrator user performs initial configuration by asking a range of expected common questions relevant to the implementation environment, and via the template management interface defines answers to those questions (for example via the interface shown in FIG. 4A to FIG. 4E). In some embodiments functionality is provided to enable provision the questions and/or answers in speech form. This provides a mechanism to quickly and efficiently create an initial set of functionalities available via natural language requests. This set of functionalities is expanded over time, as unmatchable requests are submitted, flagged, and brought to the attention of the administrator user (for example via electronic notifications), who is enabled to review those, and either configure them to in future be matched to an existing instruction template, or define a new instruction template and action template.
[00202] In some embodiments when an instruction template and action template are generated, or when they are modified by an administrator user, any users who made unmatched requests which would now match against these action templates are notified (and in some cases the action template executed or made available for execution at their request). In this manner, a user who asks a question which is initially unable to be matched to an instruction template is provided an answer at a later point in time upon (upon generation/modification of an instruction template having an associated action template). In a preferred embodiment users and/or administrator users have control over settings affecting delivery of such notifications.
[00203] In some embodiments a feedback mechanism is provided for users, such that users are able to identify instances where an executed action template is unhelpful/inappropriate to the users needs, thereby to assist an administrator user in modifying and/or fine tuning instruction and action templates over time (for example notifications of feedback are made available to administrator users via the template management interface).
[00204] A further example embodiment is described below by reference to FIG. 5. This embodiment is described by reference to a relatively straightforward practical example, in the context of a website that is configured to enable party-to-party payments (among other functionalities). Conventionally, a user of that website wishing to make a party-to-party payment navigates, via menus, to a "payments" page, and inputs into that page details of a payee and a payment amount. By way of integrating technology described herein, a user is provided with an interface via which to provide a natural language query (for example as text or as speech). For the sake of example, assume the following query is provided: "pay John Smith $50", as represented by block 501 . [00205] Block 502 represents a matching process, whereby "pay John Smith $50" is processed to determine whether it has a threshold match with an existing instruction template. For the present example, we assume it does not. Therefore, at block 503, the natural language query ("pay John Smith $50") is presented to an administrative user, referred to in this example as a natural language query training user (being a back-end user tasked with reviewing and responding manually to natural language queries). For the sake of this example, we assume that one or more natural language query training users are on hand to manually respond to unmatched natural language queries on an as-required basis (i.e. substantially immediately).
[00206] As represented by block 504, the training user prepares a web page that corresponds to query. In this regard, the training interface provides user interface objects that allow the training user to access the website from which the natural language query was submitted, navigate to a desired page, and where relevant adjust that page context. The way in which the training user navigates and provides context to a web page varies between embodiments, and the following are optionally used:
• Generic navigation, without any direct knowledge of the user's personal information.
• Navigation with the same context as the user (e.g. having been logged in, and/or with access to the user's stored data such as personal information).
• Navigation with context that is a proxy for the user, such that the user's personal information is replaced by proxy information for a "dummy user". In some cases the training user is enabled to select between a set of predefined "dummy users" depending on the situation and intention.
[00207] It will be appreciated that the latter option has advantages in terms of privacy and security, but involves an added level of complication in that there is a need to translate between "dummy user" values and actual user values.
[00208] In the current example, the training user navigates to a "payments" page, which provides a form for inputting details for a payment to be made, and inputs into relevant fields "John Smith" and "$50". The training user then causes that payments page with its defined context (i.e. completed fields) to be rendered at the user terminal in response to the natural language request (this may be rendered in the existing window, or in a separate window/object). So, the user is provided with a web page that provides a pre- filled form to pay $50 to John Smith (with "John Smith" in a name field, and "50.00" in a payment amount field), and is able to complete the transaction (for example by clicking a "perform transfer" button on the page).
[00209] At this stage the requesting user has received a manual response to his/her request, and assumed to be satisfied. Further back-end processing steps are performed thereby to remove the need for a manual response in respect of similar queries. This is performed in such that a subsequent query from another user to, for instance, "give $20 to Joe Brown" is automatically presented with a web page that provides a pre-filled form to pay $20 to Joe Brown, and is able to complete the transaction (for example by clicking a "perform transfer" button on the page). In some cases, the action is performed without the need for the requesting user to perform a confirmation step (for example the training user progresses to a page that results from the filling in of a preceding page's form, and a "submit" button is clicked). Those processing steps are now described.
[00210] Block 510 represents an optional step of normalizing a natural language query thereby to remove redundant language and/or standardising parts of the query in a way in which the meaning of the query is not changed. For example, natural language expressions such as "I would like to", "can you please" and the like are removed. Various tools for de-complicating natural language expressions in this manner are known in the art.
[0021 1 ] Block 51 1 represents a process of query term tagging using NLP techniques (this is an optional selection of technology; other approaches are used in further examples) . Some embodiments make use of available tagging technologies, for example those provided by third parties such as SpaCy. This allows identification, in the query of grammatical artefact types such as direct and indirect objects and the like. This is followed by an optional tagging simplification step at 512, which is configured to consolidate tagging (such as SpaCy type tagging) thereby to provide a simplified degree of tagging (for example combination of two-part names into a single artefact). In this example, we assume that "John Smith" is tagged as dative, and "$50" is tagged as a direct object.
[00212] Block 513 represents a process including identifying artefacts associated with the web page (with context) prepared at 504. This may include data inputted into forms, portions of a page URL, displayed content, and the like.
[00213] Block 514 represents a matching process, whereby tagged artefacts derived from the query are matched against artefacts associated with the web page. This preferably includes:
• Direct text matching. For example "John Smith" matches directly to "John Smith".
• Data type normalized matching. Each artefact is analyzed to identify one or more predicted data types (for example a name, a date, a currency value, and so on), and the artefact is associated with a basic value based on that data type. For example, "$50.00", "fifty dollars", "50 dollars", and other variations are all able to be associated with a currency value of 50.00. As another example, a date in the format "next Thursday" is associated with an actual date in a standardized date format. An artefact may have one or more types and one or more values.
[00214] In the current example of "pay John Smith $50", there is commonality between: "John Smith" in the query, and "John Smith" in the name field; and "$50" in the query and "50.00" in the amount field. Note that in the case of multiple types and values being associated with an artefact, the alignment between items which appear in the query and items which appear in the action template may be used to eliminate values which are not aligned. However, it may be the case that the permitted types in the query are still ambiguous. Some other method for resolving this (e.g. arbitrarily, using heuristics, or having preferences associated with the types) may be required to further reduce the ambiguity between acceptable types. This commonality may be exploited to reveal parameter candidates. Matched artefacts may be determined to be parameter values. Note that the one query artefact may correspond to more than one part of the action template, for example "John Smith" in the query may correspond to FIRST NAME: "John", LAST NAME: "Smith" in the action template. However, the reverse is not true: so at most one query artefact may be deemed to correspond to part of the action template. The exploitation of the commonality enables the automated defining of instruction and action templates at 515:
[00215] The instruction template includes data defining an action type, and data types as tagged language parts.
[00216] The action template includes data defining a web page (with context) which uses values extracted from the tagged language parts of identified data types.
[00217] For action template parameters, the format of the associated artefact is analysed and recorded. This is not the format of the instruction, (that is allowed to vary according to the requesting user's preferences), but is to be used for the format of the action. For example, a date corresponding to 28 September 2009 might be represented as "2009-09-28" and have a format representation "YYYY-MM-DD". By recording the format of the action template parameter, when "14/8/2016" is required to be rendered, it would be represented as "2016-08-14".
[00218] Returning to the current example, a determination is made that the query is defined by a name type value in the dative, and currency type value in the direct object. So, for a query determined to be a "pay" command (which may be identified using NLP techniques, which allow identification of similar terms such as "give", "transfer" or the like) having a name type value tagged as dative and a currency value tagged as direct object, parameter values are able to be extracted. That defines instruction template - any query having a "pay" instruction, a name type value tagged as dative and a currency value tagged as direct object is matched to that instruction template.
[00219] In relation to the action template, based on the manual action at 504 page artefacts that are independent of parameter values are known, and page artefacts that are defined by parameters are known. In this example, the page artefacts that the defined by parameters are pre-filled fields; these could just as easily be portions of a URL or the like. This defines an action template - to render a page with context using the parameter values extracted based on the instruction template.
[00220] So, for example, assume a natural language request of "give forty bucks to Joe Brown" is subsequently received. This is matched against the template defined as described above, as the "give" portion is matched to a pay command, and there is a name value ("Joe Brown") in the dative, and a currency value ("forty bucks" normalized to "40.00") in the direct object. The parameter values are then extracted and used to complete the action template, which causes the user to be provided with a payment web page pre filled with "Joe Brown" in the payee field and 40.00 in the amount field.
[00221 ] Accordingly, as a result of the manual one-off query response process performed at 501 to 505, automated steps 510 to 51 5 are performed thereby to enable autonomous handling of subsequent similar natural language requests ("similar" in that they are able to be matched to the instruction template due to data types of language artefacts).
[00222] Although the example above describes automated generation of instruction and action templates, further embodiments provide use interface components that enable manual intervention at various stages in the process (for example in the context of identifying, classifying and/or defining parameterised portions, and/or customisation of instruction/action templates with aspects that are not able to be autonomously determined).
[00223] In some embodiments, the matching (and/or parameter value determination) makes use of a wider collection of data, including data associated with the request-defining user's web page context (at the time of the query and/or based on the response page that is defined), data associated with the user (for example personal information, known contacts, location, and so on), general context data (for example a current data and time). So, by way of example, there may be matching between a known data value of known type in user personal information and a portion of a URL for the generated page (and these are parametrized into instruction and action templates; the instruction template requires that the requestor has a data value of that known type in stored personal information, and the action template uses that value as a parameter value in a URL for a web page that is generated by the action template).
[00224] In some embodiments, the technology is configured to handle situations where there is ambiguity in a parameter value required for an instruction/action template. In some embodiments this includes asking a further question to resolve the ambiguity. As a practical example, a portion of a natural language request includes a parameterised that portion could have two potential values based on the wording of the request (for example "Sydney" could be in Australia, or Sydney in Canada). A question is defined to elicit a response from the user to resolve the ambiguity. Further embodiments apply alternate resolution techniques, for example using knowledge of the user, the user's location, and the like.
[00225] In some embodiments, the technology is configured to handle situations where there is optionality in at least one parameter value required for an instruction/action template. A request is able to be matched to an instruction template in spite of lacking parameterized portions for optional parameters. For example, a date value may be optional, allowing "I want to fly to Paris on Thursday" and "I want to fly to Paris" to be matched to the same instruction template. The former will use the date value in the action template, the latter will not (and may optionally ask additional question of the user, or use default values) .
[00226] In some embodiments, a training user is enabled to define multiple conditional manual responses, and associate those with conditions (for example conditions based on context, user attributes, user login status, and so on).
Exemplary Client-Server Framework
[00227] In overview, a web server 302 provides a web interface 303. This web interface is accessed by the parties by way of client terminals 304. In overview, users access interface 303 over the Internet by way of client terminals 304, which in various embodiments include the likes of personal computers, PDAs, cellular telephones, gaming consoles, and other Internet enabled devices.
[00228] Server 303 includes a processor 305 coupled to a memory module 306 and a communications interface 307, such as an Internet connection, modem, Ethernet port, wireless network card, serial port, or the like. In other embodiments distributed resources are used. For example, in one embodiment server 302 includes a plurality of distributed servers having respective storage, processing and communications resources. Memory module 306 includes software instructions 308, which are executable on processor [00229] Server 302 is coupled to a database 310. In further embodiments the database leverages memory module 306.
[00230] In some embodiments web interface 303 includes a website. The term "website" should be read broadly to cover substantially any source of information accessible over the Internet or another communications network (such as WAN, LAN or WLAN) via a browser application running on a client terminal. In some embodiments, a website is a source of information made available by a server and accessible over the Internet by a web-browser application running on a client terminal. The web-browser application downloads code, such as HTML code, from the server. This code is executable through the web-browser on the client terminal for providing a graphical and often interactive representation of the website on the client terminal. By way of the web-browser application, a user of the client terminal is able to navigate between and throughout various web pages provided by the website, and access various functionalities that are provided.
[00231 ] Although some embodiments make use of a website/browser-based implementation, in other embodiments proprietary software methods are implemented as an alternative. For example, in such embodiments client terminals 304 maintain software instructions for a computer program product that essentially provides access to a portal via which framework 100 is accessed (for instance via an iPhone app or the like).
[00232] In general terms, each terminal 304 includes a processor 31 1 coupled to a memory module 313 and a communications interface 312, such as an internet connection, modem, Ethernet port, serial port, or the like. Memory module 313 includes software instructions 314, which are executable on processor 31 1 . These software instructions allow terminal 304 to execute a software application, such as a proprietary application or web browser application and thereby render on-screen a user interface and allow communication with server 302. This user interface allows for the creation, viewing and administration of profiles, access to the internal communications interface, and various other functionalities.
[00233] Unless specifically stated otherwise, as apparent from the following discussions, it is appreciated that throughout the specification discussions utilizing terms such as "processing," "computing," "calculating," "determining", analyzing" or the like, refer to the action and/or processes of a computer or computing system, or similar electronic computing device, that manipulate and/or transform data represented as physical, such as electronic, quantities into other data similarly represented as physical quantities.
[00234] In a similar manner, the term "processor" may refer to any device or portion of a device that processes electronic data, e.g., from registers and/or memory to transform that electronic data into other electronic data that, e.g., may be stored in registers and/or memory. A "computer" or a "computing machine" or a "computing platform" may include one or more processors.
[00235] The methodologies described herein are, in one embodiment, performable by one or more processors that accept computer-readable (also called machine-readable) code containing a set of instructions that when executed by one or more of the processors carry out at least one of the methods described herein. Any processor capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken are included. Thus, one example is a typical processing system that includes one or more processors. Each processor may include one or more of a CPU, a graphics processing unit, and a programmable DSP unit. The processing system further may include a memory subsystem including main RAM and/or a static RAM, and/or ROM. A bus subsystem may be included for communicating between the components. The processing system further may be a distributed processing system with processors coupled by a network. If the processing system requires a display, such a display may be included, e.g., a liquid crystal display (LCD) or a cathode ray tube (CRT) display. If manual data entry is required, the processing system also includes an input device such as one or more of an alphanumeric input unit such as a keyboard, a pointing control device such as a mouse, and so forth. Input devices may also include audio/video input devices, and/or devices configured to derive information relating to characteristics/attributes of a human user. The term memory unit as used herein, if clear from the context and unless explicitly stated otherwise, also encompasses a storage system such as a disk drive unit. The processing system in some configurations may include a sound output device, and a network interface device. The memory subsystem thus includes a computer-readable carrier medium that carries computer- readable code (e.g., software) including a set of instructions to cause performing, when executed by one or more processors, one of more of the methods described herein. Note that when the method includes several elements, e.g., several steps, no ordering of such elements is implied, unless specifically stated. The software may reside in the hard disk, or may also reside, completely or at least partially, within the RAM and/or within the processor during execution thereof by the computer system. Thus, the memory and the processor also constitute computer-readable carrier medium carrying computer-readable code.
[00236] Furthermore, a computer-readable carrier medium may form, or be included in a computer program product.
[00237] In alternative embodiments, the one or more processors operate as a standalone device or may be connected, e.g., networked to other processor(s), in a networked deployment, the one or more processors may operate in the capacity of a server or a user machine in server-user network environment, or as a peer machine in a peer-to-peer or distributed network environment. The one or more processors may form a personal computer (PC), a tablet PC, a set-top box (STB), a Personal Digital Assistant (PDA), a cellular telephone, a web appliance, a network router, switch or bridge, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine.
[00238] Note that while diagrams only show a single processor and a single memory that carries the computer-readable code, those in the art will understand that many of the components described above are included, but not explicitly shown or described in order not to obscure the inventive aspect. For example, while only a single machine is illustrated, the term "machine" shall also be taken to include any collection of machines that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methodologies discussed herein.
[00239] Thus, one embodiment of each of the methods described herein is in the form of a computer- readable carrier medium carrying a set of instructions, e.g., a computer program that is for execution on one or more processors, e.g., one or more processors that are part of web server arrangement. Thus, as will be appreciated by those skilled in the art, embodiments of the present invention may be embodied as a method, an apparatus such as a special purpose apparatus, an apparatus such as a data processing system, or a computer-readable carrier medium, e.g., a computer program product. The computer- readable carrier medium carries computer readable code including a set of instructions that when executed on one or more processors cause the processor or processors to implement a method. Accordingly, aspects of the present invention may take the form of a method, an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of carrier medium (e.g., a computer program product on a computer-readable storage medium) carrying computer-readable program code embodied in the medium.
[00240] The software may further be transmitted or received over a network via a network interface device. While the carrier medium is shown in an exemplary embodiment to be a single medium, the term "carrier medium" should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more sets of instructions. The term "carrier medium" shall also be taken to include any medium that is capable of storing, encoding or carrying a set of instructions for execution by one or more of the processors and that cause the one or more processors to perform any one or more of the methodologies of the present invention. A carrier medium may take many forms, including but not limited to, non-volatile media, volatile media, and transmission media. Non-volatile media includes, for example, optical, magnetic disks, and magneto- optical disks. Volatile media includes dynamic memory, such as main memory. Transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise a bus subsystem. Transmission media also may also take the form of acoustic or light waves, such as those generated during radio wave and infrared data communications. For example, the term "carrier medium" shall accordingly be taken to included, but not be limited to, solid-state memories, a computer product embodied in optical and magnetic media; a medium bearing a propagated signal detectable by at least one processor of one or more processors and representing a set of instructions that, when executed, implement a method; and a transmission medium in a network bearing a propagated signal detectable by at least one processor of the one or more processors and representing the set of instructions.
[00241 ] It will be understood that the steps of methods discussed are performed in one embodiment by an appropriate processor (or processors) of a processing (i.e., computer) system executing instructions (computer-readable code) stored in storage. It will also be understood that the invention is not limited to any particular implementation or programming technique and that the invention may be implemented using any appropriate techniques for implementing the functionality described herein. The invention is not limited to any particular programming language or operating system.
[00242] It should be appreciated that in the above description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, FIG., or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the Detailed Description are hereby expressly incorporated into this Detailed Description, with each claim standing on its own as a separate embodiment of this invention. [00243] Furthermore, while some embodiments described herein include some but not other features included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention, and form different embodiments, as would be understood by those skilled in the art. For example, in the following claims, any of the claimed embodiments can be used in any combination.
[00244] Furthermore, some of the embodiments are described herein as a method or combination of elements of a method that can be implemented by a processor of a computer system or by other means of carrying out the function. Thus, a processor with the necessary instructions for carrying out such a method or element of a method forms a means for carrying out the method or element of a method. Furthermore, an element described herein of an apparatus embodiment is an example of a means for carrying out the function performed by the element for the purpose of carrying out the invention.
[00245] In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In other instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
[00246] Similarly, it is to be noticed that the term coupled, when used in the claims, should not be interpreted as being limited to direct connections only. The terms "coupled" and "connected," along with their derivatives, may be used. It should be understood that these terms are not intended as synonyms for each other. Thus, the scope of the expression a device A coupled to a device B should not be limited to devices or systems wherein an output of device A is directly connected to an input of device B. It means that there exists a path between an output of A and an input of B which may be a path including other devices or means. "Coupled" may mean that two or more elements are either in direct physical or electrical contact, or that two or more elements are not in direct contact with each other but yet still co-operate or interact with each other.
[00247] Thus, while there has been described what are believed to be the preferred embodiments of the invention, those skilled in the art will recognize that other and further modifications may be made thereto without departing from the spirit of the invention, and it is intended to claim all such changes and modifications as falling within the scope of the invention. For example, any formulas given above are merely representative of procedures that may be used. Functionality may be added or deleted from the block diagrams and operations may be interchanged among functional blocks. Steps may be added or deleted to methods described within the scope of the present invention.

Claims

CLAIMS:
1 . A computer implemented method for providing natural language processing functionality, the method including:
receiving, based on input at a client device, data representative of a natural language based request submitted by a user of a client device;
providing an interface that is configured to enable manual defining of a response to the natural language based request;
performing a comparison of: (i) the data representative of the natural language based request; and (ii) data associated with the manually defined response;
based on the comparison, identifying one or more portions of the data representative of the natural language based request that represent parameter values, and one or more portions of the data associated with the manually defined response that represent corresponding parameter values;
generating data that defines:
(i) an instruction template, which includes parametrized portions based on the identified portions of the data representative of the natural language based request ; and
(ii) an associated action template, which includes parametrized portions based on the identified portions of the data associated with the manually defined response; receiving, based on input at a further client device, data representative of a further natural language based request submitted by a user of that client device;
determining that the further natural language based request matches the instruction template; analyzing the data representative of the further natural language based request thereby to identify parameter values for the parametrized portions of the instruction template; and
applying corresponding parameter values to the associated action template, such that the action template is configured to provide a response to the further natural language based request.
2. A method according to claim 1 wherein the step of providing an interface that is configured to enable manual defining of a response to the natural language based request is performed only in the case that the natural language based request is not matched to an existing instruction template.
3. A method according to claim 1 wherein performing a comparison of: (i) the data representative of the natural language based request; and (ii) data associated with the manually defined response includes:
processing the natural language based request thereby to identify discrete grammatical artefacts.
4. A method according to claim 4 wherein the discrete grammatical artefacts include any one or more of: datives, accusatives, direct objects, indirect objects, prepositional phrases, positional modifiers, and temporal modifiers.
5. A method according to claim 3 wherein performing a comparison of: (i) the data representative of the natural language based request; and (ii) data associated with the manually defined response additionally includes:
processing the data associated with the manually defined response thereby to identify data commonalities with the identified discrete grammatical artefacts.
6. A method according to claim 5 wherein the data commonalities are identified in a URL portion of a web page associated with the manually defined response.
7. A method according to claim 5 wherein the data commonalities are identified in context of a web page associated with the manually defined response.
8. A method according to claim 7 wherein context of a web page associated with the manually defined response includes a manually defined field value.
9. A method according to claim 5 wherein identifying data commonalities includes normalizing a data portion from either or both of the data representative of the natural language based request and the data associated with the manually defined response based on a predicted data type.
10. A method according to claim 1 wherein, the instruction template includes data defining a plurality of combinations between: identifiable grammatical artefact types; and identifiable data types.
1 1 . A computer implemented method for providing natural language processing functionality, the method including:
maintaining access to data that defines:
(i) a plurality of instruction templates; and
(ii) a plurality of action templates, wherein each instruction template is associated with a respective instruction template;
receiving, from a client device, data representative of a natural language based request submitted by a user of the client device;
executing a natural language processing engine, thereby to attempt to match the natural language based request to one of the plurality of instruction templates;
in the case that the natural language based request is unable to be matched to one of the plurality of instruction templates, making data representative of the natural language based request available in a template management interface, wherein the template management interface is configured to enable an administrator user to: (i) view data representative of the natural language based request; and (ii) define an instruction template and action template in relation to the natural language based request.
12. A method according to claim 1 1 wherein the template management interface is configured to: display a plurality of instruction templates;
for each instruction template, provide data representative of one or more natural language based requests associated with the instruction template; and
for each instruction template, enable user-definition and/or modification of an action template.
13. A method according to claim 12 wherein enabling user-definition and/or modification of an action template includes: (i) defining a parametrized portion of one or more of the natural language based requests associated with the instruction template; and (ii) defining a corresponding parametrized portion of the action template.
14. A method according to claim 13 wherein the parametrized portion of the action template is a portion of HTTP data.
15. A method according to claim 12 wherein enabling user-definition and/or modification of an action template includes enabling the user to selectively define (i) a text-based response; or (ii) a HTTP based response.
16. A method according to claim 12 wherein enabling user-definition and/or modification of an action template includes enabling the user to selectively define (i) a text-based response; (ii) a HTTP link based response; or (iii) a HTTP post based response.
17. A method according to claim 12 wherein enabling user-definition and/or modification of an action template includes enabling the user to selectively define a complex action template.
18. A method according to claim 12 wherein the user interface is configured to enable a user to: (i) select a natural language based request associated with a first instruction template; and (ii) re- associate that natural language-based request with a second template.
19. A method according to claim 18 wherein the second template is a new template.
20. A method according to claim 1 1 wherein the natural language based request is receivable in either text form or speech form.
21 . A computer implemented method for providing natural language processing functionality to a plurality of technology environments, the method including:
maintaining access to data that defines, for each of the plurality of technology environments:
(i) a plurality of instruction templates; and
(ii) an action template, wherein each instruction template is associated with a respective action template;
receiving, from a client device, data representative of a natural language based request submitted by a user of the client device;
identifying one of the plurality of technology environments being executed at the client device; executing a natural language processing engine, which is available to the plurality of technology environments, thereby to match the natural language based request to one of the plurality of instruction templates associated with the identified one of the technology environments;
based on the matching, identifying the action template associated with the matched instruction template;
defining an executable instance for the identified action template; and
transmitting data representative of the executable instance for the client device, thereby to cause the client device to perform functions based on the executable instance of the identified action template.
22. A method according to claim 21 wherein the method of matching includes the consideration of any one or more of the following variations: grammatical variations; synonymous variations; hyponyms of hypernyms; hyponymous, hypernymous, meronymous, holonymous variations; instantiable variations; and variations based on liberal matching of adjectives and adverbs.
23. A method according to claim 22 wherein the matching includes a scoring process, wherein the scoring process is affected by the presence of one or more variations.
24. A method according to claim 23 wherein the degree to which the score is effected in response to the presence of one or more variations is based on a learning algorithm .
25. A method according to claim 21 wherein a plurality of actions are chained together.
26. A method according to claim 21 wherein one or more of the instruction templates are parametrised.
27. A method according to claim 26 wherein the natural language processing engine is configured to, in the case that the matched instruction template is a parametrised instruction template: process the natural language based request thereby to determine parameter values for the parametrised instruction template.
28. A method according to claim 27 wherein defining an executable instance for the identified action template associated with a parametrised instruction template includes applying the determined parameter values to corresponding parametrised portions of the identified action template.
29. A method according to any one of claims 26 to 28 wherein each parameter has a parameter type, and wherein the parameter type is used in the matching.
30. A method according to claim 21 wherein queries which are imperfectly aligned with instruction templates are handled.
31 . A method according to claim 21 wherein the technology environments include: (i) a plurality of websites; (ii) a plurality of software applications; and (iii) a plurality of controllable machines.
32. A method according to claim 21 including providing to the client device a signal configured to cause the client device to request further information from the user to support the natural language based request.
33. A method according to claim 21 wherein the executable instance of the action template defines a series of consecutive processes configured to be performed by the client device thereby to satisfy the natural language based request.
34. A method according to claim 21 wherein the data representative of the natural language request includes: (i) request data; and (ii) contextual data.
35. A method according to claim 34 wherein the request data includes either: (i) speech data received by the client device, which is converted to text by a speech recognition associated with the natural language processing engine; or (ii) text data inputted at the client device.
36. A method according to claim 35 wherein the contextual data includes, in the case of a technology environment in the form of a software application, one or more of: localisation information, available objects and any objects which are displayed to the user, the current state of such objects, etc.
37. A method according to claim 35 wherein the contextual data includes, in the case of a controllable machine, one or more of: position information, information coming from sensors, current actions in progress, world information such as location of objects, size and weight of objects, etc.
38. A method according to claim 21 wherein, for a given technology environment, one or more of the instruction templates and/or action templates is defined, in whole or in part, by an automated process.
39. A method according to claim 38 wherein the automated process includes analysing user interface artefacts thereby to identify instructions that are configured to be provided directly via the user interface.
40. A method according to claim 21 including, for each technology environment as a configuration step, performing an automated process in respect of one or more databases associated with the technology environment thereby to extract parameter contextual data.
41 . A computer implemented method for providing natural language processing functionality, the method including:
receiving, based on input at a client device, data representative of a natural language based request submitted by a user of a client device;
providing an interface that is configured to enable manual defining of a response to the natural language based request;
performing a comparison of: (i) the data representative of the natural language based request; and (ii) data associated with the manually defined response;
based on the comparison, identifying one or more portions of the data representative of the natural language based request that represent parameter values, and one or more portions of the data associated with the manually defined response that represent corresponding parameter values; generating data that defines:
(i) an instruction template, which includes parametrized portions based on the identified portions of the data representative of the natural language based request ; and
(ii) an associated action template, which includes parametrized portions based on the identified portions of the data associated with the manually defined response; receiving, based on input at the same or a further client device, data representative of a further natural language based request submitted by a user of that client device;
determining that the further natural language based request matches the instruction template; analyzing the data representative of the further natural language based request thereby to identify parameter values for the parametrized portions of the instruction template; and
applying corresponding parameter values to the associated action template, such that the action template is configured to provide a response to the further natural language based request.
A computer system configured to perform a method according to any one of claims 1 to 41 .
A computer program configured to perform a method according to any one of claims 1 to 41 .
A non-transitory carrier medium carrying computer executable code that, when executed on a processor, causes the processor to perform a method according any one of claims 1 to 41 .
PCT/AU2016/050950 2015-10-09 2016-10-10 Frameworks and methodologies configured to enable streamlined integration of natural language processing functionality with one or more user interface environments, including assisted learning process WO2017059500A1 (en)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
AU2015904121A AU2015904121A0 (en) 2015-10-09 Frameworks and methodologies configured to enable streamlined integration of natural language processing functionality with a plurality of user interface environments
AU2015904121 2015-10-09
AU2015905046A AU2015905046A0 (en) 2015-12-04 Frameworks and methodologies configured to enable streamlined integration of natural language processing functionality with a plurality of user interface environments, including assisted learning process
AU2015905046 2015-12-04
AU2015905061 2015-12-07
AU2015905061A AU2015905061A0 (en) 2015-12-07 Frameworks and methodologies configured to enable streamlined integration of natural language processing functionality with a plurality of user interface environments, including assisted learning process

Publications (1)

Publication Number Publication Date
WO2017059500A1 true WO2017059500A1 (en) 2017-04-13

Family

ID=58487121

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/AU2016/050950 WO2017059500A1 (en) 2015-10-09 2016-10-10 Frameworks and methodologies configured to enable streamlined integration of natural language processing functionality with one or more user interface environments, including assisted learning process

Country Status (1)

Country Link
WO (1) WO2017059500A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111108477A (en) * 2017-07-15 2020-05-05 Cytk有限责任公司 Universal virtual professional toolkit
US20220350961A1 (en) * 2021-04-30 2022-11-03 Bank Of America Corporation Systems and methods for tool integration using cross channel digital forms

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5454106A (en) * 1993-05-17 1995-09-26 International Business Machines Corporation Database retrieval system using natural language for presenting understood components of an ambiguous query on a user interface
US6766320B1 (en) * 2000-08-24 2004-07-20 Microsoft Corporation Search engine with natural language-based robust parsing for user query and relevance feedback learning
US20040220809A1 (en) * 2003-05-01 2004-11-04 Microsoft Corporation One Microsoft Way System with composite statistical and rules-based grammar model for speech recognition and natural language understanding
US7403938B2 (en) * 2001-09-24 2008-07-22 Iac Search & Media, Inc. Natural language query processing

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5454106A (en) * 1993-05-17 1995-09-26 International Business Machines Corporation Database retrieval system using natural language for presenting understood components of an ambiguous query on a user interface
US6766320B1 (en) * 2000-08-24 2004-07-20 Microsoft Corporation Search engine with natural language-based robust parsing for user query and relevance feedback learning
US7403938B2 (en) * 2001-09-24 2008-07-22 Iac Search & Media, Inc. Natural language query processing
US20040220809A1 (en) * 2003-05-01 2004-11-04 Microsoft Corporation One Microsoft Way System with composite statistical and rules-based grammar model for speech recognition and natural language understanding

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SNEIDERS: "Automated Question Answering: Template-Based Approach", DEPARTMENT OF COMPUTER SCIENCES ROYAL INSTITUTE OF TECHNOLOGY AND STOCKHOLM UNIVERSITY, February 2002 (2002-02-01), pages 1 - 279, XP055371157, Retrieved from the Internet <URL:https://www.researchgate.net/profile/Eriks_Sneiders/publication/2560669_Automated_Question_Answering/liks/53ecf6b40cf2981ada110bd3.pdf> [retrieved on 20161121] *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111108477A (en) * 2017-07-15 2020-05-05 Cytk有限责任公司 Universal virtual professional toolkit
US20220350961A1 (en) * 2021-04-30 2022-11-03 Bank Of America Corporation Systems and methods for tool integration using cross channel digital forms
US11645454B2 (en) 2021-04-30 2023-05-09 Bank Of America Corporation Cross channel digital forms integration and presentation system
US11704484B2 (en) 2021-04-30 2023-07-18 Bank Of America Corporation Cross channel digital data parsing and generation system
US11763074B2 (en) * 2021-04-30 2023-09-19 Bank Of America Corporation Systems and methods for tool integration using cross channel digital forms

Similar Documents

Publication Publication Date Title
US11669918B2 (en) Dialog session override policies for assistant systems
US20230206087A1 (en) Techniques for building a knowledge graph in limited knowledge domains
US10713317B2 (en) Conversational agent for search
US10771406B2 (en) Providing and leveraging implicit signals reflecting user-to-BOT interaction
US10503767B2 (en) Computerized natural language query intent dispatching
WO2021077043A1 (en) Generating proactive content for assistant systems
US20070203869A1 (en) Adaptive semantic platform architecture
US20220138432A1 (en) Relying on discourse analysis to answer complex questions by neural machine reading comprehension
US10977155B1 (en) System for providing autonomous discovery of field or navigation constraints
US20230185799A1 (en) Transforming natural language to structured query language based on multi-task learning and joint training
US20220075960A1 (en) Interactive Communication System with Natural Language Adaptive Components
WO2017059500A1 (en) Frameworks and methodologies configured to enable streamlined integration of natural language processing functionality with one or more user interface environments, including assisted learning process
US20230139397A1 (en) Deep learning techniques for extraction of embedded data from documents
US20220229991A1 (en) Multi-feature balancing for natural language processors
US11966570B2 (en) Automated processing and dynamic filtering of content for display
US20240126795A1 (en) Conversational document question answering
US20240061832A1 (en) Techniques for converting a natural language utterance to an intermediate database query representation
US20230134149A1 (en) Rule-based techniques for extraction of question and answer pairs from data
US20240134850A1 (en) Output interpretation for a meaning representation language system
US20230342012A1 (en) Automated processing and dynamic filtering of content for display
CN113761143A (en) Method, apparatus, device and medium for determining answers to user questions
CN115297207A (en) Call preparation engine for customer relationship management

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16852910

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16852910

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205N DATED 15/06/2018)

122 Ep: pct application non-entry in european phase

Ref document number: 16852910

Country of ref document: EP

Kind code of ref document: A1