US20140046977A1 - System and method for mining patterns from relationship sequences extracted from big data - Google Patents

System and method for mining patterns from relationship sequences extracted from big data Download PDF

Info

Publication number
US20140046977A1
US20140046977A1 US13/755,047 US201313755047A US2014046977A1 US 20140046977 A1 US20140046977 A1 US 20140046977A1 US 201313755047 A US201313755047 A US 201313755047A US 2014046977 A1 US2014046977 A1 US 2014046977A1
Authority
US
United States
Prior art keywords
relationship
entities
pattern
relationships
frequent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US13/755,047
Inventor
Sridhar Gopalakrishnan
Sujatha Raviprasad Upadhyaya
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
XURMO TECHNOLOGIES PVT Ltd
Original Assignee
XURMO TECHNOLOGIES PVT Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by XURMO TECHNOLOGIES PVT Ltd filed Critical XURMO TECHNOLOGIES PVT Ltd
Publication of US20140046977A1 publication Critical patent/US20140046977A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • G06F17/30539
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • G06F17/30604
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation

Definitions

  • the embodiments herein generally relate to data mining and particularly relates to a mining patterns from structured, unstructured and semi-structured data from heterogeneous sources.
  • the embodiments herein more particularly relates to a system and method for mining patterns in relationship sequences extracted from big data.
  • a big data is any one or a combination of an unstructured data source, a semi-structured data source and a structured data source.
  • map reduce frameworks such as Hadoop
  • Pattern mining in structured data is a fairly well understood problem; however, pattern mining on unstructured data is much less understood.
  • the approaches for pattern mining in structured and unstructured data are completely different.
  • the co-occurrence of entities decides the pattern, given that the entities can share multiple relationships among themselves.
  • the co-occurrence of entities alone does not ensure that patterns are bound to the correct context.
  • association rule mining basically uses the co-occurrence of words that are used to describe a relationship to find new relationships.
  • the objective of this prior art is to discover new relationships between entities given that a statistical module asserts the significance of the relationship, and a relationship does not match existing relationships between the pair of entities already in the relation database.
  • the prior art adds new relationships after trying to resolve it with the existing relationships.
  • Another existing prior art provides a state of art method for effective pattern discovery for text mining which follow a term based approach for closed sequence pattern mining. This effort too examines sequences that are formed by term occurrences.
  • the prior art considers only the text data and the method of extracting patterns cannot be extended to other forms of data. Also, the prior art limits itself to mining patterns in entity space, where every term is considered as entity.
  • the primary object of the embodiments herein is to provide a system and method for mining patterns in a relationship space from a collection of structured, unstructured and semi-structured, data.
  • Another object of the embodiments herein is to provide, a system and method for enabling pattern extraction in relationship space by storing entities and relationships, and maintaining entity hierarchy and relationship hierarchy respectively.
  • Yet another object of the embodiments herein is to provide a system and method for building relationship sequences from heterogeneous data sources to represent the order in which the relationships occur to facilitate pattern mining.
  • Yet another object of the embodiments herein is to provide a system and method for extracting relevant relationship sequences from stored relationships using entity and relationship hierarchies for pattern-mining.
  • Yet another object of the embodiments herein is to provide a system and method for generating most frequent patterns in relationship space from relationship sequences.
  • Yet another object of the embodiments herein is to provide a system and method for deriving new perspectives from big data to enable analytics of the derived data.
  • the various embodiments herein provide a system for mining frequent patterns in relationship space from a plurality of relationship sequences extracted from a big data.
  • the system comprising a data repository for collecting and storing the big data, an entity store for collecting and storing a plurality of entities from the big data, an entity hierarchy for representing a hierarchical structure re of entities, a relationship store for collecting and storing relationship instances between the plurality of entities from the big data, a relationship hierarchy for representing a hierarchical structure of relationships, a language/domain model for organizing entities and relationships in a hierarchical manner, a Pattern Query Processing Module (PQPM) for processing a pattern query related, to finding patterns in relationships and entities, a Pattern Generation Module (PGM) to generate frequent patterns from one or more relationship sequences from the data sources collected based on the pattern query and a Frequent Pattern Display Module (FPDM) to provide a visual presentation of the mined patterns.
  • the pattern generation module performs frequent pattern mining by gathering relevant data sources using the entity hierarchy and the relationship hierarchy. It generates
  • the big data comprises structured, unstructured and semi-structured data from heterogeneous data sources.
  • the entity store is a collection of entities extracted from the big data.
  • the entity store stores specific information with respect to each entity.
  • the entity hierarchy represents a hierarchical structure of entities resolved using Natural Language Processing (NLP) techniques with a support of the Language and Knowledge Models.
  • NLP Natural Language Processing
  • the relationship store is adapted to store information related to each relationship instance.
  • the Relationship Hierarchy represents a hierarchical arrangement of relationships by resolving, the relationships through at least one of a word-sense disambiguation technique, syntactic resolution and semantic resolution in conjunction with the language/domain model for context resolution.
  • the Pattern Query Processing Module processes the pattern query by expanding the pattern query in terms of entities after consulting with the entity store and the hierarchy of entities.
  • the pattern query is a list comprising entities and relationships of the entities.
  • the Pattern Query Processing Module performs a query expansion of the pattern query to provide a relevant result by disambiguation and resolution of the entities in the pattern query.
  • the disambiguation of the entities in the pattern query is conducted by identifying explicit and implicit similar entities and ignoring the dissimilar entities.
  • the Pattern Generation Module comprises a document retriever to collect documents pertaining to the entities and relationships suggested by the query expansion, a Relationship Sequence Generator to create a relationship sequence with respect to each of the retrieved documents, and a Frequent Pattern Growth Module (FPGM) for extracting relevant relationship sequences.
  • the Relationship Sequence Generator builds the relationship sequences by treating each relationship as an item.
  • Each relationship sequence comprises the relationships in the order of appearance in the data source.
  • the Frequent Pattern Growth Pattern-Mining Module adapts a Frequent Pattern Growth (FPG) algorithm for extracting relevant relationship sequences which considers the relationship sequences as item-sets and extracts the most frequent item-sets.
  • FPG Frequent Pattern Growth
  • the embodiments herein further provide a method for mining frequent patterns from a plurality of relationship sequences extracted from a big data.
  • the method comprising, extracting a plurality of entities from the big data, storing the extracted plurality of entities in an entity store, extracting and storing one or more relationships among the plurality of entities, building an entity hierarchy by arranging the plurality of entities in a hierarchical manner, creating a relationship hierarchy by arranging the relationships in a hierarchical manner, inputting a pattern query; where the pattern query is a list of entities and the relationship of entities, processing the pattern query to find patterns in relationships and entities, retrieving relevant data sources from data using the entity hierarchy and the relationship hierarchy based on the pattern query, building relationship sequences with respect to one or more retrieved data sources and extracting frequent patterns from the relationship sequences and displaying the frequent patterns on a frequent pattern display module.
  • the big data comprises structured, unstructured and semi-structured data from heterogeneous data sources for enabling data analysis on a single view.
  • generating frequent patterns among the relationship sequences is performed using a Frequent Pattern Growth Algorithm which considers the relationship sequences as item-sets and extracts the most frequent item-sets.
  • the method of extracting frequent patterns comprises collecting data sources pertaining to one or more entities and relationships contained in a pattern query, building a relationship sequence pertaining to each of the data source by handling each relationship as an item in an item-set that represents a relationship sequence, building a relationship sequence in an order the relationships appear in the document and identifying the frequent relationship sequences,
  • the method of processing the pattern query comprises extracting the hierarchy of the plurality of entities, expanding the pattern query in terms of entities based on the entity hierarchy and expanding the pattern query in terms of relationships based on the relationship hierarchy.
  • expanding the pattern query in terms of entity comprises disambiguating the entities the pattern query, including synonyms and implied entities in the query expansion and perforating context resolution by including similar entities and discarding dissimilar entities.
  • expanding the pattern query in terms of relationships comprises resolving relationships according to the context, including the relationship which implies context similarity, including the relationships that are implied within the syntactic and semantic similarity and discarding the semantically and syntactically dissimilar relationships.
  • FIG. 1 is a block diagram illustrating a system for frequent pattern mining in relationship space, according to an embodiment of the present disclosure.
  • FIG. 2 illustrates a flow chart of a method for performing frequent pattern mining in relationship space, according to an embodiment of the present disclosure.
  • FIG. 3 is a flow diagram illustrating a method for extracting frequent patterns, according to an embodiment of the present disclosure.
  • FIG. 4 is a flow chart illustrating a method for processing the pattern query, according to an embodiment of the present disclosure.
  • the various embodiments herein provide a system for mining frequent patterns in relationship space from a plurality of relationship sequences extracted from a big data.
  • the system comprising a data repository for collecting and storing the big data an entity store for collecting and storing a plurality of entities from the big data an entity hierarchy that represents a hierarchical structure of entities, a relationship store for collecting and storing relationship instances between the plurality of entities from the big data, a relationship hierarchy that represents a hierarchical structure of relationships and a language/domain model for organizing entities and relationships in a hierarchical manner.
  • the system further comprises a Pattern Query Processing Module (PQPM) far processing, a pattern query related to finding patterns in relationships and entities, a Pattern Generation Module (PGM) to generate frequent patterns from one or more relationship sequences from the data sources collected based on the pattern query and a Frequent Pattern Display Module (FPDM) to provide a visual presentation of the mined patterns.
  • PQPM Pattern Query Processing Module
  • PGM Pattern Generation Module
  • FPDM Frequent Pattern Display Module
  • the big data comprises structured, unstructured and semi-structured data from heterogeneous data sources for enabling data analysis on a single view.
  • the entity store is a collection of entities extracted from the big data.
  • the entity store stores specific information that enables in distinguishing with one or more entities to retrieve one or more documents containing relevant entities corresponding to the pattern query.
  • the entity hierarchy represents a hierarchical structure of entities resolved using Natural Language Processing (NLP) techniques with the support of the language/domain models.
  • NLP Natural Language Processing
  • the relationship store is adapted to store information related to each relationship instance for distinguishing with one or more relationship instances.
  • the Relationship Hierarchy represents a hierarchical arrangement of relationships by resolving the relationships through at least one of a word-sense disambiguation technique and context resolution technique in conjunction with the language/domain model.
  • the Pattern Query Processing Module processes the pattern query by expanding the pattern query in terms of entities alter consulting with the entity store and the hierarchy of the entity.
  • the pattern query is a list comprising entities and relationships of the entities.
  • the Pattern Query Processing Module performs a context resolution of the pattern query to provide a relevant result by disambiguation of the entities in the pattern query.
  • the disambiguation of the entities in the pattern query is conducted by considering synonyms and implied entities obtained during expansion of pattern query where similar entities are included and dissimilar entities are excluded.
  • the Pattern Generation Module comprises a document retriever to collect documents pertaining to the entities and relationships contained in the pattern query.
  • a Relationship Sequence Generator to create a relationship sequence with respect to each of the retrieved documents.
  • a Frequent Pattern Growth Module for extracting relevant relationship sequences.
  • the Relationship Sequence Generator builds the relationship sequences by treating each relationship as an item.
  • Each relationship sequence comprises the relationships in the order of appearance in the document.
  • the Frequent Pattern Growth Module adapts a Frequent Pattern Growth (FPG) algorithm for extracting relevant relationship sequences which considers the relationship sequences as item-sets and extracts the most frequent item-sets.
  • the method for mining frequent patterns from a plurality of relationship sequences extracted from a big data comprising, extracting a plurality of entities from the big data.
  • An entity refers to concepts comprising language unit having an independent meaning.
  • the plurality of entities extracted from the big data is stored in an entity store and the extracted entities are arranged in a hierarchical manner. Similarly one or more relationships among the plurality of entities are extracted and stored and a relationship hierarchy is created by arranging the relationships in a hierarchical manner.
  • a pattern query is inputted to a pattern query recognition module which processes the pattern query to find patterns in relationships and entities, retrieve relevant data sources from data using the entity hierarchy and the relationship hierarchy based on the pattern query, build relationship sequences with respect to one or more retrieved data sources, extract frequent patterns from the relationship sequences and display the frequent patterns on a frequent pattern display module.
  • the pattern query is a list of entities and the relationship of entities.
  • generating frequent patterns among the relationship sequences is performed using a Frequent Pattern Growth Algorithm which considers the relationship sequences as item-sets and extracts the most frequent item-sets.
  • the method of extracting frequent patterns comprises collecting data sources pertaining to one or more entities and relationships contained in a pattern query. Then relationship sequence pertaining to each of the data source is built by handling each relationship as an item in an item-set that represents a relationship sequence. Further relationship sequence is built in an order the relationships appear in the document and finally the frequent relationship sequences are identified.
  • the method of processing the pattern query comprises extracting the hierarchy of the plurality of entities expanding the pattern query in terms of entities based on the entity hierarchy and expanding the pattern query in terms of relationships based on the relationship hierarchy.
  • the pattern query in terms of entity comprises disambiguating the entities the pattern query, including synonyms and implied entities in the query expansion and performing context resolution by including similar entities and discarding dissimilar entities.
  • expanding the pattern query in terms of relationships comprises resolving relationships according to the context, including the relationship which implies context similarity, including the relationships that are implied within the syntactic similarity and discarding the contextually and syntactically dissimilar relationships.
  • FIG. 1 is a block diagram illustrating a system for frequent pattern mining in relationship space, according to an embodiment of the present disclosure.
  • the system comprises a data repository 101 , a Language/Domain Models 102 , an entity store 103 , an entity hierarchy 104 , a relationship store 105 , a relationship hierarchy 106 , aquery interface 107 , a Pattern Query Processing Module (PQRM) 108 , a Pattern Generation Module (PGM) 109 and a Frequent Pattern Display Module (FPDM) 110 .
  • PQRM Pattern Query Processing Module
  • PGM Pattern Generation Module
  • FPDM Frequent Pattern Display Module
  • the data repository 101 is adopted for collecting and storing big data.
  • the big data is a collection of all forms of data comprising structured, semi-structured and unstructured data from heterogeneous sources and a language/domain model 102 to resolve and organize entities and relationships in a hierarchy.
  • the language/domain model 102 is used to disambiguate sense in an unstructured data.
  • the language/domain model 102 also disambiguates sense in the structured and semi-structured data contexts from data repository 101 .
  • the entity store 103 is a collection of entities extracted from the data repository 101 .
  • the entity store 103 also stores certain specific information relating to entities that helps in distinguishing other entities.
  • the entity store 103 is used only to retrieve the documents containing the relevant entities corresponding to a pattern query 108 .
  • the entity hierarchy 104 is built using the Language/Domain Model 102 .
  • the entity hierarchy is a hierarchical structure of entities that is built using Natural Language Processing (NLP) techniques with the support of the Language/Domain Model 102 .
  • NLP Natural Language Processing
  • the the Language/Domain Model 102 is used to resolve and organize entities and relationships in a hierarchy.
  • the Language/Domain Model 102 is especially used to disambiguate sense in an unstructured, it is useful to disambiguate sense in the structured and semi-structured data contexts also.
  • PQRM pattern query Processing Module
  • the relationship store 105 includes a collection of relationship instances that also stores certain information specific to relationship instances.
  • the relationship hierarchy 106 is a hierarchical arrangement of relationships that are contextually resolved by word-sense disambiguation with the help of the Language/Domain Model 102 .
  • the relationship store 105 and the relationship hierarchy 106 functions in conjunction with the Pattern Query Processing Module (PQRM) 108 .
  • PQRM Pattern Query Processing Module
  • the Pattern Query Processing Module (PQPM) 108 receives a pattern query inputted through a query interface 107 and performs processing as per the required information.
  • the pattern query comprises a list of entities and relationships.
  • the PQPM 108 consults the entity store 103 and the entity hierarchy and expands the pattern query in terms of entities. This entity expansion process involves disambiguating the entities in the pattern query, including the synonyms and implied entities in query expansion, making a context resolution to include the similar and exclude the dissimilar entities.
  • the Pattern Generation Module (PGM) 109 comprises a Document Retriever 109 a, a Relationship Sequence Generator 109 b and a Frequent Pattern Growth Pattern Mining Module (FPGMM) 109 c .
  • the document retriever 109 a collects all documents pertaining to the entities/relationships contained in the pattern query.
  • the Relationship Sequence Generator 109 b generates a relationship sequence with respect to each of document or data by treating each relationship as an item.
  • the Relationship Sequence Generator 109 b builds a relationship sequence in the order of appearance in the document.
  • the Frequent Pattern Growth Pattern-mining module (FPGMM) module uses a Frequent Pattern Growth algorithm (FPG) for processing the pattern query.
  • the FPG algorithm treats the relationship sequences like item-sets and extracts the most frequent item-sets/relationship sequences.
  • the Frequent Pattern Display Module (FPDM) 110 provides for in visualizing the most frequent patterns extracted from relationship sequences in conjunction with the entity.
  • FIG. 2 illustrates a flow chart of a method for performing frequent pattern mining in relationship space, according to an embodiment of the present disclosure.
  • the method comprises frequent pattern mining in relationship space.
  • the method comprises processing of big data for recognizing plurality of entities.
  • the plurality of entities are then extracted and stored in an entity store.
  • the entity store stores meaningful entities extracted out of big data irrespective of the form from which the entity originates ( 201 ).
  • Entities are objects that make independent sense. Entities are a named and unnamed object which includes names of living and non living things, concepts, theories or simply the language units that make independent sense.
  • Entities is any one of named entities such as names of places, people etc., or concepts that is represented by one or more terms (example, “Purchase power’, ‘Purchase’ as noun and ‘Purchase’ as verb is three different concepts).
  • the entity refers to named entities and concepts (language unit with independent meaning).
  • An entity hierarchy is then built by arranging, the plurality of entities in a hierarchical manner ( 202 ). Further a set of relationships among a plurality of entities is extracted and stored in a relationship store ( 203 ), and a relationship hierarchy is created by arranging the relationships in a hierarchical manner ( 204 ).
  • the method involves the use of the entity hierarchy arid the relationship hierarchy during response to the pattern query.
  • the pattern query is inputted to a Pattern Query Processing Module (PQPM) for finding frequent patterns related to entities and relationships in the query ( 205 ).
  • PQPM Pattern Query Processing Module
  • the document collector collects the documents that are relevant to the pattern query ( 206 ). Based on the contents of the pattern query, the Relationship Sequence Generator generates a relationship sequence for each of the retrieved document ( 207 ). The PGM adopts a Frequent Pattern Growth Module (FPGM) for identifying the frequent patterns among the relationship sequences ( 208 ). Finally, the identified patterns are displayed on a Frequent Pattern Display Module (FPDM) ( 209 ).
  • FPGM Frequent Pattern Growth Module
  • FIG. 3 is a flow diagram illustrating a method for extracting frequent patterns, according to an embodiment of the present disclosure.
  • the method comprises receiving a pattern query in a pattern query Processing Module (PQPM).
  • PQPM processes the pattern query and communicates with a Pattern Generation Module (PGM).
  • PGM comprises three subunits as Document Retriever, a Relationship Sequence Generator and a Frequent Pattern Growth Module (FPGM).
  • FPGM Frequent Pattern Growth Module
  • the relationship sequences that appear like “item-sets” enable frequent item set mining.
  • the item-sets comprise relationship sequences in an orderly manner for easy processing.
  • the Frequent Pattern Growth Module FPGM mines for the required pattern as desired by the pattern query ( 303 ).
  • the result of the frequent relationships sequences are then displayed by Frequent Pattern Display Module (FPDM).
  • FIG. 4 is a flow chart illustrating a method for processing the pattern query, according to an embodiment of the present disclosure.
  • the pattern query is raised by a user which is inputted to a Pattern Query Processing Module (PQPM).
  • PQPM Pattern Query Processing Module
  • the PQPM expands the pattern query in terms of entities on referring the entity list and the entity hierarchy ( 401 ). Expanding the pattern query in terms of entity includes steps of disambiguating the entities in the pattern query, including synonyms and implied entities in the query expansion and performing context resolution by including similar entities and discarding dissimilar entities.
  • the PQPM then expands the pattern query in terms of relationships based on the relationship hierarchy ( 402 ).
  • expanding the pattern query in terms of relationship includes resolving relationships according to the context, including the relationships which implies context similarity, including the relationships that are implied within the syntactic similarity and discarding the contextually and syntactically dissimilar relationships.
  • the embodiments of the present invention disclose an approach that looks for patterns in the relationship space.
  • the embodiments of the present disclosure provides a robust approach to find patterns and ensures context resolution effectively.
  • the entities and relationships among the entities assist in understanding the big data. All the entities and relationships are derived and collected. This collection of entities and relationships serves as input to all intelligent processing of data. Data mining and data analysis applications, forecasting, predictive analytics applications and machine learning applications make use of the patterns to learn further insights.
  • the embodiments herein enable an enterprise that intends to facilitate processing of big data and build applications on top.
  • the embodiment herein also allows building of domain specific, niche applications that harness big data.
  • the embodiments herein provides immense benefit to following sectors but is not limited to retail, health and pharmaceutical services, banking and insurance.

Abstract

The various embodiments herein provide a system and method for mining frequent patterns in relationship space from a plurality of relationship sequences extracted from a big data The system comprises a data repository for collecting and storing the big data. An Entity Store for collecting and storing a plurality of entities from the big data, an Entity Hierarchy for representing a hierarchical structure of entities, a Relationship Store for collecting and storing relationship instances between the pluralities of entities, a Relationship Hierarchy for representing a hierarchical structure of relationship, a language/domain model for organizing entities and relationships in a hierarchical manner, a pattern query Processing Module (PQPM) for processing, a pattern query related to finding patterns in relationships and entities, and a Pattern Generation Module (PGM) to generate frequent patterns and a Frequent Pattern Display Module (FPDM) to provide a visual presentation of the mined patterns.

Description

    CROSS-REFERENCE TO RELATED APPLICATION
  • The present application claims priority of Indian provisional application serial number 3286/C14E12012 filed on Aug. 10, 2012, and that application is incorporated in its entirety at least by reference.
  • BACKGROUND
  • 1. Technical Field
  • The embodiments herein generally relate to data mining and particularly relates to a mining patterns from structured, unstructured and semi-structured data from heterogeneous sources. The embodiments herein more particularly relates to a system and method for mining patterns in relationship sequences extracted from big data.
  • 2. Description of the Related Art
  • Information explosion within and outside the organization has led to exponential increase in unstructured data, while the systems currently used are especially meant for processing structured data. With the advent of big data systems such as columnar databases, map reduce frameworks such as Hadoop, it is now possible to store heterogeneous data at one point. A big data is any one or a combination of an unstructured data source, a semi-structured data source and a structured data source. However, making information available for analytics or deriving new perspectives from big data to enable analytics is something that is not understood clearly yet.
  • Pattern mining in structured data is a fairly well understood problem; however, pattern mining on unstructured data is much less understood. The approaches for pattern mining in structured and unstructured data are completely different. In both structured and unstructured pattern mining, the co-occurrence of entities decides the pattern, given that the entities can share multiple relationships among themselves. However, the co-occurrence of entities alone does not ensure that patterns are bound to the correct context.
  • One of the existing prior art provides a system and method for the automatic mining of new relationships which employs the use of “association rule mining” in discovering new relationships. The “association rule mining” technique basically uses the co-occurrence of words that are used to describe a relationship to find new relationships. The objective of this prior art is to discover new relationships between entities given that a statistical module asserts the significance of the relationship, and a relationship does not match existing relationships between the pair of entities already in the relation database. The prior art adds new relationships after trying to resolve it with the existing relationships.
  • Another existing prior art provides a state of art method for effective pattern discovery for text mining which follow a term based approach for closed sequence pattern mining. This effort too examines sequences that are formed by term occurrences. The prior art considers only the text data and the method of extracting patterns cannot be extended to other forms of data. Also, the prior art limits itself to mining patterns in entity space, where every term is considered as entity.
  • There exist many limitations in existing prior arts which explain pattern mining in relationship space. The existing systems attempt to perform pattern-mining on either structured or on unstructured data and not on amalgamation of both. Also, the approach of existing pattern-mining is based on co-occurrence of two or more entities, and mines patterns in entity space only. These methods do not ensure contextual resolution of entities, as same entities can co-occur in different contexts he existing pattern-mining approaches do not mine patterns upon resolution of both entities and relationships, although certain aspects of entity resolution have been addressed. Further, many forms of representation of relationships that occur between entities are rather complex and require expensive logical inference mechanism for realizing a hierarchy of the relationships. In unstructured data context, it is important to arrive at a suitable representation of relationship that facilitates easy resolution of relationships.
  • In view of the foregoing, there is a need for a system and method for mining patterns in relationship sequences extracted from big data. There is also a need for system and method for finding patterns based on co-occurring relationships. Further there exists a need for a system and method which can extract frequent patterns in relationship space from relationship sequences.
  • The abovementioned shortcomings, disadvantages and problems are addressed herein and which will be understood by reading and studying the following specification.
  • SUMMARY
  • The primary object of the embodiments herein is to provide a system and method for mining patterns in a relationship space from a collection of structured, unstructured and semi-structured, data.
  • Another object of the embodiments herein is to provide, a system and method for enabling pattern extraction in relationship space by storing entities and relationships, and maintaining entity hierarchy and relationship hierarchy respectively.
  • Yet another object of the embodiments herein is to provide a system and method for building relationship sequences from heterogeneous data sources to represent the order in which the relationships occur to facilitate pattern mining.
  • Yet another object of the embodiments herein is to provide a system and method for extracting relevant relationship sequences from stored relationships using entity and relationship hierarchies for pattern-mining.
  • Yet another object of the embodiments herein is to provide a system and method for generating most frequent patterns in relationship space from relationship sequences.
  • Yet another object of the embodiments herein is to provide a system and method for deriving new perspectives from big data to enable analytics of the derived data.
  • These and other objects and advantages of the present invention will become readily apparent from the following detailed description taken in conjunction with the accompanying drawings.
  • The various embodiments herein provide a system for mining frequent patterns in relationship space from a plurality of relationship sequences extracted from a big data. The system comprising a data repository for collecting and storing the big data, an entity store for collecting and storing a plurality of entities from the big data, an entity hierarchy for representing a hierarchical structure re of entities, a relationship store for collecting and storing relationship instances between the plurality of entities from the big data, a relationship hierarchy for representing a hierarchical structure of relationships, a language/domain model for organizing entities and relationships in a hierarchical manner, a Pattern Query Processing Module (PQPM) for processing a pattern query related, to finding patterns in relationships and entities, a Pattern Generation Module (PGM) to generate frequent patterns from one or more relationship sequences from the data sources collected based on the pattern query and a Frequent Pattern Display Module (FPDM) to provide a visual presentation of the mined patterns. The pattern generation module performs frequent pattern mining by gathering relevant data sources using the entity hierarchy and the relationship hierarchy. It generates relationship sequences with respect to each of the data source and extracts the most frequent patterns in the collection of relationship sequences.
  • According to an embodiment herein, the big data comprises structured, unstructured and semi-structured data from heterogeneous data sources.
  • According to an embodiment herein, the entity store is a collection of entities extracted from the big data. The entity store stores specific information with respect to each entity.
  • According to an embodiment herein, the entity hierarchy represents a hierarchical structure of entities resolved using Natural Language Processing (NLP) techniques with a support of the Language and Knowledge Models.
  • According to an embodiment herein, the relationship store is adapted to store information related to each relationship instance.
  • According to an embodiment herein, the Relationship Hierarchy represents a hierarchical arrangement of relationships by resolving, the relationships through at least one of a word-sense disambiguation technique, syntactic resolution and semantic resolution in conjunction with the language/domain model for context resolution.
  • According to an embodiment herein, the Pattern Query Processing Module (PQPM) processes the pattern query by expanding the pattern query in terms of entities after consulting with the entity store and the hierarchy of entities. The pattern query is a list comprising entities and relationships of the entities.
  • According to an embodiment herein, the Pattern Query Processing Module (PQPM) performs a query expansion of the pattern query to provide a relevant result by disambiguation and resolution of the entities in the pattern query. The disambiguation of the entities in the pattern query is conducted by identifying explicit and implicit similar entities and ignoring the dissimilar entities.
  • According to an embodiment herein, the Pattern Generation Module (PGM) comprises a document retriever to collect documents pertaining to the entities and relationships suggested by the query expansion, a Relationship Sequence Generator to create a relationship sequence with respect to each of the retrieved documents, and a Frequent Pattern Growth Module (FPGM) for extracting relevant relationship sequences.
  • According to an embodiment herein, the Relationship Sequence Generator builds the relationship sequences by treating each relationship as an item. Each relationship sequence comprises the relationships in the order of appearance in the data source.
  • According to an embodiment herein, the Frequent Pattern Growth Pattern-Mining Module (FPGM) adapts a Frequent Pattern Growth (FPG) algorithm for extracting relevant relationship sequences which considers the relationship sequences as item-sets and extracts the most frequent item-sets.
  • The embodiments herein further provide a method for mining frequent patterns from a plurality of relationship sequences extracted from a big data. The method comprising, extracting a plurality of entities from the big data, storing the extracted plurality of entities in an entity store, extracting and storing one or more relationships among the plurality of entities, building an entity hierarchy by arranging the plurality of entities in a hierarchical manner, creating a relationship hierarchy by arranging the relationships in a hierarchical manner, inputting a pattern query; where the pattern query is a list of entities and the relationship of entities, processing the pattern query to find patterns in relationships and entities, retrieving relevant data sources from data using the entity hierarchy and the relationship hierarchy based on the pattern query, building relationship sequences with respect to one or more retrieved data sources and extracting frequent patterns from the relationship sequences and displaying the frequent patterns on a frequent pattern display module.
  • According to an embodiment herein, the big data comprises structured, unstructured and semi-structured data from heterogeneous data sources for enabling data analysis on a single view.
  • According to an embodiment herein, generating frequent patterns among the relationship sequences is performed using a Frequent Pattern Growth Algorithm which considers the relationship sequences as item-sets and extracts the most frequent item-sets.
  • According to an embodiment herein, the method of extracting frequent patterns comprises collecting data sources pertaining to one or more entities and relationships contained in a pattern query, building a relationship sequence pertaining to each of the data source by handling each relationship as an item in an item-set that represents a relationship sequence, building a relationship sequence in an order the relationships appear in the document and identifying the frequent relationship sequences,
  • According, to an embodiment herein, the method of processing the pattern query comprises extracting the hierarchy of the plurality of entities, expanding the pattern query in terms of entities based on the entity hierarchy and expanding the pattern query in terms of relationships based on the relationship hierarchy.
  • According to an embodiment herein, expanding the pattern query in terms of entity comprises disambiguating the entities the pattern query, including synonyms and implied entities in the query expansion and perforating context resolution by including similar entities and discarding dissimilar entities.
  • According to an embodiment of the present invention, expanding the pattern query in terms of relationships comprises resolving relationships according to the context, including the relationship which implies context similarity, including the relationships that are implied within the syntactic and semantic similarity and discarding the semantically and syntactically dissimilar relationships.
  • These and the other aspects of the embodiments herein will be better appreciated and understood when considered in conjunction with the following description and the accompanying drawings. It should be understood, however, that the following descriptions, while indicating preferred embodiments and numerous specific details thereof, are given by way of illustration and not of limitation. Many changes and modifications may be made within the scope of the embodiments herein without departing from the spirit thereof, and the embodiments herein include all such modifications.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The other objects, features and advantages will occur to those skilled in the art from the following description of the preferred embodiment: and the accompanying drawings in which:
  • FIG. 1 is a block diagram illustrating a system for frequent pattern mining in relationship space, according to an embodiment of the present disclosure.
  • FIG. 2 illustrates a flow chart of a method for performing frequent pattern mining in relationship space, according to an embodiment of the present disclosure.
  • FIG. 3 is a flow diagram illustrating a method for extracting frequent patterns, according to an embodiment of the present disclosure.
  • FIG. 4 is a flow chart illustrating a method for processing the pattern query, according to an embodiment of the present disclosure.
  • Although the specific features of the present invention are shown in some drawings and not in others. This is done for convenience only as each feature may be combined with any or all of the other features in accordance with the present invention.
  • DETAILED DESCRIPTION OF DRAWINGS
  • In the following detailed description, a reference is made to the accompanying drawings that form a part hereof, and in which the specific embodiments that may be practiced is shown by way of illustration. These embodiments are described in sufficient detail to enable those skilled in the art to practice the embodiments and it is to be understood that the logical, mechanical and other changes may be made without departing from the scope of the embodiments. The following detailed description is therefore not to be taken in a limiting sense.
  • The various embodiments herein provide a system for mining frequent patterns in relationship space from a plurality of relationship sequences extracted from a big data. The system comprising a data repository for collecting and storing the big data an entity store for collecting and storing a plurality of entities from the big data an entity hierarchy that represents a hierarchical structure of entities, a relationship store for collecting and storing relationship instances between the plurality of entities from the big data, a relationship hierarchy that represents a hierarchical structure of relationships and a language/domain model for organizing entities and relationships in a hierarchical manner. The system further comprises a Pattern Query Processing Module (PQPM) far processing, a pattern query related to finding patterns in relationships and entities, a Pattern Generation Module (PGM) to generate frequent patterns from one or more relationship sequences from the data sources collected based on the pattern query and a Frequent Pattern Display Module (FPDM) to provide a visual presentation of the mined patterns. The pattern generation nodule performs frequent pattern mining by extracting relevant relationship sequences from the relationship store using the entity hierarchy and the relationship hierarchy.
  • The big data comprises structured, unstructured and semi-structured data from heterogeneous data sources for enabling data analysis on a single view.
  • The entity store is a collection of entities extracted from the big data. The entity store stores specific information that enables in distinguishing with one or more entities to retrieve one or more documents containing relevant entities corresponding to the pattern query. The entity hierarchy represents a hierarchical structure of entities resolved using Natural Language Processing (NLP) techniques with the support of the language/domain models.
  • The relationship store is adapted to store information related to each relationship instance for distinguishing with one or more relationship instances. The Relationship Hierarchy represents a hierarchical arrangement of relationships by resolving the relationships through at least one of a word-sense disambiguation technique and context resolution technique in conjunction with the language/domain model.
  • The Pattern Query Processing Module (PQPM) processes the pattern query by expanding the pattern query in terms of entities alter consulting with the entity store and the hierarchy of the entity. The pattern query is a list comprising entities and relationships of the entities.
  • The Pattern Query Processing Module (PQPM) performs a context resolution of the pattern query to provide a relevant result by disambiguation of the entities in the pattern query. The disambiguation of the entities in the pattern query is conducted by considering synonyms and implied entities obtained during expansion of pattern query where similar entities are included and dissimilar entities are excluded.
  • The Pattern Generation Module (PGM) comprises a document retriever to collect documents pertaining to the entities and relationships contained in the pattern query. A Relationship Sequence Generator to create a relationship sequence with respect to each of the retrieved documents. A Frequent Pattern Growth Module (FPGM) for extracting relevant relationship sequences.
  • The Relationship Sequence Generator builds the relationship sequences by treating each relationship as an item. Each relationship sequence comprises the relationships in the order of appearance in the document.
  • The Frequent Pattern Growth Module (FPGM) adapts a Frequent Pattern Growth (FPG) algorithm for extracting relevant relationship sequences which considers the relationship sequences as item-sets and extracts the most frequent item-sets.
  • The method for mining frequent patterns from a plurality of relationship sequences extracted from a big data. The method comprising, extracting a plurality of entities from the big data. An entity refers to concepts comprising language unit having an independent meaning. The plurality of entities extracted from the big data is stored in an entity store and the extracted entities are arranged in a hierarchical manner. Similarly one or more relationships among the plurality of entities are extracted and stored and a relationship hierarchy is created by arranging the relationships in a hierarchical manner. Further a pattern query is inputted to a pattern query recognition module which processes the pattern query to find patterns in relationships and entities, retrieve relevant data sources from data using the entity hierarchy and the relationship hierarchy based on the pattern query, build relationship sequences with respect to one or more retrieved data sources, extract frequent patterns from the relationship sequences and display the frequent patterns on a frequent pattern display module. The pattern query is a list of entities and the relationship of entities. Here generating frequent patterns among the relationship sequences is performed using a Frequent Pattern Growth Algorithm which considers the relationship sequences as item-sets and extracts the most frequent item-sets.
  • The method of extracting frequent patterns comprises collecting data sources pertaining to one or more entities and relationships contained in a pattern query. Then relationship sequence pertaining to each of the data source is built by handling each relationship as an item in an item-set that represents a relationship sequence. Further relationship sequence is built in an order the relationships appear in the document and finally the frequent relationship sequences are identified.
  • Similarly the method of processing the pattern query comprises extracting the hierarchy of the plurality of entities expanding the pattern query in terms of entities based on the entity hierarchy and expanding the pattern query in terms of relationships based on the relationship hierarchy.
  • Here the pattern query in terms of entity comprises disambiguating the entities the pattern query, including synonyms and implied entities in the query expansion and performing context resolution by including similar entities and discarding dissimilar entities. Similarly, expanding the pattern query in terms of relationships comprises resolving relationships according to the context, including the relationship which implies context similarity, including the relationships that are implied within the syntactic similarity and discarding the contextually and syntactically dissimilar relationships.
  • FIG. 1 is a block diagram illustrating a system for frequent pattern mining in relationship space, according to an embodiment of the present disclosure. The system comprises a data repository 101, a Language/Domain Models 102, an entity store 103, an entity hierarchy 104, a relationship store 105, a relationship hierarchy 106, aquery interface 107, a Pattern Query Processing Module (PQRM) 108, a Pattern Generation Module (PGM) 109 and a Frequent Pattern Display Module (FPDM) 110.
  • The data repository 101 is adopted for collecting and storing big data. The big data is a collection of all forms of data comprising structured, semi-structured and unstructured data from heterogeneous sources and a language/domain model 102 to resolve and organize entities and relationships in a hierarchy. The language/domain model 102 is used to disambiguate sense in an unstructured data. The language/domain model 102 also disambiguates sense in the structured and semi-structured data contexts from data repository 101.
  • The entity store 103 is a collection of entities extracted from the data repository 101. The entity store 103 also stores certain specific information relating to entities that helps in distinguishing other entities. The entity store 103 is used only to retrieve the documents containing the relevant entities corresponding to a pattern query 108. The entity hierarchy 104 is built using the Language/Domain Model 102. The entity hierarchy is a hierarchical structure of entities that is built using Natural Language Processing (NLP) techniques with the support of the Language/Domain Model 102. The the Language/Domain Model 102 is used to resolve and organize entities and relationships in a hierarchy. The Language/Domain Model 102 is especially used to disambiguate sense in an unstructured, it is useful to disambiguate sense in the structured and semi-structured data contexts also. After generation of the entity hierarchy, the entity hierarchy is made available to a pattern query Processing Module (PQRM) 108.
  • The relationship store 105 includes a collection of relationship instances that also stores certain information specific to relationship instances. The relationship hierarchy 106 is a hierarchical arrangement of relationships that are contextually resolved by word-sense disambiguation with the help of the Language/Domain Model 102. The relationship store 105 and the relationship hierarchy 106 functions in conjunction with the Pattern Query Processing Module (PQRM) 108.
  • The Pattern Query Processing Module (PQPM) 108 receives a pattern query inputted through a query interface 107 and performs processing as per the required information. The pattern query comprises a list of entities and relationships. The PQPM 108 consults the entity store 103 and the entity hierarchy and expands the pattern query in terms of entities. This entity expansion process involves disambiguating the entities in the pattern query, including the synonyms and implied entities in query expansion, making a context resolution to include the similar and exclude the dissimilar entities.
  • The Pattern Generation Module (PGM) 109 comprises a Document Retriever 109 a, a Relationship Sequence Generator 109 b and a Frequent Pattern Growth Pattern Mining Module (FPGMM) 109 c. The document retriever 109 a collects all documents pertaining to the entities/relationships contained in the pattern query. The Relationship Sequence Generator 109 b generates a relationship sequence with respect to each of document or data by treating each relationship as an item. The Relationship Sequence Generator 109 b builds a relationship sequence in the order of appearance in the document. The Frequent Pattern Growth Pattern-mining module (FPGMM) module uses a Frequent Pattern Growth algorithm (FPG) for processing the pattern query. The FPG algorithm treats the relationship sequences like item-sets and extracts the most frequent item-sets/relationship sequences. The Frequent Pattern Display Module (FPDM) 110 provides for in visualizing the most frequent patterns extracted from relationship sequences in conjunction with the entity.
  • FIG. 2 illustrates a flow chart of a method for performing frequent pattern mining in relationship space, according to an embodiment of the present disclosure. The method comprises frequent pattern mining in relationship space. In particular, the method comprises processing of big data for recognizing plurality of entities. The plurality of entities are then extracted and stored in an entity store. The entity store, stores meaningful entities extracted out of big data irrespective of the form from which the entity originates (201). Entities are objects that make independent sense. Entities are a named and unnamed object which includes names of living and non living things, concepts, theories or simply the language units that make independent sense. Entities is any one of named entities such as names of places, people etc., or concepts that is represented by one or more terms (example, “Purchase power’, ‘Purchase’ as noun and ‘Purchase’ as verb is three different concepts). In brief, the entity refers to named entities and concepts (language unit with independent meaning). An entity hierarchy is then built by arranging, the plurality of entities in a hierarchical manner (202). Further a set of relationships among a plurality of entities is extracted and stored in a relationship store (203), and a relationship hierarchy is created by arranging the relationships in a hierarchical manner (204).
  • The method involves the use of the entity hierarchy arid the relationship hierarchy during response to the pattern query. In case of a pattern query, the pattern query is inputted to a Pattern Query Processing Module (PQPM) for finding frequent patterns related to entities and relationships in the query (205).
  • The document collector collects the documents that are relevant to the pattern query (206). Based on the contents of the pattern query, the Relationship Sequence Generator generates a relationship sequence for each of the retrieved document (207). The PGM adopts a Frequent Pattern Growth Module (FPGM) for identifying the frequent patterns among the relationship sequences (208). Finally, the identified patterns are displayed on a Frequent Pattern Display Module (FPDM) (209).
  • FIG. 3 is a flow diagram illustrating a method for extracting frequent patterns, according to an embodiment of the present disclosure. The method comprises receiving a pattern query in a pattern query Processing Module (PQPM). The PQPM processes the pattern query and communicates with a Pattern Generation Module (PGM). The PGM comprises three subunits as Document Retriever, a Relationship Sequence Generator and a Frequent Pattern Growth Module (FPGM). Once the PGM receives the command from the PQPM, the document retriever starts collecting, one or more documents (301). The one or more documents are related to the one or more entities and relationships contained in the pattern query. Once the related documents are collected, the Relationship Sequence Generator builds a relationship sequence in an order in which the relationships appear in the document (302). The relationship sequences that appear like “item-sets” enable frequent item set mining. The item-sets comprise relationship sequences in an orderly manner for easy processing. Once an ordered item-set is built, the Frequent Pattern Growth Module (FPGM) mines for the required pattern as desired by the pattern query (303). The result of the frequent relationships sequences are then displayed by Frequent Pattern Display Module (FPDM).
  • FIG. 4 is a flow chart illustrating a method for processing the pattern query, according to an embodiment of the present disclosure. The pattern query is raised by a user which is inputted to a Pattern Query Processing Module (PQPM). Depending on the content of the pattern query, the PQPM expands the pattern query in terms of entities on referring the entity list and the entity hierarchy (401). Expanding the pattern query in terms of entity includes steps of disambiguating the entities in the pattern query, including synonyms and implied entities in the query expansion and performing context resolution by including similar entities and discarding dissimilar entities. The PQPM then expands the pattern query in terms of relationships based on the relationship hierarchy (402). Here expanding the pattern query in terms of relationship includes resolving relationships according to the context, including the relationships which implies context similarity, including the relationships that are implied within the syntactic similarity and discarding the contextually and syntactically dissimilar relationships.
  • The embodiments of the present invention disclose an approach that looks for patterns in the relationship space. The embodiments of the present disclosure, provides a robust approach to find patterns and ensures context resolution effectively. The entities and relationships among the entities assist in understanding the big data. All the entities and relationships are derived and collected. This collection of entities and relationships serves as input to all intelligent processing of data. Data mining and data analysis applications, forecasting, predictive analytics applications and machine learning applications make use of the patterns to learn further insights. The embodiments herein enable an enterprise that intends to facilitate processing of big data and build applications on top. The embodiment herein also allows building of domain specific, niche applications that harness big data. The embodiments herein provides immense benefit to following sectors but is not limited to retail, health and pharmaceutical services, banking and insurance.
  • The foregoing description of the specific embodiments will so fully reveal the general nature of the embodiments herein that others can, by applying current knowledge, readily modify and/or adapt for various applications such specific embodiments without departing from the generic concept, and, therefore, such adaptations and modifications should and are intended to he comprehended within the meaning and range of equivalents of the disclosed embodiments. It is to be understood that the phraseology or terminology employed herein is for the purpose of description and not of limitation. Therefore, while the embodiments herein have been described in terms of preferred embodiments, those skilled in the art will recognize that the embodiments herein can be practiced with modification.

Claims (18)

What is claimed is:
1. A system for mining frequent patterns in relationship space from a plurality of relationship sequences extracted from a big data, the system comprising:
a data repository for collecting and storing the big data;
an entity store for collecting and storing a plurality of entities from the big data;
an entity hierarchy for representing a hierarchical structure of entities;
a relationship store for collecting and storing relationship instances between the plurality of entities from the big data;
a relationship hierarchy for representing a hierarchical structure of relationships;
a language/domain model for organizing entities and relationships in a hierarchical manner;
a pattern query Processing Module (PQPM) for expanding a pattern query related to finding patterns in relationships and entities;
a Pattern Generation Module (PGM) to generate frequent patterns from one or more relationship sequences from the data sources collected based on the pattern query; and
a Frequent Pattern Display Module (FPDM) to provide a visual presentation of the mined patterns;
where the pattern generation module performs frequent pattern mining by extracting relevant relationship sequences from the relationship store using the entity hierarchy and the relationship hierarchy.
2. The system according to claim 1, wherein the big data comprises structured, unstructured and semi-structured data from heterogeneous data sources.
3. The system according to claim 1, wherein the entity store is a collection of entities extracted from the big data, wherein the entity store stores information specific to each entity.
4. The system according to claim 1, wherein the entity hierarchy is a hierarchical structure of entities resolved using Natural Language Processing (NLP) techniques with a support of the language/domain model.
5. The system according to claim 1, wherein the relationship store is adapted to store information related to each relationship instance.
6. The system according to claim 1, wherein the Relationship Hierarchy provides a hierarchical arrangement of relationships by resolving the relationships through at least one of a word-sense disambiguation technique, syntactic and semantic similarity and context resolution technique in conjunction with the language/domain model.
7. The system according to claim 1, wherein the pattern query Processing Module (PQPM) processes the pattern query by expanding, the pattern query in terms of entities after consulting the entity hierarchy, wherein the pattern query is a list comprising entities and relationships of the entities.
8. The system according to claim 1, wherein the pattern query Processing Module (PQPM) performs a expansion of the pattern query to provide a relevant result by disambiguation of the entities in the pattern query, where the disambiguation of the entities in the pattern query is conducted by identifying explicit and implicit similar entities and ignoring the dissimilar entities
9. The system according to claim 1, the Pattern Generation Module (PGM) comprises:
a document retriever to collect documents pertaining to the entities and relationships suggested by the query expansion;
a Relationship Sequence Generator to create a relationship sequence with respect to each of the retrieved documents;
a Frequent Pattern Growth Module (FPGM) for extracting relevant relationship sequences.
10. The system according to claim 9, wherein the Relationship Sequence Generator builds the relationship sequences by treating each relationship as an item, where each relationship sequence comprises the relationships in the order of appearance in the document.
11. The system according to claim 9, wherein the Frequent Pattern Growth Pattern-Mining Module (FPGM) adapts a Frequent Pattern Growth (FPG) algorithm for extracting relevant relationship sequences which considers the relationship sequences as item-sets and extracts the most frequent item-sets.
12. A method for mining frequent patterns from a plurality of relationship sequences extracted from a big data, the method comprising:
extracting a plurality of entities from the big data, where an entity refers to concepts comprising language unit having an independent meaning;
storing the extracted plurality of entities in an entity store;
extracting and storing one or more relationships among the plurality of entities;
building an entity hierarchy by arranging the plurality of entities in a hierarchical manner;
creating a relationship hierarchy by arranging the relationships in a hierarchical manner;
inputting a pattern query, where the pattern query is a list of entities and the relationship of entities;
expanding the pattern query to include most relevant entities and relationships and ignore irrelevant patterns and relationships;
retrieving relevant data sources from data using the pattern query;
building relationship sequences with respect to one or more retrieved data sources;
extracting frequent patterns from the relationship sequences; and
displaying the frequent patterns on a frequent pattern display module.
13. The method according to claim 12, wherein the big data comprises structured, unstructured and semi-structured data from heterogeneous data sources for enabling data analysis on a single view.
14. The method according to claim 12, wherein generating frequent patterns among the relationship sequences is performed using a Frequent Pattern Growth Algorithm which considers the relationship sequences as item-sets and extracts the most frequent item-sets.
15. The method according to claim 12, wherein the method of extracting frequent patterns comprises:
collecting data sources pertaining to one or more entities and relationships contained in a pattern query;
building a relationship sequence pertaining to each of the data source by handling each relationship as an item in an item-set that represents a relationship sequence;
building a relationship sequence in an order the relationships appear in the document; and
identifying the frequent relationship sequences.
16. The method according to claim 12, wherein the method of processing the pattern query comprises:
extracting the hierarchy of the plurality of entities;
expanding the pattern query in terms of entities based on the entity hierarchy; and
expanding the pattern query m terms of relationships based on the relationship hierarchy.
17. The method according to claim 16, expanding the pattern query in terms of entity comprises:
disambiguating the entities the pattern query;
including synonyms and implied entities in the query expansion; and
discarding dissimilar entities.
18. The method according to claim 16, expanding the pattern query in terms of relationships comprises:
resolving relationships according, to the context;
including the relationships which implies context similarity;
including the relationships that are implied within the syntactic similarity; and
discarding the contextually and syntactically dissimilar relationships.
US13/755,047 2012-08-10 2013-01-31 System and method for mining patterns from relationship sequences extracted from big data Abandoned US20140046977A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN3286/CHE/2012 2012-08-10
IN3286CH2012 2012-08-10

Publications (1)

Publication Number Publication Date
US20140046977A1 true US20140046977A1 (en) 2014-02-13

Family

ID=50066829

Family Applications (4)

Application Number Title Priority Date Filing Date
US13/755,059 Abandoned US20140046892A1 (en) 2012-08-10 2013-01-31 Method and system for visualizing information extracted from big data
US13/755,062 Expired - Fee Related US9239830B2 (en) 2012-08-10 2013-01-31 System and method for building relationship hierarchy
US13/755,047 Abandoned US20140046977A1 (en) 2012-08-10 2013-01-31 System and method for mining patterns from relationship sequences extracted from big data
US13/755,069 Abandoned US20140046653A1 (en) 2012-08-10 2013-01-31 Method and system for building entity hierarchy from big data

Family Applications Before (2)

Application Number Title Priority Date Filing Date
US13/755,059 Abandoned US20140046892A1 (en) 2012-08-10 2013-01-31 Method and system for visualizing information extracted from big data
US13/755,062 Expired - Fee Related US9239830B2 (en) 2012-08-10 2013-01-31 System and method for building relationship hierarchy

Family Applications After (1)

Application Number Title Priority Date Filing Date
US13/755,069 Abandoned US20140046653A1 (en) 2012-08-10 2013-01-31 Method and system for building entity hierarchy from big data

Country Status (1)

Country Link
US (4) US20140046892A1 (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140046653A1 (en) * 2012-08-10 2014-02-13 Xurmo Technologies Pvt. Ltd. Method and system for building entity hierarchy from big data
US20140337086A1 (en) * 2013-05-09 2014-11-13 Rockwell Authomation Technologies, Inc. Risk assessment for industrial systems using big data
US20150278192A1 (en) * 2014-03-25 2015-10-01 Nice-Systems Ltd Language model adaptation based on filtered data
WO2016114433A1 (en) * 2015-01-16 2016-07-21 주식회사 솔트룩스 Unstructured data processing system and method
US9568908B2 (en) 2012-02-09 2017-02-14 Rockwell Automation Technologies, Inc. Industrial automation app-store
WO2017091829A1 (en) * 2015-11-29 2017-06-01 Vatbox, Ltd. System and method for automatic generation of reports based on electronic documents
US9703902B2 (en) 2013-05-09 2017-07-11 Rockwell Automation Technologies, Inc. Using cloud-based data for industrial simulation
US9709978B2 (en) 2013-05-09 2017-07-18 Rockwell Automation Technologies, Inc. Using cloud-based data for virtualization of an industrial automation environment with information overlays
US9786197B2 (en) 2013-05-09 2017-10-10 Rockwell Automation Technologies, Inc. Using cloud-based data to facilitate enhancing performance in connection with an industrial automation system
US9954972B2 (en) 2013-05-09 2018-04-24 Rockwell Automation Technologies, Inc. Industrial data analytics in a cloud platform
US9989958B2 (en) 2013-05-09 2018-06-05 Rockwell Automation Technologies, Inc. Using cloud-based data for virtualization of an industrial automation environment
US10116532B2 (en) 2012-02-09 2018-10-30 Rockwell Automation Technologies, Inc. Cloud-based operator interface for industrial automation
CN109871261A (en) * 2019-02-27 2019-06-11 云南大学 Resource environment big data methods of exhibiting and display platform
US10496061B2 (en) 2015-03-16 2019-12-03 Rockwell Automation Technologies, Inc. Modeling of an industrial automation environment in the cloud
US11042131B2 (en) 2015-03-16 2021-06-22 Rockwell Automation Technologies, Inc. Backup of an industrial automation plant in the cloud
US11163952B2 (en) 2018-07-11 2021-11-02 International Business Machines Corporation Linked data seeded multi-lingual lexicon extraction
US11243505B2 (en) 2015-03-16 2022-02-08 Rockwell Automation Technologies, Inc. Cloud-based analytics for industrial automation
US11513477B2 (en) 2015-03-16 2022-11-29 Rockwell Automation Technologies, Inc. Cloud-based industrial controller
US20230367783A1 (en) * 2021-03-30 2023-11-16 Jio Platforms Limited System and method of data ingestion and processing framework

Families Citing this family (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140122414A1 (en) * 2012-10-29 2014-05-01 Xurmo Technologies Private Limited Method and system for providing a personalization solution based on a multi-dimensional data
US9009197B2 (en) 2012-11-05 2015-04-14 Unified Compliance Framework (Network Frontiers) Methods and systems for a compliance framework database schema
US9626629B2 (en) * 2013-02-14 2017-04-18 24/7 Customer, Inc. Categorization of user interactions into predefined hierarchical categories
US9146980B1 (en) * 2013-06-24 2015-09-29 Google Inc. Temporal content selection
US20150081729A1 (en) * 2013-09-19 2015-03-19 GM Global Technology Operations LLC Methods and systems for combining vehicle data
WO2015112112A1 (en) * 2014-01-21 2015-07-30 Hewlett-Packard Development Company, L.P. Automatically discovering topology of an information technology (it) infrastructure
US20150242407A1 (en) * 2014-02-22 2015-08-27 SourceThought, Inc. Discovery of Data Relationships Between Disparate Data Sets
US9996607B2 (en) * 2014-10-31 2018-06-12 International Business Machines Corporation Entity resolution between datasets
US9892362B2 (en) 2014-11-18 2018-02-13 International Business Machines Corporation Intelligence gathering and analysis using a question answering system
US11204929B2 (en) 2014-11-18 2021-12-21 International Business Machines Corporation Evidence aggregation across heterogeneous links for intelligence gathering using a question answering system
US11244113B2 (en) 2014-11-19 2022-02-08 International Business Machines Corporation Evaluating evidential links based on corroboration for intelligence analysis
US10318870B2 (en) 2014-11-19 2019-06-11 International Business Machines Corporation Grading sources and managing evidence for intelligence analysis
US11836211B2 (en) 2014-11-21 2023-12-05 International Business Machines Corporation Generating additional lines of questioning based on evaluation of a hypothetical link between concept entities in evidential data
US9727642B2 (en) 2014-11-21 2017-08-08 International Business Machines Corporation Question pruning for evaluating a hypothetical ontological link
US10042837B2 (en) 2014-12-02 2018-08-07 International Business Machines Corporation NLP processing of real-world forms via element-level template correlation
WO2016200373A1 (en) * 2015-06-09 2016-12-15 Hewlett-Packard Development Company, L.P. Generating further groups of events based on similarity values and behavior matching using a representation of behavior
CN104933164B (en) * 2015-06-26 2018-10-09 华南理工大学 In internet mass data name entity between relationship extracting method and its system
CN105701203A (en) * 2016-01-12 2016-06-22 北京中交兴路车联网科技有限公司 Information storage and query method and system for big data clusters
US10204146B2 (en) 2016-02-09 2019-02-12 Ca, Inc. Automatic natural language processing based data extraction
US10042846B2 (en) * 2016-04-28 2018-08-07 International Business Machines Corporation Cross-lingual information extraction program
US10228916B2 (en) * 2016-06-23 2019-03-12 International Business Machines Corporation Predictive optimization of next task through asset reuse
US10331659B2 (en) 2016-09-06 2019-06-25 International Business Machines Corporation Automatic detection and cleansing of erroneous concepts in an aggregated knowledge base
US10558754B2 (en) 2016-09-15 2020-02-11 Infosys Limited Method and system for automating training of named entity recognition in natural language processing
US10606893B2 (en) 2016-09-15 2020-03-31 International Business Machines Corporation Expanding knowledge graphs based on candidate missing edges to optimize hypothesis set adjudication
US10157177B2 (en) * 2016-10-28 2018-12-18 Kira Inc. System and method for extracting entities in electronic documents
CN108536664A (en) * 2017-03-01 2018-09-14 华东师范大学 The knowledge fusion method in commodity field
US10917483B2 (en) * 2017-06-22 2021-02-09 Numberai, Inc. Automated communication-based intelligence engine
US10652592B2 (en) 2017-07-02 2020-05-12 Comigo Ltd. Named entity disambiguation for providing TV content enrichment
CN107861939B (en) * 2017-09-30 2021-05-14 昆明理工大学 Domain entity disambiguation method fusing word vector and topic model
US10204124B1 (en) * 2017-12-20 2019-02-12 Merck Sharp & Dohme Corp. Database indexing and processing
CN108763445B (en) 2018-05-25 2019-09-17 厦门智融合科技有限公司 Construction method, device, computer equipment and the storage medium in patent knowledge library
US11120227B1 (en) * 2019-07-01 2021-09-14 Unified Compliance Framework (Network Frontiers) Automatic compliance tools
US10824817B1 (en) 2019-07-01 2020-11-03 Unified Compliance Framework (Network Frontiers) Automatic compliance tools for substituting authority document synonyms
US10769379B1 (en) 2019-07-01 2020-09-08 Unified Compliance Framework (Network Frontiers) Automatic compliance tools
US11275777B2 (en) 2019-08-22 2022-03-15 International Business Machines Corporation Methods and systems for generating timelines for entities
US11651156B2 (en) * 2020-05-07 2023-05-16 Optum Technology, Inc. Contextual document summarization with semantic intelligence
US11599562B2 (en) 2020-05-07 2023-03-07 Carrier Corporation System and a method for recommending feature sets for a plurality of equipment to a user
US11386270B2 (en) 2020-08-27 2022-07-12 Unified Compliance Framework (Network Frontiers) Automatically identifying multi-word expressions
US11762896B2 (en) 2020-11-16 2023-09-19 International Business Machines Corporation Relationship discovery and quantification
US20230031040A1 (en) 2021-07-20 2023-02-02 Unified Compliance Framework (Network Frontiers) Retrieval interface for content, such as compliance-related content
US11443102B1 (en) 2021-08-13 2022-09-13 Pricewaterhousecoopers Llp Methods and systems for artificial intelligence-assisted document annotation
US11645462B2 (en) * 2021-08-13 2023-05-09 Pricewaterhousecoopers Llp Continuous machine learning method and system for information extraction

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6665669B2 (en) * 2000-01-03 2003-12-16 Db Miner Technology Inc. Methods and system for mining frequent patterns
US7734556B2 (en) * 2002-10-24 2010-06-08 Agency For Science, Technology And Research Method and system for discovering knowledge from text documents using associating between concepts and sub-concepts
US8229883B2 (en) * 2009-03-30 2012-07-24 Sap Ag Graph based re-composition of document fragments for name entity recognition under exploitation of enterprise databases
US8719308B2 (en) * 2009-02-16 2014-05-06 Business Objects, S.A. Method and system to process unstructured data

Family Cites Families (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080250020A1 (en) * 2001-01-20 2008-10-09 Pointcross, Inc Ontological representation of knowledge
US7162465B2 (en) * 2001-12-21 2007-01-09 Tor-Kristian Jenssen System for analyzing occurrences of logical concepts in text documents
US7657540B1 (en) * 2003-02-04 2010-02-02 Seisint, Inc. Method and system for linking and delinking data records
GB2417792B (en) * 2003-03-31 2007-05-09 Baker Hughes Inc Real-time drilling optimization based on mwd dynamic measurements
US20060074980A1 (en) * 2004-09-29 2006-04-06 Sarkar Pte. Ltd. System for semantically disambiguating text information
US7478192B2 (en) * 2004-11-03 2009-01-13 Saffron Technology, Inc. Network of networks of associative memory networks
US7849048B2 (en) * 2005-07-05 2010-12-07 Clarabridge, Inc. System and method of making unstructured data available to structured data analysis tools
US8145677B2 (en) * 2007-03-27 2012-03-27 Faleh Jassem Al-Shameri Automated generation of metadata for mining image and text data
US8594996B2 (en) * 2007-10-17 2013-11-26 Evri Inc. NLP-based entity recognition and disambiguation
US10157195B1 (en) * 2007-11-29 2018-12-18 Bdna Corporation External system integration into automated attribute discovery
US8027948B2 (en) * 2008-01-31 2011-09-27 International Business Machines Corporation Method and system for generating an ontology
US7882126B2 (en) * 2008-02-07 2011-02-01 International Business Machines Corporation Systems and methods for computation of optimal distance bounds on compressed time-series data
US8275608B2 (en) * 2008-07-03 2012-09-25 Xerox Corporation Clique based clustering for named entity recognition system
US20100070442A1 (en) * 2008-09-15 2010-03-18 Siemens Aktiengesellschaft Organizing knowledge data and experience data
CA2646117A1 (en) * 2008-12-02 2010-06-02 Oculus Info Inc. System and method for visualizing connected temporal and spatial information as an integrated visual representation on a user interface
DE112010000947T5 (en) * 2009-03-02 2012-06-14 Borys Evgenijovich Panchenko Method for completely modifiable framework data distribution in the data warehouse, taking into account the preliminary etymological separation of said data
US8326820B2 (en) * 2009-09-30 2012-12-04 Microsoft Corporation Long-query retrieval
US8725666B2 (en) * 2010-02-26 2014-05-13 Lawrence Livermore National Security, Llc. Information extraction system
US8954440B1 (en) * 2010-04-09 2015-02-10 Wal-Mart Stores, Inc. Selectively delivering an article
AU2010355789B2 (en) * 2010-06-24 2016-05-12 Arbitron Mobile Oy Network server arrangement for processing non-parametric, multi-dimensional, spatial and temporal human behavior or technical observations measured pervasively, and related method for the same
EP2715474A4 (en) * 2011-05-24 2015-11-18 Namesforlife Llc Semiotic indexing of digital resources
CA2837765A1 (en) * 2011-06-03 2012-12-06 Live Insite, Inc. System and method for semantic knowledge capture
US20130124193A1 (en) * 2011-11-15 2013-05-16 Business Objects Software Limited System and Method Implementing a Text Analysis Service
US8943004B2 (en) * 2012-02-08 2015-01-27 Adam Treiser Tools and methods for determining relationship values
US20150026159A1 (en) * 2012-03-05 2015-01-22 Evresearch Ltd Digital Resource Set Integration Methods, Interfaces and Outputs
US8880440B2 (en) * 2012-03-09 2014-11-04 Sap Ag Automatic combination and mapping of text-mining services
US20140025626A1 (en) * 2012-04-19 2014-01-23 Avalon Consulting, LLC Method of using search engine facet indexes to enable search-enhanced business intelligence analysis
WO2013162607A1 (en) * 2012-04-27 2013-10-31 Empire Technology Development Llc Multiple variable coverage memory for database indexing
US9189473B2 (en) * 2012-05-18 2015-11-17 Xerox Corporation System and method for resolving entity coreference
WO2013177508A2 (en) * 2012-05-24 2013-11-28 The Keyw Corporation Enterprise-scalable model-based analytics
US20140032574A1 (en) * 2012-07-23 2014-01-30 Emdadur R. Khan Natural language understanding using brain-like approach: semantic engine using brain-like approach (sebla) derives semantics of words and sentences
US20140046892A1 (en) * 2012-08-10 2014-02-13 Xurmo Technologies Pvt. Ltd. Method and system for visualizing information extracted from big data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6665669B2 (en) * 2000-01-03 2003-12-16 Db Miner Technology Inc. Methods and system for mining frequent patterns
US7734556B2 (en) * 2002-10-24 2010-06-08 Agency For Science, Technology And Research Method and system for discovering knowledge from text documents using associating between concepts and sub-concepts
US8719308B2 (en) * 2009-02-16 2014-05-06 Business Objects, S.A. Method and system to process unstructured data
US8229883B2 (en) * 2009-03-30 2012-07-24 Sap Ag Graph based re-composition of document fragments for name entity recognition under exploitation of enterprise databases

Cited By (44)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10018993B2 (en) 2002-06-04 2018-07-10 Rockwell Automation Technologies, Inc. Transformation of industrial data into useful cloud information
US9965562B2 (en) 2012-02-09 2018-05-08 Rockwell Automation Technologies, Inc. Industrial automation app-store
US10749962B2 (en) 2012-02-09 2020-08-18 Rockwell Automation Technologies, Inc. Cloud gateway for industrial automation information and control systems
US10965760B2 (en) 2012-02-09 2021-03-30 Rockwell Automation Technologies, Inc. Cloud-based operator interface for industrial automation
US10139811B2 (en) 2012-02-09 2018-11-27 Rockwell Automation Technologies, Inc. Smart device for industrial automation
US9568908B2 (en) 2012-02-09 2017-02-14 Rockwell Automation Technologies, Inc. Industrial automation app-store
US9568909B2 (en) 2012-02-09 2017-02-14 Rockwell Automation Technologies, Inc. Industrial automation service templates for provisioning of cloud services
US10116532B2 (en) 2012-02-09 2018-10-30 Rockwell Automation Technologies, Inc. Cloud-based operator interface for industrial automation
US11470157B2 (en) 2012-02-09 2022-10-11 Rockwell Automation Technologies, Inc. Cloud gateway for industrial automation information and control systems
US20140046653A1 (en) * 2012-08-10 2014-02-13 Xurmo Technologies Pvt. Ltd. Method and system for building entity hierarchy from big data
US9989958B2 (en) 2013-05-09 2018-06-05 Rockwell Automation Technologies, Inc. Using cloud-based data for virtualization of an industrial automation environment
US10204191B2 (en) 2013-05-09 2019-02-12 Rockwell Automation Technologies, Inc. Using cloud-based data for industrial simulation
US9954972B2 (en) 2013-05-09 2018-04-24 Rockwell Automation Technologies, Inc. Industrial data analytics in a cloud platform
US9709978B2 (en) 2013-05-09 2017-07-18 Rockwell Automation Technologies, Inc. Using cloud-based data for virtualization of an industrial automation environment with information overlays
US10726428B2 (en) 2013-05-09 2020-07-28 Rockwell Automation Technologies, Inc. Industrial data analytics in a cloud platform
US9703902B2 (en) 2013-05-09 2017-07-11 Rockwell Automation Technologies, Inc. Using cloud-based data for industrial simulation
US10026049B2 (en) * 2013-05-09 2018-07-17 Rockwell Automation Technologies, Inc. Risk assessment for industrial systems using big data
US11676508B2 (en) 2013-05-09 2023-06-13 Rockwell Automation Technologies, Inc. Using cloud-based data for industrial automation system training
US20140337086A1 (en) * 2013-05-09 2014-11-13 Rockwell Authomation Technologies, Inc. Risk assessment for industrial systems using big data
US9786197B2 (en) 2013-05-09 2017-10-10 Rockwell Automation Technologies, Inc. Using cloud-based data to facilitate enhancing performance in connection with an industrial automation system
US11295047B2 (en) 2013-05-09 2022-04-05 Rockwell Automation Technologies, Inc. Using cloud-based data for industrial simulation
US10257310B2 (en) 2013-05-09 2019-04-09 Rockwell Automation Technologies, Inc. Industrial data analytics in a cloud platform
US10816960B2 (en) 2013-05-09 2020-10-27 Rockwell Automation Technologies, Inc. Using cloud-based data for virtualization of an industrial machine environment
US10984677B2 (en) 2013-05-09 2021-04-20 Rockwell Automation Technologies, Inc. Using cloud-based data for industrial automation system training
US10564633B2 (en) 2013-05-09 2020-02-18 Rockwell Automation Technologies, Inc. Using cloud-based data for virtualization of an industrial automation environment with information overlays
US20150278192A1 (en) * 2014-03-25 2015-10-01 Nice-Systems Ltd Language model adaptation based on filtered data
US9564122B2 (en) * 2014-03-25 2017-02-07 Nice Ltd. Language model adaptation based on filtered data
WO2016114433A1 (en) * 2015-01-16 2016-07-21 주식회사 솔트룩스 Unstructured data processing system and method
US11042131B2 (en) 2015-03-16 2021-06-22 Rockwell Automation Technologies, Inc. Backup of an industrial automation plant in the cloud
US11243505B2 (en) 2015-03-16 2022-02-08 Rockwell Automation Technologies, Inc. Cloud-based analytics for industrial automation
US11927929B2 (en) 2015-03-16 2024-03-12 Rockwell Automation Technologies, Inc. Modeling of an industrial automation environment in the cloud
US11880179B2 (en) 2015-03-16 2024-01-23 Rockwell Automation Technologies, Inc. Cloud-based analytics for industrial automation
US10496061B2 (en) 2015-03-16 2019-12-03 Rockwell Automation Technologies, Inc. Modeling of an industrial automation environment in the cloud
US11513477B2 (en) 2015-03-16 2022-11-29 Rockwell Automation Technologies, Inc. Cloud-based industrial controller
US11409251B2 (en) 2015-03-16 2022-08-09 Rockwell Automation Technologies, Inc. Modeling of an industrial automation environment in the cloud
US10614528B2 (en) 2015-11-29 2020-04-07 Vatbox, Ltd. System and method for automatic generation of reports based on electronic documents
US10235723B2 (en) * 2015-11-29 2019-03-19 Vatbox, Ltd. System and method for automatic generation of reports based on electronic documents
US20170154027A1 (en) * 2015-11-29 2017-06-01 Vatbox, Ltd. System and method for automatic generation of reports based on electronic documents
WO2017091829A1 (en) * 2015-11-29 2017-06-01 Vatbox, Ltd. System and method for automatic generation of reports based on electronic documents
US10546351B2 (en) 2015-11-29 2020-01-28 Vatbox, Ltd. System and method for automatic generation of reports based on electronic documents
US10614527B2 (en) 2015-11-29 2020-04-07 Vatbox, Ltd. System and method for automatic generation of reports based on electronic documents
US11163952B2 (en) 2018-07-11 2021-11-02 International Business Machines Corporation Linked data seeded multi-lingual lexicon extraction
CN109871261A (en) * 2019-02-27 2019-06-11 云南大学 Resource environment big data methods of exhibiting and display platform
US20230367783A1 (en) * 2021-03-30 2023-11-16 Jio Platforms Limited System and method of data ingestion and processing framework

Also Published As

Publication number Publication date
US20140046877A1 (en) 2014-02-13
US20140046653A1 (en) 2014-02-13
US20140046892A1 (en) 2014-02-13
US9239830B2 (en) 2016-01-19

Similar Documents

Publication Publication Date Title
US20140046977A1 (en) System and method for mining patterns from relationship sequences extracted from big data
Trupthi et al. Sentiment analysis on twitter using streaming API
US9880999B2 (en) Natural language relatedness tool using mined semantic analysis
US20150074112A1 (en) Multimedia Question Answering System and Method
EP3851975A1 (en) Method and apparatus for generating text topics, and electronic device
US20150081277A1 (en) System and Method for Automatically Classifying Text using Discourse Analysis
US20150120777A1 (en) System and Method for Mining Data Using Haptic Feedback
US11468070B2 (en) Method and system for performing context-based search
Yates et al. Extracting adverse drug reactions from social media
US20140108424A1 (en) Data store organizing data using semantic classification
Hamon et al. Querying biomedical linked data with natural language questions
Mahmood et al. Query based information retrieval and knowledge extraction using Hadith datasets
JP2018508075A5 (en)
WO2023211602A1 (en) Exploring entities of interest over multiple data sources using knowledge graphs
Vu et al. Graph-based interactive data federation system for heterogeneous data retrieval and analytics
Malik et al. Text mining life cycle for a spatial reading of Viet Thanh Nguyen's The Refugees (2017)
Vidal et al. Semantic data integration techniques for transforming big biomedical data into actionable knowledge
KR101374195B1 (en) Method for providing deep domain knowledge based on massive science information and apparatus thereof
Zenkert et al. Discovering contextual knowledge with associated information in dimensional structured knowledge bases
Siddiqui et al. A Comprehensive Review on Text Classification and Text Mining Techniques Using Spam Dataset Detection
Aboluwarin et al. Optimizing short message text sentiment analysis for mobile device forensics
Leotta et al. My MOoD, a Multimedia and Multilingual Ontology Driven MAS: Design and First Experiments in the Sentiment Analysis Domain.
De Maio et al. Text Mining Basics in Bioinformatics.
Hamon et al. Natural language question analysis for querying biomedical linked data
Pinto et al. Intelligent and fuzzy systems applied to language & knowledge engineering

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION