US20120278298A9 - System and method for query temporality analysis - Google Patents

System and method for query temporality analysis Download PDF

Info

Publication number
US20120278298A9
US20120278298A9 US13/161,143 US201113161143A US2012278298A9 US 20120278298 A9 US20120278298 A9 US 20120278298A9 US 201113161143 A US201113161143 A US 201113161143A US 2012278298 A9 US2012278298 A9 US 2012278298A9
Authority
US
United States
Prior art keywords
query
objects
citations
distribution
temporality
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US13/161,143
Other versions
US20110313986A1 (en
US8892541B2 (en
Inventor
Rishab Aiyer Ghosh
Thomas James Emerson
Lun Ted Cui
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Apple Inc
Original Assignee
Topsy Labs Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US12/628,801 external-priority patent/US8244664B2/en
Priority claimed from US12/628,791 external-priority patent/US8688701B2/en
Priority claimed from US12/895,593 external-priority patent/US7991725B2/en
Priority to US13/161,143 priority Critical patent/US8892541B2/en
Application filed by Topsy Labs Inc filed Critical Topsy Labs Inc
Priority to PCT/US2011/040635 priority patent/WO2011159863A1/en
Assigned to TOPSY LABS, INC. reassignment TOPSY LABS, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: CUI, LUN TED, EMERSON, THOMAS JAMES, GHOSH, RISHAB AIYER
Publication of US20110313986A1 publication Critical patent/US20110313986A1/en
Publication of US20120278298A9 publication Critical patent/US20120278298A9/en
Assigned to VENTURE LENDING & LEASING V, INC., VENTURE LENDING & LEASING VI, INC., VENTURE LENDING & LEASING VII, INC. reassignment VENTURE LENDING & LEASING V, INC. SECURITY AGREEMENT Assignors: TOPSY LABS, INC.
Priority to US14/520,872 priority patent/US10380121B2/en
Publication of US8892541B2 publication Critical patent/US8892541B2/en
Application granted granted Critical
Assigned to APPLE INC. reassignment APPLE INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: TOPSY LABS, INC.
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24578Query processing with adaptation to user needs using ranking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/338Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Definitions

  • a person seeking to locate information to assist in a decision, to determine an affinity, and/or identify a dislike can leverage traditional non-electronic data sources (e.g., personal recommendations—which can be few and can be biased) and/or electronic data sources such as web sites, bulletin boards, blogs, and other sources to locate (sometimes rated) data about a particular topic/subject (e.g., where to stay when visiting San Francisco).
  • non-electronic data sources e.g., personal recommendations—which can be few and can be biased
  • electronic data sources such as web sites, bulletin boards, blogs, and other sources to locate (sometimes rated) data about a particular topic/subject (e.g., where to stay when visiting San Francisco).
  • Such an approach is time consuming and often unreliable as with most of the electronic data there lacks an indicia of trustworthiness of the source of the information.
  • Influence accrued by persons in such a network of references is subjective. In other words, influence accrued by persons in such a network of references appear differently to each other person in the network, as each person's opinion is formed by their own individual networks of trust.
  • Real world trust networks follow a small-world pattern, that is, where everyone is not connected to everyone else directly, but most people are connected to most other people through a relatively small number of intermediaries or “connectors”. Accordingly, this means that some individuals within the network may disproportionately influence the opinion held by other individuals. In other words, some people's opinions may be more influential than other people's opinions.
  • influence is provided for augmenting reputation, which may be subjective.
  • influence is provided as an objective measure.
  • influence can be useful in filtering opinions, information, and data. It will be appreciated that reputation and influence provide unique advantages in accordance with some embodiments for the ranking of individuals or products or services of any type in any means or form.
  • FIG. 1 depicts an example of a citation graph used to support citation search.
  • FIG. 2 depicts an example of a system diagram to support query temporality analysis.
  • FIG. 3 depicts an example of a flowchart of a process to support query temporality analysis.
  • a new approach is proposed that contemplates systems and methods to determine temporality of a query in order to generate a search result including a list of objects that are not only based on matching of the objects to the query but also based on temporality analysis of the query.
  • the temporality of the query can be defined as the distribution over time of the objects matching the query, i.e., the chronology histogram of the query. Such distribution can be analyzed to provide a classification of the intent of the query.
  • a query with constant/even distribution of objects over time is most likely intended for knowledge or canonical, while a query with distribution of objects concentrated at particular points is most likely focused on a specific event, and a query with distribution of objects increased over time mainly reflects the recent interest of a user.
  • Classification of the intent of the query can result either in discrete classification of the query into categories as shown by the non-limiting examples above, or in continuous classification of the query which may be a scalar or vector value resulting from transformations of the chronology histogram.
  • Such classification of the query can be directly communicated to the user or be utilized to perform further operations, which include but are not limited to, choosing different forms of displaying the search result to the user, choosing different methods to determine the search result, and as an input to the search result computation.
  • An illustrative implementation of systems and methods described herein in accordance with some embodiments includes a citation graph 100 as shown in FIG. 1 .
  • the citation graph 100 comprises a plurality of citations 104 , each describing an opinion of the object by a source/subject 102 .
  • the nodes/entities in the citation graph 100 are characterized into two categories, 1) subjects 102 capable of having an opinion or creating/making citations 104 , in which expression of such opinion is explicit, expressed, implicit, or imputed through any other technique; and 2) objects 106 cited by citations 104 , about which subjects 102 have opinions or make citations.
  • Each subject 102 or object 106 in graph 100 represents an influential entity, once an influence score for that node has been determined or estimated. More specifically, each subject 102 may have an influence score indicating the degree to which the subject's opinion influences other subjects and/or a community of subjects, and each object 106 may have an influence score indicating the collective opinions of the plurality of subjects 102 citing the object.
  • subjects 102 representing any entities or sources that make citations may correspond to one or more of the following:
  • some subjects/authors 102 who create the citations 104 can be related to each other, for a non-limiting example, via an influence network or community and influence scores can be assigned to the subjects 102 based on their authorities in the influence network.
  • objects 106 cited by the citations 104 may correspond to one or more of the following: Internet web sites, blogs, videos, books, films, music, image, video, documents, data files, objects for sale, objects that are reviewed or recommended or cited, subjects/authors, natural or legal persons, citations, or any entities that are or may be associated with a Uniform Resource Identifier (URI), or any form of product or service or information of any means or form for which a representation has been made.
  • URI Uniform Resource Identifier
  • the links or edges 104 of the citation graph 100 represent different forms of association between the subject nodes 102 and the object nodes 106 , such as citations 104 of objects 106 by subjects 102 .
  • citations 104 can be created by authors citing targets at some point of time and can be one of link, description, keyword or phrase by a source/subject 102 pointing to a target (subject 102 or object 106 ).
  • citations may include one or more of the expression of opinions on objects, expressions of authors in the form of Tweets, blog posts, reviews of objects on Internet web sites Wikipedia entries, postings to social media such as Twitter or Jaiku, postings to websites, postings in the form of reviews, recommendations, or any other form of citation made to mailing lists, newsgroups, discussion forums, comments to websites or any other form of Internet publication.
  • citations 104 can be made by one subject 102 regarding an object 106 , such as a recommendation of a website, or a restaurant review, and can be treated as representation an expression of opinion or description. In some embodiments, citations 104 can be made by one subject 102 regarding another subject 102 , such as a recommendation of one author by another, and can be treated as representing an expression of trustworthiness. In some embodiments, citations 104 can be made by certain object 106 regarding other objects, wherein the object 106 is also a subject.
  • citation 104 can be described in the format of (subject, citation description, object, timestamp, type). Citations 104 can be categorized into various types based on the characteristics of subjects/authors 102 , objects/targets 106 and citations 104 themselves. Citations 104 can also reference other citations. The reference relationship among citations is one of the data sources for discovering influence network.
  • FIG. 2 depicts an example of a system diagram to support determination of quality of cited objects in search results based on the influence of the citing subjects.
  • the diagrams depict components as functionally separate, such depiction is merely for illustrative purposes. It will be apparent that the components portrayed in this figure can be arbitrarily combined or divided into separate software, firmware and/or hardware components. Furthermore, it will also be apparent that such components, regardless of how they are combined or divided, can execute on the same host or multiple hosts, and wherein the multiple hosts can be connected by one or more networks.
  • the system 200 includes at least search engine 204 , influence evaluation engine 204 , and object selection engine 206 .
  • the term engine refers to software, firmware, hardware, or other component that is used to effectuate a purpose.
  • the engine will typically include software instructions that are stored in non-volatile memory (also referred to as secondary memory).
  • non-volatile memory also referred to as secondary memory
  • the processor executes the software instructions in memory.
  • the processor may be a shared processor, a dedicated processor, or a combination of shared or dedicated processors.
  • a typical program will include calls to hardware components (such as I/O devices), which typically requires the execution of drivers.
  • the drivers may or may not be considered part of the engine, but the distinction is not critical.
  • each of the engines can run on one or more hosting devices (hosts).
  • a host can be a computing device, a communication device, a storage device, or any electronic device capable of running a software component.
  • a computing device can be but is not limited to a laptop PC, a desktop PC, a tablet PC, an iPod, an iPhone, an iPad, Google's Android device, a PDA, or a server machine.
  • a storage device can be but is not limited to a hard disk drive, a flash memory drive, or any portable storage device.
  • a communication device can be but is not limited to a mobile phone.
  • search engine 202 each has a communication interface (not shown), which is a software component that enables the engines to communicate with each other following certain communication protocols, such as TCP/IP protocol, over one or more communication networks (not shown).
  • the communication networks can be but are not limited to, internet, intranet, wide area network (WAN), local area network (LAN), wireless network, Bluetooth, WiFi, and mobile communication network.
  • WAN wide area network
  • LAN local area network
  • wireless network Bluetooth
  • WiFi WiFi
  • mobile communication network The physical connections of the network and the communication protocols are well known to those of skill in the art.
  • search engine 202 accepts a search request in the form of a query from a user and determines temporality of a query in order to generate a search result including a plurality of objects 106 that are not only based on matching of the objects to the query but also based on temporality analysis of the query.
  • FIG. 3 depicts an example of a flowchart of a process to support query temporality analysis. Although this figure depicts functional steps in a particular order for purposes of illustration, the process is not limited to any particular order or arrangement of steps. One skilled in the relevant art will appreciate that the various steps portrayed in this figure could be omitted, rearranged, combined and/or adapted in various ways.
  • the flowchart 300 starts at block 302 where a query is accepted from a user as part of a search request.
  • the flowchart 300 continues to block 304 where a plurality of objects that match the query are retrieved.
  • the flowchart 300 continues to block 306 where distribution over time of the objects (known as a chronology histogram) matching the query is determined for temporality analysis of the query based on timestamp metadata associated with the objects.
  • the flowchart 300 continues to block 308 where the distribution over time of the objects is analyzed to provide a classification of the intent of the query.
  • the flowchart 300 ends at block 310 where a search result including the objects is generated that is not only based on matching of the objects to the query but also based on the classification of the intent of the query.
  • the search engine 202 provides a discrete classification of the intent of the query into various categories, wherein a query with constant/even distribution of objects over time is most likely intended for knowledge/canonical and may be classified by the search engine 202 as such, while the query with distribution of the objects concentrated at particular points is most likely focused on a specific event, and a query with distribution of the objects increased over time mainly reflects the recent interest of a user.
  • the search engine 202 provides a continuous classification of the intent of the query, which can be but is not limited to a scalar or vector value resulting from transformations of a chronology histogram of the query, which represents the distribution over time of the objects matching the query.
  • the search engine 202 may utilize the temporality analysis of the query, among other criteria and/or factors, to select a “time window” that provides the best search result for a specific query.
  • the time window can be but is not limited to one of: hour (the preceding/past 60 minutes), day (the preceding 24 hours), week (the preceding 7 days), month (the preceding 30 days), or all (results from as far back as they have been collected). However, if only one of these windows can be pre-selected and displayed on the results page as the search result at a given time, the purpose of the time window selection is to choose the proper time window (H, D, W, M, or A) to be displayed.
  • the search engine 202 may select a combination of multiple time windows so that the search result may include, for a non-limiting example, mostly canonical objects with a few recent objects.
  • the search engine 202 may select the best time window based on the time distribution of the objects matching the query, i.e., the ratio of the actual count of objects (object count) during that time window to the expected object count for that time window for that specific query so that the temporality of the search result best matches that of the query.
  • the search engine 202 may compute the expected object count for a time window based on actual object counts for the preceding and succeeding time windows, including place-holder time windows that may never be selected but are used for computation purposes only.
  • the expected object count for a time window can be based on the actual time (including the time of day, or time of week) in which the query takes place.
  • the search engine 202 may weight the ratio of expected to actual object count may be weighted to provide a bias for certain time windows (e.g. day preferred to month).
  • time windows e.g. day preferred to month.
  • Such time window selection process is based on the assumption that a time window is likely to be of most interest to the user if there are more objects from that window in the search result than would otherwise be expected proportionately from the number of objects in the preceding (next earliest) or following (next latest) windows.
  • search engine 202 enables a citation search process, which unlike the “classical web search” approaches that is object/target-centric and focuses only on the relevance of the objects 106 to the searching criteria, the search process adopted by search engine 202 is “citation” centric, retrieving a plurality of citations composed by a plurality of subjects citing a plurality of objects.
  • the classical web search retrieves and ranks objects 106 based on attributes of the objects, while the proposed search approach adds citation 104 and subject/author 102 dimensions.
  • the extra metadata associated with subjects 102 , citations 104 , and objects 106 provide better ranking capability, richer functionality and higher efficiency for the searches.
  • the search engine 202 may accept and enforce various criteria/terms on citation searching, retrieving and ranking, each of which can either be explicitly described by a user or best guessed by the system based on internal statistical data.
  • criteria include but are not limited to,
  • Constraints for the citations including but are not limited to,
  • Type types of citations
  • the output can be objects, authors or citations of the types including but are not limited to,
  • Target types such as web pages, images, videos, people
  • Citation types such as tweets, comments, blog entries
  • Time bias recent; point of time; event; general knowledge; auto
  • View point bias such as general view or perspective of certain people.
  • Type bias topic type, target type.
  • object selection engine 206 determines temporalities and classifications of one or more of citing subjects/sources and cited objects/targets of the citations in addition to temporality and classification of the query in order to provide a list of selected objects based on one or more of these temporalities. More specifically, the object selection engine 206 may select and rank the objects in part according to one or more of:
  • the object selection engine 206 may determine the temporality of a subject based on its chronology histogram.
  • the chronology histogram of the subject can either be query-dependent—time distribution of citations from the subject that matches a query, e.g. when the source “reuters” makes citations including the query term “nuclear”, or query independent—time distribution of all of the citations from the subject, e.g., every time “reuters” makes a citation in any context.
  • the object selection engine 206 may utilize the temporality analysis to identify whether the subjects of the citations (possibly about a particular topic or query term) are either evenly or concentratedly distributed over time.
  • the object selection engine 206 may also utilize such temporality analysis to classify the subjects, where classification can result either in discrete classification of the subjects into categories, such as, for non-limiting examples, “regular sources” or “sporadic sources”, or in continuous classification which may be a scalar or vector value resulting from transformations of the chronology histogram.
  • the object selection engine 206 may determine the temporality of an object on the basis of its chronology histogram.
  • the chronology histogram of the object can either be query-dependent—time distribution of citations for the object that matches a query, e.g. when the object “perl.org” is cited along with the query term “perl”, or query independent—time distribution of all of the citations for the object, e.g., every time “perl.org” is cited in any context.
  • the object selection engine 206 may utilize such temporality analysis to identify whether the objects of the citations (possibly about a particular topic or query term) are either evenly or concentratedly distributed over time.
  • the object selection engine 206 may also utilize such temporality analysis to classify the objects, where classification of the intent of the query can result in discrete classification of the objects into categories, such as, in non-limiting examples, “knowledge” or “event”, or in continuous classification which may be a scalar or vector value resulting from transformations of the chronology histogram.
  • the object selection engine 206 may limit the chronology histogram to:
  • the object selection engine 206 may weigh the chronology histogram of a query, a subject or an object based on attributes associated with each citation, or attributes associated with the subject or object of each citation.
  • the attributes include but are not limited to language, location, source, and time (recency) of the citation or the subject or object of the citation.
  • the attributes may be generated, computed, acquired or may be ascribed as metadata to the citation or the subject or object.
  • the classifications of the query, the subject, and/or the object can be directly communicated to the user and the object selection engine 206 may utilize such classification to perform further operations that include but are not limited to, choosing different forms of displaying the search result to the user (e.g. highlighting “events” or “knowledge”), choosing different methods to determine the search result, and as an input to the search result computation.
  • influence evaluation engine 204 calculates influence scores of entities (subjects 102 and/or objects 106 ), wherein such influence scores can be used to determine at least in part, in combination with other methods and systems, the ranking of any subset of objects 106 obtained from a plurality of citations 104 from citation search results.
  • influence evaluation engine 204 measures influence and reputation of subjects 102 that compose the plurality of citations 104 citing the plurality of objects 106 on dimensions that are related to, for non-limiting examples, one or more of the specific topic or objects (e.g., automobiles or restaurants) cited by the subjects, or form of citations (e.g., a weblog or Wikipedia entry or news article or Twitter feed), or search terms (e.g., key words or phrases specified in order to define a subset of all entities that match the search term(s)), in which a subset of the ranked entities are made available based on selection criteria, such as the rank, date or time, or geography/location associated with the entity, and/or any other selection criteria.
  • topic or objects e.g., automobiles or restaurants
  • form of citations e.g., a weblog or Wikipedia entry or news article or Twitter feed
  • search terms e.g., key words or phrases specified in order to define a subset of all entities that match the search term(s)
  • influence evaluation engine 204 determines an influence score for a first subject or source at least partly based on how often a first subject is cited or referenced by a (another) second subject(s).
  • each of the first or the second subject can be but is not limited to an internet author or user of social media services, while each citation describes reference by the second subject to a citation of an object by the first subject.
  • the number of the citations or the citation score of the first subject by the second subjects is computed and the influence of the second subjects citing the first subject can also be optionally taken into account in the citation score.
  • the influence score of the first subject is computed as a function of some or all of: the number of citations of the first subject by second subjects, a score for each such citation, and the influence score of the second subjects.
  • the influence of the first subject as reflected by the count of citations or citation score of the first subject or subject can be displayed to the user at a location associated with the first subject, such as the “profile page” of the first subject, together with a list of the second subjects citing the first subjects, which can be optionally ranked by the influences of the second subject.
  • influence evaluation engine 204 allows for the attribution of influence on subjects 102 to data sources (e.g., sources of opinions, data, or referrals) to be estimated and distributed/propagated based on the citation graph 100 .
  • data sources e.g., sources of opinions, data, or referrals
  • an entity can be directly linked to any number of other entities on any number of dimensions in the citation graph 100 , with each link possibly having an associated score.
  • a path on a given dimension between two entities, such as a subject 102 and an object 106 includes a directed or an undirected link from the source to an intermediate entity, prefixed to a directed or undirected path from the intermediate entity to the object 106 in the same or possibly a different dimension.
  • influence evaluation engine 204 estimates the influence of each entity as the count of actual requests for data, opinion, or searches relating to or originating from other entities, entities with direct links to the entity or with a path in the citation graph, possibly with a predefined maximum length, to the entity; such actual requests being counted if they occur within a predefined period of time and result in the use of the paths originating from the entity (e.g., representing opinions, reviews, citations or other forms of expression) with or without the count being adjusted by the possible weights on each link, the length of each path, and the level of each entity on each path.
  • the paths originating from the entity e.g., representing opinions, reviews, citations or other forms of expression
  • influence evaluation engine 204 adjusts the influence of each entity by metrics relating to the citation graph comprising all entities or a subset of all linked entities.
  • metrics can include the density of the graph, defined as the ratio of the number of links to the number of linked entities in the graph; such metrics are transformed by mathematical functions optimal to the topology of the graph, such as where it is known that the distribution of links among entities in a given graph may be non-linear.
  • An example of such an adjustment would be the operation of estimating the influence of an entity as the number of directed links connecting to the entity, divided by the logarithm of the density of the citation graph comprising all linked entities. For example, such an operation can provide an optimal method of estimating influence rapidly with a limited degree of computational complexity.
  • influence evaluation engine 204 optimizes the estimation of influence for different contexts and requirements of performance, memory, graph topology, number of entities, and/or any other context and/or requirement, by any combination of the operations described above in paragraphs above, and any similar operations involving metrics including but not limited to values comprising: the number of potential source entities to the entity for which influence is to be estimated, the number of potential target entities, the number of potential directed paths between any one entity and any other entity on any or all given dimensions, the number of potential directed paths that include the entity, the number of times within a defined period that a directed link from the entity is used for a scoring, search or other operation(s).
  • object selection engine 206 utilizes influence scores of the citing subjects 102 and the number of their citations 104 to determine the selection and ranking of objects 106 cited by the citations, wherein the objects include but are not limited to documents on the Internet, products, services, data files, legal or natural persons, or any entities in any form or means that can be searched or cited over a network.
  • object selection engine 206 selects and ranks the cited objects based on ranking criteria that include but are not limited to, influence scores of the citing subjects, date or time, geographical location associated with the objects, and/or any other selection criteria.
  • object selection engine 206 calculates and ranks the influence scores of the cited objects based on attributes of one or more of the following scoring components in combination with other attributes of objects including semantic or descriptive data regarding the objects:
  • citing subject Author One has an influence score of 10, which composes Citation 1.1 and Citation 1.2, wherein Citation 1.1 cites Target One once while Citation 1.2 cites Target Two twice; citing subject Author Two has an influence score of 5, which composes Citation 2.1, which cites Target One three times; citing subject Author Three has an influence score of 4, which composes Citation 3.2, which cites Target Two four times.
  • FIG. 3 depicts an example of a flowchart of a process to support determination of quality of cited objects in search results based on the influence of the citing subjects.
  • FIG. 3 depicts functional steps in a particular order for purposes of illustration, the process is not limited to any particular order or arrangement of steps.
  • One skilled in the relevant art will appreciate that the various steps portrayed in this figure could be omitted, rearranged, combined and/or adapted in various ways.
  • the flowchart 300 starts at block 302 where citation searching, retrieving and ranking criteria and mechanisms are set and adjusted based on user specification and/or internal statistical data.
  • the flowchart 300 continues to block 304 where a plurality of citations of objects that fit the search criteria, such as text match, time filter, author filter, type filter, are retrieved.
  • the flowchart 300 continues to block 306 where influence scores of a plurality of subjects that compose the plurality of citations of objects are calculated.
  • the flowchart 300 continues to block 308 where influence scores of objects in the citations from the search are calculated based on the influence scores of the plurality of subjects and the ranking criteria.
  • the flowchart 300 ends at block 310 where objects are selected as the search result based on the matching of the objects with the searching criteria as well as influence scores of the objects.
  • object selection engine 206 determines the qualities of the cited objects by examining the distribution of influence scores of subjects citing the objects in the search results.
  • one measure of the influence distribution is the ratio of the number of citations from the “influential” and the “non-influential” subjects, where “influential” subjects may, for a non-limiting example, have an influence score higher than a threshold determined by the percentile distribution of all influence scores.
  • Object selection engine 206 accepts only those objects that show up in the citation search results if their citation ratios from “influential” and “non-influential” subjects are above a certain threshold while others can be marked as spam if the ratio of their citation ratios from “influential” and “non-influential” subjects fall below the certain threshold, indicating that they are most likely cited from spam subjects.
  • object selection engine 206 calculates and ranks cited objects by treating citations of the objects as connections having positive or negative weights in a weighted citation graph.
  • a citation with implicit positive weight can include, for a non-limiting example, a retweet or a link between individual blog posts or web cites, while a citation with negative weight can include, for a non-limiting example, a statement by one subject 102 that another source is a spammer.
  • object selection engine 206 uses citations with negative weights in a citation graph-based rank/influence calculation approach to propagate negative citation scores through the citation graph. Assigning and propagating citations of negative weights makes it possible to identify clusters of spammers in the citation graph without having each spammer individually identified. Furthermore, identifying subjects/sources 102 with high influence and propagating a few negative citations from such subjects is enough to mark an entire cluster of spammers negatively, thus reducing their influence on the search result.
  • object selection engine 206 presents the generated search results of cited objects to a user who issues the search request or provides the generated search results to a third party for further processing. In some embodiments, object selection engine 206 presents to the user a score computed from a function combining the count of citations and the influence of the subjects of the citations along with the search result of the objects. In some embodiments, object selection engine 206 displays multiple scores computed from functions combining the counts of subsets of citations and the influence of the source of each citation along with the search result, where each subset may be determined by criteria such as the influence of the subjects, or attributes of the subjects or the citations.
  • the following may be displayed to the user—“5 citations from Twitter; 7 citations from people in Japan; and 8 citations in English from influential users.”
  • the subsets above may be selected and/or filtered either by the object selection engine 206 or by users.
  • object selection engine 206 selects for display of every object in the search result, one or more citations and the subjects of the citations on the basis of criteria such as the recency or the influence of their citing subjects relative to the other citations in the search result. Object selection engine 206 then displays the selected citations and/or subjects in such a way that the relationship between the search result, the citations and the subjects of the citations are made transparent to a user.
  • One embodiment may be implemented using a conventional general purpose or a specialized digital computer or microprocessor(s) programmed according to the teachings of the present disclosure, as will be apparent to those skilled in the computer art.
  • Appropriate software coding can readily be prepared by skilled programmers based on the teachings of the present disclosure, as will be apparent to those skilled in the software art.
  • the invention may also be implemented by the preparation of integrated circuits or by interconnecting an appropriate network of conventional component circuits, as will be readily apparent to those skilled in the art.
  • One embodiment includes a computer program product which is a machine readable medium (media) having instructions stored thereon/in which can be used to program one or more hosts to perform any of the features presented herein.
  • the machine readable medium can include, but is not limited to, one or more types of disks including floppy disks, optical discs, DVD, CD-ROMs, micro drive, and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, DRAMs, VRAMs, flash memory devices, magnetic or optical cards, nanosystems (including molecular memory ICs), or any type of media or device suitable for storing instructions and/or data.
  • the present invention includes software for controlling both the hardware of the general purpose/specialized computer or microprocessor, and for enabling the computer or microprocessor to interact with a human viewer or other mechanism utilizing the results of the present invention.
  • software may include, but is not limited to, device drivers, operating systems, execution environments/containers, and applications.

Abstract

A new approach is proposed that contemplates systems and methods to determine temporality of a query in order to generate a search result including a list of objects that are not only based on matching of the objects to the query but also based on temporality analysis of the query. Here, the temporality of the query can be defined as the distribution over time of the objects matching the query, i.e., the chronology histogram of the query. Such distribution can be analyzed to provide a classification of the intent of the query. Classification of the intent of the query can result either in discrete classification of the query into categories, or in continuous classification of the query which may be a scalar or vector value resulting from transformations of the chronology histogram.

Description

    RELATED APPLICATIONS
  • This application claims priority to U.S. Provisional Patent Application No. 61/355,443, filed Jun. 16, 2010, and entitled “A system and method for query temporality analysis,” and is hereby incorporated herein by reference.
  • BACKGROUND
  • Knowledge is increasingly more germane to our exponentially expanding information-based society. Perfect knowledge is the ideal that participants seek to assist in decision making and for determining preferences, affinities, and dislikes. Practically, perfect knowledge about a given topic is virtually impossible to obtain unless the inquirer is the source of all of information about such topic (e.g., autobiographer). Armed with more information, decision makers are generally best positioned to select a choice that will lead to a desired outcome/result (e.g., which restaurant to go to for dinner). However, as more information is becoming readily available through various electronic communications modalities (e.g., the Internet), one is left to sift through what is amounting to a myriad of data to obtain relevant and, more importantly, trust worthy information to assist in decision making activities. Although there are various tools (e.g., search engines, community boards with various ratings), there lacks any indicia of personal trustworthiness (e.g., measure of the source's reputation and/or influence) with located data.
  • Currently, a person seeking to locate information to assist in a decision, to determine an affinity, and/or identify a dislike can leverage traditional non-electronic data sources (e.g., personal recommendations—which can be few and can be biased) and/or electronic data sources such as web sites, bulletin boards, blogs, and other sources to locate (sometimes rated) data about a particular topic/subject (e.g., where to stay when visiting San Francisco). Such an approach is time consuming and often unreliable as with most of the electronic data there lacks an indicia of trustworthiness of the source of the information. Failing to find a plethora (or spot on) information from immediate non-electronic and/or electronic data source(s), the person making the inquiry is left to make the decision using limited information, which can lead to less than perfect predictions of outcomes, results, and can lead to low levels of satisfaction undertaking one or more activities for which information was sought.
  • Current practices also do not leverage trustworthiness of information or, stated differently, attribute a value to the influence of the source of data (e.g., referral). With current practices, the entity seeking the data must make a value judgment on the influence of the data source. Such value judgment is generally based on previous experiences with the data source (e.g., rely on Mike's restaurant recommendations as he is a chef and Laura's hotel recommendations in Europe as she lived and worked in Europe for 5 years). Unless the person making the inquiry has an extensive network of references from which to rely to obtain desired data needed to make a decision, most often, the person making the decision is left to take a risk or “roll the dice” based on best available non-attributed (non-reputed) data. Such a prospect often leads certain participants from not engaging in a contemplated activity. Influence accrued by persons in such a network of references is subjective. In other words, influence accrued by persons in such a network of references appear differently to each other person in the network, as each person's opinion is formed by their own individual networks of trust.
  • Real world trust networks follow a small-world pattern, that is, where everyone is not connected to everyone else directly, but most people are connected to most other people through a relatively small number of intermediaries or “connectors”. Accordingly, this means that some individuals within the network may disproportionately influence the opinion held by other individuals. In other words, some people's opinions may be more influential than other people's opinions.
  • As referred to herein, influence is provided for augmenting reputation, which may be subjective. In some embodiments, influence is provided as an objective measure. For example, influence can be useful in filtering opinions, information, and data. It will be appreciated that reputation and influence provide unique advantages in accordance with some embodiments for the ranking of individuals or products or services of any type in any means or form.
  • One issue facing an online user is the difficulty to search for content that actually addresses his/her problem from his/her own perspective or from someone whose opinion the user values highly. Even when the user is able to find the content that is relevant to address his/her problem, such content is most likely to be of “time neutral” type that does not categorize the search results based on their timing.
  • The foregoing examples of the related art and limitations related therewith are intended to be illustrative and not exclusive. Other limitations of the related art will become apparent upon a reading of the specification and a study of the drawings.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 depicts an example of a citation graph used to support citation search.
  • FIG. 2 depicts an example of a system diagram to support query temporality analysis.
  • FIG. 3 depicts an example of a flowchart of a process to support query temporality analysis.
  • DETAILED DESCRIPTION OF EMBODIMENTS
  • The approach is illustrated by way of example and not by way of limitation in the figures of the accompanying drawings in which like references indicate similar elements. It should be noted that references to “an” or “one” or “some” embodiment(s) in this disclosure are not necessarily to the same embodiment, and such references mean at least one.
  • A new approach is proposed that contemplates systems and methods to determine temporality of a query in order to generate a search result including a list of objects that are not only based on matching of the objects to the query but also based on temporality analysis of the query. Here, the temporality of the query can be defined as the distribution over time of the objects matching the query, i.e., the chronology histogram of the query. Such distribution can be analyzed to provide a classification of the intent of the query. For non-limiting examples, a query with constant/even distribution of objects over time is most likely intended for knowledge or canonical, while a query with distribution of objects concentrated at particular points is most likely focused on a specific event, and a query with distribution of objects increased over time mainly reflects the recent interest of a user. Classification of the intent of the query can result either in discrete classification of the query into categories as shown by the non-limiting examples above, or in continuous classification of the query which may be a scalar or vector value resulting from transformations of the chronology histogram. Such classification of the query can be directly communicated to the user or be utilized to perform further operations, which include but are not limited to, choosing different forms of displaying the search result to the user, choosing different methods to determine the search result, and as an input to the search result computation.
  • Citation Graph
  • An illustrative implementation of systems and methods described herein in accordance with some embodiments includes a citation graph 100 as shown in FIG. 1. In the example of FIG. 1, the citation graph 100 comprises a plurality of citations 104, each describing an opinion of the object by a source/subject 102. The nodes/entities in the citation graph 100 are characterized into two categories, 1) subjects 102 capable of having an opinion or creating/making citations 104, in which expression of such opinion is explicit, expressed, implicit, or imputed through any other technique; and 2) objects 106 cited by citations 104, about which subjects 102 have opinions or make citations. Each subject 102 or object 106 in graph 100 represents an influential entity, once an influence score for that node has been determined or estimated. More specifically, each subject 102 may have an influence score indicating the degree to which the subject's opinion influences other subjects and/or a community of subjects, and each object 106 may have an influence score indicating the collective opinions of the plurality of subjects 102 citing the object.
  • In some embodiments, subjects 102 representing any entities or sources that make citations may correspond to one or more of the following:
      • Representations of a person, web log, and entities representing Internet authors or users of social media services including one or more of the following: blogs, Twitter, or reviews on Internet web sites;
      • Users of microblogging services such as Twitter;
      • Users of social networks such as MySpace or Facebook, bloggers;
      • Reviewers, who provide expressions of opinion, reviews, or other information useful for the estimation of influence.
  • In some embodiments, some subjects/authors 102 who create the citations 104 can be related to each other, for a non-limiting example, via an influence network or community and influence scores can be assigned to the subjects 102 based on their authorities in the influence network.
  • In some embodiments, objects 106 cited by the citations 104 may correspond to one or more of the following: Internet web sites, blogs, videos, books, films, music, image, video, documents, data files, objects for sale, objects that are reviewed or recommended or cited, subjects/authors, natural or legal persons, citations, or any entities that are or may be associated with a Uniform Resource Identifier (URI), or any form of product or service or information of any means or form for which a representation has been made.
  • In some embodiments, the links or edges 104 of the citation graph 100 represent different forms of association between the subject nodes 102 and the object nodes 106, such as citations 104 of objects 106 by subjects 102. For non-limiting examples, citations 104 can be created by authors citing targets at some point of time and can be one of link, description, keyword or phrase by a source/subject 102 pointing to a target (subject 102 or object 106). Here, citations may include one or more of the expression of opinions on objects, expressions of authors in the form of Tweets, blog posts, reviews of objects on Internet web sites Wikipedia entries, postings to social media such as Twitter or Jaiku, postings to websites, postings in the form of reviews, recommendations, or any other form of citation made to mailing lists, newsgroups, discussion forums, comments to websites or any other form of Internet publication.
  • In some embodiments, citations 104 can be made by one subject 102 regarding an object 106, such as a recommendation of a website, or a restaurant review, and can be treated as representation an expression of opinion or description. In some embodiments, citations 104 can be made by one subject 102 regarding another subject 102, such as a recommendation of one author by another, and can be treated as representing an expression of trustworthiness. In some embodiments, citations 104 can be made by certain object 106 regarding other objects, wherein the object 106 is also a subject.
  • In some embodiments, citation 104 can be described in the format of (subject, citation description, object, timestamp, type). Citations 104 can be categorized into various types based on the characteristics of subjects/authors 102, objects/targets 106 and citations 104 themselves. Citations 104 can also reference other citations. The reference relationship among citations is one of the data sources for discovering influence network.
  • FIG. 2 depicts an example of a system diagram to support determination of quality of cited objects in search results based on the influence of the citing subjects. Although the diagrams depict components as functionally separate, such depiction is merely for illustrative purposes. It will be apparent that the components portrayed in this figure can be arbitrarily combined or divided into separate software, firmware and/or hardware components. Furthermore, it will also be apparent that such components, regardless of how they are combined or divided, can execute on the same host or multiple hosts, and wherein the multiple hosts can be connected by one or more networks.
  • In the example of FIG. 2, the system 200 includes at least search engine 204, influence evaluation engine 204, and object selection engine 206. As used herein, the term engine refers to software, firmware, hardware, or other component that is used to effectuate a purpose. The engine will typically include software instructions that are stored in non-volatile memory (also referred to as secondary memory). When the software instructions are executed, at least a subset of the software instructions is loaded into memory (also referred to as primary memory) by a processor. The processor then executes the software instructions in memory. The processor may be a shared processor, a dedicated processor, or a combination of shared or dedicated processors. A typical program will include calls to hardware components (such as I/O devices), which typically requires the execution of drivers. The drivers may or may not be considered part of the engine, but the distinction is not critical.
  • In the example of FIG. 2, each of the engines can run on one or more hosting devices (hosts). Here, a host can be a computing device, a communication device, a storage device, or any electronic device capable of running a software component. For non-limiting examples, a computing device can be but is not limited to a laptop PC, a desktop PC, a tablet PC, an iPod, an iPhone, an iPad, Google's Android device, a PDA, or a server machine. A storage device can be but is not limited to a hard disk drive, a flash memory drive, or any portable storage device. A communication device can be but is not limited to a mobile phone.
  • In the example of FIG. 2, search engine 202, influence evaluation engine 204, and object selection engine 206 each has a communication interface (not shown), which is a software component that enables the engines to communicate with each other following certain communication protocols, such as TCP/IP protocol, over one or more communication networks (not shown). Here, the communication networks can be but are not limited to, internet, intranet, wide area network (WAN), local area network (LAN), wireless network, Bluetooth, WiFi, and mobile communication network. The physical connections of the network and the communication protocols are well known to those of skill in the art.
  • Temporality Analysis
  • In the example of FIG. 2, search engine 202 accepts a search request in the form of a query from a user and determines temporality of a query in order to generate a search result including a plurality of objects 106 that are not only based on matching of the objects to the query but also based on temporality analysis of the query. FIG. 3 depicts an example of a flowchart of a process to support query temporality analysis. Although this figure depicts functional steps in a particular order for purposes of illustration, the process is not limited to any particular order or arrangement of steps. One skilled in the relevant art will appreciate that the various steps portrayed in this figure could be omitted, rearranged, combined and/or adapted in various ways.
  • In the example of FIG. 3, the flowchart 300 starts at block 302 where a query is accepted from a user as part of a search request. The flowchart 300 continues to block 304 where a plurality of objects that match the query are retrieved. The flowchart 300 continues to block 306 where distribution over time of the objects (known as a chronology histogram) matching the query is determined for temporality analysis of the query based on timestamp metadata associated with the objects. The flowchart 300 continues to block 308 where the distribution over time of the objects is analyzed to provide a classification of the intent of the query. The flowchart 300 ends at block 310 where a search result including the objects is generated that is not only based on matching of the objects to the query but also based on the classification of the intent of the query.
  • In some embodiments, the search engine 202 provides a discrete classification of the intent of the query into various categories, wherein a query with constant/even distribution of objects over time is most likely intended for knowledge/canonical and may be classified by the search engine 202 as such, while the query with distribution of the objects concentrated at particular points is most likely focused on a specific event, and a query with distribution of the objects increased over time mainly reflects the recent interest of a user. Alternatively, the search engine 202 provides a continuous classification of the intent of the query, which can be but is not limited to a scalar or vector value resulting from transformations of a chronology histogram of the query, which represents the distribution over time of the objects matching the query.
  • In some embodiments, the search engine 202 may utilize the temporality analysis of the query, among other criteria and/or factors, to select a “time window” that provides the best search result for a specific query. Here, the time window can be but is not limited to one of: hour (the preceding/past 60 minutes), day (the preceding 24 hours), week (the preceding 7 days), month (the preceding 30 days), or all (results from as far back as they have been collected). However, if only one of these windows can be pre-selected and displayed on the results page as the search result at a given time, the purpose of the time window selection is to choose the proper time window (H, D, W, M, or A) to be displayed. In some embodiments, the search engine 202 may select a combination of multiple time windows so that the search result may include, for a non-limiting example, mostly canonical objects with a few recent objects.
  • In some embodiments, among a given set of time windows, such as the past hour, day, week, month and “all-time”, the search engine 202 may select the best time window based on the time distribution of the objects matching the query, i.e., the ratio of the actual count of objects (object count) during that time window to the expected object count for that time window for that specific query so that the temporality of the search result best matches that of the query. Here, the search engine 202 may compute the expected object count for a time window based on actual object counts for the preceding and succeeding time windows, including place-holder time windows that may never be selected but are used for computation purposes only. The expected object count for a time window can be based on the actual time (including the time of day, or time of week) in which the query takes place. The search engine 202 may weight the ratio of expected to actual object count may be weighted to provide a bias for certain time windows (e.g. day preferred to month). Such time window selection process is based on the assumption that a time window is likely to be of most interest to the user if there are more objects from that window in the search result than would otherwise be expected proportionately from the number of objects in the preceding (next earliest) or following (next latest) windows. For a non-limiting example, a term currently having 835 matching objects in the week window should have a proportionate number of 835/7=119 in the day window. If in fact the day window has 472 objects, far in excess of the number 119 that would have been expected based on the week window, the day window is likely to be of most interest to the user and that is the window whose matching objects should be displayed as the search result.
  • Citation Search
  • In some embodiments, search engine 202 enables a citation search process, which unlike the “classical web search” approaches that is object/target-centric and focuses only on the relevance of the objects 106 to the searching criteria, the search process adopted by search engine 202 is “citation” centric, retrieving a plurality of citations composed by a plurality of subjects citing a plurality of objects. In addition, the classical web search retrieves and ranks objects 106 based on attributes of the objects, while the proposed search approach adds citation 104 and subject/author 102 dimensions. The extra metadata associated with subjects 102, citations 104, and objects 106 provide better ranking capability, richer functionality and higher efficiency for the searches.
  • In some embodiments, the search engine 202 may accept and enforce various criteria/terms on citation searching, retrieving and ranking, each of which can either be explicitly described by a user or best guessed by the system based on internal statistical data. Such criteria include but are not limited to,
  • a) Constraints for the citations, including but are not limited to,
  • Description: usually the text search query;
  • Time range of the citations;
  • Author: such as from particular author or sub set of authors;
  • Type: types of citations;
  • b) Types of the cited objects: the output can be objects, authors or citations of the types including but are not limited to,
  • Target types: such as web pages, images, videos, people
  • Author types: such as expert for certain topic
  • Citation types: such as tweets, comments, blog entries
  • c) Ranking bias of the cited objects: which can be smartly guessed by the system or specified by user including but are not limited to,
  • Time bias: recent; point of time; event; general knowledge; auto
  • View point bias: such as general view or perspective of certain people.
  • Type bias: topic type, target type.
  • In the example of FIG. 2, object selection engine 206 determines temporalities and classifications of one or more of citing subjects/sources and cited objects/targets of the citations in addition to temporality and classification of the query in order to provide a list of selected objects based on one or more of these temporalities. More specifically, the object selection engine 206 may select and rank the objects in part according to one or more of:
      • How closely the temporality of the objects fits particular temporality classifications, which may be discrete, such as “knowledge” or continuous scalar or vector temporality classifications;
      • How closely the temporality of the objects fits the temporality of the query;
      • How closely the temporality of the sources of each citation for each object fits particular temporalities, such as a pre-defined temporality or the temporality of the query;
      • How closely the temporality of these subjects fits particular temporalities, such as a pre-defined temporality or the temporality of the query.
  • Similar to temporality analysis of a query, the object selection engine 206 may determine the temporality of a subject based on its chronology histogram. Here, the chronology histogram of the subject can either be query-dependent—time distribution of citations from the subject that matches a query, e.g. when the source “reuters” makes citations including the query term “nuclear”, or query independent—time distribution of all of the citations from the subject, e.g., every time “reuters” makes a citation in any context.
  • In some embodiments, the object selection engine 206 may utilize the temporality analysis to identify whether the subjects of the citations (possibly about a particular topic or query term) are either evenly or concentratedly distributed over time. The object selection engine 206 may also utilize such temporality analysis to classify the subjects, where classification can result either in discrete classification of the subjects into categories, such as, for non-limiting examples, “regular sources” or “sporadic sources”, or in continuous classification which may be a scalar or vector value resulting from transformations of the chronology histogram.
  • Similar to temporality analysis of a query, the object selection engine 206 may determine the temporality of an object on the basis of its chronology histogram. Here, the chronology histogram of the object can either be query-dependent—time distribution of citations for the object that matches a query, e.g. when the object “perl.org” is cited along with the query term “perl”, or query independent—time distribution of all of the citations for the object, e.g., every time “perl.org” is cited in any context. For a non-limiting example, the object selection engine 206 may utilize such temporality analysis to identify whether the objects of the citations (possibly about a particular topic or query term) are either evenly or concentratedly distributed over time. The object selection engine 206 may also utilize such temporality analysis to classify the objects, where classification of the intent of the query can result in discrete classification of the objects into categories, such as, in non-limiting examples, “knowledge” or “event”, or in continuous classification which may be a scalar or vector value resulting from transformations of the chronology histogram.
  • In some embodiments, the object selection engine 206 may limit the chronology histogram to:
      • citations for which subjects have influence above a threshold;
      • citations in specific languages;
      • citations for targets in specific languages; to citations from sources in particular locations.
  • In the approaches outlined above, the object selection engine 206 may weigh the chronology histogram of a query, a subject or an object based on attributes associated with each citation, or attributes associated with the subject or object of each citation. Here, the attributes include but are not limited to language, location, source, and time (recency) of the citation or the subject or object of the citation. The attributes may be generated, computed, acquired or may be ascribed as metadata to the citation or the subject or object.
  • In some embodiments, the classifications of the query, the subject, and/or the object can be directly communicated to the user and the object selection engine 206 may utilize such classification to perform further operations that include but are not limited to, choosing different forms of displaying the search result to the user (e.g. highlighting “events” or “knowledge”), choosing different methods to determine the search result, and as an input to the search result computation.
  • Influence Evaluation
  • In the example of FIG. 2, influence evaluation engine 204 calculates influence scores of entities (subjects 102 and/or objects 106), wherein such influence scores can be used to determine at least in part, in combination with other methods and systems, the ranking of any subset of objects 106 obtained from a plurality of citations 104 from citation search results.
  • In some embodiments, influence evaluation engine 204 measures influence and reputation of subjects 102 that compose the plurality of citations 104 citing the plurality of objects 106 on dimensions that are related to, for non-limiting examples, one or more of the specific topic or objects (e.g., automobiles or restaurants) cited by the subjects, or form of citations (e.g., a weblog or Wikipedia entry or news article or Twitter feed), or search terms (e.g., key words or phrases specified in order to define a subset of all entities that match the search term(s)), in which a subset of the ranked entities are made available based on selection criteria, such as the rank, date or time, or geography/location associated with the entity, and/or any other selection criteria.
  • In some embodiments, influence evaluation engine 204 determines an influence score for a first subject or source at least partly based on how often a first subject is cited or referenced by a (another) second subject(s). Here, each of the first or the second subject can be but is not limited to an internet author or user of social media services, while each citation describes reference by the second subject to a citation of an object by the first subject. The number of the citations or the citation score of the first subject by the second subjects is computed and the influence of the second subjects citing the first subject can also be optionally taken into account in the citation score. For a non-limiting example, the influence score of the first subject is computed as a function of some or all of: the number of citations of the first subject by second subjects, a score for each such citation, and the influence score of the second subjects. Once computed, the influence of the first subject as reflected by the count of citations or citation score of the first subject or subject can be displayed to the user at a location associated with the first subject, such as the “profile page” of the first subject, together with a list of the second subjects citing the first subjects, which can be optionally ranked by the influences of the second subject.
  • In some embodiments, influence evaluation engine 204 allows for the attribution of influence on subjects 102 to data sources (e.g., sources of opinions, data, or referrals) to be estimated and distributed/propagated based on the citation graph 100. More specifically, an entity can be directly linked to any number of other entities on any number of dimensions in the citation graph 100, with each link possibly having an associated score. For a non-limiting example, a path on a given dimension between two entities, such as a subject 102 and an object 106, includes a directed or an undirected link from the source to an intermediate entity, prefixed to a directed or undirected path from the intermediate entity to the object 106 in the same or possibly a different dimension.
  • In some embodiments, influence evaluation engine 204 estimates the influence of each entity as the count of actual requests for data, opinion, or searches relating to or originating from other entities, entities with direct links to the entity or with a path in the citation graph, possibly with a predefined maximum length, to the entity; such actual requests being counted if they occur within a predefined period of time and result in the use of the paths originating from the entity (e.g., representing opinions, reviews, citations or other forms of expression) with or without the count being adjusted by the possible weights on each link, the length of each path, and the level of each entity on each path.
  • In some embodiments, influence evaluation engine 204 adjusts the influence of each entity by metrics relating to the citation graph comprising all entities or a subset of all linked entities. For a non-limiting example, such metrics can include the density of the graph, defined as the ratio of the number of links to the number of linked entities in the graph; such metrics are transformed by mathematical functions optimal to the topology of the graph, such as where it is known that the distribution of links among entities in a given graph may be non-linear. An example of such an adjustment would be the operation of estimating the influence of an entity as the number of directed links connecting to the entity, divided by the logarithm of the density of the citation graph comprising all linked entities. For example, such an operation can provide an optimal method of estimating influence rapidly with a limited degree of computational complexity.
  • In some embodiments, influence evaluation engine 204 optimizes the estimation of influence for different contexts and requirements of performance, memory, graph topology, number of entities, and/or any other context and/or requirement, by any combination of the operations described above in paragraphs above, and any similar operations involving metrics including but not limited to values comprising: the number of potential source entities to the entity for which influence is to be estimated, the number of potential target entities, the number of potential directed paths between any one entity and any other entity on any or all given dimensions, the number of potential directed paths that include the entity, the number of times within a defined period that a directed link from the entity is used for a scoring, search or other operation(s).
  • In some embodiments, object selection engine 206 utilizes influence scores of the citing subjects 102 and the number of their citations 104 to determine the selection and ranking of objects 106 cited by the citations, wherein the objects include but are not limited to documents on the Internet, products, services, data files, legal or natural persons, or any entities in any form or means that can be searched or cited over a network. Here, object selection engine 206 selects and ranks the cited objects based on ranking criteria that include but are not limited to, influence scores of the citing subjects, date or time, geographical location associated with the objects, and/or any other selection criteria.
  • In some embodiments, object selection engine 206 calculates and ranks the influence scores of the cited objects based on attributes of one or more of the following scoring components in combination with other attributes of objects including semantic or descriptive data regarding the objects:
      • Subjects of the citations: such as influence scores of the subjects/authors, expertise of the subjects on the give topic, perspective bias on the subjects of the citations.
      • Citations: such as text match quality (e.g., content of citations matching search terms), number of citations, date of the citations, and other citations related to the same cited object, time bias, type bias etc.
  • For a non-limiting example, in the example depicted in FIG. 1, citing subject Author One has an influence score of 10, which composes Citation 1.1 and Citation 1.2, wherein Citation 1.1 cites Target One once while Citation 1.2 cites Target Two twice; citing subject Author Two has an influence score of 5, which composes Citation 2.1, which cites Target One three times; citing subject Author Three has an influence score of 4, which composes Citation 3.2, which cites Target Two four times. Based on the influence scores of the authors alone, object selection engine 206 calculates the influence score of Target One as 10*1+3*5=25, while the influence score of Target Two is calculated as 10*2+4*4=36. Since Target Two has a higher influence score than Target One, it should be ranked higher than Target One in the final search result.
  • FIG. 3 depicts an example of a flowchart of a process to support determination of quality of cited objects in search results based on the influence of the citing subjects. Although this figure depicts functional steps in a particular order for purposes of illustration, the process is not limited to any particular order or arrangement of steps. One skilled in the relevant art will appreciate that the various steps portrayed in this figure could be omitted, rearranged, combined and/or adapted in various ways.
  • In the example of FIG. 3, the flowchart 300 starts at block 302 where citation searching, retrieving and ranking criteria and mechanisms are set and adjusted based on user specification and/or internal statistical data. The flowchart 300 continues to block 304 where a plurality of citations of objects that fit the search criteria, such as text match, time filter, author filter, type filter, are retrieved. The flowchart 300 continues to block 306 where influence scores of a plurality of subjects that compose the plurality of citations of objects are calculated. The flowchart 300 continues to block 308 where influence scores of objects in the citations from the search are calculated based on the influence scores of the plurality of subjects and the ranking criteria. The flowchart 300 ends at block 310 where objects are selected as the search result based on the matching of the objects with the searching criteria as well as influence scores of the objects.
  • In some embodiments, object selection engine 206 determines the qualities of the cited objects by examining the distribution of influence scores of subjects citing the objects in the search results. For a non-limiting example, one measure of the influence distribution is the ratio of the number of citations from the “influential” and the “non-influential” subjects, where “influential” subjects may, for a non-limiting example, have an influence score higher than a threshold determined by the percentile distribution of all influence scores. Object selection engine 206 accepts only those objects that show up in the citation search results if their citation ratios from “influential” and “non-influential” subjects are above a certain threshold while others can be marked as spam if the ratio of their citation ratios from “influential” and “non-influential” subjects fall below the certain threshold, indicating that they are most likely cited from spam subjects.
  • In some embodiments, object selection engine 206 calculates and ranks cited objects by treating citations of the objects as connections having positive or negative weights in a weighted citation graph. A citation with implicit positive weight can include, for a non-limiting example, a retweet or a link between individual blog posts or web cites, while a citation with negative weight can include, for a non-limiting example, a statement by one subject 102 that another source is a spammer.
  • In some embodiments, object selection engine 206 uses citations with negative weights in a citation graph-based rank/influence calculation approach to propagate negative citation scores through the citation graph. Assigning and propagating citations of negative weights makes it possible to identify clusters of spammers in the citation graph without having each spammer individually identified. Furthermore, identifying subjects/sources 102 with high influence and propagating a few negative citations from such subjects is enough to mark an entire cluster of spammers negatively, thus reducing their influence on the search result.
  • In some embodiments, object selection engine 206 presents the generated search results of cited objects to a user who issues the search request or provides the generated search results to a third party for further processing. In some embodiments, object selection engine 206 presents to the user a score computed from a function combining the count of citations and the influence of the subjects of the citations along with the search result of the objects. In some embodiments, object selection engine 206 displays multiple scores computed from functions combining the counts of subsets of citations and the influence of the source of each citation along with the search result, where each subset may be determined by criteria such as the influence of the subjects, or attributes of the subjects or the citations. For non limiting-examples, the following may be displayed to the user—“5 citations from Twitter; 7 citations from people in Japan; and 8 citations in English from influential users.” The subsets above may be selected and/or filtered either by the object selection engine 206 or by users.
  • In some embodiments, object selection engine 206 selects for display of every object in the search result, one or more citations and the subjects of the citations on the basis of criteria such as the recency or the influence of their citing subjects relative to the other citations in the search result. Object selection engine 206 then displays the selected citations and/or subjects in such a way that the relationship between the search result, the citations and the subjects of the citations are made transparent to a user.
  • One embodiment may be implemented using a conventional general purpose or a specialized digital computer or microprocessor(s) programmed according to the teachings of the present disclosure, as will be apparent to those skilled in the computer art. Appropriate software coding can readily be prepared by skilled programmers based on the teachings of the present disclosure, as will be apparent to those skilled in the software art. The invention may also be implemented by the preparation of integrated circuits or by interconnecting an appropriate network of conventional component circuits, as will be readily apparent to those skilled in the art.
  • One embodiment includes a computer program product which is a machine readable medium (media) having instructions stored thereon/in which can be used to program one or more hosts to perform any of the features presented herein. The machine readable medium can include, but is not limited to, one or more types of disks including floppy disks, optical discs, DVD, CD-ROMs, micro drive, and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, DRAMs, VRAMs, flash memory devices, magnetic or optical cards, nanosystems (including molecular memory ICs), or any type of media or device suitable for storing instructions and/or data. Stored on any one of the computer readable medium (media), the present invention includes software for controlling both the hardware of the general purpose/specialized computer or microprocessor, and for enabling the computer or microprocessor to interact with a human viewer or other mechanism utilizing the results of the present invention. Such software may include, but is not limited to, device drivers, operating systems, execution environments/containers, and applications.
  • The foregoing description of various embodiments of the claimed subject matter has been provided for the purposes of illustration and description. It is not intended to be exhaustive or to limit the claimed subject matter to the precise forms disclosed. Many modifications and variations will be apparent to the practitioner skilled in the art. Particularly, while the concept “interface” is used in the embodiments of the systems and methods described above, it will be evident that such concept can be interchangeably used with equivalent software concepts such as, class, method, type, module, component, bean, module, object model, process, thread, and other suitable concepts. While the concept “component” is used in the embodiments of the systems and methods described above, it will be evident that such concept can be interchangeably used with equivalent concepts such as, class, method, type, interface, module, object model, and other suitable concepts. Embodiments were chosen and described in order to best describe the principles of the invention and its practical application, thereby enabling others skilled in the relevant art to understand the claimed subject matter, the various embodiments and with various modifications that are suited to the particular use contemplated.

Claims (51)

1. A system, comprising:
a search engine, which in operation,
accepts a query from a user as part of a search request;
retrieves a plurality of objects that match the query ;
determines distribution over time of the objects matching the query for temporality analysis of the query;
analyzes the distribution over time of the objects to provide a classification of the intent of the query;
generates a search result including the objects that are not only based on matching of the objects to the query but also based on the classification of the intent of the query.
2. The system of claim 1, wherein:
each of the plurality of objects is one of: Internet web sites, blogs, videos, books, films, music, image, video, documents, data files, objects for sale, objects that are reviewed or recommended or cited, subjects/authors, natural or legal persons, citations, or any entities that are associated with a Uniform Resource Identifier (URI).
3. The system of claim 1, wherein:
the search engine provides a discrete classification of the intent of the query into various categories.
4. The system of claim 3, wherein:
the search engine classifies the query with constant or even distribution of the objects over time as for knowledge/canonical.
5. The system of claim 3, wherein:
the search engine classifies the query with the distribution of the objects concentrated at particular points as focused on a specific event.
6. The system of claim 3, wherein:
the search engine classifies the query with the distribution of the objects increased over time as reflecting the recent interest of the user.
7. The system of claim 1, wherein:
the search engine provides a continuous classification of the intent of the query.
8. The system of claim 7, wherein:
the continuous classification of the intent of the query is a scalar or vector value resulting from transformations of a chronology histogram of the query, which represents the distribution over time of the objects matching the query.
9. The system of claim 1, wherein:
the search engine utilizes the temporality analysis of the query to select a time window that provides the best search result for the query.
10. The system of claim 9, wherein:
the search engine selects a combination of multiple time windows that provides the best search result for the query.
11. The system of claim 9, wherein:
the search engine selects the best time window based on the time distribution of the objects matching the query among a given set of time windows.
12. The system of claim 1, wherein:
the search engine enables a citation centric search process that retrieves a plurality of citations composed by a plurality of subjects citing the plurality of objects.
13. The system of claim 12, wherein:
each of the plurality of subjects has an opinion wherein expression of the opinion is explicit, expressed, implicit, or imputed through any other technique.
14. The system of claim 12, wherein:
each of the plurality of citations includes one or more of: expression of opinions on the objects, expressions of authors in the form of Tweets, blog posts, reviews of objects on Internet web sites Wikipedia entries, postings to social media, postings to websites, postings in the form of reviews, recommendations, or any other form of citation made to mailing lists, newsgroups, discussion forums, comments to websites or any other form of Internet publication.
15. The system of claim 12, wherein:
the search engine accepts and enforces a plurality of criteria on citation searching, retrieving and ranking, each of which is either be explicitly described by a user or best guessed by the system based on internal statistical data.
16. The system of claim 15, wherein:
the plurality of criteria include one or more of constraints for the plurality of citations, type of the plurality of objects cited, and ranking bias of the cited objects.
17. The system of claim 12, further comprising:
an object selection engine, which in operation,
determines temporalities and classifications of one or more of the citing subjects and the cited objects of the citations in addition to temporality of the query;
provides the list of selected objects based on one or more of these temporalities and classifications.
18. The system of claim 17, wherein:
the object selection engine determines the temporality of a subject based on time distribution of the citations from the subject that match the query.
19. The system of claim 17, wherein:
the object selection engine determines the temporality of a subject based on time distribution of all of the citations from the subject.
20. The system of claim 17, wherein:
the object selection engine utilizes the temporalities to identify whether the subjects of the citations are either evenly or concentratedly distributed over time.
21. The system of claim 17, wherein:
the object selection engine classifies the subjects either into discrete classification of categories or in continuous classification.
22. The system of claim 17, wherein:
the object selection engine determines the temporality of an object based on time distribution of the citations for the object that match the query.
23. The system of claim 17, wherein:
the object selection engine determines the temporality of an object based on time distribution of all of the citations for the object.
24. The system of claim 17, wherein:
the object selection engine utilizes the temporalities to identify whether the objects of the citations are either evenly or concentratedly distributed over time.
25. The system of claim 17, wherein:
the object selection engine classifies the objects either into discrete classification of categories or in continuous classification.
26. The system of claim 17, wherein:
the object selection engine weighs the time distribution a query, a subject or an object based on attributes associated with each citation, or attributes associated with the subject or object of each citation.
27. The system of claim 26, wherein:
the attributes include one or more of language, location, source, and time of the citation or the subject or object of the citation.
28. The system of claim 17, wherein:
the object selection engine utilizes the classifications of the query, the subject, and/or the object to perform one or more of choosing different forms of displaying the search result to the user, choosing different methods to determine the search result, and as an input to the search result computation.
29. A method, comprising:
accepting a query from a user as part of a search request;
retrieving a plurality of objects that match the query ;
determining distribution over time of the objects matching the query for temporality analysis of the query;
analyzing the distribution over time of the objects to provide a classification of the intent of the query;
generating a search result including the objects that are not only based on matching of the objects to the query but also based on the classification of the intent of the query.
30. The method of claim 29, further comprising:
providing a discrete classification of the intent of the query into various categories.
31. The method of claim 30, further comprising:
classifying the query with constant or even distribution of the objects over time as for knowledge/canonical.
32. The method of claim 30, further comprising:
classifying the query with the distribution of the objects concentrated at particular points as focused on a specific event.
33. The method of claim 30, further comprising:
classifying the query with the distribution of the objects increased over time as reflecting the recent interest of the user.
34. The method of claim 29, further comprising:
providing a continuous classification of the intent of the query.
35. The method of claim 29, further comprising:
utilizing the temporality analysis of the query to select a time window that provides the best search result for the query.
36. The method of claim 35, further comprising:
selecting a combination of multiple time windows that provides the best search result for the query.
37. The method of claim 35, further comprising:
selecting the best time window based on the time distribution of the objects matching the query among a given set of time windows.
38. The method of claim 29, further comprising:
enabling a citation centric search process that retrieves a plurality of citations composed by a plurality of subjects citing the plurality of objects.
39. The method of claim 38, further comprising:
accepting and enforcing a plurality of criteria on citation searching, retrieving and ranking, each of which is either be explicitly described by a user or best guessed by the system based on internal statistical data.
40. The method of claim 38, further comprising:
determining temporalities and classifications of one or more of the citing subjects and the cited objects of the citations in addition to temporality of the query;
providing the list of selected objects based on one or more of these temporalities and classifications.
41. The method of claim 40, further comprising:
determining the temporality of a subject based on time distribution of the citations from the subject that match the query.
42. The method of claim 40, further comprising:
determining the temporality of a subject based on time distribution of all of the citations from the subject.
43. The method of claim 40, further comprising:
utilizing the temporalities to identify whether the subjects of the citations are either evenly or concentratedly distributed over time.
44. The method of claim 40, further comprising:
classifying the subjects either into discrete classification of categories or in continuous classification.
45. The method of claim 40, further comprising:
determining the temporality of an object based on time distribution of the citations for the object that match the query.
46. The method of claim 40, further comprising:
determining the temporality of an object based on time distribution of all of the citations for the object.
47. The method of claim 40, further comprising:
utilizing the temporalities to identify whether the objects of the citations are either evenly or concentratedly distributed over time.
48. The method of claim 40, further comprising:
classifying the objects either into discrete classification of categories or in continuous classification.
49. The method of claim 40, further comprising:
weighing the time distribution a query, a subject or an object based on attributes associated with each citation, or attributes associated with the subject or object of each citation.
50. The method of claim 40, further comprising:
utilizing the classifications of the query, the subject, and/or the object to perform one or more of choosing different forms of displaying the search result to the user, choosing different methods to determine the search result, and as an input to the search result computation.
51. A machine readable medium having software instructions stored thereon that when executed cause a system to:
accept a query from a user as part of a search request;
retrieve a plurality of objects that match the query ;
determine distribution over time of the objects matching the query for temporality analysis of the query;
analyze the distribution over time of the objects to provide a classification of the intent of the query;
generate a search result including the objects that are not only based on matching of the objects to the query but also based on the classification of the intent of the query.
US13/161,143 2009-12-01 2011-06-15 System and method for query temporality analysis Active 2030-11-21 US8892541B2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US13/161,143 US8892541B2 (en) 2009-12-01 2011-06-15 System and method for query temporality analysis
PCT/US2011/040635 WO2011159863A1 (en) 2010-06-16 2011-06-16 A system and method for query temporality analysis
US14/520,872 US10380121B2 (en) 2009-12-01 2014-10-22 System and method for query temporality analysis

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US12/628,791 US8688701B2 (en) 2007-06-01 2009-12-01 Ranking and selecting entities based on calculated reputation or influence scores
US12/628,801 US8244664B2 (en) 2008-12-01 2009-12-01 Estimating influence of subjects based on a subject graph
US35544310P 2010-06-16 2010-06-16
US12/895,593 US7991725B2 (en) 2006-06-05 2010-09-30 Intelligent reputation attribution platform
US13/161,143 US8892541B2 (en) 2009-12-01 2011-06-15 System and method for query temporality analysis

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US12/895,593 Continuation-In-Part US7991725B2 (en) 2006-06-05 2010-09-30 Intelligent reputation attribution platform

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/520,872 Continuation US10380121B2 (en) 2009-12-01 2014-10-22 System and method for query temporality analysis

Publications (3)

Publication Number Publication Date
US20110313986A1 US20110313986A1 (en) 2011-12-22
US20120278298A9 true US20120278298A9 (en) 2012-11-01
US8892541B2 US8892541B2 (en) 2014-11-18

Family

ID=45329577

Family Applications (2)

Application Number Title Priority Date Filing Date
US13/161,143 Active 2030-11-21 US8892541B2 (en) 2009-12-01 2011-06-15 System and method for query temporality analysis
US14/520,872 Active US10380121B2 (en) 2009-12-01 2014-10-22 System and method for query temporality analysis

Family Applications After (1)

Application Number Title Priority Date Filing Date
US14/520,872 Active US10380121B2 (en) 2009-12-01 2014-10-22 System and method for query temporality analysis

Country Status (2)

Country Link
US (2) US8892541B2 (en)
WO (1) WO2011159863A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140052729A1 (en) * 2011-05-10 2014-02-20 David Manzano Macho Optimized data stream management system
CN108268652A (en) * 2018-01-29 2018-07-10 四川乐路科技有限公司 A kind of popular science knowledge commending system and method

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015057154A2 (en) 2013-10-18 2015-04-23 Biopetrolia Ab ENGINEERING OF ACETYL-CoA METABOLISM IN YEAST
US20150254576A1 (en) * 2014-03-05 2015-09-10 Black Hills Ip Holdings, Llc Systems and methods for analyzing relative priority for a group of patents
US10235454B1 (en) * 2014-04-01 2019-03-19 Google Llc Generating playlist inclusive canonical network addresses
US11210300B2 (en) * 2015-05-14 2021-12-28 NetSuite Inc. System and methods of generating structured data from unstructured data
CN110059193A (en) * 2019-06-21 2019-07-26 南京擎盾信息科技有限公司 Legal advice system based on law semanteme part and document big data statistical analysis

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050154690A1 (en) * 2002-02-04 2005-07-14 Celestar Lexico-Sciences, Inc Document knowledge management apparatus and method
US20060112146A1 (en) * 2004-11-22 2006-05-25 Nec Laboratories America, Inc. Systems and methods for data analysis and/or knowledge management
US20060248073A1 (en) * 2005-04-28 2006-11-02 Rosie Jones Temporal search results
US20080010253A1 (en) * 2006-07-06 2008-01-10 Aol Llc Temporal Search Query Personalization
US20080215557A1 (en) * 2005-11-05 2008-09-04 Jorey Ramer Methods and systems of mobile query classification

Family Cites Families (92)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5430839A (en) * 1991-01-28 1995-07-04 Reach Software Data entry screen method
US6286005B1 (en) 1998-03-11 2001-09-04 Cannon Holdings, L.L.C. Method and apparatus for analyzing data and advertising optimization
US6151585A (en) 1998-04-24 2000-11-21 Microsoft Corporation Methods and apparatus for determining or inferring influential rumormongers from resource usage data
AU5446900A (en) * 1999-05-28 2000-12-18 Immunex Corporation Novel murine and human kinases
US8050982B2 (en) 1999-06-29 2011-11-01 Priceplay, Inc. Systems and methods for transacting business over a global communications network such as the internet
US7330826B1 (en) 1999-07-09 2008-02-12 Perfect.Com, Inc. Method, system and business model for a buyer's auction with near perfect information using the internet
US7000194B1 (en) 1999-09-22 2006-02-14 International Business Machines Corporation Method and system for profiling users based on their relationships with content topics
US7415662B2 (en) * 2000-01-31 2008-08-19 Adobe Systems Incorporated Digital media management apparatus and methods
US20060074727A1 (en) 2000-09-07 2006-04-06 Briere Daniel D Method and apparatus for collection and dissemination of information over a computer network
US7185065B1 (en) 2000-10-11 2007-02-27 Buzzmetrics Ltd System and method for scoring electronic messages
KR100446289B1 (en) * 2000-10-13 2004-09-01 삼성전자주식회사 Information search method and apparatus using Inverse Hidden Markov Model
DE10247927A1 (en) 2001-10-31 2003-07-31 Ibm Improved procedure for evaluating units within a recommendation system based on additional knowledge of unit linking
JP2003288437A (en) 2002-03-28 2003-10-10 Just Syst Corp Guide information providing device and method, and program for making computer execute the same method
US7461392B2 (en) * 2002-07-01 2008-12-02 Microsoft Corporation System and method for identifying and segmenting repeating media objects embedded in a stream
US7370002B2 (en) 2002-06-05 2008-05-06 Microsoft Corporation Modifying advertisement scores based on advertisement response probabilities
US6946715B2 (en) 2003-02-19 2005-09-20 Micron Technology, Inc. CMOS image sensor and method of fabrication
AU2003252024A1 (en) * 2002-07-16 2004-02-02 Bruce L. Horn Computer system for automatic organization, indexing and viewing of information from multiple sources
US7512612B1 (en) 2002-08-08 2009-03-31 Spoke Software Selecting an optimal path through a relationship graph
US7472110B2 (en) 2003-01-29 2008-12-30 Microsoft Corporation System and method for employing social networks for information discovery
US20040225592A1 (en) 2003-05-08 2004-11-11 Churquina Eduardo Enrique Computer Implemented Method and System of Trading Indicators Based on Price and Volume
US7565454B2 (en) * 2003-07-18 2009-07-21 Microsoft Corporation State migration in multiple NIC RDMA enabled devices
US8266009B1 (en) * 2003-08-22 2012-09-11 Earthtrax, Inc. Auction optimization system
US7577655B2 (en) 2003-09-16 2009-08-18 Google Inc. Systems and methods for improving the ranking of news articles
US7240055B2 (en) 2003-12-11 2007-07-03 Xerox Corporation Method and system for expertise mapping based on user activity in recommender systems
NO321340B1 (en) 2003-12-30 2006-05-02 Telenor Asa Method of managing networks by analyzing connectivity
US8788492B2 (en) 2004-03-15 2014-07-22 Yahoo!, Inc. Search system and methods with integration of user annotations from a trust network
US20060010797A1 (en) 2004-07-13 2006-01-19 Halsey Jay F Method and device for increasing the opening size of a window
US20060074836A1 (en) 2004-09-03 2006-04-06 Biowisdom Limited System and method for graphically displaying ontology data
US7885844B1 (en) 2004-11-16 2011-02-08 Amazon Technologies, Inc. Automatically generating task recommendations for human task performers
US20060112111A1 (en) 2004-11-22 2006-05-25 Nec Laboratories America, Inc. System and methods for data analysis and trend prediction
US7409362B2 (en) 2004-12-23 2008-08-05 Diamond Review, Inc. Vendor-driven, social-network enabled review system and method with flexible syndication
US7716162B2 (en) 2004-12-30 2010-05-11 Google Inc. Classification of ambiguous geographic references
GB0428553D0 (en) 2004-12-31 2005-02-09 British Telecomm Method PF operating a network
US7665107B2 (en) 2005-03-11 2010-02-16 Microsoft Corporation Viral advertising for interactive services
US7636714B1 (en) 2005-03-31 2009-12-22 Google Inc. Determining query term synonyms within query context
WO2007002820A2 (en) 2005-06-28 2007-01-04 Yahoo! Inc. Search engine with augmented relevance ranking by community participation
US20070027751A1 (en) 2005-07-29 2007-02-01 Chad Carson Positioning advertisements on the bases of expected revenue
US8560385B2 (en) 2005-09-02 2013-10-15 Bees & Pollen Ltd. Advertising and incentives over a social network
US20080215429A1 (en) 2005-11-01 2008-09-04 Jorey Ramer Using a mobile communication facility for offline ad searching
US20090276500A1 (en) 2005-09-21 2009-11-05 Amit Vishram Karmarkar Microblog search engine system and method
US7827052B2 (en) 2005-09-30 2010-11-02 Google Inc. Systems and methods for reputation management
US20070150398A1 (en) 2005-12-27 2007-06-28 Gridstock Inc. Investor sentiment barometer
US7761436B2 (en) 2006-01-03 2010-07-20 Yahoo! Inc. Apparatus and method for controlling content access based on shared annotations for annotated users in a folksonomy scheme
US8145656B2 (en) * 2006-02-07 2012-03-27 Mobixell Networks Ltd. Matching of modified visual and audio media
US8015484B2 (en) 2006-02-09 2011-09-06 Alejandro Backer Reputation system for web pages and online entities
US20090119173A1 (en) 2006-02-28 2009-05-07 Buzzlogic, Inc. System and Method For Advertisement Targeting of Conversations in Social Media
US8930282B2 (en) 2006-03-20 2015-01-06 Amazon Technologies, Inc. Content generation revenue sharing
US7856411B2 (en) 2006-03-21 2010-12-21 21St Century Technologies, Inc. Social network aware pattern detection
US7792841B2 (en) 2006-05-30 2010-09-07 Microsoft Corporation Extraction and summarization of sentiment information
US9015569B2 (en) 2006-08-31 2015-04-21 International Business Machines Corporation System and method for resource-adaptive, real-time new event detection
US20080104225A1 (en) 2006-11-01 2008-05-01 Microsoft Corporation Visualization application for mining of social networks
US20080148061A1 (en) 2006-12-19 2008-06-19 Hongxia Jin Method for effective tamper resistance
US8166026B1 (en) 2006-12-26 2012-04-24 uAffect.org LLC User-centric, user-weighted method and apparatus for improving relevance and analysis of information sharing and searching
US20080215571A1 (en) 2007-03-01 2008-09-04 Microsoft Corporation Product review search
US20080215607A1 (en) 2007-03-02 2008-09-04 Umbria, Inc. Tribe or group-based analysis of social media including generating intelligence from a tribe's weblogs or blogs
US20100121839A1 (en) 2007-03-15 2010-05-13 Scott Meyer Query optimization
US8204856B2 (en) 2007-03-15 2012-06-19 Google Inc. Database replication
US20100174692A1 (en) 2007-03-15 2010-07-08 Scott Meyer Graph store
DE102007014692A1 (en) 2007-03-27 2008-10-02 Rohde & Schwarz Gmbh & Co. Kg Test device and mobile device and method for testing a mobile device
US7941391B2 (en) 2007-05-04 2011-05-10 Microsoft Corporation Link spam detection using smooth classification function
US8738695B2 (en) 2007-05-15 2014-05-27 International Business Machines Corporation Joint analysis of social and content networks
US20080288305A1 (en) 2007-05-15 2008-11-20 Laluzerne Joseph D Enterprise Decision Management System and Method
US8788334B2 (en) 2007-06-15 2014-07-22 Social Mecca, Inc. Online marketing platform
US7761475B2 (en) * 2007-07-13 2010-07-20 Objectivity, Inc. Method, system and computer-readable media for managing dynamic object associations as a variable-length array of object references of heterogeneous types binding
US8209171B2 (en) * 2007-08-07 2012-06-26 Aurix Limited Methods and apparatus relating to searching of spoken audio data
US20090049018A1 (en) 2007-08-14 2009-02-19 John Nicholas Gross Temporal Document Sorter and Method Using Semantic Decoding and Prediction
EP2191395A4 (en) 2007-08-17 2011-04-20 Google Inc Ranking social network objects
US8862690B2 (en) 2007-09-28 2014-10-14 Ebay Inc. System and method for creating topic neighborhood visualizations in a networked system
US8694483B2 (en) 2007-10-19 2014-04-08 Xerox Corporation Real-time query suggestion in a troubleshooting context
US7392250B1 (en) 2007-10-22 2008-06-24 International Business Machines Corporation Discovering interestingness in faceted search
US8166925B2 (en) 2007-10-26 2012-05-01 Fccl Partnership Method and apparatus for steam generation
US8804757B2 (en) * 2007-12-26 2014-08-12 Intel Corporation Configurable motion estimation
WO2009111733A2 (en) * 2008-03-07 2009-09-11 Blue Kai, Inc. Exchange for tagged user information with scarcity control
US7822753B2 (en) 2008-03-11 2010-10-26 Cyberlink Corp. Method for displaying search results in a browser interface
CN102037481A (en) 2008-03-19 2011-04-27 苹果核网络股份有限公司 Method and apparatus for detecting patterns of behavior
JP4510109B2 (en) * 2008-03-24 2010-07-21 富士通株式会社 Target content search support program, target content search support method, and target content search support device
US7849076B2 (en) * 2008-03-31 2010-12-07 Yahoo! Inc. Learning ranking functions incorporating isotonic regression for information retrieval and ranking
US20090275850A1 (en) * 2008-04-30 2009-11-05 Mehendale Anil C Electrocardiographic (ECG) Data Analysis Systems and Methods
AU2009260033A1 (en) 2008-06-19 2009-12-23 Wize Technologies, Inc. System and method for aggregating and summarizing product/topic sentiment
US8805110B2 (en) 2008-08-19 2014-08-12 Digimarc Corporation Methods and systems for content processing
US8302015B2 (en) 2008-09-04 2012-10-30 Qualcomm Incorporated Integrated display and management of data objects based on social, temporal and spatial parameters
US8176046B2 (en) 2008-10-22 2012-05-08 Fwix, Inc. System and method for identifying trends in web feeds collected from various content servers
US20100119053A1 (en) 2008-11-13 2010-05-13 Buzzient, Inc. Analytic measurement of online social media content
US9235646B2 (en) 2009-05-28 2016-01-12 Tip Top Technologies, Inc. Method and system for a search engine for user generated content (UGC)
US8635211B2 (en) * 2009-06-11 2014-01-21 Dolby Laboratories Licensing Corporation Trend analysis in content identification based on fingerprinting
US9996594B2 (en) * 2009-06-26 2018-06-12 Sap Se Method, article and system for time dependent search
US20110004465A1 (en) 2009-07-02 2011-01-06 Battelle Memorial Institute Computation and Analysis of Significant Themes
US8140541B2 (en) 2009-09-30 2012-03-20 Michael Campbell Koss Time-weighted scoring system and method
US8886641B2 (en) 2009-10-15 2014-11-11 Yahoo! Inc. Incorporating recency in network search using machine learning
US8176032B2 (en) * 2009-10-22 2012-05-08 Ebay Inc. System and method for automatically publishing data items associated with an event
US20130304818A1 (en) * 2009-12-01 2013-11-14 Topsy Labs, Inc. Systems and methods for discovery of related terms for social media content collection over social networks
US8990241B2 (en) 2010-12-23 2015-03-24 Yahoo! Inc. System and method for recommending queries related to trending topics based on a received query

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050154690A1 (en) * 2002-02-04 2005-07-14 Celestar Lexico-Sciences, Inc Document knowledge management apparatus and method
US20060112146A1 (en) * 2004-11-22 2006-05-25 Nec Laboratories America, Inc. Systems and methods for data analysis and/or knowledge management
US20060248073A1 (en) * 2005-04-28 2006-11-02 Rosie Jones Temporal search results
US20080215557A1 (en) * 2005-11-05 2008-09-04 Jorey Ramer Methods and systems of mobile query classification
US20080010253A1 (en) * 2006-07-06 2008-01-10 Aol Llc Temporal Search Query Personalization

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140052729A1 (en) * 2011-05-10 2014-02-20 David Manzano Macho Optimized data stream management system
US8762369B2 (en) * 2011-05-10 2014-06-24 Telefonaktiebolaget L M Ericsson (Publ) Optimized data stream management system
CN108268652A (en) * 2018-01-29 2018-07-10 四川乐路科技有限公司 A kind of popular science knowledge commending system and method

Also Published As

Publication number Publication date
WO2011159863A1 (en) 2011-12-22
US20110313986A1 (en) 2011-12-22
US20150169586A1 (en) 2015-06-18
US10380121B2 (en) 2019-08-13
US8892541B2 (en) 2014-11-18

Similar Documents

Publication Publication Date Title
US10025860B2 (en) Search of sources and targets based on relative expertise of the sources
US11036810B2 (en) System and method for determining quality of cited objects in search results based on the influence of citing subjects
US9886514B2 (en) System and method for customizing search results from user's perspective
US9454586B2 (en) System and method for customizing analytics based on users media affiliation status
US20120290551A9 (en) System And Method For Identifying Trending Targets Based On Citations
US20120284253A9 (en) System and method for query suggestion based on real-time content stream
US10380121B2 (en) System and method for query temporality analysis
US10311072B2 (en) System and method for metadata transfer among search entities
JP5902274B2 (en) Ranking and selection entities based on calculated reputation or impact scores
US8290927B2 (en) Method and apparatus for rating user generated content in search results
JP5731250B2 (en) System and method for recommending interesting content in an information stream
US20120290552A9 (en) System and method for search of sources and targets based on relative topicality specialization of the targets
US11113299B2 (en) System and method for metadata transfer among search entities
US20140149378A1 (en) Method and apparatus for determining rank of web pages based upon past content portion selections
WO2011159646A1 (en) A system and method for determining quality of cited objects in search results based on the influence of citing subjects

Legal Events

Date Code Title Description
AS Assignment

Owner name: TOPSY LABS, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:GHOSH, RISHAB AIYER;EMERSON, THOMAS JAMES;CUI, LUN TED;SIGNING DATES FROM 20110801 TO 20110802;REEL/FRAME:026777/0058

AS Assignment

Owner name: VENTURE LENDING & LEASING VII, INC., CALIFORNIA

Free format text: SECURITY AGREEMENT;ASSIGNOR:TOPSY LABS, INC.;REEL/FRAME:031105/0543

Effective date: 20130815

Owner name: VENTURE LENDING & LEASING V, INC., CALIFORNIA

Free format text: SECURITY AGREEMENT;ASSIGNOR:TOPSY LABS, INC.;REEL/FRAME:031105/0543

Effective date: 20130815

Owner name: VENTURE LENDING & LEASING VI, INC., CALIFORNIA

Free format text: SECURITY AGREEMENT;ASSIGNOR:TOPSY LABS, INC.;REEL/FRAME:031105/0543

Effective date: 20130815

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCF Information on status: patent grant

Free format text: PATENTED CASE

AS Assignment

Owner name: APPLE INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:TOPSY LABS, INC.;REEL/FRAME:035333/0135

Effective date: 20150127

FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.)

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551)

Year of fee payment: 4

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 8