US20160026707A1 - Clustering multimedia search - Google Patents

Clustering multimedia search Download PDF

Info

Publication number
US20160026707A1
US20160026707A1 US14/748,631 US201514748631A US2016026707A1 US 20160026707 A1 US20160026707 A1 US 20160026707A1 US 201514748631 A US201514748631 A US 201514748631A US 2016026707 A1 US2016026707 A1 US 2016026707A1
Authority
US
United States
Prior art keywords
signature
multimedia content
recited
content item
clustering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US14/748,631
Inventor
Edwin Seng Eng Ong
Aleksandra R. Vikati
Michael L. Harville
Kyle C. Maxwell
Andrew S. Cantino
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
JPMorgan Chase Bank NA
Gracenote Media Services LLC
Original Assignee
JPMorgan Chase Bank NA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by JPMorgan Chase Bank NA filed Critical JPMorgan Chase Bank NA
Priority to US14/748,631 priority Critical patent/US20160026707A1/en
Assigned to JPMORGAN CHASE BANK, N.A., AS AGENT reassignment JPMORGAN CHASE BANK, N.A., AS AGENT SECURITY AGREEMENT Assignors: CastTV Inc., GRACENOTE, INC., TRIBUNE BROADCASTING COMPANY, LLC
Assigned to JPMORGAN CHASE BANK, N.A., AS COLLATERAL AGENT reassignment JPMORGAN CHASE BANK, N.A., AS COLLATERAL AGENT SECURITY AGREEMENT Assignors: CastTV Inc., GRACENOTE, INC., TRIBUNE BROADCASTING COMPANY, LLC, TRIBUNE MEDIA COMPANY
Assigned to JPMORGAN CHASE BANK, N.A., AS COLLATERAL AGENT reassignment JPMORGAN CHASE BANK, N.A., AS COLLATERAL AGENT CORRECTIVE ASSIGNMENT TO CORRECT THE INCORRECT APPL. NO. 14/282,293 PREVIOUSLY RECORDED AT REEL: 037569 FRAME: 0270. ASSIGNOR(S) HEREBY CONFIRMS THE SECURITY AGREEMENT. Assignors: CastTV Inc., GRACENOTE, INC., TRIBUNE BROADCASTING COMPANY, LLC, TRIBUNE MEDIA COMPANY
Publication of US20160026707A1 publication Critical patent/US20160026707A1/en
Assigned to CastTV Inc., TRIBUNE MEDIA SERVICES, LLC, TRIBUNE DIGITAL VENTURES, LLC, GRACENOTE, INC. reassignment CastTV Inc. RELEASE OF SECURITY INTEREST IN PATENT RIGHTS Assignors: JPMORGAN CHASE BANK, N.A.
Assigned to GRACENOTE MEDIA SERVICES, LLC reassignment GRACENOTE MEDIA SERVICES, LLC MERGER (SEE DOCUMENT FOR DETAILS). Assignors: CastTV Inc.
Abandoned legal-status Critical Current

Links

Images

Classifications

    • G06F17/30598
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/738Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/732Query formulation
    • G06F16/7328Query by example, e.g. a complete video frame or video sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7834Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using audio features
    • G06F17/30864
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06K9/00288
    • G06K9/6218
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Definitions

  • search engines were designed primarily for text content, and when a user searches for multimedia content using these search engines often the relatedness of search results associated with similar multimedia content is not recognized or made apparent.
  • FIG. 1 is a block diagram illustrating an embodiment of a system for clustering a set web search results.
  • FIG. 2 illustrates an embodiment of a search server for clustering multimedia search.
  • FIG. 3 illustrates an embodiment of an index for clustering multimedia search.
  • FIG. 4 is a diagram illustrating an embodiment of a display page configured for clustering a set of multimedia search results.
  • FIG. 5 is a flowchart illustrating an embodiment of a process for clustering a set of multimedia search results.
  • FIG. 6 is a flowchart illustrating an embodiment of a process for clustering search results.
  • FIG. 7 is a flowchart illustrating an embodiment of a process for clustering search results given signatures and metadata.
  • FIG. 8 is a flowchart illustrating an embodiment of a process for sorting results into bins.
  • the invention can be implemented in numerous ways, including as a process, an apparatus, a system, a composition of matter, a computer readable medium such as a computer readable storage medium or a computer network wherein program instructions are sent over optical or communication links.
  • these implementations, or any other form that the invention may take, may be referred to as techniques.
  • a component such as a processor or a memory described as being configured to perform a task includes both a general component that is temporarily configured to perform the task at a given time or a specific component that is manufactured to perform the task.
  • the order of the steps of disclosed processes may be altered within the scope of the invention.
  • the term ‘processor’ refers to one or more devices, circuits, and/or processing cores configured to process data, such as computer program instructions.
  • Clustering search results based on similarity of multimedia content determined based at least in part on an non-text-based analysis or representation of such content.
  • a representation of the multimedia content is generated and the respective representations used, in advance of query/search time or in real time, to determine a degree of similarity between the respective multimedia content associated with the respective results (e.g., pages).
  • the degree of similarity information is used to cluster search results, for example by presenting or otherwise associating together as a responsive cluster of results two or more responsive pages (or other results) that have been determined to the same and/or very similar multimedia content.
  • FIG. 1 is a block diagram illustrating an embodiment of a system for clustering a set web search results.
  • client 102 is connected through network 104 to content 106 .
  • Client 102 may represent a user, a web browser, or another search engine licensed to use the clustering system.
  • Network 104 may be a public or private network and/or combination thereof, for example the Internet, an Ethernet, serial/parallel bus, intranet, Local Area Network (“LAN”), Wide Area Network (“WAN”), and other forms of connecting multiple systems and/or groups of systems together.
  • Content 106 may include text content, graphical content, audio content, video content, web based content and/or database content.
  • Network 104 is also connected to a search server 108 , which is connected to index 110 .
  • Search server 108 may be configured to search and cluster content 106 for client 102 .
  • Search server 108 may be comprised of one or more servers.
  • Index 110 may include a database and/or cache.
  • FIG. 2 illustrates an embodiment of a search server for clustering multimedia search.
  • the search server in FIG. 2 is included as search server 108 in FIG. 1 .
  • search server 202 includes a plurality of servers, including clustering search server 204 .
  • Clustering search server 204 includes a plurality of functional engines, including an optional signature generation engine 206 and clustering logic 208 .
  • Signature generation engine 206 generates a signature representative of at least a portion of the multimedia content of a multimedia content item, such as an image or an audio and/or video clip, associated with a web page (or other actual or potential search result).
  • a signature in some embodiments comprises a representation of at least a portion of the multimedia content of a multimedia content item.
  • the signature is generated based at least in part on portions of multimedia content believed to be characteristic of and/or distinctive to the multimedia content being represented, such that there is a likelihood that another multimedia content item having the same or a very similar signature comprises multimedia content that is at least in part the same or nearly the same as corresponding content comprising the multimedia content item that the signature is generated to represent.
  • Simple examples of a signature for illustrative purposes include without limitation the average RGB or grayscale value of each quadrant of an image, the percentage of laughter in an audio track, or the number of scene transitions in a video.
  • Signature generation engine 206 may include one or more hardware elements and/or software elements.
  • hardware elements include: servers, embedded systems, printed circuit boards (“PCBs”), processors, application specific integrated circuits (“ASICs”), field programmable gate arrays (“FPGAs”), and programmable logic devices (“PLDs”)
  • software elements could include: modules, models, objects, libraries, procedures, functions, applications, applets, weblets, widgets and instructions.
  • Clustering logic 208 groups web search results associated with multimedia content items that have been determined to have the same or similar multimedia content, at least in part, based on a comparison of the respective signatures generated for each result by signature generation engine 206 .
  • clustering is based at least in part on the entropy of the signatures of the multimedia content items. For example, a signature determined to have a high level of entropy, and therefore presumably embodies more information, in some embodiments is given more weight than a signature having low entropy.
  • the foregoing approach is based on the expectation that all else being equal if two multimedia content items have low entropy signatures having the same degree of similarity as the respective signatures of a second set of content items having high entropy signatures, the second set of content items are more likely to in fact have the same or very similar multimedia than the latter two. Stated another, if a signature is low entropy it is less likely to represent uniquely a particular multimedia content and other content that is not that similar to the first content may have a sufficiently similar signature to generate a false match.
  • Clustering logic 208 may include one or more hardware elements and/or software elements.
  • hardware elements include: servers, embedded systems, printed circuit boards (“PCBs”), processors, application specific integrated circuits (“ASICs”), field programmable gate arrays (“FPGAs”), and programmable logic devices (“PLDs”)
  • software elements could include: modules, models, objects, libraries, procedures, functions, applications, applets, weblets, widgets and instructions.
  • FIG. 3 illustrates an embodiment of an index for clustering multimedia search.
  • the index in FIG. 3 is included as index 110 in FIG. 1 .
  • index 302 includes a plurality of indices, including clustering index 304 .
  • Clustering index 304 includes a plurality of indices, including a text metadata index 306 and signature index 308 .
  • Text metadata index 306 references content 106 by metadata given for each content item.
  • a video clip may have as its metadata the producer's name, its run length and the title of the clip.
  • the video clip would be represented in text metadata index 306 by storing its address and associated metadata.
  • the video clip address may include its file location, its library reference number, its Uniform Resource Locator (“URL”), or its Uniform Resource Identifier (“URI”).
  • the associated metadata may include the metadata field descriptions, for example “producer's name”, “run-length”, and “title”, as well as field content, for example “Joe Producer”, “1:45:34” and “Drama Squirrels: The Sequel”.
  • Signature index 308 references content 106 by the signature generated by signature generation engine 206 .
  • the video clip would be represented in signature index 308 by storing its address and associated signature.
  • FIG. 4 is a diagram illustrating an embodiment of a display page configured for clustering a set of multimedia search results.
  • Display page 402 shows the layout of the page as rendered by a browser at, for example, client 102 .
  • Display page 402 includes a search frame 404 , which includes both a field for client 102 to enter in search parameters and an active element to initiate the search, such as a search button.
  • the client 102 is searching for a “drama squirrel” multimedia content item.
  • Clustered multimedia search results are given in results frame 406 , which in the example given in FIG. 4 , shows thirty-one distinct results. Search results may be given in a ranked order, as is the case in FIG. 4 .
  • results frame 406 the first result 408 is given for a “drama squirrel video” content item, and the clustering shows:
  • the results frame 406 also shows a second result with less ranking as a “drama chipmunk video” content item, and a third result with less ranking as a “squirrel dance song” content item.
  • a result there may be two additional implementations:
  • a “find similar” button 410 that will find similar results to any given result, without considering any metadata.
  • a “find similar” button will cause a search based either only on the signature of the current result in comparison with other known signatures, or on the signature of the current result in comparison with other known signatures and metadata fields implicit to the current result.
  • a “similarity slider” 412 that allows the exploration of a spectrum from ‘duplicate’ to ‘more similar’ to ‘less similar.’
  • the slider is placed on ‘duplicate’ for a “drama squirrel” search, only exact duplicates are found.
  • results become increasingly non-duplicated but still related.
  • FIG. 5 is a flowchart illustrating an embodiment of a process for clustering a set of multimedia search results. The process may be implemented in search server 108 .
  • a signature of a search result is generated, based at least in part on an analysis of multimedia content associated with the web search result.
  • this step is implemented by signature generation engine 206 .
  • Multimedia includes any content that is not purely text, such as images, video and audio.
  • a characteristic of the signature is that a distance metric may be calculated between a first and second signature.
  • the signature is a vector and the distance metric is a scalar.
  • the distance metric may include one or more and/or a weighted or other combination of one or more of:
  • the signature may include a hash value based at least in part on features of the image, audio, video or other multimedia type.
  • the signature may include one or more of:
  • any relatively concise representation of the multimedia content of a content item such that another content item having the same or a very similar signature is likely to include the same or similar multimedia content and conversely content items having a relatively more dissimilar signature are unlikely to include the same or very similar multimedia content may be used.
  • the set of web search results is clustered based at least in part on the signature of each web search result.
  • the signature of each web search result is compared to another web search result's signature by analyzing their distance metric.
  • the signatures of content 106 are pre-computed before performing search.
  • pre-computing signatures it is possible to find a multimedia content item that is similar to a web search result, for example, to:
  • FIG. 6 is a flowchart illustrating an embodiment of a process for clustering search results.
  • the process of FIG. 6 is included in 504 of FIG. 5 .
  • the process may be implemented in search server 108 .
  • the text metadata is used to find responsive records and optionally assign rankings
  • a search for “drama squirrel” could use available text search techniques to find records with metadata that includes: “drama squirrel”, “drama”, “squirrel”, “show squirrel”, “drama chipmunk”, and other permutations from parsing the query.
  • rankings may be assigned based on the relevance of the found records to the search query using available ranking techniques.
  • the results may be organized, clustered and/or displayed.
  • the organization and clustering may be similar to the example for frame 406 .
  • FIG. 7 is a flowchart illustrating an embodiment of a process for clustering search results given signatures and metadata.
  • the process of FIG. 7 is included in 604 of FIG. 6 .
  • the process may be implemented in search server 108 .
  • step 702 the results from the text metadata search in step 602 are coupled with the signatures generated in step 502 and sorted into bins. For example, a search for “drama squirrel” may find the highest ranked result is a “drama squirrel video” content item available by network 104 that has several identical copies at different addresses, and several similar copies at other addresses. In this example all of these content items would be consolidated in a single bin.
  • the bins would be ordered and displayed by its bin ranking
  • the bin ranking of a specified bin is related to the rank of each result within that specified bin.
  • the bin ranking would be directly related to the highest ranked result within each bin.
  • the bin ranking would be further weighted by the number or quality of results within a bin.
  • displaying a cluster includes labeling two web search results with similar video signatures and different audio signatures as commentary.
  • displaying a cluster includes labeling two web search results with similar audio signatures and different video signatures as remixes.
  • the number of cluster members in a bin can be used as a ranking factor, such that the result with the highest number of duplicates would be deemed more significant than a result with very few number of duplicates. For example, the most popular video of a contemporary singer Jane Smith would have a very high number of copies circulating on the web vs a homemade video of a Jane Smith cover.
  • FIG. 8 is a flowchart illustrating an embodiment of a process for sorting results into bins.
  • the process of FIG. 8 is included in 702 of FIG. 7 .
  • the process may be implemented in search server 108 .
  • step 802 a first ranked result is assigned as the primary result, with its signature generated from step 502 .
  • step 804 the next result is compared by computing the distance between itself and the primary result. If it is determined in step 806 that the distance is less than a predetermined threshold, then control is transferred to step 808 ; otherwise, control is transferred to step 810 .
  • a distance less than the predetermined threshold may indicate that the two multimedia content items associated with the two results are related.
  • step 808 two related results will be grouped together in a bin.
  • the predetermined threshold in step 806 indicates that the two results are either identical or similar, for example, a post-production modification.
  • a second comparison will be made to see if the distance is less than a predetermined smaller threshold.
  • a distance less than the predetermined smaller threshold may indicate that the two multimedia content items associated with the two results are nearly identical.
  • there may be at least two sub-bins; the first of “identical” content items to the primary result, and the second of “similar” content items to the primary result.
  • if a result is placed within a bin it may be removed from being contained within another bin.
  • step 810 if it is determined that there are no other results to compare with the primary result, then control is transferred to step 814 ; otherwise, control is transferred to step 812 .
  • there may be no other results to compare because every result has already been compared with the primary result.
  • there may be no other results to compare because a predetermined amount of results have already been compared with the primary result.
  • step 812 the process repeats starting with step 804 but with a comparison comparing the primary result with the next ranked result.
  • step 814 if it is determined that the clustering is complete, the process is ends; otherwise, control is transferred to step 816 .
  • clustering is complete because every result has been placed in a bin.
  • clustering is complete based on a heuristic; for example the heuristic may determine to stop after thirty bins have been created.
  • the next available result is assigned as the primary result. In some embodiments, the next available result is the next ranked result from the primary result. In some embodiments, the next available result is the next ranked result from the primary result not already in a bin.

Abstract

A method for clustering a set of web search results is disclosed. A first signature is compared based at least in part on an analysis of multimedia content associated with a first web search result with a second signature based at least in part on an analysis of multimedia content associated with a second web search result. The first web search result is clustered with the second web search result based at least in part on the comparison of the first signature with the second signature.

Description

    CROSS REFERENCE TO OTHER APPLICATIONS
  • This application is a continuation of co-pending U.S. patent application Ser. No. 13/608,349 entitled CLUSTERING MULTIMEDIA SEARCH filed Sep. 10, 2012, which is a continuation of U.S. patent application Ser. No. 12/317,253, now U.S. Pat. No. 8,285,718, entitled CLUSTERING MULTIMEDIA SEARCH filed Dec. 19, 2008, which claims priority to U.S. Provisional Patent Application No. 61/008,678 entitled CLUSTERING MULTIMEDIA SEARCH filed Dec. 21, 2007 all of which are incorporated herein by reference for all purposes.
  • BACKGROUND OF THE INVENTION
  • There is an increasingly large volume of image, video, audio, and other multimedia content being posted to the Internet and the World Wide Web (“web”). With increased volumes of text and multimedia content, a user must rely more on search engines to find particular content.
  • Many existing search engines were designed primarily for text content, and when a user searches for multimedia content using these search engines often the relatedness of search results associated with similar multimedia content is not recognized or made apparent.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • Various embodiments of the invention are disclosed in the following detailed description and the accompanying drawings.
  • FIG. 1 is a block diagram illustrating an embodiment of a system for clustering a set web search results.
  • FIG. 2 illustrates an embodiment of a search server for clustering multimedia search.
  • FIG. 3 illustrates an embodiment of an index for clustering multimedia search.
  • FIG. 4 is a diagram illustrating an embodiment of a display page configured for clustering a set of multimedia search results.
  • FIG. 5 is a flowchart illustrating an embodiment of a process for clustering a set of multimedia search results.
  • FIG. 6 is a flowchart illustrating an embodiment of a process for clustering search results.
  • FIG. 7 is a flowchart illustrating an embodiment of a process for clustering search results given signatures and metadata.
  • FIG. 8 is a flowchart illustrating an embodiment of a process for sorting results into bins.
  • DETAILED DESCRIPTION
  • The invention can be implemented in numerous ways, including as a process, an apparatus, a system, a composition of matter, a computer readable medium such as a computer readable storage medium or a computer network wherein program instructions are sent over optical or communication links. In this specification, these implementations, or any other form that the invention may take, may be referred to as techniques. A component such as a processor or a memory described as being configured to perform a task includes both a general component that is temporarily configured to perform the task at a given time or a specific component that is manufactured to perform the task. In general, the order of the steps of disclosed processes may be altered within the scope of the invention. As used herein, the term ‘processor’ refers to one or more devices, circuits, and/or processing cores configured to process data, such as computer program instructions.
  • A detailed description of one or more embodiments of the invention is provided below along with accompanying figures that illustrate the principles of the invention. The invention is described in connection with such embodiments, but the invention is not limited to any embodiment. The scope of the invention is limited only by the claims and the invention encompasses numerous alternatives, modifications and equivalents. Numerous specific details are set forth in the following description in order to provide a thorough understanding of the invention. These details are provided for the purpose of example and the invention may be practiced according to the claims without some or all of these specific details. For the purpose of clarity, technical material that is known in the technical fields related to the invention has not been described in detail so that the invention is not unnecessarily obscured.
  • Clustering search results based on similarity of multimedia content, determined based at least in part on an non-text-based analysis or representation of such content, is disclosed. In some embodiments, for each of a plurality of actual or potential search results, e.g., web pages, having associated multimedia content, a representation of the multimedia content is generated and the respective representations used, in advance of query/search time or in real time, to determine a degree of similarity between the respective multimedia content associated with the respective results (e.g., pages). The degree of similarity information is used to cluster search results, for example by presenting or otherwise associating together as a responsive cluster of results two or more responsive pages (or other results) that have been determined to the same and/or very similar multimedia content.
  • FIG. 1 is a block diagram illustrating an embodiment of a system for clustering a set web search results. In the example shown, client 102 is connected through network 104 to content 106. Client 102 may represent a user, a web browser, or another search engine licensed to use the clustering system. Network 104 may be a public or private network and/or combination thereof, for example the Internet, an Ethernet, serial/parallel bus, intranet, Local Area Network (“LAN”), Wide Area Network (“WAN”), and other forms of connecting multiple systems and/or groups of systems together. Content 106 may include text content, graphical content, audio content, video content, web based content and/or database content.
  • Network 104 is also connected to a search server 108, which is connected to index 110. Search server 108 may be configured to search and cluster content 106 for client 102. Search server 108 may be comprised of one or more servers. Index 110 may include a database and/or cache.
  • FIG. 2 illustrates an embodiment of a search server for clustering multimedia search. In some embodiments, the search server in FIG. 2 is included as search server 108 in FIG. 1. In the example shown, search server 202 includes a plurality of servers, including clustering search server 204. Clustering search server 204 includes a plurality of functional engines, including an optional signature generation engine 206 and clustering logic 208.
  • Signature generation engine 206 generates a signature representative of at least a portion of the multimedia content of a multimedia content item, such as an image or an audio and/or video clip, associated with a web page (or other actual or potential search result). A signature in some embodiments comprises a representation of at least a portion of the multimedia content of a multimedia content item.
  • In various embodiments, the signature is generated based at least in part on portions of multimedia content believed to be characteristic of and/or distinctive to the multimedia content being represented, such that there is a likelihood that another multimedia content item having the same or a very similar signature comprises multimedia content that is at least in part the same or nearly the same as corresponding content comprising the multimedia content item that the signature is generated to represent. Simple examples of a signature for illustrative purposes include without limitation the average RGB or grayscale value of each quadrant of an image, the percentage of laughter in an audio track, or the number of scene transitions in a video.
  • Signature generation engine 206 may include one or more hardware elements and/or software elements. Examples of such hardware elements include: servers, embedded systems, printed circuit boards (“PCBs”), processors, application specific integrated circuits (“ASICs”), field programmable gate arrays (“FPGAs”), and programmable logic devices (“PLDs”), and software elements could include: modules, models, objects, libraries, procedures, functions, applications, applets, weblets, widgets and instructions.
  • Clustering logic 208 groups web search results associated with multimedia content items that have been determined to have the same or similar multimedia content, at least in part, based on a comparison of the respective signatures generated for each result by signature generation engine 206.
  • In some embodiments clustering is based at least in part on the entropy of the signatures of the multimedia content items. For example, a signature determined to have a high level of entropy, and therefore presumably embodies more information, in some embodiments is given more weight than a signature having low entropy. The foregoing approach is based on the expectation that all else being equal if two multimedia content items have low entropy signatures having the same degree of similarity as the respective signatures of a second set of content items having high entropy signatures, the second set of content items are more likely to in fact have the same or very similar multimedia than the latter two. Stated another, if a signature is low entropy it is less likely to represent uniquely a particular multimedia content and other content that is not that similar to the first content may have a sufficiently similar signature to generate a false match.
  • Clustering logic 208 may include one or more hardware elements and/or software elements. Examples of such hardware elements include: servers, embedded systems, printed circuit boards (“PCBs”), processors, application specific integrated circuits (“ASICs”), field programmable gate arrays (“FPGAs”), and programmable logic devices (“PLDs”), and software elements could include: modules, models, objects, libraries, procedures, functions, applications, applets, weblets, widgets and instructions.
  • FIG. 3 illustrates an embodiment of an index for clustering multimedia search. In some embodiments, the index in FIG. 3 is included as index 110 in FIG. 1. In the example shown, index 302 includes a plurality of indices, including clustering index 304. Clustering index 304 includes a plurality of indices, including a text metadata index 306 and signature index 308.
  • Text metadata index 306 references content 106 by metadata given for each content item. For example, a video clip may have as its metadata the producer's name, its run length and the title of the clip. In this example, the video clip would be represented in text metadata index 306 by storing its address and associated metadata. The video clip address may include its file location, its library reference number, its Uniform Resource Locator (“URL”), or its Uniform Resource Identifier (“URI”). The associated metadata may include the metadata field descriptions, for example “producer's name”, “run-length”, and “title”, as well as field content, for example “Joe Producer”, “1:45:34” and “Drama Squirrels: The Sequel”.
  • Signature index 308 references content 106 by the signature generated by signature generation engine 206. In the above example, the video clip would be represented in signature index 308 by storing its address and associated signature.
  • FIG. 4 is a diagram illustrating an embodiment of a display page configured for clustering a set of multimedia search results. Display page 402 shows the layout of the page as rendered by a browser at, for example, client 102.
  • Display page 402 includes a search frame 404, which includes both a field for client 102 to enter in search parameters and an active element to initiate the search, such as a search button. In the example given in FIG. 4, the client 102 is searching for a “drama squirrel” multimedia content item.
  • Clustered multimedia search results are given in results frame 406, which in the example given in FIG. 4, shows thirty-one distinct results. Search results may be given in a ranked order, as is the case in FIG. 4. In results frame 406 the first result 408 is given for a “drama squirrel video” content item, and the clustering shows:
      • there are three identical videos located at addresses “video.oggle.com”, “utub.com” and “yourspace.com”, and
      • there are three variations of the “drama squirrel video” content item at addresses “facetext.com”, “utub.com/v” and “itube.com”.
  • In the example, the results frame 406 also shows a second result with less ranking as a “drama chipmunk video” content item, and a third result with less ranking as a “squirrel dance song” content item. In some embodiments, within a result there may be two additional implementations:
  • First, a “find similar” button 410 that will find similar results to any given result, without considering any metadata. In the example shown, there are 31 results for the “drama squirrel” search. Clicking “find similar” on the first result, will cause similar or duplicate results that may not necessarily have “drama squirrel” in its metadata.
  • In some embodiments a “find similar” button will cause a search based either only on the signature of the current result in comparison with other known signatures, or on the signature of the current result in comparison with other known signatures and metadata fields implicit to the current result.
  • Second, a “similarity slider” 412 that allows the exploration of a spectrum from ‘duplicate’ to ‘more similar’ to ‘less similar.’ In some embodiments, if the slider is placed on ‘duplicate’ for a “drama squirrel” search, only exact duplicates are found. As the slider is set from “duplicate” towards “less similar results” results become increasingly non-duplicated but still related.
  • For example, a search is made for a “Debra Hilton” video. A grocery shopping video with Debra Hilton is the result 408. There are four possible options by setting slider 412:
      • With the slider 412 on duplicate: Only exact matches are made to the current result, the Debra Hilton grocery shopping video;
      • With the slider 412 on more similar: Similar matches of thumbnails of the Debra Hilton grocery shopping video, including videos with possibly different shots or angles, are made;
      • With the slider 412 on similar: Similar matches of videos with Debra Hilton in comparable poses as to that in the Debra Hilton grocery shopping video, are made; and
      • With the slider 412 on less similar: Similar matches of videos with people who look like Debra Hilton as in the Debra Hilton grocery shopping video, are made.
  • FIG. 5 is a flowchart illustrating an embodiment of a process for clustering a set of multimedia search results. The process may be implemented in search server 108.
  • In step 502, a signature of a search result is generated, based at least in part on an analysis of multimedia content associated with the web search result. In some embodiments this step is implemented by signature generation engine 206. Multimedia includes any content that is not purely text, such as images, video and audio. In some embodiments, a characteristic of the signature is that a distance metric may be calculated between a first and second signature. In some embodiments, the signature is a vector and the distance metric is a scalar. The distance metric may include one or more and/or a weighted or other combination of one or more of:
      • a Cartesian or Euclidean distance;
      • a Manhattan or rectilinear distance; and
      • a byte-wise difference between one or more of the bytes within the signature.
  • The signature may include a hash value based at least in part on features of the image, audio, video or other multimedia type. In an image or video, the signature may include one or more of:
      • a recognized face, for example “The President of the United States”;
      • a recognized logo, for example “The USPTO Logo”;
      • a recognized facial feature, for example “A brown mustache”; and
      • recognizing a normalized feature, for example if all videos from a particular studio have a regular or normal form.
  • While particular types of signature and distance metrics are described above, in practice any relatively concise representation of the multimedia content of a content item such that another content item having the same or a very similar signature is likely to include the same or similar multimedia content and conversely content items having a relatively more dissimilar signature are unlikely to include the same or very similar multimedia content may be used.
  • In step 504, the set of web search results is clustered based at least in part on the signature of each web search result. In some embodiments, the signature of each web search result is compared to another web search result's signature by analyzing their distance metric.
  • In some embodiments, the signatures of content 106 are pre-computed before performing search. By pre-computing signatures, it is possible to find a multimedia content item that is similar to a web search result, for example, to:
      • find a video clip that looks like a web search result;
      • find a video clip that sounds like a web search result; and
      • find an audio clip that sounds like a web search result.
  • FIG. 6 is a flowchart illustrating an embodiment of a process for clustering search results. In some embodiments, the process of FIG. 6 is included in 504 of FIG. 5. The process may be implemented in search server 108.
  • In step 602, the text metadata is used to find responsive records and optionally assign rankings For example, a search for “drama squirrel” could use available text search techniques to find records with metadata that includes: “drama squirrel”, “drama”, “squirrel”, “show squirrel”, “drama chipmunk”, and other permutations from parsing the query. In some embodiments, rankings may be assigned based on the relevance of the found records to the search query using available ranking techniques.
  • In step 604, with both the signatures and text metadata rankings, the results may be organized, clustered and/or displayed. In some embodiments the organization and clustering may be similar to the example for frame 406.
  • FIG. 7 is a flowchart illustrating an embodiment of a process for clustering search results given signatures and metadata. In some embodiments, the process of FIG. 7 is included in 604 of FIG. 6. The process may be implemented in search server 108.
  • In step 702, the results from the text metadata search in step 602 are coupled with the signatures generated in step 502 and sorted into bins. For example, a search for “drama squirrel” may find the highest ranked result is a “drama squirrel video” content item available by network 104 that has several identical copies at different addresses, and several similar copies at other addresses. In this example all of these content items would be consolidated in a single bin.
  • In step 704, the bins would be ordered and displayed by its bin ranking The bin ranking of a specified bin is related to the rank of each result within that specified bin. In some embodiments, the bin ranking would be directly related to the highest ranked result within each bin. In some embodiments, the bin ranking would be further weighted by the number or quality of results within a bin. In some embodiments, displaying a cluster includes labeling two web search results with similar video signatures and different audio signatures as commentary. In some embodiments, displaying a cluster includes labeling two web search results with similar audio signatures and different video signatures as remixes. In some embodiments, the number of cluster members in a bin can be used as a ranking factor, such that the result with the highest number of duplicates would be deemed more significant than a result with very few number of duplicates. For example, the most popular video of a contemporary singer Jane Smith would have a very high number of copies circulating on the web vs a homemade video of a Jane Smith cover.
  • FIG. 8 is a flowchart illustrating an embodiment of a process for sorting results into bins. In some embodiments, the process of FIG. 8 is included in 702 of FIG. 7. The process may be implemented in search server 108.
  • In step 802, a first ranked result is assigned as the primary result, with its signature generated from step 502. In step 804, the next result is compared by computing the distance between itself and the primary result. If it is determined in step 806 that the distance is less than a predetermined threshold, then control is transferred to step 808; otherwise, control is transferred to step 810. A distance less than the predetermined threshold may indicate that the two multimedia content items associated with the two results are related.
  • In step 808, two related results will be grouped together in a bin. In some embodiments, the predetermined threshold in step 806 indicates that the two results are either identical or similar, for example, a post-production modification. In some embodiments a second comparison will be made to see if the distance is less than a predetermined smaller threshold. A distance less than the predetermined smaller threshold may indicate that the two multimedia content items associated with the two results are nearly identical. Thus, within the bin, there may be at least two sub-bins; the first of “identical” content items to the primary result, and the second of “similar” content items to the primary result. In some embodiments there may be a recursive clustering within clustered results. In some embodiments, if a result is placed within a bin, it may be removed from being contained within another bin.
  • In step 810, if it is determined that there are no other results to compare with the primary result, then control is transferred to step 814; otherwise, control is transferred to step 812. In some embodiments, there may be no other results to compare because every result has already been compared with the primary result. In some embodiments, there may be no other results to compare because a predetermined amount of results have already been compared with the primary result. In step 812, the process repeats starting with step 804 but with a comparison comparing the primary result with the next ranked result.
  • In step 814, if it is determined that the clustering is complete, the process is ends; otherwise, control is transferred to step 816. In some embodiments, clustering is complete because every result has been placed in a bin. In some embodiments, clustering is complete based on a heuristic; for example the heuristic may determine to stop after thirty bins have been created.
  • In step 816, the next available result is assigned as the primary result. In some embodiments, the next available result is the next ranked result from the primary result. In some embodiments, the next available result is the next ranked result from the primary result not already in a bin.
  • Although the foregoing embodiments have been described in some detail for purposes of clarity of understanding, the invention is not limited to the details provided. There are many alternative ways of implementing the invention. The disclosed embodiments are illustrative and not restrictive.

Claims (21)

What is claimed is:
1. (canceled)
2. A method, comprising:
determining a first level of entropy associated with a first signature for a first multimedia content item;
determining a second level of entropy associated with a second signature for a second multimedia content item; and
clustering the first multimedia content item with the second multimedia content item based at least in part on comparing the first level of entropy associated with the first signature and the second level of entropy associated with the first signature with that of another set of content items.
3. A method as recited in claim 2, further comprising generating a first media content signature based at least in part on an analysis of multimedia content for the first multimedia content item.
4. A method as recited in claim 3, further comprising generating a second media content signature based at least in part on an analysis of multimedia content for the second multimedia content item.
5. A method as recited in claim 4, wherein the first multimedia content item is associated with a first web search result and the second multimedia content item is associated with a second web search result.
6. A method as recite in claim 5, further comprising reducing web search results based at least in part on an analysis of textual metadata.
7. A method as recited in claim 4, wherein clustering is based on an expectation that all else being equal if a low entropy set of multimedia content items have the same degree of similarity as the respective signatures of a high entropy set of content items, the high entropy set is more likely to have similar multimedia than the low entropy set.
8. A method as recited in claim 4, wherein clustering is based on an expectation that a low entropy signature is less likely to uniquely represent a particular multimedia content.
9. A method as recited in claim 4, wherein clustering includes labeling two web search results with similar video signatures and different audio signatures as commentary.
10. A method as recited in claim 4, wherein clustering includes labeling two web search results with similar audio signatures and different video signatures as remixes.
11. A method as recited in claim 4, wherein the comparison includes calculating a distance metric between the first media content signature and the second media content signature.
12. A method as recited in claim 11, wherein the distance metric includes one or a weighted combination of: a Cartesian distance; a Manhattan distance; a Euclidean distance; and a byte difference.
13. A method as recited in claim 4, wherein each media content signature includes a hash value based at least in part on one or more of the following: image features, audio features, and on video features.
14. A method as recited in claim 4, wherein each media content signature includes one or more of the following: a recognized face, a recognized logo, and a recognized facial feature.
15. A method as recited in claim 4, further comprising finding video that sound like a web search result.
16. A method as recited in claim 2, wherein multimedia is any non-textual data or metadata.
17. A method as recited in claim 2, wherein multimedia includes images, video and audio.
18. A method as recited in claim 2, wherein clustering includes consolidating web search results if the distance metric is below an identical-threshold.
19. A method as recited in claim 2, wherein clustering includes highlighting web search results if the distance metric is below a similar-threshold but above an identical-threshold.
20. A system, comprising:
a data store configured to store signatures of web search results; and
a processor coupled to the data store and configured to:
determine a first level of entropy associated with a first signature for a first multimedia content item;
determine a second level of entropy associated with a second signature for a second multimedia content item; and
cluster the first multimedia content item with the second multimedia content item based at least in part on comparing the first level of entropy associated with the first signature and the second level of entropy associated with the first signature with that of another set of content items.
21. A computer program product, the computer program product being embodied in a non-transitory computer readable storage medium and comprising computer instructions for:
determining a first level of entropy associated with a first signature for a first multimedia content item;
determining a second level of entropy associated with a second signature for a second multimedia content item; and
clustering the first multimedia content item with the second multimedia content item based at least in part on comparing the first level of entropy associated with the first signature and the second level of entropy associated with the first signature with that of another set of content items.
US14/748,631 2007-12-21 2015-06-24 Clustering multimedia search Abandoned US20160026707A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/748,631 US20160026707A1 (en) 2007-12-21 2015-06-24 Clustering multimedia search

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US867807P 2007-12-21 2007-12-21
US12/317,253 US8285718B1 (en) 2007-12-21 2008-12-19 Clustering multimedia search
US13/608,349 US9098585B2 (en) 2007-12-21 2012-09-10 Clustering multimedia search
US14/748,631 US20160026707A1 (en) 2007-12-21 2015-06-24 Clustering multimedia search

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US13/608,349 Continuation US9098585B2 (en) 2007-12-21 2012-09-10 Clustering multimedia search

Publications (1)

Publication Number Publication Date
US20160026707A1 true US20160026707A1 (en) 2016-01-28

Family

ID=46964316

Family Applications (3)

Application Number Title Priority Date Filing Date
US12/317,253 Active 2030-06-15 US8285718B1 (en) 2007-12-21 2008-12-19 Clustering multimedia search
US13/608,349 Active US9098585B2 (en) 2007-12-21 2012-09-10 Clustering multimedia search
US14/748,631 Abandoned US20160026707A1 (en) 2007-12-21 2015-06-24 Clustering multimedia search

Family Applications Before (2)

Application Number Title Priority Date Filing Date
US12/317,253 Active 2030-06-15 US8285718B1 (en) 2007-12-21 2008-12-19 Clustering multimedia search
US13/608,349 Active US9098585B2 (en) 2007-12-21 2012-09-10 Clustering multimedia search

Country Status (1)

Country Link
US (3) US8285718B1 (en)

Cited By (69)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9767143B2 (en) 2005-10-26 2017-09-19 Cortica, Ltd. System and method for caching of concept structures
WO2017160413A1 (en) * 2016-03-13 2017-09-21 Cortica, Ltd. System and method for clustering multimedia content elements
US9792620B2 (en) 2005-10-26 2017-10-17 Cortica, Ltd. System and method for brand monitoring and trend analysis based on deep-content-classification
US9886437B2 (en) 2005-10-26 2018-02-06 Cortica, Ltd. System and method for generation of signatures for multimedia data elements
US9940326B2 (en) 2005-10-26 2018-04-10 Cortica, Ltd. System and method for speech to speech translation using cores of a natural liquid architecture system
US9953032B2 (en) 2005-10-26 2018-04-24 Cortica, Ltd. System and method for characterization of multimedia content signals using cores of a natural liquid architecture system
US10180942B2 (en) 2005-10-26 2019-01-15 Cortica Ltd. System and method for generation of concept structures based on sub-concepts
US10191976B2 (en) 2005-10-26 2019-01-29 Cortica, Ltd. System and method of detecting common patterns within unstructured data elements retrieved from big data sources
US10193990B2 (en) 2005-10-26 2019-01-29 Cortica Ltd. System and method for creating user profiles based on multimedia content
US10210257B2 (en) 2005-10-26 2019-02-19 Cortica, Ltd. Apparatus and method for determining user attention using a deep-content-classification (DCC) system
US10331737B2 (en) 2005-10-26 2019-06-25 Cortica Ltd. System for generation of a large-scale database of hetrogeneous speech
US10360253B2 (en) 2005-10-26 2019-07-23 Cortica, Ltd. Systems and methods for generation of searchable structures respective of multimedia data content
US10372746B2 (en) 2005-10-26 2019-08-06 Cortica, Ltd. System and method for searching applications using multimedia content elements
US10380623B2 (en) 2005-10-26 2019-08-13 Cortica, Ltd. System and method for generating an advertisement effectiveness performance score
US10380267B2 (en) 2005-10-26 2019-08-13 Cortica, Ltd. System and method for tagging multimedia content elements
US10380164B2 (en) 2005-10-26 2019-08-13 Cortica, Ltd. System and method for using on-image gestures and multimedia content elements as search queries
US10387914B2 (en) 2005-10-26 2019-08-20 Cortica, Ltd. Method for identification of multimedia content elements and adding advertising content respective thereof
US10402436B2 (en) 2016-05-12 2019-09-03 Pixel Forensics, Inc. Automated video categorization, value determination and promotion/demotion via multi-attribute feature computation
US10430386B2 (en) 2005-10-26 2019-10-01 Cortica Ltd System and method for enriching a concept database
US10535192B2 (en) 2005-10-26 2020-01-14 Cortica Ltd. System and method for generating a customized augmented reality environment to a user
US10585934B2 (en) 2005-10-26 2020-03-10 Cortica Ltd. Method and system for populating a concept database with respect to user identifiers
US10607355B2 (en) 2005-10-26 2020-03-31 Cortica, Ltd. Method and system for determining the dimensions of an object shown in a multimedia content item
US10614626B2 (en) 2005-10-26 2020-04-07 Cortica Ltd. System and method for providing augmented reality challenges
US10621988B2 (en) 2005-10-26 2020-04-14 Cortica Ltd System and method for speech to text translation using cores of a natural liquid architecture system
US10635640B2 (en) 2005-10-26 2020-04-28 Cortica, Ltd. System and method for enriching a concept database
US10691642B2 (en) 2005-10-26 2020-06-23 Cortica Ltd System and method for enriching a concept database with homogenous concepts
US10698939B2 (en) 2005-10-26 2020-06-30 Cortica Ltd System and method for customizing images
US10733326B2 (en) 2006-10-26 2020-08-04 Cortica Ltd. System and method for identification of inappropriate multimedia content
US10742340B2 (en) 2005-10-26 2020-08-11 Cortica Ltd. System and method for identifying the context of multimedia content elements displayed in a web-page and providing contextual filters respective thereto
US10748022B1 (en) 2019-12-12 2020-08-18 Cartica Ai Ltd Crowd separation
US10748038B1 (en) 2019-03-31 2020-08-18 Cortica Ltd. Efficient calculation of a robust signature of a media unit
US10776669B1 (en) 2019-03-31 2020-09-15 Cortica Ltd. Signature generation and object detection that refer to rare scenes
US10776585B2 (en) 2005-10-26 2020-09-15 Cortica, Ltd. System and method for recognizing characters in multimedia content
US10789527B1 (en) 2019-03-31 2020-09-29 Cortica Ltd. Method for object detection using shallow neural networks
US10789535B2 (en) 2018-11-26 2020-09-29 Cartica Ai Ltd Detection of road elements
US10796444B1 (en) 2019-03-31 2020-10-06 Cortica Ltd Configuring spanning elements of a signature generator
US10831814B2 (en) 2005-10-26 2020-11-10 Cortica, Ltd. System and method for linking multimedia data elements to web pages
US10839694B2 (en) 2018-10-18 2020-11-17 Cartica Ai Ltd Blind spot alert
US10848590B2 (en) 2005-10-26 2020-11-24 Cortica Ltd System and method for determining a contextual insight and providing recommendations based thereon
US10846544B2 (en) 2018-07-16 2020-11-24 Cartica Ai Ltd. Transportation prediction system and method
US10902049B2 (en) 2005-10-26 2021-01-26 Cortica Ltd System and method for assigning multimedia content elements to users
US10949773B2 (en) 2005-10-26 2021-03-16 Cortica, Ltd. System and methods thereof for recommending tags for multimedia content elements based on context
US11003706B2 (en) 2005-10-26 2021-05-11 Cortica Ltd System and methods for determining access permissions on personalized clusters of multimedia content elements
US11019161B2 (en) 2005-10-26 2021-05-25 Cortica, Ltd. System and method for profiling users interest based on multimedia content analysis
US11029685B2 (en) 2018-10-18 2021-06-08 Cartica Ai Ltd. Autonomous risk assessment for fallen cargo
US11032017B2 (en) 2005-10-26 2021-06-08 Cortica, Ltd. System and method for identifying the context of multimedia content elements
US11037015B2 (en) 2015-12-15 2021-06-15 Cortica Ltd. Identification of key points in multimedia data elements
US11126869B2 (en) 2018-10-26 2021-09-21 Cartica Ai Ltd. Tracking after objects
US11126870B2 (en) 2018-10-18 2021-09-21 Cartica Ai Ltd. Method and system for obstacle detection
US11132548B2 (en) 2019-03-20 2021-09-28 Cortica Ltd. Determining object information that does not explicitly appear in a media unit signature
US11181911B2 (en) 2018-10-18 2021-11-23 Cartica Ai Ltd Control transfer of a vehicle
US11195043B2 (en) 2015-12-15 2021-12-07 Cortica, Ltd. System and method for determining common patterns in multimedia content elements based on key points
US11216498B2 (en) 2005-10-26 2022-01-04 Cortica, Ltd. System and method for generating signatures to three-dimensional multimedia data elements
US11222069B2 (en) 2019-03-31 2022-01-11 Cortica Ltd. Low-power calculation of a signature of a media unit
US11285963B2 (en) 2019-03-10 2022-03-29 Cartica Ai Ltd. Driver-based prediction of dangerous events
US11361014B2 (en) 2005-10-26 2022-06-14 Cortica Ltd. System and method for completing a user profile
US11386139B2 (en) 2005-10-26 2022-07-12 Cortica Ltd. System and method for generating analytics for entities depicted in multimedia content
US11403336B2 (en) 2005-10-26 2022-08-02 Cortica Ltd. System and method for removing contextually identical multimedia content elements
US11590988B2 (en) 2020-03-19 2023-02-28 Autobrains Technologies Ltd Predictive turning assistant
US11593662B2 (en) 2019-12-12 2023-02-28 Autobrains Technologies Ltd Unsupervised cluster generation
US11604847B2 (en) 2005-10-26 2023-03-14 Cortica Ltd. System and method for overlaying content on a multimedia content element based on user interest
US11620327B2 (en) 2005-10-26 2023-04-04 Cortica Ltd System and method for determining a contextual insight and generating an interface with recommendations based thereon
US11643005B2 (en) 2019-02-27 2023-05-09 Autobrains Technologies Ltd Adjusting adjustable headlights of a vehicle
US11694088B2 (en) 2019-03-13 2023-07-04 Cortica Ltd. Method for object detection using knowledge distillation
US11758004B2 (en) 2005-10-26 2023-09-12 Cortica Ltd. System and method for providing recommendations based on user profiles
US11756424B2 (en) 2020-07-24 2023-09-12 AutoBrains Technologies Ltd. Parking assist
US11760387B2 (en) 2017-07-05 2023-09-19 AutoBrains Technologies Ltd. Driving policies determination
US11827215B2 (en) 2020-03-31 2023-11-28 AutoBrains Technologies Ltd. Method for training a driving related object detector
US11899707B2 (en) 2017-07-09 2024-02-13 Cortica Ltd. Driving policies determination

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2012212517A1 (en) * 2011-02-04 2013-08-22 Google Inc. Posting to social networks by voice
US9262513B2 (en) * 2011-06-24 2016-02-16 Alibaba Group Holding Limited Search method and apparatus
US9912713B1 (en) 2012-12-17 2018-03-06 MiMedia LLC Systems and methods for providing dynamically updated image sets for applications
US9298758B1 (en) * 2013-03-13 2016-03-29 MiMedia, Inc. Systems and methods providing media-to-media connection
US9465521B1 (en) 2013-03-13 2016-10-11 MiMedia, Inc. Event based media interface
US9183232B1 (en) 2013-03-15 2015-11-10 MiMedia, Inc. Systems and methods for organizing content using content organization rules and robust content information
US10257301B1 (en) 2013-03-15 2019-04-09 MiMedia, Inc. Systems and methods providing a drive interface for content delivery
US9465995B2 (en) * 2013-10-23 2016-10-11 Gracenote, Inc. Identifying video content via color-based fingerprint matching
US10438095B2 (en) * 2017-08-04 2019-10-08 Medallia, Inc. System and method for cascading image clustering using distribution over auto-generated labels
CN110035298B (en) * 2019-04-15 2020-04-14 深圳市摩天之星企业管理有限公司 Media quick playing method
US11449545B2 (en) 2019-05-13 2022-09-20 Snap Inc. Deduplication of media file search results

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5436653A (en) * 1992-04-30 1995-07-25 The Arbitron Company Method and system for recognition of broadcast segments
US20020038299A1 (en) * 2000-03-20 2002-03-28 Uri Zernik Interface for presenting information
US20040128511A1 (en) * 2000-12-20 2004-07-01 Qibin Sun Methods and systems for generating multimedia signature
US20070078846A1 (en) * 2005-09-30 2007-04-05 Antonino Gulli Similarity detection and clustering of images
US20070083513A1 (en) * 2005-10-12 2007-04-12 Ira Cohen Determining a recurrent problem of a computer resource using signatures
US20070085710A1 (en) * 2005-10-19 2007-04-19 Advanced Digital Forensic Solutions, Inc. Methods for searching forensic data
US7809722B2 (en) * 2005-05-09 2010-10-05 Like.Com System and method for enabling search and retrieval from image files based on recognized information
US7822700B2 (en) * 2006-12-29 2010-10-26 Brooks Roger K Method for using lengths of data paths in assessing the morphological similarity of sets of data by using equivalence signatures
US7853344B2 (en) * 2000-10-24 2010-12-14 Rovi Technologies Corporation Method and system for analyzing ditigal audio files

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6374266B1 (en) * 1998-07-28 2002-04-16 Ralph Shnelvar Method and apparatus for storing information in a data processing system
US6774917B1 (en) * 1999-03-11 2004-08-10 Fuji Xerox Co., Ltd. Methods and apparatuses for interactive similarity searching, retrieval, and browsing of video
US7127615B2 (en) * 2000-09-20 2006-10-24 Blue Spike, Inc. Security based on subliminal and supraliminal channels for data objects
US6518892B2 (en) * 2000-11-06 2003-02-11 Broadcom Corporation Stopping criteria for iterative decoding
US7681032B2 (en) * 2001-03-12 2010-03-16 Portauthority Technologies Inc. System and method for monitoring unauthorized transport of digital content
US7529659B2 (en) * 2005-09-28 2009-05-05 Audible Magic Corporation Method and apparatus for identifying an unknown work
JP4991283B2 (en) * 2003-02-21 2012-08-01 カリンゴ・インコーポレーテッド Additional hash functions in content-based addressing
US7373520B1 (en) * 2003-06-18 2008-05-13 Symantec Operating Corporation Method for computing data signatures
US7376752B1 (en) * 2003-10-28 2008-05-20 David Chudnovsky Method to resolve an incorrectly entered uniform resource locator (URL)
US20070242066A1 (en) * 2006-04-14 2007-10-18 Patrick Levy Rosenthal Virtual video camera device with three-dimensional tracking and virtual object insertion
US8004536B2 (en) * 2006-12-01 2011-08-23 Adobe Systems Incorporated Coherent image selection and modification
US8171030B2 (en) * 2007-06-18 2012-05-01 Zeitera, Llc Method and apparatus for multi-dimensional content search and video identification
US8230475B2 (en) * 2007-11-16 2012-07-24 At&T Intellectual Property I, L.P. Methods and computer program products for subcontent tagging and playback
US8135221B2 (en) * 2009-10-07 2012-03-13 Eastman Kodak Company Video concept classification using audio-visual atoms

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5436653A (en) * 1992-04-30 1995-07-25 The Arbitron Company Method and system for recognition of broadcast segments
US20020038299A1 (en) * 2000-03-20 2002-03-28 Uri Zernik Interface for presenting information
US7853344B2 (en) * 2000-10-24 2010-12-14 Rovi Technologies Corporation Method and system for analyzing ditigal audio files
US20040128511A1 (en) * 2000-12-20 2004-07-01 Qibin Sun Methods and systems for generating multimedia signature
US7809722B2 (en) * 2005-05-09 2010-10-05 Like.Com System and method for enabling search and retrieval from image files based on recognized information
US20070078846A1 (en) * 2005-09-30 2007-04-05 Antonino Gulli Similarity detection and clustering of images
US20070083513A1 (en) * 2005-10-12 2007-04-12 Ira Cohen Determining a recurrent problem of a computer resource using signatures
US20070085710A1 (en) * 2005-10-19 2007-04-19 Advanced Digital Forensic Solutions, Inc. Methods for searching forensic data
US7822700B2 (en) * 2006-12-29 2010-10-26 Brooks Roger K Method for using lengths of data paths in assessing the morphological similarity of sets of data by using equivalence signatures

Cited By (86)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10621988B2 (en) 2005-10-26 2020-04-14 Cortica Ltd System and method for speech to text translation using cores of a natural liquid architecture system
US9940326B2 (en) 2005-10-26 2018-04-10 Cortica, Ltd. System and method for speech to speech translation using cores of a natural liquid architecture system
US11032017B2 (en) 2005-10-26 2021-06-08 Cortica, Ltd. System and method for identifying the context of multimedia content elements
US9886437B2 (en) 2005-10-26 2018-02-06 Cortica, Ltd. System and method for generation of signatures for multimedia data elements
US10635640B2 (en) 2005-10-26 2020-04-28 Cortica, Ltd. System and method for enriching a concept database
US9953032B2 (en) 2005-10-26 2018-04-24 Cortica, Ltd. System and method for characterization of multimedia content signals using cores of a natural liquid architecture system
US10180942B2 (en) 2005-10-26 2019-01-15 Cortica Ltd. System and method for generation of concept structures based on sub-concepts
US10191976B2 (en) 2005-10-26 2019-01-29 Cortica, Ltd. System and method of detecting common patterns within unstructured data elements retrieved from big data sources
US10193990B2 (en) 2005-10-26 2019-01-29 Cortica Ltd. System and method for creating user profiles based on multimedia content
US10210257B2 (en) 2005-10-26 2019-02-19 Cortica, Ltd. Apparatus and method for determining user attention using a deep-content-classification (DCC) system
US10331737B2 (en) 2005-10-26 2019-06-25 Cortica Ltd. System for generation of a large-scale database of hetrogeneous speech
US10360253B2 (en) 2005-10-26 2019-07-23 Cortica, Ltd. Systems and methods for generation of searchable structures respective of multimedia data content
US10372746B2 (en) 2005-10-26 2019-08-06 Cortica, Ltd. System and method for searching applications using multimedia content elements
US10380623B2 (en) 2005-10-26 2019-08-13 Cortica, Ltd. System and method for generating an advertisement effectiveness performance score
US10380267B2 (en) 2005-10-26 2019-08-13 Cortica, Ltd. System and method for tagging multimedia content elements
US10691642B2 (en) 2005-10-26 2020-06-23 Cortica Ltd System and method for enriching a concept database with homogenous concepts
US10387914B2 (en) 2005-10-26 2019-08-20 Cortica, Ltd. Method for identification of multimedia content elements and adding advertising content respective thereof
US11216498B2 (en) 2005-10-26 2022-01-04 Cortica, Ltd. System and method for generating signatures to three-dimensional multimedia data elements
US10430386B2 (en) 2005-10-26 2019-10-01 Cortica Ltd System and method for enriching a concept database
US10535192B2 (en) 2005-10-26 2020-01-14 Cortica Ltd. System and method for generating a customized augmented reality environment to a user
US10552380B2 (en) 2005-10-26 2020-02-04 Cortica Ltd System and method for contextually enriching a concept database
US10585934B2 (en) 2005-10-26 2020-03-10 Cortica Ltd. Method and system for populating a concept database with respect to user identifiers
US10607355B2 (en) 2005-10-26 2020-03-31 Cortica, Ltd. Method and system for determining the dimensions of an object shown in a multimedia content item
US10614626B2 (en) 2005-10-26 2020-04-07 Cortica Ltd. System and method for providing augmented reality challenges
US9792620B2 (en) 2005-10-26 2017-10-17 Cortica, Ltd. System and method for brand monitoring and trend analysis based on deep-content-classification
US11019161B2 (en) 2005-10-26 2021-05-25 Cortica, Ltd. System and method for profiling users interest based on multimedia content analysis
US10380164B2 (en) 2005-10-26 2019-08-13 Cortica, Ltd. System and method for using on-image gestures and multimedia content elements as search queries
US10698939B2 (en) 2005-10-26 2020-06-30 Cortica Ltd System and method for customizing images
US10706094B2 (en) 2005-10-26 2020-07-07 Cortica Ltd System and method for customizing a display of a user device based on multimedia content element signatures
US11003706B2 (en) 2005-10-26 2021-05-11 Cortica Ltd System and methods for determining access permissions on personalized clusters of multimedia content elements
US10742340B2 (en) 2005-10-26 2020-08-11 Cortica Ltd. System and method for identifying the context of multimedia content elements displayed in a web-page and providing contextual filters respective thereto
US10949773B2 (en) 2005-10-26 2021-03-16 Cortica, Ltd. System and methods thereof for recommending tags for multimedia content elements based on context
US10902049B2 (en) 2005-10-26 2021-01-26 Cortica Ltd System and method for assigning multimedia content elements to users
US11758004B2 (en) 2005-10-26 2023-09-12 Cortica Ltd. System and method for providing recommendations based on user profiles
US10776585B2 (en) 2005-10-26 2020-09-15 Cortica, Ltd. System and method for recognizing characters in multimedia content
US11620327B2 (en) 2005-10-26 2023-04-04 Cortica Ltd System and method for determining a contextual insight and generating an interface with recommendations based thereon
US11604847B2 (en) 2005-10-26 2023-03-14 Cortica Ltd. System and method for overlaying content on a multimedia content element based on user interest
US11403336B2 (en) 2005-10-26 2022-08-02 Cortica Ltd. System and method for removing contextually identical multimedia content elements
US10831814B2 (en) 2005-10-26 2020-11-10 Cortica, Ltd. System and method for linking multimedia data elements to web pages
US11386139B2 (en) 2005-10-26 2022-07-12 Cortica Ltd. System and method for generating analytics for entities depicted in multimedia content
US10848590B2 (en) 2005-10-26 2020-11-24 Cortica Ltd System and method for determining a contextual insight and providing recommendations based thereon
US11361014B2 (en) 2005-10-26 2022-06-14 Cortica Ltd. System and method for completing a user profile
US9767143B2 (en) 2005-10-26 2017-09-19 Cortica, Ltd. System and method for caching of concept structures
US10733326B2 (en) 2006-10-26 2020-08-04 Cortica Ltd. System and method for identification of inappropriate multimedia content
US11195043B2 (en) 2015-12-15 2021-12-07 Cortica, Ltd. System and method for determining common patterns in multimedia content elements based on key points
US11037015B2 (en) 2015-12-15 2021-06-15 Cortica Ltd. Identification of key points in multimedia data elements
WO2017160413A1 (en) * 2016-03-13 2017-09-21 Cortica, Ltd. System and method for clustering multimedia content elements
US10402436B2 (en) 2016-05-12 2019-09-03 Pixel Forensics, Inc. Automated video categorization, value determination and promotion/demotion via multi-attribute feature computation
US11760387B2 (en) 2017-07-05 2023-09-19 AutoBrains Technologies Ltd. Driving policies determination
US11899707B2 (en) 2017-07-09 2024-02-13 Cortica Ltd. Driving policies determination
US10846544B2 (en) 2018-07-16 2020-11-24 Cartica Ai Ltd. Transportation prediction system and method
US11282391B2 (en) 2018-10-18 2022-03-22 Cartica Ai Ltd. Object detection at different illumination conditions
US11126870B2 (en) 2018-10-18 2021-09-21 Cartica Ai Ltd. Method and system for obstacle detection
US11718322B2 (en) 2018-10-18 2023-08-08 Autobrains Technologies Ltd Risk based assessment
US11181911B2 (en) 2018-10-18 2021-11-23 Cartica Ai Ltd Control transfer of a vehicle
US11087628B2 (en) 2018-10-18 2021-08-10 Cartica Al Ltd. Using rear sensor for wrong-way driving warning
US11029685B2 (en) 2018-10-18 2021-06-08 Cartica Ai Ltd. Autonomous risk assessment for fallen cargo
US11685400B2 (en) 2018-10-18 2023-06-27 Autobrains Technologies Ltd Estimating danger from future falling cargo
US11673583B2 (en) 2018-10-18 2023-06-13 AutoBrains Technologies Ltd. Wrong-way driving warning
US10839694B2 (en) 2018-10-18 2020-11-17 Cartica Ai Ltd Blind spot alert
US11270132B2 (en) 2018-10-26 2022-03-08 Cartica Ai Ltd Vehicle to vehicle communication and signatures
US11373413B2 (en) 2018-10-26 2022-06-28 Autobrains Technologies Ltd Concept update and vehicle to vehicle communication
US11126869B2 (en) 2018-10-26 2021-09-21 Cartica Ai Ltd. Tracking after objects
US11700356B2 (en) 2018-10-26 2023-07-11 AutoBrains Technologies Ltd. Control transfer of a vehicle
US11244176B2 (en) 2018-10-26 2022-02-08 Cartica Ai Ltd Obstacle detection and mapping
US10789535B2 (en) 2018-11-26 2020-09-29 Cartica Ai Ltd Detection of road elements
US11643005B2 (en) 2019-02-27 2023-05-09 Autobrains Technologies Ltd Adjusting adjustable headlights of a vehicle
US11285963B2 (en) 2019-03-10 2022-03-29 Cartica Ai Ltd. Driver-based prediction of dangerous events
US11755920B2 (en) 2019-03-13 2023-09-12 Cortica Ltd. Method for object detection using knowledge distillation
US11694088B2 (en) 2019-03-13 2023-07-04 Cortica Ltd. Method for object detection using knowledge distillation
US11132548B2 (en) 2019-03-20 2021-09-28 Cortica Ltd. Determining object information that does not explicitly appear in a media unit signature
US11481582B2 (en) 2019-03-31 2022-10-25 Cortica Ltd. Dynamic matching a sensed signal to a concept structure
US10796444B1 (en) 2019-03-31 2020-10-06 Cortica Ltd Configuring spanning elements of a signature generator
US10748038B1 (en) 2019-03-31 2020-08-18 Cortica Ltd. Efficient calculation of a robust signature of a media unit
US10776669B1 (en) 2019-03-31 2020-09-15 Cortica Ltd. Signature generation and object detection that refer to rare scenes
US11222069B2 (en) 2019-03-31 2022-01-11 Cortica Ltd. Low-power calculation of a signature of a media unit
US11488290B2 (en) 2019-03-31 2022-11-01 Cortica Ltd. Hybrid representation of a media unit
US11275971B2 (en) 2019-03-31 2022-03-15 Cortica Ltd. Bootstrap unsupervised learning
US10789527B1 (en) 2019-03-31 2020-09-29 Cortica Ltd. Method for object detection using shallow neural networks
US11741687B2 (en) 2019-03-31 2023-08-29 Cortica Ltd. Configuring spanning elements of a signature generator
US10846570B2 (en) 2019-03-31 2020-11-24 Cortica Ltd. Scale inveriant object detection
US11593662B2 (en) 2019-12-12 2023-02-28 Autobrains Technologies Ltd Unsupervised cluster generation
US10748022B1 (en) 2019-12-12 2020-08-18 Cartica Ai Ltd Crowd separation
US11590988B2 (en) 2020-03-19 2023-02-28 Autobrains Technologies Ltd Predictive turning assistant
US11827215B2 (en) 2020-03-31 2023-11-28 AutoBrains Technologies Ltd. Method for training a driving related object detector
US11756424B2 (en) 2020-07-24 2023-09-12 AutoBrains Technologies Ltd. Parking assist

Also Published As

Publication number Publication date
US9098585B2 (en) 2015-08-04
US8285718B1 (en) 2012-10-09
US20130066856A1 (en) 2013-03-14

Similar Documents

Publication Publication Date Title
US8285718B1 (en) Clustering multimedia search
US11151145B2 (en) Tag selection and recommendation to a user of a content hosting service
USRE48791E1 (en) Scalable, adaptable, and manageable system for multimedia identification
US11693902B2 (en) Relevance-based image selection
US10552754B2 (en) Systems and methods for recognizing ambiguity in metadata
US9785708B2 (en) Scalable, adaptable, and manageable system for multimedia identification
JP5736469B2 (en) Search keyword recommendation based on user intention
US8316056B2 (en) Second-order connection search in a social networking system
US20080247610A1 (en) Apparatus, Method and Computer Program for Processing Information
US8856051B1 (en) Augmenting metadata of digital objects
US20140201180A1 (en) Intelligent Supplemental Search Engine Optimization
US20100070507A1 (en) Hybrid content recommending server, system, and method
Deldjoo et al. MMTF-14K: a multifaceted movie trailer feature dataset for recommendation and retrieval
CN107918657B (en) Data source matching method and device
WO2012001485A1 (en) Method and apparatus for managing video content
US9229958B2 (en) Retrieving visual media
CN111294660B (en) Video clip positioning method, server, client and electronic equipment
WO2015030645A1 (en) Methods, computer program, computer program product and indexing systems for indexing or updating index
CA2757771A1 (en) Similarity-based feature set supplementation for classification
CN109933691B (en) Method, apparatus, device and storage medium for content retrieval
JP4544047B2 (en) Web image search result classification presentation method and apparatus, program, and storage medium storing program
TW202011231A (en) Data analysis method and data analysis system thereof
CN105279172B (en) Video matching method and device
KR101575819B1 (en) Video search and offering method
Pedro et al. Web‐Based Multimedia Information Extraction Based on Social Redundancy

Legal Events

Date Code Title Description
AS Assignment

Owner name: JPMORGAN CHASE BANK, N.A., AS AGENT, ILLINOIS

Free format text: SECURITY AGREEMENT;ASSIGNORS:GRACENOTE, INC.;TRIBUNE BROADCASTING COMPANY, LLC;CASTTV INC.;REEL/FRAME:036354/0793

Effective date: 20150813

AS Assignment

Owner name: JPMORGAN CHASE BANK, N.A., AS COLLATERAL AGENT, IL

Free format text: SECURITY AGREEMENT;ASSIGNORS:CASTTV INC.;GRACENOTE, INC.;TRIBUNE BROADCASTING COMPANY, LLC;AND OTHERS;REEL/FRAME:037569/0270

Effective date: 20151104

AS Assignment

Owner name: JPMORGAN CHASE BANK, N.A., AS COLLATERAL AGENT, IL

Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE INCORRECT APPL. NO. 14/282,293 PREVIOUSLY RECORDED AT REEL: 037569 FRAME: 0270. ASSIGNOR(S) HEREBY CONFIRMS THE SECURITY AGREEMENT;ASSIGNORS:CASTTV INC.;GRACENOTE, INC.;TRIBUNE BROADCASTING COMPANY, LLC;AND OTHERS;REEL/FRAME:037606/0880

Effective date: 20151104

AS Assignment

Owner name: CASTTV INC., ILLINOIS

Free format text: RELEASE OF SECURITY INTEREST IN PATENT RIGHTS;ASSIGNOR:JPMORGAN CHASE BANK, N.A.;REEL/FRAME:041656/0804

Effective date: 20170201

Owner name: GRACENOTE, INC., CALIFORNIA

Free format text: RELEASE OF SECURITY INTEREST IN PATENT RIGHTS;ASSIGNOR:JPMORGAN CHASE BANK, N.A.;REEL/FRAME:041656/0804

Effective date: 20170201

Owner name: TRIBUNE DIGITAL VENTURES, LLC, ILLINOIS

Free format text: RELEASE OF SECURITY INTEREST IN PATENT RIGHTS;ASSIGNOR:JPMORGAN CHASE BANK, N.A.;REEL/FRAME:041656/0804

Effective date: 20170201

Owner name: TRIBUNE MEDIA SERVICES, LLC, ILLINOIS

Free format text: RELEASE OF SECURITY INTEREST IN PATENT RIGHTS;ASSIGNOR:JPMORGAN CHASE BANK, N.A.;REEL/FRAME:041656/0804

Effective date: 20170201

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION

AS Assignment

Owner name: GRACENOTE MEDIA SERVICES, LLC, CONNECTICUT

Free format text: MERGER;ASSIGNOR:CASTTV INC.;REEL/FRAME:053076/0338

Effective date: 20171222