WO2001004775A2 - A method for constructing a homogeneous electronic catalog - Google Patents

A method for constructing a homogeneous electronic catalog Download PDF

Info

Publication number
WO2001004775A2
WO2001004775A2 PCT/IL2000/000417 IL0000417W WO0104775A2 WO 2001004775 A2 WO2001004775 A2 WO 2001004775A2 IL 0000417 W IL0000417 W IL 0000417W WO 0104775 A2 WO0104775 A2 WO 0104775A2
Authority
WO
WIPO (PCT)
Prior art keywords
database
catalog
supplier
fields
homogenous
Prior art date
Application number
PCT/IL2000/000417
Other languages
French (fr)
Other versions
WO2001004775A3 (en
Inventor
Tsvika Ben Porat
Luz Erez
Ziv Ofek
Original Assignee
Paragon B.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Paragon B.V. filed Critical Paragon B.V.
Priority to AU58431/00A priority Critical patent/AU5843100A/en
Publication of WO2001004775A2 publication Critical patent/WO2001004775A2/en
Publication of WO2001004775A3 publication Critical patent/WO2001004775A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases

Definitions

  • This invention relates to the generation and use of electronic catalogs.
  • a typical concept list includes: item id (standing for a key field that uniquely identifies each item) item name (e.g. Air Jordan shoe); item info (e.g. sport shoe for running).
  • item id standing for a key field that uniquely identifies each item
  • item info e.g. sport shoe for running
  • the concepts include also categories such as sport shoe which include properties, e.g. size, color etc.
  • the user has no knowledge on the identity of the relevant suppliers (or at least not on all the relevant suppliers), and hence he/she must conduct the search across the
  • the user invokes a query using a search engine utility (e.g. the known Yahoo) and provides query parameters that identify the desired product, category and/or property of interest.
  • a search engine utility e.g. the known Yahoo
  • the search engine finds a site or sites that include data that match the sought parameters, the data (e.g. relevant html pages) are retrieved and downloaded to the user.
  • the so retrieved data are often sorted by some relevance ranking, which is intended to approximate the degree of relevance of the resulting data to the query.
  • An exemplary (yet not exclusive) Al-based concept harmonization technique includes automatic learning of the "logic" which governs the classification of category.
  • Methods belonging to this approach utilize a set of training data, for which the correct categories are known in advance (usually as the result of manual classification of these categories).
  • a learning method may then include a learning phase, in which some model of the category is constructed.
  • a model may include terms that are highly associated with the category, and possibly some weights that quantify the degree of correlation between each term and the category.
  • a learning method may be memory based, in which case the learning method simply stores the training data in some useful format. Then, when a new item (say product) is given for classification, the method classifies it automatically by consulting or applying the category model (or by simply comparing the new data item to the training data, in case of a memory based approach).
  • Relational Model (or Database): The relational model, introduced by Codd, is a landmark in the history of database development.
  • an abstract concept has been introduced, according to which the data is represented by tables (referred to as entity or relationship “relations") in which the columns represent the fields and rows represent the records.
  • Field A column in a table of a relational database which represents an attribute of a data record (standing for a product) in the table, for example color, size, price in a table that represents clothing.
  • a product is represented by some or all of its fields.
  • Category Hierarchical structure of concepts represented as category values, to which products or group of products are classified. In the present invention, products are, typically (although not necessarily) classified to category values taken from the leaf nodes of the hierarchy;
  • Property A specific field type which signifies a characteristic of given product or products and which is normally not common to all the products in the catalog. As will be shown below, a property is assigned with property values.
  • a technique to construct a homogeneous knowledge base (constituting a homogeneous catalog) from a plurality of dispersed knowledge bases, each of which constitutes a separate catalog.
  • all or some of the fields are mapped to corresponding fields in the homogeneous catalog (constituting homogenous catalog field structure).
  • the catalog field structure which, as will be explained in greater detail below, includes, preferably, fields of "field type", "category type” and "property type"
  • the category values and property values of the separate catalogs are mapped to the homogeneous catalog.
  • a 'catalog import' step in which the contents of the supplier database is mapped to the homogenous catalog.
  • mapping selected fields in the supplier catalog database to corresponding fields in the homogenous catalog database include “field type” fields, "property type” fields and “category type” fields;
  • mapping category values in said supplier catalog database to corresponding category values in the homogenous catalog database include "field type” fields, "property type” fields and "category type” fields;
  • mapping property values in said supplier catalog database to corresponding property values in the homogenous catalog database mapping property values in said supplier catalog database to corresponding property values in the homogenous catalog database; and transferring data contained in said fields from said supplier catalog database to said homogenous catalog database.
  • the category values of the source supplier database are mapped to respective category values in the homogenous catalog using, preferably a "group by" function.
  • additional separate catalogs are mapped to the same homogeneous catalog.
  • the catalog stores in a unified and homogenous manner, data originated from said separate catalogs.
  • the knowledge base is arranged in accordance with the relational model database.
  • a typical, yet not exclusive example of a communication network being the Internet.
  • the catalog may be subject to queries, utilizing e.g. conventional query languages such as SQL.
  • the query that is applied to the homogeneous catalog uses terms which are identical (or substantially identical) to those that constitute the homogeneous catalog, and therefore the prospects of missing data due to inconsistent definitions are substantially reduced or even eliminated.
  • the specified terms may be chosen to be field names - (of field, category and/or property type), category values, and /or property values). Other terms may also be used, all as required and appropriate.
  • Fig. 1 is a generalized system architecture in accordance with the invention
  • Fig. 2 is a flow chart illustrating a generalized sequence of operation, in accordance with the invention
  • Fig. 3 is a generalized flow chart illustrating a field mapping sequence in accordance with one embodiment of the invention
  • FIG. 4A-B illustrate exemplary user interface screens for realizing the field mapping sequence of Fig. 3
  • Fig. 5 is a generalized flow chart illustrating category values mapping sequence in accordance with one embodiment of the invention
  • Figs. 6A-B illustrate exemplary user interface screens for realizing the category values mapping sequence of Fig. 5;
  • Fig. 7 illustrates a typical hierarchy of categories
  • Fig. 8 is a generalized flow chart illustrating a property mapping sequence in accordance with one embodiment of the invention.
  • Figs. 9A-C illustrates the resulting homogeneous catalog after mapping the fields, category values and property values, represented in an efficient manner, in accordance with one embodiment of the invention.
  • the system (10) includes a homogenous catalog site (12) and a supplier catalog site (14), inter-linked by means of Internet network (16).
  • the separate catalog (18) (coupled to conventional desktop 19) at the supplier site (14) is mapped, in a convenient semi-automatic procedure, into homogenous catalog (20) (coupled to desktop 21) at the remote site (12), using a mapping protocol over the internet (16).
  • each catalog is held in one physical storage medium, the invention is not bound to any particular physical and/or logical representation of the catalog sites.
  • Fig. 2 there is shown a flow chart illustrating a generalized sequence of operation, in accordance with the invention.
  • the supplier (14) logs into the homogenous catalog (standing for the server) site (12) and after undergoing known per se admittance control steps (30 to 32), the server catalog "accesses" the supplier's site (standing for the client), and by means of e.g. known per se ODBC driver (33) links to the client's database.
  • the catalog database at the client's site is arranged in accordance with the relational model.
  • the invention is by no means bound by any particular high-level or low-level model for representing data.
  • a flat model where all the data is held in one table is utilized.
  • the fields (34) are mapped (including field type, category type and property type). Having mapped the fields (to thereby constitute a field structure), there follows category values mapping (35) followed by property values mapping (36). Having mapped category values and property values, the contents of the supplier catalog is mapped to the homogenous catalog (referred to as catalog import step (37), and thereafter integrity checking steps (38) and (39) are performed, in which errors are rejected (38) and an optional manual modification step is provided (39) (e.g. for inputting missing data, such as contents of fields, say the value of color field for the product jeans trouser). The process terminates by providing a status summary report (39') (e.g. success or error, and in the latter case also indicating the error type).
  • a status summary report 39'
  • Fig. 3 there is shown a generalized flow chart illustrating a field mapping sequence in accordance with one embodiment of the invention.
  • the supplier identifies the tables which are subject to mapping (and which constitute the supplier's catalog).
  • the fields of the catalog table or tables and also the contents thereof can be easily identified, all as known er se.
  • the fields of the client's catalog are mapped into fields at the server's catalog (41).
  • the fields mapping is then tested (42 and 43) and thereafter the catalog and properties (44 and 45) are mapped (see below).
  • Figs. 4A-B illustrate an exemplary user interface screens for realizing the field mapping sequence of Fig. 3.
  • the communication protocol program that links the client and server sites identifies the appropriate relational tables in the server catalog and client catalog that are subject to mapping
  • ItemName is mapped to the corresponding client catalog field name ProdName.
  • the homogenous catalog field SizeName is mapped to the corresponding client catalog field name CategorName
  • the homogenous catalog field SmallPicture is mapped to the corresponding client catalog field name SmallPic.
  • the client is not obliged to map all the existing fields of the catalog into corresponding fields in his local catalog. Put differently, only those fields that are of interest are mapped.
  • fields of all types are mapped, normally, "fields”, “category” and “property” (and if desired possibly also others, all as required and appropriate, depending upon the particular application).
  • H-Catalog stands for the homogenous catalog side and Data Source Field stands for the supplier's fields as appearing in his/her local database. Focusing now on the h-catalog side, the fields 1,2 and 3 stand for "fields", since they are common attributes to all products. Put differently, every product must have a name (field no. 1), a catalog number (field number 2) and a price (field number 3). Field number 4 stands for "category” (as will be explained in greater detail below).
  • Fields 5 and 6 stand for "property".
  • property is, as a rule, an attribute of one or more products, but not of all of them.
  • size and color are attributes of some products such as shoes and shirts, but not of others, such as tyres (for cars).
  • the latter may have other properties such as (tyre) width and (tyre) diameter.
  • Category value mapping is illustrated in Figs. 6A and 6B.
  • Fig. 6A the category list as extracted from the table of the client catalog is displayed (in the left column under the title Supplier Category), and is mapped manually to corresponding category value (in the right column under the title h- category - designated also as e-FES category ) in the homogenous catalog at the server site.
  • the resulting mapping is shown in Fig. 6B.
  • the h-category category value Teamclothing trousers/men is mapped to the supplier (client) category value Trousers long Unisex/men.
  • Fig. 7 illustrating the hierarchy (tree) of categories (70), of the supplier end.
  • the hierarchy tree is a non-limiting form of representing categories.
  • the most generalized definition of categories resides (at the top of the hierarchy -referred to also as root node) and more specific definitions reside in lower levels of the tree.
  • root (71) represents the general category definition clothing.
  • Nodes (72) and (73), lower in the hierarchy, represent more specific category definition (youth clothing and kids' clothing).
  • the category values residing at the lowest level of the hierarchy represent the most specific category definition.
  • (74) represents sport pants for youth
  • category value (75) represents elegant pants for youth.
  • a category value may be viewed as concatenation of the nodes from root to leaf.
  • clothing -> youth -> pants -> sport corresponds to category value (74).
  • the category values In order to map the category values, it is first required to "flatten" the hierarchical representation of categories in order to obtain a list of category values that are subject to mapping (from the supplier catalog to the h-catalog database).
  • the flattening results in extracting only the category values of interest, and in the specific example of Fig. 7, this means category values (74 to 78). Whilst in the specific example of Fig. 7 only leaf nodes were extracted (for category value mapping purposes), this is not necessarily always the case. Thus, by an alternative embodiment, higher levels in the hierarchy of categories may also or alternatively be used.
  • the flattening procedure is implemented manually, or in a semi-automatic manner.
  • the mapping of category values is implemented basically in the same manner as mapping the fields.
  • mapping results of some of the category values is illustrated in table 2 below:
  • a preliminary "group by" function may be utilized in order to improve the efficiency of the category values mapping.
  • a catalog at the supplier end that holds X items all classified to the Sport pants for youth category value and additional Y items all classified to the Jeans trousers for youth category value.
  • the table includes X repetitions of the Sport pants for youth category value and Y repetitions of the Jeans trousers for youth category value.
  • the property mapping may involve a preliminary "group by" function in order to extract unique property values and avoid repetitions. Having mapped the fields, the category values and the property values, the contents of the catalog at the supplier end may now be imported to the h-catalog database (step 37 in Fig. 2) so as to construct the h-catalog catalog database.
  • the data import is realized using known er se data transfer techniques. Of course, the data in the server catalog are organized under the field names of the homogeneous catalog.
  • the original catalog (at the supplier site, Table 6), and the mapped catalog (at the h-catalog site, Table 5) are, accordingly, as follows:
  • Figs. 9A-C illustrate one out of many possible variants for representing data in the h-catalog site in accordance with one embodiment of the invention.
  • the category field is represented as an integral part of table (91).
  • Table (92) (Fig. 9B) stands for category table and it includes the key field Category Id and category name. The contents of the category table is, as shown, the distinct category values, i.e. "Sport pants for youth”, “Jeans trousers for youth”, and "Swimsuits and light summer clothing for youth”.
  • Fig. 9C stands for "property” having "color” and “size” properties and their respective values green, blue and yellow (for color) and 40, 38 and 32 for size.
  • the representation of data in accordance with Fig. 9A-C is, of course, only one out of many known per se manners of representing data and by way of alternative non limiting embodiment the known ERD model may be used. As is well known, the latter enables efficient 1 :N relationship (e.g. a category can be assigned to more than one product) and N:M representation.
  • the procedure described with reference to a supplier catalog database is not bound to any specific order or scope.
  • the entire database may be mapped in one time or, if desired, the procedure described above may be applied successively to database portions, e.g. applied to each database table separately.
  • the procedure described with reference to Figs. 1 to 9 is repeated for each supplier who wishes to subscribe to the homogenous catalog.
  • the data of all the separate catalogs are represented in a unified manner in the homogenous catalog and, accordingly, querying the homogenous catalog using the common field, category and/or property nomenclature, will bring about consistent results as compared to the alternative of querying the inconsistent separate catalogs of the suppliers.
  • the actual representation of data in the h-catalog and suppler may be one in any known per se manner taking in account depending on e.g. volume and performance considerations.
  • Alphabetic characters and roman symbols used to designate method steps are used for convenience of explanation only and do not necessarily imply any particular order steps.

Abstract

A method for constructing an electronic homogenous catalog database from a plurality of separate suppliers catalog databases, including performing in respect of each separate supplier catalog, the following steps. First, linking the homogeneous database to the supplier database using a communication protocol. Next, mapping selected fields in the supplier catalog database to corresponding fields in the homogenous catalog database. The fields include 'field type' fields, 'property type' fields and 'category type' fields. Next, mapping category values in the supplier catalog database to corresponding category values in the homogenous catalog database. Next, mapping property values in the supplier catalog database to corresponding property values in the homogenous catalog database. Finally, transferring data contained in the fields from said supplier catalog database to the homogenous catalog database.

Description

A METHOD FOR CONSTRUCTING A HOMOGENEOUS ELECTRONIC CATALOG
FIELD OF THE INVENTION
This invention relates to the generation and use of electronic catalogs.
BACKGROUND OF THE INVENTION
The amount of textual information that is available in computerized media has increased dramatically in recent years. The wide circulation of the Internet and the provision of a relatively secured transactions (having monetary value) over the Internet has resulted in a flood of electronic catalogs that are offered by suppliers and allow subscribers to visit catalog sites, view products of interest and possibly order them. Typically, each supplier establishes his/her own catalog by constructing a knowledge base consisting of say a hierarchy of concepts and properties that, to the best of his/her understanding, describe the products that are included in the catalog.
Thus, for example, in a catalog of sport products, a typical concept list includes: item id (standing for a key field that uniquely identifies each item) item name (e.g. Air Jordan shoe); item info (e.g. sport shoe for running). The concepts include also categories such as sport shoe which include properties, e.g. size, color etc.
A user who seeks to locate a desired product or products and to compare proposals offered by two or more suppliers, can enter the supplier's sites and attempt to locate the product(s) of interest. However, in a typical scenario, the user has no knowledge on the identity of the relevant suppliers (or at least not on all the relevant suppliers), and hence he/she must conduct the search across the
Internet in accordance with the product name and/or possible category or property identifying the product.
To this end, the user invokes a query using a search engine utility (e.g. the known Yahoo) and provides query parameters that identify the desired product, category and/or property of interest. In the case that the search engine finds a site or sites that include data that match the sought parameters, the data (e.g. relevant html pages) are retrieved and downloaded to the user. The so retrieved data are often sorted by some relevance ranking, which is intended to approximate the degree of relevance of the resulting data to the query.
As is well known to those who try to target specific data in a large knowledge source such as the Internet, the prospects of missing data which reside in the knowledge source and nevertheless are not revealed by the search engine running the query is relatively high, which is obviously undesired. This stems, inter alia, from the inherent characteristics of natural language, which enables to define a given concept (e.g. a product), in many different manners.
Thus, when different suppliers make the definitions of products in catalogs separately and independently, a variety of inconsistent definitions are brought about, which, naturally, hamper on the successful targeting of the sought data. There are known in the art numerous attempts to alleviate the problem, by utilizing sophisticated and very complicated artificial intelligent (Al) based techniques that aim at rendering the numerous dispersed knowledge bases into a harmonized knowledge base structure.
An exemplary (yet not exclusive) Al-based concept harmonization technique includes automatic learning of the "logic" which governs the classification of category. Methods belonging to this approach utilize a set of training data, for which the correct categories are known in advance (usually as the result of manual classification of these categories). A learning method may then include a learning phase, in which some model of the category is constructed. For example, such a model may include terms that are highly associated with the category, and possibly some weights that quantify the degree of correlation between each term and the category. Alternatively, a learning method may be memory based, in which case the learning method simply stores the training data in some useful format. Then, when a new item (say product) is given for classification, the method classifies it automatically by consulting or applying the category model (or by simply comparing the new data item to the training data, in case of a memory based approach).
However, due to the inherent extremely complex structure of the natural language, these solutions are only partially successful. There is accordingly a need in the art to provide for a technique which enables to construct a homogenous knowledge base whilst obviating the need to apply complex Al -based techniques.
There is a further need in the art to provide for a technique that enables suppliers to map their respective knowledge base definitions to the specified homogenous knowledge based representation, in a convenient manner utilizing substantially a semi-automatic conversion technique.
GLOSSARY OF TERMS
There follows a glossary of terms some being conventional and others have been coined:
Relational Model (or Database): The relational model, introduced by Codd, is a landmark in the history of database development. In relational databases, an abstract concept has been introduced, according to which the data is represented by tables (referred to as entity or relationship "relations") in which the columns represent the fields and rows represent the records.
The association between tables is only conceptual. It is not part of the database definition. Two tables can be implicitly associated by the fact that they have one or more fields whose values are taken from the same set of values (called "domain"). Other concepts introduced by the relational model are high level operators that operate on tables (i.e. both their parameters and results are tables) and comprehensive data languages (now called
Figure imgf000005_0001
generation languages), in which one specifies what the required results are, rather than how these results are to be produced. Such non-procedural languages (SQL - Structured Query Language) have become an industry standard. Furthermore, the relational model suggests a very high level of data independence. There should not be any effect on the programs written in these languages due to changes in the matter data which are organized, stored, indexed and ordered. The relational model has become a de-facto standard for data analysts.
Field: A column in a table of a relational database which represents an attribute of a data record (standing for a product) in the table, for example color, size, price in a table that represents clothing.. A product is represented by some or all of its fields. Category: Hierarchical structure of concepts represented as category values, to which products or group of products are classified. In the present invention, products are, typically (although not necessarily) classified to category values taken from the leaf nodes of the hierarchy; Property: A specific field type which signifies a characteristic of given product or products and which is normally not common to all the products in the catalog. As will be shown below, a property is assigned with property values.
SUMMARY OF THE INVENTION
The terms knowledge base and database are used interchangeably. Whilst for convenience of explanation the invention is described with reference to relational database, the invention is by no means bound to this particular example.
In accordance with the invention, there is provided a technique to construct a homogeneous knowledge base, (constituting a homogeneous catalog) from a plurality of dispersed knowledge bases, each of which constitutes a separate catalog. In accordance with the invention, for each one of the separate catalogs, all or some of the fields are mapped to corresponding fields in the homogeneous catalog (constituting homogenous catalog field structure). Having constructed the catalog field structure, (which, as will be explained in greater detail below, includes, preferably, fields of "field type", "category type" and "property type"), the category values and property values of the separate catalogs are mapped to the homogeneous catalog. There follows a 'catalog import' step, in which the contents of the supplier database is mapped to the homogenous catalog. Thus, there is provided in accordance with the invention, a method for constructing an electronic homogenous catalog database from a plurality of separate suppliers catalog databases, comprising performing in respect of each separate supplier catalog, the following steps, that include:
(a) linking the homogeneous database to the supplier database using a communication protocol;
(b) mapping selected fields in the supplier catalog database to corresponding fields in the homogenous catalog database; said fields include "field type" fields, "property type" fields and "category type" fields; (c) mapping category values in said supplier catalog database to corresponding category values in the homogenous catalog database;
(d) mapping property values in said supplier catalog database to corresponding property values in the homogenous catalog database; and transferring data contained in said fields from said supplier catalog database to said homogenous catalog database.
In accordance with a preferred embodiment, the category values of the source supplier database are mapped to respective category values in the homogenous catalog using, preferably a "group by" function. In a similar manner, additional separate catalogs are mapped to the same homogeneous catalog. The catalog stores in a unified and homogenous manner, data originated from said separate catalogs.
Preferably, although not necessarily, the knowledge base is arranged in accordance with the relational model database.
In accordance with a preferred embodiment of the invention, there are provided separate catalogs and at least one remote homogenous catalog inter-linked by means of communication network. A typical, yet not exclusive example of a communication network being the Internet. Having constructed the homogenous catalog in the manner specified, the catalog may be subject to queries, utilizing e.g. conventional query languages such as SQL.
Unlike prior art, where due to inconsistent nomenclature utilized by each separate catalog the queries were subjected to incomplete answers, in accordance with the homogeneous catalog of the invention, the query that is applied to the homogeneous catalog uses terms which are identical (or substantially identical) to those that constitute the homogeneous catalog, and therefore the prospects of missing data due to inconsistent definitions are substantially reduced or even eliminated. The specified terms may be chosen to be field names - (of field, category and/or property type), category values, and /or property values). Other terms may also be used, all as required and appropriate.
BRIEF DESCRIPTION OF THE DRAWINGS
In order to understand the invention and to see how it may be carried out in practice, a preferred embodiment will now be described, by way of non-limiting example only, with reference to the accompanying drawings, in which:
Fig. 1 is a generalized system architecture in accordance with the invention;
Fig. 2 is a flow chart illustrating a generalized sequence of operation, in accordance with the invention; Fig. 3 is a generalized flow chart illustrating a field mapping sequence in accordance with one embodiment of the invention;
Fig. 4A-B illustrate exemplary user interface screens for realizing the field mapping sequence of Fig. 3; Fig. 5 is a generalized flow chart illustrating category values mapping sequence in accordance with one embodiment of the invention;
Figs. 6A-B illustrate exemplary user interface screens for realizing the category values mapping sequence of Fig. 5;
Fig. 7 illustrates a typical hierarchy of categories; Fig. 8; is a generalized flow chart illustrating a property mapping sequence in accordance with one embodiment of the invention; and
Figs. 9A-C illustrates the resulting homogeneous catalog after mapping the fields, category values and property values, represented in an efficient manner, in accordance with one embodiment of the invention.
DETAILED DESCRIPTION OF A PREFERRED EMBODIMENT
The preferred embodiment is illustrated with reference to remote homogenous catalog database and supplier catalog databases, which are linked by the Internet. The invention is by no means bound by this specific example.
Turning now to Fig. 1, there is shown a generalized system architecture in accordance with the invention. The system (10) includes a homogenous catalog site (12) and a supplier catalog site (14), inter-linked by means of Internet network (16). In accordance with the system of Fig. 1, the separate catalog (18) (coupled to conventional desktop 19) at the supplier site (14) is mapped, in a convenient semi-automatic procedure, into homogenous catalog (20) (coupled to desktop 21) at the remote site (12), using a mapping protocol over the internet (16). Whilst in Fig. 1, each catalog is held in one physical storage medium, the invention is not bound to any particular physical and/or logical representation of the catalog sites.
Turning now to Fig. 2, there is shown a flow chart illustrating a generalized sequence of operation, in accordance with the invention. Thus, at the onset, the supplier (14) logs into the homogenous catalog (standing for the server) site (12) and after undergoing known per se admittance control steps (30 to 32), the server catalog "accesses" the supplier's site (standing for the client), and by means of e.g. known per se ODBC driver (33) links to the client's database. For convenience of explanation, it is assumed that the catalog database at the client's site is arranged in accordance with the relational model. Those versed in the art will readily appreciate that the invention is by no means bound by any particular high-level or low-level model for representing data. Thus, by way of example, in accordance with one embodiment, a flat model (where all the data is held in one table) is utilized.
Having linked to the client's catalog, the fields (34) are mapped (including field type, category type and property type). Having mapped the fields (to thereby constitute a field structure), there follows category values mapping (35) followed by property values mapping (36). Having mapped category values and property values, the contents of the supplier catalog is mapped to the homogenous catalog (referred to as catalog import step (37), and thereafter integrity checking steps (38) and (39) are performed, in which errors are rejected (38) and an optional manual modification step is provided (39) (e.g. for inputting missing data, such as contents of fields, say the value of color field for the product jeans trouser). The process terminates by providing a status summary report (39') (e.g. success or error, and in the latter case also indicating the error type).
Turning now to Fig. 3, there is shown a generalized flow chart illustrating a field mapping sequence in accordance with one embodiment of the invention. Thus, after the ODBC connection (41), the supplier identifies the tables which are subject to mapping (and which constitute the supplier's catalog). Considering that the data format and structure of the supplier's catalog are a priori known to the communication application, the fields of the catalog table or tables and also the contents thereof can be easily identified, all as known er se. Thus, the fields of the client's catalog are mapped into fields at the server's catalog (41). The fields mapping is then tested (42 and 43) and thereafter the catalog and properties (44 and 45) are mapped (see below).
For a better understanding of the field mapping sequence, attention is directed to Figs. 4A-B that illustrate an exemplary user interface screens for realizing the field mapping sequence of Fig. 3. At the onset, the communication protocol program that links the client and server sites identifies the appropriate relational tables in the server catalog and client catalog that are subject to mapping
(known per se and not shown). After duly identifying the respective tables, the fields thereof are identified and the list of field names of the table in the homogenous catalog is presented in the left column under the title (h-catalog field - where h-catalog stands for homogenous catalog). In Fig. 4A, the catalog is presented as e-FES. The supplier now maps manually the field names in his catalog (which, although not shown in Fig. 4A, are normally presented in the right column under the title Data Source Field ) to the corresponding homogenous catalog fields.
The mapped result is shown in Fig. 4B. Thus, the homogenous catalog field
ItemName is mapped to the corresponding client catalog field name ProdName.
Likewise, the homogenous catalog field SizeName is mapped to the corresponding client catalog field name CategorName, and the homogenous catalog field SmallPicture is mapped to the corresponding client catalog field name SmallPic. As shown, the client is not obliged to map all the existing fields of the catalog into corresponding fields in his local catalog. Put differently, only those fields that are of interest are mapped.
It should be noted that in the process of field mapping, fields of all types are mapped, normally, "fields", "category" and "property" (and if desired possibly also others, all as required and appropriate, depending upon the particular application). Consider, for example, the following list of fields:
Figure imgf000011_0001
TABLE 1
wherein H-Catalog stands for the homogenous catalog side and Data Source Field stands for the supplier's fields as appearing in his/her local database. Focusing now on the h-catalog side, the fields 1,2 and 3 stand for "fields", since they are common attributes to all products. Put differently, every product must have a name (field no. 1), a catalog number (field number 2) and a price (field number 3). Field number 4 stands for "category" (as will be explained in greater detail below).
Fields 5 and 6 stand for "property". As specified above, property is, as a rule, an attribute of one or more products, but not of all of them. Thus, size and color are attributes of some products such as shoes and shirts, but not of others, such as tyres (for cars). The latter may have other properties such as (tyre) width and (tyre) diameter.
Accordingly, had the catalog product list encompassed not only shoes and shirts, but also tyres, the list of fields would be as follows:
(1) product name
(2) catalog number (3) price
(4) category
(5) size
(6) color (7) width
(8) diameter wherein field nos. 7 and 8 stand for the newly added properties. Having mapped the fields (including categories and properties), the categories values are mapped in a similar manner as illustrated for example in Fig. 5. After the category mapping step (51), the appropriate data integrity checking is performed (52 and 53).
Category value mapping is illustrated in Figs. 6A and 6B. In Fig. 6A, the category list as extracted from the table of the client catalog is displayed (in the left column under the title Supplier Category), and is mapped manually to corresponding category value (in the right column under the title h- category - designated also as e-FES category ) in the homogenous catalog at the server site. The resulting mapping is shown in Fig. 6B. For example, the h-category category value Teamclothing trousers/men is mapped to the supplier (client) category value Trousers long Unisex/men. For a better understanding of the category value mapping, attention is drawn to Fig. 7, illustrating the hierarchy (tree) of categories (70), of the supplier end.
The hierarchy tree is a non-limiting form of representing categories. In the hierarchy tree, the most generalized definition of categories resides (at the top of the hierarchy -referred to also as root node) and more specific definitions reside in lower levels of the tree.
Thus, for example, (see Fig. 7) root (71) represents the general category definition clothing. Nodes (72) and (73), lower in the hierarchy, represent more specific category definition (youth clothing and kids' clothing). The category values residing at the lowest level of the hierarchy (leave nodes) represent the most specific category definition. Thus, (74) represents sport pants for youth, and category value (75) represents elegant pants for youth. Note that a category value may be viewed as concatenation of the nodes from root to leaf. Thus clothing -> youth -> pants -> sport corresponds to category value (74). In order to map the category values, it is first required to "flatten" the hierarchical representation of categories in order to obtain a list of category values that are subject to mapping (from the supplier catalog to the h-catalog database). The flattening results in extracting only the category values of interest, and in the specific example of Fig. 7, this means category values (74 to 78). Whilst in the specific example of Fig. 7 only leaf nodes were extracted (for category value mapping purposes), this is not necessarily always the case. Thus, by an alternative embodiment, higher levels in the hierarchy of categories may also or alternatively be used. The flattening procedure is implemented manually, or in a semi-automatic manner. As explained above, the mapping of category values is implemented basically in the same manner as mapping the fields.
Reverting now to the latter example, the mapping results of some of the category values is illustrated in table 2 below:
Figure imgf000013_0001
TABLE 2
In those applications where the categories form an integral part of the product table (of the catalog at the supplier's end), a preliminary "group by" function may be utilized in order to improve the efficiency of the category values mapping. Thus, consider for example a catalog at the supplier end that holds X items all classified to the Sport pants for youth category value and additional Y items all classified to the Jeans trousers for youth category value. Obviously, if a single table holds both the item records and the category values to which the items belong, it is expected that under the category field, the table includes X repetitions of the Sport pants for youth category value and Y repetitions of the Jeans trousers for youth category value.
Following a naive category mapping sequence may lead to redundant mapping of the same category value again and again (X times for the Sport pants or youth category value and Y times for the Jeans trousers for youth category value). Applying a known per se "group by " function will cope with the specified in-efficient procedure since it delivers as an output only the different category values (and in the latter example only two category values, i.e. Sport pants for youth and Jeans trousers for youth), and thereby avoid undesired repetitions.
Having mapped the category values (81 in Fig. 8), there follows a property value mapping step (82), followed by conventional property value checking (83 and 84).
Reverting now to the previous example, the values of the property "color" are mapped as follows:
Figure imgf000014_0001
TABLE 3
and, the values of property "size" are mapped as follows:
Figure imgf000014_0002
TABLE 4 Similar to the category mapping, also the property mapping may involve a preliminary "group by" function in order to extract unique property values and avoid repetitions. Having mapped the fields, the category values and the property values, the contents of the catalog at the supplier end may now be imported to the h-catalog database (step 37 in Fig. 2) so as to construct the h-catalog catalog database. The data import is realized using known er se data transfer techniques. Of course, the data in the server catalog are organized under the field names of the homogeneous catalog.
The original catalog (at the supplier site, Table 6), and the mapped catalog (at the h-catalog site, Table 5) are, accordingly, as follows:
Figure imgf000015_0001
TABLE 5
Figure imgf000015_0002
TABLE 6
For convenience of explanation only three products are shown in tables 5 and 6 above. Those versed in the art will readily appreciate that the catalog representation as a single table is made for illustrative purposes only and, accordingly, any known per se technique for efficiently storing the data, is applicable.
Figs. 9A-C illustrate one out of many possible variants for representing data in the h-catalog site in accordance with one embodiment of the invention.
Thus, for example the entity "fields" (91), (Fig. 9A) contains the fields
"product name", "catalog name" and "price", as well as their contents, (i.e.
Bermuda shorts, Jeans 501, Jeans, LEE) and Swimsuit - together with their respective catalog numbers and prices). One from among these fields (or possibly other fields) serves as a key field (product key), all as known per se. The category field is represented as an integral part of table (91). Table (92) (Fig. 9B) stands for category table and it includes the key field Category Id and category name. The contents of the category table is, as shown, the distinct category values, i.e. "Sport pants for youth", "Jeans trousers for youth", and "Swimsuits and light summer clothing for youth".
Table (93) Fig. 9C stands for "property" having "color" and "size" properties and their respective values green, blue and yellow (for color) and 40, 38 and 32 for size.
The representation in accordance with Figs. 9A-C avoids duplicating the "category name" data which is relatively large and duplicating only the compacted category id data.
The representation of data in accordance with Fig. 9A-C is, of course, only one out of many known per se manners of representing data and by way of alternative non limiting embodiment the known ERD model may be used. As is well known, the latter enables efficient 1 :N relationship (e.g. a category can be assigned to more than one product) and N:M representation.
The procedure described with reference to a supplier catalog database is not bound to any specific order or scope. Thus, for example, the entire database may be mapped in one time or, if desired, the procedure described above may be applied successively to database portions, e.g. applied to each database table separately. The procedure described with reference to Figs. 1 to 9 is repeated for each supplier who wishes to subscribe to the homogenous catalog. Thus, the data of all the separate catalogs are represented in a unified manner in the homogenous catalog and, accordingly, querying the homogenous catalog using the common field, category and/or property nomenclature, will bring about consistent results as compared to the alternative of querying the inconsistent separate catalogs of the suppliers. The actual representation of data in the h-catalog and suppler may be one in any known per se manner taking in account depending on e.g. volume and performance considerations. Alphabetic characters and roman symbols used to designate method steps are used for convenience of explanation only and do not necessarily imply any particular order steps.
The present invention has been described with a certain degree of particularity, but those versed in the art will readily appreciate that various alterations and modifications may be carried out without departing from the scope of the following claims:

Claims

CLAIMS:
1. A method for constructing an electronic homogenous catalog database from a plurality of separate suppliers catalog databases, comprising performing in respect of each separate supplier catalog, the following steps, that include: (a) linking the homogeneous database to the supplier database using a communication protocol;
(b) mapping selected fields in the supplier catalog database to corresponding fields in the homogenous catalog database; said fields include "field type" fields, "property type" fields and "category type" fields;
(c) mapping category values in said supplier catalog database to corresponding category values in the homogenous catalog database;
(d) mapping property values in said supplier catalog database to corresponding property values in the homogenous catalog database; and (e) transferring data contained in said fields from said supplier catalog database to said homogenous catalog database.
2. The method of Claim 1, wherein the homogenous database and said plurality of supplier databases being each a relational database.
3. The method of Claim 1, wherein said step (c) includes the following preceding step: grouping category values in the supplier catalog database so as to obtain unique set of category values.
4. The method of Claim 2, wherein said step (c) includes the following preceding step: grouping category values in the supplier catalog database so as to obtain unique set of category values.
5. The method of Claim 1, wherein said step (d) includes the following preceding step: grouping property values in the supplier catalog database so as to obtain unique set of property values per property.
6. The method of Claim 2, wherein said step (d) includes the following preceding step: grouping property values in the supplier catalog database so as to obtain unique set of property values per property.
7. The method according to Claim 1, wherein said steps (b) to (e) steps are applied separately in respect of each database portion.
8. The method according to Claim 7, wherein said database portion being a database table.
9. The method according to Claim 1, wherein the communication protocol that is used for linking the homogeneous database to the supplier database utilized ODBC driver.
10. The method according to Claim 1, wherein said linking step is accomplished from a remote homogeneous database to the supplier database using communication protocol over a communication network.
11. The method according to Claim 10, wherein said communication network being the Internet.
12. A storage medium containing a homogenous catalog database produced in accordance with the method of Claim 1.
13. A query language utility for querying homogenous catalog database produced in accordance with the method Claim 1.
PCT/IL2000/000417 1999-07-14 2000-07-14 A method for constructing a homogeneous electronic catalog WO2001004775A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU58431/00A AU5843100A (en) 1999-07-14 2000-07-14 A method for constructing a homogeneous electronic catalog

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US35345299A 1999-07-14 1999-07-14
US09/353,452 1999-07-14

Publications (2)

Publication Number Publication Date
WO2001004775A2 true WO2001004775A2 (en) 2001-01-18
WO2001004775A3 WO2001004775A3 (en) 2003-01-09

Family

ID=23389163

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IL2000/000417 WO2001004775A2 (en) 1999-07-14 2000-07-14 A method for constructing a homogeneous electronic catalog

Country Status (2)

Country Link
AU (1) AU5843100A (en)
WO (1) WO2001004775A2 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1241597A2 (en) * 2001-03-14 2002-09-18 Aktiebolaget SKF Computer program product for assisting a user to select among information units of a plurality of structured information units concerning bearings and seals
US6871198B2 (en) 2001-12-21 2005-03-22 Requisite Technology, Inc. Composing and cataloging item configuration data
US7039645B1 (en) 2002-09-26 2006-05-02 Requisite Technology, Inc. Managing content of an electronic catalog by collaboration with another electronic catalog
US7979324B2 (en) * 2007-02-27 2011-07-12 Microsoft Corporation Virtual catalog
US8838592B2 (en) 2007-06-13 2014-09-16 Mlslistings Inc. Methods and systems for developing a data repository for heterogeneous MLS systems
CN113986362A (en) * 2021-10-22 2022-01-28 山东云海国创云计算装备产业创新中心有限公司 RAID card, control method thereof and server host

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5560005A (en) * 1994-02-25 1996-09-24 Actamed Corp. Methods and systems for object-based relational distributed databases

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5560005A (en) * 1994-02-25 1996-09-24 Actamed Corp. Methods and systems for object-based relational distributed databases

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CHOY D M ET AL: "A distributed catalog for heterogeneous distributed database resources" PARALLEL AND DISTRIBUTED INFORMATION SYSTEMS, 1991., PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON MIAMI BEACH, FL, USA 4-6 DEC. 1991, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, 4 December 1991 (1991-12-04), pages 236-244, XP010025430 ISBN: 0-8186-2295-4 *
KELLER A M ET AL: "MULTI-VENDOR CATALOGS: SMART CATALOGS AND VIRTUAL CATALOGS" EDI FORUM, EDI GROUP, OAK PARK, IL, US, vol. 9, no. 3, September 1996 (1996-09), pages 87-93, XP001056432 ISSN: 1048-3047 *
SANG-GOO LEE ET AL: "Digital catalog library: a shared repository of online catalogs for electronic commerce" ADVANCE ISSUES OF E-COMMERCE AND WEB-BASED INFORMATION SYSTEMS, WECWIS, 1999. INTERNATIONAL CONFERENCE ON SANTA CLARA, CA, USA 8-9 APRIL 1999, PISCATAWAY, NJ, USA,IEEE, US, 8 April 1999 (1999-04-08), pages 84-86, XP010348777 ISBN: 0-7695-0334-9 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1241597A2 (en) * 2001-03-14 2002-09-18 Aktiebolaget SKF Computer program product for assisting a user to select among information units of a plurality of structured information units concerning bearings and seals
EP1241597A3 (en) * 2001-03-14 2005-12-21 Aktiebolaget SKF Computer program product for assisting a user to select among information units of a plurality of structured information units concerning bearings and seals
US6871198B2 (en) 2001-12-21 2005-03-22 Requisite Technology, Inc. Composing and cataloging item configuration data
US7039645B1 (en) 2002-09-26 2006-05-02 Requisite Technology, Inc. Managing content of an electronic catalog by collaboration with another electronic catalog
US7979324B2 (en) * 2007-02-27 2011-07-12 Microsoft Corporation Virtual catalog
US8838592B2 (en) 2007-06-13 2014-09-16 Mlslistings Inc. Methods and systems for developing a data repository for heterogeneous MLS systems
CN113986362A (en) * 2021-10-22 2022-01-28 山东云海国创云计算装备产业创新中心有限公司 RAID card, control method thereof and server host
CN113986362B (en) * 2021-10-22 2024-01-23 山东云海国创云计算装备产业创新中心有限公司 RAID card, control method thereof and server host

Also Published As

Publication number Publication date
WO2001004775A3 (en) 2003-01-09
AU5843100A (en) 2001-01-30

Similar Documents

Publication Publication Date Title
US6708166B1 (en) Method and apparatus for storing data as objects, constructing customized data retrieval and data processing requests, and performing householding queries
US6199059B1 (en) System and method for classifying and retrieving information with virtual object hierarchy
US7917549B2 (en) Database interface generator
US6665677B1 (en) System and method for transforming a relational database to a hierarchical database
US6944619B2 (en) System and method for organizing data
US7925658B2 (en) Methods and apparatus for mapping a hierarchical data structure to a flat data structure for use in generating a report
CN1882943B (en) Systems and methods for search processing using superunits
US8051102B2 (en) Data base and knowledge operating system
US7725460B2 (en) Method and system for a transparent application of multiple queries across multiple data sources
US8862557B2 (en) System and method for rule-driven constraint-based generation of domain-specific data sets
US10127313B2 (en) Method of retrieving attributes from at least two data sources
CN101853295B (en) Image search method
US20020161757A1 (en) Simultaneous searching across multiple data sets
US20020055932A1 (en) System and method for comparing heterogeneous data sources
US20090106286A1 (en) Method of Hybrid Searching for Extensible Markup Language (XML) Documents
CN103559189B (en) Electric analog training resource management system and method based on Metadata integration model
US20080256067A1 (en) File Search Engine and Computerized Method of Tagging Files with Vectors
JP5410514B2 (en) Method for mapping an X500 data model to a relational database
US7860903B2 (en) Techniques for generic data extraction
CN105930174B (en) A kind of graphical page program comparison in difference method and system
WO2001004775A2 (en) A method for constructing a homogeneous electronic catalog
Hansen et al. DWStar-automated star schema generation
Lim et al. Integrating HTML tables using semantic hierarchies and meta-data sets
Nair et al. A conceptual query-driven design framework for data warehouse
Tompa et al. The Application of Current Database Technology to Videotex

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP