Final iceis camera.pdf

In: Proceedings of the International Conference on Enterprise Information Systems (ICEIS-2001),Setúbal, Portugal, July 7-10, 2001 AN ANALYSIS OF B2B CATALOGUE
INTEGRATION PROBLEMS
Division of Mathematics and computer Science, Vrije Universiteit, De Boelelaan 1081a, 1081hv, Amsterdam, The Netherlands URL: www.cs.vu.nl/~{borys, dieter} B2B Catalogue Integration, Content Integration, Document Integration Content Management becomes a cornerstone of successful B2B electronic commerce. The B2B players use different document standards to represent their business documents, and different content standards to specify the products. Thousands of the players meet together at B2B marketplaces, and the marketplaces must be able to integrate numerous document and content standards. The large number of the standards and their significant complexity make the integration problems non-trivial and require development of special integration architecture. In the present paper we discuss the tasks and the problems which occur during the content and document integration, and survey possible solutions and available techniques. INTRODUCTION
interaction between three different types of standards: – Different standards for describing exchangeable Nowadays e-business includes thousands of business documents like purchase orders (e.g. companies, which provide a tremendous number of products for electronic markets. Unlike the traditional markets, the electronic markets allow the – Content standards, which specify the products participants to easily compare the offers and perform with the hierarchy of product categories and the business negotiation faster due to electronic attributes of each category (e.g. UN/SPSC3, information interchange between the market participants provided by the Business-to-Business – Product catalogues, which specify the products according to some content standard and which Forecasts for the dollar value of B2B EC in the are referenced by the exchangeable business US range between $600 billion to $2.8 trillion for documents (e.g. Ariba6 and CommerceOne7). 2003 (cf. (U.S. Department of Commerce, 2000)). Electronic marketplaces for B2B electronic From the technical point of view the content commerce bring together many online suppliers and management task for the B2B area includes two buyers, which participate in the business interactions major types of information mapping: mapping (cf. (Fensel, 2001) for an overview of the field). The business documents and aligning content standards. (U.S. Department of Commerce, 2000) estimates around 800 B2B marketplaces in early 2000 and other studies estimate around 10,000 B2B marketplaces in the very near future. However, the marketplaces have to deal with the problem of heterogeneity in product, catalogue, and document description standards of their customers. Effective and efficient management of different description styles become a key task for these marketplaces. In addition, a number of serious mapping problems, which have to be solved to make the B2B area 7 www.commerceone.com/solutions/business/content.html In the paper we discuss how the above standards There exist a number of ‘vertical’ standards. ISO8 must be integrated to facilitate B2B e-commerce. provides many product coding standards specific for We discuss the content standards in Section 2, and certain branches of industry. RosettaNet9 contains a their integration problems in Section 3, followed by catalogue of IT products with a categorization, the document integration task discussed in Section 4. attributes and mapping of each category to We finalize the paper with the future research UN/SPSC. Large manufacturers tend to develop directions and conclusions in Section 5. special product cataloguing schemes, and usually these schemes are reflected in the product coding system used by the company. Finally, the company THE CONTENT STANDARDS
may have its own focus and therefore require specific product classification scheme. Hence, the products may be classified in the The content standards provide a hierarchy of suppliers’ catalogue according to some certain product descriptions and define the subclass-of content standard, which may differ from the content relationship between the product categories. Each standard used by the buyer to specify its needs. In product from a product catalogue has an attached consequence, a B2B mediation system must be able link to a certain product category, which describes to reclassify a product, already classified once, the product. The content standards may be classified according to another classification schemas. into ‘horizontal’ standards and ‘vertical’ standards
(cf. Figure 1). As shown in the figure, the horizontal
standards provide a high-level classification of all
possible products and cover many domains. Each of
ALIGNING THE CONTENT
the vertical standards provides a deep and narrow STANDARDS
classification of a certain limited domain, e.g. the domain of IT devices. Normally, a vertical standard To reclassify the products a B2B marketplace expands several bottom-level categories of a must be able to perform three different types of mappings: – Aligning two horizontal standards. – Linking a horizontal standard to a vertical These mappings provide different problems as Aligning Horizontal
Standards
Figure 1. The relation between the horizontal and vertical standards The horizontal standards provide general The most well-known horizontal standard hierarchies of products, and we can expect many UN/SPSC has a 5-level classification scheme with equivalence mappings to appear between them. more than 12,000 categories. It is not descriptive, Sometimes the content standard providers publish that is it contains no attributes for the products but these mappings as a part of the standard. For only the hierarchy of product names. Consequently, example, UN/SPSC provides direct one-to-one the next initiative, UCEC provides an extension of mapping between UN/SPSC and other ‘horizontal’ the UN/SPSC standard with attributes. It uses only four top levels of the UN/SPSC classification Despite these published mappings, a number of hierarchy and provides a couple of attributes for problems arise in aligning the horizontal standards: each category, for example, the category [44-12-15-05] ‘Specialty envelopes’ has six attributes: type, length, width, weight, colour, and composition. Another horizontal standard ecl@ss supports the flow of products and information along the supply chain of an industrial enterprise and is mainly used in Germany. It provides the attributes for each of more than 12,700 categories tailored to the needs of industrial customers and their suppliers. – Only few horizontal standards have officially standards provide a deep and narrow classification published mappings, and most of the standards opposite to wide and shallow horizontal standards. Normally a vertical standard expands one or few – The standards significantly differ in their classifications because of the absence of a For example, consider the mapping of the consensus scheme for classifying all products. RosettaNet standard for the electronic component – The standards differ in the granularity level in and IT supply chain to UN/SPSC. The mapping the classifications of each particular group of links only 136 UN/SPSC elements out of more than products. Hence, very often the published 12,000, most of which belong to the bottom level in mappings list the concepts with different the UN/SPSC hierarchy, to 445 categories and 2660 – The equivalence of the categories is not evident The vertical standards are very precise in from their descriptions, e.g. NAICS11’ code [39] describing the items they are focused. The same time ‘Miscellaneous Manufacturing Industries’ is they are even shallower than the horizontal standards mapped to the UN/SPSC code [73] ‘Industrial in describing the things, which lay beyond their Production and Manufacturing Services’. Aligning the vertical and horizontal standards An example of aligning two horizontal standards is shown in Figure 2. In general, aligning two – Mapping a relatively small number of top-level horizontal standards has the following properties: vertical concepts with more general concepts of – It is based on the published official mappings. – It contains additional mappings created by the – Mapping the concepts, which are outside the user, which extend the set of official mappings. – It contains multiple mappings if the pair of correspondent concepts in the horizontal standard. In this case the vertical concepts may have the same granularity level as the horizontal ‘Rollerball pens’ are subclassed in UN/SPSC in [44] Office Equipment and Accessories and Supplies This linking is simpler than the horizontal aligning and it has an evident top-down structure. Hence, technologically it can be treated as a light [44121701] Rollerball pens
version of the horizontal mapping and it can be Aligning Vertical Standards
ecl@ss classifies rollerball pens as Writing material follows: [24] Communication technology Aligning vertical standards requires linking their categories in a similar way as it is done for the [24-11-01] Writing and drawing materials horizontal standards. In additional the vertical [24-11-01-01] Writing material
standards have an extensive set of attributes, which can be even larger than the set of classes, as it is in the RosettaNet case. Each attribute can be Figure 2. The example of mapping two horizontal – Attribute name, e.g. ‘Screen size’. – Name abbreviation, e.g. ‘ScreenSize’, which is a Linking Vertical and
valid identifier produced from the attribute Horizontal Standards
– Attribute value type (e.g. string, integer, float, etc.). The type may be an enumerated type Linking the vertical and horizontal standards represented with a list of possible attribute differs from the previous case. The vertical values, e.g. the currency type actually contains a list of possible currencies. – Attribute value format, which defines the way to interpret the attribute value. For example, ‘YYYY-MM-DD’ denotes that the date is represented in a year-month-day format, e.g. – The scale for the values, e.g. ‘m’ stands for techniques have been developed. The database community provides a number of approaches for – Attribute domain(s), or the set of categories to database schema integration (Poulovassilis&Brien, 1998), (Batini, Lenzerini, Navathe, 1986). The knowledge engineering community provides a tool Hence, aligning the vertical standards requires: support with Protégé (Grosso et al., 2000), Chimaera – Mapping attribute names and attribute (Noy&Musen, 2000); and inference-based – Transforming attribute types, e.g. transforming an integer value into a corresponding string achievements from these areas must be combined together to solve the aligning problems of content – Mapping the list of possible values for the – Mapping different value display formats. – Transforming between the unit scales, e.g. DOCUMENT INTEGRATION
translation of the length in meters into the length in feet. – Mapping attribute domains: the list of the categories from the source standard, to which the contain a large number of different documents to be attribute applies, must be translated into the list exchanged between the market participants. For of the categories from the target standard. This example, the xCBL standard proposes a document translation exploits previously defined mappings infrastructure described with 594 XML DTD’s. <SchemaVersion>1.0</SchemaVersion> <SchemaStandard>UNSPSC</SchemaStandard> <Money currency="USD">1000</Money> <Product Type="Good" SchemaCategoryRef="C43171801"> <Description xml:lang="en"> Armada M700 PIII 500 12GB <ProductID>140141-002</ProductID> <Manufacturer>Compaq</Manufacturer> <UnitOfMeasure>EA</UnitOfMeasure> <Classification domain="SPSC"> C43171801</Classification> <Country><CountryCoded>US</CountryCoded></Country> <ManufacturerPartID>140141-002</ManufacturerPartID> <ManufacturerName>Compaq</ManufacturerName> <ShortDescription xml:lang="en">Armada M700 PIII 500 12GB <URL>http://www.compaq.com</URL> <LongDescription xml:lang="en">This light, … <ExpirationDate>2000-06-01</ExpirationDate> <AttributeID>Processor Speed</AttributeID> <EffectiveDate>2000-01-01</EffectiveDate> <AttributeValue>500MHZ</AttributeValue> <Name xml:lang="en">Notebook</Name> <ProductVendorData PartnerRef="Acme_Laptops"> <SearchDataElement name="Processor Speed" value="500MHZ"/> <VendorPartNumber>12345</VendorPartNumber> <TerritoryAvailable>USA</TerritoryAvailable> <CurrencyCoded>USD</CurrencyCoded> </Currency> </ProductPrice> </ProductVendorData> </Product> Figure 3. Two fragments of product catalogues Other standards define a document infrastructure of the XML element CurrencyCoded to encode the a similar complexity (se (Li, 2000) for a currency, while cXML uses the XML attribute comparison). Let us consider a fragment of a product currency; in both cases the currencies are catalogue as defined by the cXML and xCBL formats and presented in Figure 3 (a) and (b) – Different value formats and encoding conventions may be used. For example, as Both catalogs contain two parts: static catalog shown in Figure 3, the reference to UN/SPSC is information and dynamic information. The static encoded with the attribute domain in cXML, part contains the descriptions, which are not updated value domain="SPSC". At the same time the frequently, such as product name and description, its xCBL standard encodes the same link with the UN/SPSC code and a manufacturer. The dynamic part contains the descriptions, which can be updated Hence, the values of the attributes and elements very often and will be sent to the user on request. must be translated in addition to the element However, different concepts are regarded as dynamic in xCBL and cXML. According to the – Different scales may be used for the values. For xCBL format, product attributes are present in the example, the price in US dollars has to be scaled static part, while the price is regarded as a dynamic for comparison to the price in DM. Unlike the part, requested from the vendor (and it appears in the representational differences shown in the ProductVendorData section). This scenario assumes previous example, the latter require the scales to that the user accesses full descriptions of all the be properly verified and timely updated. products, while the price can change in time. – Different natural languages may be used in the cXML expects another implementation scenario tag values, as marked up with the XML xml:lang partially targeted to the needs of B2B website development. It assumes that the user browses – In addition to a multi-lingual tag values we can through the descriptions of the first interest, such as expect that some national document standards product name, content classification, and the price. may use other languages in the tag names. Detail product information, such as LeadTime, ExpirationDate and other is available on request (and The standards are often represented in XML (see regarded as a dynamic part and appears in the (Li, 2000) for a survey) and this tendency dominates. The W3C12 consortium provides the attributes, which are treated as the static part in the standard architecture for XML document integration In the rest of this section we discuss the direct mapping of different representations which provides Hence, the document integration task can be a partial solution to the integration task. Then we principally resolved by means of the XSL-T discuss a multi-layered framework, which eliminates language. This requires development of a set of some of the problems of the direct mapping XSL-T rules able to translate one XML serialization to another one. Direct document transformation with XSL-T rules is discussed in (Omelayenko&Fensel, Single-Layer Integration
2001) and appeared to be a partial solution, and have raised a number of problems. The problems arise from the fact that this approach mixes several The documents represented in Figure 3 represent independent tasks in a single batch of XSL-T rules: the same information, however several differences – Aligning the granularity level of the (Omelayenko&Fensel, 2001) for relevant discussion representations and performing necessary attribute splits with XPath expressions. Very – Different terminologies are used, i.e. the tag often, this splitting is guided by ad-hoc rules, names used to denote semantically equivalent which split based on the element values. For elements (these differences are called naming example, one standard may store a street name conflicts in the database schema integration and a house number address components in a area). For example, the price is marked up in xCBL with the tag Amount, while cXML uses The standards can use either XML attributes or equivalent information. For example, xCBL uses single element, while another standard may However, the former might be more detailed than allocate two separate elements for them. the latter, e.g. the XML serialization may allocate – Transformation the attribute values. only one element for street name and house number, – Restoring necessary formatting according to the while the ontology must allocate two separate elements. We assume that different terminologies must be aligned on the Ontology layer rather than on The problems of the single-layer integration appear because two tasks run together with a single The most suitable language candidate to encode bungle of transformation rules: syntactical the triples on this layer is RDF (Lassila&Swick, translations between different XML representations 1999), a W3C standard for describing of machine- and semantical mapping between the terminology processable semantics of data also represented with and granularity level of the representations. the object-attribute-value triples. Another possible Naturally, these two types of transformations belong candidate is Simple Object Access Protocol The Ontology layer corresponds to the document
representation on the Web was proposed in ontologies used to represent the products. We (Melnik&Decker, 2000), where three layers, syntax assume that this layer specifies the documents in a layer, object layer, and semantic layer are proposed detail level, sufficient enough to specify the for information modelling on the Web. The syntax transformations between the catalogues with one-to- layer provides a way of serializing information one mapping rules. In addition, the ontology content into a sequence of characters according to contains the elements specified as optional and some standard, e.g. XML. The purpose of the object possibly absent in the XML serialization and, layer is to offer an object-oriented view on the therefore, helps in aligning them. Despite the fact we information with the normalized data models of sometimes reference to this layer throughout the standardized triples. Finally, the semantic layer paper, we do not discuss further possible ontology provides a conceptual model for the information. We mismatches or integration problems, which may use this partitioning to base our integration arise on this layer, see for example (Klein, Multi-layered integration provides a solution for these problems, as discussed in the next section. The Integration Process
Multi-Layer Integration
As we mentioned before, the difficulties of the single-layered representations are coursed by several integration tasks running together. Therefore, we use (Omelayenko&Fensel, submitted) for a detailed a ‘divide-and-conquer’ approach to decompose these discussion) we separate three layers of information tasks into several subtasks, each of which is representation, which are Syntax layer, Data Models layer, and Ontology layer. The decomposition is performed in a similar way The Syntax layer corresponds to the instance
to the structure of heuristic classification proposed in documents represented with their XML serialization. (Clancey, 1985). Heuristic classification assumes The serialization specifies the XML elements and that the classification is performed on a layer of attributes used, and their order. Even semantically abstract structures, and the input data must be first equal documents may differ in their serialization. abstracted, i.e. translated from some particular The Data models layer serves as a bridge
format into the abstract structure; after the between the Ontology layer and the Syntax layer. On classification it must again be refined from the this layer the representations are abstracted from the abstract structure to specific solutions. differences imposed by the Syntax layer and the To realize this strategy (see Figure 4) we have products are represented by object-property-object triples, where the attributes stand for products’ (Omelayenko&Fensel, submitted) which assumes attributes. The normalization is done according to that the integration is performed at least via two the corresponding ontology which specifies the semantics of the elements at the granularity required The terminology used on this layer is defined by the corresponding ontology and generally must coincide with the one used on the Syntax layer. www.w3.org/TR/SOAP/. See (Haustein, 2001) for a layers: the syntax layer of the actual XML may look like the following (from cXML element documents and the layer of the normalized data Money to xCBL element ProductPriceAmount): models for the catalogues. Accordingly, the integration process passes through three steps: the <xsl:for-each select="rdf:Description"> translation of the source XML catalogue into its <SchemaVersion>1.0</SchemaVersion> normalized data model on the data abstraction step, the translation between a pair of data models of <xsl:value-of select="Money"/></ProductPriceAmount> different catalogues on the transformation step, and the translation from the data model back into XML <xsl:value-of select="currency"/></ProductPriceCurrency> according to the target XML format on the During the refinement step all syntactical restrictions required by the target format are Data Model
Data Model
restored, and the necessary many-to-one transformations are performed. The rules must be able to perform the following transformations, if required by the target standard: – Each RDF triple is translated into a corresponding XML element, XML attribute, or non-XML entity for a non-XML catalogue. – The target XML elements are created in a Figure 4. The model for data transformation merged into a single XML element, if required. – The XML representation may be partitioned into On the abstraction step the XML catalogues are translated into their normalized data models encoded In consequence, only one-to-one and many-to- with the RDF triples. This requires the following transformations: – The translation of each XML element or XML attribute, which refers to a product feature into an RDF property with the same name (however, CONCLUSIONS
– The split of a single XML element into two or In the paper we discussed two problems, which more RDF triples, if this is required for the are quite important for the B2B area: content integration and document integration. Each of the – The combination of multi-file descriptions into a problems can be solved with an ad-hoc solution. However, given the very large amount of required – Inclusion of the optional XML elements in the mappings of content standards (more than 12,000 RDF triples; the values of the elements are filled concepts plus several times more attributes) and the large amount of documents (400 documents and 5 different standards already require around 100,000 All inter-catalogue mappings are performed on mappings) and document standards, this approach the layer of the normalized RDF data models. We does not scale up to the actual needs of effective and assume that all necessary element splits have been performed during the abstraction stage and Therefore, we developed a conceptual model for necessary element merges will be done on the the mapping process with two main contributions: refinement stage. Hence, only two types of Dividing the overall mapping process into mappings may appear between the attributes of the two catalogues: one-to-one mapping and many-to- Identifying different layer that represent many mapping. The latter requires no attribute splits different aspects of the overall mapping or merges and can be easily expressed with a set of one-to-one mappings. The XSL-T rules for this layer and evolution of Protégé-2000)’, In Proceedings of the complexity of the process and allow reusing simple Twelfth Banff Workshop on Knowledge Acquisition, rule patterns to actually define the mappings Modeling, and Management, Voyager Inn, Banff, Currently we define a simple rule pattern language on top of XSL-T customized to the specific (Haustein, 2001) Haustein, S., 2001, ‘Semantic Web integration needs of electronic commerce. Instead of Languages: RDF vs. SOAP Serialization’, In defining transformation directly in XSL-T Proceedings of the Workshop on the Semantic Web - SemWeb'2001 at the 10-th WWW Conference, Hong transformation by hand, they should be derivable from selecting and instantiating mappings defined at (Klein, submitted) Klein, M. Combining ontologies: an a more intuitive level. We are aiming on analysis of problems and solutions, submitted; transforming a complex programming task into a available online at http://www.cs.vu.nl/~mcaklein/ (Lassila&Swick, 1999) Lassila, O. and Swick, R., 1999, The Ontology layer for document integration has ‘Resource Description Frame-work (RDF) Model and to be elaborated to handle the necessary information correspondence. Further elaboration of the integration techniques requires ontology aligning to (Li, 2000) Li, H., 2000, ‘XML and Industrial Standards for guide the transformations on the lower layers of the Electronic Commerce’, Knowledge and Information representation layer we will overcome the (McGuinness et al., 2000) McGuinness, D., Fikes, R., exponential explosions in the number of required Rice, J., Wilder, S., 2000, ‘An Environment for Merging and Testing Large Ontologies’, Proceedings of the Seventh International Conference on Principles of Knowledge Representation and Reasoning (KR-2000), Breckenridge, Colorado, April ACKNOWLEDGEMENT
(Melnik&Decker, 2000) Melnik, S. and Decker, S., 2000, ‘A Layered Approach to Information Modeling and The authors would like to thank Ellen Schulten Interoperability on the Web’, In Proceedings of the for her helpful consultancy and discussions, and Workshop on the Semantic Web at the Fourth European Conference on Research and Advanced Technology for Digital Libraries (ECDL-2000), Lisbon, Portugal, September 21. REFERENCES
(Noy&Musen, 2000) Noy, N. and Musen, M., 2000, ‘PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment’, In Proceedings of (Batini, Lenzerini, Navathe, 1986) Batini, C., Lenzerini, the 17-th National Conference on Artificial Intelligence M., Navathe, S., 1986, ‘A comparative analysis of (AAAI-2000), Austin, Texas, July 30 – August 3. methodologies for database schema integration’, ACM (Omelayenko&Fensel, 2001) Omelayenko, B. and Fensel, Computing Surveys, 18(4), p. 323-364. D., 2001, ‘An Analysis of the Integration Problems of (Chalupsky, 2000) Chalupsky, H., 2000, ‘OntoMorph: A Translation System for Symbolic Knowledge’, In Commerce’, In Proceedings of 9th IFIP 2.6 Working Proceedings of the Seventh International Conference Conference on Database Semantics, Hong Kong, April on Knowledge Representation and Reasoning (KR- 2000), Breckenridge, Colorado, USA, April 12-15. (Omelayenko&Fensel, submitted) Omelayenko, B., and (Clancey, 1985) Clancey, W., 1985, ‘Heuristic Fensel, D., ‘A Two-Layered Integration Approach for Classification’, Artificial Intelligence, 27, p. 289-351. Product Catalogs in B2B E-commerce’, submitted; (Clark&DeRose, 1999) Clark, J. and DeRose, S., 1999, available online at http://www.cs.vu.nl/~borys/papers/ (Poulovassilis&Brien, 1998) Poulovassilis, A. and Brien, Recommendation; available online at P., 1998, ‘A General Formal Framework for Schema (Clark, 1999) Clark, J., 1999, ‘XSL Transformations Transformation’, Data & Knowledge Engineering 28, (XSL-T)’, W3C Recommendation; available online at (U.S. Department of Commerce, 2000) U.S. Department (Fensel, 2001) Fensel, D, 2001, Ontologies: Silver Bullet of Commerce, 2000, Digital Economy 2000, White for Knowledge Management and Electronic Commerce. Springer-Verlag, Berlin. (Grosso et al., 2000) Grosso, W., Eriksson, H., Fergerson, R., Gennari, J., Tu, S., and Musen, M., 1999, ‘Knowledge modeling at the millennium (the design

Source: http://borys.name/papers/OF_ICEIS01.pdf

Cv - tnh (full - 10/98 - update)

CURRICULUM VITAE Thomas N. Hangartner, PhD, FAAPM 4058 Whitegate Dr. Beavercreek, Ohio 45430 Phone H: (937) 427-2177 Phone W: (937) 775-5070 PERSONAL INFORMATION: EDUCATION: Matriculation,Stiftsschule Einsiedeln, SwitzerlandDipl. Phys. ETH,Swiss Federal Institute of Technology, ZürichTeaching Certificate (Secondary Education),Swiss Federal Institute of Technology, Züric

Pls02-sr0802-sr03-025-c1.qxd

Lo yoga della risata al centro ascolto Oltre Avvocati, medici, ma ancheoperai e uomini in divisatra i clienti delle prostitute. Un momento della perquisizione compiuta nel B&b Sikania di via Bacchilide dagli agenti del commissariato di Ortigia. Qui sono state sorprese una decina di ragazze, alcune delle quali in compagnia dei clienti CARMELA VINCI PRESIDENTE

Copyright © 2010-2014 Internet pdf articles