40 %
60 %
Information about GGFpart2

Published on November 28, 2007

Author: FunSchool


Part II: Representation:  Part II: Representation Thesauri, Topic Maps Frames, RDF/RDF-Schema DAML+OIL, Reasoning Services vs Data Structures Ontology Representation:  Ontology Representation Ontologies are the cornerstone of encoding understanding, BUT to be shared they they need a standard representation and exchange language Language: desirable properties:  Language: desirable properties Machine communication Model or proof theory Tractability of reasoning Strong conventions of use Human readable names Natural primitives Human communication Support for: Multiple classification Can say simple things simply but as complex as necessary Expressive enough to capture many ontologies Evolution & Merging Multiple authoring Web enabled: coding to RDF/XML Ontology Representation Languages:  Ontology Representation Languages Machines need communication with formal content to restrict meaning What makes a language formal? model theory (1st order predicate logic) proof theory (Gentzen calculus) conventions (e.g. Java) Ontology & its encoding rep.:  Ontology & its encoding rep. Ontology Domain specific conceptualisation expressed within some representational model Representational languages Data structuring mechanism in which ontology is expressed E.g. relational model, o-o model, frames, logics Ontology can be delivered as a (static) data structure for embedding in an application or a (dynamic) service. API XML for KR:  XML for KR Definition of self-describing data in worldwide standardized, non-proprietary format. Structured data and knowledge exchange for enterprises in various industries. Integration of information from different sources to uniform documents. Exchange of knowledge bases between different AI languages, knowledge bases and databases, application systems, etc. But…. XML is not enough:  XML is not enough XML defines grammars to verify and structure documents The grammar enforces constraints on tags Different grammars define the same content XML lacks a semantic model – it only has a surface model which is a tree. “The Creator of the Resource “” is Ora Lassila XML is not enough:  XML is not enough Meaning of XML documents is intuitively clear “semantic” markup tags are domain terms But computers do not have intuition Tag names per se do not provide semantics The semantics are encoded outside the XML specification XML makes no commitment on: Domain specific ontological vocabulary Ontological modelling primitives  requires pre-arranged agreement on  &  Feasible for closed collaboration agents in a small & stable community pages on a small & stable intranet Representation Paradigms (incomplete):  Ontologies TopicMaps Extended ER-Model/UML Db schema Thesauri Semantic Nets Taxonomies Representation Paradigms (incomplete) Frames RDF(S) Expressivity Logics: Predicate Logics F-Logic Conceptual Graphs Description Logics Representation Paradigms (incomplete):  Ontologies TopicMaps Extended ER-Model/UML Thesauri Semantic Nets Taxonomies Representation Paradigms (incomplete) Frames RDF(S) Logics: Predicate Logics F-Logic Conceptual Graphs Description Logics Thesauri:  Example: Fruit Orange Apfelsine (german) Vegetable similarTo synonymWith NarrowerTerm Well known in library science cf. terminologies / classifications (Dewey) Graph with labels edges (similar, nt, bt, synonym) Fixed set of edge labels (aka relations) no instances Thesauri Vocabularies: Gene Ontology:  Vocabularies: Gene Ontology Hand crafted with simple tree-like structures. Position of each concept and its relationships wholly determined by a person. Flexible but… Maintenance and consistency preservation difficult and arduous. Poor semantics. Single hierarchies are limiting “cross products”. Topic Maps (I):  Topic Maps (I) Standardized: ISO/IEC 13250:2000 (Jan 2000) enabling standard to describe knowledge structures, electronic indices, classification schemes, ... To enable information resources to be classified and navigated in a consistent manner by representing knowledge structures for indexes build valuable information networks above any kind of resources / data objects enable the structuring of unstructured information To make subjects addressable. The “GPS of the Web” Back-of-the-Book Index “British Virgin Islands”:  Back-of-the-Book Index “British Virgin Islands” Gorda Sound see North Sound Little Dix Bay .................... 89 North Sound ....................... 90 Road Harbour see also Road Town ... 73 Road Town ...................... 69,71 Spanish Town ................... 81,82 Tortola ........................... 67 Virgin Gorda ...................... 77 Topic Maps (II):  Topic Maps (II) The electronic equivalent of table of contents, glossaries, thesauri, cross references Subjects become objects ("topics"). Relationships between subjects are asserted. The concepts or topics that underlie a set of information objects exposed to those people or applications processing the information. High-level Topic Maps concepts:  High-level Topic Maps concepts Association Topic Occurrences Scope -Subject -Name(s) -Roles -Members -Template -Type Knowledge, Occurrences, Associations, Names Graph made of nodes and arcs, based on Semantic Nets Typed to form groups – (types are defined in the standard as topics) Topics and Occurrences play roles in relationships Based on Plato’s notions. In-/Semi-formal approaches: Topic Maps, Thesauri:  In-/Semi-formal approaches: Topic Maps, Thesauri Advantages Capture a lot of modelling experiences. Intuitive. Interesting primitives that are not available in other approaches (TM). Topic Maps Web enabled: XML Topic Maps (XTM) are ready to use. Disadvantages No characterization independent from particular implementation. May be misinterpreted (TM) / few primitives (Thesauri). No formal interpretation. No formal rigour. Hard to build and maintain large and coherent schemes. Pre-enumerate concepts. Representation Paradigms (incomplete):  Ontologies TopicMaps Extended ER-Model/UML Thesauri Logics: Predicate Logics F-Logic Conceptual Graphs Description Logics Semantic Nets Taxonomies Representation Paradigms (incomplete) Frames RDF(S) Frames, SDM, OO models:  Frames, SDM, OO models Frames Rich set of language constructs: frames, slots, facets, defaults. Impose restrictive constraints on how they are combined or used to define a class. All frames asserted into taxonomy by hand. All concepts are primitive. Octet/GKB, Protégé, OCML, Ontolingua … OKBC – Open Knowledge Base Connectivity. OKBC-Lite. OO / Semantic Data Models (EER, UML) Taxonomy/inheritance – semantics? Intuitive, lots of tools, widely used. Frame Data Model:  Frame Data Model Frames Classes: Genes, Reactions Instances: Relationships Slots: Chromosome, map-position, citations, reactants, products, Keq Facets: Chromosome is single-valued, instance of class Chromosomes; Citations is multiple valued, set of strings. Protégé 2000:  Protégé 2000 RDF: Web based data model:  RDF: Web based data model Semantic Web: beyond machine readable to machine understandable. Resource Description Framework is the W3C language for describing metadata on the Web. RDF consists of two parts RDF Model (a set of triples) RDF Syntax (different XML serialization syntaxes) RDF a small set of modelling primitives + syntax RDF does not commit to a domain vocabulary RDF Schema for definition of Vocabularies (simple Ontologies) for RDF (and in RDF) A simple RDF example:  A simple RDF example Resources A thing you can reference (URI) RDF definitions are themselves Resources. Properties slots, defines relationship to other resources or atomic values Similar to Frames. Statements “Resource has Property with Value” Values can be resources or atomic XML Schema data types. Directed graph s:Creator Ora Lassila Triples Resource (subject) Property (predicate) Value (object) "Ora Lassila” Collection Containers:  Collection Containers Multiple occurrences of the same PropertyType doesn’t establish a relation between the values The Millers own a boat, a bike, and a TV set RDF defines three special Resources: Bag Sequence Alternative Rdf:Bag /Students/Amy /Students/Tim /Students/John /Students/Mary /Students/Sue bagid1 students rdf:type rdf:_1 rdf:_2 rdf:_3 rdf:_4 rdf:_5 The students in course 6.001 are Amy, Tim,John, Mary, and Sue Statements about statements:  Statements about statements Transform them into Resources. Ralph Swick believes that the creator of the resource is Ora Lassila RDF Schema (RDFS):  RDF Schema (RDFS) RDF just defines the data model. Need for definition of vocabularies for the data model - an Ontology Language! RDF-Schemas describe rules for using RDF properties Define a domain vocabulary for RDF Organise this vocabulary in a typed hierarchy RDF Schemas are Web resources (and have URIs) and can be described using RDF. Are not to be confused with XML Schemas. RDFS is the framework for a vocabulary. RDF Schema Model:  RDF Schema Model Property-centric: Each property specifies what classes of subjects and objects it relates. New properties can be added to a class without modifying the class resource, class, subClassOf, type property, subPropertyOf domain, range, constraintResource, constraintProperty Definitions can include constraints which express validation conditions domain constraints link properties with classes range constraints limit property values BUT expressive inadequacy and poorly defined semantics RDF Schema Model:  RDF Schema Model Class MotorVehicle Truck Person Property registeredTo ownedBy Domain Range Resource type type subClassOf subClassOf subClassOf type type type Frame/OO model summary:  Frame/OO model summary Advantages Intuitive and popular modelling style. Many tools and examples. OKBC standard for semantics. Some reasoning. Disadvantages Extending/evolving problematic Hand crafting taxonomies and asserted properties. Static classifications. Pre-enumerate concepts. Little reasoning support Difficult to build large coherent and complete ontologies (e.g. multiple classifications) RDF Resources:  RDF Resources RDF Repositories RDFDB, RDFSuite, Sesame … RDF Query Languages RQL, RDQL, SQUISH … Annotation systems to create RDF Annotea, CREAM COHSE (see later) … RDF Java API Jena Representation Paradigms (incomplete):  Ontologies TopicMaps Extended ER-Model/UML Thesauri Semantic Nets Taxonomies Representation Paradigms (incomplete) Frames RDF(S) Logics: Predicate Logics F-Logic Conceptual Graphs Description Logics RDF(S) Extensibility:  Definition uses the Data model of RDF Defined in terms of RDF Schema Is extension of RDF(S) Extensibility Define an Ontology of your Language with RDF Schema (like RDF-Schema itself) Describe Instance Data using your new Vocabulary Advantage: all Languages use the same Data Model (simplifies Interoperability) Stack of languages::  Stack of languages: The Ontology Language Stack:  The Ontology Language Stack OIL HTML XML + Name Space + XML Schema Topic Maps SMIL RDF(S) DC PICS XOL DAML-Ont DAML+OIL RDF DAML-R DAML-S Unicode URI OWL Web Language Stack summary:  Web Language Stack summary XML: interchange syntax, no semantics RDF: Data model, some semantics & inference (recent!) RDF Schema: concept modelling, more semantics & inference DAML+OIL / OWL: more expressive ontology language; quite expressive; expensive inference Requirements for a Web Ontology Language W3C History: DAML+OIL:  History: DAML+OIL OIL : developed by group of (largely) European researchers. DAML- ONT: developed by group of (largely) US researchers (in DARPA DAML programme). Efforts merged to produce DAML+ OIL. Development was overseen by joint EU/ US committee. Now submitted to W3C as basis for standardisation WebOnt working group developing language standard. Likely to be called OWL. DAML+OIL / OWL:  DAML+OIL / OWL DAML+ OIL designed to describe structure of domain (schema) Object oriented: classes (concepts) and properties (roles) DAML+OIL ontology consists of set of axioms asserting characteristics of classes and properties E.g. Person is kind of Animal whose parents are Persons RDF used for class/property membership assertions (data) E.g. John is an instance of Person; h John ; Mary i is an instance of parent DAML+OIL / OWL :  DAML+OIL / OWL DAML+ OIL supports the full range of XML Schema data types Primitive (e. g., decimal) and derived (e. g., integer sub- range) DAML+ OIL classes can be names (URI’s) or expressions Various constructors provided for building class expressions Expressive power determined by Kinds of constructor provided Kinds of axiom allowed Description Logics:  Description Logics DAML+ OIL equivalent to the expressive Description Logic (an extension of) SHIQ DL The descendants of frame systems and object hierarchies via KL-ONE. Core distinction between class definitions (T-Box  Schema) and instance definitions (A-Box  Database tuples) Many years of DL research Well defined semantics Formal properties well understood (complexity, decidability) Known reasoning algorithms Implemented systems (highly optimised) What’s in a “Logic based ontology”?:  What’s in a “Logic based ontology”? Primitive concepts - in a hierarchy Described but not defined Properties - relations between concepts, also in a hierarchy Constructors – on concepts and properties “some”, “only”, “at least”, “at most”, and, or, not. Defined concepts Made from primitive concepts, constructors and descriptors Enzyme  protein and catalyses reaction. Reason that enzyme is a kind of protein. “is-kind-of” = “implies” “Dog is a kind of wolf” mean “All dogs are wolves” Axioms disjointness, further description of defined concepts A Reasoner to organise it for you. Consistency & Taxonomy for defined concepts established though logical reasoning. [Rector] Model built up incrementally and descriptively based on concept’s properties. Logic Based Ontologies:  Logic Based Ontologies Thing + (feature: pathological) red + partOf: Heart red + partOf: Heart [Rector] Reasoning support:  Reasoning support Consistency — check if knowledge is meaningful Subsumption — structure knowledge, compute taxonomy Equivalence — check if two classes denote same set of instances Instantiation — check if individual i instance of class C Retrieval — retrieve set of individuals that instantiate C Problems all recucible to consistency (satisfiability): Reasoning demo using OilEd:  Reasoning demo using OilEd Semantics matters:  Semantics matters A hacker who studied ontology Was famed for his sense of frivolity When his program inferred That Clyde ISA Bird† He blamed – not his code – but zoology †Clyde ISA Elephant “AI limericks” by Henry Kautz Why Reasoning Services I ?:  Why Reasoning Services I ? Ontology design Check class consistency and (unexpected) implied relationships Particularly important with large ontologies/multiple authors Ontology integration Assert inter-ontology relationships Reasoner computes integrated class hierarchy/consistency Ontology deployment Determine if set of facts are consistent w. r. t. ontology Determine if individuals are instances of ontology classes Query Inclusion Service description matchmaking Classification-based querying. Gain of mapping?:  Gain of mapping? Any RDF agent can process DAML+OIL instances Any RDF-S agent can process DAML+OIL ontologies Any DAML+OIL-aware agent can exploit semantics & reasoning (and materialize the DAML+OIL derivations for use by DAML+OIL-ignorant RDF agents) Evaluation of DAML+OIL /OWL:  Evaluation of DAML+OIL /OWL Advantages Decidable (if choosen carefully like DAML+OIL) Subsumption reasoning Consistency checking Dynamically post-coordinate rather than have to pre-enumerate. Support for evolution, merging, large scale building Can publish the ontologies as static lattices. W3C standard (so tools etc) Disadvantages Different modeling style if want to take advantage of reasoning. Limited support for A-Box reasoning (on instances) in tractable DL versions “As simple as required but as complex as necessary” Languages: Summary:  Languages: Summary Thesauri & Topic Maps Hand crafted, flexible but difficult to evolve, maintain and keep consistent, with poor semantics. Object-based KR: e.g. frames Extensively used, good structuring, intuitive. Semantics defined by OKBC standard. Logic-based: Description Logics Very expressive, model is a set of theories, well defined semantics, reasoning. Automatic derived classification taxonomies. Concepts are defined and primitive. Expressivity vs. computational complexity balance. Common language errors:  Common language errors AI people‘s errors it is good if it is formal it is good if someone with a logic background may easily use it it is good if the language allows everything Engineer‘s errors it works in my application, thus it is good who needs formality anyway? it did not work when I looked at it 10 years ago Further Reading:  Further Reading <supplied separately>

Add a comment

Related presentations