Published on December 10, 2007

Author: Callia



Ontologies (and friends…) :  Ontologies (and friends…) Ann M Wrightson Principal Consultant, CSW Group Ltd To be explained:  To be explained Ontology Taxonomy Vocabulary Thesaurus …and not forgetting… Terms and Controlled terms Subject and Object; Subject and Topic Resource (and Resource) Descriptors and Description … and back to Ontologies to finish Ontology:  Ontology Ontology (philosophy): a theory of what there is Ontologies (information systems): artefacts that express or define the extent and structure of a universe of discourse Taxonomy :  Taxonomy Taxonomy (biological): a classification of living organisms into a hierarchical structure of species, genera, families etc Taxonomy (information systems): a classification of concepts into a hierarchical structure according to whether the concepts are more general or more specific Vocabulary:  Vocabulary Vocabulary (linguistics): the recorded words of a natural language; the recorded words used by an individual speaker, community etc. Vocabulary (information systems): a set of terms used in a particular context Thesaurus:  Thesaurus Thesaurus (lexicography): a compilation of lists of words that are alternates in some context Thesaurus (information science): a set of terms used in a specific context, often with a hierarchical structure of broader and narrower terms, and designation of equivalent, preferred and deprecated terms A few more Denizens of this Noble Jungle of Terms:  A few more Denizens of this Noble Jungle of Terms Terms and Controlled Terms Resource, Subject and Topic Resource, Subject and Object Descriptors and Description Terms:  Terms Terms are strings Terms are used to represent Members of a vocabulary Members of a thesaurus Names of relationships in a thesaurus Names of individuals in an ontology Names of relationships in an ontology …and many other things too A set of terms usually implicitly represents a set of equivalence classes of strings Presumably under some, usually unspecified, string-equivalence algorithm Controlled Terms :  Controlled Terms A set of terms whose membership is controlled by some authority Local, eg named divisions of an organization Industry standards, eg a standard vocabulary Strictly speaking, in this case (some specified edition of) the vocabulary serves as the authority for the set of controlled terms Normally used to identify uses of terms outside the controlled list Eg as part of data validation Resource, Subject and Topic:  Resource, Subject and Topic Subject (intensional): anything that can be a subject of discourse i.e. anything whatsoever that can be talked about Topic (extensional): a representation of a subject within an information system pre-automation: various concepts from ancient Greek philosophers to Topic Maps Resource (extensional, relational): a chunk of data that bears some useful relation to a subject, and hence is related to a topic pre-automation: as an analytical concept, dates at least from the emergence of a formalized notion of “sources” in mediaeval logic & rhetoric Resource, Subject and Object:  Resource, Subject and Object Resource (extensional): anything that is a member of the universe of discourse (UoD - normally assumed to be a set) RDF: set-hood enforced by restricting any UoD to be a subclass of the class of items that are named using a specific kind of term (URI) Presented as the only sensible way…  Predicate (first-order logic): (a mathematical model of) an assertion about members of a UoD RDF: Subject and Object are names for the two slots for individuals of the UoD that are available in a two-place Predicate; in a “triple”, the Subject, Predicate and Object are all represented by URIs. RDF: explicit claim (presented as “common sense”) that a logic of two-place predicates over a denumerable set of names is sufficient to say everything that ever needs to be said RDF: the implicit complexity arising from Predicates being named as members of a UoD is handled in the Model Theory by dodging it quite precisely, and then comes back into its own in the literature on semantics of RDF extensions such as OWL Descriptors and Description:  Descriptors and Description Descriptor (computing): a compact data value (eg integer or string) that is used to refer to something else, in a given context Also called a “handle” or “tag” – probably the usage that gave rise to “tag” as an informal name for Generic Identifiers in SGML/XML Description (logic): a collection of properties (normally represented by terms such as property/value pairs) assigned to an individual in a UoD Inferences regarding classification using properties are the central subject matter of Description Logic OWL: Description Logic has been very useful in analysing computational properties of OWL artefacts, hence “OWL-DL” names a computationally tractable subset of OWL corresponding to a particular kind of Description Logic. …and back to Ontologies:  …and back to Ontologies A glimpse of the future :  A glimpse of the future  What is an Ontology? … to quote: …the means by which a person or other agent understands its world … the means by which a person or [other] agent communicates with others …whenever data are structured, the description of their structure is the ontology for the data Ontologies based on XML are more specifically called markup languages because of the historical origins of XML as a means of marking up text for the purpose of typesetting documents (Baclawski & Niu, Ontologies for Bioinformatics, MIT Press 2006) So what?:  So what? A right royal mess…:  A right royal mess… Not as simple a picture as I described Legitimate subtleties of definition All the source fields are still alive and active Powerplays (eg for product markets and personal influence) based on terminology wars and conceptual territory grabbing Lots of confusion No simple consensus for newbies to learn Easy to mis-learn terms and spread the confusion  Names (eg of standards) tend to stay the same whilst the scope drifts Also the usual scoping problem of complete solution vs clean component …so watch out!:  …so watch out! “Ontology”, “Taxonomy” and “Vocabulary” in particular tend to be used vaguely and interchangeably …so for example a so-called taxonomy may not have a clean hierarchical structure …and may be just a navigation aid and not a classification of anything “Ontology” sometimes means “artefact coded in OWL” whether it is an ontology or not Relatively few writers of requirements understand the computational penalties of complexity “in the worst case it can go on for ever” is a good catchphrase! 