Published on June 27, 2014
Knowing what we’re talking about Robert Stevens Bio-health Informatics Group School of Computer Science University of manchester Oxford Road Manchester United Kingdom M13 9PL Robert.Stevens@manchester.ac.uk
We have an item of data • 27 • 27 what? • Units, with what is 27 associated? • Even if I told you, would we interpret what I said in the same way? 27
• text 27mm
• text tail of 27mm
Mouse tail of 27 mm • … and we can carry on: Mouse strain, where was it raised, on what was it fed, times, dates, etc. etc. • All this data is necessary to interpret my original number • Even if that metadata exists, we have to agree on the things the numbers describe mouse tail of 27mm
What is knowledge?
Heterogeneity is rife • We agree on units (more or less)… • We don’t agree on much else when it comes to labels for the entities in our domain • If we don’t know what we’re talking about…. • It’s difficult to interpret and exchange data and the results from data
Categories and Category Labels GO:0000368 U2-type nuclear mRNA 5' splice site recognition spliceosomal E complex formation spliceosomal E complex biosynthesis spliceosomal CC complex formation U2-type nuclear mRNA 5'-splice site recognition
The Ogden Triangle “Roast Beef“ Concept [Ogden, Richards, 1923] • Humans require words (or at least symbols) to communicate efficiently. The mapping of words to things is only indirectly possible. We do it by creating concepts that refer to things. • The relation between symbols and things has been described in the form of the meaning triangle:
We need to know what we’re talking about… • … if we don’t, our data are useless • Ifg we are to interpret our data then we need to know what entities it describes • We need to share data and re-use it • We need to find data; compare data; analyse data • We need to know what we know….
Manchester Mercury January 1st 1754 Executed 18 Found Dead 34 Frighted 2 Kill'd by falls and other accidents 55 Kill'd themselves 36 Murdered 3 Overlaid 40 Poisoned 1 Scalded 5 Smothered 1 Stabbed 1 Starved 7 Suffocated 5 Aged 1456 Consumption 3915 Convulsion 5977 Dropsy 794 Fevers 2292 Smallpox 774 Teeth 961 Bit by mad dogs 3 Broken Limbs 5 Bruised 5 Burnt 9 Drowned 86 Excessive Drinking 15 List of diseases & casualties this year 19276 burials 15444 christenings Deaths by centile
A World of Instances • The world (of information) is made up of things and lots of them • Instances, individuals, objects, tokens, particulars. • The Earth is a kind of Planet • Robert Stevens (NE 67 41 58 A) is a Person • All the individual Alpha Haemoglobins in my many Instances of Red Blood Cell • Each cell instance in my Body has copies of some 30,000 Genes • A Word, language, idea, etc. • This Table, those Chairs, • Any Thing with “A”, “The”, “That”, etc. before it….
We Put things into Categories • All these instances hang about making our world • Putting these things into categories is a fundamental part of human cognition • Psychologists study this as concept formation • The same instances are put into a category
We have Labels for the Categories and their Instances • We label categories with symbols: Words • “Lion” is a category of big cat with big teeth • Gene, Protein, Cell, Person, Hydrolase Activity, etc. • …and, as we’ve already seen, each category can have many labels and any particular label can refer to more than one category • Semantic Heterogeneity • “A lion” is an instance in that category • Does the category “Lion” exist? • Lions exist, but the category could just be a human way of talking about lions • … we like putting things into categories
A Controlled Vocabulary• A specified set of words and phrases for the categories in which we place instances • Natural language definitions for those words and phrases • A glossary defines, but doesn’t control • The Uniprot keywords define and control • Control is placed upon which labels are used to represent the categories (concepts) we’ve used to describe the instances in the world • …, but there is nothing about how things in these categories are related Biopolymer DNA Enzyme Nucleic acid mRNA Polypeptide snRNA tRNA
We also like to Relate Things Together • Categories have subcategories • Instances in one category can be related in some way to instances in another • Can relate instances to each other in many different ways • Is-a, part-of, develops-from, etc.axes • We can use these relationships to classify categories • Things in category A are part is • If all instances in category A are also in category B then As are kinds of Bs Biopolymer Nucleic Acid Polypeptide Enzym e DNA RNA tRNA mRNA smRNA
Categories and sub- categories biopolymer polypeptide Nucleic acid enzyme DNA RNA
Describing Category Membership • We can make conditions that any instance must fulfil in order to be a member of a particular category • A Phosphatase must have a phosphatase catalytic domain • A Receptor must have a transmembrane domain • A codon has three nucleotide residues • A limb has part that is a joint • A man has a Y chromosome and an X chromosome • A woman has only an X chromosome
Relationships • These conditions made from a property and a successor relationship • isPartOf, hasPart • isDerivedFrom • DevelopsFrom • isHomologousTo • …and many, many more
A Structured Controlled Vocabulary • Not only can we agree on the labels we give categories • Can also agree on how the instances of categories are related • And agree on the labels we give he relations • Structure aids querying and captures knowledge with greater fidelity Biopolymer Nucleic Acid Polypeptide Enzym e DNA RNA tRNA mRNA smRNA Gene transcribedFrom
A Stronger Definition • a set of logical axioms designed to account for the intended meaning of a formal vocabulary used to describe a certain (conceptualisation of) reality [described in an information system) [Guarino 1998] • “conceptualisation of” inserted by me • “Logical axioms” means a formal definition of meaning of terms in a formal language • Formal language—something a computer an reason with • Use symbols to make inferences • Symbols represent things and their relationships • Making inferences about things computationally
So what is an ontology? Catalog/ ID Thesauri Terms/ glossary Informal Is-a Formal Is-a Formal instance Frames (properties) General Logical constraints Value restrictions Disjointness, Inverse, partof Gene Ontology Mouse Anatomy EcoCyc PharmGKB TAMBIS Arom After Chris Welty et al
What does it all mean anyway • To interpret our data we need to know what it is we’re talking about • We need to decide the things that we’re talking about and agree upon them • We need to agree on how to recognise those entities • We need to know how they are related to one another • Ontologies are a mechanism for describing those entities and their definitions • There’s more to knowledge representation than ontologies…
All this knowledge needs representing • We want this knowledge in a computational form • To make the knowledge available for software (and humans) • To help us develop and manage the (often) complex artefacts Building ontologies is hard (getting all those relationships in the right place) The Web Ontology Language (OWL) is a W3C recommendation for ontologies on the Semantic Web and in semantically enabled applications A knowledge representation language with a strict semantics that is amenable to autoamted reasoning
Web Ontology Language (OWL) • W3C recommendation for ontologies for the Semantic Web • OWL-DL mapped to a decidable fragment of first order logic • Classes, properties and instances • Boolean operators, plus existential and universal quantification • Rich class expressions used in restriction on properties – hasDomain some (ImnunoGlobinDomain or FibronectinDomain)
What are we saying? Person WomanMan is-ais-a • Are all instances of Man instances of Person? • Can an instance of Person be both a Man and an instance of Woman? • Can there be any more kinds of Person?
What are we saying? • What kinds of class can fill “has chromosome”? • How many “Y chromosome” are present? • Does their have to be a “Y chromosome”? • What properties are sufficient to be a Man and which are simply necessary? Y chromosomeMan has-chromosome Y chromosomeMan has-chromosome X chromosomehas-chromosome autosomehas-chromosome 1 1 44
OWL represents classes of instances A B C
Necessity and Sufficiency • An R2A phosphatase must have a fibronectin domain • Having a fibronectin domain does not a phosphatase make • Necessity -- what must a class instance have? • Any protein that has a phosphatase catalytic domain is a phosphatase enzyme • All phosphatase enzymes have a catalytic domain • Sufficiency – how is an instance recognised to be a member of a class?
Uses of ontologies
Ontologies in software
Problems Ontologies in Biology Try To Solve • Provenance – where did it come from, who did it? • Reproducibility – can I repeat and find results reported? • Sharing – can others understand your data? • Integration – can I readily take multiple (thousands of) data sets and use them without preparation? • New knowledge – can we infer new knowledge as a sum of current knowledge (computationally)?
The rise and rise of ontologies
What are the prospects for ontologies
How organisms adapt and survive in different environment.
Aplicación de ANOVA de una vía, modelo efectos fijos, en el problema de una empres...
Libros: Dra. Elisa Bertha Velázquez Rodríguez
In this talk we describe how the Fourth Paradigm for Data-Intensive Research is pr...
Pretoria ist die Hauptstadt der Republik Südafrika. ... Juli 2013 Häuser, die im Rahmen des sozialen Wohnungsbaus errichtet wurden, ...
Deutsche Schule Pretoria, die vom Kindergarten bis zum Abitur eine international anerkannte hochqualifizierte Erzhiehung anbietet.
Pretoria was founded in 1855 by Marthinus Pretorius, a leader of the Voortrekkers, who named it after his father Andries Pretorius and chose a spot on the ...
GANAA Summer School (mit Unterstützung des DAAD) an der Universität Pretoria, Südafrika, vom 23. - 27. September 2013. Bewerbungsfrist: 26.Mai 2013!
Nachdem das Event in Pretoria 2013 gecancelt worden war, feiert es dieses Jahr sein Debüt auf der Red Bull X-Fighters Tour. Dieses Debüt findet in den ...
Twittern ## Deutschland; Ägypten Albanien Algerien Andorra Argentinien Armenien Aserbaidschan Australien Bahrain Belarus Belgien Bolivien Bosnien ...
The 21 year old student disappeared in August 2013 whilst visiting a friend Soweto. 0. Like. Save. Share. Tweet this; ... Truck crashes into bus in Pretoria;
PRETORIA - Neues Wahrzeichen für die Hauptstadt Südafrikas: Präsident Jacob Zuma hat am Montag in Pretoria ein Denkmal Nelson Mandelas enthüllt. Vor ...
Want to watch this again later? Sign in to add this video to a playlist. Rating is available when the video has been rented
Im Jahre 1856 erwarb Andries Francois du Toit, der erste Magistrat von Pretoria, ... Dezember 2013 durch Präsident Jacob Zuma eingeweiht wurde, ...