Slide1:  Frank Hartel, PhD Enterprise Vocabulary Services National Cancer Institute NCI Enterprise Vocabulary Services (EVS) and Semantic Integration at NCI - An Overview - Outline::  Outline: Terminology management and semantic integration at NCI NCI Enterprise Vocabulary Services NCI Thesaurus (NCIt) NCI Metathesaurus (NCI Meta) Collaborations NCI biomedical informatics:  NCI biomedical informatics Goal: A virtual web of interconnected data, individuals, and organizations redefines how research is conducted, care is provided, and patients/participants interact with the biomedical research enterprise Interoperability:  in·ter·op·er·a·bil·i·ty ability of a system...to use the parts or equipment of another system Source: Merriam-Webster web site interoperability ability of two or more systems or components to exchange information and to use the information that has been exchanged. Source: IEEE Standard Computer Dictionary: A Compilation of IEEE Standard Computer Glossaries, IEEE, 1990] Interoperability Semantic interoperability Syntactic interoperability Courtesy: Charlie Mead No Controlled Terminology? No Interoperability :  No Controlled Terminology? No Interoperability Systems cannot exchange or use information if they use incompatible codes or tokens to signify meaning Terminology services provide token and codes Proper use of them assures consistent meaning across the enterprise Slide6:  Vocabulary for CDE specification Dictionary, thesaurus, ontology services via caBIO API Can it be done? caCORE - An Example via downloads cancer Common Ontologic Representation Environment (caCORE):  Information integration Cross-discipline reasoning cancer Common Ontologic Representation Environment (caCORE) biomedical objects common data elements controlled vocabulary Common Data Elements:  Common Data Elements Structured data reporting elements Precisely defining the questions and answers What question are you asking, exactly? What are the possible answers, and what do they mean? biomedical objects common data elements controlled vocabulary Biomedical Information Objects:  Biomedical Information Objects Data service infrastructure developed using OMG’s Model Driven Architecture approach Object models expressed in UML represent actual biomedical research entities such as genes, sequences, chromosomes, sequences, cellular pathways, ontologies, clinical protocols, etc. The object models form the basis for uniform APIs (Java, SOAP, HTTP-XML, Perl) that provide an abstraction layer and interfaces for developers to access information without worrying about the back-end data stores biomedical objects common data elements controlled vocabulary Binding Data, Metadata to Terminology - caCORE SDK:  Binding Data, Metadata to Terminology - caCORE SDK UML Modeling Tool (provided by user) Information model that will define data classes, attributes and relationships Semantic Connector Annotate UML model with ontology concepts: bridges the world of databases to that of structured semantics. UML Loader (run by NCI staff) Loads model into the caDSR metadata registry Model and associated semantics are available at runtime Code Generator Model and a code template are inputs into generator Creates the ‘caCORE-like’ n-tier software system with Java and Web Services APIs caCORE SDK:  caCORE SDK Extending Interoperability Beyond the Enterprise:  Extending Interoperability Beyond the Enterprise cancer Biomedical Informatics Grid (caBIG) Common, widely distributed infrastructure permits cancer research community to focus on innovation Shared vocabulary, data elements, data models facilitate information exchange Collection of interoperable applications developed to common standard Raw cancer research data is available for mining and integration Slide13:  caBIG - facilitate sharing of infrastructure, applications, and data caGrid Service-Oriented Architecture:  caGrid Service-Oriented Architecture Enterprise Vocabulary:  Enterprise Vocabulary NCI Metathesaurus (Cross-map standard vocabularies/ontologies, e.g. SNOMED, MedDRA, ICD) Semantic integration, inter-vocabulary mapping UMLS Metathesaurus extended with cancer-oriented vocabularies 930,000 Concepts, 2,200,000 terms and phrases Mappings among over 50 vocabularies NCI Thesaurus Description logic-based 48,000 “Concepts” Concept is the semantic unit Terms are Concept labels – synonymy Semantic relationships between Concepts Other standard terminologies MedDRA, MGED, SNOMED, GO, etc. biomedical objects common data elements controlled vocabulary NCI builds on EVS via caCORE Infrastructure :  NCI builds on EVS via caCORE Infrastructure Production EVS Servers in caCORE:  Production EVS Servers in caCORE Enterprise Vocabulary Services:  Enterprise Vocabulary Services Services and resources that address NCI's needs for controlled vocabulary http://www.nci.nih.gov/EVS A collaboration NCI Office of Communications Physician Data Query (PDQ), Cancer Information Service and the NCI web portal www.cancer.gov NCI Center for Bioinformatics Bioinformatics Core Infrastructure (caCORE), including metadata repository (caDSR) and object models built using EVS terminology for core semantics NCI EVS Goal – Integration by Meaning:  NCI EVS Goal – Integration by Meaning Clinical, translational, and basic research terminology have overlapping but specialized needs, therefore EVS assists to: Integrate different conceptual frameworks Create terminological and taxonomic conventions across systems Vocabulary Products NCI Thesaurus – an ontology-like terminology NCI Metathesaurus – maps vocabularies External vocabularies maintained and served: MedDRA, HL7, NDF-RT, LOINC, etc. Terminology Development Guidelines:  Terminology Development Guidelines Develop a content model Leverage existing sources where appropriate (VA NDF-RT, RxNorm, LOINC, etc. …) Develop unique content where needed (Cancer genes and diagnoses, drugs and therapies, molecular abnormalities, clinical trial standard terminology etc.) Link to other information sources and standards using URLs as possible (GO, Swissprot, drug formularies, trial protocols) Federate, merge or map with other standard terminology for semantic integration NCI Thesaurus (NCIt):  NCI Thesaurus (NCIt) Reference Terminology for NCI, Partners A Federal Standard Terminology Broad coverage of the cancer research and clinical domain including prevention and treatment trials Neoplastic and other Diseases Findings and Abnormalities Anatomy, Tissues, Subcellular Structures Agents, Drugs, Chemicals Genes, Gene Products, Biological Processes Animal Models – Mouse, other Research techniques and management, apparatus, clinical and lab, radiology, imagery NCI Thesaurus (2):  NCI Thesaurus (2) Published Monthly Public domain, open content license Available on-line and by download (OWL, Ontylog XML, flat files) 48,000+ “Concepts” hierarchically organized Description-logic based “Roles” establish machine readable semantic relationships between Concepts, ex.: “Carcinoma” Clinically_associated_with “Lytic Bone Lesions,” “TP53” Gene_associated_with_Disease “Breast Carcinoma” Slide24:  NCI Thesaurus is Deployed: http://nciterms.nci.nih.gov http://www.nci.nih.gov/EVS (full documentation) API: caCORE public access Fulfills NCI and collaborators’ needs for controlled vocabulary Public domain, open content license Example Concept Details :  Example Concept Details Concept Details URI: http://nciterms.nci.nih.gov:80/NCIBrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&code=C19151 Version: August 2005 (05.09e) Metastasis Identifiers:  name   Metastasis  code   C19151 Relationships to other concepts:  Biological_Process_Has_Result_Biological_Process Tumor Expansion   Biological_Process_Has_Initiator_Process Pathologic Process Information about this concept:  Synonym MET   Synonym metastasis Synonym Tumor Cell Migration Synonym with source data Metastasis|PT|CADSR   Synonym with source data MET|AB|CADSR   Synonym with source data Tumor Cell Migration|SY|NCI Synonym with source data Metastasis|PT|NCI Synonym with source data metastasis|SY|NCI-GLOSS|CDR0000046710 NCI_META_CUI CL001192   Semantic_Type Phenomenon or Process   Related_Lash_Concept metastasis   Preferred_Name Metastasis    DEFINITION NCI|Metastasis is the spread or migration of cancer cells from one part of the body (the organ in which it first appeared) to another. The secondary tumor contains cells that are like those in the original (primary) tumor. For example, breast cancer cells may spread (metastasize) to the lungs and cause the growth of a new tumor. When this happens, the disease is called metastatic breast cancer. (NCI)  Synonym Metastasis   DEFINITION NCI-GLOSS|(meh-TAS-ta-sis) The spread of cancer from one part of the body to another. A tumor formed from cells that have spread is called a secondary tumor, a metastatic tumor, or a metastasis. The secondary tumor contains cells that are like those in the original (primary) tumor. The plural form of metastasis is metastases (meh-TAS-ta-seez).    Superconcepts: Cancer Progression Subconcepts: Distant Metastasis Intravascular Metastasis Other Examples ::  Other Examples : Use URI to view Details of a Drug Concept- http://nciterms.nci.nih.gov:80/NCIBrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&code=C620 Use GUI to search for and view hierarchy Http://nciterms.nci.nih.gov Fluvastatin Sodium NCI Metathesaurus::  NCI Metathesaurus: Filtered UMLS Metathesaurus extended with additional required vocabularies 930,000+ concepts, 2,200,000 terms and phrases with definitions Mappings among over 50 vocabularies Extensive synonymy: Over 40,000 terms for neoplasms mapped to 7,000 concepts Used as online dictionary and thesaurus, for mapping and document indexing NCI Metathesaurus (2):  NCI Metathesaurus (2) Minor releases monthly, Major releases twice a year Provides a mapped overlap and partial inter-relation of current versions of NCI and partner required vocabularies, ex. The ICD’s, MedDRA, SNOMED, MeSH (NLM Medical Subject Headings), HCPCS (procedures), LOINC (lab values), drug terminologies (VA NDF-RT, AOD, RxNORM, Multum, NCI Thesaurus drugs, etc.) EVS Products & Services Are Open:  EVS Products & Services Are Open NCI Thesaurus is Open Contnent ftp://ftp1.nci.nih.gov/pub/cacore/EVS/ThesaurusTermsofUse.htm NCI Metathesaurus is Mostly Open Source See Each Source’s License http://ncimeta.nci.nih.gov/MetaServlet/GenerateSourcesServlet NCI EVS Servers Are Freely Accessible On the Web: Via API: All Software Developed by NCI EVS is Public Open Source and Free for the Asking: http://nciterms.nci.nih.gov and http://ncimeta.nci.nih.gov http://ncicb.nci.nih.gov/core/caBIO http://ncicb.nci.nih.gov/core EVS Collaborations:  EVS Collaborations Many Active Collaborations Federal: FDA, VA, CDC, and Various NIH Institutes such as NHLBI, NIDCR Major Standards Organizations: HL7, CDISC, W3C, FHA Cancer Centers and Cancer Cooperative Groups (caBIG, caGRID) Numerous Research collaborators such as the Microarray Gene Expression Data Society (MGED Ontology, FuGO) Areas of Collaboration:  Areas of Collaboration FDA (Terminology for Drugs, Devices, and Clinical Trial Terminology Initiatives) VA (Drugs, Common Clinical Trials Semantics, Terminology Operations) CDC (Cancer Incidence and Prevention, Terminology Operations) Cancer Centers (Clinical Trials, Experimental Organism Terminology, Micro- nutrients, Open Terminology Servers, other (caBIG)) CDISC/HL7 RCRIM (Clinical Research Data Standards) Contact: Frank Hartel, PhD NCI Center for Bioinformatics hartel@mail.nih.gov :  Contact: Frank Hartel, PhD NCI Center for Bioinformatics hartel@mail.nih.gov

