Taxonomy Development and Digital Projects*

100 %
0 %
Information about Taxonomy Development and Digital Projects*
Technology

Published on February 9, 2009

Author: danielabarbosa

Source: slideshare.net

Description

Presentation from ALA Midwinter 2009 (American Library Association) meeting as part of the Networked Resources and Metadata Interest Group (NRMIG). A discussion on taxonomy development lead by Laura Dorricott a Taxonomy Project Delivery Manger with Dow Jones Taxonomy Services on Sunday, January 25th 2009.

Corresponding Blog post with notes from session by Laura available here:
http://synapticacentral.com/content/notes-session-taxonomy-development-and-digital-projects

Taxonomy Development and Digital Projects Laura Dorricott Project Delivery Manager, Taxonomy Services Dow Jones Client Solutions January 25, 2009 Networked Resources and Metadata Interest Group ALA Midwinter 2009

Introduction Laura Dorricott, Project Delivery Manager, Taxonomy Services, Dow Jones Client Solutions IHS, Inc. – Indexer and Lexicographer Synapse – 1995-2005 Taxonomist and Operations Director Dow Jones – 2005 – Project Delivery Manager

Laura Dorricott, Project Delivery Manager, Taxonomy Services, Dow Jones Client Solutions

IHS, Inc. – Indexer and Lexicographer

Synapse – 1995-2005

Taxonomist and Operations Director

Dow Jones – 2005 – Project Delivery Manager

Information management needs – What do we do with this??? American Theo LeSieg Theodore Seuss Geisel Children’s writer March 2, 1904 Springfield, MA Articles about “Dr. Seuss Dr. Seuss

American

Taxonomy’s Evolutionary Path © 2007, Dow Jones Dictionaries & Flat Lists Hierarchical Taxonomies Controlled Vocabulary Thesauri Ontologies Structured Authority Files Taxonomies are the building blocks for ontologies and ontologies are semantic representations of the real world in all its rich diversity. Taxonomy is evolving organically…

Definitions of Controlled Vocabularies List: “ Sometimes called a pick list, a limited set of terms arranged as a simple alphabetical list or in some other logically evident way.” Synonym ring: “ A group of terms that are considered equivalent for the purposes of retrieval.” Taxonomy: “ A collection of controlled vocabulary terms organized into a hierarchical structure. Each term has one or more parent/child (broader/narrower) relationships to each other term.” Thesaurus: “ A controlled vocabulary arranged in a known order and structured so that the various relationships among terms are displayed clearly and identified by standardized relationship indicators. Relationship indicators should be employed reciprocally.”

List:

“ Sometimes called a pick list, a limited set of terms arranged as a simple alphabetical list or in some other logically evident way.”

Synonym ring:

“ A group of terms that are considered equivalent for the purposes of retrieval.”

Taxonomy:

“ A collection of controlled vocabulary terms organized into a hierarchical structure. Each term has one or more parent/child (broader/narrower) relationships to each other term.”

Thesaurus:

“ A controlled vocabulary arranged in a known order and structured so that the various relationships among terms are displayed clearly and identified by standardized relationship indicators. Relationship indicators should be employed reciprocally.”

Next Generation Ontology: “ A controlled vocabulary developed to bridge the gap between the real world and the information world, by striving to exactly model and control all the fundamentals of information concepts with the goal of building a new class of intelligent technologies and knowledge systems. ”

Ontology:

“ A controlled vocabulary developed to bridge the gap between the real world and the information world, by striving to exactly model and control all the fundamentals of information concepts with the goal of building a new class of intelligent technologies and knowledge systems. ”

Purposes of Controlled Vocabularies Translation Consistency Provide a framework of concepts that accurately represents the real world.* Indication of semantic relationships Hierarchical arrangement to assist browsing Search and retrieval Improve precision and recall Reduce search time * Real world includes physical objects, databases, digital content, and abstract domains of knowledge

Translation

Consistency

Provide a framework of concepts that accurately represents the real world.*

Indication of semantic relationships

Hierarchical arrangement to assist browsing

Search and retrieval

Improve precision and recall

Reduce search time

SEARCH

Keyword Search Keyword searching is insufficient People do not always know what they want People all have different “keywords” People don’t perform complex keyword searches One word can have many meanings Two or more words can share the same meaning

Keyword searching is insufficient

People do not always know what they want

People all have different “keywords”

People don’t perform complex keyword searches

One word can have many meanings

Two or more words can share the same meaning

one thing can have many different names Dr. Peter Roget one word can mean very different things "the elasticity of language"

 

Taxonomy helps people filter out the noise and discover the relevant things regardless of what they are called.

NAVIGATE

Search and Navigation are not alternative solutions, they are complementary solutions Users expect both

Points of view…

one point of view…

another point of view…

Different audiences will have different views and good navigation will serve all of them.

Building a Taxonomy or Controlled Vocabulary Now that we know what taxonomies and controlled vocabularies are and can see some of the reasons we need them – what do we do next???

Now that we know what taxonomies and controlled vocabularies are and can see some of the reasons we need them – what do we do next???

Building a Taxonomy or Controlled Vocabulary Basic issues and principles One word can have multiple meanings (ambiguity) Two words can share the same meaning (synonymy) Semantic relationships Facets Warrant Structures Metadata

Basic issues and principles



One word can have multiple meanings (ambiguity)

Two words can share the same meaning (synonymy)

Semantic relationships

Facets

Warrant

Structures

Metadata

Ambiguity Polysemes (homonyms, homographs) cranes (birds) cranes (equipment) Mercury (planet) Mercury (god) Mercury (car) Mercury (metal) Ambiguity

Polysemes (homonyms, homographs)

cranes (birds)

cranes (equipment)

Mercury (planet)

Mercury (god)

Mercury (car)

Mercury (metal)

Synonymy Two words with the same or similar meaning Popular vs. scientific names Generic vs. trade names Slang vs. traditional terms Dialectical variants Near-synonyms Lexical variants Generic postings Synonymy

Two words with the same or similar meaning

Popular vs. scientific names

Generic vs. trade names

Slang vs. traditional terms

Dialectical variants

Near-synonyms

Lexical variants

Generic postings

Semantic Relationships Basic Types: Equivalence (USE/UF) Hierarchical (BT/NT) Associative (RT/RT) Represented by standard codes/symbols Reciprocity Semantic Relationships

Basic Types:

Equivalence (USE/UF)

Hierarchical (BT/NT)

Associative (RT/RT)

Represented by standard codes/symbols

Reciprocity

Hierarchical Relationships Allow for browsable structures Information discovery Search expansion Three types: Generic Instance Whole-part

Allow for browsable structures

Information discovery

Search expansion

Three types:

Generic

Instance

Whole-part

Hierarchical Relationships Between a class and its members “ IsA” relationship A cactus IsA succulent plant, therefore: succulent plants NT cacti Generic Hierarchical Relationships

Between a class and its members

“ IsA” relationship

A cactus IsA succulent plant, therefore:

succulent plants NT cacti

Hierarchical Relationships Between a general category of things or events and an individual instance of that category Instance is often a proper noun Also an “IsA” relationship type Example: mountains NT Rocky Mountains Instance Hierarchical Relationship

Between a general category of things or events and an individual instance of that category

Instance is often a proper noun

Also an “IsA” relationship type

Example:

mountains NT Rocky Mountains

Hierarchical Relationships One concept inherently included in another Examples: Systems and organs of the body Geographic locations Corporate, social, or political structures Whole Part Hierarchical Relationships

One concept inherently included in another

Examples:

Systems and organs of the body

Geographic locations

Corporate, social, or political structures

Polyhierachy Concept logically fits into two different hierarchical structures Advantage of electronic structures, allows for different viewpoints Example: Biochemistry BT biology BT chemistry

Concept logically fits into two different hierarchical structures

Advantage of electronic structures, allows for different viewpoints

Example:

Biochemistry

BT biology

BT chemistry

Associative Relationships May suggest additional terms for indexing or searching Between terms in the same hierarchy Overlapping sibling terms Derivational relationships Between terms in different hierarchies Many types Examples: Process/agent; Action/property; Cause/effect

May suggest additional terms for indexing or searching

Between terms in the same hierarchy

Overlapping sibling terms

Derivational relationships

Between terms in different hierarchies

Many types

Examples: Process/agent; Action/property; Cause/effect

Form of Terms Single word or compound terms Grammatical forms: Nouns and noun phrases Singular / plural Capitalization Predominantly lowercase characters, except for proper names, acronyms, trade names, etc. Punctuation

Single word or compound terms

Grammatical forms:

Nouns and noun phrases

Singular / plural

Capitalization

Predominantly lowercase characters, except for proper names, acronyms, trade names, etc.

Punctuation

2007 Factiva, Inc. All Rights Reserved. Standards “ Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies ,” ANSI/NISO Z39 19-2005 “ Z39 50: A Primer on the Protocol ,” ANSI/NISO Z39 50 “ Structured Vocabularies for Information Retrieval. Guide. Definitions, Symbols and Abbreviations ,” BS 8723-1:2005 “ Structured Vocabularies for Information Retrieval. Guide. Thesauri ,” BS 8723-2:2005 “ Guidelines for the Establishment and Development of Multilingual Thesauri ,” ISO 5964-1985 “ Guidelines for the Establishment and Development of Monolingual Thesauri ,” ISO 2788-1986 Web Ontology Language (OWL) Overview Standards

“ Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies ,” ANSI/NISO Z39 19-2005

“ Z39 50: A Primer on the Protocol ,” ANSI/NISO Z39 50

“ Structured Vocabularies for Information Retrieval. Guide. Definitions, Symbols and Abbreviations ,” BS 8723-1:2005

“ Structured Vocabularies for Information Retrieval. Guide. Thesauri ,” BS 8723-2:2005

“ Guidelines for the Establishment and Development of Multilingual Thesauri ,” ISO 5964-1985

“ Guidelines for the Establishment and Development of Monolingual Thesauri ,” ISO 2788-1986

Web Ontology Language (OWL) Overview

Value Proposition 2007 Factiva, Inc. All Rights Reserved. “ 40% of corporate users…cannot find the information they need to do their jobs on their intranets.” Susan Feldman, “The High Cost of Not Finding Information,” KMWorld, March 2004 Value Proposition, or “So what?”

“ 40% of corporate users…cannot find the information they need to do their jobs on their intranets.”

Susan Feldman, “The High Cost of Not Finding Information,” KMWorld, March 2004

Low productivity High frustration Little leverage of information assets Too many search results Too many irrelevant hits The more precise I get the more I miss End-user search illiteracy Multilingual content Ambiguous results Information retrieval issues within companies

The controlled vocabulary value proposition Unlock the value of internal and external content to: Improve productivity “ Stop searching, start finding” Reduce cost Make existing content actionable, not dormant Avoid reinventing wheels Gain competitive advantage Be better informed, act quicker

Unlock the value of internal and external content to:

Improve productivity

“ Stop searching, start finding”

Reduce cost

Make existing content actionable, not dormant

Avoid reinventing wheels

Gain competitive advantage

Be better informed, act quicker

Controlled vocabulary’s role in portal success Drive usage Improve user experience, leverage portal investment Drive cultural change Help develop a common language Support information exchange/reuse Leverage information management skills Turn information officers into information architects

Drive usage

Improve user experience, leverage portal investment

Drive cultural change

Help develop a common language

Support information exchange/reuse

Leverage information management skills

Turn information officers into information architects

Value Proposition Taxonomies make it easier to find information so people are more likely to use intranets and extranets. This results in better return on the time and effort already invested in these intranets and extranets. Taxonomies improve “hit” rates - people find what they need Everyone has experienced irrelevant results from internet search engines because • Two or more words or terms can be used to represent a single concept salinity/saltiness • Two or more words that have the same spelling can represent different concepts Mercury (planet) Mercury (metal) Mercury (automobile) Taxonomies eliminate much of this problem People spend less time searching and more time finding With a common taxonomy across the organization, knowledge can be more readily shared, reused and repurposed

Taxonomies make it easier to find information so people are more likely to use intranets and extranets. This results in better return on the time and effort already invested in these intranets and extranets.

Taxonomies improve “hit” rates - people find what they need

Everyone has experienced irrelevant results from internet search engines because

• Two or more words or terms can be used to represent a single concept

salinity/saltiness

• Two or more words that have the same spelling can represent different concepts

Mercury (planet)

Mercury (metal)

Mercury (automobile)

Taxonomies eliminate much of this problem

People spend less time searching and more time finding

With a common taxonomy across the organization, knowledge can be more readily shared, reused and repurposed

Controlled vocabulary can help reduce costs and increase revenue Taxonomies can help organizations save money Reduces the number of hours spent seeking information. Hierarchical relationships allow users to easily narrow or broaden searches as well as look for related information. Improves productivity by reusing and repurposing content A taxonomy can help increase revenue Increase customer satisfaction by improving search efficiency findability Relevance Provide timely information with up to date terminology Provide more precise information retrieval

Taxonomies can help organizations save money

Reduces the number of hours spent seeking information. Hierarchical relationships allow users to easily narrow or broaden searches as well as look for related information.

Improves productivity by reusing and repurposing content

A taxonomy can help increase revenue

Increase customer satisfaction by improving

search efficiency

findability

Relevance

Provide timely information with up to date terminology

Provide more precise information retrieval

THANK YOU! Laura Dorricott [email_address]

THANK YOU!

Laura Dorricott

[email_address]

Add a comment

Related presentations

Related pages

Taxonomy Development and Digital Projects - Technology

... meeting as part of the Networked Resources and Metadata Interest Group (NRMIG). A discussion on taxonomy ... Taxonomy Development and Digital Projects.
Read more

Notes from A Session on Taxonomy Development and Digital ...

20 degrees, light snow, 8:00 on a Sunday morning….and I’m about to do a presentation and hopefully lead a discussion on taxonomy development and ...
Read more

Taxonomy Development | .orgSource

Taxonomy Development. ... Implement a taxonomy, ... “Without .orgSource’s engagement in projects for the American Society of Anesthesiologists and ...
Read more

Taxonomy Development Process - Galaxy Consulting

Taxonomy Development Process. ... including both taxonomy structure and taxonomy view. The taxonomy development team is responsible for the ... Digital ...
Read more

Projects In Development | Starwing Digital

Now that all the groundwork for our website has been done and everything is basically in place, I can get back to work on other projects. Today I'll be ...
Read more

What is taxonomy? - Definition from WhatIs.com

Taxonomy is a process or system of classification used to organize ... the development of a good taxonomic classification takes into ... digital workplace ...
Read more

Jurisdictions | XBRL

... taxonomy development and ... It also leads projects to ... XBRL Germany also continues to cooperate with other European Jurisdictions and ...
Read more

A guide to developing taxonomies for effective data management

A guide to developing taxonomies for effective data management. ... a nation united on its digital future; ... Be it a taxonomy designed for storage and ...
Read more

GitHub - dhtaxonomy/TaDiRAH: Taxonomy of Digital Research ...

... Taxonomy of Digital Research ... dhtaxonomy / TaDiRAH. ... to collect information on digital humanities tools, methods, projects, ...
Read more