BioOne Keynote

67 %
33 %
Information about BioOne Keynote

Published on April 17, 2008

Author: drielinger

Source: slideshare.net

Scientific Disciplines From Discovery to Delivery Cathy Norton Deputy Director BHL BIOONE April 18, 2008

“ The launch of the Encyclopedia of Life will have a profound and creative effect in science… this effort will lay out new directions for research in Every branch of biology ” E.O. Wilson

“ The launch of the Encyclopedia of Life will have a profound and creative effect in science… this effort will lay out new directions for research in Every branch of biology ”

E.O. Wilson

Collaborative Tree of Life distributed semantic Biodiversity Heritage Library ever evolving TED all information Synthesis Center Oh wow! SpeciesBase ClassificationBank Education and Outreach ANTS index MacArthur Foundation taxonomic intelligence modular software communal ownership user defined AvenueA | Razorfish OBIS MBL free visualization images WorkBench sounds phylogeny web 2.0 names-based infrastructure Atlas of Living Australia February 2008 Google Marine Biological Laboratory all species Smithsonian FISHBASE Harvard Field Museum Tree of Life E. O. Wilson aggregation / mashup EDIT ScratchPad widgets MOBOT NHM AMNH NYBotancial Sloan Foundation GBIF llison l NameBank videos National Geographic any classification TDWG/BIS

Collaborative Tree of Life distributed semantic Biodiversity Heritage Library ever evolving TED all information Synthesis Center Oh wow! SpeciesBase ClassificationBank Education and Outreach ANTS index MacArthur Foundation taxonomic intelligence modular software communal ownership user defined AvenueA | Razorfish OBIS MBL free visualization images WorkBench sounds phylogeny web 2.0 names-based infrastructure Atlas of Living Australia February 2008 Google Marine Biological Laboratory all species Smithsonian FISHBASE Harvard Field Museum Tree of Life E. O. Wilson aggregation / mashup EDIT ScratchPad widgets MOBOT NHM AMNH NYBotancial Sloan Foundation GBIF llison l NameBank videos National Geographic any classification TDWG/BIS

EOL Hierarchy The EOL Steering Committee is comprised of senior authorities from Harvard University, Smithsonian Institution, the Field Museum of Chicago, the Marine Biological Laboratory at Woods Hole, the Biodiversity Heritage Library consortium, Missouri Botanical Garden, and the Macarthur and Sloan Foundations. The EOL Institutional Council contains more than 25 institutions from around the world and provides EOL with global perspectives and outreach capabilities. The Distinguished Advisory Board consists of 13 global leaders from the scientific and policy communities.

The EOL Steering Committee is comprised of senior authorities from Harvard University, Smithsonian Institution, the Field Museum of Chicago, the Marine Biological Laboratory at Woods Hole, the Biodiversity Heritage Library consortium, Missouri Botanical Garden, and the Macarthur and Sloan Foundations.

The EOL Institutional Council contains more than 25 institutions from around the world and provides EOL with global perspectives and outreach capabilities. The Distinguished Advisory Board consists of 13 global leaders from the scientific and policy communities.

Con’t The Species Sites Group works with contributors and data providers and IP issues Biodiversity Informatics Group is responsible for the software development of tools and open access delivery of species information through a single portal Education and Outreach Group works to insure widespread awareness of the EOL Biodiversity Synthesis Group will facilitate cross disciplinary involvement and will explore integrative topics, including taxonomy, evolution, biogeography, phylogenetics and biodiversity informatics. Scanning and Digitization Group led by the Biodiversity Heritage Library , is a consortium of 10 natural history, botanical and research libraries that will scan for the public commons out of copyright and permissioned works.

The Species Sites Group works with contributors and data providers and IP issues

Biodiversity Informatics Group is responsible for the software development of tools and open access delivery of species information through a single portal

Education and Outreach Group works to insure widespread awareness of the EOL

Biodiversity Synthesis Group will facilitate cross disciplinary involvement and will explore integrative topics, including taxonomy, evolution, biogeography, phylogenetics and biodiversity informatics.

Scanning and Digitization Group led by the Biodiversity Heritage Library , is a consortium of 10 natural history, botanical and research libraries that will scan for the public commons out of copyright and permissioned works.

Con’t FishBase ( www.fishbase.org ), a global information system with data on practically every fish species known to science. FishBase is serving information on more than 30,000 fish species through the EOL. The Catalogue of Life Partnership (CoLp) ( www.catalogueoflife.org ), an informal partnership dedicated to creating an index of the world’s organisms.. They contain substantial contributions of taxonomic expertise from more than fifty organizations around the world, integrated into a single work by the ongoing work of the CoLp partners. The EOL currently uses CoLp as its taxonomic backbone. Tree of Life web project (ToL) ( www.tolweb.org ), a collaborative effort of biologists from around the world. On more than 9,000 Web pages, the project provides information about the diversity of organisms on Earth, their evolutionary history (phylogeny), and characteristics. ToL project illustrates the genetic connections between all living things. The Global Biodiversity Information Facility (GBIF) ( www.gbif.org ), the world’s premiere source for information on biological specimen and observational data, providing on-line access to more than 135 million data records from around the world. GBIF is providing range maps for the EOL species pages. AmphibiaWeb ( http:// amphibiaweb.org ), an online system enabling anyone with a Web browser to search and retrieve information relating to amphibian biology and conservation. The Solanaceae Source Web site ( www.nhm.ac.uk/research-curation/projects/solanaceaesource ), The aim of the project is to produce a worldwide taxonomic monograph of the species occurring within the plant genus Solanum (the potato and tomato family), with principal investigators from four research institutions in England and the United States. Data Partners

FishBase ( www.fishbase.org ), a global information system with data on practically every fish species known to science. FishBase is serving information on more than 30,000 fish species through the EOL.

The Catalogue of Life Partnership (CoLp) ( www.catalogueoflife.org ), an informal partnership dedicated to creating an index of the world’s organisms.. They contain substantial contributions of taxonomic expertise from more than fifty organizations around the world, integrated into a single work by the ongoing work of the CoLp partners. The EOL currently uses CoLp as its taxonomic backbone.

Tree of Life web project (ToL) ( www.tolweb.org ), a collaborative effort of biologists from around the world. On more than 9,000 Web pages, the project provides information about the diversity of organisms on Earth, their evolutionary history (phylogeny), and characteristics. ToL project illustrates the genetic connections between all living things.

The Global Biodiversity Information Facility (GBIF) ( www.gbif.org ), the world’s premiere source for information on biological specimen and observational data, providing on-line access to more than 135 million data records from around the world. GBIF is providing range maps for the EOL species pages.

AmphibiaWeb ( http:// amphibiaweb.org ), an online system enabling anyone with a Web browser to search and retrieve information relating to amphibian biology and conservation.

The Solanaceae Source Web site ( www.nhm.ac.uk/research-curation/projects/solanaceaesource ), The aim of the project is to produce a worldwide taxonomic monograph of the species occurring within the plant genus Solanum (the potato and tomato family), with principal investigators from four research institutions in England and the United States.

“ It is exciting to anticipate the scientific chords we might hear once 1.8 million notes are brought together through this instrument. Potential EOL users are professional and citizen scientists, teachers, students, media, environmental managers, families and artists. The site will link the public and scientific community in a collaborative way that’s without precedent in scale.” Jim Edwards, Executive Director, EOL

“ It is exciting to anticipate the scientific chords we might hear once 1.8 million notes are brought together through this instrument. Potential EOL users are professional and citizen scientists, teachers, students, media, environmental managers, families and artists. The site will link the public and scientific community in a collaborative way that’s without precedent in scale.”

Jim Edwards, Executive Director, EOL

Encyclopedia of Life Major project to create a single Web page for every known species (1.8 million!) Total funding will reach at least $50M EOL needs the literature underpinning in the BHL project BHL now key partner in EOL project Launched on 9 th May, 2007 First 30,000 pages launched at TED Feb 27th, 2008

Major project to create a single Web page for every known species (1.8 million!)

Total funding will reach at least $50M

EOL needs the literature underpinning in the BHL project

BHL now key partner in EOL project

Launched on 9 th May, 2007

First 30,000 pages launched at TED Feb 27th, 2008

Data Sharing Plant Names Specimens Plant Names Plant Names Specimens Descriptions Plant Names Plant Names Citations

Data Sharing Standards Services

Standards

Services

Cache A data point is a collection of Data sources EOL Tree http://www.eol.org auto-updates Client application Update ontologies can be used to describe and relate the contents

Using ontologies, unique identifiers, an editable views by semantic lenses An Enterprise Semantic Information Fabric

Serine Molecule Biodiversity Heritage Library Synthesis Center Field Museum Informatics Marine Biological Laboratory & MOBOT Education & Outreach Smithsonian/Harvard Secretariat Smithsonian

This library serves the MBL, WHOI, USGS, NMFS, SEA, WHRC, and other scientific groups in the area. Facing a new dynamic phase NMFS - 1871 MBL - 1888 WHOI - 1930 USGS - 1960 SEA - 1971 WHRC - 1985 Woods Hole Scientific Community

Biodiversity Heritage Library

 

Museums Field Museum Natural History Museum (London) Smithsonian Institution American Museum of Natural History Botanical Gardens Missouri Botanical Garden New York Botanical Garden Royal Botanic Gardens, Kew University Libraries Botany Libraries, Harvard University Ernst Meyer Library of the Museum of Comparative Zoology, Harvard University Research Institute Library Marine Biological Laboratory / Woods Hole Oceanographic Institution Library (MBL/WHOI ) All signed MOU’s

Museums

Field Museum

Natural History Museum (London)

Smithsonian Institution

American Museum of Natural History

Botanical Gardens

Missouri Botanical Garden

New York Botanical Garden

Royal Botanic Gardens, Kew

University Libraries

Botany Libraries, Harvard University

Ernst Meyer Library of the Museum of Comparative Zoology, Harvard University

Research Institute Library

Marine Biological Laboratory / Woods Hole Oceanographic Institution Library (MBL/WHOI )

All signed MOU’s

Mission: Provide Open Access to Biodiversity Literature Goals: Digitize the core published literature on biodiversity and put on the Web Agree on approaches with the global taxonomic community, rights holders and others

Digitize the core published literature on biodiversity and put on the Web

Agree on approaches with the global taxonomic community, rights holders and others

How big is the Biodiversity domain? Over 5.4 million books dating back to 1469 800,000 monographs 40,000 journal titles (12,5000 current ) 50% pre-1923

Over 5.4 million books dating back to 1469

800,000 monographs

40,000 journal titles (12,5000 current )

50% pre-1923

Why now? Cost low – 10-19 cents a page Other projects funded recently – BL/Microsoft /Google big ten Tractable, well-defined scientific domain Taxonomic information has exceptionally longevity Supports GBIF and other international initiatives – including CBD, ABS, Darwin Declaration

Cost low – 10-19 cents a page

Other projects funded recently – BL/Microsoft /Google big ten

Tractable, well-defined scientific domain

Taxonomic information has exceptionally longevity

Supports GBIF and other international initiatives – including CBD, ABS, Darwin Declaration

Taxonomists and other scientists will have access to biodiversity literature - globally Will provide the developing world with access to the historical literature Scientists working in many biological domains – and other areas like meteorology, geology, ecology, genomics, etc – will get access Advance objectives of the Convention on Biological Diversity Benefits

Taxonomists and other scientists will have access to biodiversity literature - globally

Will provide the developing world with access to the historical literature

Scientists working in many biological domains – and other areas like meteorology, geology, ecology, genomics, etc – will get access

Advance objectives of the Convention on Biological Diversity

Less space needed for Library collections In Lillie – space freed for other uses % material can be stored off-site in ‘dark storage. FTP Our scientists will get access at their desk or in the field Library focus will shift to informatics Virtual web library will increase public access Library staff will change – Benefits to the MBLWHOI Library

Less space needed for Library collections In Lillie – space freed for other uses

% material can be stored off-site in ‘dark storage. FTP

Our scientists will get access at their desk or in the field

Library focus will shift to informatics

Virtual web library will increase public access

Library staff will change –

Key partner of Encyclopedia of Life Working Groups have agreed technical plan , metadata standards and image standards Internet Archive to be technical partner – scanning and hosting ‘ Scribe’ scanners now installed in NHM NYC and in Boston 2.5 million pages already available Where are we now?

Key partner of Encyclopedia of Life

Working Groups have agreed technical plan , metadata standards and image standards

Internet Archive to be technical partner – scanning and hosting

‘ Scribe’ scanners now installed in NHM NYC and in Boston

2.5 million pages already available

Legal issues - BHL organisational structure, content licensing, contracts being developed by EFF BHL will take responsibility for long-term sustainability of the scanned material Blackwells Publishing/Wiley back-files possibly available through the BHL Zoological Record will provide their index as route to BHL articles OCR and name recognition tools identified and linked to project - Taxonomic Intelligence

Legal issues - BHL organisational structure, content licensing, contracts being developed by EFF

BHL will take responsibility for long-term sustainability of the scanned material

Blackwells Publishing/Wiley back-files possibly available through the BHL

Zoological Record will provide their index as route to BHL articles

OCR and name recognition tools identified and linked to project - Taxonomic Intelligence

BHL is US/UK focused. Plans to engage European partners – through projects such as EDIT and SYNTHESYS – in a similar attempt to capture the non-English language publications G8+5 Environment Ministers identified need for ‘Global Species Information System’ – first EU meeting to address response endorsed the BHL as the way forward Positive discussions have already taken place with the Chinese Academy of Sciences Australian Government likely to fund scanning as part of Atlas of Australian Life Where are we now? Europe, Rest of the World

BHL is US/UK focused.

Plans to engage European partners – through projects such as EDIT and SYNTHESYS – in a similar attempt to capture the non-English language publications

G8+5 Environment Ministers identified need for ‘Global Species Information System’ – first EU meeting to address response endorsed the BHL as the way forward

Positive discussions have already taken place with the Chinese Academy of Sciences

Australian Government likely to fund scanning as part of Atlas of Australian Life

Classes of texts Public Domain – pre-1923 Non-profit society journals Post-1923 monographs some with copyright renewals some without copyright renewals Commercial journals

Classes of texts

Public Domain – pre-1923

Non-profit society journals

Post-1923 monographs

some with copyright renewals

some without copyright renewals

Commercial journals

BHL Seeks Permissions BHL will digitize learned society backfiles and mount them through the BHL Portal at no cost. Will provide a set of files to the learned society for reuse as they see fit. Will index the issues using Taxonomic Intelligence increasing their usability.

BHL Seeks Permissions

BHL will digitize learned society backfiles and mount them through the BHL Portal at no cost.

Will provide a set of files to the learned society for reuse as they see fit.

Will index the issues using Taxonomic Intelligence increasing their usability.

Benefits Use of the articles will increase as evidenced by citation upsurge. Long-term management of the digital assets is provided by the BHL at no cost so it’s contributors Content will be integrated into EOL project through TI nomenclatural linking.

Benefits

Use of the articles will increase as evidenced by citation upsurge.

Long-term management of the digital assets is provided by the BHL at no cost so it’s contributors

Content will be integrated into EOL project through TI nomenclatural linking.

Levinus Vincent, Elenchus tabularum, pinacothecarum, 1719 The cited half-life of publications in taxonomy is longer than in any other scientific discipline. The decay rate is longer than in most scientific disciplines. Macro-economic case for open access Tom Moritz Current taxonomic literature often relies on texts and specimens >100 years old.

The cited half-life of publications in

taxonomy is longer than in any other

scientific discipline.

The decay rate is longer than in most

scientific disciplines.

Macro-economic case for open access

Tom Moritz

Current taxonomic literature often relies

on texts and specimens >100 years old.

The Long NOW Strategy Georges Louis Leclerc, comte de Buffon Histoire naturelle : générale et particulière (Oiseaux) , 1799-1808 Convention on Biological Diversity: Article 17 Institutions that are creating the BHL exist to persist through time. The future is uncertain, the technology landscape changes, people pass on. So create consortial structures that are low-overhead, flexible, and can respond quickly. Interoperability is the key.. Repository islands will sink

Institutions that are creating the BHL exist to persist through time.

The future is uncertain, the technology landscape changes, people pass on. So create consortial structures that are low-overhead, flexible, and can respond quickly.

Interoperability is the key.. Repository islands will sink

Biologia Centrali-American Physical Distribution… Now… you can Parse data, harvest out data, Wealth of information locked on the pages are now liberated!

Henry Walter Bates The Naturalist on the River Amazons , 1863 Most literature is in the developed world the Northern Hemisphere Most Biodiversity is in the developing world the Southern Hemisphere

Progne subis- Purple Martin Illustrations of the nest and eggs of birds of Ohio , 1879-1886 Library and Laboratory: the Marriage of Research, Data and Taxonomic Literature London, February 2005 Eighty participants from 22 countries gathered to discuss the status and future of access to the taxonomic literature and to propose an agenda for actions that would improve the research environment for taxonomy. The participants were taxonomists; librarians; publishers; representatives of learned and professional societies, private foundations and government agencies; and specialists in information and communications technology. Scalable Mass Scanning Contracts Firewalls Security Loading Docks Trucks 180 mile round trip!

Ernest Ingersoll Hand-book to the National Museum … Smithsonian Institution , 1886 Mass Scanning Workflow Bid Lists Pick Lists Packing Lists Serials Management Monographic Management Stickers for Media and carts Rare Books-vaults

It began and begat Reptilia and Batrachia . (1885-1902) by Albert C.L.G.  Günther Open Access: all content can be reused, repurposed, reformatted, sliced, diced, scraped, harvested, integrated. 2003 Telluride . Encyclopedia of Life Meeting 2005 London. Library and laboratory: the Marriage of Research, Data, and Taxonomic Literature. June 2006 Washington. Organization and Technical Meeting October 2006 St Louis/San Francisco Technical Meeting

Reptilia and Batrachia . (1885-1902) by Albert C.L.G.  Günther February 2007 MCZ Harvard Organizational Meeting May 2007 Encyclopedia of Life Launch. Washington DC Sept 2007 Missouri Botanical Garden Technical Meeting March 2008 MCZ Harvard Organizational Technical Meeting

Collaborators Sanborn Tenney Natural History of Animals . . . 1868. Internet Archive Set up scanning centers in London, New York, Washington, Boston, etc. High-quality, non-destructive Scanning. Image files and text derived from OCR. Internet Archive International Commission on Zoological Nomenclature Open Content Alliance European Distributed Institute of Taxonomy Global Biodiversity Information Facility (GBIF) Many more under negotiation Sanborn Tenney Natural History of Animals . . . 1868.

Jacob Christian Schäffer Elementa entomologica . . . 1766. BHL Portal http:// www.biodiversitylibrary.org Serve image and test files: create volume, Part, piece, metadata; ingest page level Metadata at scanning level; apply Globally Unique Identifiers (GUIDs) for linking to Other taxonomic services.

Internet Archive Scribe: Boston

Biodiversity Heritage Library Collaborators: Internet Archive

Collaborators: Internet Archive

Biodiversity Heritage Library

Biodiversity Informatics

Period of explosive growth NCL Centre for Biodiversity Informatics (India)--2000 Speciation event: Biodiversity Informatics --2004 Ocean Biodiversity Informatics conferences--2004, 2007 Species-bases sites: FishBase, AntWeb, AmphibiaWeb, North American Mammals, Swedish ArtDatabanken, Atlas of Living Australia, Netherlands species compendium … Specimen-based networks: HerpNet, MANIS, ORNIS, Regional networks: IABIN, OBIS, … Biogeomancer--2005 IdentifyLife--2005 JRS Biodiversity Foundation--2005 European Distributed Institute of Taxonomy (EDIT)--2006 BDI curricula University of Illinois Master of Science in Biological Informatics--2006 Encyclopedia of Life (EOL)--2007

NCL Centre for Biodiversity Informatics (India)--2000

Speciation event: Biodiversity Informatics --2004

Ocean Biodiversity Informatics conferences--2004, 2007

Species-bases sites: FishBase, AntWeb, AmphibiaWeb, North American Mammals, Swedish ArtDatabanken, Atlas of Living Australia, Netherlands species compendium …

Specimen-based networks: HerpNet, MANIS, ORNIS,

Regional networks: IABIN, OBIS, …

Biogeomancer--2005

IdentifyLife--2005

JRS Biodiversity Foundation--2005

European Distributed Institute of Taxonomy (EDIT)--2006

BDI curricula

University of Illinois Master of Science in Biological Informatics--2006

Encyclopedia of Life (EOL)--2007

An example: The Encyclopedia of Life (EOL) An online encyclopedia composed of 1.8 million web sites One for each known species EOL is developing two aspects of the original GBIF work programme SpeciesBank--assemblage of all kinds of information about species Digital library of biodiversity literature

An online encyclopedia composed of 1.8 million web sites

One for each known species

EOL is developing two aspects of the original GBIF work programme

SpeciesBank--assemblage of all kinds of information about species

Digital library of biodiversity literature

Web 2.0 components of the Encyclopedia of Life (EOL) Each site consists of several components Species page for the general public Draft pages assembled via mashup technology Drafts authenticated by experts (“curators”) using controlled wikis Information protected from being changed by anyone except the curators But anyone can comment on the information and or suggest things to add Curators will examine these suggestions and may move some of the information to the protected part

Each site consists of several components

Species page for the general public

Draft pages assembled via mashup technology

Drafts authenticated by experts (“curators”) using controlled wikis

Information protected from being changed by anyone except the curators

But anyone can comment on the information and or suggest things to add

Curators will examine these suggestions and may move some of the information to the protected part

Each site consists of several components Species page for the general public Community-assembled spaces E.g. taxonomists, molecular biologists, horticulturists, birdwatchers, pollinator biologists, etc., etc. Each links in different databases and information Can also be the focus of social networks Spider freaks, leech aficionados, polar bear lovers, gingko groupies, microbe mavens, whatever … Each group/network controls the information on its space Web 2.0 components of the Encyclopedia of Life (EOL)

Each site consists of several components

Species page for the general public

Community-assembled spaces

E.g. taxonomists, molecular biologists, horticulturists, birdwatchers, pollinator biologists, etc., etc.

Each links in different databases and information

Can also be the focus of social networks

Spider freaks, leech aficionados, polar bear lovers, gingko groupies, microbe mavens, whatever …

Each group/network controls the information on its space

Example of a science-based community-assembled space on the EOL Scientists working on ageing wanted access to longevity information on the EOL Proposed to organize their community to find this information and put it on the EOL species pages Will set up their own portal into this information and manage the changing of the information Received USD 2 million from private foundation to fund this activity

Scientists working on ageing wanted access to longevity information on the EOL

Proposed to organize their community to find this information and put it on the EOL species pages

Will set up their own portal into this information and manage the changing of the information

Received USD 2 million from private foundation to fund this activity

Example of an education-based community-assembled space on the EOL A school wishes to catalogue the biodiversity of a site near their schoolyard EOL and GBIF supply a bioblitz tool for them to use Use GPS-enabled phones to take pictures of organisms found on the site Assembly software combines these into a community inventory Students identify the organisms using EOL species pages Prepare inventory of the site Serve that information back to the EOL web pages (and potentially even to GBIF)

A school wishes to catalogue the biodiversity of a site near their schoolyard

EOL and GBIF supply a bioblitz tool for them to use

Use GPS-enabled phones to take pictures of organisms found on the site

Assembly software combines these into a community inventory

Students identify the organisms using EOL species pages

Prepare inventory of the site

Serve that information back to the EOL web pages (and potentially even to GBIF)

Web 2.0 components of the Encyclopedia of Life (EOL) Each site consists of several components Species page for the general public Community-assembled spaces Digitized biodiversity literature Biodiversity Heritage Library--consortium of 10 of the largest natural history libraries Scanning and marking up of 320,000,000 pages of literature

Each site consists of several components

Species page for the general public

Community-assembled spaces

Digitized biodiversity literature

Biodiversity Heritage Library--consortium of 10 of the largest natural history libraries

Scanning and marking up of 320,000,000 pages of literature

“ All accumulated information of a species is tied to a scientific name, a name that serves as a link between what has been learned in the past and what we today add to the body of knowledge.” ~ Grimaldi & Engel, 2005, Evolution of the Insects

Who knowth not the name, knoweth not the subject Linnaeus, 1737, Critica Botanica n 210 .

Information about named groups (taxa) of organisms (taxon-related information) Extends back at least 1000 years Books, journals, surveys Museum specimens, herbaria In many languages and is distributed From T.E. Glover, The Fishes of Southwestern Japan, c.1870

Information about named groups (taxa) of organisms (taxon-related information)

Extends back at least 1000 years

Books, journals, surveys

Museum specimens, herbaria

In many languages and is distributed

The challenge for contemporary DIGITAL libraries Goal: Use one name to find the content for all names

Names – the only universal metadata for Biology Names offer a logical way to search for and index content Names annotate data objects All names annotate all data objects A compilation of all names ever used is the foundation of a universal index for biology or for a semantic web for biology

Names annotate data objects

All names annotate all data objects

A compilation of all names ever used is the foundation of a universal index for biology or for a semantic web for biology

Who is affected by these problems? Libraries Publishers Museums Federal Agencies

Serious challenges in federated environments One organism 4 scientific names 4 maps We want one map

Reconciliation – linking alternative names for the same organism A query initiated with any name, can be expanded to all names and will unify data associated with each

Reuse, don’t rebuild

 

All names & all Classifications ClassificationBank Alternative names reconciled Similar names disambiguated Exploit hierarchies to browse and search, build a comprehensive classification Improve performance with federated systems Read documents, web sites, databases and taxonomically indexing the content Create a unified portal to information about organisms on the internet Taxonomic intelligence is the inclusion of taxonomic practices, skills and knowledge within informatics services to manage information about organisms

All names & all Classifications ClassificationBank

Alternative names reconciled

Similar names disambiguated

Exploit hierarchies to browse and search, build a comprehensive classification

Improve performance with federated systems

Read documents, web sites, databases and taxonomically indexing the content

Create a unified portal to information about organisms on the internet

data from various sources may be merged red dots on the map link back to the website that provided the geographical co-ordinates Specimen distribution data from remote sources

data from various sources may be merged

red dots on the map link back to the website that provided the geographical

co-ordinates

Biodiversity Heritage Library BHL Taxonomic Intelligence Tool Georges Louis Leclerc, comte de Buffon Histoire naturelle : générale et particulière (Oiseaux) , 1799-1808

uBio 10.7 Million+ Name Strings Reconciliation Groups http://www.ubio.org

10.7 Million+ Name Strings

Reconciliation Groups

http://www.ubio.org

FindIT - uBio’s Scientific Name Recognition Algorithm

Training and Improving the Algorithm

uBioRSS Taxonomically Intelligent RSS Feed Aggregator

uBioRSS Taxonomically Intelligent RSS Feed Aggregator

MBL WHOI Library – Woods Hole authors’ publications

MBL WHOI Library – Woods Hole species publications

Taxonomically intelligent scientific text parsing

 

 

Search Browse

Search

Browse

Taxonomic intelligence works miracles It will benefit any initiative that uses distributed and heterogeneous information about biology Distributed content on the same species can be drawn together because different names will be standardized through reconciliation We can read documents, find names, catalog and taxonomically index documents Produce a framework around which we can organize and assemble remote and local content

It will benefit any initiative that uses distributed and heterogeneous information about biology

Distributed content on the same species can be drawn together because different names will be standardized through reconciliation

We can read documents, find names, catalog and taxonomically index documents

Produce a framework around which we can organize and assemble remote and local content

Taxonomic Intelligence Lexicon of Scientific Names Reconciliation and Disambiguation Hierarchical Inclusion Integration into Information Retrieval Linkage to Other Data Types (e.g., Molecular, Morphological, Phenotype)

Lexicon of Scientific Names

Reconciliation and Disambiguation

Hierarchical Inclusion

Integration into Information Retrieval

Linkage to Other Data Types (e.g., Molecular, Morphological, Phenotype)

B E Y O N D A N D

EMF Biology of Aging Ellison Medical Foundation (EMF) “ Enable the Study of Aging Across the Spectrum of Life” Officially Began January 2008

EMF Biology of Aging FEDORA Commons Conditions Locations Organisms Genes

EMF/EOL Key Resources Medline, BHL (Literature) GenBank (Molecular) EOL (Habitat & Location)

Medline, BHL (Literature)

GenBank (Molecular)

EOL (Habitat & Location)

All organisms are affected by aging Not all aging is associated with disease The flip side: Understanding aging might give insights to regeneration A constant

All organisms are affected by aging

Not all aging is associated with disease

The flip side: Understanding aging might give insights to regeneration

Biomedical Focus Expand the scope of organisms beyond the “classic” models:

Expand the scope of organisms beyond the “classic” models:

Goals of EMF (years one & two) What genes are associated with aging conditions? What are the conditions associated with these genes? What organisms are associated with the aging genes and conditions? What other organisms might also have aging genes? Where do the identified organisms live, and in what types of habitats? What are the demographic patterns associated with organisms across the spectrum of life? What are common phenotypes associated with organisms that share common aging genes?

What genes are associated with aging conditions?

What are the conditions associated with these genes?

What organisms are associated with the aging genes and conditions?

What other organisms might also have aging genes?

Where do the identified organisms live, and in what types of habitats?

What are the demographic patterns associated with organisms across the spectrum of life?

What are common phenotypes associated with organisms that share common aging genes?

Status Update: Statistics 3,047 titles completed 7,669 volumes 2,945,143 pages in portal ~5.5 million pages scanned Three 10 station scribes centers (Boston, Washington New York) Two 1-2 scribe stations (SI, Urbana, London)

3,047 titles completed

7,669 volumes

2,945,143 pages in portal

~5.5 million pages scanned

Three 10 station scribes

centers (Boston, Washington

New York)

Two 1-2 scribe stations

(SI, Urbana, London)

Proven the concept of mass scanning of general collections Proven concept of automated structured markup done in collaboration with Penn State and the Internet Archive Built proof of concept portal on proprietary ( .Net) environment. High levels of OCR accuracy in late 19 th and 20 th century printing Applied taxonomic intelligence (species name finding) across million of pages against nearly 11 million names in Name Bank. Data mining BHL for other bioinformatics projects (EOL) Obtained buy-in from a diverse group of learned societies for the BHL opt-in copyright model Support and encouragement from our traditional bibliophile, and scientific audiences Collaboration with an international group of competitive organizations Status today

Proven the concept of mass scanning of general collections

Proven concept of automated structured markup done in collaboration with Penn State and the Internet Archive

Built proof of concept portal on proprietary ( .Net) environment.

High levels of OCR accuracy in late 19 th and 20 th century printing

Applied taxonomic intelligence (species name finding) across million of pages against nearly 11 million names in Name Bank.

Data mining BHL for other bioinformatics projects (EOL)

Obtained buy-in from a diverse group of learned societies for the BHL opt-in copyright model

Support and encouragement from our traditional bibliophile, and scientific audiences

Collaboration with an international group of competitive organizations

Get equal cost efficiencies and speed for special collections Nail down automated structural markup to a high level of accuracy Port the portal from .Net to Fedora Improve OCR for publications in other languages with little human intervention Broaden the use of taxonomic intelligence algorithm Data mining BHL for other bioinformatics projects (?????) Work with commercial publishers for fair and equitable use of their publications Expand audiences through social networking and repurposing content for new audiences Expand the consortium to bring in more partners, and more partners in Europe, Asia, and the developing world Status Tomorrow

Get equal cost efficiencies and speed for special collections

Nail down automated structural markup to a high level of accuracy

Port the portal from .Net to Fedora

Improve OCR for publications in other languages with little human intervention

Broaden the use of taxonomic intelligence algorithm

Data mining BHL for other bioinformatics projects (?????)

Work with commercial publishers for fair and equitable use of their publications

Expand audiences through social networking and repurposing content for new audiences

Expand the consortium to bring in more partners, and more partners in Europe, Asia, and the developing world

 

 

 

 

www.eol.org www.ubio.org www.biodiversitylibrary.org

Acknowledgments Patrick Leary David Remsen Diane Rielinger David Patterson Neil Sarkar A.W. Mellon Foundation Alfred P. Sloan Foundation John D. & Catherine T. MacArthur Foundation Internet Archive Jim Edwards Christopher Freeland Tom Garnett Martin Kalfatovic Graham Higley BHL & EOL Teams

 

Gesner, 1576

 

Add a comment

Related presentations

Related pages

BioOne Online Journals - Keynote Symposium

KS-1. NASA Space Based Research: Challenges and Benefits for Tissue Engineering. DAVID A. WOLF. National Aeronautics and Space Administration, Lyndon B ...
Read more

BioOne Online Journals - Keynote Symposium

KS-1. Global Agriculture at the Crossroads: Pathway to an Era of Biohappiness. M. S. SWAMINATHAN. UNESCO Chair in Ecotechnology, Chairman, M S Swaminathan ...
Read more

Global Biodiversity Information Facility - Deutschland ...

“global about access administrativa argentina argentino asistencia biodiversidad biodiversity bioló bioone botanischer ... keynote kommission ...
Read more

Topp 10 Anwendungen die von Mathtype 6 - BASIS 1

• Apple Keynote • Apple Mail • ... • BioOne • Blackboard • ...
Read more

2011 ABLS Field Meeting at Roan Mountain - researchgate.net

BioOne (www.bioone.org) is a a nonprofit, online aggregation of core research in the biological, ... The Keynote Speaker, James Lendemer ...
Read more

eGathering Speaker Profile – Jason Griffey, Keynote ...

eGathering Speaker Profile – Jason Griffey, Keynote. ... Jason Griffey’s Keynote address, ... ← BioOne is Now Optimized for Your Mobile Phone.
Read more

Exhibition competitions and activities | ALIA

Keynote Speakers. Keynote Speakers; Keynote Speakers; Featured Speakers; Program and Events. Program and Events; ... Exhibition competitions and activities.
Read more

Thomas More College - Biology - Dr. Shannon Galbraith-Kent

Shannon Galbraith-Kent first came to Thomas More College in 2008. Athletics. Saints Headlines; Summer Camps; Hall of Fame; Athletic Training; Athletics Staff;
Read more