Trait data mining using FIGS (2006)

50 %
50 %
Information about Trait data mining using FIGS (2006)
Technology

Published on October 10, 2009

Author: DagEndresen

Source: slideshare.net

Description

Trait Mining, prediction of agricultural traits in plant genetic resources with ecological parameters. Focused Identification of Germplasm Strategy (FIGS). For the Vavilov seminars at the IPK Gatersleben 13th June 2007. Dag Endresen, Michael Mackay, Kenneth Street.

Cover slide Utilization of Genetic Resources Prediction of agricultural traits in plant genetic resources with ecological parameters June 13, 2007, IPK Gatersleben Dag Terje Filip Endresen , Nordic Gene Bank (NGB), Sweden Michael Mackay , Australian Winter Cereals Collection (AWCC), Tamworth Agricultural Institute , NSW DPI, Australia Kenneth Street , Project Coordinator, Genetic Resource Unit, ICARDA

TOPICS Utilization of genetic resources: Prediction of agricultural trait values with ecological parameters Distributed information network, data standards, data exchange tools

Utilization of genetic resources:

Prediction of agricultural trait values with ecological parameters

Distributed information network, data standards, data exchange tools

Utilization Utilization of genetic resources Strategies to improve the utilization of the accessions in the genebank collections are of high priority to increase the genetic diversity of the food crops for enhanced food security. Data access, interoperability Data mining tools

Utilization of genetic resources

Strategies to improve the utilization of the accessions in the genebank collections are of high priority to increase the genetic diversity of the food crops for enhanced food security.

Data access, interoperability

Data mining tools

Utilization, data access Utilization of genetic resources, data access Access to user-friendly, interoperable documentation on genetic resources across genebank collections could be a constraint for wider use of the genetic resources conserved in genebanks. Today no global one-stop data portal to access genebank accessions from all parts of the world exists. In Europe the EURISCO search catalogue was developed. Methods to improve data exchange with web services and enhanced interoperability with data standards and ontologies will be explored further. (Examples: TDWG, GBIF, BioCASE, Bioversity International)

Utilization of genetic resources, data access

Access to user-friendly, interoperable documentation on genetic resources across genebank collections could be a constraint for wider use of the genetic resources conserved in genebanks.

Today no global one-stop data portal to access genebank accessions from all parts of the world exists. In Europe the EURISCO search catalogue was developed.

Methods to improve data exchange with web services and enhanced interoperability with data standards and ontologies will be explored further.

(Examples: TDWG, GBIF, BioCASE, Bioversity International)

Utilization, tools Utilization of genetic resources, data mining tools Perhaps the availability of powerful tools to analyze and find (mine) accessions with a higher probability to have attractive phenotypes for further crop improvement is another constraint for wider utilization. The new methods under development for prediction of agrobotanical traits will be the main topic of this study. This will involve the building of ecological niche models based on the ecological parameters of the site of origin for the genetic resources included in a study. (FIGS, openModeller, GBIF-MAPA)

Utilization of genetic resources, data mining tools

Perhaps the availability of powerful tools to analyze and find (mine) accessions with a higher probability to have attractive phenotypes for further crop improvement is another constraint for wider utilization.

The new methods under development for prediction of agrobotanical traits will be the main topic of this study.

This will involve the building of ecological niche models based on the ecological parameters of the site of origin for the genetic resources included in a study.

(FIGS, openModeller, GBIF-MAPA)

Trait mining Trait mining with ecological parameters Landraces and other cultivars have been created during cultivation over centuries. The ecological parameters of this culture landscape takes part in forming the cultivated plant material. When searching for suitable genebank accessions for a specific crop improvement program, a breeder may have thousands of candidate accessions. Often only a small number of the candidate accessions have previously been screened for the relevant phenotypic trait characters. Based on the assumption that these trait characters are correlated to the ecological attributes of the site of origin, trait values can be predicted in accessions not yet screened.

Trait mining with ecological parameters

Landraces and other cultivars have been created during cultivation over centuries. The ecological parameters of this culture landscape takes part in forming the cultivated plant material.

When searching for suitable genebank accessions for a specific crop improvement program, a breeder may have thousands of candidate accessions.

Often only a small number of the candidate accessions have previously been screened for the relevant phenotypic trait characters.

Based on the assumption that these trait characters are correlated to the ecological attributes of the site of origin, trait values can be predicted in accessions not yet screened.

Trait mining Trait mining with ecological parameters A subset of accessions with known character expression are used to build a “habitat signature” (niche model) to rank the probability of the different “habitats” to “produce” a valuable phenotypic trait. This niche model is then applied to the ecological parameters of the origin sites to predict desired trait values. The first studies within the FIGS project proved this assumption and generated very interesting results.

Trait mining with ecological parameters

A subset of accessions with known character expression are used to build a “habitat signature” (niche model) to rank the probability of the different “habitats” to “produce” a valuable phenotypic trait.

This niche model is then applied to the ecological parameters of the origin sites to predict desired trait values.

The first studies within the FIGS project proved this assumption and generated very interesting results.

Ecological Niche The fundamental ecological niche of an organism was formalized by Hutchinson [1] in 1957 as a multidimensional hypercube defining the ecological conditions that allow a species to exist. Full understanding of all the environmental conditions for any organism is a monumental task [2] . Extrapolating of the occurrence localities together with selected associated environmental conditions such as rainfall, temperature, day length etc., an approximation of the fundamental niche can be made. Some popular software implementations for modeling the ecological niche include BioCLIM, DesktopGARP, MaxEnt, etc.

The fundamental ecological niche of an organism was formalized by Hutchinson [1] in 1957 as a multidimensional hypercube defining the ecological conditions that allow a species to exist.

Full understanding of all the environmental conditions for any organism is a monumental task [2] .

Extrapolating of the occurrence localities together with selected associated environmental conditions such as rainfall, temperature, day length etc., an approximation of the fundamental niche can be made.

Some popular software implementations for modeling the ecological niche include BioCLIM, DesktopGARP, MaxEnt, etc.

Ecological Niche Ecological niche, trait mining In this study of agrobotanical trait mining, correlation between the trait and this fundamental niche is assumed. The distribution of the studied traits in the spatial space will be correlated to an approximated fundamental ecological niche, or “distribution in the ecological space”. The ecological niche of a trait will be used to predict trait values in the studied dataset of genebank accessions.

Ecological niche, trait mining

In this study of agrobotanical trait mining, correlation between the trait and this fundamental niche is assumed.

The distribution of the studied traits in the spatial space will be correlated to an approximated fundamental ecological niche, or “distribution in the ecological space”.

The ecological niche of a trait will be used to predict trait values in the studied dataset of genebank accessions.

Biological status of sample Types of plant genetic resources: Wild relatives of crop species (MCPD: 100, 200) Landraces , traditional cultivars (MCPD: 300) Research material, genetic stocks (MCPD: 400) Modern varieties, advanced cultivars (MCPD: 500) Multi-Crop Passport Descriptors (SAMPSTAT). FAO, IPGRI (Alercia et al. 2001) Genetic Resources in Plants (Frankel, 1984, 1970) It is more often the landraces and non-cultivated types of plant genetic resources that are targeted as sources of novel genetic variation for breeding activities, especially for overcoming biotic and abiotic stresses. (Mackay et al. , 2005; Harlan, 1977)

Types of plant genetic resources:

Wild relatives of crop species (MCPD: 100, 200)

Landraces , traditional cultivars (MCPD: 300)

Research material, genetic stocks (MCPD: 400)

Modern varieties, advanced cultivars (MCPD: 500)

Multi-Crop Passport Descriptors (SAMPSTAT). FAO, IPGRI (Alercia et al. 2001) Genetic Resources in Plants (Frankel, 1984, 1970)

It is more often the landraces and non-cultivated types of plant genetic resources that are targeted as sources of novel genetic variation for breeding activities, especially for overcoming biotic and abiotic stresses. (Mackay et al. , 2005; Harlan, 1977)

FIGS Focused Identification of Germplasm Strategy The FIGS technology takes much of the guess work out of choosing which accessions are most likely to contain the specific characteristics being sought by plant breeders to improve plant productivity across numerous challenging environments. The development of FIGS was a joint project involving The Australian Winter Cereals Collection ( AWCC ), Tamworth , Australia, the International Center for Agricultural Research in the Dry Areas ( ICARDA ) in Syria, and the N .  I . Vavilov Research Institute of Plant Industry ( VIR ) in St . Petersburg, Russia. [ http://www.figstraitmine.org/ ] [ http://www.bwldb.net/ ]

Focused Identification of Germplasm Strategy

The FIGS technology takes much of the guess work out of choosing which accessions are most likely to contain the specific characteristics being sought by plant breeders to improve plant productivity across numerous challenging environments.

The development of FIGS was a joint project involving The Australian Winter Cereals Collection ( AWCC ), Tamworth , Australia, the International Center for Agricultural Research in the Dry Areas ( ICARDA ) in Syria, and the N .  I . Vavilov Research Institute of Plant Industry ( VIR ) in St . Petersburg, Russia.

[ http://www.figstraitmine.org/ ]

[ http://www.bwldb.net/ ]

FIGS The Focused Identification of Germplasm Strategy (FIGS) exploits the relationships between genotype and environment to select sets of collected germplasm containing specified genetic variation. The coordinates of the collection sites provide the link between germplasm and the environment where it evolved over millennia. Using geographic information system (GIS) technology, each collection site can be individually profiled for available environmental parameters such as precipitation, humidity, temperature, ago-climatic zoning, and soil characteristics.

The Focused Identification of Germplasm Strategy (FIGS) exploits the relationships between genotype and environment to select sets of collected germplasm containing specified genetic variation.

The coordinates of the collection sites provide the link between germplasm and the environment where it evolved over millennia.

Using geographic information system (GIS) technology, each collection site can be individually profiled for available environmental parameters such as precipitation, humidity, temperature, ago-climatic zoning, and soil characteristics.

http://www.figstraitmine.org

Long-term average precipitation for all collection sites

Logical process Select Parents Identify the Problem Understand Problem Information & Knowledge Identify Likely Accs. Evaluate Sub Set Breeding & Selection Cultivar

VIR ICARDA AWCC ? USDA ? Database GIS Traits specific selection Figs Set Figs Set Figs Set Figs Set Figs Set Evaluation VIR ICARDA AWCC ? USDA IPK? Database GIS Trait-specific selection Figs Set Figs Set Figs Set Figs Set Figs Set Evaluation

FIGS salinity set

Core and FIGS drought sets  Core accessions  FIGS accessions

After M C Mackay 1995

Distribution of 17,000 bread wheat landraces ICARDA, Aleppo, Syria VIR, St Petersburg, Russia AWCC, Tamworth, Australia A virtual collection from these gene banks: www.figstraitmine.com

Origin of Concept : Boron toxicity of wheat and barley example of late 1980s FIGS What is F ocused I dentification of G ermplasm S trategy

Online web application A similar project of inspiration for the PhD research is the GBIF-MAPA. The trait mining methods to be developed in the PhD study will be implemented as a public online web application as well as a downloadable desktop tool. Users will be able to extract occurrence data from the GBIF index and environmental parameters for the sites in a similar manner as with the GBIF-MAPA application.

A similar project of inspiration for the PhD research is the GBIF-MAPA.

The trait mining methods to be developed in the PhD study will be implemented as a public online web application as well as a downloadable desktop tool.

Users will be able to extract occurrence data from the GBIF index and environmental parameters for the sites in a similar manner as with the GBIF-MAPA application.

GBIF-MAPA GBIF-MAPA Mapping and Analysis Portal Application Survey Gap Analysis . The survey gap analysis (SGA) tool helps you design a biodiversity survey that will best complement the existing survey effort by identifying those areas least well surveyed in terms of environmental conditions. Species Richness Assessment . Use this tool to provide an estimate, from GBIF data, of the number of species recorded in an area; and to gain insight into the adequacy of sampling based on abundance distributions for those species. Environment Values Extraction . Query a range of environmental layers (e.g. climate) using GBIF species record point data to create a table showing the environmental values at those points. This data can then be used in your own statistical analyses. The main target of the GBIF-MAPA is users who have a focus on conservation planning and habitat conservation . GBIF-MAPA is developed by researchers from the Australian Museum, the University of Colorado (Boulder, Colorado, USA) and the New South Wales Department of Environment and Conservation [http://gbifmapa.austmus.gov.au/mapa/]

GBIF-MAPA Mapping and Analysis Portal Application

Survey Gap Analysis . The survey gap analysis (SGA) tool helps you design a biodiversity survey that will best complement the existing survey effort by identifying those areas least well surveyed in terms of environmental conditions.

Species Richness Assessment . Use this tool to provide an estimate, from GBIF data, of the number of species recorded in an area; and to gain insight into the adequacy of sampling based on abundance distributions for those species.

Environment Values Extraction . Query a range of environmental layers (e.g. climate) using GBIF species record point data to create a table showing the environmental values at those points. This data can then be used in your own statistical analyses.

The main target of the GBIF-MAPA is users who have a focus on conservation planning and habitat conservation .

GBIF-MAPA is developed by researchers from

the Australian Museum, the University of Colorado

(Boulder, Colorado, USA) and the New South

Wales Department of Environment and Conservation

[http://gbifmapa.austmus.gov.au/mapa/]

openModeller The openModeller project aims to provide a flexible, user friendly, cross-platform environment where the entire process of conducting a fundamental niche modeling experiment can be carried out. The software includes facilities for reading species occurrence and environmental data, selection of environmental layers on which the model should be based, creating a fundamental niche model and projecting the model into an environmental scenario.   [ http://openmodeller.sourceforge.net/ ] The project is currently being developed by the Centro de Referência em Informação Ambiental (CRIA) , Escola Politécnica da USP (Poli) , and Instituto Nacional de Pesquisas Espaciais (INPE) as an open-source initiative.

openModeller openModeller was initiated in 2003 by CRIA (Brazil). Developed in C++ and cross-platform, MS Widows, Mac OS X, and Linux. Open Source freely available under the GPL license. A plug-in architecture and have today plug-in for a number of fundamental niche modeling algorithms. (Bioclim [ Bioclimatic Envelopes] , GARP, CSM [ Climate Space Model] , Environmental Distance and others) There is a user-friendly desktop version, a web service API based on SOAP, a CGI application and a console interface for the command line. Occurrence data can be retrieved directly from GBIF in the openModeller Desktop application to start a new experiment. openModeller Desktop comes with a mapping module to visualize the predicted niche model (species) distribution on a map view.

openModeller was initiated in 2003 by CRIA (Brazil).

Developed in C++ and cross-platform, MS Widows, Mac OS X, and Linux.

Open Source freely available under the GPL license.

A plug-in architecture and have today plug-in for a number of fundamental niche modeling algorithms. (Bioclim [ Bioclimatic Envelopes] , GARP, CSM [ Climate Space Model] , Environmental Distance and others)

There is a user-friendly desktop version, a web service API based on SOAP, a CGI application and a console interface for the command line.

Occurrence data can be retrieved directly from GBIF in the openModeller Desktop application to start a new experiment.

openModeller Desktop comes with a mapping module to visualize the predicted niche model (species) distribution on a map view.

Phenotype agricultural traits

Phenotype agricultural traits

Phenotype data NGB has developed a tool to preview evaluation and characterization data. © NGB, 2003, GPL 2.0, (Morten Hulden, 2001) Dynamic Evaluation Data Analyzer: [http://www.nordgen.org/sesto/index.php?scp=ngb&thm=observations] When an individual trait character is selected the results are displayed split on variation by observation site variation by observation year variation in observed tax on variation in observed biological status of sample (wild relative, landrace, advanced cultivar…) variation by country of origin variation in observed accessions … or any other useful categories as defined by the responsible administrator.

NGB has developed a tool to preview evaluation and characterization data. © NGB, 2003, GPL 2.0, (Morten Hulden, 2001)

Dynamic Evaluation Data Analyzer: [http://www.nordgen.org/sesto/index.php?scp=ngb&thm=observations]

When an individual trait character is selected the results are displayed split on

variation by observation site

variation by observation year

variation in observed tax on

variation in observed biological status of sample

(wild relative, landrace, advanced cultivar…)

variation by country of origin

variation in observed accessions

… or any other useful categories

as defined by the responsible

administrator.

Phenotype

http://barley.ipk-gatersleben.de/genres Template slide

Template slide

http://barley.ipk-gatersleben.de/genres … screen dump cropped Screen dump continued…

Data Standards

Data Standards

TDWG :: SDD Structured Descriptive Data In taxonomy, descriptive data takes a number of very different forms. Natural-language descriptions are semi-structured, semi-formalised descriptions of a taxon (or occasionally of an individual specimen). They may be simple, short and written in plain language (if used for a popular field guide), or long, highly formal and using specialised terminology when used in a taxonomic monograph or other treatment. The goal of the SDD standard is to allow capture, transport, caching and archiving of descriptive data in all the forms shown above, using a platform- and application-independent, international standard. Such a standard is crucial to enabling lossless porting of data between existing and future software platforms including identification, data-mining and analysis tools, and federated databases. Hagedorn, G.; Thiele, K.; Morris, R. & Heidorn, P. B. 2005. The Structured Descriptive Data (SDD) w3c-xml-schema, version 1.0. [ http://www. tdwg .org/standards/116/] . [Retrieved 05-May-2007 ] [ http://www.tdwg.org/standards/116/ ]

Structured Descriptive Data

In taxonomy, descriptive data takes a number of very different forms.

Natural-language descriptions are semi-structured, semi-formalised descriptions of a taxon (or occasionally of an individual specimen). They may be simple, short and written in plain language (if used for a popular field guide), or long, highly formal and using specialised terminology when used in a taxonomic monograph or other treatment.

The goal of the SDD standard is to allow capture, transport, caching and archiving of descriptive data in all the forms shown above, using a platform- and application-independent, international standard. Such a standard is crucial to enabling lossless porting of data between existing and future software platforms including identification, data-mining and analysis tools, and federated databases.

Hagedorn, G.; Thiele, K.; Morris, R. & Heidorn, P. B. 2005. The Structured Descriptive Data (SDD) w3c-xml-schema, version 1.0. [ http://www. tdwg .org/standards/116/] . [Retrieved 05-May-2007 ]

[ http://www.tdwg.org/standards/116/ ]

Crop Descriptors The Bioversity International (IPGRI) crop descriptors are developed to standardize characterization and evaluation data – called “descriptive data” in TDWG context. The MCPD (Multi Crop Passport Descriptors) is designed to standardize "passport data" across crops. It enables compatibility with the crop specific descriptor lists and the FAO World Information and Early Warning System (WIEWS) and serves as a basis for data exchange. The MCPD descriptor list was made fully compatible with ABCD 2.06

The Bioversity International (IPGRI) crop descriptors are developed to standardize characterization and evaluation data – called “descriptive data” in TDWG context.

The MCPD (Multi Crop Passport Descriptors) is designed to standardize "passport data" across crops. It enables compatibility with the crop specific descriptor lists and the FAO World Information and Early Warning System (WIEWS) and serves as a basis for data exchange.

The MCPD descriptor list was made fully compatible with ABCD 2.06

Taxonomic Database Working Group Darwin Core 2 - Element definitions designed to support the sharing and integration of primary biodiversity data". [http://darwincore.calacademy.org/] Access to Biological Collection Data (ABCD) 2.06 - An evolving comprehensive standard for the access to and exchange of data about specimens and observations (a.k.a. primary biodiversity data)“. [http://www.bgbm.org/TDWG/CODATA/Schema/]

Darwin Core 2 - Element definitions designed to support the sharing and integration of primary biodiversity data". [http://darwincore.calacademy.org/]

Access to Biological Collection Data (ABCD) 2.06 - An evolving comprehensive standard for the access to and exchange of data about specimens and observations (a.k.a. primary biodiversity data)“.

[http://www.bgbm.org/TDWG/CODATA/Schema/]

ABCD A ccess to B iological C ollection D ata ABCD is a common data specification for data on biological specimens and observations (including the plant genetic resources seed banks). The design goal is to be both comprehensive and general (about 1200 elements). Development of the ABCD started after the 2000 meeting of the TDWG. ABCD was developed with support from TDWG/CODATA , ENHSIN, BioCASE, and GBIF. The MCPD descriptor list is completely mapped and compatible to ABCD 2.06

ABCD is a common data specification for data on biological specimens and observations (including the plant genetic resources seed banks).

The design goal is to be both comprehensive and general (about 1200 elements).

Development of the ABCD started after the 2000 meeting of the TDWG.

ABCD was developed with support from TDWG/CODATA , ENHSIN, BioCASE, and GBIF.

The MCPD descriptor list is completely mapped and compatible to ABCD 2.06

Generation Challenge Programme, GCP_Passport_1.04 The Generation Challenge Programme is a research and capacity building network that uses plant genetic diversity to produce better crop varieties for resource-poor farmers. In the context of the GCP (Generation Challenge Programme), the GCP Passport data exchange schema was developed.

The Generation Challenge Programme is a research and capacity building network that uses plant genetic diversity to produce better crop varieties for resource-poor farmers.

In the context of the GCP (Generation Challenge Programme), the GCP Passport data exchange schema was developed.

W3C :: RDF Resource Description Framework Scenario: You have a dataset of genebank accessions with pointers to the source datasets of the holding genebanks. You produce phenotypic evaluation data on accessions in this dataset. You find evaluation data from other sources on some of the accessions in your dataset. Some of the evaluation data are produced in areas of different day length, rainfall, soils… Some of the accessions in your dataset originate from areas of higher population densities; other accessions originate from more natural habitats. Unfortunately most of the different sources of information are located on different web sites and it is difficult to bring the information together. You would need to go through more or less the same process as other researchers in many domains of gathering heterogeneous data from multiple sources, combining and analysing it. This is the challenge that faces the web as a whole and is being addressed by the Semantic Web project. RDFs can assist you to relate information from different sources. A RDF triplet looks like this: subject-predicate-object <rdf:Description rdf:about=&quot;http://www.example.org/index.html&quot;> <dc:creator>John Smith</dc:creator> </rdf:Description> anytime approximate case study diagnosis inconsistent kads banana apples stem color knowledge based systems knowledge level knowledge management knowledge representation LSID accession number GUID unitID ontology owl parametric design Full Scientific Name peer to peer systems problem solving landrace traditional cultivar 300 methods rdf rdf WEB2 ABCD SDD semantic web semantics specification languages web based web ontology INSTCODE plant genetic resources germplasm agricultural traits Aegilops

Resource Description Framework

Scenario: You have a dataset of genebank accessions with pointers to the source datasets of the holding genebanks. You produce phenotypic evaluation data on accessions in this dataset. You find evaluation data from other sources on some of the accessions in your dataset. Some of the evaluation data are produced in areas of different day length, rainfall, soils… Some of the accessions in your dataset originate from areas of higher population densities; other accessions originate from more natural habitats. Unfortunately most of the different sources of information are located on different web sites and it is difficult to bring the information together.

You would need to go through more or less the same process as other researchers in many domains of gathering heterogeneous data from multiple sources, combining and analysing it. This is the challenge that faces the web as a whole and is being addressed by the Semantic Web project.

RDFs can assist you to relate information from different sources.

A RDF triplet looks like this: subject-predicate-object

<rdf:Description rdf:about=&quot;http://www.example.org/index.html&quot;>

<dc:creator>John Smith</dc:creator>

</rdf:Description>

Life Science IDentifiers LSID is a digital name tag. LSIDs are GUIDs, Global Unique Identifiers. [http://lsid.sourceforge.net/] Structure urn:lsid: authority : namespace : object : revision Example (fictive) urn:lsid:eurisco.org:accession:H451269 The LSID concept introduces a straightforward approach to naming and identifying data resources stored in multiple, distributed data stores . LSID define s a simple, common way to identify and access biologically significant data ; whether that data is stored in files, relational databases, in applications, or in internal or public data sources, LSID provides a naming standard to support interoperability. Developed by OMG-LSR and W3C, implemented by IBM. W3C/TDWG :: LSID

Life Science IDentifiers

LSID is a digital name tag.

LSIDs are GUIDs, Global Unique Identifiers.

[http://lsid.sourceforge.net/]

Structure urn:lsid: authority : namespace : object : revision

Example (fictive) urn:lsid:eurisco.org:accession:H451269

The LSID concept introduces a straightforward approach to naming and identifying data resources stored in multiple, distributed data stores .

LSID define s a simple, common way to identify and access biologically significant data ; whether that data is stored in files, relational databases, in applications, or in internal or public data sources, LSID provides a naming standard to support interoperability.

Developed by OMG-LSR and W3C, implemented by IBM.

Biodiversity data exchange tools

Biodiversity data exchange tools

Data Provider Software DiGIR , Di stributed G eneric I nformation R etrieval. [http://digir.net] PyWrapper, based on the BioCASE Python wrapper software. [http://www.pywrapper.org/]

DiGIR , Di stributed G eneric I nformation R etrieval. [http://digir.net]

PyWrapper, based on the BioCASE Python wrapper software.

[http://www.pywrapper.org/]

Decentralized model

Outlook The compatibility of data standards between PGR and biodiversity collections made it possible to integrate the worldwide germplasm collections into the biodiversity community. Using GBIF technology (and contributing to its development), the PGR community can easily establish specific PGR networks without duplicating GBIF's work. Use of GBIF technology and integration of PGR collection data into GBIF allows PGR users to simultaneously search PGR collections and other biodiversity collections, and to get access to the data (and possibly the material) of relevant biodiversity collections. The same data sharing methods can also be applied to germplasm trait data. Agrobotanical phenotype characters could be described by common global standard and shared with the same data exchange tools as the accession passport data.

The compatibility of data standards between PGR and biodiversity collections made it possible to integrate the worldwide germplasm collections into the biodiversity community.

Using GBIF technology (and contributing to its development), the PGR community can easily establish specific PGR networks without duplicating GBIF's work.

Use of GBIF technology and integration of PGR collection data into GBIF allows PGR users to simultaneously search PGR collections and other biodiversity collections, and to get access to the data (and possibly the material) of relevant biodiversity collections.

The same data sharing methods can also be applied to germplasm trait data. Agrobotanical phenotype characters could be described by common global standard and shared with the same data exchange tools as the accession passport data.

Thank you for listening!

Add a comment

Related presentations

Presentación que realice en el Evento Nacional de Gobierno Abierto, realizado los ...

In this presentation we will describe our experience developing with a highly dyna...

Presentation to the LITA Forum 7th November 2014 Albuquerque, NM

Un recorrido por los cambios que nos generará el wearabletech en el futuro

Um paralelo entre as novidades & mercado em Wearable Computing e Tecnologias Assis...

Microsoft finally joins the smartwatch and fitness tracker game by introducing the...

Related pages

trait-mining - Identification of useful traits in ...

Identification of useful traits in cultivated plants using ecoclimatic data and multivariate data analysis. Code repository migrated to https://github.com ...
Read more

Predictive Association between Trait Data and ...

rent study demonstrates how trait mining with the new FIGS ... cal traits using a modern multilinear data ... in trait mining. The multiway data ...
Read more

Sources of Resistance to Stem Rust (Ug99) in Bread Wheat ...

for implementation of trait mining using FIGS ... 2006 and more recently in Iran in 2007 ... and durum wheat using trait mining with FIGS. The trait data-
Read more

Data Mining: Concepts and Techniques - Website Services ...

Data Mining: Concepts and ... 5.2.5 Mining Frequent Itemsets Using Vertical Data Format 245 ... 1/13/2006 7:17:00 PM ...
Read more

FIGS – a new tool for rapid mining of agricultural genebanks

International Center for Agricultural Research in ... previously possible using traditional methods. FIGS is a ... with data on plant traits and ...
Read more

Focused identification of germplasm strategy (FIGS ...

... (2006) Using principal components for estimating ... a large scale allele mining ... Predictive association between trait data and ecogeographic ...
Read more

Data Mining: What is Data Mining? - MBA, Executive MBA, Ph ...

Generally, data mining (sometimes called data or knowledge discovery) ... By using the NBA universal clock, ...
Read more

Focused identification of germplasm strategy (FIGS ...

FIGS combines both the ... to define a best bet subset of accessions with a higher probability of containing new variation for the sought after trait
Read more