advertisement

kincaid

50 %
50 %
advertisement
Information about kincaid
Entertainment

Published on September 18, 2007

Author: Seasham

Source: authorstream.com

advertisement

Robert KincaidDaniel KluesingAditya Vailaya:  Robert Kincaid Daniel Kluesing Aditya Vailaya BNS: An LDAP-based Biomolecule Naming Service Outline:  Outline Problem statement and design goals BNS architecture BNS use cases LDAP Final thoughts Problem:  Problem There is an increasing need to connect related genomic and proteomic measurements However, no universally accepted/used identifiers exist for biomolecules (GenBank, RefSeq, Unigene, PIR, Swiss-Prot … ) High-throughput measurements make manual association of related measurements impractical We need a practical solution that uses today’s data Initial Motivating Use Cases:  Initial Motivating Use Cases Generate a 'view' of data that is formed by the 'join' of: A microarray and a protein array A microarray and mass spec proteomics data An Agilent and a brand X microarray A commercial oligo array and a home-brew cDNA array Solution:  Solution A high-speed biomolecule Name/ID resolver Converts between different identifier schemes based on gene locus or transcript Converts between different states of transcription gene-andgt;transcript-andgt;protein Converts between gene symbols and aliases Easy to deploy and code applications Platform and language neutral Explores the research questions of feasibility and usefulness of - Name/ID resolver - LDAP System Is Not:  System Is Not A sequence database Primarily an annotation system Intended to be updated by users Not an object/interface naming service A complete, definitive system BNS – Biomolecule Naming Service:  BNS – Biomolecule Naming Service Research Prototype: Based on LDAP for easy deployment and wide platform support Derived from LocusLink data CLIENT APPLICATION BNS API LDAP API BNS NAME/ID RESOLVER LDAP-BASED NAME SERVER LOCUSLINK LDAP PROTOCOL DOWNLOAD AND CONVERSION SCRIPTS NCBI FTP (via HTTP proxy) DirectoryStructureLDAPSchema:  Directory Structure LDAP Schema Example Entry (LDIF):  Example Entry (LDIF) dn: locus=1,org=Homo sapiens,dc=BNS objectClass: bnsobject locus: 1 sym: A1BG name: alpha-1-B glycoprotein ug: Hs.373554 summary: The protein encoded by this gene is a plasma glycopro . . . org: Homo sapiens chr: 19q13.4 altsym: A1B altsym: ABG altsym: GAB gbaccn: AC010642 . . . gbaccn: W25099 dn: transcript=NM_130786,locus=1,org=Homo sapiens,dc=BNS objectClass: bnstranscript locus: 1 transcript: NM_130786 nm: NM_130786 np: NP_570602 prod: alpha 1B-glycoprotein Object Model:  Object Model BNSConnection Connect/Disconnect to LDAP server (local or remote) connect(String url, String org) Query, Lookup functions BNSObject lookupID(String id) String resolveTranscriptPair(String refseqID) List lookupSymbolList(String symbol) BNSObject Returned by query/lookup methods Get/Set methods for attributes Various text output functions provided for convenience toString(), toText(), toTabbedText(), toHTML() Example:  Example try { // STEP 1: Connect to the ldap server conn.connect('ldap://localhost'); // STEP 2: Do some BNS calls System.out.println(conn.lookupSymbol('ABL1').toText()); // STEP 3: Disconnect - That's all there is to it! conn.disconnect(); } catch (BNSException e) { e.printStackTrace(); } LOCUS 25 SYMBOL ABL1 ALIAS ABL, JTK7, p150, c-ABL DESCRIPTION v-abl Abelson murine leukemia viral oncogene homolog 1 UNIGENE ID Hs.14635 GENBANK K00009, AAA51895, M13099, AAA51896, U07563, AAB60393, AAB60394, . . . TRANSCRIPTS NM_005157, NP_005148, , v-abl Abelson murine leukemia viral oncogene homolog 1 isoform a NM_007313, NP_009297, , v-abl Abelson murine leukemia viral oncogene homolog 1 isoform b GENE ONTOLOGY cellular component : 0005634 : nucleus biological process : 0007048 : oncogenesis . . . Output Java Code Example:  Example try { // STEP 1: Connect to the ldap server conn.connect( 'ldap://localhost' ); // STEP 2: Do some BNS calls System.out.println( conn.resolveTranscriptionPair('NM_000018') ); System.out.println( conn.resolveTranscriptionPair('NP_000009') ); System.out.println( conn.resolveSymbol('PSCP') ); System.out.println( conn.lookupSymbol('A1BG').get_description() ); // STEP 3: Disconnect - That's all there is to it! conn.disconnect(); } catch (BNSException e) { e.printStackTrace(); } NP_000009 NM_000018 BRCA1 alpha-1-B glycoprotein Output Java Code A real case – joining Microarray and MS data*:  A real case – joining Microarray and MS data* * Data provided by Joel Sevinsky and Natalie Ahn, Dept. of Chemistry and Biochemistry, University of Colorado, Boulder Microarray 12626 Genes Mass Spec Proteomics 741 Protein IDs BNS 359 MS ID’s Matched to Microarray Features (48%) 441 (60%) 9419 (75%) GenBank/ UniGene RefSeq/ GenBank Locus High Throughput Use Cases:  High Throughput Use Cases Annotation of biomolecule lists example: microarray annotation, analysis bnsConnection.lookupID('NM_00018').toTabbedText(); Ad-hoc creation of biomolecule lists via query example: create a theme-based microarray List bnsObjects = bnsConnection.query('godesc=*onco*'); Merging biomolecule data with varied identifiers example: joining high throughput measurements bnsConnection.resolveTranscript('NM_00018'); bnsConnection.lookupID('NM_00018').getUnigene(); High Throughput Use Cases:  High Throughput Use Cases Normalizing biomolecule ID’s to a common scheme example: microarray annotation bnsConnection.lookupID('NM_00018').get_unigene(); Validating gene symbols example: text mining if (bnsConnection.lookupSymbol('PSCP') != null) Normalizing symbols to the official/preferred symbol example: text mining, microarray annotation officialSym = bnsConnection.lookupSymbol('PSCP').get_sym(); Low Throughput Use Cases:  Low Throughput Use Cases Lookup single ID BNSObject bnsObj= bnsConnection.lookupID('NM_00018'); Display data for single ID example: popup information dialog bnsObj.toHTML(); BNS Findings:  BNS Findings A system like BNS is extremely useful and efficient: New novel uses of genomic/proteomic data emerged beyond simple joins – text mining, annotation operations, chromosome mapping, etc. Flexible range of associations possible - exact ID matches, transcript/product matches or looser locus matches Simpler programming model than typical database access methods Standardized object models and interfaces for performing 'routine' name/id operations would enable rapid development of applications Why LDAP:  Why LDAP Sequence data easily conforms to a hierarchical directory structure Sequence databases are often lookup only and are not updated by users (cf. SRS and flat file databases) LDAP is scalable from very low end systems (slow laptops) to shared high-end servers Cross-platform, variety of language support, flexible back-ends, open standard Access control and security Good performance for minimal cost LDAP Issues:  LDAP Issues Approaches problem in a unique way Can be confusing to newcomers Easily overcome with modest experience Potential rate and quantity of individual BNS queries is far beyond the expectations of email address book applications Seems to work in practice Assumed solvable by scalability More difficult to proxy through firewalls than HTML-based solutions Socksification possible (trivial with Java) LDAP Supports Distributed Architecture:  LDAP Supports Distributed Architecture CLIENT APPLICATION BNS API LDAP API BNS NAME/ID RESOLVER LDAP-BASED NAME SERVER LOCUSLINK LDAP PROTOCOL Data is replicated from central curation server LOCUSLINK OUTASIGHT PRIVATE DATA Query referral enables transparent federated searching across widely distributed data servers LDAP-BASED NAME SERVER LDAP-BASED NAME SERVER LDAP Findings:  LDAP Findings LDAP appears quite suitable for deploying this kind of system: Performance appears to be good ~20-200+ lookups/sec – usually bandwidth limited queries can be roundtrip optimized server-side in-memory caching possible low footprint allows client-side instance for special high-throughput needs substantially faster than web-services equivalent* Minimal infrastructure is required scalable from laptop to high-end multi-processor server accessible from many environments (Java, Perl, C/C++, Matlab, etc.) Replication/Referral show promise for building distributed systems of biomolecule data *Based on data from Don Gilbert, Indiana Univ. (http://iubio.bio.indiana.edu/grid/directories) Conclusion:  Conclusion Some form of consistent ubiquitous interface for performing BNS-like operations is useful and desirable Efforts to create unified identifier schemes should consider a LocusLink-like organizing principle as these transcript/product relationships are important to emerging analyses Properly overloaded ID conventions could eliminate the need for ID conversions (e.g. Hs12345M6789 vs. Hs12345P6789, Hs12346M*, etc) LDAP shows promise as a useful lightweight high-performance delivery mechanism for biomolecule information Availablility:  Availablility http://openbns.sourceforge.net Acknowledgements:  Acknowledgements Agilent Paul Wolber Karen Shannon Dean Thompson Annette Adler Amir Ben-Dor University of Colorado Natalie Ahn Joel Sevinski Daniel Kleusing Aditya Vailaya Indiana University Don Gilbert

Add a comment

Related presentations

Related pages

Kincaid Furniture

Solid Wood Furniture and Custom Upholstery. Kincaid Furniture - Solid Wood bedroom furniture , dining room furniture, and living room sofas and tables.
Read more

Kincaid – Wikipedia

Kincaid ist der Familienname folgender Personen: Aron Kincaid (1940–2011), US-amerikanischer Schauspieler; Austin Kincaid (* 1980), US-amerikanische ...
Read more

LIVING ROOM - Kincaid Furniture

Solid Wood Furniture and Custom Upholstery. Kincaid Furniture - Solid Wood bedroom furniture , dining room furniture, and living room sofas and tables.
Read more

Kincaid Seed Research Equipment, Planters, Drills & Plot ...

Quality Seed Research Equipment. Kincaid has offered quality Seed Research Equipment since 1967. We have several models of threshers, planters ...
Read more

Kincaid / James von Deborah Crombie in folgender Reihenfolge

In welcher Reihenfolge sollte man die Bücher der Reihe Kincaid / James von Deborah Crombie lesen? 1. Das Hotel im Moor 2. Alles wird gut 3. Und ruhe in ...
Read more

Jamaica Kincaid – Wikipedia

Jamaica Kincaid (* 25. Mai 1949 als Elaine Cynthia Potter Richardson in Saint John’s, Antigua und Barbuda) ist eine antiguanisch-amerikanische ...
Read more

Kincaid, West Virginia - Wikipedia

Kincaid family. In 1807, James Kincaid and his wife Mary Tritt Kincaid moved from old Virginia and settled in Greenbrier county. They were not favorably ...
Read more

Kincaid's Restaurant: Classic American Grill

Kincaid's Restaurant is classic American grill that has been artfully serving quality USDA Prime steak and innovative seafood since 1983. Enjoy brunch ...
Read more

Kincaid, Horace - Kincaid, Kari: - Find People in Texas

Kincaid, Horace - Kincaid, Kari, Texas find people page in the Veromi People Index. Veromi has the most comprehensive people and Business data base in the ...
Read more

Park District SW - Anchorage, Alaska

Kincaid Outdoor Center 9401 W Raspberry Road Phone: 343-6397 Fax: 248-3780 e-mail: Kincaid@muni.org. Assistant Manager: Victoria Hutton e-mail: HuttonVK ...
Read more