Paper as a Research Object

100 %
0 %
Information about Paper as a Research Object
Technology

Published on February 15, 2014

Author: alexgarciac

Source: slideshare.net

Description

We need to start understanding documents within an electronic machine procesable environment. Such conception goes beyond the PDF and HTML; it entails, I argue, understanding the document as a fluid aggregator.

Research around and about the scientific paper in the biomedical domain. Supporting Literature Based Discovery From the paper to the data back and forth Alexander Garcia, PhD. FSU

350 Years and Counting  Scientific articles have adopted electronic dissemination channels  Scholarly communication has been complemented by the adoption of blogs, mailing lists, social networks, and other technologies  Information remains locked up in PDFs

And so we are… Managing the publication on a postmortem basis… The paper as an interface to the Web of Data? The problem remains, so… To be born semantics… why not?

Heading towards  A semantic document, one where human-readable knowledge is augmented to enable its interpretation by machine  A human interpretable document fully procesable by machines  Human interoperability and machine interoperability  Literature Based Discovery and the Paper as an interface to the WoD

We all know that  Information is locked up in discrete documents  Mostly PDF  Controlled vocabularies are not always available  Text Mining depends on availability of data  Poor metadata

Agenda Biotea Citagora Semantic documents as scaffolds for research objects Human interoperability and machine interoperability

Literature Based Discovery • The key idea is: putting together explicit assertions from different papers to form new implicit assertions – PTSD and suicide – Magnesium-migraine – Fish oil-Raynaud’s or calcium-channel blokers • Sophisticated access to online information • Supplement document retrieval with: – Information extraction – Automatic summarization – Question answering

The White Paper Challenge  Search and Retrieval How to get relevant documents faster Info Sources Query Builders Notifications How to “scan” the document in a meaningful manner? How to repurpose fragments of the documents?

Literature Discovery Process  Search  Usually string-based search mechanisms  Little cognitive support  Retrieval  Simple list of DB entries  Little cognitive support  Interacting with the document  Straight into the PDF  Zero cognitive support  Data availability

Literature Discovery Process  Search  Usually string-based search mechanisms  Little cognitive support  Retrieval  Simple list of DB entries  Little cognitive support  Interacting with the document  Straight into the PDF  Zero cognitive support

Literature Discovery Process  Search  Usually string-based search mechanisms  Little cognitive support  Retrieval  Simple list of DB entries  Little cognitive support  Interacting with the document  Straight into the PDF  Zero cognitive support

Challenge: Language Complexity The average age of participants (approximately 63 years), the predominance of women, and the high prevalence of comorbid conditions (for example, hypertension and cardiovascular disease) reflect typical characteristics of patients with osteoarthritis. Language encodes a lot of information

Words and Phrases age approximately average cardiovascular characteristics comorbid conditions disease example high average age of participants approximately 63 years predominance of women high prevalence comorbid conditions

Semantic Predications The average age of participants (approximately 63 years), the predominance of women, and the high prevalence of comorbid conditions (for example, hypertension and cardiovascular disease) reflect typical characteristics of patients with osteoarthritis.

Semantic Predications Cardiovascular Diseases CO-OCCURS_WITH Degenerative polyarthritis Hypertension CO-OCCURS_WITH Degenerative polyarthritis Suicide Ideation CO-OCCURS_WITH Suicide Risk

What is needed  Disambiguate Text and tag/link concepts  Meta-analyse information at concept level  Provide meta-analysed information  Support Information Based Knowledge Discovery (especially new associations)

In order to support Literature Based Discovery  Ontologies  Communities  Annotation  Machinereadable documents In a nutshell…. …documents as interfaces to the Web of Data…. Biotea • Machine-readable and procesable documents • Interactive documents • Enriched metadata • Full content management, document centric • Social hub Citagora -Aggregated search -Single entry point -Social hub -Citation centric

Biotea in a nutshell  It is a knowledge model for biomedical literature  We are semantically annotating literature with text mining and ontologies  Delivers a network of interrelated documents  Delivers a semantic infrastructure for PMC and scientific literature in general

PMC RDFication Metadata+ Content + References References Enrichment RDF Generation RDFReacto r PMC XML

RDF4PMC, some results Makes possible  How similar are two articles?  based on authors, keywords, abstracts, ontologi cal terms  Metadata + Content + References What articles use this reference in a section with title “Results”? Annotations Makes possible • How similar are two articles?  based on semantic distance • Which annotation co-occurs more with this “YYY” annotation? • Which articles include “TERM” but not this other “TERM”? Annotations Some numbers, article PMC126253 “Computational method for reducing variance with Affymetrix microarrays” • NCBO • Annotations: 407 • Topics: 633 • Whatizit • Annotations: 14 • Topics: 203 Delivering: the platform that makes possible to build interactive environments for semantic publications

A dashboard for semantic biopublications Semantically enriched publication Metadata+ Content + References SPARQL Catalase Automatically Annotated RDF

Cloud of Bioannotations (term + # of bioentities) Title & authors Links Abstra ct Paragraphs containing the annotation selected by the user

Bio-entities for the annotation selected Enriched content: interactive zone for the bio-entity selected by user

Citagora  An Agora for Citations  From Citations to Social Web to an Interactive Document  Aggregating activity from Social Networks, Reference Management Systems, Blogs, Publishers, etc.  Aggregating sources from Google Scholar, Microsoft Academics, Zotero, Mendely, etc.

What is MSRC.CITAGORA? Corpus of documents for one specific domain • • • BibRef centric Enrichment mechanism Based on heterogeneous data sources, aggregator o • o Heterogeneous BibRef data sources Heterogeneous PDF layouts Value in o o o o Enriching semantics around the BibRef Aggregating social activity around the BibRef  Social activity as part of the BifRef Making use of the content without exposing it DATA for and compatible with the Web of Data

MSRC.CITAGORA Data Source Data Sources, may be users uploading ENL files, that have for each record the corresponding PDF. Result from harvesting Mendeley, ZOTERO, Elsevier API, Microsoft Academics API, etc. Extracting Meaningful Information by Processing the Data Source -List of references this document cites_to -Meaningful bag of words Authors, affiliations, emails Outcome: RDF -BibRef for the original PDF -Annotations for the whole document -Text -List of cites_to

MSRC.CITAGORA Citagora Harvester Citation Metadata & References Database S2T PDFs Basic XML Enhanced XML Ontology / Citation References Vocabulary Documen Query Search t Database Engine RDF SPARQL Interface (Search + Tag Browser)

Moving Towards OPEN.CITAGORA Lets build the largest OPEN repository of everything around a standardized interoperable bibliographic reference Annotations has_part BibRef has_part has_part has_part Living in the Web of Data References Content PDF

Focus for OPEN.CITAGORA Data Interoperability Unlocking valuable information from the PDF Home of the largest collection of scientific bibliographic references and literature

Semantic Enrichment Jailbreaking PDF Content is locked up Meaningful Text Citations, cites_t o this paper cites_to -Authors -this paper has_authors -Title, DOI, etc -Content as text -Bag of words describing content Annotations PDF has_part has_part BibRef has_part has_part Content References

Semantic Enrichment Jailbreaking BibRef PDF Meaningful Text -Citations, cites_to Heterogeneous Content is this paper locked up formats cites_to Diversity in APIs -Authors for collecting -this paper BibRefs has_authors Poor in -Title, DOI, etc descriptors -Content as text anchored in the -Bag of words content Not justdescribing about the Louzy content PDF metadata Standardization, all in one place, one URI, etc Annotatio ns PDF has_p art has_p art BibRef has_p art Reference s has_p art Conte nt

Translational Research  How is MSRC contributing to Translational Research in Clinical Psychology?  Data Standards  Semantic Infrastructure  Bridging the gap between documents and data repositories

Narrative Text Usable by humans and comp The paper as a Research Object The RO is a fluid structured grid

About data Data Processing Data Processing BibRef Object BibRef Object Data The RO is a fluid structured grid

Rhetorical structure: Header, Body. Lab Notebook

BIBLIOGRAPHIC RECORD: CiTO+FaBIO HEAD: Bibliographic record (this paper), KeyWords, Author Contacts AUTHOR CONTACT: FOAF RHETORIC INFORMATION + EVIDENCE (external): SWAN-SIOC + CiTO + FaBIO SCIENTIFIC PAPER: Head, Body, Tail BODY: Rhetoric, Information, Evidence METHODS & MATERIALS: REAGENTS, PROTOCOLS, EQUIPMENT, INSTRUMENTATION INFORMATION + EVIDENCE (internal): METHODS & MATERIALS, EXPERIMENTAL DESIGN, DATA & COMPUTATIONS, INTERPRETATIONS REAGENTS: SemRes Antibodies, SemRes Mouse Models EXPERIMENTAL DESIGN: SWAN Data + Experiment, OBI, myExperiment DATA & COMPUTATIONS: SWAN Data+Experiment, OBI, SWAN, myExperiment INTERPRETATIONS: SWAN-SIOC TAIL: Bibliographic records (papers cited as external evidence) BIBLIOGRAPHIC RECORDS: SWAN Collections, CiTO+FaBIO

We have learned so far  Born semantic enables the semantics to be of use to the authors, as they are present in the publication process from the start. To add value for readers and computational consumption these semantics must then be "preserved” throughout the publication process; so, we need to address the publication process to achieve this goal.

Acknowledgments  Special Thanks to John Gomez, John Patterson, Dietrich Rebholz-Schuhmann, Robert Morris, Oscar Corcho, Diane Leiva and Greg Riccardi

Add a comment

Related presentations

Related pages

Objectives for Research Paper Methodology - ProfEssays.com

Tweet. Quick Navigation through the Research Paper Methodology Page. Download Free Research Paper Methodology Sample; How to Do a Good Research Methodology
Read more

Thesis Statement on Object Oriented ... - paper-research.com

Download thesis statement on Object Oriented Programming in our database or order an original thesis paper that will be written by one of our staff writers ...
Read more

dict.cc | research object | Wörterbuch Englisch-Deutsch

Unter folgender Adresse kannst du auf diese Übersetzung verlinken: http://www.dict.cc/?s=research+object Tipps: ... research paper research paradigm
Read more

How to give a great research talk - Microsoft Research

Related links How to write a great research paper How to write a great research proposal Contact Simon Peyton Jones: simonpj@microsoft.com.
Read more

object of research definition | English dictionary for ...

... definition, English dictionary, ... such as a tree, house, or another structure with toilet paper. ... Search object of research and thousands of other ...
Read more

Research Paper Sample on Museum Object Analysis

Museum Object Analysis Museum Object Analysis research papers explore a sample paper order that requires you to visit a museum to help with your research.
Read more

ResearchGate - Share and discover research

ResearchGate is a network dedicated to science and research. Connect, collaborate and discover scientific publications, ... 2016 researchgate.net.
Read more

Objective Writing Tips: Keeping Your Research Paper Free ...

Objective Writing Tips: Keeping Your Research Paper Free of Bias. Objective writing is essential for writing an effective and credible research paper.
Read more

Purdue OWL: Research Papers

Purdue OWL; Writing Lab; OWL News; Engagement; ... The research paper. There will come a time in most students' careers when they are assigned a research ...
Read more

Paper (Java Platform SE 7 ) - Oracle Help Center

Sets the width and height of this Paper object, which represents the properties of the page onto which printing occurs.
Read more