The Future of Research (Science and Technology)

38 %
62 %
Information about The Future of Research (Science and Technology)

Published on September 26, 2008

Author: dullhunk

Source: slideshare.net

Description

Talk by Carole Goble at the British Library Board Awayday 23rd September 2008

The Future of Research (Science and Technology) Carole Goble [email_address] University of Manchester, UK OMII-UK British Library Board Awayday 23rd September 2008

 

Acknowledgements David De Roure Michael McLennan Noshir Contractor Christine Borgman Tony Linde Cameron Neylon Duncan Hull Geoffery Fox Malcolm Atkinson Jean Claude Bradley Anne Trefethen Graham Cameron Phil Bourne Bertram Ludaescher Tim Wess Roger Barga Paul Fisher Jane Hunter Jeremy Frey Tony Hey Jim Hendler Bob Jones Liz Lyon Juliana Friere Domenico Talia Michael Nielsen Marco Roos Doug Kell Anthony Finkelstein Peter Murray-Rust Robert Tansley Michael Wilson Rob Tansley

David De Roure

Michael McLennan

Noshir Contractor

Christine Borgman

Tony Linde

Cameron Neylon

Duncan Hull

Geoffery Fox

Malcolm Atkinson

Jean Claude Bradley

Anne Trefethen

Graham Cameron

Phil Bourne

Bertram Ludaescher

Tim Wess

Roger Barga

Paul Fisher

Jane Hunter

Jeremy Frey

Tony Hey

Jim Hendler

Bob Jones

Liz Lyon

Juliana Friere

Domenico Talia

Michael Nielsen

Marco Roos

Doug Kell

Anthony Finkelstein

Peter Murray-Rust

Robert Tansley

Michael Wilson

Rob Tansley

Trypanosomiasis in Africa http://www.genomics.liv.ac.uk/tryps/trypsindex.html Andy Brass Steve Kemp Paul Fisher

Hypothesis driven research Now we add Data driven Simulation / prediction driven Automated experiments Open “as you go” communication Team research New types of research output

Data intensive Science Data from observations Data from predictions through simulations and computer models Industrialised science

1070 databases, Nucleic Acids Research Jan 2008 (96 in Jan 2001) Proteomics Genomics Transcriptomics Protein sequence prediction Phenotypic studies Phylogeny Sequence analysis Protein Structure prediction Protein-protein interaction Metabolomics Model organism collections Systems Biology Epidemiology ….

Proteomics

Genomics

Transcriptomics

Protein sequence prediction

Phenotypic studies

Phylogeny

Sequence analysis

Protein Structure prediction

Protein-protein interaction

Metabolomics

Model organism collections

Systems Biology

Epidemiology ….

Growth of data, regardless of discipline Raw, predicted, derived, combined, aggregated Curated to be annotated and enriched manually or automagically Interlinked

Raw, predicted, derived, combined, aggregated

Curated to be annotated and enriched manually or automagically

Interlinked

Large Hadron Collidor [Norbert Neumeister]

Why Data intensive Science? New high throughput experimental methods (microarrays, combinatorial chemistry, sensor networks, earth observation, sky surveys, heroic experiments ….) Increasing scale, diversity and complexity of digital material processed separately and in combination. Commons based production über accessibility Heterogeneous, Autonomous and Volatile

New high throughput experimental methods (microarrays, combinatorial chemistry, sensor networks, earth observation, sky surveys, heroic experiments ….)

Increasing scale, diversity and complexity of digital material processed separately and in combination.

Commons based production

über accessibility

Heterogeneous, Autonomous and Volatile

Why Data Intensive Science? Small data. Spreadsheets. Personal lab books. Privately held. Increasingly publicly shared. Through the web Millions of them. Born digital.

Small data.

Spreadsheets.

Personal lab books.

Privately held.

Increasingly publicly shared.

Through the web

Millions of them.

Born digital.

Raw and Interpretive Data What is fact? Revision is constantly occurring. Even primary data can be revised. Science is interpretation Much of scientific data is secondary datasets of interpretative, information. Primary Data Primary Data Primary Data Secondary Curated Data Processed Data Secondary Curated Data Secondary Data Integrated data Processing details Capture details Update revise Update revise revise revise

What is fact?

Revision is constantly occurring. Even primary data can be revised.

Science is interpretation

Much of scientific data is secondary datasets of interpretative, information.

 

Data collection management Large scale community-wide global data centres – EBI, DDBJ, NCBI, NCI, CERN Institutional data centres and labs and individuals – precarious and uncertain. Role for data stewardship and preservation on behalf of the community Cloud data [email_address]

Large scale community-wide global data centres – EBI, DDBJ, NCBI, NCI, CERN

Institutional data centres and labs and individuals – precarious and uncertain.

Role for data stewardship and preservation on behalf of the community

Cloud data

[email_address]

Not the end of theory! The prevalence of data and the rise of data intensive science and data driven science adds to the pool of hypothesis driven and theory driven research. It doesn’t replace it. Data Theory Prediction Hypothesis

The prevalence of data and the rise of data intensive science and data driven science adds to the pool of hypothesis driven and theory driven research.

It doesn’t replace it.

200 Genotype Phenotype Metabolic pathways Literature [Paul Fisher]

Large scale data collection from multiple sites throughout the world. The team’s own data and personal data sets. Analytical pipelines and automated workflows with intelligent intervention. Literature auto found and mined If manual: its logged If automated: faster, systematic, repeatable, reduced bias, auto-logged, explicit, shareable Born digital

Large scale data collection from multiple sites throughout the world.

The team’s own data and personal data sets.

Analytical pipelines and automated workflows with intelligent intervention.

Literature auto found and mined

If manual: its logged

If automated: faster, systematic, repeatable, reduced bias, auto-logged, explicit, shareable

Born digital

Automated processing of library content PubMed contains ~17,787,763 articles to date Manually searching is tedious and frustrating Can be hard finding links between data and articles Conclusion? Machines will be reading the library. Link between cholesterol , patient trauma and parasite resistance in cattle revealed. http://www.myexperiment.org/workflows/172 Paul Fisher

PubMed contains ~17,787,763 articles to date

Manually searching is tedious and frustrating

Can be hard finding links between data and articles

Conclusion? Machines will be reading the library.

Link between cholesterol , patient trauma and parasite resistance in cattle revealed.

Data driven research Was: Hypothesis to experiment to analyse the data Now: start with the data. There is so much data that is accessible. Ideas Data Synthesis / Induction Hypothesis Analysis / Deduction [Kell and Oliver]

Was:

Hypothesis to experiment to analyse the data

Now:

start with the data. There is so much data that is accessible.

Published. Eventually.

Reproducible, or rather “fully supported” Transparent science, Composite research components Methods Lab Books Preprints Data Video Blogs Podcasts Codes Algorithms Models Presentations Ontologies Intermediate Results Related Articles Comments & Reviews Plans Models

Reproducible, or rather “fully supported” Transparent science, Composite research components Methods Lab Books Preprints Data Video Blogs Podcasts Codes Algorithms Models Presentations Ontologies Intermediate Results Related Articles Comments & Reviews

Reproducible Science means context, quality, trust means easy access to the sources

Methods are Scientific commodities Scripts, workflows, simulations, experimental plans statistical models, ... Repeatable, reproducible, comparable and reusable research. Sharing to propagates expertise and build reputation. , http://myexperiment.org

Scripts, workflows, simulations, experimental plans statistical models, ...

Repeatable, reproducible, comparable and reusable research.

Sharing to propagates expertise and build reputation.

120 Simulation tools 1,200 Seminars, podcasts, etc. 77,000 Users worldwide 550 Contributors Developed by the NSF Network for Computational Nanotechnology Online since October 2002 [Michael McLennan] http://nanoHUB.org

[Jean-Claude Bradley] http://usefulchem.wikispaces.com/

BioLit Seamless integration between data and publications From the Public Library of Science people. 1. A link brings up figures from the paper 0. Full text of PLoS papers stored in a database 2. Clicking the paper figure retrieves data from the PDB which is analyzed 3. A composite view of journal and database content results 4. The composite view has links to pertinent blocks of literature text and back to the PDB 1. 2. 3. 4. The Knowledge and Data Cycle http://biolit.ucsd.edu [Phil Bourne]

BioLit

Seamless integration between data and publications

From the Public Library of Science people.

ICTP Trieste, December 10, 2007 [Phil Bourne]

[Phil Bourne]

The reproducible and interactive research documents* Mixed stewardship research documents The recombinant, compound research documents The virtual research document Multi-versioned, dynamic research document *Papers, Books, whatever. 2020

Data, image, model, process, workflow, podcast, slideset* Finding, citation, peer review, preservation, identity, versioning, security, privacy, copyright management, format authority Authority on metadata descriptions Propagation of descriptions * Insert new research commodity type here 2020

What does this mean for library services? Seamless interlinking of data, literature and other research commodities Integrated search across external resources Selective quality curation Hell is other people’s (lack of) semantic metadata 2020

Supporting Paul, The Scientist Search/Discover Serendipitous Finding Collaborative Searching Structural Search Keeping Current Gather Collecting Manage Organizing Create Annotating Review & Rate Describe Write Share Publish Sharing Rights Integrated search Automatic paper download Continual queries Paper recommendation Alert Project and Personal Internal search Refereed and Grey literature Tag, annotate, rate Templates Multi-author authoring Bibliography management Version management Copyright tools (CC and SC) Linking up data, models and other components [Roger Barga]

Collaboration

(Virtual) Team Research Research increasingly team-based Teams produce more highly cited research Team science is increasingly composed of co-authors located at different universities. “ virtual communities of scholars” produce higher impact work than comparable co-located teams or solo scientists. True for all fields and team sizes. Studies of 19.9 million research articles over 5 decades as recorded in the Web of Science database, and an additional 2.1 million patent records from 1975-2005. Using the Web of Science database to analyze the collaboration arrangements of over 4,000,000 papers over a 30 year period Sources: Wuchty, Jones, and Uzzi Noshir Contractor

Research increasingly team-based

Teams produce more highly cited research

Team science is increasingly composed of co-authors located at different universities.

“ virtual communities of scholars” produce higher impact work than comparable co-located teams or solo scientists.

True for all fields and team sizes.

Distributed [Helen Hulme]

Distributed and Collaborative .....skills-rich and time-poor Biologists, Geneticists, Bioinformaticians, Immunologists, Microarray specialists, Computer Scientists, Mathematicians, Physicists..... [Helen Hulme]

Personal: log books and spreadsheets, file stores Group: shared data, methods, protocols, information, failures, insights, observations, know-how Born digital but not very digitally processable. [Helen Hulme]

Personal: log books and spreadsheets, file stores

Group: shared data, methods, protocols, information, failures, insights, observations, know-how

Born digital but not very digitally processable.

Virtual Research Environments 1 Collaboration Environments Science Gateways to data and computing grids Multi-authored document preparation

Multi-disciplinary Proteomics Classical Genetics / QTL studies Animal Experts Transcriptomics Parasite Experts Statistical modelling Text Analysis Image analysis Health Epidemiology

Proteomics

Crossing boundaries Interdisciplinary Support Expert finding Complementary experts swarming around a problem Transferring data, methods and know-how from one discipline to another e.g. astronomy image analysis applied to cancer tissue microarrays How do you find relevant material that uses a different jargon in a different discipline organised to only suit its experts? Overlay and virtual journals are few and far between – e.g. the Virtual Journal of Quantum Information. Where is the overlay library?

Expert finding

Complementary experts swarming around a problem

Transferring data, methods and know-how from one discipline to another

e.g. astronomy image analysis applied to cancer tissue microarrays

How do you find relevant material that uses a different jargon in a different discipline organised to only suit its experts?

Overlay and virtual journals are few and far between – e.g. the Virtual Journal of Quantum Information.

Where is the overlay library?

Virtual Research Environments 2 Social Professional Networking Expert finding

[Roger Barga]

The BL’s Research Information Centre

Open Science Collective Intelligence Researcher participation Commons based production Sharing Accelerated dissemination Embedded in the researchers environment and work practices

“ Long Tail” Science. “Hypo” Science Increased scale and diversity of scientific participation The small research team. Niche experts. The citizen. Easier to work with, and get hold of, digital output. Better tools. Scaling effects of peer review, social working and community curation.

Increased scale and diversity of scientific participation

The small research team.

Niche experts.

The citizen.

Easier to work with, and get hold of, digital output.

Better tools.

Scaling effects of peer review, social working and community curation.

Open content, services and software. Social tools for the social process of science.

Publicly available data Open services and software tools. Science Commons, open access journals, open data and linked data*, PLoS, ... Open notebook science Recent US funding agencies declarations on open access * formerly known as Semantic Web

Publicly available data

Open services and software tools.

Science Commons, open access journals, open data and linked data*, PLoS, ...

Open notebook science

Recent US funding agencies declarations on open access

Anyone can be a publisher as well as a consumer. Social tools for the social process of science. Wikis, (micro)blogs, instant messaging. Accelerate research and reduce time-to-experiment.

Anyone can be a publisher as well as a consumer.

Social tools for the social process of science.

Wikis, (micro)blogs, instant messaging.

Accelerate research and reduce time-to-experiment.

Community collective intelligence network effects Share information and know-how Tag resources for finding Socially curate resources Openly review and debate Recommendations based on usage and opinion.

Community collective intelligence network effects

Share information and know-how

Tag resources for finding

Socially curate resources

Openly review and debate

Recommendations based on usage and opinion.

http://www.wikipathways.org/

[Duncan Hull]

Growth of open access scientists digital natives, always online, hybrids catalysts for change [Phil Bourne]

Cameron Neylon’s chemistry notebook

Sharing reusable methods Paul Jo

Competitive advantage. Academic vanity. Reputation. Adoption. Scrutiny. Being scooped. Misinterpretation. New Reward Schemes Rewards Fears

What is the role of the library? Trusted curator Trusted data manager Quality arbiter Knowledge disseminator Format authority Add value content provider Metadata / controlled vocabulary provider Add value service provider 2020

Services Embedding into the Researchers Workflow The Cloud

Personal Scientist-centric tooling We don’t come to the library, it comes to us. We don’t use just one library or one source. We don’t use just one tool! Library services embedded in our toolkits, workbenches, browsers, authoring tools. Zotero Firefox plug-in

We don’t come to the library, it comes to us.

We don’t use just one library or one source.

We don’t use just one tool!

Library services embedded in our toolkits, workbenches, browsers, authoring tools.

Hypothesis Construction from the Literature Marco Roos, Scott Marshall , University of Amsterdam

Towards Automated Science Inspired by Jean-Claude Bradley Human Human Human Machine Machine Machine Quality Trust Ease Ubiquity We are here

Combining information on the web DIY mash ups As the pieces become easy to use, researchers bring them together in new ways and ask new questions. Boundaries are shifting, practice is changing. Based on the ease of assembly and automation. Allen Brain Atlas

DIY mash ups

As the pieces become easy to use, researchers bring them together in new ways and ask new questions.

Boundaries are shifting, practice is changing.

Based on the ease of assembly and automation.

http://info.scopus.com/scsearchapi/geoCitations/index.html

Take a PubMed search result Combine it with a Google Scholar search for number of citations Mash the results into the PubMed results with a link to Google Scholar

Take a PubMed search result

Combine it with a Google Scholar search for number of citations

Mash the results into the PubMed results with a link to Google Scholar

What does this mean for library services? With not For Opening up to researcher’s tools and research environments for discovery, management and curation of research commodities Enabling and encouraging new services and new content to add new value Remove obstacles to interoperate and share Collaborate, don’t control

Give researchers tools and access to content They control their own software/data apparatus and their experiments. They are creative Pervasive devices and the mixing up of virtual and real worlds

Give researchers tools and access to content

They control their own software/data apparatus and their experiments.

They are creative

Prior to leaving home Paul, a Manchester graduate student, syncs his IPhone with the latest papers, delivered overnight by the library via a news syndication feed. On the bus he reviews the stream, selecting a paper close to his interest in HIV-1 proteases. The data shows apparent anomalies with his own work, and the method, an automated script, looks suspect. Being on-line he notices that a colleague in Madrid has also discovered the same paper through a blog discussion and they Instant Message, annotating the results together. By the time the bus stops he has recomputed the results, proven the anomaly, made a rebuttal in the form of a pubcast to the Journal Editor, sent it to the journal and annotated the article with a comment and the pubcast. Based on an original idea by Phil Bourne

Prior to leaving home Paul, a Manchester graduate student, syncs his IPhone with the latest papers, delivered overnight by the library via a news syndication feed. On the bus he reviews the stream, selecting a paper close to his interest in HIV-1 proteases.

The data shows apparent anomalies with his own work, and the method, an automated script, looks suspect.

Being on-line he notices that a colleague in Madrid has also discovered the same paper through a blog discussion and they Instant Message, annotating the results together.

By the time the bus stops he has recomputed the results, proven the anomaly, made a rebuttal in the form of a pubcast to the Journal Editor, sent it to the journal and annotated the article with a comment and the pubcast.

Questions? http://research.microsoft.com/towards2020science/

Extras

Other References Duncan Hull, Steve Pettifer, Doug Kell, Defrosting the digital library: bibliographic tools for the next generation web to appear in PLoS Computational Biology Michael Nielsen, The Future of Science http://michaelnielsen.org/blog/?p=448 Philip Bourne Will a biological database be different from a biological journal, PLOS Computational Biology 1(3) www.ploscompbiol.org James A. Evans Electronic Publication and the Narrowing of Science and Scholarship Science 18 July 2008: Vol. 321. no. 5887, pp. 395 - 399 http://www.sciencemag.org/cgi/content/abstract/321/5887/395 James Hendler Reinventing Academic Publishing, Editorials for IEEE Intelligent Systems http://www.mindswap.org/blog/2007/08/14/reinventing-academic-publishing-%E2%80%93-part-i/ http://www.mindswap.org/blog/2007/11/23/reinventing-academic-publishing-%E2%80%93-part-ii/ http://www.mindswap.org/blog/2008/01/03/reinventing-academic-publishing-%E2%80%93-part-iii/ Cameron’s suggested open science blogs http://www.earlham.edu/~peters/fos/2008/07/online-researchers-have-acces s-to-more.html http://scienceblogs.com/clock/2008/07/electronic_publication_and_the.php http://www.sennoma.net/main/archives/2008/07/an_open_access_partisans_vi ew.php http://openwetware.org/wiki/Science_2.0/Brainstorming http://sciencex2.org/en/user/113/track

Duncan Hull, Steve Pettifer, Doug Kell, Defrosting the digital library: bibliographic tools for the next generation web to appear in PLoS Computational Biology

Michael Nielsen, The Future of Science http://michaelnielsen.org/blog/?p=448

Philip Bourne Will a biological database be different from a biological journal, PLOS Computational Biology 1(3) www.ploscompbiol.org

James A. Evans Electronic Publication and the Narrowing of Science and Scholarship Science 18 July 2008: Vol. 321. no. 5887, pp. 395 - 399 http://www.sciencemag.org/cgi/content/abstract/321/5887/395

James Hendler Reinventing Academic Publishing, Editorials for IEEE Intelligent Systems http://www.mindswap.org/blog/2007/08/14/reinventing-academic-publishing-%E2%80%93-part-i/ http://www.mindswap.org/blog/2007/11/23/reinventing-academic-publishing-%E2%80%93-part-ii/ http://www.mindswap.org/blog/2008/01/03/reinventing-academic-publishing-%E2%80%93-part-iii/

Cameron’s suggested open science blogs

http://www.earlham.edu/~peters/fos/2008/07/online-researchers-have-acces s-to-more.html

http://scienceblogs.com/clock/2008/07/electronic_publication_and_the.php

http://www.sennoma.net/main/archives/2008/07/an_open_access_partisans_vi ew.php

http://openwetware.org/wiki/Science_2.0/Brainstorming

http://sciencex2.org/en/user/113/track

http://research.microsoft.com/towards2020science/

Add a comment

Related presentations

Related pages

The Role of Science and Technology in Future Design

The role of science and technology in future design will be ... rational support of basic research and the future of science and ...
Read more

The Future of Science and Technology - DUJS Online

... presented on “The Future of Science and Technology” at ... present, and future through technology. ... for sharing undergraduate research and ...
Read more

Home - Future Research Corporation

Information Technology. ... Virtual Common Access Card is a software solution developed and offered by Future Research Corporation (FRC) ...
Read more

Science and Technology Research News Articles | Futurity

The latest research news from top universities about topics related to science and technology.
Read more

Technology Org - Science and technology news

Technology Org Science and technology news. ... Highly advanced audio technology and research are needed to enhance the sound at big ... A driveless future
Read more

Future Timeline | Technology | Singularity | 2020 | 2050 ...

Part fact and part fiction, the timeline is based on detailed research ... on science, technology and the future. AI & Robotics. Biology & Medicine.
Read more

The future of science and technology - labdesignnews.com

The future of science and technology. ... This model of education illustrates Endicott College’s commitment to attracting top research talent to their ...
Read more

Popular Science - New Technology, Science News, The Future Now

Popular Science and XPRIZE are teaming up to explore and explain technologies that make us say "The Future Is Now ... in science, health, and technology ...
Read more

Science | AAAS

Erratum for the Research Article: ... The strength of Science and its online journal sites rests with the strengths of its community of authors, ...
Read more