Published on March 13, 2009
David De Roure IEEE e-Science 2008
“But the Grid is successful!”
So why are there three projects addressing lack of uptake?
...and a theme in the e-Science Institute? Adoption of e-Research Technologies
How did we get here?! Early adopter success Then rollout of infrastructure services And then wondering where the users are Heard at another repositories event... “How do we persuade researchers to populate our repositories?”
e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it. Due to the complexity of the software and the backend infrastructural requirements, e-Science projects usually involve large teams managed and developed by research laboratories, large universities or governments.
What are we really trying to achieve here? A. Everyone using the Grid/Repositories? B. Research advances on an everyday basis that would not have happened otherwise? Not just accelerated but new
How do we move from heroic scientists doing heroic science with heroic infrastructure to everyday scientists doing science they couldn’t research do before? humanists archaeologists geographers musicologists ... It’s the researchers! democratisati on of e- Research
Jim Downing came up with the idea of “Long Tail Science”... So we are exploring how big science and long-tail science work together to communicate their knowledge. Long-tail science needs its domain repositories - I am not sanguine that IRs can provide the metalayers (search, metadata, domain-specific knowledge, domain data) that are needed for effective discovery and re-use. Peter Murray-Rust
Virtual Learning The social process Environment of science 2.0 Undergraduate Students Digital researchers Libraries Graduate Students Reprints Peer- experimentation Reviewed Technical Journal & Preprints Reports Conference & Papers Metadata Local Web Data, Metadata Repositories Certified Provenance Experimental Workflows Results & Analyses Ontologies
1 Everyday researchers doing everyday research • Not just a specialist few doing heroic science with heroic infrastructure • Chemists are blogging the lab • Everyone is mashing up • Everday hardware – multicore machines and mobile devices
2 A data-centric perspective, like researchers • Data is large, rich, complex and real-time • There is new value in data, through new digital artefacts and through metadata e.g. context, provenance, workflows • This isn’t “anti-computation” – design interaction around data
3 Collaborative and participatory • The social process of science revisited in the digital age • Collaborative tools – blogs and Wikis • e-Science now focuses on publishing as well as consuming • Scholarly lifecycle perspective
4 Benefitting from the scale of digital science activity to support science • This is new and powerful! • Community intelligence • Review • Usage informing recommendation • e.g. OpenWetWare • e.g. myExperiment
5 Increasingly open • Preprints servers and institutional repositories • Open journals • Open access to data • Science Commons • Object Reuse & Exchange
6 Better not Perfect • The technologies people are using are not perfect • They are better • They are easy to use • They are chosen by scientists
7 Empowering researchers • The success stories come from the researchers who have learned to use ICT • Domain ICT experts are delivering the solutions • Anything that takes away autonomy will be resisted
8 About pervasive computing • e-Science is about the intersection of the digital and physical worlds • Sensor networks • Mobile handheld devices
Onward and Upward • e-Research is now enabling researchers to do some completely new stuff! • As the individual pieces become easy to use, researchers can bring them together in new “Standing on the ways and ask new shoulders of giants” questions • “The next level” (Everyday researchers are giants www.w3.org/2007/Talks/www2007-AnsweringScientificQuestions-Ruttenberg.pdf
Repositories Repositories • Absolutely key role in future research. So think of a better word! • Think of a park / reserve / gardens / zoo – Visitors, rangers, wardens, gardeners, experts, security, volunteers, ... – Curation by providers, experts and consumers
www.oreilly.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html Those 8 Repository points 1. Not just a specialist few doing heroic science with heroic infrastructure – repositories for all! 2. There is new value in data, through new digital artefacts and through metadata e.g. context, provenance, workflows 3. e-Science now focuses on publishing as well as consuming 4. Usage informing recommendation 5. Researchers work with collections - Object Reuse & Exchange 6. They are easy to use 7. Anything that takes away autonomy will be resisted 8. e-Science is about the intersection of the digital and physical worlds (not 1970s library catalogue interfaces)
And we needprocess processes too! Curation of to curate Goble & De Roure Educause Review Sep/Oct 2008 • Find a process based on what it and find copies or similar services usable as alternates. • Understand how and when it works, how to operate it correctly and predict its performance. • Know the conditions for use: permissions, licenses, platforms, and costs. • Judge the benefits of adoption based on its reputation, provenance and validation by peers. • Estimate the risk of adoption based on its reliability and stability. • Get assistance for its incorporation into applications and workflows.
Transformation is already underway • To understand where we’re going, look at communities which have been early to embrace new technology. • e-Science is one. What can we learn? • Incidentally, so is music and broadcast! – Vinyl was like books – Now the process is digital from the studio through to playback on an iPod – People create content – People publish content – Has the business adapted?
Note to Reader. The next slides are not intended to be anti-grid. Everyone working on Grid is doing great work.
Don’t think rollout of technologies... Mass Use by Researchers Think roll-in of researchers... Mass Use by Researchers Knowledge co-production vs Service Delivery!
N N2 N Without middleware we need lots of bits of software to join things together
N One Middleware 2N N With middleware there are fewer arrows!
N Middleware Middleware Middleware Middleware ? Polynomial involving N1, Middleware Middleware N2 and M N But this is what happened. Now the picture with lots of thin arrows isn’t quite so scary!
use Web 2.0 here HPC Grid Grid cloud Web is being embraced for usability and programmability e.g. mashups
And Grid is trying to come to terms with multicore and clouds!
A Thought Experiment Imagine Eprints/Dspace/Fedora isn’t something you download and run on a local server Imagine instead that you just go to the cloud and make one* How would this repository ecosystem self-organise to support Research 2.0? Would there be institutional repositories? * (Actually you can!)
web Is it a wave or is it a particle? Tension between data being “out on the Web” (user view) or in an institutional machine room (provider view) What is the curator view? Issues perceived differently for metadata servers and data servers
How Repositories can avoid Failing like the Grid 1. Understand what the users will need by going on the journey together 2. Be open-minded: are we solving the right problem? (Don’t forget curation of process!) 3. Don’t create artificial distinctions from Web 4. Beware standards as a barrier to adoption 5. Think cloud, outside the institutional box: imagine the repository factory 6. Think of a new name for repositories!
Contact David De Roure email@example.com Thanks Carole Goble Jeremy Frey Simon Coles Peter Murray-Rust