Managing Metadata for Science and Technology Studies: the RISIS case

33 %
67 %
Information about Managing Metadata for Science and Technology Studies: the RISIS case

Published on July 4, 2016

Author: rinkehoekstra

Source: slideshare.net

1. Managing Metadata for Science, Technology and Innovation Studies: The RISIS Case Al Koudous Idrissou, Ali Khalili, Rinke Hoekstra and Peter van den Besselaar
 Vrije Universiteit Amsterdam/University of Amsterdam
 rinke.hoekstra@vu.nl • 6 public resea • Goal: promote a to advance science universities public research org oal: promote a distribute advance science & innova • 6 public resea • Goal: promote a to advance science

2. Science & Technology Studies • Study the dynamics of scientific ideas. • Interaction between academia, business and government. • Highly interdisciplinary
 social sciences, economics, political science, humanities • Highly heterogeneous data
 structured vs. unstructured
 qualitative vs. quantitative

3. The RISIS Project • "an explosion of experimental datasets since 2000 … mostly thanks to EC supported project" • A distributed research infrastructure to advance science & innovation studies • Serving research: 
 consolidate and integrate existing datasets
 complement with new datasets on key issues currently not covered
 develop software platforms to support research
 (extract, integrate, structure and treat semantic web data) • Serving society:
 A radically improved evidence base for research & innovation policies

4. Six Use Cases 1. Where do what types of firms innovate, how do they develop, where do they grow fastest? 2. How stable and large are EU-promoted networks? How do joint funding and emerging science & technologies affect Europe? 3. What is the quality and extent of public sector research? Build registers at a European level, integrated views of excellence (leiden ranking etc.) 4. Track the careers of researchers across borders 5. Effect and impact of research & innovation studies 6. Develop integrated data and tools for researchers in the field

5. Six Use Cases 1. Where do what types of firms innovate, how do they develop, where do they grow fastest? 2. How stable and large are EU-promoted networks? How do joint funding and emerging science & technologies affect Europe? 3. What is the quality and extent of public sector research? Build registers at a European level, integrated views of excellence (leiden ranking etc.) 4. Track the careers of researchers across borders 5. Effect and impact of research & innovation studies 6. Develop integrated data and tools for researchers in the field

6. The Types of Data in SMS Data Integration Organization Product Agreement Person Policy Policy Evaluation Location CIB ETER EUPRO JOREP Leiden-Ranking MORE I Nano Profile SIPER VICO Higher Education Firm Funding Body Publication Patent Project Investment Funding Program

7. Semantically Mapping Science (SMS) DB DBDB DB RISIS Private DataRISIS Public Data VOID RDF VOID RDF VOID RDF VOID [Linked Data] API Data Cache (Triple Store) Data Viz. & Exploration views Interoperability with corTEXT Named Entity Recognition [Linked] Open Data Public Data Access Methods (SPARQL, API, RSS,…) Meta-data Services Basic Geo Services Innovative Geo Services Integration with local datasets Integration with public datasets Category Services Apps Integration with social data Access Control Service Domain Adaptation Service Identifier Management ServiceIdentity Resolution Service VOID

8. But wait a moment… hasn't this been done before? • … solve a similar data integration problem
 pharma (OpenPHACTS), socio-economic history, linguistics, media (CLARIAH, CEDAR, etc.). • … solve a similar data search, indexing and cataloguing problem
 datahub.io, lodlaundromat.org • … solve similar metadata representation problems
 DCAT, VOID, etc.

9. data privacy

10. data privacy data licensing

11. data privacy data licensing data paywall

12. data privacy data licensing data paywall physical location

13. Semantically Mapping Science (SMS) DB DBDB DB RISIS Private DataRISIS Public Data VOIDVOIDVOID SMS [Linked Data] API Data Cache (Triple Store) Data Viz. & Exploration views Interoperability with corTEXT Named Entity Recognition [Linked] Open Data Meta-data Services Basic Geo Services Innovative Geo Services Integration with local datasets Integration with public datasets Category Services Apps Integration with social data Domain Adaptation Service Identifier Management Service Identity Resolution Service Access Control Points RDF metadata VOIDVOID RDFstore convert convert metadata metadata RDF metadata convert

14. Semantically Mapping Science (SMS) DB DBDB DB RISIS Private DataRISIS Public Data VOIDVOIDVOID SMS [Linked Data] API Data Cache (Triple Store) Data Viz. & Exploration views Interoperability with corTEXT Named Entity Recognition [Linked] Open Data Meta-data Services Basic Geo Services Innovative Geo Services Integration with local datasets Integration with public datasets Category Services Apps Integration with social data Domain Adaptation Service Identifier Management Service Identity Resolution Service Access Control Points RDF metadata VOIDVOID RDFstore convert convert metadata metadata RDF metadata convert How can we still provide an integrated view on this data?

15. Semantically Mapping Science (SMS) DB DBDB DB RISIS Private DataRISIS Public Data VOIDVOIDVOID SMS [Linked Data] API Data Cache (Triple Store) Data Viz. & Exploration views Interoperability with corTEXT Named Entity Recognition [Linked] Open Data Meta-data Services Basic Geo Services Innovative Geo Services Integration with local datasets Integration with public datasets Category Services Apps Integration with social data Domain Adaptation Service Identifier Management Service Identity Resolution Service Access Control Points RDF metadata VOIDVOID RDFstore convert convert metadata metadata RDF metadata convert How can we still provide an integrated view on this data? Do existing vocabularies suffice?

16. How do experts assess the suitability of a dataset? • Knowledge acquisition & elicitation with experts
 interviews -> first design -> user experiences -> revise & adapt • Distinguish between private, publicly accessible, and other public data.

17. How do experts assess the suitability of a dataset? • Knowledge acquisition & elicitation with experts
 interviews -> first design -> user experiences -> revise & adapt • Distinguish between private, publicly accessible, and other public data. 1. User friendly web interface for viewing dataset metadata; 2. Show conditions under which the data can be used; 3. Provide detailed information about the dataset, to 4. Enable users to gain an in depth understanding of the data; 5. Facilitate trust (quality assessment); 6. Allow for both simple and advanced search (background knowledge)

18. Operationalisation 1. User interface
 categorisation of different types of metadata, non-technical terms, hints 2. Usage conditions
 legal aspects, access conditions, but also technical (data format, size, model) 3. Information & Understanding
 overview, content description, temporal aspects, structure of the data 4. Trust
 provenance and origin of data, when, how, and by whom it was created 5. Search
 All of the above + the use of external knowledge sources to show connections

19. Operationalisation 1. User interface
 categorisation of different types of metadata, non-technical terms, hints 2. Usage conditions
 legal aspects, access conditions, but also technical (data format, size, model) 3. Information & Understanding
 overview, content description, temporal aspects, structure of the data 4. Trust
 provenance and origin of data, when, how, and by whom it was created 5. Search
 All of the above + the use of external knowledge sources to show connections Fig. 2. RISIS metadata coverage overview through knowledge type categorization.

20. In more detail and generic domain of the problem, SMS is intended to be useful not only for STIS but also for the humanities and social sciences. 5 Conclusions & Future Work This paper presents an approach for managing metadata in the field of science, technology and innovation studies. The approach was developed and applied in the context of the RISIS-SMS project with the goal of supporting data integra- tion, discovery and search across datasets, maintaining privacy, and obtaining user trust while focussing on data that are not directly accessible. A contribu- tion of this work is the requirements elicited by interviewing the stakeholders. The requirement analysis guided the design of a new vocabulary, together with review of existing metadata vocabularies that helped us filling in part of the metadata needed to accommodate the domain needs. Additionally, to meet the requirements, we designed and implemented a user-friendly interface which al- lows non-expert users to easily author metadata in RDF. As future work, we envisage to extend our vocabulary to cover aspects related to the quality and provenance of data. We also plan to conduct a usability evaluation with end-users of the system to ensure that our user interface and metadata specifications fulfil the user needs. References 1. P. Ciccarese, S. Soiland-Reyes, K. Belhajjame, A. J. Gray, C. Goble, and T. Clark. Pav ontology: provenance, authoring and versioning. Journal of biomedical seman- tics, 4(1):1–22, 2013. 2. C. Daraio, M. Lenzerini, C. Leporelli, H. F. Moed, P. Naggar, A. Bonaccorsi, and A. Bartolucci. Data integration for research and innovation policy: an ontology- based data management approach. Scientometrics, pages 1–15, 2015. 3. P. Groth, A. Loizou, A. J. Gray, C. Goble, L. Harland, and S. Pettifer. Api-centric linked data integration: The open phacts discovery platform case study. Web Se- mantics: Science, Services and Agents on the World Wide Web, 29:12–18, 2014. 4. E. J. Hackett, O. Amsterdamska, M. Lynch, and J. Wajcman. The handbook of science and technology studies. The MIT Press, 2008. 5. A. Khalili, A. Loizou, and F. van Harmelen. Adaptive linked data-driven web components: Building flexible and reusable semantic web interfaces. Semantic Web Conference (ESWC) 2016, 2016. 6. J. P. McCrae, P. Labropoulou, J. Gracia, M. Villegas, V. Rodr´ıguez-Doncel, and P. Cimiano. One ontology to bind them all: The meta-share owl ontology for the interoperability of linguistic datasets on the web. In The Semantic Web: ESWC 2015 Satellite Events, pages 271–282. Springer, 2015. 7. A. Mero˜no-Pe˜nuela, A. Ashkpour, M. Van Erp, K. Mandemakers, L. Breure, A. Scharnhorst, S. Schlobach, and F. Van Harmelen. Semantic technologies for historical research: A survey. Semantic Web, 6(6):539–564, 2014. 8. P. Van den Besselaar. The cognitive and the social structure of sts. Scientometrics, 51(2):441–460, 2001. Fig. 4. The RISIS Ontology and the vocabularies it reuses.

21. In more detail and generic domain of the problem, SMS is intended to be useful not only for STIS but also for the humanities and social sciences. 5 Conclusions & Future Work This paper presents an approach for managing metadata in the field of science, technology and innovation studies. The approach was developed and applied in the context of the RISIS-SMS project with the goal of supporting data integra- tion, discovery and search across datasets, maintaining privacy, and obtaining user trust while focussing on data that are not directly accessible. A contribu- tion of this work is the requirements elicited by interviewing the stakeholders. The requirement analysis guided the design of a new vocabulary, together with review of existing metadata vocabularies that helped us filling in part of the metadata needed to accommodate the domain needs. Additionally, to meet the requirements, we designed and implemented a user-friendly interface which al- lows non-expert users to easily author metadata in RDF. As future work, we envisage to extend our vocabulary to cover aspects related to the quality and provenance of data. We also plan to conduct a usability evaluation with end-users of the system to ensure that our user interface and metadata specifications fulfil the user needs. References 1. P. Ciccarese, S. Soiland-Reyes, K. Belhajjame, A. J. Gray, C. Goble, and T. Clark. Pav ontology: provenance, authoring and versioning. Journal of biomedical seman- tics, 4(1):1–22, 2013. 2. C. Daraio, M. Lenzerini, C. Leporelli, H. F. Moed, P. Naggar, A. Bonaccorsi, and A. Bartolucci. Data integration for research and innovation policy: an ontology- based data management approach. Scientometrics, pages 1–15, 2015. 3. P. Groth, A. Loizou, A. J. Gray, C. Goble, L. Harland, and S. Pettifer. Api-centric linked data integration: The open phacts discovery platform case study. Web Se- mantics: Science, Services and Agents on the World Wide Web, 29:12–18, 2014. 4. E. J. Hackett, O. Amsterdamska, M. Lynch, and J. Wajcman. The handbook of science and technology studies. The MIT Press, 2008. 5. A. Khalili, A. Loizou, and F. van Harmelen. Adaptive linked data-driven web components: Building flexible and reusable semantic web interfaces. Semantic Web Conference (ESWC) 2016, 2016. 6. J. P. McCrae, P. Labropoulou, J. Gracia, M. Villegas, V. Rodr´ıguez-Doncel, and P. Cimiano. One ontology to bind them all: The meta-share owl ontology for the interoperability of linguistic datasets on the web. In The Semantic Web: ESWC 2015 Satellite Events, pages 271–282. Springer, 2015. 7. A. Mero˜no-Pe˜nuela, A. Ashkpour, M. Van Erp, K. Mandemakers, L. Breure, A. Scharnhorst, S. Schlobach, and F. Van Harmelen. Semantic technologies for historical research: A survey. Semantic Web, 6(6):539–564, 2014. 8. P. Van den Besselaar. The cognitive and the social structure of sts. Scientometrics, 51(2):441–460, 2001. Fig. 4. The RISIS Ontology and the vocabularies it reuses. Fig. 3. RISIS's Ontology. A view over mapped vocabularies reused. respectively The Dublin Core metadata Element Set9 which is a ”vocabulary of fifteen properties for use in resource description”, The PROV Ontology10 which is

22. Ali Khalili, Antonis Loizou and Frank van Harmelen. Adaptive Linked Data-driven Web Components: Building Flexible and Reusable Semantic Web Interfaces

23. Ali Khalili, Antonis Loizou and Frank van Harmelen. Adaptive Linked Data-driven Web Components: Building Flexible and Reusable Semantic Web Interfaces

24. Ali Khalili, Antonis Loizou and Frank van Harmelen. Adaptive Linked Data-driven Web Components: Building Flexible and Reusable Semantic Web Interfaces

25. Discussion • Science & innovation studies thrives on diverse and heterogeneous data • Existing platforms do not take access restrictions into account, or • … they do not provide sufficiently descriptive metadata to support research • We performed a requirements analysis for minimal metadata needs • Resulting in a vocabulary that integrates and connects existing standards, and • … drives a Linked Data driven data search portal.

Add a comment

Related pages

Managing metadata for science, technology and innovation ...

Managing metadata for science, technology and innovation studies: The RISIS case. A. Idrissou, A. Khalili, R. Hoekstra, ...
Read more

Publications | RISIS

Publications . Dataset-specific ... {2016-04-14T17:46:21.000+0200}, title = {Managing metadata for science, technology and innovation studies: The RISIS ...
Read more

CEUR-WS.org/Vol-1608 - Humanities in the Semantic Web (WHiSe)

Session 1: Use cases. ... Managing metadata for science, technology and innovation studies: The RISIS case 15-20 ... Case study: towards a linked ...
Read more

1st Workshop on Humanities in the Semantic Web – WHiSe

WHiSe 2016 welcomes original research contributions crossing Humanities ... Managing metadata for science, technology and innovation studies: The RISIS case.
Read more

Digital Video Archives: Managing Through Metadata ...

Digital Video Archives: Managing Through Metadata ... The technology also makes it possible to bring collections of ... Case Study: Informedia. The ...
Read more

Open Science Case Studies | Digital Curation Centre

... (National Endowment for Science Technology and the Arts) ... Using Metadata Standards; ... Open Science case studies .
Read more

What are Metadata Standards | Digital Curation Centre

What are Metadata Standards. ... Workflow Standards for e-Science; Technology Watch Papers. ... Life Science Case Studies; Open Science Case Studies;
Read more

Metadata Management, Semantics Management

Metadata Management - Efficiently Manage Data ... the recommended technology is a metadata ... to develop and manage operational and study level metadata.
Read more

Case Study: King Abdullah University of Science and ...

Case Study: King Abdullah University of Science and Technology and host-named site collections (SharePoint ... of Science and Technology ...
Read more