grlc Makes GitHub Taste Like Linked Data APIs

50 %
50 %
Information about grlc Makes GitHub Taste Like Linked Data APIs

Published on July 5, 2016

Author: albertmeronyo


1. ‹#› Het begint met een idee GRLC MAKES GITHUB TASTE LIKE LINKED DATA APIS Chefs Albert Meroño-Peñuela Rinke Hoekstra Services and Applications over Linked APIs and Data (SALAD) ESWC 29-05-2016

2. Vrije Universiteit Amsterdam  VU University Amsterdam – Computer Science (Knowledge Representation & Reasoning group)  International Institute of Social History (IISG), Amsterdam  CLARIAH – National Infrastructure for Digital Humanities > DataLegend : Structured Data Hub  Previously incubated by CEDAR – Dutch historical censuses as 5-star LOD 2 INSTITUTIONAL SLIDE

3. ‹#› Het begint met een idee DISCLAIMER 3 Frustration- driven research

4. ‹#› Het begint met een idee 1. LD-CONSUMING APPLICATIONS 4

5. ‹#› Het begint met een idee 5 Het begint met een idee  Publishing Dutch historical censuses as 5-star LD > Intensive use of RDF Data Cube > Harmonization rules > Provenance  1st historical census data as Linked Data (1795-1971)  8 million observations (sex, marital status, occupation position, housing type, residence status)  External links > Geographical: 2.7M > Occupations: 350K > Belief: 250K  High value for social historians 5 Faculty / department / title presentation THE CEDAR STORY

6. Vrije Universiteit Amsterdam  Historians can’t really write SPARQL  Variety of access interfaces needed 6 CENSUS DATA QUERYING INTERFACES

7. Vrije Universiteit Amsterdam  CLARIAH-WP4: Structured data hub for social historians  IPUMS, NAPP, CEDAR, etc > Macro-, micro-, meso-data > Civil registries, occupation, religion, country-level economic indicators > National (Netherlands) and international  Mostly CSV tables turned into RDF Data Cube and CSVW  More than 1B triples already  Higher variety of humanities scholars  higher variety of data access requirements) 7 SCALING VARIETY Exi sts Frequency Table Variable does not yet existVariables Mappings Publish Augment Includes both external LinkedDataand standard vocabularies, e.g. World Bank External (Meta)Data Existing Variables & Codes Provenance tracking of a External Datasets StructuredDataHub

8. ‹#› Het begint met een idee8

9. ‹#› Het begint met een idee FRUSTRATION 1 9 This is SPARQL mess!!!1one

10. ‹#› Het begint met een idee

11. ‹#› Het begint met een idee 11 Het begint met een idee  One .rq file for SPARQL query  Good support of query curation processes > Versioning > Branching > Clone-pull-push  Web-friendly features! > One URI per query > Uniquely identifiable > De-referenceable ( 11 Faculty / department / title presentation GITHUB AS A HUB OF SPARQL QUERIES

12. ‹#› Het begint met een idee LESSON 1 12 Query centralization helps maintaining distributed applications

13. ‹#› Het begint met een idee 2. THE NEED FOR APIS 13

14. Vrije Universiteit Amsterdam  Linked Data APIs emerge  RESTful entry point to Linked Data hubs for Web applications  OpenPHACTS  …but the Linked Data API (e.g. Swagger spec, code itself) still needs to be coded and maintained 14 MEANWHILE IN THE SEMANTIC WEB…

15. Vrije Universiteit Amsterdam  Love story – thanks KMi!  Automatically builds Swagger specs and API code  Takes SPARQL queries as input (1 API operation = 1 SPARQL query) > API call functionality limited to SPARQL expressivity  Makes SPARQL queries uniquely referenceable by using their equivalent LDA operation > Stores SPARQL internally > But we already have uniquely referenceable SPARQL… 15 BASIL

16. ‹#› Het begint met een idee FRUSTRATION 2 16 Copy-pasting 200 queries!!! & Organization problem

17. ‹#› Het begint met een idee 17 Het begint met een idee  Cousin of BASIL in a SALAD   Same basic principle: 1 SPARQL query = 1 API operation  Automatically builds Swagger spec and UI from SPARQL But:  External query management  Organization of SPARQL queries in the GitHub repo matches organization of the API  Thin layer – nothing stored server- side  Maps > GitHub API > Swagger spec 17 Faculty / department / title presentation

18. Vrije Universiteit Amsterdam 18 MAPPING GITHUB AND SWAGGER

19. Vrije Universiteit Amsterdam 19 SPARQL DECORATOR SYNTAX

20. Vrije Universiteit Amsterdam 20 THE GRLC SERVICE  Assuming your repo is at and your grlc instance at :host, > http://:host/:owner/:repo/spec returns the JSON swagger spec > http://:host/:owner/:repo/api-docs returns the swagger UI > http://:host/:owner/:repo/:operation?p_1=v_1...p_n=v_n calls operation with specifiec parameter values > Uses BASIL’s SPARQL variable name convention for query parameters  Sends requests to > to look for SPARQL queries and their decorators > to dereference queries, get the SPARQL, and parse it

21. Vrije Universiteit Amsterdam 21 SPICED-UP SWAGGER UI

22. Vrije Universiteit Amsterdam 22 EVALUATION – USE CASES  CEDAR: Access to census data for historians > Hides SPARQL > Allows them to fill query parameters through forms > Co-existence of SPARQL and non-SPARQL clients  CLARIAH - Born Under a Bad Sign: Do prenatal and early-life conditions have an impact on socioeconomic and health outcomes later in life? (uses 1891 Canada and Sweden Linked Census Data) > Reduction of coupling between SPARQL libs and R > Shorter R code – input stream as CSV

23. Vrije Universiteit Amsterdam The spectrum of Linked Data clients: SPARQL intensive applications vs RESTful API applications grlc uses decoupling of SPARQL from all client applications (including LDA) as a powerful practice  Separates query curation workflows from everything else  Allows at the same time > Web-friendly SPARQL queries > Web-friendly RESTful APIs  Helps you to easily organise your LDA – just organise your SPARQL repository and you’re set  Try it out! > > 23 CONCLUSIONS


Add a comment