RDF Stream Processing: Let's React

78 %
22 %
Information about RDF Stream Processing: Let's React
Software

Published on October 21, 2014

Author: jpcik

Source: slideshare.net

Description

RDF Stream processing and reactive systems

1. RDF Stream Processing: Let’s React! @jpcik Jean-Paul Calbimonte LSIR EPFL OrdRing 2014. International Semantic Web Conference ISWC 2014 Riva del Garda, 20.10.2014

2. The RSP Community Research work Many Papers PhD Thesis Datasets Prototypes Benchmarks RDF Streams Stream Reasoning Complex Event Processing Stream Query Processing Stream Compression Semantic Sensor Web Many topics Tons of work W3C RSP Community Group http://www.w3.org/community/rsp discuss standardize combine formalize evangelize Effort to our work on RDF stream processing 2

3. The RSP Community 3

4. Why Streams? Internet of Things Sensor Networks Mobile Networks Smart Devices Participatory Sensing Transportation Financial Data Social Media Urban Planning Health Monitoring Marketing “It’s a streaming world!”[1] 4 Della Valle, et al : It's a Streaming World! Reasoning upon Rapidly Changing Information. IEEE Intelligent Systems

5. Why Streams? Web standards Data discovery Data sharing Web queries Go Web Semantics Vocabularies Data Harvesting Data linking Matching Integration Ontologies Expressivity Inference Rule processing Knowledge bases Reasoning Query languages Query answering Efficient processing Query Federation Processing Is this what we require? 5

6. Looking back 10 years ago… Stonebreaker et al. The 8 requirement of Real-TimeStream Processing. SIGMOD Record. 2005. “8 requirements of real-time stream processing”[2] Do we address them? Do we have more requirements? Do we need to do more? Keep data moving Query with stream SQL Handle imperfections Predictable outcomes Integrate stored data Data safety & availability Partition & scale Respond Instantaneously 8 Requirements 6

7. Reactive Systems Keep data moving Query with stream SQL Handle imperfections Predictable outcomes Integrate stored data Data safety & availability Partition & scale Respond Instantaneously 8 Requirements Event-Driven Jonas Boner. Go Reactive: Event-Driven, Scalable, Resilient & Responsive Systems. 2013. Events: react to Load: Scalable Failure: Resilient Users: Responsive Do we address them? Do we have more requirements? Do we need to do more? 7

8. Warning: You may see code 8

9. ① Keep the data moving Process data in-stream Not required to store Active processing model input streams RSP queries/ rules output streams/events RDF Streams 9

10. … RDF Stream Gi Gi+1 Gi+2 … Gi+n … unbounded sequence Gi {(s1,p1,o1), (s2,p2,o2),…} [ti] 1+ triples implicit/explicit timestamp/interval public class SensorsStreamer extends RdfStream implements Runnable { public void run() { .. while(true){ the stream is “observable” ... RdfQuadruple q=new RdfQuadruple(subject,predicate,object, System.currentTimeMillis()); this.put(q); } } } How do I code this? something to run on a thread timestamped triple Data structure, execution and callbacks are mixed 10 Observer pattern Tightly coupled listener

11. Actor Model Actor 1 11 send: fire-forget m No shared mutable state Actor 2 Avoid blocking operators Lightweight objects Loose coupling communicate through messages mailbox state non-blocking response behavior More goodies later… Implementations: e.g. Akka for Java/Scala

12. RDF Stream Data structure object DemoStreams { ... def streamTriples={ Ideas using akka actors Iterator.from(1) map{i=> ... new Triple(subject,predicate,object) } } Infinite triple iterator Execution val f=Future(DemoStreams.streamTriples) f.map{a=>a.foreach{triple=> //do something }} Asynchronous iteration Message passing f.map{a=>a.foreach{triple=> someSink ! triple }} send triple to actor Immutable RDF stream  avoid shared mutable state  avoid concurrent writes  unbounded sequence Futures  non blocking composition  concurrent computations  work with not-yet-computed results Actors  message-based  share-nothing async  distributable 12

13. RDF Stream  Loose coupling  Immutable data streams  Asynchronous message passing  Well defined input/output 13 … other issues: Graph implementation? Timestamps: application vs system? Serialization?

14. ② Query using SQL on Streams Model Continuous execution Union, Join, Optional, Filter Aggregates Time window Triple window R2S operator Sequence, Co-ocurrence Time function TA-SPARQL TA-RDF ✗ ✔ Limited ✗ ✗ ✗ ✗ ✗ tSPARQL tRDF ✗ ✔ ✗ ✗ ✗ ✗ ✗ ✗ Streaming RDF Stream ✔ ✔ ✗ ✔ ✔ ✗ ✗ ✗ SPARQL C-SPARQL RDF Stream ✔ ✔ ✔ ✔ ✔ ✗ ✗ ✔ CQELS RDF Stream ✔ ✔ ✔ ✔ ✔ ✗ ✗ ✗ SPARQLStream (Virtual) RDF Stream ✔ ✔ ✔ ✔ ✗ ✔ ✗ ✗ EP-SPARQL RDF Stream ✔ ✔ ✔ ✗ ✗ ✗ ✔ ✗ Instans RDF ✔ ✔ ✔ ✗ ✗ ✗ ✗ ✗ W3C RSP  review features in existing systems  agree on fundamental operators  discuss on possible semantics https://www.w3.org/community/rsp/wiki/RSP_Query_Features RSP is not always/only SPARQL-like querying SPARQL protocol is not enough RSP RESTful interfaces? 14 Powerful languages for continuous query processing

15. SELECT ?person ?loc WHERE { STREAM <http://deri.org/streams/rfid> [RANGE 3s] {?person :detectedAt ?loc} } ExecContext context=new ExecContext(HOME, false); register query String queryString =" SELECT ?person ?loc … ContinuousSelect selQuery=context.registerSelect(queryString); selQuery.register(new ContinuousListener() { public void update(Mapping mapping){ adding listener get result updates String result=""; for(Iterator<Var> vars=mapping.vars();vars.hasNext();) result+=" "+ context.engine().decode(mapping.get(vars.next())); System.out.println(result); } }); RSP Querying Example with CQELS (code.google.com/p/cqels) CQELS continuous query: “Queries evaluated continuously against the changing dataset”[4] Le Phuoc et al. An Native and adaptive approach for unifyied processing of Linked Streams and Linked Data. ISWC2011. 15 Tightly coupled listeners Results delivery: push & pull?

16. Dynamic Push-Pull Producer 16 Consumer demand flow m data flow Push when consumer is faster Pull when producer is faster Dynamically switch modes Communication is dynamic depending on demand vs supply

17. ③ Handle stream imperfections Delayed, missing, out-of-order … Setting timeouts in message passing val future = myActor.ask("hello")(5 seconds) context.setReceiveTimeout(100 milliseconds) def receive = { case "Hello" => //do something case ReceiveTimeout => throw new RuntimeException("Timed out") 17 timeout after 5 sec Timeout receiving messages in case of timeout “order matters!”[1] Async message passing helps Still to be studied at protocol level Different guarantee requirements In W3C RSP: usually ‘out-of-scope’ 

18. ④ Generate predictable outcomes Correctness in RSP 18 “RSP engines produce different but correct results”[1] RSP queries: operational semantics Common query execution model Comparable / predictable results Dell’Aglio Calbimonte, Balduini, Corcho, Della Valle. On Correctness on RDF Stream Processing Benchmarking. ISWC 2013

19. ⑤ Integrate stored and streaming data Integration in RSP languages SELECT ?person1 ?person2 FROM NAMED <http://deri.org/floorplan/> WHERE { GRAPH <http://deri.org/floorplan/> {?loc1 lv:connected ?loc2} STREAM <http://deri.org/streams/rfid> [NOW] {?person1 lv:detectedAt ?loc1} SELECT ?road ?speed WHERE { { ?road :slowTrafficDue ?observ } { ?observ rdfs:subClassOf :SlowTraffic { ?observ :speed ?speed } FILTER (getDURATION() < ”P1H”^^xsd:duration) Loading background knowledge 19 } stored RDF graph RDF stream Clean query model for stream+stored data Web-ready inter-dataset queries Shared vocabularies/ontologies RDF stream + stored “EP-SPARQL uses background ontologies to enable stream reasoning”[1] More than just joins with stored data Stream reasoning / inferences context.loadDataset("{URI OF GRAPH}", "{DIRECTORY TO LOAD}"); Anicic et al. EP-SPARQL: a unified language for event processing and stream reasoning. WWW2011.

20. ⑥ Guarantee data safety and availability Restart, Suspend, Stop, Escalate, etc 20 Parent Actor 1 Automatic supervision Isolate failures Manage local failures Supervision strategies: All-for-One One-for-one Death watch handling Supervision hierarchy Supervision Actor 2 Actor 3 4X Actor

21. Data availability High load: input streams 21 RSP queries/ rules under stress “eviction: delete from cache of operators”[1] binding1 binding2 .. bindingX binding3 binding5 .. bindingY X Join operator X Evict cache items based on: recency LRU Likeliness of matching Gao et al. The CLOCK Data-Aware Eviction Approach: Towards Processing Linked Data Streams with Limited Resources. ESWC 2014

22. ⑦ Partition and scale automatically 22 Dynamite: Parallel Materialization Maintaining a dynamic ontology Parallel threads

23. Actors everywhere Actor 1 23 m No difference in Actor 2 one core many cores many servers Actor 3 Actor 4 Transparent Remoting Locality optimization Define Routing policies Define actor clusters m m Existing ‘map reduce’ for streams: Storm, S4, Spark, Akka Streams Create workflows of Stream Processors?

24. ⑧ Process and respond instantaneously Blocking queries Sync communication Bottlenecks No end to end reactivity  24 SPARQLStream Virtual RDF Stream DSMS CEP Sensor middleware … rewritten queries users, applications query processing Morph-streams data layer “Query virtual RDF streams”[1] Push: Using Websockets!  Two way communication Responsive answers

25. RSP Stream Reasoning Example with TrOWL (trowl.eu) Ontologies evolve over time! ONTO + axiom1 + axiom2 ONTO’ - axiom3 ONTO’’ Adding and removing axioms over time “reasoning with changing knowledge”[3] add 10 axioms get instances of ‘System’ class Ren, Pan, Zhao. Towards Scalable Reasoning on Ontology Streams via Syntactic Approximation. IWOD2010. val onto=mgr.createOntology val reasoner = new RELReasonerFactory().createReasoner(onto) (1 to 10).foreach{i=> reasoner += (Indiv(rspEx+"sys"+i) ofClass system) reasoner.reclassify reclassify } println (reasoner.getInstances(SystemClass, true)) 25 Powerful Dynamic Ontology Maintenance Streaming query results? Notify new inferences?

26. A lot to do… 26

27. Reactive RSPs Keep data moving Query with stream SQL Handle imperfections Predictable outcomes Integrate stored data Data safety & availability Partition & scale Respond Instantaneously 8 Requirements We go beyond only these 27 Data Heterogeneity Data Modeling Stream Reasoning Data discovery Stream data linking Query optimization … more Reactive Principles Needed if we want to build relevant systems

28. Reactive RSP workflows 28 Morph Streams CSPARQL s Etalis TrOWL s Dyna mite s Event-driven message passing Async communication Immutable streams Transparent Remoting Parallel and distributed Supervised Failure Handling Responsive processing s CQELS Minimal agreements: standards, serialization, interfaces Formal models for RSPs and reasoning Working prototypes/systems! Reactive RSPs

29. We like to do things 29

30. Muchas gracias! @jpcik Jean-Paul Calbimonte LSIR EPFL

Add a comment

Related presentations

Speaker: Matt Stine Developing for the Cloud Track Marc Andressen has famou...

This presentation explains how to develop a Web API in Java using (JAX-RS or Restl...

1 App,

1 App,

November 10, 2014

How to bring innovation to your organization by streamlining the deployment proces...

Cisco Call-control solutions can handle voice, video and data

Nathan Sharp of Siemens Energy recently spoke at the SAP Project Management in Atl...

Related pages

RDF Stream Processing: Let’s React

RDF Stream Processing: Let’s React ... In the specific case of RDF stream processing,wecanevidencethatmostsystemsarestillintheirinfancyandthe
Read more

RDF Stream processing: Let's react (PDF Download Available)

Page 1. RDF Stream Processing: Let’s React Jean-Paul Calbimonte Faculty of Computer Science and Communication Systems, EPFL, Switzerland. jean-paul ...
Read more

RDF Stream Processing: Let's React - Infoscience ...

Presented at: 3rd International Workshop on Ordering and Reasoning at ISWC 2014, Riva del Garda, Trentino, Italy, October 20, 2014; Published in ...
Read more

On the need for adaptivity in RDF Stream Processing

On the need for adaptivity in RDF Stream Processing ... rather than on the ability to adapt and react to ... and let the application discover and ...
Read more

Ordering and Reasoning - CEUR-WS.org/Vol-1303 ...

Table of Contents . Preface and Workshop Organization; Invited Talk. RDF Stream Processing: Let's React 1-10 Jean-Paul Calbimonte; Accepted Papers. Towards ...
Read more

Resource Description Framework (RDF) - RDF - Semantic Web ...

Resource Description Framework (RDF) Overview. ... RDF has features that facilitate data merging even if the underlying schemas differ, ...
Read more

React - Documents

RDF Stream Processing: Let's React Async - react, don't wait - PingConf Are You Ready to React? React 2012 Yearbook React to Contact Comments.
Read more

Mining Big Data with RDF Graph Technology

Data Layer RDF Media Parallel Processing Engine ... Stream Acquire Organize ... Analytics RDF Models . Title: Mining Big Data with RDF Graph Technology ...
Read more