GraphConnect 2014 SF: From Zero to Graph in 120: Scale

25 %
75 %
Information about GraphConnect 2014 SF: From Zero to Graph in 120: Scale
Software

Published on October 22, 2014

Author: neo4j

Source: slideshare.net

Description

GraphConnect 2014 SF: From Zero to Graph in 120: Scale

1. SAN FRANCISCO | 10.22.2014 Scaling Neo4j Applica0ons @iansrobinson

2. The Burden of Success • More users • Larger datasets • More concurrent requests • More complex queries

3. Scaling is a Feature • It doesn’t come for free • Condi0ons of success: – Understand current needs • Design for an order of magnitude growth – Itera0ve and incremental development – Unit tests • Bedrock of asserted behaviour – Performance tests

4. Overview • Scaling Reads – Latency – Throughput • Scaling Writes • Hardware

5. Scaling Reads -­‐ Latency

6. Query Latency latency = f(search_area)

7. Query Latency latency = f(search_area)

8. Query Latency latency = f(search_area)

9. Query Latency latency = f(search_area)

10. Query Latency latency = f(search_area)

11. Query Latency latency = f(search_area)

12. Search Area search_area = f(domain_invariants)

13. Search Area search_area = f(domain_invariants) Absolute Every user has 50 friends

14. Search Area search_area = f(domain_invariants) Absolute Every user has 50 friends

15. Search Area search_area = f(domain_invariants) Absolute Every user has 50 friends Rela,ve Every user is friends with 10% of the user base

16. Search Area search_area = f(domain_invariants) Absolute Every user has 50 friends Rela,ve Every user is friends with 10% of the user base

17. Reducing Read Latency • The Blackadder solu0on

18. Reducing Read Latency • The Blackadder solu0on • Improve the Cypher query • Change the model • Use an Unmanaged Extension

19. Improve Cypher Query • Small queries, separated by WITH • Start from low-­‐cardinality nodes hp://thought-­‐bytes.blogspot.co.uk/2013/01/op0mizing-­‐neo4j-­‐cypher-­‐queries.html hp://wes.skeweredrook.com/pragma0c-­‐cypher-­‐op0miza0on-­‐2-­‐0-­‐m06/

20. Change the Model Goal Do less work (in the query) – By exploring less of the graph How? Iden0fy inferred rela-onships – Replace with use-­‐case specific shortcuts

21. Change the Model -­‐ From MATCH (:Person{username:'ben'}) -[:WORKED_ON]->(:Project)<-[:WORKED_ON]- (colleague:Person)

22. Change the Model -­‐ From MATCH (:Person{username:'ben'}) -[:WORKED_ON]->(:Project)<-[:WORKED_ON]- (colleague:Person)

23. Change the Model -­‐ To MATCH (:Person{username:'ben'}) -[:WORKED_WITH]- (colleague:Person)

24. Tradeoff More expensive writes More data Cheaper reads When to add the new rela0onship? • With tx • Queue for subsequent tx • Periodic/batch

25. Refactor Exis0ng Data MATCH (p1:Person) -[:WORKED_ON]->(:Project)<-[:WORKED_ON]- (p2:Person) WHERE NOT ((p1)-[:WORKED_WITH]-(p2)) WITH DISTINCT p1, p2 LIMIT 10 MERGE (p1)-[r:WORKED_WITH]-(p2) RETURN count(r)

26. Select Batch MATCH (p1:Person) -[:WORKED_ON]->(:Project)<-[:WORKED_ON]- (p2:Person) WHERE NOT ((p1)-[:WORKED_WITH]-(p2)) WITH DISTINCT p1, p2 LIMIT 10 MERGE (p1)-[r:WORKED_WITH]-(p2) RETURN count(r) Batch size

27. Add New Rela0onship MATCH (p1:Person) -[:WORKED_ON]->(:Project)<-[:WORKED_ON]- (p2:Person) WHERE NOT ((p1)-[:WORKED_WITH]-(p2)) WITH DISTINCT p1, p2 LIMIT 10 MERGE (p1)-[r:WORKED_WITH]-(p2) RETURN count(r)

28. Con0nue While count(r) > 0 MATCH (p1:Person) -[:WORKED_ON]->(:Project)<-[:WORKED_ON]- (p2:Person) WHERE NOT ((p1)-[:WORKED_WITH]-(p2)) WITH DISTINCT p1, p2 LIMIT 10 MERGE (p1)-[r:WORKED_WITH]-(p2) RETURN count(r)

29. Use Unmanaged Extensions /db/data/cypher /my-extension/service REST API Extensions

30. RESTful Resource @Path("/similar-skills") public class ColleagueFinderExtension { private static final ObjectMapper MAPPER = new ObjectMapper(); private final ColleagueFinder colleagueFinder; public ColleagueFinderExtension( @Context CypherExecutor cypherExecutor ) { this.colleagueFinder = new ColleagueFinder( cypherExecutor.getExecutionEngine() ); } @GET @Produces(MediaType.APPLICATION_JSON) @Path("/{name}") public Response getColleagues( @PathParam("name") String name ) throws IOException { String json = MAPPER .writeValueAsString( colleagueFinder.findColleaguesFor( name ) ); return Response.ok().entity( json ).build(); } }

31. JAX-­‐RS Annota0ons @Path("/similar-skills") public class ColleagueFinderExtension { private static final ObjectMapper MAPPER = new ObjectMapper(); private final ColleagueFinder colleagueFinder; public ColleagueFinderExtension( @Context CypherExecutor cypherExecutor ) { this.colleagueFinder = new ColleagueFinder( cypherExecutor.getExecutionEngine() ); } @GET @Produces(MediaType.APPLICATION_JSON) @Path("/{name}") public Response getColleagues( @PathParam("name") String name ) throws IOException { String json = MAPPER .writeValueAsString( colleagueFinder.findColleaguesFor( name ) ); return Response.ok().entity( json ).build(); } }

32. Inject Database/Cypher Execu0on Engine @Path("/similar-skills") public class ColleagueFinderExtension { private static final ObjectMapper MAPPER = new ObjectMapper(); private final ColleagueFinder colleagueFinder; public ColleagueFinderExtension( @Context CypherExecutor cypherExecutor ) { this.colleagueFinder = new ColleagueFinder( cypherExecutor.getExecutionEngine() ); } @GET @Produces(MediaType.APPLICATION_JSON) @Path("/{name}") public Response getColleagues( @PathParam("name") String name ) throws IOException { String json = MAPPER .writeValueAsString( colleagueFinder.findColleaguesFor( name ) ); return Response.ok().entity( json ).build(); } }

33. 1. Get Close to the Data Applica0on MATCH MATCH CREATE DELETE MERGE MATCH Single request, many opera0ons – Reduce network latencies

34. 2. Mul0ple Implementa0on Op0ons REST API Extensions Cypher Traversal Framework Graph Algo Package Core API

35. 3. Control Request/Response Format { users: [ { id: 1234}, { id: 9876} ] } JSON, CSV, protobuf, etc 1a 03 08 96 01 Domain-­‐specific representa0ons – Compact – Conserve bandwidth

36. 4. Control HTTP Headers GET /my-extension/service/top-10 Applica0on Reverse Proxy HTTP/1.1 200 OK Cache-Control: max-age=60

37. 5. Integrate with Backend Systems Applica0on REST API Extensions RDBMS LDAP

38. Migra0ng to Extensions • Re-­‐implement original query inside extension • Modify request/response formats and headers • Refactor implementa0on to use lower parts of the stack where necessary • Measure, measure, measure

39. Scaling Reads -­‐ Throughput

40. Scale Horizontally For High Read Throughput Applica0on

41. Scale Horizontally For High Read Throughput Applica0on Load Balancer Master Slave Slave

42. Scale Horizontally For High Read Throughput Applica0on Read Load Balancer Write Load Balancer Master Slave Slave

43. Configure HAProxy as Read Load Balancer global daemon maxconn 256 defaults mode http timeout connect 5000ms timeout client 50000ms timeout server 50000ms frontend http-in bind *:80 default_backend neo4j-slaves backend neo4j-slaves option httpchk GET /db/manage/server/ha/slave server s1 10.0.1.10:7474 maxconn 32 check server s2 10.0.1.11:7474 maxconn 32 check server s3 10.0.1.12:7474 maxconn 32 check listen admin bind *:8080 stats enable

44. Configure HAProxy as Read Load Balancer global daemon maxconn 256 defaults mode http timeout connect 5000ms timeout client 50000ms timeout server 50000ms frontend http-in bind *:80 default_backend neo4j-slaves backend neo4j-slaves option httpchk GET /db/manage/server/ha/slave server s1 10.0.1.10:7474 maxconn 32 check server s2 10.0.1.11:7474 maxconn 32 check server s3 10.0.1.12:7474 maxconn 32 check listen admin bind *:8080 stats enable Master 404 Not Found false Slave 200 OK true 404 Not Found UNKNOWN Unknown

45. This Isn’t The Throughput You Were Looking For Applica0on MATCH (c:Country{name:'NZAaoumrsbwtairaya'l}i)a.'.}.) ... Load Balancer 1 2 3

46. Cache Sharding Using Consistent Rou0ng Applica0on Load Balancer 1 2 3 NZAaoumrsbwtairaya'l}i)a.'.}.) ... A-­‐I 1 J-­‐R 2 S-­‐Z 3 MATCH (c:Country{name:'BJZraiapmzabinal'b'}w})e)'..}..).. ..

47. Configure HAProxy for Cache Sharding global daemon maxconn 256 defaults mode http timeout connect 5000ms timeout client 50000ms timeout server 50000ms frontend http-in bind *:80 default_backend neo4j-slaves backend neo4j-slaves balance url_param country_code server s1 10.0.1.10:7474 maxconn 32 server s2 10.0.1.11:7474 maxconn 32 server s3 10.0.1.12:7474 maxconn 32 listen admin bind *:8080 stats enable

48. Configure HAProxy for Cache Sharding global daemon maxconn 256 defaults mode http timeout connect 5000ms timeout client 50000ms timeout server 50000ms frontend http-in bind *:80 default_backend neo4j-slaves backend neo4j-slaves balance url_param country_code server s1 10.0.1.10:7474 maxconn 32 server s2 10.0.1.11:7474 maxconn 32 server s3 10.0.1.12:7474 maxconn 32 listen admin bind *:8080 stats enable

49. Scaling Writes -­‐ Throughput

50. Factors Impac0ng Write Performance • Managing transac0onal state – Crea0ng and commilng are expensive opera0ons • Contending for locks – Nodes and rela0onships

51. Improving Write Throughput • Delay taking expensive locks • Batch/queue writes

52. Delay Expensive Locks • Iden0fy contended nodes • Involve them as late as possible in a transac0on

53. Add Linked List Item + Update Pointers

54. Add Linked List Item + Update Pointers Locked

55. Add Linked List Item + Update Pointers Locked

56. Add Linked List Item + Update Pointers Locked

57. Add Linked List Item

58. Add Linked List

59. Add Linked List

60. Add Linked List

61. Add Pointers Locked

62. Batch Writes • Mul0ple CREATE/MERGE statements per request – Good for integra0on with backend systems • Queue – Good for small, online transac0ons

63. Single-­‐Threaded Queue Write Write Write Queue Single Thread Batch

64. Queue Loca0on Op0ons Applica0on Applica0on

65. Benefits of Batched Writes • Less transac0onal state management – Create/commit per batch rather than per write • No conten0on for locks – No deadlocks • Query consolida0on – Reduce the amount of work inside the database

66. Query Consolida0on MATCH sam MATCH jenny CREATE sam-[:KNOWS]-jenny MATCH sam MATCH sarah CREATE sam-[:KNOWS]-sarah CREATE address1 CREATE address2 DELETE address1 MATCH sam CREATE sam-[:LIVES_AT]-address2

67. Eliminate Duplicate Lookups MATCH sam MATCH jenny CREATE sam-[:KNOWS]-jenny MATCH sam MATCH sarah CREATE sam-[:KNOWS]-sarah CREATE address1 CREATE address2 DELETE address1 MATCH sam CREATE sam-[:LIVES_AT]-address2

68. Eliminate Duplicate Lookups MATCH sam MATCH jenny CREATE sam-[:KNOWS]-jenny MATCH sam MATCH sarah CREATE sam-[:KNOWS]-sarah CREATE address1 CREATE address2 DELETE address1 MATCH sam CREATE sam-[:LIVES_AT]-address2

69. Eliminate Duplicate Lookups MATCH sam MATCH jenny CREATE sam-[:KNOWS]-jenny MATCH sarah CREATE sam-[:KNOWS]-sarah CREATE address1 CREATE address2 DELETE address1 CREATE sam-[:LIVES_AT]-address2

70. Eliminate Duplicate Lookups MATCH sam MATCH jenny CREATE sam-[:KNOWS]-jenny MATCH sarah CREATE sam-[:KNOWS]-sarah CREATE address1 CREATE address2 DELETE address1 CREATE sam-[:LIVES_AT]-address2

71. Eliminate Unnecessary Writes MATCH sam MATCH jenny CREATE sam-[:KNOWS]-jenny MATCH sarah CREATE sam-[:KNOWS]-sarah CREATE address1 CREATE address2 DELETE address1 CREATE sam-[:LIVES_AT]-address2

72. Eliminate Unnecessary Writes MATCH sam MATCH jenny CREATE sam-[:KNOWS]-jenny MATCH sarah CREATE sam-[:KNOWS]-sarah CREATE address1 CREATE address2 DELETE address1 CREATE sam-[:LIVES_AT]-address2

73. Eliminate Unnecessary Writes MATCH sam MATCH jenny CREATE sam-[:KNOWS]-jenny MATCH sarah CREATE sam-[:KNOWS]-sarah CREATE address2 CREATE sam-[:LIVES_AT]-address2

74. Tradeoff Latency Higher throughput In-­‐memory or durable queues? • Lost writes in event of crash • Transac0onal dequeue?

75. Further Reading hp://maxdemarzi.com/2013/09/05/scaling-­‐writes/ hp://maxdemarzi.com/2014/07/01/scaling-­‐concurrent-­‐writes-­‐in-­‐neo4j/

76. Hardware

77. Memory • SLC (single-­‐level cell) SSD w/SATA • Lots of RAM – 8-­‐12G heap – Explicitly memory-­‐map store files

78. Object Cache • 2G for 12G heap • No object cache – consistent throughput at expense of latency

79. AWS • HVM (hardware virtual machine) over PV (paravirtual) • EBS-­‐op0mized instances • Provisioned IOPS

Add a comment

Related presentations

Speaker: Matt Stine Developing for the Cloud Track Marc Andressen has famou...

This presentation explains how to develop a Web API in Java using (JAX-RS or Restl...

1 App,

1 App,

November 10, 2014

How to bring innovation to your organization by streamlining the deployment proces...

Cisco Call-control solutions can handle voice, video and data

Nathan Sharp of Siemens Energy recently spoke at the SAP Project Management in Atl...

Related pages

GraphConnect 2014 SF: From Zero to Graph in 120: Scale ...

Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor - GraphConnect San Francisco 2013
Read more

GraphConnect 2014 SF: From Zero to Graph in 120: Model ...

GraphConnect 2014 SF: From Zero to Graph in 120: Model. GraphConnect 2014 SF: ... GraphConnect 2014 SF: From Zero to Graph in 120: Scale. GraphConnect 2014 ...
Read more

Notes From GraphConnect 2014 · William Lyon

From Zero To Graph In 120: Query ... From Zero To Graph In 120: Scale ... http://www.slideshare.net/neo4j/graphconnect-2014-sf. @graph_aware.
Read more

Notes From GraphConnect 2014 - 推酷 - tuicool.com

From Zero To Graph In 120: Query ... From Zero To Graph In 120: Scale Ian Robinson ... http://www.slideshare.net/neo4j/graphconnect-2014-sf. @graph_aware @ ...
Read more

The Ruby Rogues by DevChat.tv on iTunes

Download past episodes or subscribe to future episodes of The Ruby Rogues by ... with Neo4j @ GraphConnect SF 2015 ... 2014: Free: View in iTunes: 120.
Read more

5 Topics You'll See at GraphConnect - Neo4j Graph Database

... here are just five of the topics you’ll hear and talk about at GraphConnect 2014 SF: Neo4j at Scale ... Zero to Graph in 120 series at GraphConnect.
Read more

SOFTWARE SOFT: Crack Tracker Pro 1.7.4.0 (Demo)

2014 (5550) December (1550 ) ... Content Analytics Raises $4 Million in Series A Funding to Scale Its eCommerce ...
Read more

www.nodexlgraphgallery.org

Image 162.035796886537 99.9999755118128 http://pbs.twimg.com/profile_images/507471334/avatar_normal.jpg albgorski albgorski Try different form of graph and ...
Read more

www.nodexlgraphgallery.org

Image 162.549387000237 99.9996150501148 http://pbs.twimg.com/profile_images/545247021620162560/-yKMHaNl_normal.jpeg js_of_passion js_of_passion RT @ ...
Read more