Cassandra Summit 2014: Launching PlayStation 4 with Apache Cassandra

45 %
55 %
Information about Cassandra Summit 2014: Launching PlayStation 4 with Apache Cassandra
Technology

Published on September 29, 2014

Author: planetcassandra

Source: slideshare.net

Description

Presenters: Alexander Filipchick and Staff Software Engineer, Staff Software Engineers at Sony Network Entertainment

Since the launch of the PlayStation 4, many of the PSN features have been delivered using Cassandra. We will be talking about our experience as we launched one of the most popular gaming consoles in the world on well over 300 nodes.
- Why we picked Cassandra
- Exactly what PSN features for PS4 are powered by Cassandra
- The infrastructure used to deploy our clusters
- How we monitor system heath
- How we design, test and deploy
- Issues we faced and lessons learned along the way

Launching PS4 with Cassandra

Introduction • Alexander Filipchik – Staff Software Engineer at SNEI • Dustin Pham – Staff Software Engineer at SNEI

Agenda • Journey towards Cassandra • Cassandra-backed PS4 Features • Ops-y Stuff • Lessons learned View slide

Journey towards Cassandra View slide

Challenges • Small Team • Legacy Support • Hardware Deadline • Scaling @ Peak Time

Why Cassandra • Strong community • Horizontally scalable architecture • Good performance • Cost effective • New adventure J 6

PS4 Features backed by Cassandra

Cassandra-backed PS4 features • What’s New • Video Library • My Library • PS Now • Notifications • LiveArea • Store catalog • Pre-order • PS Plus • Recommendations • Remote Download • Share • Authentication • +more

Cassandra-backed PS4 features • What’s New • Video Library • My Library • PS Now • Notifications • LiveArea • Store catalog • Pre-order • PS Plus • Recommendations • Remote Download • Share • Authentication • +more

Cassandra-backed PS4 features • What’s New • Video Library • My Library • PS Now • Notifications • LiveArea • Store catalog • Pre-order • PS Plus • Recommendations • Remote Download • Share • Authentication • +more

Cassandra-backed PS4 features • What’s New • Video Library • My Library • PS Now • Notifications • LiveArea • Store catalog • Pre-order • PS Plus • Recommendations • Remote Download • Share • Authentication • +more

Cassandra-backed PS4 features • What’s New • Video Library • My Library • PS Now • Notifications • LiveArea • Store catalog • Pre-order • PS Plus • Recommendations • Remote Download • Share • Authentication • +more

Cassandra-backed PS4 features • What’s New • Video Library • My Library • PS Now • Notifications • LiveArea • Store catalog • Pre-order • PS Plus • Recommendations • Remote Download • Share • Authentication • +more

Cassandra-backed PS4 features • What’s New • Video Library • My Library • PS Now • Notifications • LiveArea • Store catalog • Pre-order • PS Plus • Recommendations • Remote Download • Share • Authentication • +more

Cassandra-backed PS4 features • What’s New • Video Library • My Library • PS Now • Notifications • LiveArea • Store catalog • Pre-order • PS Plus • Recommendations • Remote Download • Share • Authentication • +more

Cassandra-backed PS4 features • What’s New • Video Library • My Library • PS Now • Notifications • LiveArea • Store catalog • Pre-order • PS Plus • Recommendations • Remote Download • Share • Authentication • +more

Cassandra-backed PS4 features • What’s New • Video Library • My Library • PS Now • Notifications • LiveArea • Store catalog • Pre-order • PS Plus • Recommendations • Remote Download • Share • Authentication • +more

Cassandra-backed PS4 features • What’s New • Video Library • My Library • PS Now • Notifications • LiveArea • Store catalog • Pre-order • PS Plus • Recommendations • Remote Download • Share • Authentication • +more

Cassandra-backed PS4 features • What’s New • Video Library • My Library • PS Now • Notifications • LiveArea • Store catalog • Pre-order • PS Plus • Recommendations • Remote Download • Share • Authentication • +more

Cassandra-backed PS4 features • What’s New • Video Library • My Library • PS Now • Notifications • LiveArea • Store catalog • Pre-order • PS Plus • Recommendations • Remote Download • Share • Authentication • +more

Cassandra-backed PS4 features • What’s New • Video Library • My Library • PS Now • Notifications • LiveArea • Store catalog • Pre-order • PS Plus • Recommendations • Remote Download • Share • Authentication • + more

Ops-y Stuff

Infrastructure • Hosted in cloud and physical DCs • Several hundred nodes and growing • Cluster by feature • Vnodes and Assigned token clusters • Astyanax Client

Stats for PS4 cloud nodes • Data throughput: Gigabytes / sec • Cassandra read/writes: > 200,000 / sec • Data size: tens of terabytes • 10M PS4 and 80M PS3 sold 24

Clusters • Cluster per Read/Write pattern initially • Now use cluster per feature • Seeds referenced by DNS names • Size Tiered compaction • Manual compactions for some CFs 25

A typical node • m2.4xl + i2.2xl • 2 ephemeral disks (~ 2 x 800 GB) • Commit log on root partition • Topology managed in the topology file managed by chef 26

AWS • Nodes are interleaved between AZs – Replication factor spreads data across AZ’s – Minimizes downtime due to AZ outage Availability Zone A Availability Zone C

Eph1 Disk Layout Eph0 Pre-Launch Launch Current ü 2 Ephemerals in a RAID 0 ü Higher throughput (io spreads into 2 devices for reading & writing) ü If you lose 1 device, you loose the array ! ü 2 Ephemerals in a RAID 1 ü Higher throughput for reading (io spreads into 2 devices), but not for writing ü If you lose 1 device, the array continues up in degraded mode. ü ½ the available space Eph0 ü 2 individual Ephemerals ü Higher throughput (io spreads into 2 devices for reading & writing) ü You lose 1 device, Cassandra stops (configurable) ü No RAID overhead Eph0 AWS m2.4xl RAID 0 Eph1 AWS m2.4xl RAID 1 Eph1 AWS m2.4xl

Cluster Resizing

Thrift Payload Size thri%_framed_transport_size_in_mb thri%_max_message_length_in_mb

Bouncing Nodes phi_convict_threshold

Inter-DC Latency

Monitor system health • Nagios • Kibana/Elasticsearch • Graphite • AWS Cloudwatch • App level monitoring • Opscenter

App level metrics

Lessons Learned

Fun with Astyanax Client • Cross DC Latencies – Several second latencies in JP and EE data centers – Astyanax configs to ensure local datacenters used • Imbalanced node traffic – Hashing algorithm (MD5 vs Murmur3) • DNS Caching in the JVM – Stale seed nodes

A tale of 2 Nodes

Cluster lessons • A single bad node can raise app latencies significantly • Taking out an entire cassandra cluster is easy (not so fun) – Compressing data before sending to cassandra helps a lot. • Corrupted SStable resulted in cascading failure

• Monitoring – Memtable flush frequency – Hinted handoffs – Garbage collection – Compactions – Histograms

• VPNs are a dangerous bottle neck • Easier to rebuild a node than to fix • Backup data – Replication factor helps but does not account for data corruption

• Denormalization costs • Disk is cheap but EC2s are not • TTL on almost everything • Adjust gc_grace_period based off TTL times • Transactions ? Be creative • Load test with real data

• Replication strategy: – Read / Write pattern – Data is source of truth or not – Data locality – User Level data vs App level data • Cluster wide commands should be staggered – Global repair L

Tokens • Vnodes vs Assigned Tokens – Increased chattiness on gossip protocol with vnodes – Perceived slowness on repair and cleanup operations on vnodes enabled cluster – Astyanax client does not like vnodes…

Compactions • Compactions are your worst enemy – larger disk usage = high cpu & longer compactions • Leveled compaction vs sized compaction – Start up time – Cpu tradeoff – IO tradeoff • Updates + Removals eat up disks

We are hiring… sonyentertainmentnetwork.com/careers

Add a comment

Related presentations

Presentación que realice en el Evento Nacional de Gobierno Abierto, realizado los ...

In this presentation we will describe our experience developing with a highly dyna...

Presentation to the LITA Forum 7th November 2014 Albuquerque, NM

Un recorrido por los cambios que nos generará el wearabletech en el futuro

Um paralelo entre as novidades & mercado em Wearable Computing e Tecnologias Assis...

Microsoft finally joins the smartwatch and fitness tracker game by introducing the...

Related pages

Sony Network Entertainment: Launching PlayStation 4 with ...

... Launching PlayStation 4 with Apache Cassandra PlanetCassandra. ... Cassandra Summit 2014: DataStax CEO Billy Bosworth Keynote ...
Read more

Cassandra Summit 2014 - YouTube

Video presentations from Cassandra Summit 2014 in San Francisco, CA. ... Launching PlayStation 4 with Apache Cassandra by PlanetCassandra.
Read more

Cassandra Summit 2014 - Cvent | Online Registration by Cvent

Cassandra Summit 2015 - September 22-24, 2015. Cvent Online Event Registration Software | Copyright © 2000-2016 Cvent Inc.
Read more

Cassandra Summit 2014: Deploying Cassandra for Call of ...

4. ... • Launching in a DC we didn't load test in ; 22. ... Cassandra Summit 2014: Apache Cassandra on Pivotal CloudFoundry
Read more

My Experience at Cassandra Summit 2014

... Cassandra Summit 2014 there was a dedicated area titled "Cassandra LIVE" which allowed attendees. What is Apache Cassandra ... My Experience at ...
Read more

How Sony changed the world of gaming with PlayStation 4 on ...

At the Cassandra Summit 2014, ... How Sony changed the world of gaming with PlayStation 4 on Cassandra. ... The team chose Apache Cassandra ™ as the ...
Read more

Planet Cassandra Blog

The Planet Cassandra blog covers everything 'real world' for Apache Cassandra that ... Cassandra 3.4 feature freeze. ... Planet Cassandra | Apache, ...
Read more

Three weeks to go, what I am most excited about for ...

... what I am most excited about for Cassandra Summit 2014. ... so whether your are new to Apache Cassandra, ... Ghosts on PlayStation 4, ...
Read more