Published on March 13, 2014
Big Data and the Future of Storage Dr. Evangelos Eleftheriou IBM Fellow
© 2014 International Business Machines Corporation 2 Big Data: More than just volume Volume Terabytes to exabytes of existing data to store and process Velocity Streaming data, milliseconds to seconds to respond Variety Structured, unstructured, text & multimedia Veracity Uncertainty from inconsistency, ambiguities, etc.
© 2014 International Business Machines Corporation 3 Memory / Storage Stack Significant advancements in non-volatile memories blur the boundaries between storage and memory by being low-cost, fast, and cost- effective. 101 1 102 103 104 105 106 107 108 109 1010 CPU Operation (1ns) Fetch data from the L2 Cache (10ns) Fetch data from DRAM (60ns) Read / Write SCM (100 – 2000 ns) Read from a Flash device (20 μs) Write to a Flash device (1 ms) Read from / Write to disk (5 ms) Read from / Write to tape (40 s) Memory SCM Storage Time(ns) second minute hour day week month year decade century SLOW FAST
© 2014 International Business Machines Corporation 4 IBM FlashSystem: The Tipping Point • The first time Flash storage outperformed hard disks in all aspects, including capacity and performance density, cost per I/O Operations Per Second (IOPS) and energy efficiency! " • With IBM FlashSystem 820 we achieved more than 6 million IOPS running an IBM DB2 workload on IBM Power servers • 19 kilowatts vs. 4.5 megawatts with high capacity hard disks, 236x better • Installed in less than 48 Hours, HDD would require two years !
© 2014 International Business Machines Corporation 5 IBM FlashSystem: The Tipping Point
© 2014 International Business Machines Corporation 6 Storage infrastructure in the new era High IOPS Ultra-low Latency Memory-speed Data Processing Infrastructure Elasticity Reduced time for deployment Bottomless Capacity Optimized TCO Integrated global infrastructure Simplicity Security Reliability Pay-as-you-go Extreme Scalability Block Interface Novel Interfaces MicroLatency Storage Global File System POSIX NFS CIFS Object GPFS Active! File Management POSIX CIFS ObjectNFS ObjectNFSPOSIX CIFS GPFS GPFS
© 2014 International Business Machines Corporation 8 Tape Storage: Big Data Needs Tape • Faster than Hard Disks at streaming! ! • Reliability: read after three decades, against five years for disks! ! • Zero Power Consumption (when idle)! ! • Security: 50 Petabytes on an HDD can be deleted in minutes, tape would take years! ! • Cost: 1 GB of disk storage costs 10 cents, versus 4 cents for tape ! ! • The Next Target: the 100 TB cartridge The Economist, Magnetic tape to the rescue. 30 Nov 2013
© 2014 International Business Machines Corporation 9 Multi-cloud Storage Tool Kit (live demo in CeBIT booth) • Can connect to one or more clouds including: IBM Softlayer, AWS, Azure, Rackspace (public, private, or hybrid) ! • Enterprise features, including: encryption, integrity, and resiliency, it’s transparent to GPFS and manages keys/ metadata! • Drag-and-drop usability
© 2014 International Business Machines Corporation© 2014 International Business Machines Corporation !9
© 2014 International Business Machines Corporation 10 A universal, non-volatile memory technology superior to Flash. PCM is very durable and can endure at least 10 million write cycles, compared to current enterprise-class flash at 30,000 cycles or consumer- class flash at 3,000 cycles. Phase Change Memory (PCM) DRAM Flash PCM Multi-Level PCM Timeframe Invented in 1966 Invented in 1980s Used in smartphones – wide adoption expected by 2016 2016 Speed Density Endurance Retention Scaling Best in class Good / Adequate Average / Inadequate Bad / Worst in class ➢ The qualification is relative and depends on the application! ➢ Racetrack memory is not included in this time horizon
© 2014 International Business Machines Corporation 11 Theseus Project: PCM-based PCI-e Prototype Card Hybrid Storage/Caching subsystem based on PCM • Design and implementation of a high- performance PCI-e card using PCM! • Consistent ultra-low latency, high IOPS even for very small operations! • Integration of PCM and Flash for hybrid use cases: ! • Caching! • Tiering! • Persistent Key/Value Store
... The Future of Big Data in ... Big data is only as ... Apache Hadoop made its debut as the future of big data. With cheaper data storage and ...
The future of storage: ... from adjacent tracks when trying to retrieve the wanted data. ... The big difference for 2015 is the maturation of ...
The future of storage: ... data storage disks' way. The three big HDD ... stacks of the future." For Seagate cloud storage of data is becoming on of ...
The future of big data storage will be on Hadoop, Actian predicts. The Actian chief technology officer believes that the wave of big data and analytics ...
THE cassette tape is about to make a comeback, in a big way. From the updates posted by Facebook’s 1 billion users to the medical images shared by ...
Big data technologies and ... “The future state of big data will be a hybrid of on ... them into Hadoop as the distributed file storage ...
Big Data: The Future of Data Storage www.iosrjournals.org 134 | Page A. Apache Hadoop: Open source software ...
Optical storage arrays: a perspective for EB data centers. ... computing and storage have emerged as technical solutions for future big data storage, ...
Big Data Using Cloud Big Data Using Cloud Computing Bernice M. Purcell ... A typical big data storage and analysis infrastructure will be based on