Massively Parallel Processing with Procedural Python (PyData London 2014)

0 %
100 %
Information about Massively Parallel Processing with Procedural Python (PyData London 2014)
Technology

Published on February 22, 2014

Author: ihuston

Source: slideshare.net

Description

The Python data ecosystem has grown beyond the confines of single machines to embrace scalability. Here we describe one of our approaches to scaling, which is already being used in production systems. The goal of in-database analytics is to bring the calculations to the data, reducing transport costs and I/O bottlenecks. Using PL/Python we can run parallel queries across terabytes of data using not only pure SQL but also familiar PyData packages such as scikit-learn and nltk. This approach can also be used with PL/R to make use of a wide variety of R packages. We look at examples on Postgres compatible systems such as the Greenplum Database and on Hadoop through Pivotal HAWQ. We will also introduce MADlib, Pivotal’s open source library for scalable in-database machine learning, which uses Python to glue SQL queries to low level C++ functions and is also usable through the PyMADlib package.

Add a comment

Related presentations

Related pages

Ian Huston - Massively Parallel Processing with Procedural ...

... parallel-processing-with-procedural-python ... with-procedural-python-pydata-london-2014 ... Massively Parallel ...
Read more

Presentation Abstracts - PyData 2014 | London | Feb 21 - 23

Massively Parallel Processing with Procedural Python ... pure SQL but also familiar PyData packages ... rental transactions in Central London during ...
Read more

Ian Huston - Massively Parallel Processing with Procedural ...

... massively-parallel-processing-with-procedural-python-pydata-london-2014 The Python data ... Massively Parallel Processing with Procedural ...
Read more

Ronert Obst - Massively Parallel Processing with ...

... PyData/pydata-presentation-ronertobst PyData Berlin 2014 The Python data ecosystem ... Massively Parallel Processing with Procedural ...
Read more

PyData 2014 | London | Feb 21 - 23 - PyData.org | Home

... Feb 21, 2014. Time: ... Massively Parallel Processing with Procedural Python Level: Intermediate ... (AI/Data Science in London)) ...
Read more

PyData London 2014 | IanHuston.net

... my talk on Massively Parallel Processing with Procedural ... 04/2014: The videos from the PyData London conference are ... PyData London 2014 ...
Read more

Massively Parallel Processing | LinkedIn

Massively Parallel Processing (MPP) database on ... R has become a massively popular language for data mining and predictive model building with over ...
Read more

pydata | IanHuston.net

My talk on Wednesday will be about how to do massively parallel processing ... Python PyData London 2014. ... Massively Parallel Processing with Procedural ...
Read more