Published on March 6, 2014
Real Time Recommender System with Jan 22, 2014 Daqing Zhao, Director of Advanced Analytics Macy’s.com
Agenda Big data analytics versus traditional BI Macy’s Advanced Analytics Team Our analytics projects Example: site recommendations using Kiji High level architecture Kiji Schema table structure Model deployment using Kiji Key benefits of Kiji and WibiData team 1
Traditional BI process Knowledge Discovery Segmentation and Predictive Modeling Most companies Stay in this area Multidimensional Report Standard Report Schema definition, ETL into RDMS Baseline Consulting Data can be accessed and analyzed only after ETL Schema definition may not be optimal 2
Hadoop/NoSQL: paradigm shift Decisions Insights Models Decision Agent Segmentation and Predictive Modeling Multi dimensional Report Reports Standard Report Hive, Mahout, Cascading, Scalding, Kiji, … MapReduce Raw data Volume Velocity Variety Write Append Read Distributed storage Computation near data Hadoop, HBase, avro, … We can access raw data and analyze using MapReduce With pros and cons 3
Macy.com’s Advanced Analytics Group We are at the frontiers of Big Data science: • Using Big Data technology • Machine learning and Statistical algorithms We have predictive modeling, experimental design and data science teams Our team members have very strong background in • Quantitative fields, math, stat, physics, bioinformatics, decision sciences, and cs • We collaborate with systems and IT teams internally as well as 3rd party vendors like WibiData, SAS Research, IBM Research… We use a wide range of tools • Hadoop, SAS, R, Mahout, and others, as well as Kiji Models We are data scientists with keen focus on domain problems 4
Customer acquisition and retention Targeting the right message to the right customer at the right time • Build predictive models of purchase behavior and identify drivers Site recommendation algorithms • Recommend products based on items that are added to bag for cross- and up-sell • We also look at market basket analysis • Most work is in batch mode, expanding slowly into real time Rapid-prototyping and testing of algorithms and policies • All done in short development cycles Output of the team’s work support other marketing teams to identify, and reach best customers • Search, display, social network, affiliates, retention, customer services, … 5
Some other projects Data organization or data munging • • • • • Data collections, individual and event level, 360 degrees, … Segmentation of customers Customer value, revenue, costs Multiple channel attribution of marketing contacts Product attributes Experimentation platform • Success of online marketing depends highly on testing, learning and optimization • Both for site layout as well as contents and recommendations Forecast and optimization • Prediction, simulation, and search and optimize Big data refinement and scalability • New data sources, more efficient ways of accessing data, and organizing and processing data 6
Example: similar and complementary products 7
Example: customer segmentation Demographic Socio-economic Behavioral Values and styles Channels Modality 8
Example: product social network Demographic Style Size Brand Price range Season 9
Example: site product recommendation Customer Adds to Bag one or more products We recommend in real time similar/complementary products • Based on product associations and customer profile We use various machine learning algorithms • • • • • • Association rules Collaborative filtering Predictive modeling Business rules And others, … Models built offline Real time data, real time model scoring and real time decision Champion/challenger tests, models evolve quickly in time Frequent model updates, add new data 10
Architecture Real Time Data access, Scoring Decisions Others data mining Kiji Express environment data mining Mahout environment data mining R environment SAS Environment products Kiji Model Kiji Kiji Scoring Scoring Kiji Kiji Rest Rest Kiji Kiji Rest Rest Hadoop HBase 11
Kiji Schema table structure Customer table entity id customer email metadata order Product table entity id product category metadata inventory Schema have column names and types, compared to bits stored in HBase Group column families are structured, while Map column families are flexible Accessible as collections from Kiji Express Scala code focuses on model and business logic Scalding underneath takes care of generating MapReduce jobs 12
Model Build and Deployment Model Model building Model building Model building Model building building Kiji Express Kiji Scoring Kiji PMML Kiji MR Deployment Kiji Schema HBase Hadoop Offline Kiji Modeling R, SAS, Mahout, … Real time data update Real time scoring Real time decisions 13
Key benefits of partnership with WibiData Open source, Kiji suite, abstracted with focus in modeling • Kiji Schema, KijiMR, Kiji Model, Kiji Scoring, Kiji Express, Kiji REST • Allow quick development cycle Package popular open source projects • Hadoop, HBase, Avro, Cascading, Scalding, Scala Better organization • Create tables, query by field name, flexibility, …, more DB like than HBase WibiData professional services team help develop, integrate, maintain, train in-house team, consult,… • Competence, knowledge • Support infrastructure, so that we can focus on the science Real time model deployment environment and scalable • Interactive • In milliseconds 14
Acknowledgement Macy’s teams Analytics team: Kerem Tomak, Albert Zhai Infrastructure team: Winslow Holmes, Rakesh Sharma, Cherry Peng WibiData team Professional Services team: Adam, Christophe, Renuka, Lynn 15
Jon Natkins explains in this article how to create a personalized recommendation system fed with large amounts of real-time data using Kiji ...
kiji. The Kiji project suite Updated Jun 18, 2015. ... and inspecting schemas on HBase using KijiSchema Updated Sep 26, 2014. Java 5 18 kiji ...
Recommender systems or recommendation systems ... The system generates recommendations using only information about ... Embedded systems; Real-time ...
kijiproject / kiji-bento. ... and server that supports the real-time per-row calculations on kiji ... new music recommendations. kiji-express ...
Real-time Recommendation Systems using Apache Storm. Home; ... Pranab Ghosh discussed the real time recommendations feature of Sifarish, ...
Real-time mobile recipe recommendation system using food ingredient recognition on ... We propose a real-time object recognition method for a ...
... Home » Amazon EC2 » Real-time Recommendation Systems using Apache Storm. ... Pranab Ghosh discussed the real time recommendations feature of Sifarish
Real-time Recommendation Systems using Apache Storm File; Brochure of Real-time Recommendation Systems: Please enter your contact information to receive ...