Data scientist enablement dse 400 week 4 roadmap

0 %
100 %
Information about Data scientist enablement dse 400 week 4 roadmap

Published on March 16, 2014

Author: MohanBavirisetty

Source: slideshare.net

Data Scientist Enablement DSE 400 - Fast Track to Data Science Week 4 Roadmap Advanced Center of Excellence Modern Renaissance Corporation In Collaboration with SONO team and others Content of this document is under Creative Commons Licence CC BY 4.0

Agenda You can always find the latest version of this document at http://bit.ly/1g8tMKM Week 4 Overview Discussions Learning Path Activities Assignment Submission Looking ahead References Citation

Discussions: Big Data - top blog posts from 2013. Evolving Darwin Genetic Algorithm. Optional Q&A. Learning plan: Read R for Machine Learning by Allison Chang and Introduction to Machine Learning etc. Activities: Try Visualization through spreadsheets. Implement functions in R. Build a personal roadmap. Assignment 4: Survey Paper - How Big Data is being used in your industry. DSE 400 - Week 4 at a glance

Discussion 1: Read Top 8 Big Data Posts from December 2013. Pick a post that interest you most. Comment what you like most about it and how these insights can applied. Discussion 2: Watch video Evolving Darwin - Genetic Algorithm and comment on it. Does it sound like a valid machine learning approach? What are its strengths and weaknesses, if any? How would you improve it? These discussions are required. If you already have access to SONO > DSE 400, you will be required to participate in these discussions. There will also be an Optional Q&A. Please do not create additional threads in weekly KCs. Social Engagement on SONO - Week 4 http://getsokno.com/redvinef/controllers/cell.php?user_knocell=1004

Read R for Machine Learning by Allison Chang Read Introduction to Machine Learning by Lars Marius Garshol <Optional> Watch The Learning Problem by Prof. Abu Mostafa from Caltech ML video series. <Optional> Watch Machine Learning: The Basics by Ron Bekkerman <Optional> Watch Introduction to R for Data Mining by Joseph Rickert <Optional> Read Top 10 Algorithms in Data Mining by Wu et. al. Recommended Learning Plan

<Practice> Write a user-defined function in R that takes an integer N and outputs the sum of first N odd numbers. Using this function verify that the sum of first N odd integers is given by the formula N^2 (i.e N*N or N-squared). Activities <Practice> Gather the data on 2010 Winter Olympic Medals. Visualize this data using a spreadsheet showing geographic distribution pattern of these medals. If you use Google Spreadsheet this pattern may like the adjacent picture. Later on you can repeat this exercise for 2014 Winter Olympics

<Practice> Sieve of Eratosthenes is an algorithm that describes how to generate all prime numbers between 1 and given number N, by eliminating the multiples of prime numbers. Write an R function that implements Sieve of Eratosthenes. <Practice> Build a personal Career Advancement Roadmap. Focus on your career over 5-10 year horizon. Get an inventory of your current strengths and capabilities. Reflect on your career ambitions and add it to this roadmap. Use DSE Roadmap to enhance your capabilities to move you towards the desired goals. What other skills and competencies do need to advance yourself? Use open knowledge repositories like ocw.mit.edu to examine these additional capabilities you can assimilate. Activities

Assignment 4 - Submission Required Prepare a small survey (i.e. overview) paper (2-5 pages) of Big Data and its impact on your industry or area of focus. If you do not have a preferred industry or area of focus, choose either Retail or Telecom sector. Use pictures and infographics in your paper to make it readable. As an example, you may refer to The 'big data' revolution in healthcare - McKinsey & Company report. Your assignment doesn’t have to be this exhaustive. It is enough if you give an overview and make it readable for any audience. You can use blogs, newspaper articles, webinars and Linkedin forums etc. to gather material for your survey. If you do not have access to commercial Word Processing Packages, you can use either Google Docs or OpenOffice.org or similar free or opensource package.

Submissions Deadline Saturday, 11:59 PM your local time. Mail Assignment 4 to <dse400.datascience@gmail.com> Submit a single PDF document showing your Big Data Survey. Use this naming convention: DSE 400 - Assignment 4 - Your Full Name for your document. No document links should be sent. Just one single PDF document, please. Please add DSE 400 > Assignment 4 in the subject line.

Week 5 Visualizations. Submit your research Data Visualization Tools - A Comparative Study Week 6 -7 Processing large data sets. Hadoop Ecosystem. Stream Computing etc. Week 8 Ethics, Privacy and Building Data Products. DSE 400 - Weeks 5-8 ahead

References, Resources and Additional Reading [MIT OCW] R for Machine Learning by Allison Chang An Introduction to Machine Learning. Hilary Mason, O’Reilly Media Inc., 2011 Machine Learning, Tom Mitchell, Mc Graw-Hill Publishers, 1997 Advanced Machine Learning. Hilary Mason, O’Reilly Media Inc., 2012 Scaling Up Machine Learning. Bekkerman, Bilenko, and Langford, O’Reilly Publishers, 2011 [MIT OCW] Prediction: Machine Learning and Statistics Stanford University Machine Learning Video Collection Caltech Machine Learning Video Collection

Citation R for Machine Learning by Allison Chang is recommended by MIT Course Prediction: Machine Learning and Statistics from Sloan School of Management, It is adopted in DSE 400 as per OCW guidelines. Content that appears as is on this document only, is under Creative Commons License CC BY 4.0 This license may not necessarily apply to other material referenced here in this document.

For More Information Week 4 discussions take place during this week on SONO DSE 400 Week 4 <Help On Demand> You may reach out to Ms. Rachel Fleming <rachel@emodern.biz> if you have any difficulties with the assignments or looking for more activities. If you have any questions or suggestions on SONO, please reach out Mr. Eric Kmeic <> We welcome questions, thoughts and suggestions. Post these on SONO in the right forum/discussion or write to us at <dse400.datascience@gmail.com> You can always find the latest version of this document at http://bit.ly/1g8tMKM

Fun@Work

In year 1859, Charles Darwin published On the Origin of Species which is regarded as one of the monumental works in human history. In this work, he explained that life on earth adapts to constantly changing environment by means of natural selection. Thank You

Add a comment

Related presentations

Related pages

Data scientist enablement dse 400 week 2 roadmap - Documents

Data Scientist Enablement DSE 400 ... Data scientist enablement dse 400 week 4 roadmap. ... Data Scientist Enablement roadmap 1.0.
Read more

Data scientist enablement dse 400 week 7 roadmap - Documents

1.Data Scientist Enablement DSE 400 ... Data scientist enablement dse 400 week 4 roadmap. Data scientist enablement dse 400 week 5 roadmap.
Read more

Data scientist enablement dse 400 week 4 roadmap - Documents

1.Data Scientist Enablement DSE 400 - Fast Track to Data Science Week 4 Roadmap Advanced Center of Excellence Modern Renaissance Corporation In ...
Read more

Data scientist enablement dse 400 week 7 roadmap - Documents

1.Data Scientist Enablement DSE 400 - Fast Track to Data Science Week 7 Roadmap Advanced Center of Excellence Modern Renaissance Corporation In ...
Read more

Data scientist enablement dse 400 - week 1 roadmap - Documents

×Close Share Data scientist enablement dse 400 - week 1 roadmap. Embed
Read more

Data scientist enablement dse 400 - week 1 - Education

Share Data scientist enablement dse 400 - week 1. Embed ...
Read more

DSE 400 Fast Track to Data Science, Free Online Course

Data Scientist Enablement (DSE) ... A roadmap for Data Scientist Enablement program. ... DSE 400 Fast Track to Data Science Free Online Course ...
Read more

DSE 7310-20 Data Sheet - Documents

of 4 × Close Share DSE ... Download DSE 7310-20 Data Sheet. Transcript ...
Read more