Bird Feed

50 %
50 %
Information about Bird Feed

Published on June 27, 2016

Author: EamonKavanagh

Source: slideshare.net

1. a real-time bird tracker for Central Park Eamon Kavanagh, Insight Data Engineering Fellowship Summer 2016

2. Motivation & Main Problems •  Birds can be fast and elusive unless you know where to look •  How do you process real-time location and trending data? •  How do you properly handle unreliable sensor data? •  Can you store data in a way to ensure accuracy in batch? Hooded Warbler Yellow-rumped Warbler

3. Motivation & Main Problems •  Birds can be fast and elusive unless you know where to look •  How do you process real-time location and trending data? •  How do you properly handle unreliable sensor data? •  Can you store data in a way to ensure accuracy in batch? Hooded Warbler Yellow-rumped Warbler

4. Motivation & Main Problems •  Birds can be fast and elusive unless you know where to look •  How do you process real-time location and trending data? •  How do you properly handle unreliable sensor data? •  Can you store data in a way to ensure accuracy in batch? Hooded Warbler Yellow-rumped Warbler

5. Motivation & Main Problems •  Birds can be fast and elusive unless you know where to look •  How do you process real-time location and trending data? •  How do you properly handle unreliable sensor data? •  Can you store data in a way to ensure accuracy in batch? Hooded Warbler Yellow-rumped Warbler

6. Demo eamonkavanagh.com/bird-feed

7. Pipeline {“name”: “Catbird”, “family”: “Thrush”, “lat”: …}

8. Challenges & Solutions •  Managing real-time location and trending data to have up-to-date queries •  Properly handling out-of-order real-time data so you have a sense of computational accuracy •  Using very new open-source technology (cloned Flink locally to implement a bug fix before it was officially released)

9. Challenges & Solutions •  Managing real-time location and trending data to have up-to-date queries

10. Challenges & Solutions •  Managing real-time location and trending data to have up-to-date ‘near me’ queries [Streaming Windows in Apache Flink] Retrieved June 23, 2016 link

11. Challenges & Solutions •  Properly handling out-of-order real-time data so you have a sense of computational accuracy

12. Challenges & Solutions •  Properly handling out-of-order real-time data so you have a sense of computational accuracy [Watermarks in Apache Flink] Retrieved June 23, 2016 link

13. About Me •  ~2 years experience as a data scientist in ad tech •  MSc in Applied Mathematics (University of British Columbia) •  BSc in Pure Mathematics (McMaster University)

Add a comment