Spark Streaming for World Domination (and other projects)

1:00pm - 1:25pm on Friday, October 6 in PennTop North

Win Suen

Audience Level:


Learn how to use Python to bend real-time data streams to your will and become a data science hero. We’ll cover the basics of Python big data wrangling with Spark’s powerful Streaming API and pyspark, exploring real-time Twitter data and applying some machine learning magic, just for kicks.


Ask not what you can do for real time data streams but what they can do for you. This talk will give an overview of Apache Spark and pyspark (Spark’s Python API), with an emphasis on Spark’s Streaming API. We’ll be munging and visualizing Twitter data streams as a motivating example. Learn how your streaming data projects can benefit from bigger, better, faster data processing and analytics. Your life will be changed for the better - master streaming data, achieve world domination!

Want to edit this page?