Slides for Stockholm
I’ve been busy preparing the data science tutorial I’m giving next week in Stockholm, and I thought I’d share my prezi slides with you. Almost everything in these slide decks is stolen from the web, and the more I worked on my presentation the more I realized how much of a tool the web itself has become for learning and explaining things.
The tutorial will be divided up into three parts. The first part I call “Data,” and it takes 2.5 hours. In that time I introduce the kind of data used in various fields of data science, how to get the data, how to store it, and how to do basic exploratory data analysis, cleaning, and basic statistics. Here’s the slide deck.
The second part is called “Models,” also 2.5 hours, and during that section I discuss the modeling process, including defining success, finding proxies, understanding information, choosing algorithms, understanding results through visualization, the problem of overfitting, and how to avoid it. The slide deck for Models is here.
In the final part, which is 1.5 hours, I am calling my presentation Product, and it addresses the various ways data science projects are published, whether through production code in higher level languages, or academic journals, or data journalism. Here I address end-product visualizations, keeping models updated with new data, building in feedback loops, and documentation. I’m not quite done with this one but close enough. That slide deck is here.
Tell me if you think I’m missing something!