Home > journalism, open source tools > Update on the Lede Program

Update on the Lede Program

June 11, 2014

My schedule nowadays is to go to the Lede Program classes every morning from 10am until 1pm, then office hours, when I can, from 2-4pm. The students are awesome and are learning a huge amount in a super short time.

So for instance, last time I mentioned we set up iPython notebooks on the cloud, on Amazon EC2 servers. After getting used to the various kinds of data structures in python like integers and strings and lists and dictionaries, and some simple for loops and list comprehensions, we started examining regular expressions and we played around with the old enron emails for things like social security numbers and words that had four or more vowels in a row (turns out that always means you’re really happy as in “woooooohooooooo!!!” or really sad as in “aaaaaaarghghgh”).

Then this week we installed git and started working in an editor and using the command line, which is exciting, and then we imported pandas and started to understand dataframes and series and boolean indexes. At some point we also plotted something in matplotlib. We had a nice discussion about unsupervised learning and how such techniques relate to surveillance.

My overall conclusion so far is that when you have a class of 20 people installing git, everything that can go wrong does (versus if you do it yourself, then just anything that could go wrong might), and also that there really should be a better viz tool than matplotlib. Plus my Lede students are awesome.

  1. JSE
    June 11, 2014 at 8:26 am

    I thought you were boycotting Amazon, Cathy!


  2. June 11, 2014 at 8:27 am

    Yeah I updated that post to concede this. Not psyched about it.


  3. Sam Penrose
    June 11, 2014 at 9:42 am

    Are you working with the Software Carpentry folks? They are developing and testing curricula for this sort of thing.


  4. Allen Riddell
    June 12, 2014 at 6:54 am

    Gains in Python graphics sanity can be made with
    – seaborn http://stanford.edu/~mwaskom/software/seaborn/index.html
    – having a matplotlib reference open in a tab at all times: https://scipy-lectures.github.io/intro/matplotlib/matplotlib.html
    (There’s a ggplot2 port that looks like it is making progress: https://github.com/yhat/ggplot/ )


  5. Orso de Teranova
    June 19, 2014 at 2:03 am

    Perhaps R should be considered in the future. I don’t think that it’s much more difficult than Python to learn. R has RStudio and easy to use graphics.


    • June 19, 2014 at 12:29 pm

      Now that python has pandas, more development is happening in python than in R. We wanted to give our students the most powerful and ubiquitous tools possible.


  1. No trackbacks yet.
Comments are closed.
%d bloggers like this: