Update on the Lede Program
My schedule nowadays is to go to the Lede Program classes every morning from 10am until 1pm, then office hours, when I can, from 2-4pm. The students are awesome and are learning a huge amount in a super short time.
So for instance, last time I mentioned we set up iPython notebooks on the cloud, on Amazon EC2 servers. After getting used to the various kinds of data structures in python like integers and strings and lists and dictionaries, and some simple for loops and list comprehensions, we started examining regular expressions and we played around with the old enron emails for things like social security numbers and words that had four or more vowels in a row (turns out that always means you’re really happy as in “woooooohooooooo!!!” or really sad as in “aaaaaaarghghgh”).
Then this week we installed git and started working in an editor and using the command line, which is exciting, and then we imported pandas and started to understand dataframes and series and boolean indexes. At some point we also plotted something in matplotlib. We had a nice discussion about unsupervised learning and how such techniques relate to surveillance.
My overall conclusion so far is that when you have a class of 20 people installing git, everything that can go wrong does (versus if you do it yourself, then just anything that could go wrong might), and also that there really should be a better viz tool than matplotlib. Plus my Lede students are awesome.
I thought you were boycotting Amazon, Cathy!
LikeLike
Yeah I updated that post to concede this. Not psyched about it.
LikeLike
Are you working with the Software Carpentry folks? They are developing and testing curricula for this sort of thing.
LikeLike
Gains in Python graphics sanity can be made with
– seaborn http://stanford.edu/~mwaskom/software/seaborn/index.html
– having a matplotlib reference open in a tab at all times: https://scipy-lectures.github.io/intro/matplotlib/matplotlib.html
(There’s a ggplot2 port that looks like it is making progress: https://github.com/yhat/ggplot/ )
LikeLike
Perhaps R should be considered in the future. I don’t think that it’s much more difficult than Python to learn. R has RStudio and easy to use graphics.
LikeLike
Now that python has pandas, more development is happening in python than in R. We wanted to give our students the most powerful and ubiquitous tools possible.
LikeLike