Archive

Archive for the ‘modeling’ Category

Great news: InBloom is shutting down

I’m trying my hardest to resist talking about Piketty’s Capital because I haven’t read it yet, even though I’ve read a million reviews and discussions about it, and I saw him a couple of weeks ago on a panel with my buddy Suresh Naidu. Suresh, who was great on the panel, wrote up his notes here.

So I’ll hold back from talking directly about Piketty, but let me talk about one of Suresh’s big points that was inspired in part by Piketty. Namely, the fact that it’s a great time to be rich. It’s even greater now to be rich than it was in the past, even when there were similar rates of inequality. Why? Because so many things have become commodified. Here’s how Suresh puts it:

We live in a world where much more of everyday life occurs on markets, large swaths of extended family and government services have disintegrated, and we are procuring much more of everything on markets. And this is particularly bad in the US. From health care to schooling to philanthropy to politicians, we have put up everything for sale. Inequality in this world is potentially much more menacing than inequality in a less commodified world, simply because money buys so much more. This nasty complementarity of market society and income inequality maybe means that the social power of rich people is higher today than in the 1920s, and one response to increasing inequality of market income is to take more things off the market and allocate them by other means.

I think about this sometimes in the field of education in particular, and to that point I’ve got a tiny bit of good news today.

Namely, InBloom is shutting down (hat tip Linda Brown). You might not remember what InBloom is, but I blogged about this company a while back in my post Big Data and Surveillance, as well as the ongoing fight against InBloom in New York state by parents here.

The basic idea is that InBloom, which was started in cooperation with the Bill and Melinda Gates Foundation and Rupert Murdoch’s Amplify, would collect huge piles of data on students and their learning and allow third party companies to mine that data to improve learning. From this New York Times article:

InBloom aimed to streamline personalized learning — analyzing information about individual students to customize lessons to them — in public schools. It planned to collect and integrate student attendance, assessment, disciplinary and other records from disparate school-district databases, put the information in cloud storage and release it to authorized web services and apps that could help teachers track each student’s progress.

It’s not unlike the idea that Uber has, of connecting drivers with people needing rides, or that AirBNB has, of connecting people needing a room with people with rooms: they are platforms, not cab companies or hoteliers, and they can use that matchmaking status as a way to duck regulations.

The problem here is that the relevant child data protection regulation, called FERPA, is actually pretty strong, and InBloom and companies like it were largely bypassing that law, as was discovered by a Fordham Law study led by Joel Reidenberg. In particular, the study found that InBloom and other companies were offering what seemed like “free” educational services, but of course the deal really was in exchange for the children’s data, and the school officials who were agreeing to the deals had no clue as to what they were signing. The parents were bypassed completely. Much of the time the contracts were in direct violation of FERPA, but often the school officials didn’t even have copies of the contracts and hadn’t heard of FERPA.

Because of that report and other bad publicity, we saw growing resistance in New York State by parents, school board members and privacy lawyers. And thanks to that resistance, New York State Legislature recently passed a budget that prohibited state education officials from releasing student data to amalgamators like inBloom. InBloom has subsequently decided to close down.

I’m not saying that the urge to privatize education – and profit off of it – isn’t going to continue after a short pause. For that matter look at the college system. Even so, let’s take a moment to appreciate the death of one of the more egregious ideas out there.

Warning: extremely nerdy content, harmful if taken seriously

Today I’d like to share a nerd thought experiment with you people, and since many of you are already deeply nerdy, pardon me if you’ve already thought about it. Feel free – no really, I encourage you – to argue strenuously with me if I’ve misrepresented the current thinking on this. That’s why I have comments!!

It’s called the Fermi Paradox, and it’s loosely speaking a formula that relates the probability of intelligent life somewhere besides here on earth, the probability of other earth-like planets, and the fact that we haven’t been contacted by our alien neighbors.

It starts with that last thing. We haven’t been contacted by aliens, so what gives? Is it because life-sustaining planets are super rare? Or is it because they are plentiful and life, or at least intelligent life, or at least intelligent life with advanced technology, just doesn’t happen on them? Or does life happen on them but once they get intelligent they immediately kill each other with atomic weapons?

There is a supremely nerdy underlying formula to this thought experiment, which has a name – the Drake Equation. I imagine it comes in various forms but here’s one:

N = R_{\ast} \cdot f_p \cdot n_e \cdot f_{\ell} \cdot f_i \cdot f_c \cdot L

where:

N = the number of civilizations in our galaxy with which radio-communication might be possible (i.e. which are on our current past light cone)
R* = the average rate of star formation in our galaxy
fp = the fraction of those stars that have planets
ne = the average number of planets that can potentially support life per star that has planets
fl = the fraction of planets that could support life that actually develop life at some point
fi = the fraction of planets with life that actually go on to develop intelligent life (civilizations)
fc = the fraction of civilizations that develop a technology that releases detectable signs of their existence into space
L = the length of time for which such civilizations release detectable signals into space[8]

 

So the bad news (hat tip Suresh Naidu) is that, due to scientists discovering more earth-like planets recently, we’re probably all going to die soon.

Here’s the reasoning. Notice in the above equation that N is the product of a bunch of things. If N doesn’t change but our estimate of one of those terms goes up or down, then the other terms have to go down or up to compensate. And since finding a bunch of earth-like planets increases some combination of R*fp,  and ne, we need to compensate with some combination of the other terms. But if you look at them the most obvious choice is L, the length of time civilizations release detectable signals into space.

And I say “most obvious” because it makes the thought experiment more fun that way. Also we exist as proof that some planets do develop intelligent life with the technology to send out signals into space but we have no idea how long we’ll last.

Anyhoo, not sure if there are actionable items here except for maybe deciding to stop looking for earth-like planets, or deciding to stop emitting signals to other planets so we can claim other aliens didn’t obliterate themselves, they were simply “too busy” to call us (we need another term which represents the probability of the invention of Candy Crush Saga!!!). Or maybe they took a look from afar and saw reality TV and decided we weren’t ready, a kind of updated Star Trek first contact kind of theory.

Update: I can’t believe I didn’t add an xkcd comic to this, my bad. Here’s one (hat tip Suresh Naidu):

the_drake_equation

Categories: modeling, musing

A simple mathematical model of congressional geriatric penis pumps

This is a guest post written by Stephanie Yang and reposted from her blogStephanie and I went to graduate school at Harvard together. She is now a quantitative analyst living in New York City, and will be joining the data science team at Foursquare next month.

samantha_bee_daily_showLast week’s hysterical report by the Daily Show’s Samantha Bee on federally funded penis pumps contained a quote which piqued our quantitative interest. Listen carefully at the 4:00 mark, when Ilyse Hogue proclaims authoritatively:

“Statistics show that probably some our members of congress have a vested interested in having penis pumps covered by Medicare!”

Ilya’s wording is vague, and intentionally so. Statistically, a lot of things are “probably” true, and many details are contained in the word “probably”. In this post we present a simple statistical model to clarify what Ilya means.

First we state our assumptions. We assume that penis pumps are uniformly distributed among male Medicare recipients and that no man has received two pumps. These are relatively mild assumptions. We also assume that what Ilya refers to as “members of Congress [with] a vested interested in having penis pumps covered by Medicare,” specifically means male member of congress who received a penis pump covered by federal funds. Of course, one could argue that female members congress could also have a vested interested in penis pumps as well, but we do not want to go there.

Now the number crunching. According to the report, Medicare has spent a total of $172 million supplying penis pumps to recipients, at “360 bucks a pop.” This means a total of 478,000 penis pumps bought from 2006 to 2011.

45% of the current 49,435,610 Medicare recipients are male. In other words, Medicare bought one penis pump for every 46.5 eligible men. Inverting this, we can say that 2.15% of male Medicare recipients received a penis pump.

There are currently 128 members of congress (32 senators plus 96 representatives) who are males over the age of 65 and therefore Medicare-eligible. The probability that none of them received a federally funded penis pump is:

(1-0.0215)^{128} \approx 6.19\%

In other words, the chances of at least one member of congress having said penis pumps is 93.8%, which is just shy of the 95% confidence that most statisticians agree on as significant. In order to get to 95% confidence, we need a total of 138 male members of congress who are over the age of 65, and this has not happened yet as of 2014. Nevertheless, the estimate is close enough for us to agree with Ilya that there is probably someone member of congress who has one.

Is it possible that there two or more penis pump recipients in congress? We did notice that Ilya’s quote refers to plural members of congress. Under the assumptions laid out above, the probability of having at least two federally funded penis pumps in congress is:

1- {128 \choose 0} (1- 0.0215)^{128} - {128 \choose 1}(1-0.0215)^{127} (0.0215)^1 \approx 76.3\%

Again, we would say this is probably true, though not nearly with the same amount of confidence as before. In order to reach 95% confidence that there are two or moreq congressional federally funded penis pump, we would need 200 or more Medicare-eligible males in congress, which is unlikely to happen anytime soon.

Note: As a corollary to these calculations, I became the first developer in the history of mankind to type the following command: git merge --squash penispump.

The US political system serves special interests and the rich

A paper written by Martin Gilens and Benjamin Page and entitled Testing Theories of American Politics: Elites, Interest Groups, and Average Citizens has been recently released and reported on (h/t Michael Crimmins) that studies who has influence on policy in the United States.

Here’s an excerpt from the abstract of the paper:

Multivariate analysis indicates that economic elites and organized groups representing business interests have substantial independent impacts on U.S. government policy, while average citizens and mass-based interest groups have little or no independent influence.

A word about “little or no independent influence”: the above should be interpreted to mean that average citizens and mass-based groups only win when their interests align with economic elites, which happens sometimes, or business interests, which rarely happens. It doesn’t mean that average citizens and mass-based interest groups never ever get what they want.

There’s actually a lot more to the abstract, about abstract concepts of political influence, but I’m ignoring that to get to the data and the model.

The data

The found lots of polls on specific issues that were yes/no and included information about income to determine what poor people (10th percentile) thought about a specific issue, what an average (median income) person thought, and what a wealthy (90th percentile) person thought. They independently corroborated that their definition of wealthy was highly correlated, in terms of opinion, to other stronger (98th percentile) definitions. In fact they make the case that using 90th percentile instead of 98th actually underestimates the influence of wealthy people.

For the sake of interest groups and their opinions on public policy, they had a list of 43 interest groups (consisting of 29 business groups, 11 mass-based groups, and 3 others) that they considered “powerful” and they used domain expertise to estimate how many would oppose or be in favor of a given issue, and more or less took the difference, although they actually did something a bit fancier to reduce the influence of outliers:

Net Interest Group Alignment = ln(# Strongly Favor + [0.5 * # Somewhat Favor] + 1) – ln(#
Strongly Oppose + [0.5 * # Somewhat Oppose] + 1).

Finally, they pored over records to see what policy changes were actually made in the 4 year period after the polls.

Statistics

The different groups had opinions that were sometimes highly correlated:

Screen Shot 2014-04-17 at 6.40.59 AM

Note the low correlation between mass public interest groups (like unions, pro-life, NRA, etc) and average citizens’ preferences and the negative correlation between business interests and elites’ preferences.

Next they did three bivariate regressions, measuring the influence of each of the groups separately, as well as one including all three, and got the following:

Screen Shot 2014-04-17 at 6.46.55 AM

This is where we get our conclusion that average citizens don’t have independent influence, because of this near-zero coefficient in Model 4. But note that if we ignore elites and interest groups, we do have 0.64 in Model 1, which indicates that preferences of the average citizens are correlated with outcomes.

The overall conclusion is that policy changes are determined by the elites and the interest groups.

We can divide the interest groups into business versus mass-based and check out how the influence is divided between the four defined groups:

Screen Shot 2014-04-17 at 6.52.19 AM

Caveats

This stuff might depend a lot on various choices the modelers made as well as their proxies. It doesn’t pick up on smaller special interest groups. It doesn’t account for all possible sources of influence and so on. I’d love to see it redone with other choices. But I’m impressed anyway with all the work they put into this.

I’ll let the authors have the last word:

What do our findings say about democracy in America? They certainly constitute troubling news for advocates of “populistic” democracy, who want governments to respond primarily or exclusively to the policy preferences of their citizens. In the United States, our findings indicate, the majority does not rule — at least not in the causal sense of actually determining policy outcomes. When a majority of citizens disagrees with economic elites and/or with organized interests, they generally lose. Moreover, because of the strong status quo bias built into the U.S. political system, even when fairly large majorities of Americans favor policy change, they generally do not get it.

Categories: #OWS, modeling

Let’s experiment more

What is an experiment?

The gold standard in scientific fields is the randomized experiment. That’s when you have some “treatment” you want to impose on some population and you want to know if that treatment has positive or negative effects. In a randomized experiment, you randomly divide a population into a “treatment” group and a “control group” and give the treatment only to the first group. Sometimes you do nothing to the control group, sometimes you give them some other treatment or a placebo. Before you do the experiment, of course, you have to carefully define the population and the treatment, including how long it lasts and what you are looking out for.

Example in medicine

So for example, in medicine, you might take a bunch of people at risk of heart attacks and ask some of them – a randomized subpopulation – to take aspirin once a day. Note that doesn’t mean they all will take an aspirin every day, since plenty of people forget to do what they’re told to do, and even what they intend to do. And you might have people in the other group who happen to take aspirin every day even though they’re in the other group.

Also, part of the experiment has to be well-defined lengths and outcomes of the experiment: after, say, 10 years, you want to see how many people in each group have a) had heart attacks and b) died.

Now you’re starting to see that, in order for such an experiment to yield useful information, you’d better make sure the average age of each subpopulation is about the same, which should be true if they were truly randomized, and that there are plenty of people in each subpopulation, or else the results will be statistically useless.

One last thing. There are ethics in medicine, which make experiments like the one above fraught. Namely, if you have a really good reason to think one treatment (“take aspirin once a day”) is better than another (“nothing”), then you’re not allowed to do it. Instead you’d have to compare two treatments that are thought to be about equal. This of course means that, in general, you need even more people in the experiment, and it gets super expensive and long.

So, experiments are hard in medicine. But they don’t have to be hard outside of medicine! Why aren’t we doing more of them when we can?

Swedish work experiment

Let’s move on to the Swedes, who according to this article (h/t Suresh Naidu) are experimenting in their own government offices on whether working 6 hours a day instead of 8 hours a day is a good idea. They are using two different departments in their municipal council to act as their “treatment group” (6 hours a day for them) and their “control group” (the usual 8 hours a day for them).

And although you might think that the people in the control group would object to unethical treatment, it’s not the same thing: nobody thinks your life is at stake for working a regular number of hours.

The idea there is that people waste their last couple of hours at work and generally become inefficient, so maybe knowing you only have 6 hours of work a day will improve the overall office. Another possibility, of course, is that people will still waste their last couple of hours of work and get 4 hours instead of 6 hours of work done. That’s what the experiment hopes to measure, in addition to (hopefully!) whether people dig it and are healthier as a result.

Non-example in business: HR

Before I get too excited I want to mention the problems that arise with experiments that you cannot control, which is most of the time if you don’t plan ahead.

Some of you probably ran into an article from the Wall Street Journal, entitled Companies Say No to Having an HR Department. It’s about how some companies decided that HR is a huge waste of money and decided to get rid of everyone in that department, even big companies.

On the one hand, you’d think this is a perfect experiment: compare companies that have HR departments against companies that don’t. And you could do that, of course, but you wouldn’t be measuring the effect of an HR department. Instead, you’d be measuring the effect of a company culture that doesn’t value things like HR.

So, for example, I would never work in a company that doesn’t value HR, because, as a woman, I am very aware of the fact that women get sexually harassed by their bosses and have essentially nobody to complain to except HR. But if you read the article, it becomes clear that the companies that get rid of HR don’t think from the perspective of the harassed underling but instead from the perspective of the boss who needs help firing people. From the article:

When co-workers can’t stand each other or employees aren’t clicking with their managers, Mr. Segal expects them to work it out themselves. “We ask senior leaders to recognize any potential chemistry issues” early on, he said, and move people to different teams if those issues can’t be resolved quickly.

Former Klick employees applaud the creative thinking that drives its culture, but say they sometimes felt like they were on their own there. Neville Thomas, a program director at Klick until 2013, occasionally had to discipline or terminate his direct reports. Without an HR team, he said, he worried about liability.

“There’s no HR department to coach you,” he said. “When you have an HR person, you have a point of contact that’s confidential.”

Why does it matter that it’s not random?

Here’s the crucial difference between a randomized experiment and a non-randomized experiment. In a randomized experiment, you are setting up and testing a causal relationship, but in a non-randomized experiment like the HR companies versus the no-HR companies, you are simply observing cultural differences without getting at root causes.

So if I notice that, at the non-HR companies, they get sued for sexual harassment a lot – which was indeed mentioned in the article as happening at Outback Steakhouse, a non-HR company – is that because they don’t have an HR team or because they have a culture which doesn’t value HR? We can’t tell. We can only observe it.

Money in politics experiment

Here’s an awesome example of a randomized experiment to understand who gets access to policy makers. In an article entitled A new experiment shows how money buys access to Congressan experiment was conducted by two political science graduate students, David Broockman and Josh Kalla, which they described as follows:

In the study, a political group attempting to build support for a bill before Congress tried to schedule meetings between local campaign contributors and Members of Congress in 191 congressional districts. However, the organization randomly assigned whether it informed legislators’ offices that individuals who would attend the meetings were “local campaign donors” or “local constituents.”

The letters were identical except for those two words, but the results were drastically different, as shown by the following graphic:

accesstocongress

Conducting your own experiments with e.g. Mechanical Turk

You know how you can conduct experiments? Through an Amazon service called Mechanical Turk. It’s really not expensive and you can get a bunch of people to fill out surveys, or do tasks, or some combination, and you can design careful experiments and modify them and rerun them at your whim. You decide in advance how many people you want and how much to pay them.

So for example, that’s how then-Wall Street Journal journalist Julia Angwin, in 2012, investigated the weird appearance of Obama results interspersed between other search results, but not a similar appearance of Romney results, after users indicated party affiliation.

Conclusion

We already have a good idea of how to design and conduct useful and important experiments, and we already have good tools to do them. Other, even better tools are being developed right now to improve our abilities to conduct faster and more automated experiments.

If we think about what we can learn from these tools and some creative energy into design, we should all be incredibly impatient and excited. And we should also think of this as an argumentation technique: if we are arguing about whether a certain method or policy works versus another method or policy, can we set up a transparent and reproducible experiment to test it? Let’s start making science apply to our lives.

Categories: data journalism, modeling, rant

Defining poverty #OWS

I am always amazed by my Occupy group, and yesterday’s meeting was no exception. We decided to look into redefining the poverty line, and although the conversation took a moving and deeply philosophical turn, I’ll probably only have time to talk about the nuts and bolts of formulas this morning.

The poverty line, or technically speaking the “poverty threshold,” is the same as it was in 1964 when it was invented except for being adjusted for inflation via the CPI.

In the early 1960′s, it was noted that poor families spent about a third of their money on food. To build an “objective” measure of poverty, then, they decided to measure the cost of an “economic food budget” for a family of that size and then multiply that cost by 3.

Does that make sense anymore?

Well, no. Food has gotten a lot cheaper since 1964, and other stuff hasn’t. According to the following chart, which I got from The Atlantic, poor families now spend about one sixth of their money on food:

Rich people spend even less on food.

Rich people spend even less on food.

Now if you think about it, the formula should be more like “economic food budget” * 6, which would effectively double all the numbers.

Does this matter? Well, yes. Various programs like Medicare and Medicaid determine eligibility based on poverty. Also, the U.S. census measures poverty in our country using this yardstick. If we double those numbers we will be seeing a huge surge in the official numbers.

Not that we’d be capturing everyone even then. The truth is, in some locations, like New York, rent is so high that the formula would likely be needing even more adjustment. Although food is expensive too, so maybe the base “economic food budget” would simply need adjusting.

As usual the key questions are, what are we accomplishing with such a formula, and who is “we”?

Categories: #OWS, modeling, statistics

What Monsanto and college funds have in common

College

I recently read this letter to the editor written by Catharine Hill, the President of Vassar College, explaining why reducing family contributions in college tuition and fees isn’t a good idea. It was in response to this Op-Ed by Steve Cohen about the onerous “E.F.C.” system.

Let me dumb down the debate a bit for the sake of simplicity. Steve is on one side basically saying, “college costs too damn much, it didn’t used to cost this much!” and Catharine is on the other side saying, “colleges need to compete! If you’re not willing to pay then someone else will!”

Here’s the thing, there’s an arms race of colleges driving up costs. In some perverse combination of US News & World Reports model gaming and in responding to the Federal loan support incentive system, not to mention political decisions methodically removing funding from state colleges, college costs have been wildly rising.

And when you have an arms race, as I’ve learned from Tom Slee, the only solution is an armistice. In this case an armistice would translate into something like an agreement among colleges to set a maximum and reasonable tuition and fee structure. Sounds good, right? But an armistice won’t happen if the players in question are benefitting from the arms race. In this case parents are suffering but colleges are largely benefitting.

Monsanto

This recent Salon article detailing the big data approach that Monsanto is taking to their massive agricultural empire is in the same boat.

The idea is that Monsanto has bought up a bunch of big data firms and satellite firms to perform predictive analytics on a massive scale for farming. And they are offering farmers who are already internal to the Monsanto empire the chance to benefit from their models.

Farmers are skeptical of using the models, because they are worried about how much data Monsanto will be able to collect about them if they do.

But here’s the thing, farmers: Monsanto already has all your data, and will have it forever, due to their surveillance. They will know exactly what you plant, where, and how densely.

And what they are offering you is probably actually a benefit to you, but of course the more important thing for them is that they are explicitly creating an arms race between Monsanto farmers and non-Monsanto farmers.

In other words, if they give Monsanto farmers a extra boost, it will lead other farmers into the conclusion that, without such a boost, they won’t be able to keep up, and they will be forced into the Monsanto system by economic necessity.

Again an arms race, and again no armistice in sight, since Monsanto is doing this deliberately towards their profit bottom line. Assuming their models are good, the only way to avoid this for non-Monsanto farmers is to build their own predictive models, but clearly that would require enormous investment.

Categories: arms race, modeling

Lobbyists have another reason to dominate public commenting #OWS

Before I begin this morning’s rant, I need to mention that, as I’ve taken on a new job recently and I’m still trying to write a book, I’m expecting to not be able to blog as regularly as I have been. It pains me to say it but my posts will become more intermittent until this book is finished. I’ll miss you more than you’ll miss me!

On to today’s bullshit modeling idea, which was sent to me by both Linda Brown and Michael Crimmins. It’s a new model built in part by the former chief economist for the Commodity Futures Trading Commission (CFTC) Andrei Kirilenko, who is now a finance professor at Sloan. In case you don’t know, the CFTC is the regulator in charge of futures and swaps.

I’ll excerpt this New York Times article which describes the model:

The algorithm, he says, uncovers key word clusters to measure “regulatory sentiment” as pro-regulation, anti-regulation or neutral, on a scale from -1 to +1, with zero being neutral.

If the number assigned to a final rule is different from the proposed one and closer to the number assigned to all the public comments, then it can be inferred that the agency has taken the public’s views into account, he says.

Some comments:

  1. I know really smart people that use similar sentiment algorithms on word clusters. I have no beef with the underlying NLP algorithm.
  2. What I do have a problem with is the apparent assumption that the “the number assigned to all the public comments” makes any sense, and in particular whether it takes into account “the public’s view”.
  3. It sounds like the algorithm dumps all the public comment letters into a pot and mixes it together to get an overall score. The problem with this is that the industry insiders and their lobbyists overwhelm public commenting systems.
  4. For example, go take a look at the list of public letters for the Volcker Rule. It’s not unlike this graphic on the meetings of the regulators on the Volcker Rule:reg_volcker
  5. Besides dominating the sheer number of letters, I’ll bet the length of each letter is also much longer on average for such parties with very fancy lawyers.
  6. Now think about how the NLP algorithm will deal with this in a big pot: it will be dominated by the language of the pro-industry insiders.
  7. Moreover, if such a model were to be directly used, say to check that public commenting letters were written in a given case, lobbyists would have even more reason to overwhelm public commenting systems.

The take-away is that this is an amazing example of a so-called objective mathematical model set up to legitimize the watering down of financial regulation by lobbyists.

 

Update: I’m willing to admit I might have spoken too soon. I look forward to reading the paper on this algorithm and taking a deeper look instead of relying on a newspaper.

Categories: #OWS, finance, modeling, rant

Let’s not replace the SAT with a big data approach

The big news about the SAT is that the College Boards, which makes the SAT, has admitted there is a problem, which is widespread test-prep and gaming. As I talked about in this post, the SAT mainly serves to sort people by income.

It shouldn’t be a surprise to anyone when a weak proxy gets gamed. Yesterday I discussed this very thing in the context of Google’s PageRank algorithm, and today it’s student learning aptitude. The question is, what do we do next?

Rick Bookstaber wrote an interesting post yesterday (hat tip Marcos Carreira) with an idea to address the SAT problem with the same approach that I’m guessing Google is addressing the PageRank problem, namely by abandoning the poor proxy and getting a deeper, more involved one. Here’s Bookstaber’s suggestion:

You would think that in the emerging world of big data, where Amazon has gone from recommending books to predicting what your next purchase will be, we should be able to find ways to predict how well a student will do in college, and more than that, predict the colleges where he will thrive and reach his potential.  Colleges have a rich database at their disposal: high school transcripts, socio-economic data such as household income and family educational background, recommendations and the extra-curricular activities of every applicant, and data on performance ex post for those who have attended. For many universities, this is a database that encompasses hundreds of thousands of students.

There are differences from one high school to the next, and the sample a college has from any one high school might be sparse, but high schools and school districts can augment the data with further detail, so that the database can extend beyond those who have applied. And the data available to the colleges can be expanded by orders of magnitude if students agree to share their admission data and their college performance on an anonymized basis. There already are common applications forms used by many schools, so as far as admission data goes, this requires little more than adding an agreement in the college applications to share data; the sort of agreement we already make with Facebook or Google.

The end result, achievable in a few years, is a vast database of high school performance, drilling down to the specific high school, coupled with the colleges where each student applied, was accepted and attended, along with subsequent college performance. Of course, the nature of big data is that it is data, so students are still converted into numerical representations.  But these will cover many dimensions, and those dimensions will better reflect what the students actually do. Each college can approach and analyze the data differently to focus on what they care about.  It is the end of the SAT version of standardization. Colleges can still follow up with interviews, campus tours, and reviews of musical performances, articles, videos of sports, and the like.  But they will have a much better filter in place as they do so.

Two things about this. First, I believe this is largely already happening. I’m not an expert on the usage of student data at colleges and universities, but the peek I’ve had into this industry tells me that the analytics are highly advanced (please add related comments and links if you have them!). And they have more to do with admissions and college aid – and possibly future alumni giving – than any definition of academic success. So I think Bookstaber is being a bit naive and idealistic if he thinks colleges will use this information for good. They already have it and they’re not.

Secondly, I want to think a little bit harder about when the “big, deeper data” approach makes sense. I think it does for teachers to some extent, as I talked about yesterday, because after all it’s part of a job to get evaluated. For that matter I expect this kind of thing to be part of most jobs soon (but it will be interesting to see when and where it stops – I’m pretty sure Bloomberg will never evaluate himself quantitatively).

I don’t think it makes sense to evaluate children in the same way, though. After all, we’re basically talking about pre-consensual surveillance, not to mention the collection and mining of information far beyond the control of the individual child. And we’re proposing to mine demographic and behavioral data to predict future success. This is potentially much more invasive than just one crappy SAT test. Childhood is a time which we should try to do our best to protect, not quantify.

Also, the suggestion that this is less threatening because “the data is anonymized” is misleading. Stripping out names in historical data doesn’t change or obscure the difference between coming from a rich high school or a poor one. In the end you will be judged by how “others like you” performed, and in this regime the system gets off the hook but individuals are held accountable. If you think about it, it’s exactly the opposite of the American dream.

I don’t want to be naive. I know colleges will do what they can to learn about their students and to choose students to make themselves look good, at least as long as the US News & World Reports exists. I’d like to make it a bit harder for them to do so.

The endgame for PageRank

First there was Google Search, and then pretty quickly SEOs came into existence.

SEOs are marketing people hired by businesses to bump up the organic rankings for that business in Google Search results. That means they pay people to make their website more attractive and central to Google Search so they don’t have to pay for ads but will get visitors anyway. And since lots of customers come from search results, this is a big deal for those businesses.

Since Google Search was based on a pretty well-known, pretty open algorithm called PageRank which relies on ranking the interestingness of pages by their links, SEOs’ main jobs were to add links and otherwise fiddle with links to and from the websites of their clients. This worked pretty well at the beginning and the businesses got higher rank and they didn’t have to pay for it, except they did have to pay for the SEOs.

But after a while Google caught on to the gaming and adjusted its search algorithm, and SEOs responded by working harder at gaming the system (see more history here). It got more expensive but still kind of worked, and nowadays SEOs are a big business. And the algorithm war is at full throttle, with some claiming that Google Search results are nowadays all a bunch of crappy, low-quality ads.

This is to be expected, of course, when you use a proxy like “link” to indicate something much deeper and more complex like “quality of website”. Since it’s so high stakes, the gaming acts to decouple the proxy entirely from its original meaning. You end up with something that is in fact the complete opposite of what you’d intended. It’s hard to address except by giving up the proxy altogether and going for something much closer to what you care about.

Recently my friend Jordan Ellenberg sent me an article entitled The Future of PageRank: 13 Experts on the Dwindling Value of the LinkIt’s an insider article, interviewing 13 SEO experts on how they expect Google to respond to the ongoing gaming of the Google Search algorithm.

The experts don’t all agree on the speed at which this will happen, but there seems to be some kind of consensus that Google will stop relying on links as such and will go to user behavior, online and offline, to rank websites.

If correct, this means that we can expect Google to pump all of our email, browsing, and even GPS data to understand our behaviors in a minute fashion in order to get at a deeper understanding of how we perceive “quality” and how to monetize that. Because, let’s face it, it’s all about money. Google wants good organic searches so that people won’t abandon its search engine altogether so it can sell ads.

So we’re talking GPS on your android, or sensor data, and everything else it can get its hands on through linking up various data sources (which as I read somewhere is why Google+ still exists at all, but I can’t seem to find that article on Google).

It’s kind of creepy all told, and yet I do see something good coming out of it. Namely, it’s what I’ve been saying we should be doing to evaluate teachers, instead of using crappy and gameable standardized tests. We should go deeper and try to define what we actually think makes a good teacher, which will require sensors in the classroom to see if kids are paying attention and are participating and such.

Maybe Google and other creepy tech companies can show us the way on this one, although I don’t expect them to explain their techniques in detail, since they want to stay a step ahead of SEO’s.

Categories: data science, modeling

Julia Angwin’s Dragnet Nation

I recently devoured Julia Angwin‘s new book Dragnet Nation: A Quest for Privacy, Security, and Freedom in a World of Relentless Surveillance. I actually met Julia a few months ago and talked to her briefly about her upcoming book when I visited the ProPublica office downtown, so it was an extra treat to finally get my hands on the book.

First off, let me just say this is an important book, and a provides a crucial and well-described view into the private data behind the models that I get so worried about. After reading this book you have a good idea of the data landscape as well as many of the things that can currently go wrong for you personally with the associated loss of privacy. So for that reason alone I think this book should be widely read. It’s informational.

Julia takes us along her journey of trying to stay off the grid, and for me the most fascinating parts are her “data audit” (Chapter 6), where she tries to figure out what data about her is out there and who has it, and the attempts she makes to clean the web of her data and generally speaking “opt out”, which starts in Chapter 7 but extends beyond that when she makes the decision to get off of gmail and LinkedIn. Spoiler alert: her attempts do not succeed.

From the get go Julia is not a perfectionist, which is a relief. She’s a working mother with a web presence, and she doesn’t want to live in paranoid fear of being tracked. Rather, she wants to make the trackers work harder. She doesn’t want to hand herself over to them on a silver platter. That is already very very hard.

In fact, she goes pretty far, and pays for quite a few different esoteric privacy services; along the way she explores questions like how you decide to trust the weird people who offer those services. At some point she finds herself with two phones – including a “burner”, which made me think she was a character in House of Cards – and one of them was wrapped up in tin foil to avoid the GPS tracking. That was a bit far for me.

Early on in the book she compares the tracking of a U.S. citizen with what happened under Nazi Germany, and she makes the point that the Stasi would have been amazed by all this technology.

Very true, but here’s the thing. The culture of fear was very different then, and although there’s all this data out there, important distinctions need to be made: both what the data is used for and the extent to which people feel threatened by that usage are very different now.

Julia brought these up as well, and quoted sci-fi writer David Brin: The key question is, who has access? and what do they do with it?

Probably the most interesting moment in the book was when she described the so-called “Wiretapper’s Ball”, a private conference of private companies selling surveillance hardware and software to governments to track their citizens. Like maybe the Ukrainian government used such stuff when they texted warning messages to to protesters.

She quoted the Wiretapper’s Ball organizer Jerry Lucas as saying “We don’t really get into asking, ‘Is in the public’s interest?’”.

That’s the closest the book got to what I consider the critical question: to what extent is the public’s interest being pursued, if at all, by all of these data trackers and data miners?

And if the answer is “to no extent, by anyone,” what does that mean in the longer term? Julia doesn’t go much into this from an aggregate viewpoint, since her perspective is both individual and current.

At the end of the book, she makes a few interesting remarks. First, it’s just too much work to stay off the grid, and moreover it’s become entirely commoditized. In other words, you have to either be incredibly sophisticated or incredibly rich to get this done, at least right now. My guess is that, in the future, it will be more about the latter category: privacy will be enjoyed only by those people who can afford it.

Julia also mentions near the end that, even though she didn’t want to get super paranoid, she found herself increasingly inside a world based on fear and well on her way to becoming a “data survivalist,” which didn’t sound pleasant. It is not a lot of fun to be the only person caring about the tracking in a world of blithe acceptance.

Julia had some ways of measuring a tracking system, which she refers to as a “dragnet”, which seems to me a good place to start:

julia_angwinIt’s a good start.

SAT overhaul

There’s a good New York Times article by Todd Balf entitled The Story Behind the SAT Overhaul (hat tip Chris Wiggins).

In it is described the story of the new College Board President David Coleman, and how he decided to deal with the biggest problem with the SAT: namely, that it was pretty easy to prepare for the test, and the result was that richer kids did better, having more resources – both time and money – to prepare.

Here’s a visual from another NY Times blog on the issue:

allscores

Here’s my summary of the story.

At this point the SAT serves mainly to sort people by income. It’s no longer an appropriate way to gauge “IQ” as it was supposed to be when it was invented. Not to mention that colleges themselves have been playing a crazy game with respect to gaming the US News & World Reports college ranking model via their SAT scores. So it’s one feedback loop feeding into another.

How can we deal with this? One way is to stop using it. The article describes some colleges that have made SAT scores optional. They have not suffered, and they have more diversity.

But since the College Board makes their livelihood by testing people, they were never going to just shut down. Instead they’ve decided to explicitly make the SAT about content knowledge that they think high school students should know to signal college readiness.

And that’s good, but of course one can still prepare for that test. And since they’re acknowledging that now, they’re trying to set up the prep to make it more accessible, possibly even “free”.

But here’s the thing, it’s still online, and it still involves lots of time and attention, which still saps resources. I predict we will still see incredible efforts towards gaming this new model, and it will still break down by income, although possibly not quite as much, and possibly we will be training our kids to get good at slightly more relevant stuff.

I would love to see more colleges step outside the standardized testing field altogether.

Categories: modeling, statistics

An attempt to FOIL request the source code of the Value-added model

Last November I wrote to the Department of Education to make a FOIL request for the source code for the teacher value-added model (VAM).

Motivation

To explain why I’d want something like this, I think the VAM model sucks and I’d like to explore the actual source code directly. The white paper I got my hands on is cryptically written (take a look!) and doesn’t explain what the actual sensitivity to inputs are, for example. The best way to get at that is the source code.

Plus, since the New York Times and other news outlets published teacher’s VAM scores after a long battle and a FOIA request (see details about this here), I figured it’s only fair to also publicly release the actual black box which determines those scores.

Indeed without knowledge of what the model consists of, the VAM scoring regime is little more than a secret set of rules, with tremendous power over teachers and the teacher union, and also incorporates outrageous public shaming as described above.

I think teachers deserve better, and I want to illustrate the weaknesses of the model directly on an open models platform.

The FOIL request

Here’s the email I sent to foil@schools.nyc.gov on 11/22/13:

Dear Records Access Officer for the NYC DOE,

I’m looking to get a copy of the source code for the most recent value-added teacher model through a FOIA request. There are various publicly available descriptions of such models, for example here, but I’d like the actual underlying code.

Please tell me if I’ve written to the correct person for this FOIA request, thank you very much.

Best,
Cathy O’Neil

Since my FOIL request

In response to my request, on 12/3/13, 1/6/14, and 2/4/14 I got letters saying stuff was taking a long time since my request was so complicated. Then yesterday I got the following response:
Screen Shot 2014-03-07 at 8.49.57 AM

If you follow the link you’ll get another white paper, this time from 2012-2013, which is exactly what I said I didn’t want in my original request.

I wrote back, not that it’s likely to work, and after reminding them of the text of my original request I added the following:


What you sent me is the newer version of the publicly available description of the model, very much like my link above. I specifically asked for the underlying code. That would be in a programming language like python or C++ or java.

Can you to come back to me with the actual code? Or who should I ask?

Thanks very much,
Cathy

It strikes me as strange that it took them more than 3 months to send me a link to a white paper instead of the source code as I requested. Plus I’m not sure what they mean by “SED” but I’m guessing it means these guys, but I’m not sure of exactly who to send a new FOIL request.

Am I getting the runaround? Any suggestions?

Categories: modeling, statistics

How much is your data worth?

I heard an NPR report yesterday with Emily Steel, reporter from the Financial Times, about what kind of attributes make you worth more to advertisers. She has developed an ingenious online calculator here, which you should go play with.

As you can see it cares about things like whether you’re about to have a kid or are a new parent, as well as if you’ve got some disease where the industry for that disease is well-developed in terms of predatory marketing.

For example, you can bump up your worth to $0.27 from the standard $0.0007 if you’re obese, and another $0.10 if you admit to being the type to buy weight-loss products. And of course data warehouses can only get that much money for your data if they know about your weight, which they may or may not since if you don’t buy weight-loss products.

The calculator doesn’t know everything, and you can experiment with how much it does know, but some of the default assumptions are that it knows my age, gender, education level, and ethnicity. Plenty of assumed information to, say, build an unregulated version of a credit score to bypass the Equal Credit Opportunities Act.

Here’s a price list with more information from the biggest data warehouser of all, Acxiom.

Categories: data science, modeling

The CARD Act works

Every now and then you see a published result that has exactly the right kind of data, in sufficient amounts, to make the required claim. It’s rare but it happens, and as a data lover, when it happens it is tremendously satisfying.

Today I want to share an example of that happening, namely with this paper entitled Regulating Consumer Financial Products: Evidence from Credit Cards (hat tip Suresh Naidu). Here’s the abstract:

We analyze the effectiveness of consumer financial regulation by considering the 2009 Credit Card Accountability Responsibility and Disclosure (CARD) Act in the United States. Using a difference-in-difference research design and a unique panel data set covering over 150 million credit card accounts, we find that regulatory limits on credit card fees reduced overall borrowing costs to consumers by an annualized 1.7% of average daily balances, with a decline of more than 5.5% for consumers with the lowest FICO scores. Consistent with a model of low fee salience and limited market competition, we find no evidence of an offsetting increase in interest charges or reduction in volume of credit. Taken together, we estimate that the CARD Act fee reductions have saved U.S. consumers $12.6 billion per year. We also analyze the CARD Act requirement to disclose the interest savings from paying off balances in 36 months rather than only making minimum payments. We find that this “nudge” increased the number of account holders making the 36-month payment value by 0.5 percentage points.

That’s a big savings for the poorest people. Read the whole paper, it’s great, but first let me show you some awesome data broken down by FICO score bins:

Rich people buy a lot, poor people pay lots of fees.

Rich people buy a lot, poor people pay lots of fees.

Interestingly, some people in the middle lose money for credit card companies. Poor people are great customers but there aren't so many of them.

Interestingly, some people in the middle lose money for credit card companies. Poor people are great customers but there aren’t so many of them.

The study compared consumer versus small business credit cards. After CARD Act implementation, fees took a nosedive.

The study compared consumer versus small business credit cards. After CARD Act implementation, fees took a nosedive.

 

This data, and the results in this paper, fly directly in the face of the myth that if you regulate away predatory fees in one way, they will pop up in another way. That myth is based on the assumption of a competitive market with informed participants. Unfortunately the consumer credit card industry, as well as the small business card industry, is not filled with informed participants. This is a great example of how asymmetric information causes predatory opportunities.

Categories: finance, modeling

Does making it easier to kill people result in more dead people?

A fascinating and timely study just came out about the “Stand Your Ground” laws. It was written by Cheng Cheng and Mark Hoekstra, and is available as a pdf here, although I found out about in a Reuters column written by Hoekstra. Here’s a longish but crucial excerpt from that column:

It is fitting that much of this debate has centered on Florida, which enacted its law in October of 2005. Florida provides a case study for this more general pattern. Homicide rates in Florida increased by 8 percent from the period prior to passing the law (2000-04) to the period after the law (2006-10).By comparison, national homicide rates fell by 6 percent over the same time period. This is a crude example, but it illustrates the more general pattern that exists in the homicide data published by the FBI.

The critical question for our research is whether this relative increase in homicide rates was caused by these laws. Several factors lead us to believe that laws are in fact responsible. First, the relative increase in homicide rates occurred in adopting states only after the laws were passed, not before. Moreover, there is no history of homicide rates in adopting states (like Florida) increasing relative to other states. In fact, the post-law increase in homicide rates in states like Florida was larger than any relative increase observed in the last 40 years. Put differently, there is no evidence that states like Florida just generally experience increases in homicide rates relative to other states, even when they don’t pass these laws.

We also find no evidence that the increase is due to other factors we observe, such as demographics, policing, economic conditions, and welfare spending. Our results remain the same when we control for these factors. Along similar lines, if some other factor were driving the increase in homicides, we’d expect to see similar increases in other crimes like larceny, motor vehicle theft and burglary. We do not. We find that the magnitude of the increase in homicide rates is sufficiently large that it is unlikely to be explained by chance.

In fact, there is substantial empirical evidence that these laws led to more deadly confrontations. Making it easier to kill people does result in more people getting killed.

If you take a look at page 33 of the paper, you’ll see some graphs of the data. Here’s a rather bad picture of them but it might give you the idea:

Screen Shot 2014-02-17 at 7.21.15 AM

That red line is the same in each plot and refers to the log homicide rate in states without the Stand Your Ground law. The blue lines are showing how the log homicide rates looked for states that enacted such a law in a given year. So there’s a graph for each year.

In 2009 there’s only one “treatment” state, namely Montana, which has a population of 1 million, less than one third of one percent of the country. For that reason you see much less stable data. The authors did different analyses, sometimes weighted by population, which is good.

I have to admit, looking at these plots, the main thing I see in the data is that, besides Montana, we’re talking about states that have a higher homicide rate than usual, which could potentially indicate a confounding condition, and to address that (and other concerns) they conducted “falsification tests,” which is to say they studied whether crimes unrelated to Stand Your Ground type laws – larceny and motor vehicle theft – went up at the same time. They found that the answer is no.

The next point is that, although there seem to be bumps for 2005, 2006, and 2008 for the two years after the enactment of the law, there doesn’t for 2007 and 2009. And then even those states go down eventually, but the point is they don’t go down as much as the rest of the states without the laws.

It’s hard to do this analysis perfectly, with so few years of data. The problem is that, as soon as you suspect there’s a real effect, you’d want to act on it, since it directly translates into human deaths. So your natural reaction as a researcher is to “collect more data” but your natural reaction as a citizen is to abandon these laws as ineffective and harmful.

Categories: modeling, news, statistics

Intentionally misleading data from Scott Hodge of the Tax Foundation

Scott Hodge just came out with a column in the Wall Street Journal arguing that reducing income inequality is way too hard to consider. The title of his piece is Scott Hodge: Here’s What ‘Income Equality’ Would Look Like, and his basic argument is as follows.

First of all, the middle quintile already gets too much from the government as it stands. Second of all, we’d have to raise taxes to 74% for the top quintile to even stuff out. Clearly impossible, QED.

As to the first point, his argument, and his supporting data, is intentionally misleading, as I will explain below. As to his second point, he fails to mention that the top tax bracket has historically been much higher than 74%, even as recently as 1969, and the world didn’t end.

Hodge argues with data he took from a report from the CBO called The Distribution of Federal Spending and Taxes in 2006This report distinguishes between transfers and spending. Here’s a chart to explain what that looks, before taxes are considered and by quintile, for non-elderly households (page 5 of the report):

Screen Shot 2014-02-14 at 7.44.34 AM

 

The stuff on the left corresponds to stuff like food stamps. The stuff in the middle is stuff like Medicaid. The stuff on the right is stuff like wars.

Here are a few things to take from the above:

  1. There’s way more general spending going on than transfers.
  2. Transfers are very skewed towards the lowest quintile, as would be expected.
  3. If you look carefully at the right-most graph, the light green version gives you a way of visualizing of how much more money the top quintile has versus the rest.

Now let’s break this down a bit further to include taxes. This is a key chart that Hodge referred to from this report (page 6 of the report):

Screen Shot 2014-02-14 at 7.51.20 AM

OK, so note that in the middle chart, for the middle quintile, people pay more in taxes than they receive in transfers. On the right chart, for the middle quintile, which includes all spending, the middle quintile is about even, depending on how you measure it.

Now let’s go to what Hodge says in his column (emphasis mine):

Looking at prerecession data for non-elderly households in 2006 in “The Distribution of Federal Spending and Taxes in 2006,” the CBO found that those in the bottom fifth, or quintile, of the income scale received $9.62 in federal spending for every $1 they paid in federal taxes of all kinds. This isn’t surprising, since people with low incomes pay little in taxes but receive a lot of transfers.

Nor is it surprising that households in the top fifth received 17 cents in federal spending for every $1 they paid in all federal taxes. High-income households hand over a disproportionate amount in taxes relative to what they get back in spending.

What is surprising is that the middle quintile—the middle class—also got more back from government than they paid in taxes. These households received $1.19 in government spending for every $1 they paid in federal taxes.

In the first paragraph Hodge intentionally conflates the concept of “transfers” and “spending”. He continues to do this for the next two paragraphs, and in the last sentence, it is easy to imagine a middle-quintile family paying $100 in taxes and receiving $119 in food stamps. This is of course not true at all.

What’s nuts about this is that it’s mathematically equivalent to complaining that half the population is below median intelligence. Duh.

Since we have a skewed distribution of incomes, and therefore a skewed distribution of tax receipts as well as transfers, then in the context of a completely balanced budget, we would expect the middle quintile – which has a below-mean average income – to pay slightly less than the government spends on them. It’s a mathematical fact as long as our federal tax system isn’t regressive, which it’s not.

In other words, this guy is just framing stuff in a “middle class is lazy and selfish, what could rich people possibly be expected do about that?” kind of way. Who is this guy anyway?

Turns out that Hodge is the President of the Tax Foundation, which touts itself as “nonpartisan” but which has gotten funding from Big Oil and the Koch brothers. I guess it’s fair to say he has an agenda.

Categories: modeling, news, rant

Diane Ravitch speaks in Westchester

One thing I learned on the “Public Facing Math” panel at the JMM was that I needed to know more about the Common Core, since so much of the audience was very interested in discussing it and since it was actually a huge factor in the public’s perception of math, both in the sense of high school math curriculum and in the context of the associated mathematical models related to assessments. In fact at that panel I promised to learned more about the Common Core and I urged other mathematicians in the room to do the same.

As part of my research I listened to a recent lecture that Diane Ravitch gave in Westchester which centered on the Common Core. The video of the lecture is available here.

Diane Ravitch

If you don’t know anything about Diane Ravitch, you should. She’s got a super interesting history in education – she’s an education historian – and in particular has worked high up, as the U.S. Assistant Secretary of Education and on the National Assessment Governing Board, which supervises the National Assessment of Educational Progress.

What’s most interesting about her is that, as a high ranking person in education, she originally supported the Bush “No Child Left Behind” policy but now is an outspoken opponent of it as well as Obama’s “Race to the Top“, which she claims in an extension of the same bad idea.

Ravitch writes an incredibly interesting blog on education issues and, what’s most interesting to me, assessment issues.

Ravitch in Westchester

Let me summarize her remarks in a free-form and incomplete way. If you want to know exactly what she said and how she said it, watch the video, and feel free to skip the first 16 minutes of introductions.

She doesn’t like the Common Core initiative and mentions that Gates Foundation people, mostly not experienced educators, and many of them associated to the testing industry, developed the Common Core standards. So there’s a suspicion right off the bat that the material is overly academic and unrealistic for actual teachers in actual classrooms.

She also objects to the idea of any fixed and untested set of standards. No standard is perfect, and this one is rigid. At the very least, if we need a “one solution for all” kind of standard, it needs to be under constant review and testing and open to revisions – a living document to change with the times and with the needs and limits of classrooms.

So now we have an unrealistic and rigid set of standards, written by outsiders with vested interests, and it’s all for the sake of being able to test everyone to death. She also made some remarks about the crappiness of the Value-Added Model similar to stuff I’ve mentioned in the past.

The Common Core initiative, she explains, exposes an underlying and incorrect mindset, which is that testing makes kids learn, and more testing makes kids learn faster. That setting a high bar makes kids suddenly be able to jump higher. The Common Core, she says, is that higher bar. But just because you raise standards doesn’t mean people suddenly know more.

In fact, she got a leaked copy of last year’s Common Core test and saw that it’s 5th grade version is similar to a current 8th grade standardized test. So it’s very much this “raise the bar” setup. And it points to the fact that standardized testing is used as punishment rather than diagnostic.

In other words, if we were interested in finding out who needs help and giving them help, we wouldn’t need harder and harder tests, we’d just look at who is struggling with the current tests and go help them. But because it’s all about punishment, we need to add causality and blame to the environment.

She claims that poverty causes kids to underperform in schools, and blaming the teachers on poverty is a huge distraction and meaningless for those kids. In fact, she asks, what are going to happen to all of those kids who fail the Common Core standards? What is going to become of them if we don’t allow them to graduate? And how do we think we are helping them? Why do we spend so much time with developing these fancy tests and on assessments instead of figuring out how to help them graduate?

She also points out that the blame game going on in this country is fueled by bad facts.

For example, there is no actual educational emergency in this country. In fact, test scores and graduation rates have never been higher for each racial group. And, although we are alway made to be afraid vis a vis our “international competition” (great recent example of this here) we actually historically never scored at the top of international rankings. But we didn’t think that meant we weren’t competitive 50 years ago, so why do we suddenly care now?

She provides the answer. Namely, if people are convinced there is an emergency in education, then the private companies – test prep and testing companies as well as companies that run charter school – stand to make big money from our response and from straight up privatization.

The statistical argument that poverty causes educational delays is ready to be made. If we want to “fix our educational system” then we need to address poverty, not scapegoat teachers.

Categories: math education, modeling

How do you define success for a calculus MOOC?

I’m going to strike now, while the conversational iron is hot, and ask people to define success for a calculus MOOC.

Why?

I’ve already mostly explained why in this recent post, but just in case you missed it, I think mathematics is being threatened by calculus MOOCs, and although maybe in some possibly futures this wouldn’t be a bad thing, in others it definitely would.

One way it could be a really truly bad thing is if the metric of success were as perverted as we’ve seen happen in high school teaching, where Value-Added Models have no defined metric of success and are tearing up a generation of students and teachers, creating the kind of opaque, confusing, and threatening environment where code errors lead to people getting fired.

And yes, it’s kind of weird to define success in a systematic way given that calculus has been taught in a lot of places for a long time without such a rigid concept. And it’s quite possible that flexibility should be built in to the definition, so as to acknowledge that different contexts need different outcomes.

Let’s keep things as complicated as they need to be to get things right!

The problem with large-scale models is that they are easier to build if you have some fixed definition of success against which to optimize. And if we mathematicians don’t get busy thinking this through, my fear is that administrations will do it for us, and will come up with things based strictly on money and not so much on pedagogy.

So what should we try?

Here’s what I consider to be a critical idea to get started:

  • Calculus teachers should start experimenting with teaching calculus in different ways. Do randomized experiments with different calculus sections that meet at comparable times (I say comparable because I’ve noticed that people who show up for 8am sections are typically more motivated students, so don’t pit them against 4pm sections).
  • Try out a bunch of different possible definitions of success, including the experience and attitude of the students and the teacher.
  • So for example, it could be how students perform on the final, which should be consistent for both sections (although to do that fairly you need to make sure the MOOC you’re using covers the critical material to do the final).
  • Or it could be partly an oral exam or long-form written exam, whether students have learned to discuss the concepts (keeping in mind that we have to compare the “MOOC” students to the standardly taught students).
  • Design the questions you will ask your students and yourself before the semester begins so as to practice good model design – we don’t want to decide on our metric after the fact. A great way to do this is to keep a blog with your plan carefully described – that will timestamp the plan and allow others to comment.
  • Of course there’s more than one way to incorporate MOOCs in the curriculum, so I’d suggest more than one experiment.
  • And of course the success of the experiment will also depend on the teaching style of the calc prof.
  • Finally, share your results with the world so we can all start thinking in terms of what works and for whom.

One last comment. One might complain that, if we do this, we’re actually speeding on our own deaths by accelerating the MOOCs in the classroom. But I think it’s important we take control before someone else does.

Categories: math education, modeling

What is regulation for?

A couple of days ago I was listening to a recorded webinar on K-12 student data privacy. I found out about it through an education blog I sometimes read called deutsch29, where the blog writer was complaining about “data chearleaders” on a panel and how important issues are sure to be ignored if everyone on a panel is on the same, pro-data and pro-privatization side.

Well as it turns out deutsch29 was almost correct. Most of the panelists were super bland and pro-data collection by private companies. But the first panelist named Joel Reidenberg, from Fordham Law School, reported on the state of data sharing in this country, the state of the law, and the gulf between the two.

I will come back to his report in another post, because it’s super fascinating, and in fact I’d love to interview that guy for my book.

One thing I wanted to mention was the high-level discussion that took place in the webinar on what regulation is for. Specifically, the following important question was asked:

Does every parent have to become a data expert in order to protect their children’s data?

The answer was different depending on who answered it, of course, but one answer that resonated with me was that that’s what regulation is for, it exists so that parents can rely on regulation to protect their children’s privacy, just as we expect HIPAA to protect the integrity of our medical data.

I started to like this definition – or attribute, if you will – of regulation, and I wondered how it relates to other kinds of regulation, like in finance, as well as how it would work if you’re arguing with people who hate all regulation.

First of all, I think that the financial industry has figured out how to make things so goddamn complicated that nobody can figure out how to regulate anything well. Moreover, they’ve somehow, at least so far, also been able to insist things need to be this complicated. So even if regulation were meant to allow people to interact with the financial system and at the same time “not be experts,” it’s clearly not wholly working. But what I like about it anyway is the emphasis on this issue of complexity and expertise. It took me a long time to figure out how big a problem that is in finance, but with this definition it goes right to the heart of the issue.

Second, as for the people who argue for de-regulation, I think it helps there too. Most of the time they act like everyone is a omniscient free agent who spends all their time becoming expert on everything. And if that were true, then it’s possible that regulation wouldn’t be needed (although transparency is key too). The point is that we live in a world where most people have no clue about the issues of data privacy, never mind when it’s being shielded by ridiculous and possibly illegal contracts behind their kids’ public school system.

Finally, in terms of the potential for protecting kids’ data: here the private companies like InBloom and others are way ahead of regulators, but it’s not because of complexity on the issues so much as the fact that regulators haven’t caught up with technology. At least that’s my optimistic feeling about it. I really think this stuff is solvable in the short term, and considering it involves kids, I think it will have bipartisan support. Plus the education benefits of collecting all this data have not been proven at all, nor do they really require such shitty privacy standards even if they do work.

Categories: data science, finance, modeling
Follow

Get every new post delivered to your Inbox.

Join 979 other followers