I often talk about the modeling war, and I usually mean the one where the modelers are on one side and the public is on the other. The modelers are working hard trying to convince or trick the public into clicking or buying or consuming or taking out loans or buying insurance, and the public is on the other, barely aware that they’re engaging in anything at all resembling a war.
But there are plenty of other modeling wars that are being fought by two sides which are both sophisticated. To name a couple, Anonymous versus the NSA and Anonymous versus itself.
Here’s another, and it’s kind of bland but pretty simple: Twitter bots versus Twitter.
This war arose from the fact that people care about how many followers someone on Twitter has. It’s a measure of a person’s influence, albeit a crappy one for various reasons (and not just because it’s being gamed).
The high impact of the follower count means it’s in a wannabe celebrity’s best interest to juice their follower numbers, which introduces the idea of fake twitter accounts to game the model. This is an industry in itself, and an associated arms race of spam filters to get rid of them. The question is, who’s winning this arms race and why?
Twitter has historically made some strides in finding and removing such fake accounts with the help of some modelers who actually bought the services of a spammer and looked carefully at what their money bought them. Recently though, at least according to this WSJ article, it looks like Twitter has spent less energy pursuing the spammers.
It begs the question, why? After all, Twitter has a lot theoretically at stake. Namely, its reputation, because if everyone knows how gamed the system is, they’ll stop trusting it. On the other hand, that argument only really holds if people have something else to use instead as a better proxy of influence.
Even so, considering that Twitter has a bazillion dollars in the bank right now, you’d think they’d spend a few hundred thousand a year to prevent their reputation from being too tarnished. And maybe they’re doing that, but the spammers seem to be happily working away in spite of that.
And judging from my experience on Twitter recently, there are plenty of active spammers which actively degrade the user experience. That brings up my final point, which is that the lack of competition argument at some point gives way to the “I don’t want to be spammed” user experience argument. At some point, if Twitter doesn’t maintain standards, people will just not spend time on Twitter, and its proxy of influence will fall out of favor for that more fundamental reason.
The idea is that we’re analyzing metadata around a texting hotline for teens in crisis. We’re trying to see if we can use the information we have on these texts (timestamps, character length, topic – which is most often suicide – and outcome reported by both the texter and the counselor) to help the counselors improve their responses.
For example, right now counselors can be in up to 5 conversations at a time – is that too many? Can we figure that out from the data? Is there too much waiting between texts? Other questions are listed here.
Our “hackpad” is located here, and will hopefully be updated like a wiki with results and visuals from the exploration of our group. It looks like we have a pretty amazing group of nerds over here looking into this (mostly python users!), and I’m hopeful that we will be helping the good people at Crisis Text Line.
We saw what happened in finance with self-regulation and ethics. Let’s prepare for the exact same thing in big data.
Remember back in the 1970′s through the 1990′s, the powers that were decided that we didn’t need to regulate banks because “they” wouldn’t put “their” best interests at risk? And then came the financial crisis, and most recently came Alan Greenspan’s recent admission that he’d got it kinda wrong but not really.
Let’s look at what the “self-regulated market” in derivatives has bestowed upon us. We’ve got a bunch of captured regulators and a huge group of bankers who insist on keeping derivatives opaque so that they can charge clients bigger fees, not to mention that they insist on not having fiduciary duties to their clients, and oh yes, they’d like to continue to bet depositors’ money on those derivatives. They wrote the regulation themselves for that one. And this is after they blew up the world and got saved by the taxpayers.
Given that the banks write the regulations, it’s arguably still kind of a self-regulated market in finance. So we can see how ethics has been and is faring in such a culture.
The answer is, not well. Just in case the last 5 years of news articles wasn’t enough to persuade you of this fact, here’s what NY Fed Chief Dudley had to say recently about big banks and the culture of ethics, from this Huffington Post article:
“Collectively, these enhancements to our current regime may not solve another important problem evident within some large financial institutions — the apparent lack of respect for law, regulation and the public trust,” he said.
“There is evidence of deep-seated cultural and ethical failures at many large financial institutions,” he continued. “Whether this is due to size and complexity, bad incentives, or some other issues is difficult to judge, but it is another critical problem that needs to be addressed.”
Given that my beat is now more focused on the big data community and less on finance, mostly since I haven’t worked in finance for almost 2 years, this kind of stuff always makes me wonder how ethics is faring in the big data world, which is, again, largely self-regulated.
Examples of how awesome “transparency” is in these cases vary from letting people know what cookies are being used (BlueKai), to promising not to share certain information between vendors (Retention Science), to allowing customers a limited view into their profiling by Acxiom, the biggest consumer information warehouse. Here’s what I assume a typical reaction might be to this last one.
Wow! I know a few things Acxiom knows about me, but probably not all! How helpful. I really trust those guys now.
Not a solution
What’s great about letting customers know exactly what you’re doing with their data is that you can then turn around and complain that customers don’t understand or care about privacy policies. In any case, it’s on them to evaluate and argue their specific complaints. Which of course they don’t do, because they can’t possibly do all that work and have a life, and if they really care they just boycott the product altogether. The result in any case is a meaningless, one-sided conversation where the tech company only hears good news.
Oh, and you can also declare that customers are just really confused and don’t even know what they want:
In a recent Infosys global survey, 39% of the respondents said that they consider data mining invasive. And 72% said they don’t feel that the online promotions or emails they receive speak to their personal interests and needs.
Conclusion: people must want us to collect even more of their information so they can get really really awesome ads.
Finally, if you make the point that people shouldn’t be expected to be data mining and privacy experts to use the web, the issue of a “market solution for ethics” is raised.
“The market will provide a mechanism quicker than legislation will,” he says. “There is going to be more and more control of your data, and more clarity on what you’re getting in return. Companies that insist on not being transparent are going to look outdated.”
Back to ethics
What we’ve got here is a repeat problem. The goal of tech companies is to make money off of consumers, just as the goal of banks is to make money off of investors (and taxpayers as a last resort).
Given how much these incentives clash, the experts on the inside have figured out a way of continuing to do their thing, make money, and at the same time, keeping a facade of the consumer’s trust. It’s really well set up for that since there are so many technical terms and fancy math models. Perfect for obfuscation.
If tech companies really did care about the consumer, they’d help set up reasonable guidelines and rules on these issues, which could easily be turned into law. Instead they send lobbyists to water down any and all regulation. They’ve even recently created a new superPAC for big data (h/t Matt Stoller).
And although it’s true that policy makers are totally ignorant of the actual issues here, that might be because of the way big data professionals talk down to them and keep them ignorant. It’s obvious that tech companies are desperate for policy makers to stay out of any actual informed conversation about these issues, never mind the public.
There never has been, nor there ever will be, a market solution for ethics so long as the basic incentives between the public and an industry are so misaligned. The public needs to be represented somehow, and without rules and regulations, and without leverage of any kind, that will not happen.
Yesterday I read Alan Greenspan’s recent article in Foreign Affairs magazine (hat tip Rhoda Schermer). It is entitled “Never Saw It Coming: Why the Financial Crisis Took Economists By Surprise,” and for those of you who want to save some time, it basically goes like this:
I’ll admit it, the macroeconomic models that we used before the crisis failed, because we assumed
people financial firms behaved rationally. But now there are new models that assume predictable irrational behavior, and once we add those bells and whistles onto our existing models, we’ll be all good. Y’all can start trusting economists again.
Here’s the thing that drives me nuts about Greenspan. He is still talking about financial firms as if they are single people. He just didn’t really read Adam Smith’s Wealth of Nations, or at least didn’t understand it, because if he had, he’d have seen that Adam Smith argued against large firms in which the agendas of the individuals ran counter to the agenda of the company they worked for.
If you think about individuals inside the banks, in other words, then their individual incentives explain their behavior pretty damn well. But Greenspan ignores that and still insists on looking at the bank as a whole. Here’s a quote from the piece:
Financial firms accepted the risk that they would be unable to anticipate the onset of a crisis in time to retrench. However, they thought the risk was limited, believing that even if a crisis developed, the seemingly insatiable demand for exotic financial products would dissipate only slowly, allowing them to sell almost all their portfolios without loss. They were mistaken.
Let’s be clear. Financial firms were not “mistaken”, because legal contracts can’t think. As for the individuals working inside those firms, there was no such assumption about a slow exhale. Everyone was furiously getting their bonuses pumped up while the getting was good. People on the inside knew the market for exotic financial products would blow at some point, and that their personal risks were limited, so why not make systemic risk worse until then.
As a mathematical modeler myself, it bugs me to try to put a mathematical band-aid on an inherently failed model. We should instead build a totally new model, or even better remove the individual perverted incentives of the market using new rules (I’m using the word “rules” instead of “regulations” because people don’t hate rules as much as they hate regulations).
Wouldn’t it be nice if the agendas of the individuals inside a financial firm were more closely aligned with the financial firm? And if it was over a long period of time instead of just until the bonus day? Not impossible.
And, since I’m an occupier, I get to ask even more. Namely, wouldn’t it be even nicer if that agenda was also shared by the general public? Doable!
Mr. Greenspan, there are ways to address the mistake you economists made and continue to make, but they don’t involve fancier math models from behavioral economics. They involve really simple rule changes and, generally speaking, making finance much more boring and much less profitable.
I’ve been really impressed by how consistently people have gone to read my post “K-Nearest Neighbors: dangerously simple,” which I back in April. Here’s a timeline of hits on that post:
I think the interest in this post is that people like having myths debunked, and are particularly interested in hearing how even the simple things that they thought they understand are possibly wrong, or at least more complicated than they’d been assuming. Either that or it’s just got a real catchy name.
Anyway, since I’m still getting hits on that post, I’m also still getting comments, and just this morning I came across a new comment by someone who calls herself “travelingactuary”. Here it is:
My understanding is that CEOs hate technical details, but do like results. So, they wouldn’t care if you used K-Nearest Neighbors, neural nets, or one that you invented yourself, so long as it actually solved a business problem for them. I guess the problem everyone faces is, if the business problem remains, is it because the analysis was lacking or some other reason? If the business is ‘solved’ is it actually solved or did someone just get lucky? That being so, if the business actually needs the classifier to classify correctly, you better hire someone who knows what they’re doing, rather than hoping the software will do it for you.
Presumably you want to sell something to Monica, and the next n Monicas who show up. If your model finds a whole lot of big spenders who then don’t, your technophobe CEO is still liable to think there’s something wrong.
I think this comment brings up the right question, namely knowing when you’ve solved your data problem, with K-Nearest Neighbors or whichever algorithms you’ve chosen to use. Unfortunately, it’s not that easy.
Here’s the thing, it’s almost never possible to tell if a data problem is truly solved. I mean, it might be a business problem where you go from losing money to making money, and in that sense you could say it’s been “solved.” But in terms of modeling, it’s very rarely a binary thing.
Why do I say that? Because, at least in my experience, it’s rare that you could possibly hope for high accuracy when you model stuff, even if it’s a classification problem. Most of the time you’re trying to achieve something better than random, some kind of edge. Often an edge is enough, but it’s nearly impossible to know if you’ve gotten the biggest edge possible.
For example, say you’re binning people you who come to your site in three equally sized groups, as “high spenders,” “medium spenders,” and “low spenders.” So if the model were random, you’d expect a third to be put into each group, and that someone who ends up as a big spender is equally likely to be in any of the three bins.
Next, say you make a model that’s better than random. How would you know that? You can measure that, for example, by comparing it to the random model, or in other words by seeing how much better you do than random. So if someone who ends up being a big spender is three times more likely to have been labeled a big spender than a low spender and twice as likely than a medium spender, you know your model is “working.”
You’d use those numbers, 3x and 2x, as a way of measuring the edge your model is giving you. You might care about other related numbers more, like whether pegged low spenders are actually low spenders. It’s up to you to decide what it means that the model is working. But even when you’ve done that carefully, and set up a daily updated monitor, the model itself still might not be optimal, and you might still be losing money.
In other words, you can be a bad modeler or a good modeler, and either way when you try to solve a specific problem you won’t really know if you did the best possible job you could have, or someone else could have with their different tools and talents.
Even so, there are standards that good modelers should follow. First and most importantly, you should always set up a model monitor to keep track of the quality of the model and see how it fares over time. Because why? Because second, you should always assume that, over time, your model will degrade, even if you are updating it regularly or even automatically. It’s of course good to know how crappy things are getting so you don’t have a false sense of accomplishment.
Keep in mind that just because it’s getting worse doesn’t mean you can easily start over again and do better. But a least you can try, and you will know when it’s worth a try. So, that’s one thing that’s good about admitting your inability to finish anything.
On to the political aspect of this issue. If you work for a CEO who absolutely hates ambiguity – and CEO’s are trained to hate ambiguity, as well as trained to never hesitate – and if that CEO wants more than anything to think their data problem has been “solved,” then you might be tempted to argue that you’ve done a phenomenal job just to make her happy. But if you’re honest, you won’t say that, because it ‘aint true.
Ironically and for these reasons, some of the most honest data people end up looking like crappy scientists because they never claim to be finished doing their job.
I had a great time at Harvard Wednesday giving my talk (prezi here) about modeling challenges. The audience was fantastic and truly interdisciplinary, and they pushed back and challenged me in a great way. I’m glad I went and I’m glad Tess Wise invited me.
One issue that came up is something I want to talk about today, because I hear it all the time and it’s really starting to bug me.
Namely, the fallacy that people, especially young people, are “happy to give away their private data in order to get the services they love on the internet”. The actual quote came from the IBM guy on the congressional subcommittee panel on big data, which I blogged about here (point #7), but I’ve started to hear that reasoning more and more often from people who insist on side-stepping the issue of data privacy regulation.
Here’s the thing. It’s not that people don’t click “yes” on those privacy forms. They do click yes, and I acknowledge that. The real problem is that people generally have no clue what it is they’re trading.
In other words, this idea of a omniscient market participant with perfect information making a well-informed trade, which we’ve already seen is not the case in the actual market, is doubly or triply not the case when you think about young people giving away private data for the sake of a phone app.
Just to be clear about what these market participants don’t know, I’ll make a short list:
- They probably don’t know that their data is aggregated, bought, and sold by Acxiom, which they’ve probably never heard of.
- They probably don’t know that Facebook and other social media companies sell stuff about them even if their friends don’t see it and even though it’s often “de-identified”. Think about this next time you sign up for a service like “Bang With Friends,” which works through Facebook.
- They probably don’t know how good algorithms are getting at identifying de-identified information.
- They probably don’t know how this kind of information is used by companies to profile users who ask for credit or try to get a job.
Conclusion: people are ignorant of what they’re giving away to play Candy Crush Saga. And whatever it is they’re giving away, it’s something way far in the future that they’re not worried about right now. In any case it’s not a fair trade by any means, and we should stop referring to it as such.
What is it instead? I’d say it’s a trick. A trick which plays on our own impulses and short-sightedness and possibly even a kind of addiction to shiny toys in the form of candy. If you give me your future, I’ll give you a shiny toy to play with right now. People who click “yes” are not signaling that they’ve thought deeply about the consequences of giving their data away, and they are certainly not making the definitive political statement that we don’t need privacy regulation.
1. I actually don’t know the data privacy rules for Candy Crush and can’t seem to find them, for example here. Please tell me if you know what they are.
I’m on an Amtrak train to Boston today to give a talk in the Applied Statistics workshop at Harvard, which is run out of the Harvard Institute for Quantiative Social Science. I was kindly invited by Tess Wise, a Ph.D. student in the Department of Government at Harvard who is organizing this workshop.
My title is “Data Skepticism in Industry” but as I wrote the talk (link to my prezi here) it transformed a bit and now it’s more about the problems not only for data professionals inside industry but for the public as well. So I talk about creepy models and how there are multiple longterm feedback loops having a degrading effect on culture and democracy in the name of short-term profits.
Since we’re on the subject of creepy, my train reading this morning is this book entitled “Murdoch’s Politics,” which talks about how Rupert Murdoch lives by design in the center of all things creepy.
I’m reading a fine book called Nobody Makes You Shop at Walmart, which dispels many of the myths surrounding market populism, otherwise described in the book as “MarketThink”, namely the rhetoric which “portrays the world (governments aside) as if it works like an ideal competitive market, even when proposing actions that contradict that portrayal,” according to the author Tom Slee.
I’ve gotten a lot out of this book, and I suggest that you guys read it, especially if you are libertarians, so we can argue about it afterwards.
One thing Slee does is distinguish between different kinds of competitive and power-dynamic systems, and fingers certain situations as “arms races”, in which there are escalating costs but no long-lasting added value for the participants. They often involve relative rankings.
Slee’s example is a neighborhood block where all the men on the block compete to have the nicest cars. Each household spends a bunch of money to rise in the rankings just to have others respond by spending money too, and at the end of a year they’ve all spent money and none of the rankings have actually changed.
One of Slee’s overall points about arms races is that the way to deal with them is through armament agreements, which everyone involved needs to sign onto. Later in the book he also talks about how hard it is to get large groups of people to agree to anything at all, especially vague social contracts, when there’s an advantage to cheating,
something he calls “free riding.” (as a commenter pointed out to me, free riding is more like someone who gets something for nothing, like a worker who benefits from the work of a union without being in the union and paying dues. This is just cheating.)
I’d argue, and I believe the book even uses this example, that education can be seen as an arms race as well. Take the statistics in this Opinionator blog from the New York Times, written by Jonathan Cowan and Jim Kessler, and entitled “The Middle Class Gets Wise.”
It describes how much more money the average high school graduate, versus two-year college, versus four-year college, versus professional degree graduate makes. In other words, it describes the payoffs to being higher ranked in that system. The money is real, of course, and everyone is aware of it as an issue even if they don’t know the exact numbers, so it is very analogous to the car status thing.
Cowan and Kessler describe in their article how, in the face of recession, lots more people have gone to college. That makes sense, since many of them didn’t have jobs and wanted to make themselves employable in the future, and at the same time people knew the job climate was even more rank-oriented since it has become tighter. People responded, in other words, to the incentives.
There’s a feedback loop going on in colleges as well, of course, and paired with the federal loan program and the fact that students cannot get rid of student debt in bankruptcy, we’ve seen a predictable (in direction if not size) and dramatic increase in tuition and student debt load for the younger generation.
My reaction to this is: we need an armament agreement, but it’s really not clear how that’s going to all of a sudden appear or how it would work, considering the number of entities involved, and the free rider problems due to the cash money incentives everywhere.
From the point of view of employers, rankings are great and they can be sure to pick the highest ranked individuals from that system, even if that means – as it often does – having Ph.D. graduates working in mailrooms. So don’t expect any help from them to add sanity to this system.
From the point of view of the colleges, they’re getting to hire more and more administrators, which means growth, which they love.
Finally, from the point of view of the individual student, it makes sense to go into debt, with almost no limit (to a point, but people rarely do that calculation explicitly, and if they did there’d be intense bias) to get significantly higher in the ranking.
In other words, it’s a shitshow, and possibly the only real disruption that could improve it would be widespread and universally respected basic and free-ish education. At least that would solve some of the arms race problems, for employers and for students. It would not make colleges happy.
The authors of the Opinionator piece, Cowan and Kessler, don’t agree with me. They have a goal, which is for even more people to go to school, and for tuition to be somehow magically decreased as well. In other words, up the antes for one feedback loop and hope its partner feedback loop somehow relaxes. Here’s the way they describe it:
So what can we do? Anya Kamenetz, the author of “Generation Debt,” has put together some excellent ideas for Third Way, the centrist policy organization where we both work. Let’s start by reducing the number of college administrators per 100 students, which jumped by 40 percent between 1993 and 2007. We should demand a cease-fire to the perk wars in which colleges build ever-more-luxurious living, dining and recreational facilities. Blended learning, which uses online teaching tools together with professors and teaching assistants, could also help students master coursework at less cost.
There are 37 million Americans with some college experience, but no degree. So pegging government tuition aid to college graduation rates would entice schools to find ways of keeping students in class. And eliminating some of the offerings of rarely chosen majors could bring some market efficiencies now lacking in education.
That really just doesn’t seem like a viable plan to me, and pegging government money to graduation rates is really stupid, as I described here, but maybe I’m just being negative. Cowan and Kessler, please tell me how that “demand” is going to work in practice.
Also, what’s funny about their idealistic demand is that they also think of a couple other things to do but dismiss them as unrealistic:
The most commonly discussed solutions to the problem of income inequality seem unlikely to get to the heart of the problem. Yes, we could raise additional taxes on the wealthy, but we just did that. Bumping up the minimum wage would help, but how high would lawmakers allow it to go? We should look instead at what Americans are already doing to solve this problem and help them do it far more successfully and at less cost.
Am I the only one who thinks raising the minimum wage would help more to address income inequality and is easier to imagine working?
A few of you may have read this recent New York TImes op-ed (hat tip Suresh Naidu) by economist Raj Chetty entitled “Yes, Economics is a Science.” In it he defends the scienciness of economics by comparing it to the field of epidemiology. Let’s focus on these three sentences in his essay, which for me are his key points:
I’m troubled by the sense among skeptics that disagreements about the answers to certain questions suggest that economics is a confused discipline, a fake science whose findings cannot be a useful basis for making policy decisions.
That view is unfair and uninformed. It makes demands on economics that are not made of other empirical disciplines, like medicine, and it ignores an emerging body of work, building on the scientific approach of last week’s winners, that is transforming economics into a field firmly grounded in fact.
Chetty is conflating two issues in his first sentence. The first is whether economics can be approached as a science, and the second is whether, if you are an honest scientist, you push as hard as you can to implement your “results” as public policy. Because that second issue is politics, not science, and that’s where people like myself get really pissed at economists, when they treat their estimates as facts with no uncertainty.
In other words, I’d have no problem with economists if they behaved like the people in the following completely made-up story based on the infamous Reinhart-Rogoff paper with the infamous excel mistake.
Two guys tried to figure what public policy causes GDP growth by using historical data. They collected their data and did some analysis, and they later released both the spreadsheet and the data by posting them on their Harvard webpages. They also ran the numbers a few times with slightly different countries and slightly different weighting schemes and explained in their write-up that got different answers depending on the initial conditions, so therefore they couldn’t conclude much at all, because the error bars are just so big. Oh well.
You see how that works? It’s called science, and it’s not what economists are known to do. It’s what we all wish they’d do though. Instead we have economists who basically get paid to write papers pushing for certain policies.
Next, let’s talk about Chetty’s comparison of economics with medicine. It’s kind of amazing that he’d do this considering how discredited epidemiology is at this point, and how truly unscientific it’s been found to be, for essentially exactly the same reasons as above – initial conditions, even just changing which standard database you use for your tests, switch the sign of most of the results in medicine. I wrote this up here based on a lecture by David Madigan, but there’s also a chapter in my new book with Rachel Schutt based on this issue.
To briefly summarize, Madigan and his colleagues reproduce a bunch of epidemiological studies and come out with incredible depressing “sensitivity” results. Namely, that the majority of “statistically significant findings” change sign depending on seemingly trivial initial condition changes that the authors of the original studies often didn’t even explain.
So in other words, Chetty defends economics as “just as much science” as epidemiology, which I would claim is in the category “not at all a science.” In the end I guess I’d have to agree with him, but not in a good way.
Finally, let’s be clear: it’s a good thing that economists are striving to be scientists, when they are. And it’s of course a lot easier to do science in microeconomic settings where the data is plentiful than it is to answer big, macro-economic questions where we only have a few examples.
Even so, it’s still a good thing that economists are asking the hard questions, even when they can’t answer them, like what causes recessions and what determines growth. It’s just crucial to remember that actual scientists are skeptical, even of their own work, and don’t pretend to have error bars small enough to make high-impact policy decisions based on their fragile results.
I’m on my way to D.C. today to give an alleged “distinguished lecture” to a group of mathematics enthusiasts. I misspoke in a previous post where I characterized the audience to consist of math teachers. In fact, I’ve been told it will consist primarily of people with some mathematical background, with typically a handful of high school teachers, a few interested members of the public, and a number of high school and college students included in the group.
So I’m going to try my best to explain three different ways of approaching recommendation engine building for services such as Netflix. I’ll be giving high-level descriptions of a latent factor model (this movie is violent and we’ve noticed you like violent movies), of the co-visitation model (lots of people who’ve seen stuff you’ve seen also saw this movie) and the latent topic model (we’ve noticed you like movies about the Hungarian 1956 Revolution). Then I’m going to give some indication of the issues in doing these massive-scale calculation and how it can be worked out.
And yes, I double-checked with those guys over at Netflix, I am allowed to use their name as long as I make sure people know there’s no affiliation.
In addition to the actual lecture, the MAA is having me give a 10-minute TED-like talk for their website as well as an interview. I am psyched by how easy it is to prepare my slides for that short version using prezi, since I just removed a bunch of nodes on the path of the material without removing the material itself. I will make that short version available when it comes online, and I also plan to share the longer prezi publicly.
[As an aside, and not to sound like an advertiser for prezi (no affiliation with them either!), but they have a free version and the resulting slides are pretty cool. If you want to be able to keep your prezis private you have to pay, but not as much as you'd need to pay for powerpoint. Of course there's always Open Office.]
Train reading: Wrong Answer: the case against Algebra II, by Nicholson Baker, which was handed to me emphatically by my friend Nick. Apparently I need to read this and have an opinion.
Sometimes my plan of getting up super early to write on my blog fails, and this is one of those days. But I’m still going to ask you to read this article from the New Yorker written by Lisa Servon and entitled, “The High Cost, For The Poor, Of Using A Bank.” Here’s a key passage, but the whole thing is amazing, and yes, I’ve invited her to my Occupy group already:
To understand why, consider loans of small amounts. People criticize payday loans for their high annual percentage rates (APR), which range from three hundred per cent to six hundred per cent. Payday lenders argue that APR is the wrong measure: the loans, they say, are designed to be repaid in as little as two weeks. Consumer advocates counter that borrowers typically take out nine of these loans each year, and end up indebted for more than half of each year.
But what alternative do low-income borrowers have? Banks have retreated from small-dollar credit, and many payday borrowers do not qualify anyway. It happens that banks offer a de-facto short-term, high-interest loan. It’s called an overdraft fee. An overdraft is essentially a short-term loan, and if it had a repayment period of seven days, the APR for a typical incident would be over five thousand per cent.
It makes me wonder whether, if someone did a careful analysis with all-in costs including time and travel, whether PayDay Lenders are not actually a totally rational choice for the poor.
One thing I do a lot when I work with data is figure out how to visualize my signals, especially with respect to time.
Lots of things change over time – relationships between variables, for example – and it’s often crucial to get deeply acquainted with how exactly that works with your in-sample data.
Say I am trying to predict “y”: so for a data point at time t, we’ll say we try to predict y(t). I’ll take an “x”, a variable that is expected to predict “y”, and I’ll demean both series x and y, hopefully in a causal way, and I will rename them x’ and y’, and then, making sure I’ve ordered everything with respect to time, I’ll plot the cumulative sum of the product x’(t) * y’(t).
In the case that both x’(t) and y’(t) have the both sign – so they’re both bigger than average or they’re both smaller than average, this product is positive, and otherwise it’s negative. So if you plot the cumulative sum, you get an upwards trend if things are positively correlated and downwards trend if things are negatively correlated. If you think about it, you are computing the numerator of the correlation function, so it is indeed just an unscaled version of total correlation.
Plus, since you ordered everything by time first, you can see how the relationship between these variables evolved over time.
Also, in the case that you are working with financial models, you can make a simplifying assumption that both x and y are pretty well demeaned already (especially at short time scales) and this gives you the cumulative PnL plot of your model. In other words, it tells you how much money your model is making.
So I was doing this exercise of plotting the cumulative covariance with some data the other day, and I got a weird picture. It kind of looked like a “U” plot: it went down dramatically at the beginning, then was pretty flat but trending up, then it went straight up at the end. It ended up not quite as high as it started, which is to say that in terms of straight-up overall correlation, I was calculating something negative but not very large.
But what could account for that U-shape? After some time I realized that the data had been extracted from the database in such a way that, after ordering my data by date, it was hugely biased in the beginning and at the end, in different directions, and that this was unavoidable, and the picture helped me determine exactly which data to exclude from my set.
After getting rid of the biased data at the beginning and the end, I concluded that I had a positive correlation here, even though if I’d trusted the overall “dirty” correlation I would have thought it was negative.
This is good information, and confirmed my belief that it’s always better to visualize data over time than it is to believe one summary statistic like correlation.
I left finance pretty disgusted with the whole thing, and because I needed to make money and because I’m a nerd, I pretty quickly realized I could rebrand myself a “data scientist” and get a pretty cool job, and that’s what I did. Once I started working in the field, though, I was kind of shocked by how positive everyone was about the “big data revolution” and the “power of data science.”
Not to underestimate the power of data––it’s clearly powerful! And big data has the potential to really revolutionize the way we live our lives for the better––or sometimes not. It really depends.
From my perspective, this was, in tenor if not in the details, the same stuff we’d been doing in finance for a couple of decades and that fields like advertising were slow to pick up on. And, also from my perspective, people needed to be way more careful and skeptical of their powers than they currently seem to be. Because whereas in finance we need to worry about models manipulating the market, in data science we need to worry about models manipulating people, which is in fact scarier. Modelers, if anything, have a bigger responsibility now than ever before.
This is a guest post by Nicholas Diakopoulos, a Tow Fellow at the Columbia University Graduate School of Journalism where he is researching the use of data and algorithms in the news. You can find out more about his research and other projects on his website or by following him on Twitter. Crossposted from engenhonetwork with permission from the author.
How can we know the biases of a piece of software? By reverse engineering it, of course.
When was the last time you read an online review about a local business or service on a platform like Yelp? Of course you want to make sure the local plumber you hire is honest, or that even if the date is dud, at least the restaurant isn’t lousy. A recent survey found that 76 percent of consumers check online reviews before buying, so a lot can hinge on a good or bad review. Such sites have become so important to local businesses that it’s not uncommon for scheming owners to hire shills to boost themselves or put down their rivals.
To protect users from getting duped by fake reviews Yelp employs an algorithmic review reviewer which constantly scans reviews and relegates suspicious ones to a “filtered reviews” page, effectively de-emphasizing them without deleting them entirely. But of course that algorithm is not perfect, and it sometimes de-emphasizes legitimate reviews and leaves actual fakes intact—oops. Some businesses have complained, alleging that the filter can incorrectly remove all of their most positive reviews, leaving them with a lowly one- or two-stars average.
This is just one example of how algorithms are becoming ever more important in society, for everything from search engine personalization, discrimination, defamation, and censorship online, to how teachers are evaluated, how markets work, how political campaigns are run, and even how something like immigration is policed. Algorithms, driven by vast troves of data, are the new power brokers in society, both in the corporate world as well as in government.
They have biases like the rest of us. And they make mistakes. But they’re opaque, hiding their secrets behind layers of complexity. How can we deal with the power that algorithms may exert on us? How can we better understand where they might be wronging us?
Transparency is the vogue response to this problem right now. The big “open data” transparency-in-government push that started in 2009 was largely the result of an executive memo from President Obama. And of course corporations are on board too; Google publishes a biannual transparency report showing how often they remove or disclose information to governments. Transparency is an effective tool for inculcating public trust and is even the way journalists are now trained to deal with the hole where mighty Objectivity once stood.
But transparency knows some bounds. For example, though the Freedom of Information Act facilitates the public’s right to relevant government data, it has no legal teeth for compelling the government to disclose how that data was algorithmically generated or used in publicly relevant decisions (extensions worth considering).
Moreover, corporations have self-imposed limits on how transparent they want to be, since exposing too many details of their proprietary systems may undermine a competitive advantage (trade secrets), or leave the system open to gaming and manipulation. Furthermore, whereas transparency of data can be achieved simply by publishing a spreadsheet or database, transparency of an algorithm can be much more complex, resulting in additional labor costs both in creation as well as consumption of that information—a cognitive overload that keeps all but the most determined at bay. Methods for usable transparency need to be developed so that the relevant aspects of an algorithm can be presented in an understandable way.
Given the challenges to employing transparency as a check on algorithmic power, a new and complementary alternative is emerging. I call it algorithmic accountability reporting. At its core it’s really about reverse engineering—articulating the specifications of a system through a rigorous examination drawing on domain knowledge, observation, and deduction to unearth a model of how that system works.
As interest grows in understanding the broader impacts of algorithms, this kind of accountability reporting is already happening in some newsrooms, as well as in academic circles. At the Wall Street Journal a team of reporters probed e-commerce platforms to identify instances of potential price discrimination in dynamic and personalized online pricing. By polling different websites they were able to spot several, such as Staples.com, that were adjusting prices dynamically based on the location of the person visiting the site. At the Daily Beast, reporter Michael Keller dove into the iPhone spelling correction feature to help surface patterns of censorship and see which words, like “abortion,” the phone wouldn’t correct if they were misspelled. In my own investigation for Slate, I traced the contours of the editorial criteria embedded in search engine autocomplete algorithms. By collecting hundreds of autocompletions for queries relating to sex and violence I was able to ascertain which terms Google and Bing were blocking or censoring, uncovering mistakes in how these algorithms apply their editorial criteria.
All of these stories share a more or less common method. Algorithms are essentially black boxes, exposing an input and output without betraying any of their inner organs. You can’t see what’s going on inside directly, but if you vary the inputs in enough different ways and pay close attention to the outputs, you can start piecing together some likeness for how the algorithm transforms each input into an output. The black box starts to divulge some secrets.
Algorithmic accountability is also gaining traction in academia. At Harvard, Latanya Sweeney has looked at how online advertisements can be biased by the racial association of names used as queries. When you search for “black names” as opposed to “white names” ads using the word “arrest” appeared more often for online background check service Instant Checkmate. She thinks the disparity in the use of “arrest” suggests a discriminatory connection between race and crime. Her method, as with all of the other examples above, does point to a weakness though: Is the discrimination caused by Google, by Instant Checkmate, or simply by pre-existing societal biases? We don’t know, and correlation does not equal intention. As much as algorithmic accountability can help us diagnose the existence of a problem, we have to go deeper and do more journalistic-style reporting to understand the motivations or intentions behind an algorithm. We still need to answer the question of why.
And this is why it’s absolutely essential to have computational journalists not just engaging in the reverse engineering of algorithms, but also reporting and digging deeper into the motives and design intentions behind algorithms. Sure, it can be hard to convince companies running such algorithms to open up in detail about how their algorithms work, but interviews can still uncover details about larger goals and objectives built into an algorithm, better contextualizing a reverse-engineering analysis. Transparency is still important here too, as it adds to the information that can be used to characterize the technical system.
Despite the fact that forward thinkers like Larry Lessig have been writing for some time about how code is a lever on behavior, we’re still in the early days of developing methods for holding that code and its influence accountable. “There’s no conventional or obvious approach to it. It’s a lot of testing or trial and error, and it’s hard to teach in any uniform way,” noted Jeremy Singer-Vine, a reporter and programmer who worked on the WSJ price discrimination story. It will always be a messy business with lots of room for creativity, but given the growing power that algorithms wield in society it’s vital to continue to develop, codify, and teach more formalized methods of algorithmic accountability. In the absence of new legal measures, it may just provide a novel way to shed light on such systems, particularly in cases where transparency doesn’t or can’t offer much clarity.
I’m preparing for a short trip to D.C. this week to take part in a day-long event held by Americans for Financial Reform. You can get the announcement here online, but I’m not sure what the finalized schedule of the day is going to be. Also, I believe it will be recorded, but I don’t know the details yet.
In any case, I’m psyched to be joining this, and the AFR are great guys doing important work in the realm of financial reform.
Opening Wall Street’s Black Box: Pathways to Improved Financial Transparency
Sponsored By Americans for Financial Reform and Georgetown University Law Center
Keynote Speaker: Gary Gensler Chair, Commodity Futures Trading Commission
October 11, 2013 10 AM – 3 PM
Georgetown Law Center, Gewirz Student Center, 12th Floor
120 F Street NW, Washington, DC (Judiciary Square Metro) (Space is limited. Please RSVP to AFRtransparencyrsvp@gmail.com)
The 2008 financial crisis revealed that regulators and many sophisticated market participants were in the dark about major risks and exposures in our financial system. The lack of financial transparency enabled large-scale fraud and deception of investors, weakened the stability of the financial system, and contributed to the market failure after the collapse of Lehman Brothers. Five years later, despite regulatory efforts, it’s not clear how much the situation has improved.
Join regulators, market participants, and academic experts for an exploration of the progress made – and the work that remains to be done – toward meaningful transparency on Wall Street. How can better information and disclosure make the financial system both fairer and safer?
|Jesse Eisinger, Pulitzer Prize-winning reporter for the New York Times and Pro Publica|
|Zach Gast, Head of financial sector research, Center on Financial Research and Analysis|
|Amias Gerety, Deputy Assistant Secretary for the FSOC, United States Treasury|
|Henry Hu, Alan Shivers Chair in the Law of Banking and Finance, University of Texas Law School|
|Albert “Pete” Kyle, Charles E. Smith Professor of Finance, University of Maryland|
|Adam Levitan, Professor of Law, Georgetown University Law Center|
|Antoine Martin, Vice President, New York Federal Reserve Bank|
|Brad Miller, Former Representative from North Carolina; Of Counsel, Grais & Ellsworth|
|Cathy O’Neil, Senior Data Scientist, Johnson Research Labs; Occupy Alternative Banking|
|Gene Phillips, Director, PF2 Securities Evaluation|
|Greg Smith, Author of “Why I Left Goldman Sachs”; former Goldman Sachs Executive Director|
I was reading this Bloomberg article about the internal risk models at JP Morgan versus Goldman Sachs, and it hit me: I too had an urge for the SEC to hire the insiders at Goldman Sachs to help them “understand risk” at every level. Why not hire a small team of Goldman Sachs experts to help the SEC combat bullshit like what happened with the London Whale?
After all, Goldman people know risk. They probably knew risk even better before 1999, when they went IPO and the partners stopped being personally liable for losses. But even now, of all the big players on the street, Goldman is known for being a few steps ahead of everyone else when it comes to a losing trade.
So it’s natural to want someone from deeply within that culture to come spread their technical risk wisdom to the other side, the regulators.
Unfortunately that’s never what actually happens. Instead of getting the technical knowledge of how to think about risk, how to model a portfolio to squirrel out black holes of mystery, the revolving door instead keeps outputting crazy freaks like Jon Corzine, who blow up firms through, ironically, taking ridiculous risks at the first opportunity.
So, why does this happen? Some possibilities:
- Goldman Sachs promotes crazy freaks because they make great leaders while constrained inside a disciplined culture of calculated risks, but when they get outside they go nuts. This is kind of the model of Mormon children who are finally allowed out into the world and engage in tons of sex and drugs.
- On the flip side, perhaps Goldman Sachs keeps the people who actually understand the technical part of risk very deep in the machine and these guys never get leave the building at all.
- Or maybe, people who understand risk sometimes do go through the revolving door, but they don’t share their knowledge with the other side, because their incentives have changed once they’re outside.
- In other words, they don’t help the regulators understand how banks lie and cheat to regulators, because they’re too busy watering down regulation so their buddies can continuously lie and cheat to regulators.
Whatever the case, for whatever reason we keep using the revolving door in hopes that someone will eventually tell us the magic that Goldman Sachs knows, but we never quite get anyone like that, and that means the the SEC and other regulators are woefully unprepared for the kind of tricks that banks have up their sleeves.
It is available here and is based on a related essay written by Susan Webber entitled “Management’s Great Addiction: It’s time we recognized that we just can’t measure everything.” It is being published by O’Reilly as an e-book.
No, I don’t know who that woman is looking skeptical on the cover. I wish they’d asked me for a picture of a skeptical person, I think my 11-year-old son would’ve done a better job.
Did you think public radio doesn’t have advertising? Think again.
Last week Here and Now’s host Jeremy Hobson set up College Board’s James Montoya for a perfect advertisement regarding a story on SAT scores going down. The transcript and recording are here (hat tip Becky Jaffe).
To set it up, they talk about how GPA’s are going up on average over the country but how, at the same time, the average SAT score went down last year.
Somehow the interpretation of this is that there’s grade inflation and that kids must be in need of more test prep because they’re dumber.
What is the College Board?
You might think, especially if you listen to this interview, that the college board is a thoughtful non-profit dedicated to getting kids prepared for college.
Make no mistake about it: the College Board is a big business, and much of their money comes from selling test prep stuff on top of administering tests. Here are a couple of things you might want to know about College Board through its wikipedia page:
Consumer rights organization Americans for Educational Testing Reform (AETR) has criticized College Board for violating its non-profit status through excessive profits and exorbitant executive compensation; nineteen of its executives make more than $300,000 per year, with CEO Gaston Caperton earning $1.3 million in 2009 (including deferred compensation). AETR also claims that College Board is acting unethically by selling test preparation materials, directly lobbying legislators and government officials, and refusing to acknowledge test-taker rights.
Anyhoo, let’s just say it this way: College Board has the ability to create an “emergency” about SAT scores, by say changing the test or making it harder, and then the only “reasonable response” is to pay for yet more test prep. And somehow Here and Now’s host Jeremy Hobson didn’t see this coming at all.
Here’s an excerpt:
HOBSON: It also suggests, when you look at the year-over-year scores, the averages, that things are getting worse, not better, because if I look at, for example, in critical reading in 2006, the average being 503, and now it’s 496. Same deal in math and writing. They’ve gone down.
MONTOYA: Well, at the same time that we have seen the scores go down, what’s very interesting is that we have seen the average GPAs reported going up. So, for example, when we look at SAT test takers this year, 48 percent reported having a GPA in the A range compared to 45 percent last year, compared to 44 percent in 2011, I think, suggesting that there simply have to be more rigor in core courses.
HOBSON: Well, and maybe that there’s grade inflation going on.
MONTOYA: Well, clearly, that there is grade inflation. There is no question about that. And it’s one of the reasons why standardized test scores are so important in the admission office. I know that, as a former dean of admission, test scores help gauge the meaning of a GPA, particularly given the fact that nearly half of all SAT takers are reporting a GPA in the A range.
Just to be super clear about the shilling, here’s Hobson a bit later in the interview:
HOBSON: Well – and we should say that your report noted – since you mentioned practice – that as is the case with the ACT, the students who take the rigorous prep courses do better on the SAT.
What does it really mean when SAT scores go down?
Here’s the thing. SAT scores are fucked with ALL THE TIME. Traditionally, they had to make SAT’s harder since people were getting better at them. As test-makers, they want a good bell curve, so they need to adjust the test as the population changes and as their habits of test prep change.
The result is that SAT tests are different every year, so just saying that the scores went down from year to year is meaningless. Even if the same group of kids took those two different tests in the same year, they’d have different scores.
Also, according to my friend Becky who works with kids preparing for the SAT, they really did make substantial changes recently in the math section, changing the function notation, which makes it much harder for kids to parse the questions. In other words, they switched something around to give kids reason to pay for more test prep.
Important: this has nothing to do with their knowledge, it has to do with their training for this specific test.
If you want to understand the issues outside of math, take for example the essay. According to this critique, the number one criterion for essay grade is length. Length trumps clarity of expression, relevance of the supporting arguments to the thesis, mechanics, and all other elements of quality writing. As my friend Becky says:
I have coached high school students on the SAT for years and have found time and again, much to my chagrin, that students receive top scores for long essays even if they are desultory, tangent-filled and riddled with sentence fragments, run-ons, and spelling errors.
Similarly, I have consistently seen students receive low scores for shorter essays that are thoughtful and sophisticated, logical and coherent, stylish and articulate.
As long as the number one criterion for receiving a high score on the SAT essay is length, students will be confused as to what constitutes successful college writing and scoring well on the written portion of the exam will remain essentially meaningless. High-scoring students will have to unlearn the strategies that led to success on the SAT essay and relearn the fundamentals of written expression in a college writing class.
If the College Board (the makers of the SAT) is so concerned about the dumbing down of American children, they should examine their own role in lowering and distorting the standards for written expression.
Two things. First, shame on College Board and James Montoya for acting like SAT scores are somehow beacons of truth without acknowledging the fiddling that goes on time and time again by his company. And second, shame on Here and Now and Jemery Hobson for being utterly naive and buying in entirely to this scare tactic.
The 2013 PopTech & Rockefeller Foundation Bellagio Fellows - Kate Crawford, Patrick Meier, Claudia Perlich, Amy Luers, Gustavo Faleiros and Jer Thorp - yesterday published “Seven Principles for Big Data and Resilience Projects” on Patrick Meier’s blog iRevolution.
Although they claim that these principles are meant for “best practices for resilience building projects that leverage Big Data and Advanced Computing,” I think they’re more general than that (although I’m not sure exactly what a resilience building project is) I and I really like them. They are looking for public comments too. Go to the post for the full description of each, but here is a summary:
1. Open Source Data Tools
Wherever possible, data analytics and manipulation tools should be open source, architecture independent and broadly prevalent (R, python, etc.).
2. Transparent Data Infrastructure
Infrastructure for data collection and storage should operate based on transparent standards to maximize the number of users that can interact with the infrastructure.
3. Develop and Maintain Local Skills
Make “Data Literacy” more widespread. Leverage local data labor and build on existing skills.
4. Local Data Ownership
Use Creative Commons and licenses that state that data is not to be used for commercial purposes.
5. Ethical Data Sharing
Adopt existing data sharing protocols like the ICRC’s (2013). Permission for sharing is essential. How the data will be used should be clearly articulated. An opt in approach should be the preference wherever possible, and the ability for individuals to remove themselves from a data set after it has been collected must always be an option.
6. Right Not To Be Sensed
Local communities have a right not to be sensed. Large scale city sensing projects must have a clear framework for how people are able to be involved or choose not to participate.
7. Learning from Mistakes
Big Data and Resilience projects need to be open to face, report, and discuss failures.
My friend Suresh just reminded me about this article written a couple of years ago by Malcolm Gladwell and published in the New Yorker.
It concerns various scoring models that claim to be both comprehensive (which means it covers the whole thing, not just one aspect of the thing) and heterogeneous (which means it is broad enough to cover all things in a category), say for cars or for colleges.
Weird things happen when you try to do this, like not caring much about price or exterior detailing for sports cars.
Two things. First, this stuff is actually really hard to do well. I like how Gladwell addresses this issue:
At no point, however, do the college guides acknowledge the extraordinary difficulty of the task they have set themselves.
Second of all, I think the issue of combining heterogeneity and comprehensiveness is addressable, but it has to be addressed interactively.
Specifically, what if instead of a single fixed score, there was a place where a given car-buyer or college-seeker could go to fill out a form of preferences? For each defined and rated aspect, the user would fill answer a question about how much they cared about that aspect. They’d assign a weight to each aspect. A given question would look something like this:
For colleges, some people care a lot about whether their college has a ton of alumni giving, other people care more about whether the surrounding town is urban or rural. Let’s let people create their own scoring system. It’s technically easy.
I’ve suggested this before when I talked about rating math articles on various dimensions (hard, interesting, technical, well-written) and then letting people come and search based on weighting those dimensions and ranking. But honestly we can start even dumber, with car ratings and college ratings.