There’s a new book out, called The Black Box Society and written by Frank Pasquale, a lawyer focused on technology and a friend of mine. It’s published by Harvard University Press and it looks like this:
To be honest, when I first received it I was a bit worried that it would make my book, which I am utterly engaged in writing, entirely moot. After all, Frank and I had discussed his book and I’d seen earlier versions. I knew it contained information about racist secret algorithms in finance and tech, and there were also other issues in common with our two books.
Now that I’ve had a chance to read it, though, I’m not as worried. First of all, Frank’s book is aimed at a different audience, which is to say a somewhat more academic and technical audience. In particular his policy recommendations near the end of the book seem to be written for lawyers who know the current laws and need arguments to improve them.
Also, his focus is on secrecy itself as a means of power, whereas I focus on models as the object of interest.
I like a lot of what Frank says, and I think his metaphors work really well. For example, he talks about the early promise of the internet to expose information of all sorts, on powerful corporations as well as individuals. Then he talks about how reality has been a disappointment, and we’ve ended up with an internet that acts as a “one way mirror,” whereby powerful corporations can see into individual’s lives but those individuals can’t look back.
He also makes the important point that, when it comes to the NSA and other government agencies snooping around, while they might be legally prevented from gathering certain kinds of data about people, nothing prevents them from buying information and profiles from data warehouses like Acxiom, which can do the kind of collecting that they can’t. In other words, the data warehousing industry acts as a giant loophole in the set of rules protecting our civil liberties.
For another really interesting review of Frank’s book, written by a software engineer, take a look at David Auerbach’s Slate review (hat tip Jordan Ellenberg). In particular he has interesting things to say about the extent to which algorithms are intentionally evil (they’re probably not) and the extent to which engineers can fix problems (they probably can).
In any case, I recommend The Black Box Society, it’s a fascinating and important book.
I seem to have caught a break at the San Antonio airport, with free wifi. So I will take this opportunity to offer a link to my prezi talk.
One embarrassing omission from my talk is the existence of many public facing math podcasts. Embarrassing not because I knew about them – I didn’t – but because I should have, since after all I participate in a weekly podcast myself, so of course I know it’s a new and exciting medium. Luckily, the audience member who pointed out my mistake has agreed to write a guest post surveying the math podcast landscape, so stay tuned for that.
Did you hear about TechCrunch’s leaked documents detailing the client list of Palantir, the super secretive data mining contractor (hat tip Chris Wiggins)? Palantir, founded by uberlibertarian Peter Thiel, had clients as of 2013 including the LAPD, the CIA, DHS, NSA, the FBI, and CDC. Besides data mining for government agencies, they also work in the finance sector and the legal sector.
Here’s the scariest thing about the TechCrunch article:
Samuel Reading, a former Marine who works in Afghanistan for NEK Advanced Securities Group, a U.S. military contractor, was quoted in the document as saying It’s the combination of every analytical tool you could ever dream of. You will know every single bad guy in your area.”
That quote, if true, belies a lack of understanding of what data mining can actually do in terms of accuracy. No data mining tool can be both comprehensive and accurate – find all the bad guys with no accidental good guys getting caught in the net. It’s just not possible, unless you have DNA samples with markers for “bad guyness,” and even then DNA tests sometimes get mixed up.
It behooves an expensive and fancy consulting company to act like their tools are prophetic, however, even if that means false positives or false negatives happen all the time, which of course they do, with any algorithm.
It’s bad enough when stupid start-up companies claim big data solves everything, when what they’re doing is trying to solve a problem nobody cares about. It’s another thing altogether when it’s our military and military contractors and police and secret services, and when we don’t have any view into what it actually does. Scary stuff.
So I’m here at JMM, hanging out with my buddy Aaron Abrams and finagling free wifi at the Hyatt (pro tip from Jonathan Bloom: sign up to be on their gold membership plan, which is free, and as a member you get free wifi).
Aaron and I started talking about the case of MIT professor Walter Lewin, and whether his OpenCourseWorks physics lectures should or should not have been removed after he was discovered to have been a sexual harasser.
UPDATE: Here’s an article giving some idea of what Lewin did, which was basically to harass women who were taking his online class.
I’ve already asserted that it makes sense to me that they are removed, but I wasn’t happy with my explanation. I think I’ve understood it better now, and I wanted to throw it out there.
To explain it, let’s move to a more cut and dry example, or at least an older one, namely Harvard mathematician George Birkhoff. That guy was a hugely famous and powerful mathematician in his day, which was in the 1930’s. He was also a huge anti-semite, and prevented Harvard from hiring jewish mathematicians fleeing the Nazis.
When it comes to doing math, I might write a paper that uses a result he proved. Will I cite him? Personally, I would feel weird about it. Citing someone, speaking their name, is not just a mathematical shortcut, a way of avoiding proving everything from basic principles, although it is that, of course. If you have no prior knowledge about someone, you might not see that, but I’ve set it up explicitly so you see more than that.
Here’s what I see. By citing him, I am doing more than giving him credit for proving something, I’m including him in the community of mathematics, which is actually an honor. And honestly I’d rather not honor the wisdom of someone I detest.
Update: to be clear I would cite him if I needed to. I just would actively feel weird about it. I might even add a note.
Going back to Walter Lewin. Supposedly he can explain certain kinds of physics really really well. People say this, and I believe them. But of course the physics is already known, he’s not inventing something, and other people can also explain it, just not quite as well, at least right now.
Why would a given person choose to watch Lewin’s lectures instead of someone else’s lectures on the same material? Well, what is the delta between those two experiences? On the one hand, it’s a better explanation, which adds, but on the other hand, it’s the knowledge that we are honoring a man with no integrity, which subtracts. If written citation is received wisdom, then actually sitting and listening to a person is even more intimate.
For me, personally, these two opposite considerations don’t add up to a net positive. I’d rather watch someone else explain the physics.
As for MIT’s OpenCourseWorks (OCW) platform, they also had a “delta” computation to make, and they had to take into account the community they are trying to build through OCW. They want women in particular to feel welcomed to that community, and they decided that the videos’ presence made that more difficult (and it’s already difficult enough in physics). I think they made the right call.
This is a guest post by Tom Adams, who spent over 20 years in the securitization business and now works as an attorney and consultant and expert witness on MBS, CDO and securitization related issues.
Good news for would-be home buyers – the Obama Administration heard your concerns and has a new tool to help make homes more affordable!
Are they going to increase wages? Or reduce the price of homes? No, they’re going to attack mortgage rates for Federal Housing Administration (FHA) borrowers. Of course, mortgage rates are already at close to all time lows, having declined significantly over the past year to about 3.7% on conventional 30 year fixed rate loans. The Administration’s main tool for doing this is to cut the insurance fee charged by the Federal Housing Authority on new mortgages by 0.50%, from 1.35% to 0.85% (on top of the interest rate charged to borrowers).
This fee is paid by borrowers into a fund that the FHA uses to protect itself against losses in case borrowers that it has insured later default. In theory, this move was somewhat controversial because the FHA’s fund had incurred higher than expected losses during the crisis and the FHA had to ask Congress for money to shore up the fund not that long ago. Around the same time, the FHA raised this insurance premium to additionally replenish the fund.
If it’s already really cheap to borrow money, is another 0.5% reduction going to make that big a difference? Probably not, because historically low interest rates haven’t been the obstacle to buying a house. I expect the number of net, new home buyers produced as a result of this change will be considerably lower than the Administration is projection (“millions of homeowners,” according to Obama’s statement today).
Rather, would-be homeowners don’t have the income to support buying the houses listed for sale in their markets – which is another way of saying that, for average Americans homes are too expensive for them to afford (or wages are too uncertain for them to want to buy).
Also note that the new lower fee is primarily aimed at new home purchasers. In order for existing FHA borrowers to get the new lower premium they would have to refinance into a new loan, which means they’d have to incur new closing costs. The new closing costs would probably eat up most of the savings for a year or more. Presumably, this would discourage many existing borrowers from refinancing for the lower premium, which helps the FHA by allowing it to retain the old, higher premium on the borrowers who don’t refinance.
This highlights one of those fundamental conundrums in the housing market. Existing homeowners and home sellers want home prices to go up. Representatives of this group are great at lobbying and have convinced many people (including, by all appearances, this Administration) that rising home prices are a good thing for America. On the other hand, potential home buyers would rather not have home prices going up – because that makes buying much harder. For whatever reason, this group has about zero lobbying juice.
Making credit cheaper is a small tool the Administration has via this reduced premium, so they used it, I guess. But it’s an action that has consequences, including potentially running the risk of not having enough in the fund down the road if losses increase (not a risk I’m especially worried about – the Urban Institute did a fine analysis of why the lower fee is probably sufficient – but it’s a reasonable concern). In addition, it is somewhat disheartening that the Administration still seems to believe that the solution to consumer issues is to have the consumers take on more debt.
The most significant impact of this change is that it will make FHA loans more competitive with Fannie Mae and Freddie Mac loans. You may recall that Mel Watt, the man in charge of the Federal Housing Finance Agency (FHFA), which manages Fannie and Freddie, made a big announcement recently that the GSE’s would offer 97% loan-to-value (LTV) ratio loans to qualified borrowers. Previously, that type of LTV had been mostly the territory of the FHA.
So, effectively, this is just a form of catch-up for the FHA. The various government housing agencies are competing for market share among the same limited universe of qualifying borrowers by trying to get them to take on bigger mortgages than they would qualify for previously. For the average would-be buyer of the average house, the new, lower FHA fee would be worth about $900 a year, equivalent to about a $75 reduction in monthly payment.
It’s hard to believe that anyone in the Administration believes that this will do much for making homes more affordable for Americans. Perhaps it is a measure, however, of how seriously the Administration is taking the issue of housing affordability. There are big issues in housing and the economy that need to be taken seriously – like resolution of Fannie and Freddie, home prices that still remain beyond the reach of many Americans, stagnant wages, on-going foreclosure and mortgage servicing problems – but the Administration seems content to tinker around the edges and try to sell it as important reform.
Hey, so this is cool. The Alternative Banking group just came out with a second Huffington Post essay, this time on how the bailout isn’t over, how it didn’t work, and how we’re already preparing for the next one. I think it came out really well. You can read it here.
Also, I’ll be giving a talk at the Joint Math Meetings again this year, this time as an invited MAA speaker. My title is Making the Case for Data Journalism, and you can see the abstract here. I guess I’m speaking on Monday afternoon at 4pm in a place called the Lila Cockrell Theatre.
So, a few things. If you’re a math nerd planning to be in San Antonio this weekend, please don’t leave Sunday, because there are still talks on Monday! Also, if you want to hang out, leave a comment or send me email and I’ll try to figure out a way to meet up with you. I honestly feel like I don’t know too many mathematicians anymore, so it would be nice to see or meet a friendly face. I’m getting to San Antonio Friday.
There’s an excellent Wall Street Journal article by Joseph Walker, entitled Can a Smartphone Tell if You’re Depressed?, that describes a lot of creepy new big data projects going on now in healthcare, in partnership with hospitals and insurance companies.
Some of the models come in the form of apps, created and managed by private, third-party companies that try to predict depression in, for example, postpartum women. They don’t disclose what they are doing to many of the women, or the extent of what they’re doing, according to the article. They own the data they’ve collected at the end of the day and, presumably, can sell it to anyone interested in whether a woman is depressed. For example, future employers. To be clear, this data is generally not covered by HIPAA.
Perhaps the creepiest example is a voice analysis model:
Nurses employed by Aetna have used voice-analysis software since 2012 to detect signs of depression during calls with customers who receive short-term disability benefits because of injury or illness. The software looks for patterns in the pace and tone of voices that can predict “whether the person is engaged with activities like physical therapy or taking the right kinds of medications,” Michael Palmer, Aetna’s chief innovation and digital officer, says.
Patients aren’t informed that their voices are being analyzed, Tammy Arnold, an Aetna spokeswoman, says. The company tells patients the calls are being “recorded for quality,” she says.
“There is concern that with more detailed notification, a member may alter his or her responses or tone (intentionally or unintentionally) in an effort to influence the tool or just in anticipation of the tool,” Ms. Arnold said in an email.
In other words, in the name of “fear of gaming the model,” we are not disclosing the creepy methods we are using. Also, considering that the targets of this model are receiving disability benefits, I’m wondering if the real goal is to catch someone off their meds and disqualify them for further benefits or something along those lines. Since they don’t know they are being modeled, they will never know.
Conclusion: we need more regulation around big data in healthcare.