Archive

Archive for the ‘modeling’ Category

Advertising vs. Privacy

I’ve was away over the weekend (apologies to Aunt Pythia fans!) and super busy yesterday but this morning I finally had a chance to read Ethan Zuckerman’s Atlantic piece entitled The Internet’s Original Sin, which was sent to me by my friend Ernest Davis.

Here’s the thing, Zuckerman gets lots of things right in the article. Most importantly, the inherent conflict between privacy and the advertisement-based economy of the internet:

Demonstrating that you’re going to target more and better than Facebook requires moving deeper into the world of surveillance—tracking users’ mobile devices as they move through the physical world, assembling more complex user profiles by trading information between data brokers.

Once we’ve assumed that advertising is the default model to support the Internet, the next step is obvious: We need more data so we can make our targeted ads appear to be more effective.

This is well said, and important to understand.

Here’s where Zuckerman goes a little too far in my opinion:

Outrage over experimental manipulation of these profiles by social networks and dating companies has led to heated debates amongst the technologically savvy, but hasn’t shrunk the user bases of these services, as users now accept that this sort of manipulation is an integral part of the online experience.

It is a mistake to assume that “users accept this sort of manipulation” because not everyone has stopped using Facebook. Facebook is, after all, an hours-long daily habit for an enormous number of people, and it’s therefore sticky. People don’t give up addictive habits overnight. But it doesn’t mean they are feeling the same way about Facebook that they did 4 years ago. People are adjusting their opinion of the user experience as that user experience is increasingly manipulated and creepy.

An analogy should be drawn to something like smoking, where the rates have gone way down since we all found out it is bad for you. People stopped smoking even though it is really hard for most people (and impossible for some).

We should instead be thinking longer term about what people will be willing to leave Facebook for. What is the social networking model of the future? What kind of minimum privacy protections will convince people they are safe (enough)?

And, most importantly, will we even have reasonable minimum protections, or will privacy be entirely commoditized, whereby only premium pay members will be protected, while the rest of us will be thrown to the dogs?

Categories: data science, modeling

Illegal PayDay syndicate in New York busted

There’s an interesting and horrible New York Time story by Jessica Silver-Greenberg about a PayDay loan syndicate being run out of New York State. The syndicate consists of twelve companies owned by a single dude, Carey Vaughn Brown, with help from a corrupt lawyer and another corrupt COO. Manhattan District Attorneys are charging him and his helpers with usury under New York law.

The complexity of the operation was deliberate and intended to obscure the chain of events that would start with a New Yorker online looking for quick cash online and end with a predatory loan. They’d interface with a company called MyCashNow.com, which would immediately pass their application on to a bunch of other companies in different states or overseas.

Important context: in New York, the usury law caps interest rates at 25 percent annually, and these PayDay operations were charging between 350 and 650 percent annually. Also key, the usury laws apply to where the borrower is, not where the lender is, so even though some of the companies were located (at least on paper) in the West Indies, they were still breaking the law.

They don’t know exactly how big the operation was in New York, but one clue is that in 2012, one of the twelve companies had $50 million in proceeds from New York.

Here’s my question: how did MyCashNow.com advertise? Did it use Google ads, or Facebook ads, or something else, and if so, what were the attributes of the desperate New Yorkers that it looked for to do its predatory work?

One side of this is that vulnerable people were somehow targeted. The other side is that well-off people were not, which meant they didn’t see ads like this, which makes it harder for people like the Manhattan District Attorney to even know about shady operations like this.

Categories: data science, modeling

The problem with charter schools

Today I read this article written by Allie Gross (hat tip Suresh Naidu), a former Teach for America teacher whose former idealism has long been replaced by her experiences in the reality of education in this country. Her article is entitled The Charter School Profiteers.

It’s really important, and really well written, and just one of the articles in the online magazine Jacobin that I urge you to read and to subscribe to. In fact that article is part of a series (here’s another which focuses on charter schools in New Orleans) and it comes with a booklet called Class Action: An Activist Teacher’s Handbook. I just ordered a couple of hard copies.

I’d really like you to read the article, but as a teaser here’s one excerpt, a rant which she completely backs up with facts on the ground:

You haven’t heard of Odeo, the failed podcast company the Twitter founders initially worked on? Probably not a big deal. You haven’t heard about the failed education ventures of the person now running your district? Probably a bigger deal.

When we welcome schools that lack democratic accountability (charter school boards are appointed, not elected), when we allow public dollars to be used by those with a bottom line (such as the for-profit management companies that proliferate in Michigan), we open doors for opportunism and corruption. Even worse, it’s all justified under a banner of concern for poor public school students’ well-being.

While these issues of corruption and mismanagement existed before, we should be wary of any education reformer who claims that creating an education marketplace is the key to fixing the ills of DPS or any large city’s struggling schools. Letting parents pick from a variety of schools does not weed out corruption. And the lax laws and lack of accountability can actually exacerbate the socioeconomic ills we’re trying to root out.

Nerding out: RSA on an iPython Notebook

Yesterday was a day filled with secrets and codes. In the morning, at The Platform, we had guest speaker Columbia history professor Matthew Connelly, who came and talked to us about his work with declassified documents. Two big and slightly depressing take-aways for me were the following:

  • As records have become digitized, it has gotten easy for people to get rid of archival records in large quantities. Just press delete.
  • As records have become digitized, it has become easy to trace the access of records, and in particular the leaks. Connelly explained that, to some extent, Obama’s harsh approach to leakers and whistleblowers might be explained as simply “letting the system work.” Yet another way that technology informs the way we approach human interactions.

After class we had section, in which we discussed the Computer Science classes some of the students are taking next semester (there’s a list here) and then I talked to them about prime numbers and the RSA crypto system.

I got really into it and wrote up an iPython Notebook which could be better but is pretty good, I think, and works out one example completely, encoding and decoding the message “hello”.

The underlying file is here but if you want to view it on the web just go here.

Crime and punishment

When I was prepping for my Slate Money podcast last week I read this column by Matt Levine at Bloomberg on the Citigroup settlement. In it he raises the important question of how the fine amount of $7 billion was determined. Here’s the key part:

 Citi’s and the Justice Department’s approaches both leave something to be desired. Citi’s approach seems to be premised on the idea that the misconduct was securitizing mortgages: The more mortgages you did, the more you gotta pay, regardless of how they performed. The DOJ’s approach, on the other hand, seems to be premised on the idea that the misconduct was sending bad e-mails about mortgages: The more “culpable” you look, the more it should cost you, regardless of how much damage you did.

would have thought that the misconduct was knowingly securitizing bad mortgages, and that the penalties ought to scale with the aggregate badness of Citi’s mortgages. So, for instance, you’d want to measure how often Citi’s mortgages didn’t match up to its stated quality-control standards, and then compare the actual financial performance of the loans that didn’t meet the standards to the performance of the loans that did. Then you could say, well, if Citi had lived up to its promises, investors would have lost $X billion less than they actually did. And then you could fine Citi that amount, or some percentage of that amount. And you could do a similar exercise for the other big banks — JPMorgan, say, which already settled, or Bank of America, which is negotiating its settlement — and get comparable amounts that appropriately balance market share (how many bad mortgages did you sell?) and culpability (how bad were they?).

I think he nailed something here, which has eluded me in the past, namely the concept of what comprises evidence of wrongdoing and how that translates into punishment. It’s similar to what I talked about in this recent post, where I questioned what it means to provide evidence of something, especially when the data you are looking for to gather evidence has been deliberately suppressed by either the people committing wrongdoing or by other people who are somehow gaining from that wrongdoing but are not directly involved.

Basically the way I see Levine’s argument is that the Department of Justice used a lawyerly definition of evidence of wrongdoing – namely, through the existence of emails saying things like “it’s time to pray.” After determining that they were in fact culpable, they basically did some straight-up negotiation to determine the fee. That negotiation was either purely political or was based on information that has been suppressed, because as far as anyone knows the number was kind of arbitrary.

Levine was suggesting a more quantitative definition for evidence of wrongdoing, which involves estimating both “how much you know” and “how much damage you actually did” to determine the damage, and then some fixed transformation of that damage becomes the final fee. I will ignore Citi’s lawyers’ approach since their definition was entirely self-serving.

Here’s the thing, there are problems with both approaches. For example, with the lawyerly approach, you are basically just sending the message that you should never ever write some things on email, and most or at least many people know that by now. In other words, you are training people to game the system, and if they game it well enough, they won’t get in trouble. Of course, given that this was yet another fine and nobody went to jail, you could make the argument – and I did on the podcast – that nobody got in trouble anyway.

The problem with the quantitative approach, is that first of all you still need to estimate “how much you knew” which again often goes back to emails, although in this case could be estimated by how often the stated standards were breached, and second of all, when taken as a model, can be embedded into the overall trading model of securities.

In other words, if I’m a quant at a nasty place that wants to trade in toxic securities, and I know that there’s a chance I’d be caught but I know the formula for how much I’d have to pay if I got caught, then I could include this cost, in addition to an estimate of the likelihood for getting caught, in an optimization engine to determine exactly how many toxic securities I should sell.

To avoid this scenario, it makes sense to have an element of randomness in the punishments for getting caught. Every now and then the punishment should be much larger than the quantitative model might suggest, so that there is less of a chance that people can incorporate the whole shebang into their optimization procedure. So maybe what I’m saying is that arriving at a random number, like the DOJ did, is probably better even though it is less satisfying.

Another possibility to actually deter crimes would be to arbitrarily increasing the likelihood of catching people up to no good, but that has been bounded from above by the way the SEC and the DOJ actually work.

Categories: finance, modeling

The future of work

People who celebrate the monthly jobs report getting better nowadays often forget to mention a few facts:

  • the new jobs are often temporary or part-time, with low wages
  • the old lost jobs, which we lose each month, were often full-time with higher wages

I could go on, and I have, and mention the usual complaints about the definition of the unemployment rate. But instead I’ll take a turn into a thought experiment I’ve been having lately.

Namely, what is the future of work?

It’s important to realize that in some sense we’ve been here before. When all the farming equipment got super efficient and we lost agricultural jobs by the thousands, people swarmed to the cities and we started building things with manufacturing. So if before we had “the age of the farm,” we then entered into “the age of stuff.” And I don’t know about you but I have LOTS of stuff.

Now that all the robots have been trained and are being trained to build our stuff for us, what’s next? What age are we entering?

I kind of want to complain at this point that economists are kind of useless when it comes to questions like this. I mean, aren’t they in charge of understanding the economy? Shouldn’t they have the answer here? I don’t think they have explained it if they do.

Instead, I’m pretty much left considering various science fiction plots I’ve heard about and read about over the years. And my conclusion is that we’re entering “the age of service.”

The age of service is a kind of pyramid scheme where rich people employ individuals to service them in various ways, and then those people are paid well so they can hire slightly less rich people to service them, and so on. But of course for this particular pyramid to work out, the rich have to be SUPER rich and they have to pay their servants very well indeed for the trickle down to work out. Either that or there has to be a wealth transfer some other way.

So, as with all theories of the future, we can talk about how this is already happening.

I noticed this recent Bloomberg View article about how rich people don’t have normal doctors like you and me. They just pay out of pocket for super expensive service outside the realm of insurance. This is not new but it’s expanding.

Here’s another example of the future of jobs, which I should applaud because at least someone has a  job but instead just kind of annoys me. Namely, the increasing frequency where I try to make a coffee date with someone (outside of professional meetings) and I have to arrange it with their personal assistant. I feel like, when it comes to social meetings, if you have time to be social, you have time to arrange your social calendar. But again, it’s the future of work here and I guess it’s all good.

More generally: there will be lots of jobs helping out old people and sick people. I get that, especially as the demographics tilt towards old people. But the mathematician in me can’t help but wonder, who will take care of the old people who used to be taking care of the old people? I mean, they by definition don’t have lots of extra cash floating around because they were at the bottom of the pyramid as younger workers.

Or do we have a system where people actually change jobs and levels as they age? That’s another model, where oldish people take care of truly old people and then at some point they get taken care of.

Of course, much like the Star Trek world, none of this has strong connection to the economy as it is set up now, so it’s hard to imagine a smooth transition to a reasonable system, and I’m not even claiming my ideas are reasonable.

By the way, by my definition most people who write computer programs – especially if they’re writing video games or some such – are in a service industry as well. Pretty much anyone who isn’t farming or building stuff in manufacturing is working in service. Writers, poets, singers, and teachers included. Hell, the future could be pretty awesome if we arrange things well.

Anyhoo, a whimsical post for Thursday, and if you have other ideas for the future of work and how that will work out economically, please comment.

Categories: economics, modeling

Two great articles about standardized tests

In the past 12 hours I’ve read two fascinating articles about the crazy world of standardized testing. They’re both illuminating and well-written and you should take a look.

First, my data journalist friend Meredith Broussard has an Atlantic piece called Why Poor Schools Can’t Win At Standardized Testing wherein she tracks down the money and the books in the Philadelphia public school system (spoiler: there’s not enough of either), and she makes the connection between expensive books and high test scores.

Here’s a key phrase from her article:

Pearson came under fire last year for using a passage on a standardized test that was taken verbatim from a Pearson textbook.

The second article, in the New Yorker, is written by Rachel Aviv and is entitled Wrong Answer. It’s a close look, with interviews, of the cheating scandal from Atlanta, which I have been studying recently. The article makes the point that cheating is a predictable consequence of the high-stakes “data-driven” approach.

Here’s a key phrase from the Aviv article:

After more than two thousand interviews, the investigators concluded that forty-four schools had cheated and that a “culture of fear, intimidation and retaliation has infested the district, allowing cheating—at all levels—to go unchecked for years.” They wrote that data had been “used as an abusive and cruel weapon to embarrass and punish.”

Putting the two together, it’s pretty clear that there’s an acceptable way to cheat, which is by stocking up on expensive test prep materials in the form of testing company-sponsored textbooks, and then there’s the unacceptable way to cheat, which is where teachers change the answers. Either way the standardized test scoring regime comes out looking like a penal system rather than a helpful teaching aid.

Before I leave, some recent goodish news on the standardized testing front (hat tip Eugene Stern): Chris Christie just reduced the importance of value-added modeling for teacher evaluation down to 10% in New Jersey.

Follow

Get every new post delivered to your Inbox.

Join 1,229 other followers