Archive

Archive for the ‘modeling’ Category

The future of work

People who celebrate the monthly jobs report getting better nowadays often forget to mention a few facts:

  • the new jobs are often temporary or part-time, with low wages
  • the old lost jobs, which we lose each month, were often full-time with higher wages

I could go on, and I have, and mention the usual complaints about the definition of the unemployment rate. But instead I’ll take a turn into a thought experiment I’ve been having lately.

Namely, what is the future of work?

It’s important to realize that in some sense we’ve been here before. When all the farming equipment got super efficient and we lost agricultural jobs by the thousands, people swarmed to the cities and we started building things with manufacturing. So if before we had “the age of the farm,” we then entered into “the age of stuff.” And I don’t know about you but I have LOTS of stuff.

Now that all the robots have been trained and are being trained to build our stuff for us, what’s next? What age are we entering?

I kind of want to complain at this point that economists are kind of useless when it comes to questions like this. I mean, aren’t they in charge of understanding the economy? Shouldn’t they have the answer here? I don’t think they have explained it if they do.

Instead, I’m pretty much left considering various science fiction plots I’ve heard about and read about over the years. And my conclusion is that we’re entering “the age of service.”

The age of service is a kind of pyramid scheme where rich people employ individuals to service them in various ways, and then those people are paid well so they can hire slightly less rich people to service them, and so on. But of course for this particular pyramid to work out, the rich have to be SUPER rich and they have to pay their servants very well indeed for the trickle down to work out. Either that or there has to be a wealth transfer some other way.

So, as with all theories of the future, we can talk about how this is already happening.

I noticed this recent Bloomberg View article about how rich people don’t have normal doctors like you and me. They just pay out of pocket for super expensive service outside the realm of insurance. This is not new but it’s expanding.

Here’s another example of the future of jobs, which I should applaud because at least someone has a  job but instead just kind of annoys me. Namely, the increasing frequency where I try to make a coffee date with someone (outside of professional meetings) and I have to arrange it with their personal assistant. I feel like, when it comes to social meetings, if you have time to be social, you have time to arrange your social calendar. But again, it’s the future of work here and I guess it’s all good.

More generally: there will be lots of jobs helping out old people and sick people. I get that, especially as the demographics tilt towards old people. But the mathematician in me can’t help but wonder, who will take care of the old people who used to be taking care of the old people? I mean, they by definition don’t have lots of extra cash floating around because they were at the bottom of the pyramid as younger workers.

Or do we have a system where people actually change jobs and levels as they age? That’s another model, where oldish people take care of truly old people and then at some point they get taken care of.

Of course, much like the Star Trek world, none of this has strong connection to the economy as it is set up now, so it’s hard to imagine a smooth transition to a reasonable system, and I’m not even claiming my ideas are reasonable.

By the way, by my definition most people who write computer programs – especially if they’re writing video games or some such – are in a service industry as well. Pretty much anyone who isn’t farming or building stuff in manufacturing is working in service. Writers, poets, singers, and teachers included. Hell, the future could be pretty awesome if we arrange things well.

Anyhoo, a whimsical post for Thursday, and if you have other ideas for the future of work and how that will work out economically, please comment.

Categories: economics, modeling

Two great articles about standardized tests

In the past 12 hours I’ve read two fascinating articles about the crazy world of standardized testing. They’re both illuminating and well-written and you should take a look.

First, my data journalist friend Meredith Broussard has an Atlantic piece called Why Poor Schools Can’t Win At Standardized Testing wherein she tracks down the money and the books in the Philadelphia public school system (spoiler: there’s not enough of either), and she makes the connection between expensive books and high test scores.

Here’s a key phrase from her article:

Pearson came under fire last year for using a passage on a standardized test that was taken verbatim from a Pearson textbook.

The second article, in the New Yorker, is written by Rachel Aviv and is entitled Wrong Answer. It’s a close look, with interviews, of the cheating scandal from Atlanta, which I have been studying recently. The article makes the point that cheating is a predictable consequence of the high-stakes “data-driven” approach.

Here’s a key phrase from the Aviv article:

After more than two thousand interviews, the investigators concluded that forty-four schools had cheated and that a “culture of fear, intimidation and retaliation has infested the district, allowing cheating—at all levels—to go unchecked for years.” They wrote that data had been “used as an abusive and cruel weapon to embarrass and punish.”

Putting the two together, it’s pretty clear that there’s an acceptable way to cheat, which is by stocking up on expensive test prep materials in the form of testing company-sponsored textbooks, and then there’s the unacceptable way to cheat, which is where teachers change the answers. Either way the standardized test scoring regime comes out looking like a penal system rather than a helpful teaching aid.

Before I leave, some recent goodish news on the standardized testing front (hat tip Eugene Stern): Chris Christie just reduced the importance of value-added modeling for teacher evaluation down to 10% in New Jersey.

The Platform starts today

Hey my class starts today, I’m totally psyched!

The syllabus is up on github here and I prepared an iPython notebook here showing how to do basic statistics in python, and culminating in an attempt to understand what a statistically significant but tiny difference means, in the context of the Facebook Emotion study. Here’s a useless screenshot which I’m including because I’m proud:

Screen Shot 2014-07-15 at 7.04.05 AM

If you want to follow along install anaconda on your machine and type “ipython notebook –pylab inline” into a terminal. Then you can just download this notebook and run it!

Most of the rest of the classes will feature an awesome guest lecturer, and I’m hoping to blog about those talks with their permission, so stay tuned.

Surveillance in NYC

There’s a CNN video news story explaining how the NYC Mayor’s Office of Data Analytics is working with private start-up Placemeter to count and categorize New Yorkers, often with the help of private citizens who install cameras in their windows. Here’s a screenshot from the Placemeter website:

From placemeter.com

From placemeter.com

You should watch the video and decide for yourself whether this is a good idea.

Personally, it disturbs me, but perhaps because of my priors on how much we can trust other people with our data, especially when it’s in private hands.

To be more precise, there is, in my opinion, a contradiction coming from the Placemeter representatives. On the one hand they try to make us feel safe by saying that, after gleaning a body count with their video tapes, they dump the data. But then they turn around and say that, in addition to counting people, they will also categorize people: gender, age, whether they are carrying a shopping bag or pushing strollers.

That’s what they are talking about anyway, but who knows what else? Race? Weight? Will they use face recognition software? Who will they sell such information to? At some point, after mining videos enough, it might not matter if they delete the footage afterwards.

Since they are a private company I don’t think such information on their data methodologies will be accessible to us via Freedom of Information Laws either. Or, let me put that another way. I hope that MODA sets up their contract so that such information is accessible via FOIL requests.

Great news: for-profit college Corinthian to close

I’ve talked before about the industry of for-profit colleges which exists largely to game the federal student loan program. They survive almost entirely on federal student loans of their students, while delivering terrible services and worthless credentials.

Well, good news: one of the worst of the bunch is closing its doors. Corinthian College, Inc (CCI) got caught lying about job placement of its graduates (in some cases, they said 100% when the truth was closer to 0%). They were also caught advertising programs they didn’t actually have.

But here’s what interests me the most, which I will excerpt from the California Office of the Attorney General:

CCI’s predatory marketing efforts specifically target vulnerable, low-income job seekers and single parents who have annual incomes near the federal poverty line. In internal company documents obtained by the Department of Justice, CCI describes its target demographic as “isolated,” “impatient,” individuals with “low self-esteem,” who have “few people in their lives who care about them” and who are “stuck” and “unable to see and plan well for future.”

I’d like to know more about how they did this. I’m guessing it was substantially online, and I’m guessing they got help from data warehousing services.

After skimming the complaint I’m afraid it doesn’t include such information, although it does say that the company advertised programs it didn’t have and then tricked potential students into filling out information about them so CCI could follow up and try to enroll them. Talk about predatory advertising!

Update: I’m getting some information by checking out their recent marketing job postings.

Categories: feedback loop, modeling

Thanks for a great case study, Facebook!

I’m super excited about the recent “mood study” that was done on Facebook. It constitutes a great case study on data experimentation that I’ll use for my Lede Program class when it starts mid-July. It was first brought to my attention by one of my Lede Program students, Timothy Sandoval.

My friend Ernest Davis at NYU has a page of handy links to big data articles, and at the bottom (for now) there are a bunch of links about this experiment. For example, this one by Zeynep Tufekci does a great job outlining the issues, and this one by John Grohol burrows into the research methods. Oh, and here’s the original research article that’s upset everyone.

It’s got everything a case study should have: ethical dilemmas, questionable methodology, sociological implications, and questionable claims, not to mention a whole bunch of media attention and dissection.

By the way, if I sound gleeful, it’s partly because I know this kind of experiment happens on a daily basis at a place like Facebook or Google. What’s special about this experiment isn’t that it happened, but that we get to see the data. And the response to the critiques might be, sadly, that we never get another chance like this, so we have to grab the opportunity while we can.

We are not ready for health data mining

There have been two articles very recently about how great health data mining could be if we could only link up all the data sets. Larry Page from Google thinks so, which doesn’t surprise anyone, and separately we are seeing that the consequence of the new medical payment system through the ACA is giving medical systems incentives to keep tabs on you through data providers and find out if you’re smoking or if you need to fill up on asthma medication.

And although many would consider this creepy stalking, that’s not actually my problem with it. I think Larry Page is right – we might be able to save lots of lives if we could mine this data which is currently siloed through various privacy laws. On the other hand, there are reasons those privacy laws exist. Let’s think about that for a second.

Now that we have the ACA, insurers are not allowed to deny Americans medical insurance coverage because of a pre-existing condition, nor are they allowed to charge more, as of 2014. That’s good news on the health insurance front. But what about other aspects of our lives?

For example, it does not generalize to employers. In other words, a large employer like Walmart might take into account your current health and your current behaviors and possibly even your DNA to predict future behaviors, and they might decide not to give jobs to anyone at risk of diabetes, say. Even if medical insurance casts were taken out of the picture, which they haven’t been, they’d have incentives not to hire unhealthy people.

Mind you, there are laws that prevent employers from looking into HIPAA-protected health data, but not Acxiom data, which is entirely unregulated. And if we “opened up all the data” then the laws would be entirely moot. It would be a world where, to get a job, the employer got to see everything about you, including your future health profile. To some extent this is already happening.

Perhaps not everyone thinks of this as bad. After all, many people think smokers should pay more for insurance, why not also work harder to get a job? However, lots of the information gleaned from this data – even behaviors – have much more to do with poverty levels than circumstance than with conscious choice. In other words, it’s another stratification of society along the lucky/unlucky birth lottery spectrum. And if we aren’t careful, we will make it even harder for poor people to eke out a living.

I’m all for saving lives but let’s wait for the laws to catch up with the good intentions. Although to be honest, it’s not even clear how the law should be written, since it’s not clear what “medical” data is nowadays nor how we could gather evidence that a private employer is using it against someone improperly.

Categories: modeling
Follow

Get every new post delivered to your Inbox.

Join 1,281 other followers