We are not ready for health data mining

There have been two articles very recently about how great health data mining could be if we could only link up all the data sets. Larry Page from Google thinks so, which doesn’t surprise anyone, and separately we are seeing that the consequence of the new medical payment system through the ACA is giving medical systems incentives to keep tabs on you through data providers and find out if you’re smoking or if you need to fill up on asthma medication.

And although many would consider this creepy stalking, that’s not actually my problem with it. I think Larry Page is right – we might be able to save lots of lives if we could mine this data which is currently siloed through various privacy laws. On the other hand, there are reasons those privacy laws exist. Let’s think about that for a second.

Now that we have the ACA, insurers are not allowed to deny Americans medical insurance coverage because of a pre-existing condition, nor are they allowed to charge more, as of 2014. That’s good news on the health insurance front. But what about other aspects of our lives?

For example, it does not generalize to employers. In other words, a large employer like Walmart might take into account your current health and your current behaviors and possibly even your DNA to predict future behaviors, and they might decide not to give jobs to anyone at risk of diabetes, say. Even if medical insurance casts were taken out of the picture, which they haven’t been, they’d have incentives not to hire unhealthy people.

Mind you, there are laws that prevent employers from looking into HIPAA-protected health data, but not Acxiom data, which is entirely unregulated. And if we “opened up all the data” then the laws would be entirely moot. It would be a world where, to get a job, the employer got to see everything about you, including your future health profile. To some extent this is already happening.

Perhaps not everyone thinks of this as bad. After all, many people think smokers should pay more for insurance, why not also work harder to get a job? However, lots of the information gleaned from this data – even behaviors – have much more to do with poverty levels than circumstance than with conscious choice. In other words, it’s another stratification of society along the lucky/unlucky birth lottery spectrum. And if we aren’t careful, we will make it even harder for poor people to eke out a living.

I’m all for saving lives but let’s wait for the laws to catch up with the good intentions. Although to be honest, it’s not even clear how the law should be written, since it’s not clear what “medical” data is nowadays nor how we could gather evidence that a private employer is using it against someone improperly.

Categories: modeling

Unsolicited advice about having kids

You know how it’s better to have a discussion with someone when you’re calm and they haven’t just done something that drives you absolutely nuts? Well I’m going to generalize to the parenting advice realm: best time to give parenting advice is not when you’ve just seen a kid get poorly parented or a parent stress out about stupid stuff. Best time is when you’re alone in your pajamas, nowhere near other people’s kids. That way those of you who have kids won’t feel defensive.

Also, here’s another rule about parenting advice: never take parenting advice from anyone, because the people who are actually eager to give it are usually super weird. Look at Tiger Mom as Exhibit A.

In spite of that very wise second rule, I’ma go ahead and give some advice that’s pretty good, if I do say so myself in my own weird way.

  1. Before having kids, think of all the reasons not to. They’re loud, expensive, and they weigh you down immensely. You will never be able to stay up with friends after 10pm again if you do it. So don’t do it.
  2. Unless… unless you just absolutely cannot help it because of all those freaking hormones and how cute they look in summer dresses (boys included, yes, they don’t care, they’re babies). Then do it, but think hard and plan well for the noise, the expense, and the inconvenience.
  3. In terms of how you parent a baby: think long-term about stuff. Are you gonna want to get up a million times every night for the rest of your life? No, you’re not. So figure out how to get the damn baby to sleep through the night. This cannot be forced until the kid is 6 months or so, and the moment you can manipulate their sleep is characterized by the moment they can try to manipulate their sleep and stay awake to hang out with you. That’s when you start the 6pm bedtime ritual, including songs and books and 6:30 lights out. They will cry for like 10 minutes three nights in a row and after that you will be golden. Long term thinking, remember. Even if they cry for an hour, it’s an investment for a lifetime, namely yours.
  4. In terms of how you parent a little kid: think super long-term about stuff. Don’t raise your voice unless they are doing something actually dangerous, like walking into traffic or sticking a fork into an outlet. Make sure you let them get really dirty and try to eat weird things, too – their tongues are like extra hands at this age, it helps them explore the world. The only thing a little kid really needs is regular meals and a 6 or maybe 7pm bedtime ritual. They can spend 2 hours ripping up a newspaper for entertainment. Once a week baths would be good.
  5. In terms of how you parent a school age kid: think super duper long-term about stuff. If you do their homework for them, they will never do it themselves. So let them figure that out, but do remind them to do it if they’re forgetful. If you structure all their time, they will never figure out what they love to do, so make sure they get bored sometimes. Keep lots of good books and nerdy puzzles and interesting people around the house but don’t make them “do math” with you unless they ask for it. Don’t make them take music lessons. Instead, wait for them to beg for music lessons, and then say no for a while until you’re really sure they want them. Don’t just tell them to be nice, exhibit nice behavior to them and to others in front of them. Reward them for pointing out your hypocrisies, and make them watch Star Trek: The Next Generation (or equivalent) with you for its moral education and for the popcorn, and have fun listening to them pointing out the bad physics. And the most important of all: enjoy them and have fun with them, because that’s the best kind of way to role model for your kids, plus it’s fun, and they’re people who will move away pretty soon and you’ll miss them.
  6. In terms of how you parent an older kid, I have no idea because my oldest kid is 14. But so far we’re having a blast. I’m pretty sure they’re already mostly raised in terms of my role anyway by the time they’re 12.

One last, general thing for today’s anxious parents: don’t feel guilty, you’re doing your best. Guilt is a waste of time and gets in the way of enjoying the popcorn.

Categories: musing

The dark matter of big data

A tiny article in The Cap Times was recently published (hat tip Jordan Ellenberg) which describes the existence of a big data model which claims to help filter and rank school teachers based on their ability to raise student test scores. I guess it’s a kind of pre-VAM filtering system, and if it was hard to imagine a more vile model than the VAM, here you go. The article mentioned that the Madison School Board was deliberating on whether to spend $273K on this model.

One of the teachers in the district wrote her concerns about this model in her blog and then there was a debate at the school board meeting, and a journalist covered the meeting, so we know about it. But it was a close call, and this one could have easily slipped under the radar, or at least my radar.

Even so, now I know about it, and once I looked at the website of the company promoting this model, I found links to an article where they name a customer, for example in the Charlotte-Mecklenburg School District of North Carolina. They claim they only filter applications using their tool, they don’t make hiring decisions. Cold comfort for people who got removed by some random black box algorithm.

I wonder how many of the teachers applying to that district knew their application was being filtered through such a model? I’m going to guess none. For that matter, there are all sorts of application screening algorithms being regularly used of which applicants are generally unaware.

It’s just one example of the dark matter of big data. And by that I mean the enormous and growing clusters of big data models that are only inadvertently detectable by random small-town or small-city budget meeting journalism, or word-of-mouth reports coming out of conferences or late-night drinking parties with VC’s.

The vast majority of big data dark matter is still there in the shadows. You can only guess at its existence and its usage. Since the models themselves are proprietary, and are generally deployed secretly, there’s no reason for the public to be informed.

Let me give you another example, this time speculative, but not at all unlikely.

Namely, big data health models arising from the quantified self movement data. This recent Wall Street Journal article entitled Can Data From Your Fitbit Transform Medicine? articulated the issue nicely:

A recent review of 43 health- and fitness-tracking apps by the advocacy group Privacy Rights Clearinghouse found that roughly one-third of apps tested sent data to a third party not disclosed by the developer. One-third of the apps had no privacy policy. “For us, this is a big trust issue,” said Kaiser’s Dr. Young.

Consumer wearables fall into a regulatory gray area. Health-privacy laws that prevent the commercial use of patient data without consent don’t apply to the makers of consumer devices. “There are no specific rules about how those vendors can use and share data,” said Deven McGraw, a partner in the health-care practice at Manatt, Phelps, and Phillips LLP.

The key is that phrase “regulatory gray area”; it should make you think “big data dark matter lives here”.

When you have unprotected data that can be used as a proxy of HIPAA-protected medical data, there’s no reason it won’t be. So anyone who wants stands to benefit from knowing health-related information about you – think future employers who might help pay for future insurance claims – will be interested in using big data dark matter models gleaned from this kind of unregulated data.

To be sure, most people nowadays who wear fitbits are athletic, trying to improve their 5K run times. But the article explained that the medical profession is on the verge of suggesting a much larger population of patients use such devices. So it could get ugly real fast.

Secret big data models aren’t new, of course. I remember a friend of mine working for a credit card company a few decades ago. Her job was to model which customers to offer subprime credit cards to, and she was specifically told to target those customers who would end up paying the most in fees. But it’s become much much easier to do this kind of thing with the proliferation of so much personal data, including social media data.

I’m interested in the dark matter, partly as research for my book, and I’d appreciate help from my readers in trying to spot it when it pops up. For example, I remember begin told that a certain kind of online credit score is used to keep people on hold for customer service longer, but now I can’t find a reference to it anywhere. We should really compile a list at the boundaries of this dark matter. Please help! And if you don’t feel comfortable commenting, my email address is on the About page.

You are not Google’s customer

I’m going to write one of those posts where many of you will already understand my point. In fact it might be old hat for a majority of my readers, yet it’s still important enough for me to mention just in case there are a few people out there who don’t know how the modern business model is set up.

Namely, like this. As a gmail and Google Search user, you are not a customer of Google. You are the product. The customers of Google are the ones who advertise to you. Your interaction with Google is, from the perspective of the business operation, that you give them information which they harvest so they can advertise to you in a more targeted way, thus increasing the likelihood of you clicking. The fact that you get a service from these interactions is great, because it means you’ll come back to give Google and its customers more information about you soon.

This misunderstanding, once you see it as such, can be clarifying. For example, when people talk about anti-trust and Google, they should talk about whether the customers of Google have any other serious choice. And since the customers of Google are advertisers, not gmailers or searchers, the alternatives aren’t hotmail or Bing. Rather they are other advertising outlets. And a very good case can be made that Google does violate anti-trust laws in that sense, just ask Nathan Newman.

It also explains why something like the recent European “right to be forgotten” law seems so strange and unreasonable to the powers that be at Google. It’d be like a meat farm where the cows go on strike and demand better food. Cows are the product, and they aren’t supposed to complain. They’re not even supposed to be heard. At worst we treat them better when our customers demand it, not when the cows do.

I was reminded about this ubiquitous business model yesterday, and newly enraged by its consequences, when reading this article entitled Held Captive by Flawed Credit Reports (hat tip Linda Brown) about the credit score agency Experian and how they utterly disregard the laws trying to protect consumers from mistakes in their credit reports. The problem here is that, to the giant company Experian, its customers are giant companies like Verizon which send credit score requests millions of times a day and pay for each score. Mere people, whose mortgage application is being denied because of mistakes, are the product, not the customer, and they are almost by definition unimportant.

And it seems that the law which is supposed to protect these people, namely the Fair Credit Reporting Act, first passed in 1970, doesn’t have enough teeth behind it to make the big credit scoring agencies sit up and pay attention. It’s all about the scale of the fines compare to the scale of the business. This is well explained in the article (emphasis mine):

Last year, the Federal Trade Commission found that 5 percent of consumers — or an estimated 10 million people — had an error on one of their credit reports that could have resulted in higher borrowing costs.

The F.T.C., which oversees the industry along with the Consumer Financial Protection Bureau, has been busy bringing cases in this arena. Since 2000, it has filed 18 enforcement actions against reporting bureaus; 13 were district court actions that generated $25.7 million in penalties.

Consumers have also won in the courts, on occasion. Last year, an Oregon consumer was awarded $18.4 million in punitive damages by a jury after she sued Equifax for inserting errors into her credit report. But the fines, settlements and judgments paid by the larger companies are not even close to a rounding error. Experian generated $4.8 billion in revenue for the year ended March 2014, and its after-tax profit of $747 million in the period was more than twice its 2013 figure.

Million versus billion. It seems like the cows don’t have much leverage.

Categories: economics, rant

Guest post: What is the goal of a college calculus course?

This is a guest post by Nathan, who recently finished graduate school in math, and will begin a post-doc in the fall. He loves teaching young kids, but is still figuring out how to motivate undergraduates.

The question

Like most mathematicians in academia, I’m teaching calculus in the fall. I taught in grad school, but the syllabus and assignments were already set. This time I’ll be in charge, so I need to make some design decisions, like the following:

  1. Are calculators/computers/notes allowed on the exams?
  2. Which purely technical skills must students master (by a technical skill I mean something like expanding rational functions into partial fractions: a task which is deterministic but possibly intricate)?
  3. Will students need to write explanations and/or proofs?

I have some angst about decisions like these, because it seems like each one can go in very different directions depending on what I hope the students are supposed to get from the course. If I’m listing the pros and cons of permitting calculators, I need some yardstick to measure these pros and cons.

My question is: what is the goal of a college calculus course?

I’d love to have an answer that is specific enough that I can use it to make concrete decisions like the ones above. Part of my angst is that I’ve asked many people this question, including people I respect enormously for their teaching, but often end up with a muddled answer. And there are a couple stock answers that come to mind, but each one doesn’t satisfy me for one reason or another. Here’s what I have so far.

The contenders.

To teach specific tasks that are necessary for other subjects.

These tasks would include computing integrals and derivatives, converting functions to power series or Fourier series, and so forth.

Intuitive understanding of functions and their behavior.

This is vague, so here’s an example: a couple years ago, a friend in medical school showed me a page from his textbook. The page concerned whether a certain drug would affect heart function in one way or in the opposite way (it caused two opposite effects), and it showed a curve relating two involved parameters. It turned out that the essential feature was that this curve was concave down. The book did not use the phrase “concave down,” though, and had a rather wordy explanation of the behavior. In this situation, a student who has a good grasp of what concavity is and what its implications are is better equipped to understand the effect described in the book. So if a student has really learned how to think about concavity of functions and its implications, then she can more quickly grasp the essential parts of this medical situation.

To practice communicating with precision.

I’m taking “communication” in a very wide sense here: carefully showing the steps in an integral calculation would count.

Not Satisfied

I have issues with each of these as written. I don’t buy number 1, because the bread and butter of calculus class, like computing integrals, isn’t something most doctors or scientists will ever do again. Number 2 is a noble goal, but it’s overly idealistic; if this is the goal, then our success rate is less than 10%. Number 3 also seems like a great goal, relevant for most of the students, but I think we’d have to write very different sorts of assignments than we currently do if we really want to aim for it.

I would love to have a clear and realistic answer to this question. What do you think?

Categories: education, math education

Clearwater Festival and Pete Seeger

After recording my weekly Slate Money podcast this morning I will be off to the Clearwater Festival in Croton-on-Hudson. The weather’s supposed to be gorgeous all weekend, which is good because I’m camping in a tent, and the last few times I went to bluegrass or folk festivals and camped in a tent it rained and I ended up sleeping in puddles. If you’ve never done that, let me tell you that there’s something gross and creepy about wet pillows.

My bandmate Jamie, who plays the mandolin and washboard, convinced me not only to go but to be a volunteer at this festival, which as it turns out means I’ll be preparing food in the kitchen. There are 1,000 volunteers at this festival, so who knows how many people go; I’m preparing for a lot of diced carrots and onions no matter what. Or maybe I’ll be doing dishes. I love doing dishes for some reason.

So this Clearwater Festival was Pete Seeger’s baby, he came every year, and since he passed away this past winter, the entire weekend will be a tribute to his life and his work. Some incredible musicians are going to be there to honor Pete, and I am hoping my kitchen duties don’t conflict with my old favorite, Marty Sexton (Sunday at 4pm), as well as my new favorite, John Fullbright (Saturday at 2:30).

Stuff I’ve packed for the trip: tent, sleeping bag, pillow (dry so far), bluegrass juice (of the Jack Daniels variety), my fiddle, my banjo, a wooden bowl and utensils, and some metal coffee cups and shot glasses. Oh, and some clothes.

You should totally come by for either day or for the whole weekend if you’re nearby and in the mood for some really old hippy reminiscences! And really, who isn’t.

 

Categories: musing

Circular arguments, eigenjesus, and climate change

No time for a post this morning but go read this post by Scott Aaronson on using a PageRank-like algorithm to understand human morality and decision making. The post is funny, clever, very thoughtful, and pretty long.

Categories: modeling, musing
Follow

Get every new post delivered to your Inbox.

Join 974 other followers