Archive

Archive for the ‘news’ Category

A Code of Conduct for data scientists from the Bellagio Fellows

September 25, 2013 3 comments

The 2013 PopTech & Rockefeller Foundation Bellagio Fellows - Kate CrawfordPatrick MeierClaudia PerlichAmy LuersGustavo Faleiros and Jer Thorp - yesterday published “Seven Principles for Big Data and Resilience Projects” on Patrick Meier’s blog iRevolution.

Although they claim that these principles are meant for “best practices for resilience building projects that leverage Big Data and Advanced Computing,” I think they’re more general than that (although I’m not sure exactly what a resilience building project is) I and I really like them. They are looking for public comments too. Go to the post for the full description of each, but here is a summary:

1. Open Source Data Tools

Wherever possible, data analytics and manipulation tools should be open source, architecture independent and broadly prevalent (R, python, etc.).

2. Transparent Data Infrastructure

Infrastructure for data collection and storage should operate based on transparent standards to maximize the number of users that can interact with the infrastructure.

3. Develop and Maintain Local Skills

Make “Data Literacy” more widespread. Leverage local data labor and build on existing skills.

4. Local Data Ownership

Use Creative Commons and licenses that state that data is not to be used for commercial purposes.

5. Ethical Data Sharing

Adopt existing data sharing protocols like the ICRC’s (2013). Permission for sharing is essential. How the data will be used should be clearly articulated. An opt in approach should be the preference wherever possible, and the ability for individuals to remove themselves from a data set after it has been collected must always be an option.

6. Right Not To Be Sensed

Local communities have a right not to be sensed. Large scale city sensing projects must have a clear framework for how people are able to be involved or choose not to participate.

7. Learning from Mistakes

Big Data and Resilience projects need to be open to face, report, and discuss failures.

Are you cliterate?

September 24, 2013 1 comment

Not much time this morning for blogging, but I wanted everyone to get a chance to read this amazing Huffington Post article about learning more than you ever thought possible about the female sexual organ, and then celebrating that knowledge in style.

The article is actually more inspiring than you’d think, and I found myself weeping with joy at times. I’m an easy cry, but still.

Plus, any article that has this picture is worth reading:

SOLID-GOLD-CLIT

Categories: musing, news

Happy Birthday, Occupy Wall Street! #OWS

September 17, 2013 17 comments

Hey, what are you doing for the 2nd anniversary of the occupation of Zuccotti Park?

I know what I’m doing, namely going down to the park and handing out hundreds of copies of my occupy group’s new book – now on scribd!!. Here’s a ridiculous gif of the pile of books that came from the printer yesterday with my kindergartner posing by it (you might need to click on it to see the animation!!):

Someone isn't shy.

Someone isn’t shy.

I’m also planning a small speech at 10:15am, which I’m still writing. I’ll post it here later. here it is:

Thank you for coming
Thank you for occupying
I am here today to announce a birth
The birth of a book
It’s called “Occupy Finance”
We wrote it
we are Alternative Banking

Who are we?
We are a working group of Occupy
we first met almost two years ago
we have been meeting ever since
we meet every Sunday afternoon
at Columbia University
our meetings are totally open
we want you to come

We discuss the financial system
we discuss financial regulation
we discuss how lobbyists destroy regulation
we discuss how Obama destroys regulation
we discuss what we can do to help
how we can make our opinions known
how we can make the system work for us
the 99%

Last year we had a project
The 52 Shades of Greed
we came here to Zuccotti Park
we gave out hundreds of packs of cards
they explained the financial system
they called out the criminals
they called out the toxic ideas
and the toxic instruments
and the toxic institutions
that started this mess

This year we’ve come back
with another present to share
it’s a book we wrote
it’s a book for all of us
it explains how the financial system works
and how it doesn’t work
it explains how the system uses us
how the bankers scam us all
how the regulators fail to do their job
how the politicians have been bought

Why did we write this book?
we wrote it for you
and we wrote it for us
we wrote it for anyone
who wants to know
how to argue against
the side of greed
the side of corruption
the side of entitlement

let me tell you something
some people call us radicals
but listen up
when the top 1%
capture 95% of the income gains
since the so-called end
of the recession,
when more than half the country thinks
that we didn’t do enough
to put bankers in jail,
when the median household income
has gone down 7.3% since 2007,
when the actual employment rate
is 5% below 2007,
when the jobs that do exist are crappy
when we get paid with prepaid debit cards
that nickel and dime us all
then what we demand is not radical
it is only a system that works
we are asking for a just system
we are asking for a fair system
we are asking for an end to too-big-to-fail
we demand banks take less risk
with our money
and we are asking lawmakers
to stop banks
once and for all
from scamming people because they are poor

Please join us
we want you to come
you don’t need to be an expert
we started out as strangers
who wanted to know how things work
we have become friends
we have become allies
we have made something
out of our curiousity
and out of our hard work
and we are here today
to share that with you
and to ask you to join us
please join us
happy birthday to us!
Thank you!!

Categories: #OWS, finance, news

Sunday morning musing: is sexism an addiction?

I’ve been reading articles about cultures of sexism at Harvard Business School and in philosophy, both articles published in the New York Times this past week. The two of them have gotten me to speculate about the different ways that men and women experience sexist behavior.

Namely, very differently. Women, being the targets of sexist remarks and behavior, are sensitive to its barbaric nature and status-oriented putdowns – they are aware of it because it so obviously stings. Men – some men, not all – consistently seem baffled by all the fuss, and if they acknowledge the behavior, it is, in their opinion, more like having fun than being mean.

“Why would people want me to stop having fun?” they ask.

It makes me wonder if sexism is addictive. Let me explain my Sunday morning theory.

Assume that, when men perform an act of sexism, they get rewarded in their pleasure center similar to when someone takes a street drug or has sex.

So for example, say some male Harvard Business School (HBS) student encounters a female HBS colleague who is a potential competitor. To establish his dominance, he puts her down publicly on the basis of her looks. As mentioned in the article, the HBS population is obsessed with status, and this is a standard way of keeping her status low and simultaneously making her anxious and distracted.

My question is, what happens inside that man’s brain when he does that? For that matter, what happens to the brains of the other men in that group who witness that? My theory is that they all experience a kind of pleasure center stimulation, whereby their entire group is nudged up in rank over some “other,” which happens to be that woman. In some sense it’s kind of irrelevant who they put down in order to be rewarded, though, which is why they don’t think of what they did as a bad thing, just something that they vaguely enjoyed.

Go back to how differently the men and women describe their experiences after the fact of sexist environments. Men consistently don’t remember it as a negative event. From the article about sexism in philosophy:

I’m always hearing from stressed-out men, worrying aloud what “all this fuss” about sexual harassment means for them. I’ve heard it at training sessions on university sexual harassment policy: “Does this mean I can’t even tell a woman that she looks nice?” I’ve heard it in coffee lounges: “Make sure you keep your door open when you’re talking to a woman student — you never know what she might say later.” And I’ve had it confided to me, with a sigh of regret, at conference happy hours: “I’m afraid now to form any relationships with female students — they might take it the wrong way.”

I don’t think men are lying. I think they actually experience sexist events as positive and benign.

It also makes sense how men react when sexism is addressed by the higher authorities in the form of sensitivity training. When men are forced into a room to talk about sexism and norms of appropriate behavior, they’re super uncomfortable and don’t seem to know why they’re there (again, not all men). They for whatever reason don’t think discussions about sexism apply to them, like it’s a women’s issue.

On the other hand, as we saw in the HBS article, forcing men to talk about it at length does seem to actually help, in spite of their protests. The article focuses on women’s behavior, I think overly much, but it’s just as much about men as it is about women. True, women undermine themselves by competing with each other to be perfect and sexy and brilliant (but not too brilliant), etc., but really it’s about getting them men to stop with their nonsense, right?

And what might be happening is that, along with the positive feedback which stimulates the pleasure center, through this training they might also be developing a second, negative feedback around sexist comments, which would mean that eventually, if that second feedback grew strong enough, it would no longer feel so good to be sexist.

I mean, how do you break someone of their addictive habits? I guess you could destroy the pleasure center altogether, but that seems extreme except for the really most annoying HBS folks. Probably what you’d want to do is counteract the effect with an opposing effect. Thus sensitivity training.

Of course, this theory applies equally well to other forms of discrimination. And it’s not obvious how to address it even if it’s true. But at least, if we thought about it this way, it would throw light on the baffling disconnect whereby such problems are glaringly obvious to some while remaining utterly invisible to others.

Categories: musing, news

Simons Center for Data Analysis

Has anyone heard of the new Simons Center for Data Analysis?

Neither had I until just now. But some guy named Leslie Greengard, who is a distinguished mathematician and computer scientist, just got named its director (hat tip Peter Woit).

Please inform me if you know more about this center. I got nothing except this tiny description:

As SCDA’s director, Greengard will build and lead a team of scientists committed to analyzing large-scale, rich data sets and to developing innovative mathematical methods to examine such data.

Categories: data science, news

Short your kids, go long your neighbor: betting on people is coming soon

Yet another aspect of Gary Shteyngart’s dystopian fiction novel Super Sad True Love Story is coming true for reals this week.

Besides anticipating Occupy Wall Street, as well as Bloomberg’s sweep of Zuccotti Park (although getting it wrong on how utterly successful such sweeping would be), Shteyngart proposed the idea of instant, real-time and broadcast credit ratings.

Anyone walking around the streets of New York, as they’d pass a certain type of telephone pole – the kind that identifies you via your cell phone and communicates with data warehousing services and databases – would have their credit rating flashed onto a screen. If you went to a party, depending on how you impressed the other party go-ers, your score could plummet or rise in real time, and everyone would be able to keep track and treat you accordingly.

I mean, there were other things about the novel too, but as a data person these details certainly stuck with me since they are both extremely gross and utterly plausible.

And why do I say they are coming true now? I base my claim on two news stories I’ve been sent by my various blog readers recently.

[Aside: if you read my blog and find an awesome article that you want to send me, by all means do! My email address is available on my "About" page.]

First, coming via Suresh and Marcos, we learn that data broker Acxiom is letting people see their warehoused data. A few caveats, bien sûr:

  1. You get to see your own profile, here, starting in 2 days, but only your own.
  2. And actually, you only get to see some of your data. So they won’t tell you if you’re a suspected gambling addict, for example. It’s a curated view, and they want your help curating it more. You know, for your own good.
  3. And they’re doing it so that people have clarity on their business.
  4. Haha! Just kidding. They’re doing it because they’re trying to avoid regulations and they feel like this gesture of transparency might make people less suspicious of them.
  5. And they’re counting on people’s laziness. They’re allowing people to opt out, but of course the people who should opt out would likely never even know about that possibility.
  6. Just keep in mind that, as an individual, you won’t know what they really think they know about you, but as a corporation you can buy complete information about anyone who hasn’t opted out.

In any case those credit scores that Shteyngart talks about are already happening. The only issue is who gets flashed those numbers and when. Instead of the answers being “anyone walking down the street” and “when you walk by a pole” it’s “any corporation on the interweb” and “whenever you browse”.

After all, why would they give something away for free? Where’s the profit in showing the credit scores of anyone to everyone? Hmmmm….

That brings me to my second news story of the morning coming to me via Constantine, namely this TechCrunch story which explains how a startup called Fantex is planning to allow individuals to invest in celebrity athletes’ stocks. Yes, you too can own a tiny little piece of someone famous, for a price. From the article:

People can then buy shares of that player’s brand, like a stock, in the Fantex-consumer market. Presumably, if San Francisco 49ers tight end Vernon Davis has a monster year and looks like he’s going to get a bigger endorsement deal or a larger contract in a few years, his stock would rise and a fan could sell their Davis stock and cash out with a real, monetary profit. People would own tracking or targeted stocks in Fantex that would depend on the specific brand that they choose; these stocks would then rise and fall based on their own performance, not on the overall performance of Fantex.

Let’s put these two things together. I think it’s not too much of a stretch to acknowledge a reason for everyone to know everyone else’s credit score! Namely, we can can bet on each other’s futures!

I can’t think of any set-up more exhilarating to the community of hedge fund assholes than a huge, new open market – containing profit potentials for every single citizen of earth – where you get to make money when someone goes to the wrong college, or when someone enters into an unfortunate marriage and needs a divorce, or when someone gets predictably sick. An orgy in the exact center of tech and finance.

Are you with me peoples?!

I don’t know what your Labor Day plans are, but I’m getting ready my list of people to short in this spanking new market.

College ranking models

Last week Obama began to making threats regarding a new college ranking system and its connection to federal funding. Here’s an excerpt of what he was talking about, from this WSJ article:

The president called for rating colleges before the 2015 school year on measures such as affordability and graduation rates—”metrics like how much debt does the average student leave with, how easy is it to pay off, how many students graduate on time, how well do those graduates do in the workforce,” Mr. Obama told a crowd at the University at Buffalo, the first stop on a two-day bus tour.

Interesting! This means that Obama is wading directly into the field of modeling. He’s probably sick of the standard college ranking system, put out by US News & World Reports. I kind of don’t blame him, since that model is flawed and largely gamed. In fact, I made a case for open sourcing that model recently just so that people would look into it and lose faith in its magical properties.

So I’m with Obama, that model sucks, and it’s high time there are other competing models so that people have more than one thing to think about.

On the other hand, what Obama is focusing on seems narrow. Here’s what he supposedly wants to do with that model (again from the WSJ article):

Once a rating system is in place, Mr. Obama will ask Congress to allocate federal financial aid based on the scores by 2018. Students at top-performing colleges could receive larger federal grants and more affordable student loans. “It is time to stop subsidizing schools that are not producing good results,” he said.

His main goal seems to be “to make college more affordable”.

I’d like to make a few comments on this overall plan. The short version is that he’s suggesting something that will have strong, mostly negative effects, and that won’t solve his problem of college affordability.

Why strong negative effects?

What Obama seems to realize about the existing model is that it’s had side effects because of the way college administrators have gamed the model. Presumably, given that this new proposed model will be directly tied to federal funding, it will be high-impact and will thus be thoroughly gamed by administrators as well.

The first complaint, then, is that Obama didn’t address this inevitably gaming directly – and that doesn’t bode well about his ability to put into place a reasonable model.

But let’s not follow his lead. Let’s think about what kind of gaming will occur once such a model is in place. It’s not pretty.

Here are the attributes he’s planning to use for colleges. I’ve substituted reasonably numerical proxies for his descriptions above:

  1. Cost (less is better)
  2. Percentage of people able to pay off their loans within 10 years (more is better)
  3. Graduation rate (more is better)
  4. Percentage of people graduating within 4 years (more is better)
  5. Percentage of people who get high-paying jobs after graduating (more is better)

Cost

Nobody is going to argue against optimizing for lower cost. Unfortunately, what with the cultural assumption of the need for a college education, combined with the ignorance and naive optimism of young people, not to mention start-ups like Upstart that allow young people to enter indentured servitude, the pressure is upwards, not downwards.

The supply of money for college is large and growing, and the answer to rising tuition costs is not to supply more money. Colleges have already responded to the existence of federal loans, for example, by raising tuition in the amount of the loan. Ironically, much of the rise in tuition cost has gone to administrators, whose job it is to game the system for more money.

Which is to say, you can penalize certain colleges for being at the front of the pack in terms of price, but if the overall cost is rising constantly, you’re not doing much.

If you really wanted to make costs low, then fund state universities and make them really good, and make them basically free. That would actually make private colleges try to compete on cost.

Paying off loans quickly

Here’s where we get to the heart of the problem with Obama’s plan.

What are you going to do, as an administrator tasked with making sure you never lose federal funding under the new regime?

Are you going to give all the students fairer terms on their debt? Or are you going to select for students that are more likely to get finance jobs? I’m guessing the latter.

So much for liberal arts educations. So much for learning about art, philosophy, or for that matter anything that isn’t an easy entrance into the tech or finance sector. Only colleges that don’t care a whit about federal money will even have an art history department.

Graduation rate

Gaming the graduation rate is easy. Just lower your standards for degrees, duh.

How quickly people graduate

Again, a general lowering of standards is quick and easy.

How well graduates do in the workforce

Putting this into your model is toxic, and measures a given field directly in terms of market forces. Economics, Computer Science, and Business majors will be the kings of the hill. We might as well never produce writers, thinkers, or anything else creative again.

Note this pressure already exists today: many of our college presidents are becoming more and more corporate minded and less interested in education itself, mostly as a means to feed their endowments. As an example, I don’t need to look further than across my street to Barnard, where president Debora Spar somehow decided to celebrate Ina Drew as an example of success in front of a bunch of young Barnard students. I can’t help but think that was related to a hoped-for gift.

Obama needs to think this one through. Do we really want to build the college system in this country in the image of Wall Street and Silicon Valley? Do we want to intentionally skew the balance towards those industries even further?

Building a better college ranking model

The problem is that it’s actually really hard to model quality of education. The mathematical models that already exist and are being proposed are just pathetically bad at it, partly because college, ultimately, isn’t only about the facts you learn, or the job you get, or how quickly you get it. It’s actually a life experience which, in the best of cases, enlarges your world view, and gets you to strive for something you might not have known existed before going.

I’d suggest that, instead of building a new ranking system, we on the one hand identify truly fraudulent colleges (which really do exist) and on the other, invest heavily in state schools, giving them enough security so they can do without their army of expensive administrators.

Categories: modeling, news, rant

How to be a pickup artist, Silicon Valley style

You know that feeling you get when you’re reading an disembodied article on the web and it’s just so ridiculous, you get the creeping sensation that it’s either from The Onion or the Borowitz Report?

That is, I would suggest, how you’re going to feel when you read this article about a school for Silicon Valley style entrepreneurship (hat tip Peter Woit). Even just the name of the school – the Draper University of Heroes – feels like an Onion article, never mind the visuals:

Students in class at the Draper University of Heroes

Students in class at the Draper University of Heroes

So, what do these young people learn do to become douchebag heros? Here’s what:

  • They pledge allegiance every morning to their personal brands,
  • They submit to a full two days of coding and excel lessons,
  • Then they get down to the real work of sun tanning by the pool and go-kart racing,
  • They hang out with VC Tim Draper, an investor in Tesla (the new conspicuous consumption choice among pseudo-progressive capitalists, as I learned at FOO),
  • They read books, or at least they own books, including Donald Trump’s The Art of the Deal, The Wall Street MBA, and Ayn Rand’s The Fountainhead,
  • and all this for just $9,500 for an eight week program!

How does it end? From the article:

In lieu of diplomas, Draper U. students receive masks and capes printed with their superhero nicknames and are instructed to jump on each of a series of three small trampolines placed in a line in front of them. While bouncing from trampoline to trampoline, they’re told to shout, “Up, up, and away!” Then they assemble for a group photo.

“The world needs more heroes,” Draper says. “And it just got 40 more of them!”

Here’s the thing. It’s no accident that there are way more men than women here. This school is very similar in design and intent to the society built by Neil Strauss, who wrote The Game and taught a bunch of guys how to pick up “hot” women for sex – Aunt Pythia discussed it here.

Why do I say that? Because it’s fundamentally a confidence-boosting ritual, where a bunch of guys convince themselves that their prospects are good, their goals are attainable, their narcissistic world view is honorable, and it’s just a question of acquiring the right magic tricks to entrap their prey. It just happens to be about money instead of sex in this case.

There is a difference, of course. Whereas the pick up artists only needed to trick drunk women for a few hours in order to sleep with them, these “Silicon Valley Heroes” have to trick way more people for way longer that they should get investment. That doesn’t make it impossible for something like this to work, though, just harder.

Categories: musing, news

Minorities possible unfairly disqualified from opening bank accounts

My friend Frank Pasquale sent me this article over twitter, about New York State attorney general Eric T. Schneiderman’s investigation into possibly unfair practices by big banks using opaque and sometimes erroneous databases to disqualify people from opening accounts.

Not much hard information is given in the article but we know that negative reports stemming from the databases have effectively banished more than a million lower-income Americans from the financial system, and we know that the number of “underbanked” people in this country has grown by 10% since 2009. Underbanked people are people who are shut out of the normal banking system and have to rely on the underbelly system including check cashing stores and payday lenders.

I can already hear the argument of my libertarian friends: if I’m a bank, and I have reason to suspect you have messed up with your finances in the past, I don’t offer you services. Done and done. Oh, and if I’m a smart bank that figures out some of these so-called “past mistakes” are actually erroneously reported, then I make extra money by serving those customers that are actually good when they look bad. And the free market works.

Two responses to this. First, at this point big banks are really not private companies, being on the taxpayer dole. In response they should reasonably be expected to provide banking services to all of not most people as part of a service. Of course this is a temporary argument, since nobody actually likes the fact that the banks aren’t truly private companies.

The second, more interesting point – at least to me – is this. We care about and defend ourselves from our constitutional rights being taken away but we have much less energy to defend ourselves against good things not happening to us.

In other words, it’s not written into the constitution that we all deserve a good checking account, nor a good college education, nor good terms on a mortgage, and so on. Even so, in a large society such as ours, such things are basic ingredients for a comfortable existence. Yet these services are rare if not nonexistent for a huge and swelling part of our society, resulting in a degradation of opportunity for the poor.

The overall effect is heinous, and at some point does seem to rise to the level of a constitutional right to opportunity, but I’m no lawyer.

In other words, instead of only worrying about the truly bad things that might happen to our vulnerable citizens, I personally spend just as much time worrying about the good things that might not happen to our vulnerable citizens, because from my perspective lots of good things not happening add up to bad things happening: they all narrow future options.

Categories: modeling, news, rant

Larry Summers being set up to fail?

I’m back from PyData, which was a lot of fun and filled with super nice nerdy people. My prezi slides are now available here.

I have time for one thought: a bunch of people have chatted me up recently with the theory that Larry Summers is being put in the running for the Fed Chair alongside Janet Yellen just so that, when Yellen gets the call, we can all breathe a sigh of relief it didn’t go to Summers.

In other words, it’s a wholly political ploy so the Obama can look like a hero for women everywhere when he chooses Yellen, and so that we can all conclude that at least Obama’s learned this one lesson with regards to dealing with the ongoing financial crisis: Summers isn’t the solution.

Depending on my mood I sometimes buy into this theory, but obviously I’m still worried.

Categories: finance, news

PyData and a few other things

So here’s the thing about being a parent of benign neglect: it’s no walk in the park. I talk a big game, but the truth is I’ve have trouble getting to sleep from the anxiety. To distract myself I’ve been watching Law & Order episodes on Netflix until the wee hours of the night.

Two things about this plan suck. First, my husband is in Amsterdam, which means he’s 6 time zones away from our oldest son whereas I’m only 3, but somehow that means I’m shouldering 99.5% of the responsibility to worry (there’s some universal geographic law of parenting at work there but I don’t know how to formulate it). Second, half of the L&O episodes involve either children getting maimed or killed or child killers. Not restful but I freaking can’t stop!

In any case, not much extra energy to spring out of bed and write the blog, so apologies for a sparse period for mathbabe. For whatever reason I woke up this morning in time to blog, however, so as to not miss an opportunity it’s gonna be in list form:

  1. I’ve been invited to keynote at PyData in Cambridge, MA at the end of the month – me and Travis Oliphant! I’m still coming up with the title and abstract for my talk, but it’s going to be something about storytelling with data using the iPython Notebook. Please make suggestions!
  2. I was in a Wall Street Journal article about Larry Summers, talking about whether he’s got a good personality to take over from Ben Bernanke, i.e. should we trust our lives and our future with him. I say nope. What’s funny is that my uncle, economist Bob Hall, is also referred to in the same article. The journalist didn’t know we’re related until after the article came out and Uncle Bob informed him.
  3. Hey, can we give it up for Eliot Spitzer? The powers that be are down about that guy presumably for having sex with prostitutes but really because he’s a threat. I say legalize prostitution, unionize the prostitutes a la the dutch, and put Spitzer in charge of something involving money and corruption, he’s smart and fearless. Who’s with me?
  4. It looks like good news: the Consumer Financial Protection Bureau might be cracking down on illegal debt collector tactics. Update: wait, the fines are fractions of 1% of the revenue these guys made on their unfair practices. Can we please have a rule that when you get caught breaking the law, the fine will be large enough so it’s no longer profitable?
Categories: news, open source tools

How much would you pay to be my friend?

I am on my way to D.C. for a health analytics conference, where I hope to learn the state of the art for health data and modeling. So stay tuned for updates on that.

In the meantime, ponder this concept (hat tip Matt Stoller, who describes it as ‘neoliberal prostitution’). It’s a dating website called “What’s Your Price?” where suitors bid for dates.

Screen Shot 2013-06-03 at 7.19.56 AM

 

What’s creepier, the sex-for-pay aspect of this, or the it’s-possibly-not-about-sex-it’s-about-dating aspect? I’m gonna go with the latter, personally, since it’s a new idea for me. What else can I monetize that I’ve been giving away too long for free?

Hey, kid, you want a bedtime story? It’s gonna cost you.

Categories: data science, modeling, news

Technocrats and big data

Today I’m finally getting around to reporting on the congressional subcommittee I went to a few weeks ago on big data and analytics. Needless to say it wasn’t what I’d hoped.

My observations are somewhat disjointed, since there was no coherent discussion, so I guess I’ll just make a list:

  1. The Congressmen and women seem to know nothing more about the “Big Data Revolution” than what they’d read in the now-famous McKinsey report which talks about how we’ll need 180,000 data scientists in the next decade and how much money we’ll save and how competitive it will make our country.
  2. In other words, with one small exception I’ll discuss below, the Congresspeople were impressed, even awed, at the intelligence and power of the panelists. They were basically asking for advice on how to let big data happen on a bigger and better scale. Regulation never came up, it was all about, “how do we nurture this movement that is vital to our country’s health and future?”
  3. There were three useless panelists, all completely high on big data and making their money being like that. First there was a schmuck from the NSF who just said absolutely nothing, had been to a million panels before, and was simply angling to be invited to yet more.
  4. Next there was a guy who had started training data-ready graduates in some masters degree program. All he ever talked about is how programs like his should be funded, especially his, and how he was talking directly with employers in his area to figure out what to train his students to know.
  5. It was especially interesting to see how this second guy reacted when the single somewhat thoughtful and informed Congressman, whose name I didn’t catch because he came in and left quickly and his name tag was miniscule, asked him about whether or not he taught his students to be skeptical. The guy was like, I teach my students to be ready to deal with big data just like their employers want. The congressman was like, no that’s not what I asked, I asked whether they can be skeptical of perceived signals versus noise, whether they can avoid making huge costly mistakes with big data. The guy was like, I teach my students to deal with big data.
  6. Finally there was the head of IBM Research who kept coming up with juicy and misleading pro-data tidbits which made him sound like some kind of saint for doing his job. For example, he brought up the “premature infants are being saved” example I talked about in this post.
  7. The IBM guy was also the only person who ever mentioned privacy issues at all, and he summarized his, and presumably everyone else’s position on this subject, by saying “people are happy to give away their private information for the services they get in return.” Thanks, IBM guy!
  8. One more priceless moment was when one of the Congressmen asked the panel if industry has enough interaction with policy makers. The head of IBM Research said, “Why yes, we do!” Thanks, IBM guy!

I was reminded of this weird vibe and power dynamic, where an unchallenged mysterious power of big data rules over reason, when I read this New York Times column entitled Some Cracks in the Cult of Technocrats (hat tip Suresh Naidu). Here’s the leading paragraph:

We are living in the age of the technocrats. In business, Big Data, and the Big Brains who can parse it, rule. In government, the technocrats are on top, too. From Washington to Frankfurt to Rome, technocrats have stepped in where politicians feared to tread, rescuing economies, or at least propping them up, in the process.

The column was written by Chrystia Freeland and it discusses a recent paper entitled Economics versus Politics: Pitfalls of Policy Advice by Daron Acemoglu from M.I.T. and James Robinson from Harvard. A description of the paper from Freeland’s column:

Their critique is not the standard technocrat’s lament that wise policy is, alas, politically impossible to implement. Instead, their concern is that policy which is eminently sensible in theory can fail in practice because of its unintended political consequences.

In particular, they believe we need to be cautious about “good” economic policies that have the side effect of either reinforcing already dominant groups or weakening already frail ones.

“You should apply double caution when it comes to policies which will strengthen already powerful groups,” Dr. Acemoglu told me. “The central starting point is a certain suspicion of elites. You really cannot trust the elites when they are totally in charge of policy.”

Three examples they discuss in the paper: trade unions, financial deregulation in the U.S., privatization in Russia. Examples where something economists suggested would make the system better also acted to reinforce power of already powerful people.

If there’s one thing I might infer from my trip to Washington, it’s that the technocrats in charge nowadays, whose advice is being followed, may have subtly shifted away from deregulation economists and towards big data folks. Not that I’m holding my breath for Bob Rubin to be losing his grip any time soon.

Categories: data science, finance, news

Left Forum panels next weekend: #OWS Alt Banking meeting and a debate with Doug Henwood

Next weekend at Pace University in New York City I’ll be taking part in two panels at the Left Forum, a yearly conference of progressives that everybody who’s anybody seems to know about, although this will be my first year there. For example Noam Chomsky is coming this year.

header23

First, from noon til 1:40 on Saturday June 8th, I’ll be debating how to shrink the financial sector with Doug Henwood, author of Wall Street: how it works and for whom. The panel will be moderated by my buddy Suresh Naidu, an occupier profiled in the Huffington Post. The announcement for this panel is here and includes room information.

Second, from 3:40 til 5:20, also on Saturday June 8th, I’ll be facilitating a meeting of the Alternative Banking group of OWS, which will be loads of fun. The idea is to explain to the panel audience how we roll in Alt Banking, to have a discussion about breaking up the banks, and to get the audience to participate as well. We expect them to enjoy getting on stack. The announcement for this panel is here, please come!

Registration for the Left Forum is still open and is affordable. Go here to register, and see you next weekend!

Categories: #OWS, news

Huge fan of citibikes

In spite of the nasty corporate connection to megabank Citigroup, I’m a huge of the new bike share program in downtown Manhattan and Brooklyn. I got my annual membership for $95 last week and activated it online and I already used it three times yesterday even though it was raining the whole time.

It helps that I work on 21st street near 6th avenue, which is one of the 300 stations so far set up with bikes. I biked downtown along Broadway to NYU to have lunch with Johan, and since we’d walked along Bleecker Street for some distance, I grabbed a bike from a different station on the way up along 6th.

Then later in the day I was meeting someone at Bryant Park so I biked up there, getting ridiculously wet but being super efficient. Now you know where my priorities are.

Here’s the map I’ve been staring at for the past week. It’s interactive, but just to give you an idea I captured a screenshot:

Screen Shot 2013-05-29 at 7.03.13 AM

Friday I’m meeting my buddy Kiri near her work in downtown Brooklyn for lunch. Yeah!!

Sign up today, people!

 

Categories: news

The Bounded Gaps Between Primes Theorem has been proved

There’s really exciting news in the world of number theory, my old field. I heard about it last month but it just hit the mainstream press.

Namely, mathematician Yitang Zhang just proved is that there are infinitely many pairs of primes that differ by at most 70,000,000. His proof is available here and, unlike Mochizuki’s claim of a proof of the ABC Conjecture, this has already been understood and confirmed by the mathematical community.

Mathematician Yitang Zhang

Mathematician Yitang Zhang

Go take a look at number theorist Emmanuel Kowalski‘s blog post on the subject if you want to understand the tools Zhang used in his proof.

Also, my buddy and mathematical brother Jordan Ellenberg has an absolutely beautiful article in Slate explaining why mathematicians believed this theorem had to be true, due to the extent to which we can consider prime numbers to act as if they are “randomly distributed.” My favorite passage from Jordan’s article:

It’s not hard to compute that, if prime numbers behaved like random numbers, you’d see precisely the behavior that Zhang demonstrated. Even more: You’d expect to see infinitely many pairs of primes that are separated by only 2, as the twin primes conjecture claims.

(The one computation in this article follows. If you’re not onboard, avert your eyes and rejoin the text where it says “And a lot of twin primes …”)

Among the first N numbers, about N/log N of them are primes. If these were distributed randomly, each number n would have a 1/log N chance of being prime. The chance that n and n+2 are both prime should thus be about (1/log N)^2. So how many pairs of primes separated by 2 should we expect to see? There are about N pairs (n, n+2) in the range of interest, and each one has a (1/log N)^2 chance of being a twin prime, so one should expect to find about N/(log N)^2 twin primes in the interval.

Congratulations!

Categories: math, news

Fight back against surveillance using TrackMeNot, TrackMeNot mobile?

After two days of travelling to the west coast and back, I’m glad to be back to my blog (and, of course, my coffee machine, which is the real source of my ability to blog every morning without distraction: it makes coffee at the push of a button, and that coffee has a delicious amount of caffeine).

Yesterday at the hotel I grabbed a free print edition of the Wall Street Journal to read on the plane, and I was super interested in this article called Phone Firm Sells Data on Customers. They talk about how phone companies (Verizon, specifically) are selling location data and browsing data about customers, how some people might be creeped out by this, and then they say:

The new offerings are also evidence of a shift in the relationship between carriers and their subscribers. Instead of merely offering customers a trusted conduit for communication, carriers are coming to see subscribers as sources of data that can be mined for profit, a practice more common among providers of free online services like Google Inc. and Facebook Inc.

Here’s the thing. It’s one thing to make a deal with the devil when I use Facebook: you give me something free, in return I let you glean information about me. But in terms of Verizon, I pay them like $200 per month for my family’s phone usage. That’s not free! Fuck you guys for turning around and selling my data!

And how are marketers going to use such location data? They will know how desperate you are for their goods and charge you accordingly. Like this for example, but on a much wider scale.

There are a two things I can do to object to this practice. First, I write this post and others, railing against such needless privacy invasion practices. Second, I can go to Verizon, my phone company, and get myself off the list. The instructions for doing so seem to be here, but I haven’t actually followed them yet.

Here’s what I wish a third option were: a mobile version of Trackmenot, which I learned about last week from Annelies Kamran.

Trackmenot, created by Daniel C. Howe and Helen Nissenbaum at what looks like the CS department of NYU, confuses the data gatherers by giving them an overload of bullshit information.

Specifically, it’s a Firefox add-on which sends you to all sorts of websites while you’re not actually using your browser. The data gatherers get endlessly confused about what kind of person you actually are this way, thereby fucking up the whole personal data information industry.

I have had this idea in the past, and I’m super happy it already exists. Now can someone do it for mobile please? Or even better, tell me it already exists?

Categories: data science, modeling, news

Dow at an all-time high, who cares?

The Dow is at an all-time high. Here’s the past 12 months:

Screen Shot 2013-05-17 at 6.44.43 AM

 

Once upon a time it might have meant something good, in a kind of “rising tide lifts all boats” sort of way. Nowadays not so much.

Of course, if you have a 401K you’ll probably be a bit happier than you were 4 years ago. Or if you’re an investor with money in the game.

On the other hand, not many people have 401K plans, and not many who do don’t have a lot of money in them, partly because one in four people have needed to dip into their savings lately in spite of the huge fees they were slapped with for doing so. Go watch the recent Frontline episode about 401Ks to learn more about this scammy industry.

Let’s face it, the Dow is so high not because the economy is great, or even because it is projected to be great soon. It’s mostly inflated out of a combination of easy Fed money for banks, which translates to easy money for people who are already rich, and the fact that world-wide investors are afraid of Europe and are parking their money in the U.S. until the Euro problem gets solved.

In other words, that money is going to go away if people decide Europe looks stable, or if the Fed decides to raise interest rates. The latter might happen when the economy (or rather, if the economy) looks better, so putting that together we’re talking about a possible negative stock market response to a positive economic outlook.

The stock market has officially become decoupled from our nation’s future.

Categories: finance, news

SEC Roundtable on credit rating agency models today

I’ve discussed the broken business model that is the credit rating agency system in this country on a few occasions. It directly contributed to the opacity and fraud in the MBS market and to the ensuing financial crisis, for example. And in this post and then this one, I suggest that someone should start an open source version of credit rating agencies. Here’s my explanation:

The system of credit ratings undermines the trust of even the most fervently pro-business entrepreneur out there. The models are knowingly games by both sides, and it’s clearly both corrupt and important. It’s also a bipartisan issue: Republicans and Democrats alike should want transparency when it comes to modeling downgrades- at the very least so they can argue against the results in a factual way. There’s no reason I can see why there shouldn’t be broad support for a rule to force the ratings agencies to make their models publicly available. In other words, this isn’t a political game that would score points for one side or the other.

Well, it wasn’t long before Marc Joffe, who had started an open source credit rating agency, contacted me and came to my Occupy group to explain his plan, which I blogged about here. That was almost a year ago.

Today the SEC is going to have something they’re calling a Credit Ratings Roundtable. This is in response to an amendment that Senator Al Franken put on Dodd-Frank which requires the SEC to examine the credit rating industry. From their webpage description of the event:

The roundtable will consist of three panels:

  • The first panel will discuss the potential creation of a credit rating assignment system for asset-backed securities.
  • The second panel will discuss the effectiveness of the SEC’s current system to encourage unsolicited ratings of asset-backed securities.
  • The third panel will discuss other alternatives to the current issuer-pay business model in which the issuer selects and pays the firm it wants to provide credit ratings for its securities.

Marc is going to be one of something like 9 people in the third panel. He wrote this op-ed piece about his goal for the panel, a key excerpt being the following:

Section 939A of the Dodd-Frank Act requires regulatory agencies to replace references to NRSRO ratings in their regulations with alternative standards of credit-worthiness. I suggest that the output of a certified, open source credit model be included in regulations as a standard of credit-worthiness.

Just to be clear: the current problem is that not only is there wide-spread gaming, but there’s also a near monopoly by the “big three” credit rating agencies, and for whatever reason that monopoly status has been incredibly well protected by the SEC. They don’t grant “NRSRO” status to credit rating agencies unless the given agency can produce something like 10 letters from clients who will vouch for them providing credit ratings for at least 3 years. You can see why this is a hard business to break into.

The Roundtable was covered yesterday in the Wall Street Journal as well: Ratings Firms Steer Clear of an Overhaul - an unfortunate title if you are trying to be optimistic about the event today. From the WSJ article:

Mr. Franken’s amendment requires the SEC to create a board that would assign a rating firm to evaluate structured-finance deals or come up with another option to eliminate conflicts.

While lawsuits filed against S&P in February by the U.S. government and more than a dozen states refocused unflattering attention on the bond-rating industry, efforts to upend its reliance on issuers have languished, partly because of a lack of consensus on what to do.

I’m just kind of amazed that, given how dirty and obviously broken this industry is, we can’t do better than this. SEC, please start doing your job. How could allowing an open-source credit rating agency hurt our country? How could it make things worse?

How much math do scientists need to know?

I’m catching up with reading the “big data news” this morning (via Gil Press) and I came across this essay by E. O. Wilson called “Great Scientist ≠ Good at Math”. In it, he argues that most of the successful scientists he knows aren’t good at math, and he doesn’t see why people get discouraged from being scientists just because they suck at math.

Here’s an important excerpt from the essay:

Over the years, I have co-written many papers with mathematicians and statisticians, so I can offer the following principle with confidence. Call it Wilson’s Principle No. 1: It is far easier for scientists to acquire needed collaboration from mathematicians and statisticians than it is for mathematicians and statisticians to find scientists able to make use of their equations.

Given that he’s written many papers with mathematicians and statisticians, then, he is not claiming that math itself is not part of great science, just that he hasn’t been the one that supplied the mathy bits. I think this is really key.

And it resonates with me: I’ve often said that the cool thing about working on a data science team in industry, for example, is that different people bring different skills to the table. I might be an expert on some machine learning algorithms, while someone else will be a domain expert. The problem requires both skill sets, and perhaps no one person has all that knowledge. Teamwork kinda rocks.

Another thing he exposes with Wilson’s Principle No. 1, though, which doesn’t resonate with me, is a general lack of understanding of what mathematicians are actually trying to accomplish with “their equations”.

It is a common enough misconception to think of the quant as a guy with a bunch of tools but no understanding or creativity. I’ve complained about that before on this blog. But when it comes to professional mathematicians, presumably including his co-authors, a prominent scientist such as Wilson should realize that they are doing creative things inside the realm of mathematics simply for the sake of understanding mathematics.

Mathematicians, as a group, are not sitting around wishing someone could “make use of their equations.” For one thing, they often don’t even think about equations. And for another, they often think about abstract structures with no goal whatsoever of connecting it back to, say, how ants live in colonies. And that’s cool and beautiful too, and it’s not a failure of the system. That’s just math.

I’m not saying it wouldn’t be fun for mathematicians to spend more time thinking about applied science. I think it would be fun for them, actually. Moreover, as the next few years and decades unfold, we might very well see a large-scale shrinkage in math departments and basic research money, which could force the issue.

And, to be fair, there are probably some actual examples of mathy-statsy people who are thinking about equations that are supposed to relate to the real world but don’t. Those guys should learn to be better communicators and pair up with colleagues who have great data. In my experience, this is not a typical situation.

One last thing. The danger in ignoring the math yourself, if you’re a scientist, is that you probably aren’t that great at knowing the difference between someone who really knows math and someone who can throw around terminology. You can’t catch charlatans, in other words. And, given that scientists do need real math and statistics to do their research, this can be a huge problem if your work ends up being meaningless because your team got the math wrong.

Categories: modeling, news, statistics
Follow

Get every new post delivered to your Inbox.

Join 887 other followers