Home > data science, modeling > What’s the difference between big data and business analytics?

What’s the difference between big data and business analytics?

August 16, 2013

I offend people daily. People tell me they do “big data” and that they’ve been doing big data for years. Their argument is that they’re doing business analytics on a larger and larger scale, so surely by now it must be “big data”.

No.

There’s an essential difference between true big data techniques, as actually performed at surprisingly few firms but exemplified by Google, and the human-intervention data-driven techniques referred to as business analytics.

No matter how big the data you use is, at the end of the day, if you’re doing business analytics, you have a person looking at spreadsheets or charts or numbers, making a decision after possibly a discussion with 150 other people, and then tweaking something about the way the business is run.

If you’re really doing big data, then those 150 people probably get fired laid off, or even more likely are never hired in the first place, and the computer is programmed to update itself via an optimization method.

That’s not to say it doesn’t also spit out monitoring charts and numbers, and it’s not to say no person takes a look every now and then to make sure the machine is humming along, but there’s no point at which the algorithm waits for human intervention.

In other words, in a true big data setup, the human has stepped outside the machine and lets the machine do its thing. That means, of course, that it takes way more to set up that machine in the first place, and probably people make huge mistakes all the time in doing this, but sometimes they don’t. Google search got pretty good at this early on.

So with a business analytics set up we might keep track of the number of site visitors and a few sales metrics so we can later try to (and fail to) figure out whether a specific email marketing campaign had the intended effect.

But in a big data set-up it’s typically much more microscopic and detail oriented, collecting everything it can, maybe 1,000 attributed of a single customer, and figuring out what that guy is likely to do next time, how much they’ll spend, and the magic question, whether there will even be a next time.

So the first thing I offend people about is that they’re not really part of the “big data revolution”. And the second thing is that, usually, their job is potentially up for grabs by an algorithm.

Categories: data science, modeling
  1. August 16, 2013 at 12:35 pm

    Good – but is this a necessary or sufficient condition? Are you arguing that any adaptive control is big data? For example, one of our clients used our platform for adaptive optimization of their website based on a few user segments -so a pretty simple partitioned multi-armed bandit problem – low cardinality in the choice variable and just a handful of targeting data. I don’t think I would call that big data. Still not clear what the actual classification rule would be – perhaps it is soft margin. Thoughts?

    Like

  2. JSE
    August 16, 2013 at 1:49 pm

    To be fair, Cathy, you were offending people daily long before you started working on big data.

    Like

    • Allen K.
      August 16, 2013 at 3:23 pm

      Beat me to it!

      Like

  3. August 16, 2013 at 1:57 pm

    Hi Cathy (long time — Gabe from HCSSiM).

    It seems like the difference here is between programming vs ‘the eye ball’, as it was once called at an old job. Put differently, between data science and business analytics.

    But what makes big data big? I go by the (also faulty) rule of whether or not it can fit in memory.

    Like

  4. August 16, 2013 at 3:45 pm

    And now for Big Data purposes Google publish his open source toolkit for understand meaning behind words http://google-opensource.blogspot.com/2013/08/learning-meaning-behind-words.html

    Like

  5. August 16, 2013 at 4:50 pm

    This is actually a great post, except that I disagree with the “then those 150 people probably get fired, or even more likely are never hired in the first place” bit. This isn’t the algorithm or technology’s fault; this is a managerial decision. Most of those 150 could be retrained to work with the big data algorithms, moved to positions where they could still be very helpful to the company (with company knowledge and culture to boot) or at the very least be kept on in new related roles. Except that never happens, because of “productivity gains” and “efficiency”. In other words, someone wants more money. Interestingly, the decision to make such firings are always made by people in roles that seemingly can never be disrupted by technology in the same way.

    This point is always conveniently left out when it comes to discussions of algorithms and machines “taking” human jobs away.

    Like

  6. me
    August 17, 2013 at 3:59 am

    big data is an overweight android from star trek, right?

    Like

  7. August 17, 2013 at 8:54 am

    “Big data” = Still more market and human manipuation:

    http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2309703

    I don’t find much to cheer about this.

    Like

  8. Zathras
    August 17, 2013 at 5:18 pm

    Cathy,

    If this is the true “big data revolution,” then it will never get complete. Google is a paradigm of “stylized facts” used to justify the big data revolution. Sure, it worked for them, not because they were able to put the big data paradigm into their business, but that the big data paradigm was their business.

    Your description of the big data paradigm sounds exactly like another field which aspires to cover all of or replace business analytics: operations research. The OR optimization scheme typically does not use big data (although sometime it does), yet the OR partisans also claim that the algorithms replace useless people. Often, these plans go awry, since there are hidden variables not present in the OR models. This isn’t an issue for Google, since the data is both the means and the end, creating a closed system.

    I know you said “probably,” but just to give an example of a big data algorithm which does not replace people, an example in some cases is cross-selling. This is the big data problem of taking consumers of product A and determining which are the most likely to buy product B. In many cases, this selling still requires a warm body; algorithms don’t cold-call. Whereas before the salesperson would be picking customers more or less from their gut, now they have a targeted list. But there is still the salesperson.

    Like

  9. August 18, 2013 at 5:29 pm

    Zathras – The OR analogy is a great one. OR went through its “hype cycle” way back in the 1960s. Expert systems in the 1980s were another claim that people were not needed to make decisions. Both have value, in the right situations, but are not as general as the proponents claimed.

    Cathy, big data in your sense does not work widely. If you say that “no human judgment is needed,” this is approximately equivalent to “the relationships do not need to be supported by causal theory, just by raw correlation.” This works great in certain domains. But the underlying correlations have to be changing relatively slowly, compared to the amount of data that is available. With enough data for “this month,” an empirical relationship which holds for multiple months can be data mined and used to make decisions, without human judgment.

    But many of the world’s important problems don’t have that much stability. For example trying to use searches to track the spread of an annual flu, at the state-by-state level, won’t be very reliable without human judgement. The correlation between search terms and flu incidence in 2012 is not likely to be the same in 2013. One reason is that news cycles very from year to year, so in some years people are more frightened of the flu than other years, and do more searches. Consider the following experiment: use the “big data relationships” from 2010, to track the incidence of flu in 2014. It won’t work very well, will it?
    On the other hand, if you could get accurate weekly data about flu incidence, the same methods might work much better. Using the correlations between search terms and flu in November might give reasonably accurate estimates in December.

    Like

  10. Kaleberg
    August 18, 2013 at 9:31 pm

    I get it: my thermostat does big data, but my digital camera does business analytics. Neat. I always wondered.

    Like

  11. August 19, 2013 at 4:25 am

    I think Big Data has evolved (mostly by marketing) in definition from the classic “what does’t fit into memory” to “what is different from classical BI”…and that means we are now dealing with a huge amount of topics at once. A term with such abstraction is harmful because it disregards the massive brainpower that previously went into making BI architectures that drive 80% of enterprise corporate workflows today. It means there is a intersection between the two that is the most challenging to articulate. It appears academia is now trying to disassemble the pieces into more fine-grained disciplines (as software engineering should) but this will take some time. Using Big Data in discussion is almost useless unless one can derive a specific context of the part of Big Data really being discussed or unless you are in a sales and marketing situation where you are pitching to the status quo.

    Like

  12. September 2, 2013 at 2:41 pm

    I see where you are going here and I understand it. But the shift from “manual” to “automated” doesn’t happen overnight. It takes a lot of coaching for the managers to believe in the automated.

    Like

  13. Claire
    November 10, 2013 at 11:19 am

    While I agree Big Data is not a new concept, I also disagree with Cathy’s in that Big Data replaces human. Computers has been replacing people’s jobs for decades. However, we should question what are considered “people’s jobs” to begin with. We as human are given with great minds to THINK, not to perform boring tasks like collecting data or organizing data. As business analysts, we should be “analyzing” the patterns and logics that surrounds us not just collecting them or cleansing them (hence the term “Data Scientist” vs. “Data Steward”). It’s very similar to the difference between an inventor vs. a librarian – an inventor create things based on what they know, while a librarian merely keeps and organizes information. It’s not a librarian’s job to make something new. Computers, with Big Data or not, are just ways to help us think faster and to create more things. Technology only demands us to perform our jobs better as human, i.e. letting analysts actually do their jobs as humans instead of as machines.

    Like

  14. November 21, 2013 at 6:37 am

    Seems like a consequence of your definition is that the inputs, outputs, and goals of a system need to be defined concretely (and quantitatively) in advance.

    Curious what your thoughts are on whether the following constitute “doing big data”:

    – Human input at scale, e.g. Mechanical Turk + active learning
    – Infrastructure, e.g. scalable data integration, storage, knowledge management, retrieval, and computation
    – Theory, e.g. statistics/math/machine learning, which are agnostic to whether they are utilized by a human analyst or a production system

    FWIW, I think it’s sensible to for someone to say they “do business analytics with big data”.

    Like

    • November 21, 2013 at 10:19 am

      I think the “in advance” thing is key. Obviously nothing starts out in advance, everyone starts out with basic data exploration. The question is whether they ever get beyond that.

      Like

  1. August 17, 2013 at 4:23 am
  2. August 18, 2013 at 5:06 pm
  3. August 21, 2013 at 3:16 pm
  4. August 26, 2013 at 8:46 pm
  5. September 18, 2013 at 3:16 am
  6. September 21, 2013 at 10:03 am
  7. November 23, 2013 at 2:09 am
Comments are closed.