Home > Uncategorized > Comments on COVID-19

Comments on COVID-19

March 30, 2020

I am, like you, restless and having trouble coping with the tragedy going on. It’s especially hard to think through the logical details of issues that only two weeks ago seemed urgently important. So instead, like you, I find myself with an internal dialogue of how the publicized statistics are consistently biased or wrong. At the risk of simply supporting your own internal thoughts, here are a few of mine:

  1. We still aren’t testing people, even in New York, which is the most tested population in the current mostly highly infected country according to the crap data we have.
  2. What that means to me is that we can ballpark how many actual cases we have if we know what the condition is for actually getting tested. In New York, it’s something close to “needs hospitalization.” Considering that only about the worst 10% of cases in countries that do widespread testing actually need hospitalization, that means we can multiply our confirmed case count by 10 to get an estimated total case count.
  3. That means that, instead of 60K cases in New York state, which is what this webpage says this morning, we can assume it’s actually more like 600K.
  4. Similarly as a nation, we should multiply the confirmed case count of 143K by ten to get an estimated 1.43 million cases in the US.
  5. Is that an overestimate? Perhaps. It’s possible that enough testing is happening in those car wash type setups, where people are at least capable of driving a car, to make it pessimistic.
  6. On the other hand, we’ve seen plenty of examples in the NYC area of people calling their doctor with intensely bad symptoms who are told not to overburden the hospital system and to take care of themselves at home.
  7. Also, it’s worth pointing out that multiplying by 10 assumes that more than half, and perhaps up to 75% of all actual cases are entirely asymptomatic. This is something we’ve been seeing in places that have done randomized or comprehensive testing.
  8. All the above are ballpark reckoning, but honestly I trust my numbers more than any official ones.
  9. Especially because we’ve been hearing stories told in Spain and Italy that their death counts are not including horrible fucking things that have been happening in nursing homes. That means those terrible numbers are heavily underestimating actual deaths.
  10. Also, we should not trust China’s death count numbers, which some say are underestimating actual death counts by a factor around 15.
  11. And if we don’t trust their death counts, we should also not count their confirmed case count, which has been tiny recently.
  12. Why this matters a lot to us right now is that China closed Wuhan on January 23rd, which means they are/were under quarantine stricter than ours for more than two months, and we’d REALLY like to know what the actual situation is right now, but we don’t.
  13. Long story short, being a skeptical data scientist means not trusting the data whatsoever. The best we can do is use the data and our real world knowledge to ballpark what might actually be happening. We will never know the true numbers.
  14. One exception might be the Netherlands, which I’m keeping my eyes on. I don’t think they lie as much as most other countries.
  15. I could be wrong about that too.
  16. I hope tomorrow’s post will be more optimistic.
  17. One last comments: Sunday reported deaths are lower than other days because of the way reporting happens.  Doctors and others are taking a well-deserved rest. So don’t get excited about flattening curves based on Sunday data:

daily deaths sunday effect

Categories: Uncategorized
  1. March 30, 2020 at 11:17 am

    Posted on my Facebook. Thanks. Hope you and yours are doing as well as can be hoped. You are always a welcome voice.



  2. Tom
    March 30, 2020 at 11:20 am

    So share your thoughts on the “official” statistics. I’m a 69 year old from Amsterdam NL. Hate to say it but I’m sceptical about the accuracy of our data here. Death percentage is suspect because basically too little testing is done here. WHO organisation urges us all to test. No management can be done without. Stating the obvious. Hoping that in the new order after this politicians are disqualified from making statements in these situations. Their only task and responsibilities are to provide funds for health care first. Big business can look after itself. Pfew, had to vent… Stay vigilant everybody


  3. TOJ
    March 30, 2020 at 11:34 am

    The one thing that gives me pause is that even though we are only testing those who are at high risk, with symptoms, and potentially needing hospitalization, the majority of test results are still negative.

    It’s clear that testing is heavily insufficient and there’s lots of positive cases not being spread, but I do wonder if the 10x figure is accurate, particularly as testing does continue to expand.


  4. mojomogoz
    March 30, 2020 at 11:39 am

    I’m not 100% I’m correct but I think the miscount is more France where old folk homes being ignored. In Spain they are closed off but I think generally counted and in Italy I believe everyone who has C19 at time of death is counted (whether believed to be actual cause or not)


  5. Virginia Dignum
    March 30, 2020 at 11:47 am

    Hi. Great post. Unfortunately you are wrong about 14 and right about 15. In the Netherlands only those ill enough to have severe symptoms are tested. There are many stiries of local GPs that see 1 or 2 deaths of elderly per day wherr they usually see 1 per week. These are not tested thus not part of the data


  6. constantinecostes
    March 30, 2020 at 12:01 pm

    Currently, The best COVID-19 testing data appears to be coming from Iceland, which is doing simple random sampling of the entire population. South Korea’s testing is also worth a look.




  7. Mark Mullin
    March 30, 2020 at 12:48 pm

    You’re not alone, this is a nice writeup on people arguing we’re doing bad math on faulty first principles. https://swprs.org/a-swiss-doctor-on-covid-19/


  8. March 30, 2020 at 12:49 pm

    Quite a long time I hadn’t come here, and we all math babes of the world are now gathered around a same and terrible topic. So I discover your blog post about your memory of your father. As a father myself, may I say: this is moving. I would be happy if my daughters are strong enough to tell something like that the day it will need to be told. Now you can suppress this off-topic comment!


  9. mathematrucker
    March 30, 2020 at 1:25 pm

    Near-ditto Frédéric Lefebvre-Naré’s comment (the last sentence of which I wouldn’t have written): I still read Cathy’s tweets religiously but after being an off-and-on regular here for several years, haven’t visited the blog much since November 2016. I’m not a father nor had one remotely like hers, but it doesn’t matter…I too found that post extraordinary.


  10. andeux
    March 30, 2020 at 1:37 pm

    A few scattered thoughts of my own:
    1. As another commenter already pointed out, given the positive test rate of 10-15%, it can’t be true that only those with the most severe respiratory symptoms are getting tested (unless, I suppose, a significant proportion of the population has severe respiratory symptoms for some other reason). So while you may well be right that we are underestimating current cases by 10x, that particular way of coming to that conclusion seems weak.
    2. In principle death counts should be more useful than infection counts, since they don’t rely on comprehensive testing, except
    a) Deaths lag behind infections by a week or two, which is less useful
    b) As you point out, some countries are probably lying about their death counts
    c) Even when they are not lying, they may be underestimating. There was something making the rounds last week about one Italian town, saying the number of excess deaths compared to a baseline was 3x the number officially ascribed to covid.
    3. Side note: Everyone is dunking on Marco Rubio for saying China was greatly underestimating their death count. His motives may be bad (and dunking on Marco Rubio is generally fun) but he may well be right this time.
    4. The Icelandic study is interesting, but note the update at the end of the first link saying the study had been retracted, possibly due to a high false-positive rate in the test itself.
    5. Hospitals would see the same number of cases if (e.g.) 5% of the population were infected but 90% of those were asymptomatic vs. 1% were infected but only half asymptomatic. But from the point of view of trying to predict how things would develop, those situations would likely be very different (and it isn’t even clear which would be better).
    6. Here in California cases have grown much more slowly than in NYC. Restrictions on socializing did start a little earlier here, but the divergence seems to have begun even before those measures should have been effective, so there may be other factors we just don’t understand at all.
    7. It seems clear from all of the above that there are too many parameters that we amateurs just don’t know to be able to make accurate predictions. This may be true even for the experts.
    8. Despite the preceding point, the most optimistic predictions that the epidemic would peter out on its own without even needing social distancing (from “I am not an empiricist” econ-law prof Richard Epstein this week, or the silicon valley tech bro last week) do not seem realistic based on what happened in Italy, or what is already happening in NYC.


  11. March 30, 2020 at 1:38 pm

    Cathy, have you seen the writings by Tomas Pueyo?
    View at Medium.com
    View at Medium.com

    He goes into a lot of detail about how to estimate the number of actual cases from confirmed case count, taking into account latency lag, the huge danger of delaying effective responses, and prognoses for various policy options.


  12. March 30, 2020 at 1:46 pm

    Reblogged this on sonofbluerobot.


  13. March 30, 2020 at 2:26 pm

    Then there is Russia that we never seem to hear about. Curious, I checked last week and discovered that some doctors in Russia were making anonymous reports, out of fear of Putin, because Putin started out sounding like Trump, that Putin had it all under control, nothing to worry about. No changed needed. Bla, Bla, Bla …

    However, those anonymous Russian doctors were reporting that deaths due to pneumonia had gone up more than 30 percent above what was expected based on norms.

    It was alleged that Putin was hiding the actual number of COVID-19 deaths by forcing medical authorities to list them as pneumonia deaths instead.


  14. rob
    March 30, 2020 at 2:37 pm

    The data I trust most are the Icelandic and Korean — they tested widely, not just of the symptomatic, but also likely contacts — and also the Diamond Princess cruise, which is no doubt age-skewed but has the virtue of being complete, so of some special interest. I encourage everyone to read Mark Mullin’s link above and Constantine’s above as well.

    I assume that most or all of us will eventually have contact with the virus. The Korean mortality rate remains under 1% and of that mostly >70yrs, but this won’t hold of NYC since we are not as well prepared. An overwhelmed system also means higher mortality from other conditions that, absent COVID, would not have been fatal.

    So it’s most important to avoid burdening the health care system — don’t try to get tested, avoid contact with others either to give or get any pathogen. I say this as someone in the fifth unambiguously symptomatic day of a bout of what is almost surely COVID. I have no need to be tested. It would be useful for data collection, but that luxury will have to wait in this imperfectly organized society.

    On the encouraging side, I’m pushing 66 and asthmatic, yet the symptoms were relatively mild. I was even able to continue working remotely. Same for a friend only two years younger, who was riding his bike throughout his bout (though always avoiding human contact). So I see no need to panic for oneself, but do be careful for the sake of the health care system and the vulnerable.


  15. Jan
    March 30, 2020 at 2:57 pm

    Out of the European countries, I think only Germany and Iceland are doing more widespread testing, but in Germany I think it goes with contact tracing. So not really a randomized population sample.
    The Netherlands and UK are certainly testing very little (only severe cases), and Italy does not quite have the capacity for testing beyond suspicious cases. Don’t know about the rest.

    I’ve seen an interesting comparison between Germany and Italy – the proportions of cases by 10-year age brackets. In Germany more even, in Italy clearly peaks between 60 and 70.


  16. Mark Schaeffer
    March 30, 2020 at 4:37 pm

    An important caveat about Tomas Pueyo’s analysis I linked above: he takes the official Wuhan statistics at face value. But according to this report “Estimates Show Wuhan Death Toll Far Higher Than Official Figure” https://www.rfa.org/english/news/china/wuhan-deaths-03272020182846.html on Radio Free Asia, linked by the estimable David Dayen, multiple crematoria in Wuhan have been working around the clock. China is a tight dictatorship which is known to have covered up the early stages of the outbreak.The RFA report estimates some 44,000+/-2000 deaths rather than ~2500 as of 3/27. But Pueyo uses more trustworthy data from several other countries with low mortality after strict controls went into effect, for example S Korea, and his analysis looks at multiple variables and countries around the world.


  17. Mark Schaeffer
    March 30, 2020 at 4:41 pm
  18. anonymous
  19. March 31, 2020 at 10:05 am

    The models could be improved by uncertainty analyses that add hyperparameters for hyper-distributions of under-reporting and under-testing fractions. Even though there are no data to train the parameters, it would at least incorporate uncertainty stemming from these unknowns. Because they would necessarily be between zero and one, they would inflate the projected cases and deaths for each random draw in generating uncertainty intervals.


  20. March 31, 2020 at 10:20 am

    Thanks for facilitating this helpful, albeit scary conversation, Cathy. It feels strange, in the face of the life and death day reality to keep working remotely and still hiring math tutors from CUNY to work this coming fall in NYC middle and high schools but life, for those of us fortunate enough to be healthy (so far) goes on, as does math teaching and tutoring.. All of our tutors are learning how to tutor remotely, btw.


  21. Quentin
    March 31, 2020 at 12:51 pm

    I wish more of the journalists reporting numbers would put them in context by focusing on per capita infection rates rather than just raw infection counts. The US is a bigger country than all of the European countries with high infection rates, yet its infection rate is still quite a bit lower. To me that suggests we may have a way to go before hitting an inflection point, although the counts over the last day or two suggest more of a linear rather than exponential trend. It’s possible it’s still exponential and we’ve managed to change the exponent, which might be the best that can be hoped for.


  22. March 31, 2020 at 5:41 pm

    Another challenge – testing sensitivities (false positives and false negatives)


    One of the more interesting sentences in the letter from the Captain of the U.S.S. Theodore Roosevelt: “The COVID-19 test cannot prove a Sailor does not have the virus, it can only prove that a Sailor does.” The Captain then goes on to state that “approximately 21% of the Sailors that tested negative and are currently moving — ashore, are currently infected, will develop symptoms over the next several days, and will proceed to infect the remainder of their shore-based restricted group.”



    • March 31, 2020 at 6:12 pm

      Gosh thats a ridiculous false negative rate.


    • March 31, 2020 at 6:15 pm

      I read a few days ago that there is now a COVIC-19 Antibody test out to determine how many people had it and are now immune.

      “Researchers at the Mount Sinai Health System say they’ve developed a test that can find out if you already have had or were infected with the new coronavirus.

      “The test is called ‘serological enzyme-linked immunosorbent assay,’ or ELISA for short. It checks whether or not you have antibodies in your blood to SARS-CoV-2, the scientific name of the new coronavirus that causes COVID-19. …



  23. April 1, 2020 at 10:24 am

    By the way — if some countries only display partial figures, this is not a deep concern as far as modeling / “forecasting” in another country, is the point. In order to model, you just need consistent parameters. If they leave out one part of the population (say, countryside), but observed cases, severe cases, deaths… are measured consistently on the same population (say, urban), it’s usable. So far, data all countries with seemingly consistent figures (say, all but Iran) point to consistent figures about propagation (RO~2.5 to 3.8) and about letality (~1.4% of symptomatic cases with a Chinese demographic structure, so ~~0.7% of all infected, if hospitals are overwhelmed (= they can’t ventilate old people); that implies ~1.1% of all infected, with a European demographic structure. The question mark remains about the % of a whole (national or regional) population that is infected; but as the “natural peak” was reached nowhere, this % depends actually only on national and local containment measures. Statistical analysis cannot easily discriminate between more or less effectiveness of measures within a given population, and more or less underestimation of cases. Taking into account that just 1 day delay to take measures implies ~10 to ~40% more or less infection cases *altogether*.


  24. dave jones
    April 2, 2020 at 10:44 am

    You should also look at Iceland – small population but high level or testing with lots of health background data on their population. And Zingales at http://www.promarket.org on the natural experiment between Vo and Diamond Princess.


  25. Greg Taylor
    April 2, 2020 at 12:14 pm

    Usefully testing more people requires more accurate tests than we appear to have. The false negative rates on the swab tests appear to be unacceptably high (north of 10%) for informative mass testing. False positive rates may be an issue as well.

    While popularized by the JHU dashboard, cases and deaths aren’t particularly useful metrics. Tracking local healthcare capacity – availability of hospital beds, ICU beds, ventilators, drug treatments, etc. – might help ensure resources get allocated in a way that minimizes damage. U Washington’s Institute for Health Metrics and Evaluation has a good start on such a site although it’s predictions may suffer from bad input data as described in this post. You can view some available resources at the state levels (click on the green bar to change US to any state) but the analyses/predictions are unfortunately not localized.



  26. Tomonthebeach
    April 2, 2020 at 11:18 pm

    is this how Trump Makes America Great Again? 6 Million unemployed on record, but probably another 10 Million cannot even get a form to apply for help. Then a report today said it might be August before the unemployed see a dime. the equates to 10s of millions of starving evicted Americans. Apply this same meltdown to our public health trainwreck.

    Because we were not testing at least a random sample of several hundred thousand [could not do that because we were too good to use the WHO test and needed to develop our own invalid on], and retesting every week, it is statistically impossible to project the infection rate or the mortality rate of COVID-19. Anybody who tells you otherwise, is 6 graduate-level statistics courses behind me (10 if you count Trump and Larry Kudzu). We can guess a little by those in hospital in contrast with those who leave on their own and those who leave in a hearse. However, many people cannot even get in to a hospital, so it’s a guess.

    The 2011 movie Contagion (Matt Damon) literally describes the current pandemic as its story unfolds starting in China (ironically). Had Trump seen the movie, he still would likely have cuts CDC funds and fired his Pandemic Office at NSC. He’s that stupid. That sad thing is that Trump caused this trainwreck. Had we been prepared and followed our own protocols, we probably would have avoided catastrophic economic meltdowns because we would have had a handle on where the problems were and how to keep them at bay from the uninfected.

    The Trump-Gumpyist thing we have witnessed so far, is watching 10s of thousands of Americans cram into jets, to return from infected zones, then cram into large halls shoulder-to-shoulder fror 6 hours at major international airports (LAX, JFK, ORD) then send them all home without screening. Shockeroo – LA, NYC, and Chgo are experiencing huge outbreaks. Where was our Surgeon General? Not on TV telling us how to stay safe since Jan 1st when he and most of us in the health professions were well aware of the viral Tsunami headed in all directions.


  27. JR
    April 3, 2020 at 4:24 pm

    The Dutch do seem to do some randomized testing now. See the end of their official figures document. They split the results per lab, and claim that about 25-30% tests are positive in their random sampling.

    Click to access Epidemiologische%20situatie%20COVID-19%20in%20Nederland%203%20april%202020.pdf


  1. April 2, 2020 at 7:11 am
  2. April 2, 2020 at 7:34 am
  3. April 2, 2020 at 7:46 am
  4. April 2, 2020 at 7:55 am
  5. April 2, 2020 at 7:58 am
  6. April 2, 2020 at 1:38 pm
  7. April 7, 2020 at 11:15 am
Comments are closed.
%d bloggers like this: