Shekhar's Science Blog

Thursday, May 21, 2020

Number of Confirmed Case per Test for India


Number of Confirmed Case per Test for India


Dr Himanshu Shekhar


Introduction: In constant effort to analyze the daily and cumulative number of confirmed cases in India, previous posts in the blogs were made. In the last post (dated 21 May 2020, entitled “India crossing 1 Lakh leads to Fresh Prediction”), for reassessing the prediction, another exponential curve is fitted where pre-exponential factor was increased 9 times and activation term was halved, as compared to equation used for the same prediction on 30 April 2020. However, it was pointed out by Shri S. Jayaraman sir, one of the very senior scientists, that “Instead of plotting total number against time, plot the number of cases divided by the number of tests against time”. This post emerged out of that concern of the reader.

Data Collection: Although, it is not very convincing that number of testing has any bearing on COVID-19 spread. With passage of time, more and more COVDI-19 cases are actually present in India, whether testing is done or not. Initially, reporting less number may not be due to less number of testing, but less numbers of infections, itself. More number of testing can give more number of confirmed cases is a general conclusion drawn, but as time progresses, number of cases are rising due to various assorted moves of spread and this fact cannot be ruled out. So, more than number of testing, it is time elapsed, control measures, relaxations granted, spread harbingers efforts, etc matter more. However, the concern for correlating number of testing and number of confirmed cases has to be looked into, through analysis.
No doubt, I was following worldometer and the site addresses were also given in my previous posts. In addition, it is also mentioned that data is now taken from Aarogya Setu app, also. To get data on number of tests, News published on http://indianexpress.com/article/explained/Coronovirus-numbers-explained-india-covid-19-testing-6418807/ web site, written by Amitabh Sinha, updated on 20 May 2020 at 7:11:04 pm, is taken. 

Data Analysis: Daily Number of Tests conducted and daily number of Confirmed Cases from 09 May 2020 to 21 May 2020, is plotted on same scale. Both the results are continuously on the rise with time. The results are quite interesting.
  • On 11 May 2020, there is certain drop reported in number of daily tests conducted, but surprisingly, the number of confirmed cases, reported was more.
  • Contrary to this, between 12 May 2020 and 16 May 2020, the numbers of testing were more but number of confirmed cases have not shown matching rise.
  • On 18 May 2020, a sudden drop in number of test is not reflected in number of confirmed cases.
This type of behaviour cannot be seen as any correlation. The Pearson coefficient is obtained for both the data, as 0.3626. So, mathematically, these two set of data are not correlating.



However, data of 11 May 2020 and 18 May 2020 can be treated as assorted deviation cases. On both the days, sudden reduction in daily number of tests is noted, which is due to either incomplete data compilation or due to actual variation. But giving them benefit of doubt, these two points are removed from the data and Pearson coefficient of correlation is obtained for the remaining data. The Pearson Coefficient is obtained for the remaining data as 0.88423. The correlation has improved and probably, higher number of test correlates well with higher number of daily confirmed cases.


Both the curves are shown on different scales and there is no correlation between them. The rate of rise of number of testing per day is of the order of 1740, whereas rate of rise of number confirmed case per day is 213. Definitely number of testing has increased at a faster pace and it is dependent on technological competence and initiatives to establish the labs for testing. Contrary to this the rate of rise of confirmed cases is although derived from testing, but is dependent on potency of the virus and incompetency of human efforts to avoid infection. Additionally, the factors, responsible for spreading it at faster pace are also working simultaneously. A straight line is fitted to both the data and on 22.05.2020, number of tests to rise to 106409 and number of confirmed cases may rise to 5704. These numbers are obtained, when these two parameters are considered independent of each other. 

After removal for those deviations, the ratio of number of tests per unit number of confirmed cases, on daily basis is plotted against time, with assumption that, 08 May 2020 is Zero day. 



The vertical axis of the axis represents number of tests needed to get a confirmed case, on daily basis. A higher value in the initial days indicates that more number of testing are to be conducted to get a confirmed case, at that time. As time progressed, the number of tests needed to get a confirmed case is reduced. This indicates that more number of infections is there, as time progresses and though correlated with number of tests, the number of confirmed cases is higher at later time instant. This clearly indicates that any inference of the type – “more number of tests resulted in more number of cases” or “invariant of ratio of number of test to number of confirmed cases” could not be drawn.

A straight line curve is also fitted, which has a negative slope. If number of tests needed to get a confirmed case is represented by TCC and days counted with zero at 08 May 2020 is D, then following approximate relation holds.
TCC = 27.77 – 0.7xD.
Although this relation has Regression correlation coefficient R2 as 0.77155 and Person Coefficient as 0.87838, indicating a poor fit, the correlation can be treated as rough estimate of current state of affair.
From this correlation, 22 May is day number 14 and the value of TCC = 27.77 – 0.7 x 14 = 17.79. If same number of testing is repeated on 22 May 2020, as done on 21 May 2020, then number of confirmed cases on 22 May 2020 will be 5761. If this becomes true, tomorrow, then the mathematical correlation developed is worth exploration further.

Conclusion: Daily number of confirmed cases is analyzed in light of daily number of tests conducted. The values are analyzed from 08 May 2020 to 21 May 2020. It is observed that as time progressed daily number of tests needed to get a confirmed case is lower. This means that infection population has increased with passage of time. Had the number of test needed to get a confirmed case been constant, the conclusion of dependence of number of confirmed cases on number of tests could be drawn. In present scenario, a correlation is developed, from which it is predicted that, if number of tests is kept constant, then tomorrow i.e. 22.05.2020, there will be 5761 cases. Incidentally in the post, “India Crossing 1 Lakh leads to Fresh Prediction” indicates daily confirmed case on 22 May 2020 as 5670. We have to wait and watch to see the usefulness of the mathematical activities. 

Dr Himanshu Shekhar

6 comments:

  1. Thanks for your constant concern for the pandemic.
    Frankly speaking, In great depth, I am unable to understand . But now my mind is waiting for two types of analysis -first what you are doing.
    But the second thing is rate of recovery versus confirm cases vs days.
    Today we achieved rate of recovery as 40% . It is very satisfying scenario. So I hope if we will reach recovery rate upto 90%, then we will have control over the pandemic.
    And I am optimistic regarding the rate of recovery. Like any powerful destructive cyclone, any pandemic has too a limited lifetime. and it will also die its own death. In case of this pandemic, death means very minimum confirm cases per day.
    So I will request you to make one separate graph for rate of recovery. Present that graph also. it will work like a silver lining among the dark clouds.
    Thank you.

    ReplyDelete
    Replies
    1. Thanks a lot. You are a constant source of inspiration and encouragement for me. I will definitely work on recovery rate and try to look into active cases than into confirmed cases. Thanks.

      Delete
    2. Sir,
      Reading your analysis regularly, can we have R0 vs t analysis.
      Thankyou
      Santosh Kumar

      Delete
    3. Thanks. Recovery is currently a fictitious term. One of my friend, Prakash Kumar stated that earlier three tests were there to mark recovery. Now recovery is announced with only one test. So, the data is inconsistent for Recovery.

      Delete
  2. Thank you readers and friends. I posted it at 1613. hrs and by 1850, viewership reached 50. Regards.

    ReplyDelete
  3. On 22.05.2020, India registered 6011 confirmed cases, as against 5761, predicted in this post. This situation is not very comfortable.

    ReplyDelete