Covid Letter to British Government March 2020

Covid Letter to British Government March 2020

During early 2020 the Covid 19 outbreak was in full swing. Nobody knew how serious things were at the time and so most just followed government guidance, which also later became government mandates. This was a terrible time for many and in particular with the loss of people’s love ones and not being able to say goodbye due to the restrictions by authoritarian hospital administrators.

One of the things that bothered me watching the news media each night was the appearance of government ministers on television claiming they had no idea what the prevalence of the virus was in the country (or the world) at the time. In response to this there was a major push by government ministers to get to a testing regime of something like 100,000 people per day. I considered this and thought immediately this did not sound like a sensible policy yet it had became the main focus of government policy in responding to the pandemic. In addition, it seemed to me to be a waste of government resources, in terms of money, time and people allocation…and therefore a totally inefficient response to a crisis.

As a result, I decided to write a letter to Number 10 Downing Street and the office of the Prime Minister of the United Kingdom. The letter was sent to his Chief Advisor, but it was also sent to other government departments including the office of the Chief Scientific Advisor Sir Patrick Vallance. Although I received confirmation through the system that the letters had been received I never heard back from any of these people. Recently the UK held the Covid enquiry and I submitted my letter to the system, but again no feedback.

The main argument that I was making in my letter of March 2020 was that there was a simple way to determine the prevalence of Covid in a population by a mathematical method known as statistical sampling and for a basic estimate it would only require the testing of around 4,000 people. I have decided to publish my letter below in full.

===================================================================

To: Dominic Cummings

Chief Advisor to Prime Minister Boris Johnson

10 Downing Street, Westminster, London, SW1A 2AA

 Date: Sunday 29th March 2020

SUBJECT URGENT: STATISTICAL THEORY SUGGESTS WIDE-SCALE TESTING OF THE UK POPULATION FOR COVID-19 IS NOT REQUIRED TO ACHIEVE HIGH CONFIDENCE SAMPLING IN RAPID TIME (HOURS TO DAYS).

Dear Sir,

I am writing in regards to the requirement for testing Covid-19 within the UK population. Although I accept that eventually wide-scale testing of the population will be required, there is a much faster way to get critical status information with high confidence by limited statistical sampling methods requiring the testing of only thousands of people in the population and with a result within hours or days depending on the assumptions. I have seen no evidence this is currently occurring. Because the sampling size is so small, a benefit of this method is that it can be repeated on a weekly basis and so give critical information about the rise and fall of the virus as it is present in the population at the time of testing. Although a caveat of this method is it will not inform about who has previously had the virus. It is argued that adoption of this method will help to inform our response strategies in a vital way. In the text below I describe this proposal in more detail, for the scrutiny of your team.

Many governments are stating the importance of testing for Covid-19 infections in the human population and the UK is no exception with current testing at 10,000 per day with an aim to get to 25,000 per day. The emphasis on wide- scale testing right now by governments demonstrates a lack of appreciation for statistics and may also be an inappropriate use of time, resources and personnel when urgently needed. Particularly since data on the presence of the virus in the community is severely lacking. Please read on to understand my argument as to why this is the case. I have tried to keep this letter as short as possible but also including sufficient information to permit an evaluation.

In observing the big emphasis on wide-scale testing, I fear that governments have been inaccurately advised by the science in not adopting well known methods in mathematical statistical theory. Although the UK has a large population size of order 65.7 million people, it is actually only necessary to test a small sample group of that population in order to obtain a mean of that distribution, and then to resample (bootstrap) that population, in order to get an accurate and high confidence measure for the levels of Covid-19 in the UK population TODAY. The purpose of resampling the same population data set is to gain an insight into the variability of the mean that has been estimated. This would then enable a repeat of that sampling on a weekly basis as a metric for how our nationwide responses are working. The estimate of a statistical sample on a population depends on a few crucial parameters:-

Population Size: It only depends on the total population size in the circumstances where the population group is small or when the size of the sampling exceeds a few percent of the total population being assessed. Neither would be the case or necessary in regards to the UK population and in this circumstance this is a large population size and so the sampling required is independent of the population size. This is due to the randomness of the distribution, and the Central Limit Theorem of mathematics which states that the distribution of any sampling will be normal (bell shaped curve) if the population is large, no matter what the shape of the population is; we can assume the distribution is normal. This does require that any two events are independent random variables and in the case of Covid-19 in the UK there will be some dependence present in the sample. However, if care is taken to ensure that the chosen sample is sufficiently random the degree of independence will be maximised and I think the assumption of tending towards a normal distribution would be an appropriate approximation.

Sample Size: The sample size to be adopted only needs to be in the thousands and increasing this number will only serve to make the estimate more accurate. In other words with greater numbers in a sample we can be more confident in the estimate. The size of the sample that we then require will depend on only three parameters, known as the Confidence Interval, Confidence Level and the Population Proportion.

Confidence Interval: This is the error on the estimate and is also known as the margin of error. For example an estimate for the number of people in a sample with Covid-19 might be 10% and stated with a confidence interval of +/-5%, so the actual value could be in the range 5 – 15%.

Confidence Level: This is a measure of the certainty on the estimate, so that a population with a normal distribution is around the confidence interval. It is also correlated to the standard deviation from the mean. Such that a 99% confidence level is within 3 standard deviations of the mean, a 95% confidence level is within 2 standard deviations of the mean. To have confidence in the sampling of the UK population for Covid-19 it is recommended that 2 standard deviations be the minimum requirement on any measurement although obviously a higher number of standard deviations are desirable for improved confidence. This means that for the Confidence Interval above of an estimate of 10% +/-5% we can be 95% certain in the estimate or that the sample is within the defined range.

Population Proportion: The accuracy of modelling depends on the percentage of the sample that gives a specific answer. If 99% are negative and 1% is positive for Covid-19 then the chances of errors are negligible, no matter the size of the sample. However if the numbers are much closer together such as 55% negative and 45% positive then the chances of errors are significantly large. It is standard practice to assume a worst case population proportion of 50% when determining the sample size.

Using the above, I have constructed some basic model calculations and I conclude that the amount of testing we need to do in order to get an accurate measure of the levels of Covid-19 in the UK population AT ANY TIME is in fact relatively low. My recommendation would be that we random sample groups of around 385 people and that we do this 10 times around the country; so that the total number of people sampled is of order 3,850 and the total measurement time would be around 9 hours; this is assuming a Confidence Interval of 5%, Confidence Level of 95%, proportion of 50%. Alternatively, one may accept a lower Confidence Level of say 90% which would then require individual sampling groups of 273 people, assuming the same Confidence Interval of 5%, but then repeat this exercise as a resampling of the data 100 times, which would then require a total sampled group of order 27,300. It is up to government statisticians to derive the optimum sampling approach balanced with other resource needs.

Example sampling options are illustrated below and with sampling times based on the current 10,000 per day claimed government testing capability, to derive a total testing time with given people per group (ppg) Confidence Level (CL), Confidence Interval (CI). For the calculations shown below we only show margins of error of 5% as the likely range, although it is possible that the error could be slightly higher ~10-15% depending on the assumptions.

(i) CL=99%, CI=5%, 666 ppg, 10 groups, Total group = 6,660, Time = 16 hours.

(ii) CL=95%, CI=5%, 385 ppg, 10 groups, Total group = 3,850, Time = 9 hours.

(iii) CL=90%, CI=5%, 273 ppg, 10 groups, Total group = 2,730, Time = 7 hours.

(i) CL=99%, CI=5%, 666 people/group, 100 groups, Total group = 66,600, Time = 7 days.

(ii) CL=95%, CI=5%, 385 people/group, 100 groups, Total group = 38,500, Time = 4 days.

(iii) CL=90%, CI=5%, 273 people/group, 100 groups, Total group = 27,300, Time = 3 days.

It is also important to ensure that the group sampled is random and care is taken to avoid selection bias, sampling errors or systematic uncertainties. That would be sufficient to build up an accurate estimate with high confidence. However, due to the differences in the rate of infections, we may have to build a separate sampling model for the big cities like London, Birmingham, Manchester, Glasgow, Newcastle, since the levels of infections are likely to be higher due to the larger population density and the sampling in these locations would not be representative of the majority of the country. So it is likely necessary to create two separate sampling groups on the population, which for those three big cities would be an additional 1,925 people (5*383) bringing the total testing to only 5,775 people; far below the current numbers of people being tested (more if separate sampling for other big cities). This exercise could be repeated weekly throughout the epidemic to inform the success of response and mitigation strategies.

A major question to be addressed is what group to sample so as to avoid selection bias? At one extreme there is sampling door to door, but it could be argued that those at home are likely okay and do not have the virus because they have been in near-isolation for some time. At the other extreme there is sampling of all patients in a hospital but it could be argued that this group has a higher probability of having the virus because they are around infected patients or medical professionals that are involved in their treatment. Therefore a good compromise group between these two extremes would be those out and about, walking in parks, or those at work in the community, who are neither confined to home nor confined to hospital but are still interacting with other people even if they are practicing social distancing. This may include police officers who at this time have a moderate social interaction. There is also the question of what is one testing for in the sampling? This could be direct testing of Covid-19 using the current medical process, or measurements of proxy data such as elevated temperatures as an indication of the likely prevalence of the virus within the population. Such decisions are best made by medical professionals.

I would encourage an urgent review of the above analysis by UK government statisticians and the construction of confidence models based on statistical sampling methods which can give results rapidly. If we can quickly build up an accurate picture of the levels of Covid-19 in the current population as it is present now and with only limited testing, this is bound to have dramatic implications for our political, economic and medical response strategies and also an adjustment to our current way of life in the disruptive civil society we currently endure. This would also be important for getting business back to work and providing some level of certainty to the financial markets on the actual status.

In particular, because the amount of testing within the population required to form an accurate estimate using statistical sampling techniques is limited, this means that testing can also be freed up to focus on medical health professionals who are at the front line in fighting existing infections; and they should receive the priority in any other non-sampling testing so they can safely return to work and carry out their important duties of care. The testing can also be carried out on a weekly basis to gauge the rise/fall of the virus in response to our mitigation strategies. A caveat to the proposed method is that it does not take into account those people that have previously had the virus.

To be clear, I do think that testing is very important, but it is the wide-scale manner in which it is being applied and advocated at this time which concerns me. At a time of this critical urgency, it is important to not over allocate resources disproportionately, and whilst well-meaning may be inappropriate to addressing the problems in the immediacy. Indeed, it is argued that current wide-scale testing methods may in fact serve to exacerbate the problem beyond reasonable control by diluting our resource capacity and creating a large data management problem that may not be helpful in the present. Thank you for your attention.

Yours Sincerely,

Kelvin F Long BEng Msc FBIS CMInsP

Aerospace Engineer, Physicist

The Fermi Parameter

The Fermi Parameter

Watching Starlink

Watching Starlink

0