httprover's 2nd blog: May 2020

Sunday, May 24, 2020

Alternative Moscow Covid-19 Data Fit

I may have mistakenly identified the Moscow Infected data column as new cases. Taken at face value the compartments appear to be related as follows.

Where S is the number of Susceptibles, I the numbet of Infected, D those who have Died and R the Recovered. The peak value of I appears to be a little farther off and so the value of I_max was included in the search to minimize the rms error.

The zero point in the plot above is somewhat uncertain due to fluctuations in the data.

Either way it looks like the people of Moscow will soon be switching to the downward sloping road to recovery.

Supplemental (May 24): There are different ways of reporting the spread of Covid-19, total number of cases to date, new cases for the day and the number of current active cases for example. This fit assumes Infected refers to the latter. One has to be on guard against making a mistake of this kind. The data can be corrupted if not everyone follows the same rules. The way one defines recovered can also affect the counts. "Contrarians look" can be a source of problems.

Supplemental (May 24): It seems reasonable to refer to the curved line of the data fit as a trackline.

Supplemental (May 26): The curve could also be described as a tracing.

Saturday, May 23, 2020

More Recent Moscow Covid-19 Data Fit

I downloaded more recent Moscow Covid-19 data from the following website. There's a drop-down menu in the plot of the data that will permit access to a table containing the data.

https://коронавирус-сегодня.рф/россия/москва/

The number of cases didn't behave as expected. The plot of d(lnI)/dt no longer appears to cross the horizontal axis and the peak appears to be a little farther off.

The data was more difficult to fit due to fluctuations.

The data consists of the blue points partially hidden by the red fitted curve. The data deviates slightly from the SIR model at the end.

The initial point in time, t=0, corresponds to Mar. 15, 2020.

Supplemental (May 23): The error in the peak was due to the wrong formula for I₀. That's been corrected and the fit images above have been replaced. I also changed the I data to new cases minus recovered minus deaths. It would be best if this fit was checked more thoroughly.

Friday, May 22, 2020

A Fit of Moscow Covid-19 Data

In the last blog it was shown that knowing I_max, the peak value for the number of Infected individuals, helped reduce the number of parameters needed to estimate the values of the other parameters for a curve fit. In the case of a Covid-19 fit the peak value is not always immediately known. At the peak we know that dI/dt=(rS-a)I=0 which implies that rS-a=0 but we don't know what the values of S are and only have values of I to work with. However we can evaluate d(lnI)/dt=rS-a and look for a zero of this function.

In the above plot of Moscow I data we see that this function is about to cross the zero horizontal axis if it hasn't done so already. To get an estimated set of parameters for the fit the value of Imax was arbitrarily assumed to be 11000. Then a search of the remaining parameters was done to minimize the value of the rms error for the fit.

New data will allow us to improve on this fit and get a better estimate of the maximum value of I.

Friday, May 15, 2020

Reducing the Set of Parameters for the Fit

One needs a set of 5 parameters such as N, I₀, S₀, r, and a to do the influenza curve fit. Since N=763 is given and I₀ can be taken to be roughly the value for the 1st data point we are left with 3 unknowns for a rough fit. If the data includes the peak of the Infected values we can use it and a formula for I_max in terms of the other unknowns to get an estimate for S₀. We are then left with just 2 unknowns needed to make a rough estimate of the parameters which we will take to be r and ρ=a/r.

Given an estimate for I_max one can compute an table of values for S₀(ρ), then fit a quadratic curve to the data from the table and use the quadratic to interpolate the data for an assumed value of ρ to estimate S₀.

Once a rough estimate is made one can adjust the set of 4 parameters to give a minimum value for the rms error.

The value of "δI rms err" is the rms error for the difference between I(fit) and I(formula). I added a switch to the spreadsheet to allow a comparison with a 1st order numerical integration.

The 2nd order calculation is definitely better.

Wednesday, May 13, 2020

A Sample Influenza Curve Fit

A curve fit can be somewhat challenging if you don't have a formula to actually fit to your data but it's not impossible if you have a set of differential equations that model the process. I've worked out a procedure that gives fairly good results for the SIR epidemic model and applied it to the 1978 English boys school influenza epidemic found in Murray, Mathematical Biology. The set of equations are shown in the following image along with an integrated expression for the number of infected, I, as a function of the unknown number of susceptibles, S. The 3rd compartment for the SIR model is the number of removed individuals, R, but we will not need that.

Given assumed initial values for I and S one can compute successive values using a simple numerical integration procedure. A first order calculation appears to be sufficient if the step size is small enough. I tried a second order term to help get past the peak but it didn't make much difference in the results.

The data used for the fit was obtained with the aid of verinier calipers to measure the height of the data points in the figure found in Murray's book.

The small step size chosen required about 600 iterations for the numerical integration to cover the range of the data. The fits were evaluated by comparing the rms errors of the fits and values for the set of parameters of the model were chosen by trial and error to obtain the smallest rms error.

Here's a plot showing a comparison of the fit with the data along with a comparison of the values of I(S) for both the numerical integration and the formula which was used as a check on the accuracy of the numerical integration.

Some other plots are also useful.

Supplemental (May 15): The 2nd plot from the end should be labeled I(S).