httprover's 2nd blog: 2015

Friday, December 25, 2015

Derivation of Some Least Squares Formulas

One can use the method of least squares to derive vector formulas for best fits. For a linear fit one can define the deviation, δ_k, as the vertical distance of data point k from a line and the variance, V, as the sum of the squares of the deviations. The formula for estimating the best slope through a set of data going through a particular point is derived as as shown below. The best slope is assumed to be that for which the sum of the squares of the deviations or the variance is a minimum. From the theory of maxima and minima in calculus we know that for the minimum the derivative of the variance is zero.

In the above the chosen point that the line goes through is (t_p,ΔT_p). The equation for the line is ΔT(Δt) = ΔT_p + s Δt where Δt = t - t_p. A vector Σ whose components are all 1 is needed to include the scalars t_p and ΔT_p in the vectors defining δ and Δt.

One can do something similar to find vector formulas for the best straight line through a set of data.

Merry Christmas.

Sunday, December 20, 2015

Polar Warming

NOAA Climate has recently pointed out that the arctic is warming at a faster rate than the global average. Temperature anomaly data is available from the NOAA ftp site for 30° intervals of latitude and can be used to compute global and arctic tracks for comparison.

The anomaly baseline for the ftp data is 1971-2000 so the vertical scale is different from NOAA's plot. The arctic track appears to have started increasing about 1970, a little earlier that the global track of the anomaly mean. The arctic anomaly data also shows much more variation in its values than the antarctic data.

This may be due to the fact that the arctic has an ocean ice pack and the antarctic a continental ice sheet. The antarctic currently appears to be cooling.

Wednesday, December 16, 2015

Tracking Global Land-Ocean Warming

I tweaked the procedure a little for the global land-ocean anomaly by using a dotted line for the linear projection at the end of the plot.

The computed slopes seem to be a little more sensitive to changes in the step size, q, than the anomaly track.

Tracking Global Land Warming

I did the same tracking calculation for the global land anomaly with the last few values obtained using the last estimate of the slope. Again something seems to have happened in the late 70's.

The tracking method seems to work better than the successive residual polynomial fit method with better behavior at the beginning and end of the plot.

Tuesday, December 15, 2015

Tracking Global Ocean Warming

One can use the formula for estimating slopes to track changes in the global ocean anomaly. I fit a straight line to the first five years of data to get a starting anomaly and slope. Then I stepped forward one year along that track to get a new value for the anomaly, estimated the slope for the next 5 year interval (Δp) and repeated the process. The fit for the last five years was a linear projection based on the last slope found. Here is the calculation for the repeated steps.

The warming indicates the rate is not steady as the track anomaly and slope show.

There seems to have been a slight jump in the rate of ocean warming after the mid 70s but it's difficult to characterize the fluctuations.

Sunday, December 13, 2015

Modified Formula for the Slope

The formula for the slope in the last blog assumed that Δt_k = t_k - t₈₀₄ so that Δt₀ = 0. The formula can be modified to give the best slope through an arbitrary point (t_p,ΔT_p) is as follows.

This vector formula is similar to method used to estimate the derivative of f(t) using Δf/Δt. The dot products are needed since division is not defined for vectors.

Saturday, December 12, 2015

A Check on the Same Rate Calculation Using a Different Method

Instead of doing a search to find the slope which gives the best fit for same starting point to the global warming record one can use a formula to estimate the best slope. The formula for s_min below uses vector notation and the products are dot products. A function is needed to compute s_min since one has to use vectors whose lengths are determined by the length of the period specified by m_j.

The formula gives values which are nearly identical to those found by the search as the plot shows. The calculation using the formula takes less time.

Friday, December 11, 2015

The Global Ocean Warming Rate Appears To Be 1°C per Century

I recomputed the warming rates with a starting point of Jan 1947 and the values stabilize at 1°C per century. This appears to be the current rate of global ocean warming as determined by least squares slope method.

As the length of the period for the least squares estimate increases the value becomes less sensitive to the fluctuations in the anomaly.

A Steady Increase in the Mean Global Ocean Warming

In a blog two weeks ago I used a global ocean warming rate based on a least squares fit of the slope a line through the fixed starting point of the record. There is an advantage for using the fixed starting point since it allows the comparison of periods of different lengths. This allows us to avoid the comparison of apples and oranges. As the length of the period over which the rate is determined increases one might expect it to approach some fixed value if there was some long term average for the global warming rate and its fluctuations were bounded. This definition however shows a nearly steady rate of increase since about 1945.

The changes in CO₂ emissions do not seem to have significantly affected this change in the rate of global ocean warming.

Tuesday, December 8, 2015

Why Does Successive Residual Polynomial Fits Work?

To see why the successive residual polynomial fit works so well we can compute the polynomials for the sequence of fits and measure the distances between the fits and the original curve represented by the data points. A simple definition of the distance between the curves represented by two sets of points, f and g, is the square root of the sum of the squares of the distances for each set of points.

The calculation of the successive fits proceeds as follows.

A plot of the distances shows that the successive fits steadily approach the original data points.

The root mean square error is another measure of the distance between two sets of points. One needs to be careful about the density of points and that could be taken into account by including a weight function in the sum of the squares. There may also be some error correction of the coefficients for the fit polynomial taking place as one gets closer to the desired function. Blunders may be eliminated by removing unusually large deviations from the fit.

Friday, December 4, 2015

NIST Saturation Vapor Pressure Curve Fit

I tested the successive residual polynomial fit method against NIST saturation vapor pressure data to see how it works with more accurate data. It's easiest to fit the log of the pressure, logp, as a function of the inverse of the temperature, U, with a polynomial fit. See Karapetoff, Engineering Applications of Higher Mathematics (1916) for more details on fit procedures. The best fit that I was able to get was for a 35 degree polynomial which added roughly 3 digits of precision to the values of the pressure in the NIST tables.

δ is the error in log(p) for the fit and appears to be just a round off error distorted by the log function. The temperature, T, ranges from the triple point of water to its critical temperature.

Saturday, November 28, 2015

Comparison of the Mean Global Ocean Anomaly Warming with the Record

I forgot to include a plot of the mean warming rate with the record of the global ocean temperature anomaly in the last blog.

The zigzag in the record may be a little to regular to be considered purely random.

Friday, November 20, 2015

Ancient Arctic Ice Pack

I redid the estimate of the rate of global warming for the oceans more carefully and found a value of 1.57 °C per 1000 years.

Taking into account a mean ocean temperature of 13.9 °C for the 20th Cent. and that seawater freezes at -1.89 °C we find that much of the worlds oceans would have been frozen over about 10,000 years ago.

So, we have to assume that the Arctic Ice Pack would have been much greater during the last ice age. The land bridges for North America could have been farther south in the Pacific and the Atlantic.

Monday, November 16, 2015

The Global Ocean Anomaly and Its Rate of Change

One can do a similar comparison of polynomial predictors for the global ocean temperature anomaly and the results are not much better even if one bases the prediction on a 100 year interval which better constrains the higher order polynomials.

If one starts with the global ocean anomaly for 1880.0 and asks what slope is the best predictor of the change in the anomaly one gets a surprisingly low value of 1.5 °C per 1000 years as these least squares fits show.

Since the mean temperature estimates indicate that the average for the twentieth century is 13.9 °C one would expect much of the world's oceans to be near freezing about 9000 years ago.

Wednesday, November 11, 2015

Using the Polynomials as Predictors of Global Warming

The lower degree polynomials make better predictors since the higher degree polynomials tend to diverge from mean value in an interval as one moves father forward in time. In the following example the data from the 40 years prior to 1980 were used predict the subsequent global land anomaly. The black horizontal line is the average for the interval. The blue line is the linear fit and the cyan and brown are the best 2-degree and 3-degree polynomials.

If the change was determined by a stochastic variable the horizontal line would be the best predictor. In this case it seems to be the best "predictor" of prior anomalies. The horizontal line and linear fit seem to be the best long term predictors. The quadratic curve is capable of detecting some curvature in the interval but it can go astray more quickly. Smaller intervals for the fit tend to result in higher curvatures and sometimes produces unlikely results.

Tuesday, November 10, 2015

Separating Signal from Noise in the Global Warming Data

I stumbled upon an interesting way of doing a curve fit a few days ago that is a variant of the idea behind Chebyshev polynomials. Instead of using fixed polynomials though one finds a series of best fitting polynomials of increasing degree to the successive remainders of the data fits. One also needs to re-scale the time about a mean value to keep the powers involved in the polynomials manageable. In the case of the global land anomaly the fit soon reaches the point of diminishing returns.

I found that an 8-degree polynomial gives a reasonably good fit for the anomaly with an approximately linear start and finish. To limit the sensitivity of the fit to the ends I used 5 year buffers with a lower statistical weight (0.25 vs 1) in the sums involved.

The quick approach to a limiting value for the rms error as one increases the degree of the polynomial seems to indicate that most of the data is "noise." As one increases the degree of the polynomial a point is reached where the any remaining global warming is masked by the fluctuations present present in the data. The 6-degree polynomial shows a sharp downturn in the fitted curve on the right side. The 10-degree polynomial shows a rapid change on the left side. The results are similar to the 20-year average used for the prediction made in October 2013 but much easier to do. But effects analogous to the Gibbs phenomenon and large fluctuations interfere with making an accurate prediction of what will happen.

It's been reported in the news lately that the global warming anomaly will pass the 1°C mark this year. It appears the global land anomaly has already done so.

Tuesday, September 15, 2015

The Widmark Formula and Factor for the Fit

I've been having trouble getting the Widmark formula and factor to work out properly for Widmark's 1914 data. One can use the intercept of the vertical concentration axis and slope to determine the line which represents the Widmark formula and which in this case is B' = B₀- I_Bt. The Widmark factor, r, obtained from the B₀ = A/(Mr) should be approximately 0.70 but the calculated value is 0.32 instead which is too low. Widmark showed that the concentration of alcohol in urine was approximately the same as that in the blood. So it would appear that the alcohol concentration values for the data are too high.

One can look at the de.wikipedia articles promille, blutalkoholkonzentration, diurese and urin (harn) for more information on the German terminology used in Widmark's paper.

Thursday, September 10, 2015

Widmark's per mille BAC unit, etc.

Widmark expresses the blood alcohol content as per mille, ‰, but this term can at first seem a little vague. By studying Table 1 in Neymark and Widmark, Kinetik des Aethylalkoholumsatzes (1935), we can assign a more meaningful unit to measure the BAC.

It appears from the data in the table that ‰ is equivalent to 1 gm/1000ml but since the density of water is 1 kg/liter this could also mean 1 gm/kg or 1 gm/1000gm which is unitless. Since by concentration we usually mean quantity per unit volume it seems more reasonable to use 1 gm/1000ml. When we do this the unit for the Widmark factor is liter/kg which is what we used before. Even though the factor here was obtained by experiments on dogs, we still get r = 0.7 liter/kg approximately.

Another expression, per os, used in Widmark, Konzentration des Alkohols (1914), means taken by mouth so it appears he drank the alcohol in a number of different concentrations but obtained similar results for the BAC vs time. In the experiments on the dogs the alcohol was administered by intravenous injection directly into the blood.

Monday, September 7, 2015

Degenerate Solution Fit for Widmark's 1914 BAC Data

I found the degenerate solution for the diffusion model used in the last posting and tried doing a least squares fit again. I used the trick guessing at where the initial zero, t₀, and the final zero, t_R, would be as well as "relative likelihood" statistical weights for the sums involved. There was only one parameter, α, to search for since I_B was found using the least squares fit. The result was closer to what one would expect from statistical errors in the observations coming within two standard deviations of the data points.

The improvement in the fit came from using the statistical weights which were inversely proportional to the standard deviations of the data points.

Saturday, September 5, 2015

Fit of Widmark's 1914 BAC Data Assuming Constant Removal Rate

Using a few tricks I was able to get a fit for the blood alcohol content data published by Widmark in 1914 for a 4-compartment model. The flow was modeled by the following analogous electrical circuit with I₀ representing the removal rate.

There wasn't very much data to work with and at first my fits looked a little strange but I decided to estimate the time at which the BAC was zero and adjusted those for the best least squares fit of the general solution of the assumed model. One criterion for the fit was that the downward slope be approximately linear. Another problem was that the best fit again appears to be degenerate suggesting that the two time constants are equal.

The peak value still appears to be a little off but the fit would probably benefit from more accurate data over a longer span of time. Only positive values for the BAC fit make any physical sense. The fitted curve can act a little weird to the left and right of the region shown but the assumed removal rate isn't valid in these regions either.

Wednesday, September 2, 2015

Adding Compartments to the Blood Diffusion Model Can Raise the Peak BAC

I used an electrical RC circuit to model the flow of alcohol from an initial compartment to two others with no removal of the alcohol from the system to simplify the problem. Alcohol diffusing from the digestive tract into the blood and from there into body tissues can show a slightly higher peak than the 3-compartment model. The circuit used, component values and some of its initial conditions are as follows.

One can compute successive values for the charges on the capacitors which are analogous to the amount of alcohol in each compartment using the simultaneous equations or attempt to solve the system of equations analytically. The analytical equations are easier to compute but can be rather difficult to find. The solutions for the charges on the capacitors are shown in the figure below. I included the analytical equation for the charge, q_B, corresponding to the blood alcohol content. Notice that the digestive tract and blood can approach equilibrium before the equilibrium is reached for the tissues accounting for the slightly raised peak.

Since q_B and q_T are initially zero the simultaneous equations tell us that the initial rate of change of q_T is also zero and its curve is slow to change at first. The characteristic equation for the circuit is cubic but fortunately the constant term is missing so one only has to solve a quadratic equation to get the values of α₁ and α₂.

Saturday, August 29, 2015

The Rate of Oxidation of Alcohol is a Constant

In a 1936 paper by Neymark and Widmark¹ it was shown that the rate of oxidation of alcohol in the body is a constant. To avoid the delays encountered by diffusion, the alcohol was introduced by intravenous injection into the bodies of test animals and it was observed that the decrease in the blood followed a straight line descent. This is what one would expect if an enzyme was involved in the breakdown of the alcohol molecule. Since the alcohol has to combine with the catalyst, a limited amount of catalyst limits the rate of reaction. As the amount of alcohol in the blood increases above a critical limit the reaction rate tends to level off. This value is very small so the rate of decrease appears constant for typical concentrations of alcohol. It can be shown that the reaction rate is proportional to the product of the concentration of catalyst and the value of ρ in the formula below.

A group of enzymes, alcohol dehydrogenase, is required to metabolize alcohol.
_____________________
¹ Neymark, M. and Widmark, E. M. P. (1936), Zur Frage der Kinetik des Aethylalkoholumsatzes im Organismus. Skandinavisches Archiv Für Physiologie, 73: 283–290. doi: 10.1111/j.1748-1716.1936.tb01471.x

Some Concerns About the Diffusion Model for the Widmark Fit

I was concerned about the large deviation of the peak value of Widmark's data from the diffusion model curve for the best fit of the alcohol concentrations so I did a plot that included the three standard deviation error bounds the data and it just passes this statistical test.

The data used for the plot were the mean values shown in Table III of Widmark's 1914 paper. As is, one cannot reject the model based on just this experiment. One could try a more complicated diffusion model with a 5 compartment model including an eliminated compartment. The assumption is that the alcohol in the digestive tract flows into the blood before it flows into the remaining body tissues and the liver where it is eliminated.

The more complicated scheme may affect the peak value slightly since there will be a slightly greater concentration of alcohol in the blood than in the liver. Trying to find an equivalent circuit with fewer components will result in loss of information about the quantities of alcohol in each compartment and may confuse the situation somewhat.

To test more complicated models like the one above to see if they will product a slightly higher peak one could put together a number of electronic circuits and measure the voltage drop across capacitor C_B. An alternative would be to simulate the response of a model using an electronic analog computer.

Friday, August 28, 2015

The Simple Model's Fit To Widmark's 1914 Alcohol Content Data And More

In a 1914 paper by Widmark some more data is given for the concentration of alcohol in blood and urine versus time. The two are approximately the same as was shown in some earlier experiments. The data is from Table III. I tried fitting the simplified formula and wasn't impressed by the results.

So I tried a simple analogy that compares diffusion with electrical conduction.

For diffusion the concentration, c, of a substance acts like a potential which causes a quantity of mass m to move from one point to another. The rate of this flow is analogous to an electrical current. And there is an equivalent of Ohm's law I = ΔV/R which is dm/dt = Γ Δc. So we can draw an electrical circuit which will mimic the behavior of the diffusion.

The capacitors act like the compartments of the 3-compartment model representing the digestive tract, D, and the blood, B. One could add a extremely large capacitor to receive the eliminated alcohol but it would act like a short circuit since it would have a minute change in voltage with the addition of a small amount of charge q. The solution for the voltage across C_B is the sum of two exponentials.

The new model gave a better fit. The way the circuit works is the charges on the capacitors first tend to equalize with the voltage across C_D remaining slightly higher than C_B and then a small current from each drains through R_B. We would expect similar changes as the alcohol diffuses throughout the body.

Supplemental (Aug 29): The citation given for the article is,

Widmark, E. M. P. (1916), Über die Konzentration des genossenen Alkohols in Blut und Harn unter verschiedenen Umständen. Skandinavisches Archiv Für Physiologie, 33: 85–96. doi: 10.1111/j.1748-1716.1916.tb00107.x

Monday, August 17, 2015

Determining Recovery Times After Drinking Alcohol

One can use the formula for BAC as a function of time to estimate the recovery time after drinking a quantity of alcohol. As an example suppose an 86 kg man drinks 2 liters of 5.5% beer. Using the data below...

...we can compute a BAC curve as a function of a parameter x = λt for a more general result and compare the results with the limiting BAC value of 0.08 gm/100ml. The equation that we need to solve is derived as follows. γ is the BAC converted to mass using the individual's effective fluid volume, V_fl, divided by the mass of alcohol consumed, A_m.

Mathcad can easily solve the transendental equation for x given γ as follows. The advantage of using x is the curve is the same no matter what the individual's λ value is.

We can use the same method to compute a table to for the peak BAC value for given body mass and quantity of drink. For the 5.5% beer the peak BAC table using pounds and fluid ounces is,

Solving the corresponding BAC curves for right intersection point gives the recovery parameter x values for the BACs above the BAC limit. One may have to use Newton's method for approximating the zeros of a set of equations for each pair of mass and drink quantity.

If λ = 0.33/hr, the value for nondrinkers, the corresponding number of hours that it will take for an individual's BAC to drop below BAC limit are as follows. The zeros in the tables are for BACs below the BAC limit so there are no "recovery times."

For comparison note that three 12 fl.oz. bottles of beer is 36 fl.oz. and a gallon is 128 fluid ounces.

Supplemental (Aug 18): The tables above were computed for men with a Widmark ρ_W of 0.7 liter/kg. Women can use the same tables if they use a body mass reduced by 6/7 = 0.86. A gender neutral table would replace the body mass with the effective fluid volume V_fl.

Supplemental (Aug 20): The fit to Schweisheimer's data was fairly good considering the simple model assumed but all the subjects consumed the same amount of alcohol and therefore we can't really say that Widmark's formula for the peak BAC was validated. The amount of data was also limited and as a result we didn't have accurate knowledge of true shape of the curve to be fit. Consequently, we shouldn't be surprised if our estimates of the recovery times are off somewhat.