Friday, April 28, 2017
I've got a better handle on the error bounds for the fit of a trapazoidal distribution to the deviations of the time of the Equinox. Here's a review and a corrected plot with error bounds showing the expected deviation from expected value for δt.
The error bounds used the values of the estimate of the deviation, δf*, for the probability densities, obs_f*, for the intervals in the table above. One can get a better understanding of what the error bounds mean by looking at the expected relative frequency, fi=ni/n, for the intervals chosen. The x values indicate the center of the interval. Using the values of a and b for the fitted trapazoidal distribution we can compare the observed counts with the expected counts, k=nf and their expected rms deviation of the counts, δk=√[nf(1-f)]. The expected variation in the relative frequency will then be δf=√[f(1-f)/n]. But what does all this tell us about the observations themselves? One can look at the terms of the binomial distribution with p=f and determine the probability of observing exactly k counts in each interval. Then we can add up the probabilities for those values of k which are within a distance of δk from the expected value for k. The last column on the right shows the probability of this occurring for each interval. A calculation shows the odds aren't uniform for the intervals but equal to 0.6252 ± 0.0375. The probabilities associated with the error bounds are less than those for a normal distribution and fluctuate a little because we are dealing with a discrete probability distribution and taking sum of those values of k between <k>-δk and <k>+δk. One can show that probabilities associated with bounds for a given number of standard deviations in a normal distribution is equal to P(k)=erf(k/√2). So we would expect slightly more observations to be outside the error bounds for the binomial distribution than would be the case for a normal distribution
The binomial distributions for each of the intervals can be plotted together for comparison.
Monday, April 24, 2017
I got a little bogged down with some technical details associated with copying formulas from one Excel worksheet to another. It's a little annoying when Excel crashes, restarts and you have to redo the stuff you haven't saved. It may have been how an error crept into one of my previous pages. You literally lose track of what you are doing. One needs to constantly check one's formulas and trace dependencies when transferring material from one page to another.
I used the expected value formulas to do the fit for the deviations of the Equinox times. The results were similar. The frequencies that I've been using were relative frequencies defined in as the ratio of counts for an interval to the total count. One can also define the function f as a probability density or probability per unit interval. I had to use this definition to get the fit to work properly for the Equinox times. The value of f here is the previous value divided by the width of the interval dx.
The error bounds are nominal in the sense that they are typical of the observed variations for a trapazoidal distribution. The fit values for the trapazoidal distribution are a=4.470 min and b=15.089.
Supplemental (Apr 24): The trapazoidal distribution has an interesting series for formulas for its expected values. The pattern holds for higher powers of x. Technically this might be called a folded trapazoidal distribution since the probabilities for the positive and negative values of x are combined.
Supplemental (Apr 25): The variations in the relative frequencies are scaled down versions of the expected variation in the counts for an interval as this derivation shows.
In evaluating f and δf in the table above I used the observed values for the interval's density, obs_f (=ni/n/dx), as an approximation. It was intended as a check of the trapazoidal density formula whose maximum value is 2/(a+b)=0.1023. Using the same letter for the relative frequencies of the counts and the probability density formula may have be a little too confusing. So the error bounds in the plot are a little too large. Using f* for the density the correct formula for the expected rms error in the density would be as follows with Δx=dx.
Friday, April 21, 2017
From the definition of the trapazoidal distribution and its integral one can obtain formulas for the expected values ⟨x⟩ and ⟨x2⟩.
Then given a set of random numbers from a trapazoidal generator one can analyze the set by counting the number of values that fall within chosen intervals and then estimate the distribution for the intervals and the expected values ⟨x⟩ and ⟨x2⟩.
We now have two equations which can be solved for a and b which can be used to fit the observations and compare the results with the original values of a and b. In the example below the original values were a=0.5 and b=1.0 and the fit values were a=0.538 and b=0.993.
This was a lot easier to do since the equation for ⟨x⟩ can be transformed into a quadratic function of ρ=a/b whose solution can be found if a set of values are assumed for b. The set of values for ρ can then be used to find a zero of the expression for ⟨x2⟩.
Wednesday, April 19, 2017
The least squares fit failed primarily because it favored the majority at the expense of a minority. The histogram cells closer to the mean time had the highest probabilities while those beyond the value of b had zero probability. The result was that the data for the last cell on the right could be ignored when computing the rms error of a trapazoidal distribution if the b value was too small.
The additional constraint for the minimum maximum magnitude of the z-scores for the probability distribution assured that an unlikely situation would not occur. The probability of a histogram interval was found by using the difference of the integral of the trapazoidal distribution of its upper and lower bounds.
When setting bounds for curve fits one has to make certain that significant data is not ignored.
The trapazoidal distribution fit for MICA's Spring Equinox time deviations proved to be a little difficult. The b values were difficult to fit since they favored lower values at the expense of large z-scores for the last interval of the histogram. I tried minimizing the maximum absolute value of the z-scores while minimizing the rms error and got what appears to be a better fit.
Here are some statistics for the fit.
Judging by the z-scores its a marginal trapazoidal distribution at best.
Tuesday, April 18, 2017
Just got through doing a rough trapazoidal fit of the deviations in the time of the Spring Equinox.
This fit uses the MICA times of the 251 Spring Equinoxes from 1800 through 2050. We probably shouldn't take the trapazoidal distribution too seriously but it may be wise to keep an open mind about the actual shape of the distribution. The minimum error was for a=2.05 min and b=15.15 min. The statistics hint that a slight deviation from the mean time is the most probable situation.
Supplemental (Apr 19): This fit ended up somewhat off the mark due to an error in evaluating the probability for the histogram intervals. I found the error in data used last night and corrected it in the next blog. Here too I had problems in choosing a good value for b.
Monday, April 17, 2017
I did a video to show the trapazoid fit process in action. I was able to compensate for Google's processing somewhat but not entirely. Excel recalculates the worksheet if a cell's content is changed to moving a cell about is an easy way to recalculate the worksheet. Each time the selected box is moved the worksheet computes 1000 random trapazoidal numbers, does the data analysis and computes a fit for the data.
You can pause the video to study a particular fit. It helps to zoom in a little too.
Over the weekend I've been studying the Trapazoidal Distribution. One only needs two numbers a and b to define this distribution. Its height, h, can be found since the area under the curve equals 1.
I wrote an Excel user function to generate 1000 random numbers that fit this distribution with a=0.5 and b=1 then determined the observed frequencies for the following set of intervals.
To check the trapazoidal random number generator it was necessary to determine the distribution which best fit the data. To do this a search was necessary to find the values for a and b which minimized the root mean square error for the fit. Since the trapazoidal distribution is symmetric it can be folded horizontally to reduce the number of intervals needed in the calculation.
The fit turned out to be fairly good.
Above the blue dots represent the observed frequencies and the blue line is the fit.
Thursday, April 13, 2017
I've solved the problem I've been having with setting the intervals on Excel's histograms. I tried adjusting the bounds by setting the Overflow and Underflow bins to ±18.0 but they were ignored. Apparently, these bounds need to be less than the maximum and minimum for the data series used for the histogram. It's debatable whether or not one needs empty columns at the ends of a histogram. These entries set up the histogram in an acceptable manner.
Again, the observed counts from the histogram can be used for comparison with the statistically expected counts for a normal distribution.
The distribution above looks similar to a normal distribution and the z-scores for the intervals can't be rejected on statistical grounds but one cannot always get a clear impression of the actual distribution for deviations from a mean value with a single dataset. In another histogram the distribution appears to be more like a trapazoid.
We would need more data to come to a conclusion about what the actual distribution for the Equinox times is.
Monday, April 10, 2017
Excel's error function, ERF(x), can be used to evaluate the probabilities for the intervals in a histogram and their expected values. The probability integral, p, below is simplified by the use of a z-score eliminating the standard deviation found in a normal probability distribution. The error function is simplified even further since the factor of 1/2 is missing from the exponential so its arguments require the z-scores a and b of the interval to be divided by the square root of 2. This can be done for the deviations for the times of the Spring Equinoxes. Formulas for binomial distribution can be use to compute the expected value, k=np, for an interval once the probability is known and also the deviation, δk, for the expected value. δk2=np(1-p). The z-scores for the observed counts is found in the last column and gives us a measure of how good the counts for the intervals are. One would expect nearly all the observations to be within 3z for the interval as is the case.
So even though the Earth's orbit deviates in a complicated spiral motion from the Keplerian ellipse, the observed deviations time of the Equinoxes appears to be random.
Supplemental (Apr 10): Another way of looking at the analysis above is that we compared the distribution for the deviations in time with a normal distribution. The z-scores in the last column indicate the central peak is somewhat flattened and the "outliers" at the ends are under represented in our sample. I had a part-time job in the Physics Shop while attending university and one of my teachers used a bean machine to study the statistics. He said that the distributions he was getting were not what was expected. We considered the possibility that the balls might be gaining a little momentum as they fell and were more likely to bounce off the pins and move outwards or the pins may get bent after prolonged use. But rare events are less likely to be represented in a sample and the peak may be more likely to lose its balls than gain them if the total number of balls is small.
Friday, April 7, 2017
I redid the statistics for the deviations from fit formula and got something a little more normal using fewer intervals
Microsoft's histograms are a little difficult to do. For this one I set the underflow and overflow values at -16 and 16 respectively and used 6.4 for the interval width. One can compare the observed values with the expected values and their standard deviations.
One would expect most of the observed values to be within 3 standard deviations of the expected values. Under the circumstances one cannot rule out a normal distribution for the deviations.
Supplemental (Apr 8): I used numerical integration to evaluate the probability integrals for the expected values and their deviations. A more accurate calculation shows that for the 16< interval the expected value is 1.6.
Thursday, April 6, 2017
I converted the MICA times for the Spring Equinox between 1800 and 2050 to Julian Dates and got a linear equation for the fit. A plot of the deviations appears to be fairly random with a maximum deviation of about 16 minutes.
The "resonance" seems to have disappeared.
Supplemental (Apr 7): The probability distribution for the deviations from the linear fit doesn't appear to be normal. Its peak appears to be somewhat flattened.
Wednesday, April 5, 2017
I tried using the apsides from MICA's 2017 positions for the Earth to calculate positions on the Keplerian ellipse but got a 0.02 radian difference between the curves for MICA's radii and the ellipse radii. Using apsides based on the fit brought both curves together.
These angular separation between these two apsides was very close to π as it should be.
One can plot the deviations of MICA's Earth positions from the best fitting Keplerian ellipse positions. The result is rather curious. The motion appears to be that of compounded spirals. The smaller cycles are due to the Moon's motion.
The blue dot at the end point marks the last data point.
Supplemental (Apr 6): A 0.0002 AU distance deviation along the ecliptic path of the Sun is enough to account for the observed 16 min deviation in the time of the Spring Equinox.
Tuesday, April 4, 2017
The formula to the shift in time of the Spring Equinox could be clarified somewhat. The Sun has a mean position and mean path. The Moon and other bodies can shift the Earth's position a little and this will affect the apparent position of the Sun which travels parallel to and at the same speed as the mean Sun. So for some arbitrary deviation, δ, relative to the mean position we can calculate the extra distance the Sun needs to travel in order to cross the Celestial Equator and the maximum value.
Note that the error is magnified by small angles of inclination for the path. As stated previously the maximum error is for θ = 0 or a deviation perpendicular to the plane of the Celestial Equator. Of course when the deviated position is above the Celestial Equator the Sun will cross before the Mean Sun does.
Sunday, April 2, 2017
I used MICA to compute the Earth's heliocentric positions for 2017 and then used Excel to fit a Keplerian orbit to the resulting data.
The angle θ is in radians. The resulting orbital elements were,
The epoch is Jan 1, 2017 00:00 UT. If you look at the difference between the two curves you get a small periodic error that fits very well to 12.50 per year which corresponds to the Draconic month.
The deviations account for most but not all of the observed variation in the time of the Spring Equinox.
The speed of the Earth was approximated by dividing the circumference of the Earth's orbit, 2π AU, by the length of the year in minutes. One needs to take the tilt of the Earth's orbit into account to estimate when the Earth will cross the celestial equator.
Edit (Apr 2): Redid the last figure. Here the horizontal line is the Celestial Equator. The sloped line is the Ecliptic along which the Sun travels. The length x is proportional to the distance of a point on the circle from the equator so it is maximum for the right triangle. The numerical results remain unchanged. This calculation would be conclusive if based on the actual angular deviations. As is, the calculation shows how a vertical deviation can affect the time of the equinox.
Supplemental (Apr 2): The times of perihelion and aphelion give a little more information about the Earth's orbit. The values obtained from least squares cubic fits of r are in good agreement with published values.
Edit (Apr 2): Found an error in the linear interpolation for the aphelion. It may have been due to back stepping a fraction of a step.
Monday, March 27, 2017
Working with arrays of numbers in Excel can be a little difficult especially if one is doing cross products. But we can input the formulas for something familiar and then rearrange the cells to get the relative positions correct for the array.
I found a screen recording tool in PowerPoint 2016 today and decided to test it with a video showing how to do a cross product in Excel 2016. One first enters two vectors in adjacent columns. If one copies the first two rows underneath the original two vectors one can use drag and fill to copy the formula for the first component of the cross product in the third column. Setting the two rows at the bottom equal to the first two rows at the top allows the procedure to work even if the numbers are changed.
Hiding the 4th and 5th rows makes it easier to view the results.
Wednesday, March 22, 2017
I recomputed the times for the Spring Equinox using JPL's HORIZONS Web-Interface for the years 2010-2020 and got similar results to those for MICA for the same period. The times were all shifted by a few hours and the mean value for the period differed slightly but the plot was similar overall.
I couldn't find an explanation for magnitude of the deviations. The changes in the Earth's orientation due to nutation are of the order of a few seconds of arc. The effects of gavitational perturbations accelerate and retard the Earth in its orbit about the Sun and can produce cummulative changes.
Tuesday, March 21, 2017
I recomputed the deviations for the time between Spring Equinoxes using MICA data from 1800 to 2050 which was all that was available.
The data appears to be fairly random but there may be a beat frequency present since the data shows very little variation about 1820 and 1990. The mean value was 365.242358 days.
Monday, March 20, 2017
The Spring Equinox occurred at 10:28 UT today and the table in the Wikipedia article for the time of day that the Sun crosses the equator appears to range throughout but this is somewhat deceiving since the time between Equinoxes differs slightly from the time it takes for the Earth to complete its orbit. It's the reason we have leap years. If one looks at the deviations from the mean time of the Equinox the variation appears to be less that 10 minutes for the decade or about 1 part in 52,595. Most of this variation is probably due to the perturbing influence of the position of the Moon.
There is also a slow change in the position that the Celestial Equator crosses the Ecliptic due to the Earth's Axial Precession which in turn is due to gravitational torques of the Moon, Sun and other planetary bodies acting on the Earth's equatorial bulge.
Maia was an ancient fertility goddess who appears to have been associated with the growth and renewal of springtime. In ancient Greek maia meant lady but she was also known as a midwife. It may not be a coincidence that in poetry she is associated with magic and enchantment considering the similarity of the Greek word for magic, μαγεία. She is definitely beguiling.
Supplemental (Mar 21): The product of a little free association this morning:
Thursday, March 16, 2017
Uncritical thinking can get us into trouble. If we place too much trust in the group and take every suggestion seriously we can be induced to act improperly. We can't ignore individual responsibility and slave our actions to the will of others.
The same is true with computer systems. Every computer has a bag of tricks and hackers will try to take advantage of them if they can get away with it. Added security might be achieved if we can limit actions to a known set of instructions. The programs may have to become more like automatons that act cooperatively with assigned tasks and notify others of the problems they encounter.
If things do not go as planned a programmer will have to take remedial action.
This is why I designed the imround function to be selective in its actions. Sometimes it may be useful to pass on error messages to bring them to the attention of the programmer. To others error messages may be a distraction so I decided to add a variable to imround to allow it to change its behavior as needed.
If copy is true, this version of imround will just copy the contents of the source cell if it doesn't modify it. When copy is false, the contents are not passed on.
Sometimes we have to act to maintain order and limit the propagation of error.
Wednesday, March 15, 2017
The slideshow above came from a google html/css gadget originally posted on the side panel and moved into the blog column. The code was a slight modification of the code found on the W3.CSS Slideshow page. The code from the gadget was copied as HTML/CSS into this post. For some reason there is now a vertical shift and a lot of "space." Blogger seems to have altered some of the code.
Supplemental (Mar 15): Found the bug. There were some spurious <br> 's in the html code for the blog so I placed the entire <div> code for the container on one line. Removed the vertical shift.