httprover's 2nd blog: A More Detailed Look at the t-test for Averages

Monday, July 8, 2019

A More Detailed Look at the t-test for Averages

I did the last blog more carefully and got significantly improved statistics for the relative error of the test results. I also cleaned up the notation somewhat. A script "l," ℓ, is now used for the level of significance of 0.05. Subscripts are used to distinguish the various t-values. The spread of the t̃_σ only depends on the spread of the x̄, the average value of x, since σ is a constant and can be computed from the t-values shown in the following plot.

The t-value t̃_s also depends of the spread of the estimated standard deviation s which makes it broader so more values are rejected. The two hypotheses tested were H₁: |t̃_σ|≤t_ℓ and H₂: |t̃_s|≤t_ℓ which are either accepted (A) or rejected (R). The level of significance determines the t_ℓ which is used in the hypotheses.

A FOR loop was added to the update macro to permit the collection of data for a fixed number of trials so one did not have to hold down the keys for the macro shortcut. It transferred the values of the set of random numbers the x column and did the updates for the decision tallies. Each pair of binary decisions for the tests can be represented by a pair of letters consisting of the letter designating the H₁ decision followed by that designating the H₂ decision.

The formulas used are contained in the following figure. Using n-1 for the estimated standard deviation s gives better results since dividing by n for the rms error is not quite correct.

With the FOR loop I was able to increase the number of trials to 250,000. The calculation took about 3½ hrs. The changes gave surprisingly good results for the rates. The rejection rate for hypothesis H₂ involving the estimated standard deviation s agreed very well with the 5% level of significance.

Monday, July 8, 2019

A More Detailed Look at the t-test for Averages

No comments: