httprover's 2nd blog: 2018

Wednesday, December 26, 2018

Some Polynomial Factoring Formulas

I got Nio to work out some polynomial factoring formulas for me. Looks like they follow a simple pattern.

Factoring a Quartic Polynomial into Two Quadratics 2

The previous blog neglected to mention what happens if b=0 which has to be treated as a special case since the formula for d results in division by zero. When b=0 the initial sets of constraints are simplified slightly and the formula for d changes.

One can then proceed as before to find the best values for a, b, c and d.

There is only one remaining constraint which gives the same zeros.

We need only one solution to factor the quartics since the various zeros correspond to alternative permutations of the monomials in the quartic. Another special case occurs when in addition a=0 and the altered constraints tell us that d=A₀ and c=A₃. Zero coefficients warn about the occurrence of the special cases. One can use the same general method to factor a cubic equation into a quadratic and a monomial where one also finds a special case for a=0 since the formula for b involves division by a.

Tuesday, December 25, 2018

Factoring a Quartic Polynomial into Two Quadratics

Lately I've been working on an Excel spreadsheet to solve for the eigenvalues of a given 4x4 matrix M with real coefficients. These eigenvalues are the roots of the eigenvalue equation |M-μI|=0 which is actually a 4th degree polynomial, a quartic equation. But before I got around to doing a blog on solving for the roots of these equations with numerical methods I decided to check to see if it was easier to factor the quartic into two quadratic equations instead.

This turned out to be the case. To see this lets start with a set of coefficients for a quartic equation and compare them with those that result from multiplying the two quadratics together. If the first quadratic is represented by q₁(x)=x²+ax+b and the second by q₂(x)=x²+cx+d so equating their products gives the set of four equations which need to be solved for a, b, c and d.

The first and last equations allow us to easily compute d and c if a and b are known. One can then substitute these formulas into the second and third equations to get two equation f₁(a)=0 and f₂(a)=0 by assuming that b is a function of a. Substituting the solution for b(a)² from one into the other results in a linear equation for b(a) giving us a rational function for b(a).

One can do a one dimensional search for the magnitude of the minimum error between the computed values for the quartic coefficients and the given values. One can alter the range of the search and change the step size to zoom in on a zero using feedback if necessary for precise values. Evaluating the function f₁(a) gives us a check on the zeros.

Substituting the four values for the zeros of a into our formulas gives the corresponding values for b, c, and d which check with the original polynomials used to generate the coefficients.

Note there are four pairs of monomials here since there is one double root. In general there would be six zeros but as with the example given here there is duplication since the coefficients of the first and second quadratic can be exchanged. The feedback mentioned above for finding the zeros uses copy and paste to transfer the value of a for the minimum err into the first row of the a column.

Merry Christmas to All

Wednesday, October 10, 2018

True Standard & Some Silver References

Cupellation appears to be the most accurate assay method for silver but there are some losses, on the order of 0.1%, in the process which requires an adjustment to determine the fineness on the scale of a true standard. This points out the difficulty in specifying a definitive test for the fineness of silver and the care that needs to be taken.

Here are some books and links concerning silver and its history, alloys, assay and metallugy:

aes - Wiktionary Harper's Dictionary of Classical Literature and Antiquities

Pliny - The Natural History of Metals c. 79 AD

Arbuthnot - Tables of Ancient Coins, Weights and Measures 1727, proportion of gold & silver in coins

Phillips - A Manual of Metallurgy 2nd Ed 1859, assay of the alloys & ores of silver

Phillips - The Mining and Metallurgy of Gold and Silver 1867, concentration of precious metals in lead, smelting

Percy - The Metallurgy of Lead 1870, lead-smelting

Hill - A Handbook of Greek & Roman Coins 1899, quality of metals used

Del Mar - A History of the Precious Metals 1902

Scientific American Cyclopedia of Formulas 1915, silver and copper alloys dwt

Phase diagram - Wikipedia

Saturday, October 6, 2018

Provisional and Definitive Tests

I've been trying to come up with a good example from the History of Science to illustrate the comparison of test procedures and one is the assessment of the purity of metals. One test might be the use of Archimedes' principle to test the specific gravity of a given sample. On the other hand one might take cupellation as the definitive test for purity but he disadvantage of cupellation is that it is destructive so one might prefer a provisional test like the use of Archimedes' principle.

Cupellation is used in the Trial of the Pyx for the assay of coinage. This method dates from ancient times. Modern silver standards are well regulated. For example sterling silver is required to have a fineness of 925.

A modern example of the purification of metals similar to cupellation is the Czochralski process. X-rays can also be used for assaying metals.

Monday, October 1, 2018

Using Concurrence Counts for Comparisons

In the last couple of blogs I tried to show how the expected number of good items in a sample can be estimated if the agreement and disagreement of two testers were known for correct assessments on the same set of items. One just needs the counts of the concurrence since from them one can determine what the testers observations were as indicated in this diagram. The counts in the last column are just the sum of the counts for the two path leading to the combined counts.

If one tries to make an estimate N_G using the actual observations one ends up with an estimate of N₀, the number of items in the sample instead.

So the testers' assessments themselves don't help us very much. In general we need a table of the concurrence of the actual number of good and bad items. We can alter the problem by asking how well a pass/fail function test will predict whether an item will function for a specified length of time with those that do being the number of "good" items. So we need to consider diagrams like these for the two testers.

After the true counts of concurrence have been determined on can use them to get the conditional probabilities for the testers which in turn allow us to evaluate the testers and their tests.

Supplemental (Oct 4): The reference to a concurrence matrix above would be more properly be called a concurrence table. The conditional probabilities are part of a matrix since they allow one set to be used to compute the other if one set of probabilities is known. The reason for the failure of the testers' concurrence estimate is due to the presence of false positives in the counts used.

Friday, September 28, 2018

A Commentary on Bayesian Games etc.

The last few blogs dealt with problems related to Bayesian inference. The chief obstacle to such an analysis is getting a good estimate of the conditional probabilities when the reliability of the observers are unknown. We found that using two observers can give an improved estimate of the expected values of the likely unknown counts which are needed to determine the conditional probabilities.

The formulas for the hidden probabilities require counts of the number of events that the observers agree on. Rutherford gives the formula for computing the probability of two simultaneous events P and Q occurring but his papers don't indicate that he used these formulas to get improved estimates of rates based on the observations of two observers. Fuller's proofreading problem doesn't indicate a method of solution either. I have a copy of Fuller book (3rd Ed. see p. 170) and derived the formulas for the solution to the problem on my own. The proofreading problem and formulas can be found in Ross, Introduction to Probability and Statistics for Engineers and Scientists, p. 234f.

A related problem in Bayesian inference would be the determination of the likely number of false positives and false negatives for the observers using a particular method for assessments. If the conditional probabilities are known one can do so. A problem with statistical analysis is that rare events tend to be excluded from observations. Another problem is extending restricted studies to more general cases where the rates are likely not to be the same.

Thursday, September 27, 2018

Observer Bias Affects the Corrections to the Faulty Observations

Observer bias can affect the correction to the estimates of the hidden probabilities for the occurrence of good and bad items. In the last blog the observer assessments were unbiased for both examples. In the first example below the first observer is less likely to make an error in identifying good items while the second observer makes fewer mistakes on bad items. The estimated probabilities ends up slightly biased in favor of good items since there are more good items than bad in the sample. In the second example below both observers more accurately identify bad items and the estimated probabilities have a slight shift towards the occurrence of bad items.

Since there were 100 items in each sample one would expect the rms error in the estimated mean probabilities to be about 1/√100=0.1 times the rms error in the 100 items in the sample so the mean probability estimates should be accurate to about 3 digits.

Wednesday, September 26, 2018

Correcting Faulty Observations

One can use the results of two independent quality tests to improve the estimate of the probability of a finding a good item. The counts for Tester 1 of good and bad items are sesignated N₁ and N₂ and those of Tester 2 are N₃ and N₄ and their probabilities are p and q and p' and q' respectively. Based on the two observers' assessments of the items how can we determine N_G, N_B, p_G and p_B? The answer is to keep track of the number of times, N₁₃, when both N₁ and N₃are good and the number of times, N₂₄, when both N₂ and N₄are bad.

Then we can borrow a trick used by Rutherford to improve on the scintillation rates determined by two different observers.

Even with relatively large rates for the counting errors one still get a good estimate of the actual rates.

The mean values indicated were found by averaging the counts for N_G and N_B using N_G+N_B=N₀.

Supplemental (Sep 26): The estimated probabilities for G and B are again the averages for the 10 sets of assessments each involving 100 items. The set of stochastic variables used to generate the data were for p_G, a, d, a' and d' and were randomly set to 1 or 0 based on the rates as was done previously.

Supplemental (Sep 26): See Feller, An Introduction to Probability Theory and Its Applications, Vol I, 2nd ed., p. 160, prob. 23 which cites Rutherford. See also Rutherford &al., Probability Variations in the Distribution of α Particles cited in Rutherford's book linked above.

Tuesday, September 25, 2018

Uncertainty in the Cross Terms for the Comparison of the Two Quality Assessments

When comparing the two processes for determining quality of items using the more accurate estimates good (G) and bad (B) and the less accurate estimates pass (P) and fail (F) the uncertainty in the cross terms can be quite large. The conditional probabilities were determined as follows.

We can recalculate the Excel worksheet that generated the 10 sets of 100 random estimates of the quality of G, B, P, and F items to get a new set of averages and save the numerical values. With 100 of these trials the average of b and c were determined and as well as the root mean square deviations from this average.

One can see that there is quite a bit of variation in the cross terms even though the average turns out to be fairly accurate. A large number of tests of a given set of items is needed to get good estimates of the cross terms b and c in order to check theoretical results.

Sunday, September 23, 2018

Faulty Sorts

Consider a faulty sort mechanism intended to separate good from bad. If the decision depends on the perception of what's good and what's bad the sort can end up getting more mixed up instead of improving the sort. Using counts instead of probabilities one can show that the "sort" has an equilibrium distribution.

If A is the matrix containing the components a, b, c, d in the last blog then the limit after repeatedly multiplying an initial distribution by A can be determined with N₁=N_∞ and N₂=N₀-N_∞.

So if A represents a faulty processing mechanism used to sort successive distributions then the limit of the distributions will be (90, 10)^T. One can treat the column vectors of A as "distributions" with their limits being A^∞ above. One could use the equilibrium distribution to characterize the sort mechanism.

One can view a decision mechanism as a kind of sort. From this perspective it might be better to do an objective test of each item rather than rely some preconceived notions.

Note that the final result of the repeated sort is dependent on the process itself and not on whether or not a particular item is good or bad. It can degrade better sorts and improve poorer sorts.

Faulty Decisions

Perhaps this might be a good time to review Bayesian inference and its application to decision making. Suppose we have two methods to assess the quality of items in a given sample. The first method is more exact but is also more difficult than the second. Let's call the results of the first test good (G) and bad (B) and those of the second pass (P) and fail (F). How might we go about comparing them?

We could assume that the probabilities of the two sets of outcomes are linearly related by matrix transforming the first set of probabilities to the second. Since the results are either G or B in the first case and P or F in the second only two components of the matrix are independent so we choose the variables to be the cross terms.

So given p, q, b and c one one can determine p' and q'. Alternatively one can take a statistical approach to the problem. In the example below ten sets of trials with N₀=100 were averaged to get estimates of the values for N₁, N₂, N₃, N₄, N₁₃, etc. Three stochastic variables s_p, s_a, s_d equal to 1 or 0 based on whether or not a random number between 0 and 1 inclusive was less than or equal to the probability associated with the variable were used.

The following results were obtained.

The conditional probabilities are p(i|j) or "the probability of outcome i given j" where i and j are the number of the cells or alternatively G, B, P and F. It was assumed that there was a 10% chance that a G item would test as F and a 20% chance that a B item would test as P. There was only a 1% chance that a passed item would be bad but a 27% chance that a failed item would be bad.

The probability of passing a bad item ends up about twice the probability of failing a good item but the chance of encountering a bad item is relatively rare.

Supplemental (Sep 23): More on conditional probability can be found in Parzen, Modern Probability Theory and Its Applications. The paper cited is Bayes, An Essay towards Solving a Problem in the Doctrine of Chances.

Supplemental (Sep 23): The values above for p' and q' are the estimated values. The calculated values are p'=0.865 and q'=0.135.

Friday, August 31, 2018

An Electromagnetic Field Tensor

From a mathematical perspective electromagnetic theory consists of laws of nature expressed in terms of vector analysis.

But what would we see if we looked at EM theory from a more physical perspective focusing on the forces and changes in momentum instead? We can start with the Lorentz force which involves electric and magnetic fields. The electric force produces changes in the motion of a charged particle in the direction of the electric field. The magnetic force involves changes in position and gives the deflection of the path of motion. So for differential changes in time and position we can associate changes in momentum or impulses acting on the particle.

It turns out that the fields are the coefficients of the differential changes in position and time that give the changes in momentum. We can add the change in energy or work done on the particle to the momentum changes to get a 4-dimension picture of what's happening. And we end up with a field tensor, the result of applying the chain rule to each component of the momentum. So the field tensor can be expressed in terms of 4-dimensional gradients of the components of a momentum flow field describing the paths of identical test particles in neighboring positions. This appears to have been the approach that Maxwell adopted for EM theory.

A Short Timeline for Early Electromagnetic Theory

The introduction to electromagnetic theory usually involves mathematical statements of the physical laws governing the relations between forces, charges and their motion and the more abstract fields. Here are some Wikipedia articles dealing with some of the more important ones.

1785 Coulomb's law

1813 Gauss' Law

1820 Biot-Savart law

1820 Oersted's law

1823 Ampere's force law

1831 Faraday's law of induction

1855 Ampere's circuital law (Maxwell)

In the twenty years between 1855 and 1873 Maxwell wrote a number of works on electricity and magnetism attempting to develop mathematically Faraday's impressionistic lines of force approach to the subject.

1855 Maxwell, On Faraday's Lines of Force

1861 Maxwell, On Physical Lines of Force

1865 Maxwell, A Dynamical Theory of the Electromagnetic Field

1873 Maxwell, A Treatise on Electricity & Magnetism Vol 1, Vol 2

Part III of A Dynamical Theory of the Electromagnetic Field deals with the General Equations of the Electromagnetic Field in which Maxwell introduces such concepts as "electromotive force," "electromagnetic momentum," "magnetic force" and "electromotive force in a circuit." These deal with force fields rather than electric and magnetic fields and his approach appears to be less abstract than our modern theory of the subject. One can compare Maxwell's formulas with those in modern notation.

Tuesday, August 28, 2018

Why the Assumption of an Impulse Acting on Light May Have Worked

If one looks at the change in phase in the plane of refraction along the boundary between the two media one sees that the two components of the wavevector are the same for both wave functions thus enabling us to match them along this line.

When we tried to explain refraction by the assumption of a vertical impulse acting on photons at the boundary we inadvertently assumed that the component of the wavevector on the boundary in the plane of refraction remained unchanged since the momentum of the photon is proportional to its wavevector.

Monday, August 27, 2018

A Physical Explanation of Refraction

One can trace Snell's law of refraction to the boundary conditions acting on the electromagnetic wave equations for the electromagnetic fields. To show this we can assume that the incident, reflected and refracted rays are simple plane waves.

In what follows let ψ represent an arbitrary electric or magnetic field component. In the first medium the field is a sum of the incident and reflected fields and there is only one field in the second medium. The relation between the angular frequency, wavelength and phase velocity, ω=kv_ph=kc/n allows us to express the wave vectors for each medium in terms of the magnitude, k=nω/c, of the corresponding wave vector in a vacuum. We assume the plane of incidence is the x,y-plane with horizontal axis x and vertical axis y. Along the boundary between the media y=0. For reflection we know that the reflected angle is equal to the angle of incidence so we can set β equal to α. This allows us to simplify the field equation for the first medium slightly. If a field is continuous along the boundary then its derivative will also be the same for both media.

Equating these functions we get two equations for the wave amplitudes A_k and the second equation yields Snell's law after eliminating the common factors.

So the continuity of the fields appears to be the physical cause for the "broken" or refracted light path. The wave vectors do not depend on the values for the incident fields but we can expect the amplitudes of the reflected and refracted rays to depend on them.

Thursday, August 23, 2018

Interpreting Snell's Law of Refraction

Back to Hamilton's Theory of Systems of Rays. How does one explain Snell's law of refraction where the index of refraction n=sini/sinr? Mathematically it's just a description of the relation between the paths of the incident ray and the refracted ray. Is there anything significant about the length of the lines corresponding to the sines? They're the altitudes of the triangles with vertices i, n̂₁, and the origin and that for r, n̂₂ and the origin but the ratio of the areas of these triangles is also equal to the index of refraction.

These triangles are isosceles so we could draw the altitude as perpendicular to the rays instead.

What we want is some sort of physical explanation for the law of refraction. We can find this in Feynman's Lectures on Physics, Vol I where the index of refraction is attributed to a phase change due to secondary waves caused by forced oscillations of the electrons in a plate of the transparent body and not a change in the speed of light in the refracting medium. The derivation assumed that the index of refraction differed by a small amount from that of a vacuum which is 1. The relation for a denser medium is found in Vol II. This explanation is essentially that of the classical dispersion theory introduced by Paul Drude a little over a century ago. A formula for normal dispersion can be found in his Lehrbuch der Optik of 1900.

Tuesday, August 21, 2018

Equivalent Quaternion Simplifies Calculations

Using an equivalent rotation quaternion requires less calculation to obtain the same results. In this slightly modified version of the previous rotation one just needs to keep track of the poles, their rotation quaternions and changes to the required data.

After computing the equivalent rotation quaternion we can rotate the data points for the vertices of the tetrahedron.

In the plot below the color code red, green, blue indicates the vertices a, b and c and the axes î, ĵ, k̂ respectively. The fourth vertex was originally the origin before it was translated to the center of the tetrahedron.

In Excel the worksheet is automatically recalculated when the contents of a cell is changed so when the index is changed by pressing either the shift right or shift left command button the plot is also recalculated and we can observe the resulting rotation.

Note one needs to be careful not to confuse the axes used to determine a rotation with axes that are rotated which are treated as data. Here the original î and k̂ axes were used to determine the pole p̂' while the rotated k̂'' axis was used for p̂''.

Monday, August 20, 2018

Doing Stepped Rotations in Excel

I've been trying to get Excel to do some simple rotation videos using command buttons to step a plot through the rotations and Power Point's record tool to capture a video. This is the best system that I have been able to come up with so far.

The image in the Blogger player is a little initially but clicking on it clears things up considerably. It doesn't appear that QGraphics is ready quite yet.

Sunday, August 19, 2018

Replacing a Series of Rotations With an Equivalent Rotation

One can simplify the rotation of a set of data by replacing the series of rotations with a single equivalent rotation q̂_eq=q̂''q̂'q̂. We need to follow the actions of the rotations on â and its images to compute the series of q̂s.

Next q̂_eq is used to transform the data by setting x'=q̂_eqxq̂_eq*. This action was performed by a user function Qrot(p,q̂,x) acting on the upper left corner of the transformed data table followed by drag and fill to complete the table. One only needs to pass the pointer p, the rotation quaternion q̂, and the data point x to Qrot.

One can plot the transformed data along with a set of transformed axes.

Note that the equilateral triangle is in the transformed î,ĵ-plane.

edit (Aug 19): Did some minor cleanup by changing x̂→x used to represent a data point since the data does not have to have unit magnitude.

Saturday, August 18, 2018

Results of the Previous Example Using the Hamilton Half-angle Formula

Hamilton's quaternion half-angle rotation formula gives exactly the same results for the series of rotations in the previous blog with less calculation.

Equilateral triangle calculation,

Calculation for rotation by angle α about p̂=k̂,

Calculation for rotation by angle β about p̂'=â'k̂,

Calculation for rotation by angle γ about p̂''=â'',

An Example Involving a Series of Rotations

We can do a series of rotations to show that they do not affect the area of a parallelogram. We start with two vector quaternions that along with the origin form an equilateral triangle. The quaternion product gives us the area of the two parallelograms. Next we find the set of projections vectors for the vectors to be rotated starting with a rotation of angle α about k̂. Next we compute the pole for a rotation of angle β in the â',k̂-plane. Finally we do a rotation of angle γ about â".

We first check the calculations done with the algebraic expressions with numerical calculations and compute the magnitude of the vector part of the product.

Doing the rotation of α about k̂ gives,

And a rotation of β about â'k̂ gives,

Finally the rotation of γ about â" gives,

Note that the magnitudes of the scalar and vector parts of the products are unchanged by the rotations. This method is rather cumbersome since we need the polar, normal and binormal parts for each vector rotated. Hamilton's half angle triple products do not need them and the same q̂ works for all points subject to the same rotation.

Improved Quaternion Rotation Formulas

One can use a vector quaternion to represent the pole of a rotation. To do the rotation of any vector quaternion properly one has to one has to partition the quaternion into polar and normal parts, v_p and v_n. We need a third vector in the plane of rotation, the binormal, v_b, to do the rotation.

Another formula for rotating quaternions was given by Hamilton and is explained in Kellog and Tait's Introduction to Quaternions. Since the square of the magnitude of q̂ equals 1, we can replace q̂^-1 by its conjugate q̂*. We start by setting q̂=1cosφ+p̂ sinφ. The product of the three quaternions gives essentially the same formula for the rotation but with θ replaced by 2φ.