Saturday, August 31, 2013
Karl Pearson - Lines & Planes of Closest Fit
I took another look at Cramer's Numerical Methods of Statistics and the section on Orthogonal Mean Square Regression (p. 309) seems more familiar now with the derivations of the last few blogs filling in a lot of the gaps. The section also cites Karl Pearson's 1901 paper On Lines and Planes of Closest Fit to Systems of Points in Space. This paper provides a lot of the missing details on the history of using normal errors but appears a little antiquated now. There are a lot of advantages to the approach using the Calculus of Variations. Pearson's analysis is a little different from the reduction of the variational equations to an eigenvector equation that was used here. It has come to be known as Principal Component Analysis.
Edits (Aug 31): Pearson link & Principal Component Analysis
Supplemental (Sep 17): Pearson's paper was published in the Nov 1901 issue of Philosophical Magazine. The use of perpendicular deviations from a line was mentioned in a letter to the editor in the March 1901 issue of Nature by Ravenshear.
Friday, August 30, 2013
Calculating Machines in the 19th Century
The advances in mathematics in 19th Century spurred the development of calculating machines. Here are some articles to help give a rough impression of what was happening in this area.
18-20th Cent. history of the mechanical calculator
1814 Planimeter
1822 Difference Engine
1837 Analytical Engine
1855 Maxwell's Platometer
1890 Continuous Calculating Machines - Thompson & Tait
Thursday, August 29, 2013
Wednesday, August 28, 2013
Use of "Characteristic" in 19th Cent. Math
I tried googling "characteristic" in Google Books and found a number of relevant 19th Cent. references that involve its use. One of the books also included expressions for the distances of a point from a line and a plane. One must realize that the use of vectors came later than much of this work. Gibbs and Heaviside were among the chief promoters of the use of vectors by themselves. They were originally introduced as a part of Hamilton's quaternions.
1830 Hymers, Analytical Geometry
Example: characteristic property of a surface:
differential equation valid for every pt on surface
1838 Young, Analytical Geometry
analytical expression must characterize the position of every point in the curve
characteristic equation
1904 Scott, Determinants (1st ed. 1880)
characteristic function & equation for a matrix
Also,
1830 Hymers, Analytical Geometry
distances of a point from a line and a plane
Linear Fit: Best Choice for the Reference Point
In the last blog the reference point for the line was assumed to be the average of the data points. This can be deduced by treating the components of the reference as parameters to be determined as part of the variation. In assigning r0 for the reference point the zero subscript was chosen to indicate that it was the value for the function r(ρ) when ρ was set equal to zero. To avoid the possible confusion resulting from it being mistaken for a component of the vector r a better choice for it would be rref. We can start with the result that was found for the estimate of ρ and the reduced expression for δ.
The second line up from the bottom line gives a result involving rref and the average with some additional factors involving P-bar which is a function of e and the unit vectors for the axes. This expression has to be satisfied for all choices for the unit vectors and all possible P-bars since e is still a variable. This requires that the first factor in the expression be set equal to zero and as a result we get rref =<r>.
This example illustrates how one can reduce the variation problem by solving for independent variables separately. This leaves the problem of the constraint and simultaneous variation of parameters till last but the expressions involved have been simplified. Each constraint adds a Lagrange multiplier.
Tuesday, August 27, 2013
Using Normal Errors for Line Fits
One can also derive a "characteristic" equation for the fit of a line starting with the following figure to set up the problem. An arbitrary point, r, can be written as the sum of three vectors linking r with the origin. We need an arbitrary reference point, r0, on the line and the part of r that remains can be split into a part along the line, ρe, and a perpendicular deviation from the line, δ.
For the point on the line we can use the average position of the data points so we set r0=<r>. Angle brackets are used here to indicate an average. Using the Calculus of Variations we can solve for the unknown parameter ρ and simplify the expressions for δ and the variance, V. P-bar is a projection operator that eliminates the part of a vector in the direction of the line, e. Again the expression for the variance has to be modified to Φ to allow arbitrary variations of e.
To find the change of Φ corresponding to a change in e we need to express e in terms of the unit vectors in the directions of the axes and the direction cosines of e. This allows us to determine the change in P-bar and the constraint that requires the e be a unit vector. The only way to get dΦ=0 for arbitrary changes in the directions cosines is to set the expression in square brackets equal to zero. The result is another eigenvector equation. This may be what is referred to as "the characteristic equation." Eigenvectors are sometimes referred to as "characteristic vectors" which may have come from 19th century terminology.
The expression for the matrix is rather complicated involving the data points and matrices for which the i,jth component is equal to unity and the rest are zero. Expressing the data points in in terms of the unit vectors allows us to simplify the expressions for the components of M.
We get the same result that we found for fitting planes and their nD analogues. But for the nD line giving the best fit we have to choose the largest eigenvalue and the corresponding eigenvector for e since the data points should be spread out most along the line.
Monday, August 26, 2013
Using Normal Errors for Data Fits
We found that there was a simple formula for measuring the distance of a point from a plane if its distances from the plane along direction of the axes were known. For a linear curve fit one usually does the fit for just one axis if the unit of measurement is not the same in all the dimensions. But if the measurements in all direction are made using the same unit we can try an alternative method for doing the fit using the normal distance of a data point from a line or a plane. Starting with a general definition of a line or plane, r·e=λ, we can derive a formula for the variance of the data, V=eTRe, where e is the unknown direction of the normal and R is a matrix that is just a function of the data. The Einstein convention is used where repeated subscripts in an expression indicates a summation. The derivation works for an arbitrary number of dimensions which I refer to as nD.
If we replace R and an arbitrary matrix, M, and let V=eTMe we can show that the best fit is an eigenvector using the Calculus of Variations. The direction, e, is a solution of the eigenvector equation (M+MT)e=μe and the best solution for e is the eigenvector corresponding to the smallest eigenvalue for μ. Some of the details have been left out of the derivation of this result below to save space but it shouldn't be too difficult to fill them in. It too is an nD proof.
Note that the Lagrange multiplier, μ, has been included in Φ to allow for an arbitrary variation of e since it is subject to the constraint that the magnitude of e is 1. M+MT is twice the symmetric part of the matrix M. One can show that R is symmetric so R+RT=2R and, as a consequence, e is one of its eigenvectors.
Wednesday, August 21, 2013
Ptolemy's 2nd Refraction Experiment
While reviewing some of the history of optics over the weekend I came across Ptolemy's experiments to measure the amount of refraction that light undergoes when going from one media to another. The first experiment was for light passing from air into water. His second experiment was for light passing from air into pure glass. Ptolemy did not publish a law governing refraction. That came later when Snell determined that the ratio of the sine of the angle of incidence, i, to the sine of the angle of refraction, r, was a constant, n, known as the index of refraction. We can use this law to see how good Ptolemy's measurements were. The plot for the second experiment is shown below. The angle of incidence ranged from 10° to 80° in steps of 10° measured from the vertical.
The fit indicates that for the glass that Ptolemy used the index of refraction was n=1.519 which is typical of crown glass. The Roman glass of Ptolemy's time was made from sand with soda (sodium carbonate) or natron to reduce the melting point of sand. Pliny's description of glass making confirms this. For more on the history of Roman glass see Dillon, Glass.
The History of Rays & Normal Planes in Optics.
Hamilton was not the first to study the use of rays and normal planes in optics but he put the subject on a sound mathematical ground. Newton's books on optics dwelt primarily with physical optics which was concerned with optical phenomena. When Huygens was developing his wave theory of light in the 1600's he postulated that as light propagated the surface of each wave front acted as a new source of waves which combined to form a new wave front. In the 1760 Lambert published Photometria which dealt with the variation of the intensity of light with distance. Malus in his 1808 paper, Optique, wrote about the rectilinear propagation of light with surfaces and systems of rays. So when Hamilton's works on optics were published in the late 1820's and 1830's much of its content was already known. His characteristic function and the associated differential equations were arrived at using the Calculus of Variations which Lagrange had used in Mechanics.
Hamilton was a contemporary of Faraday who developed a field theory to help understand electromagnetism. In it he used the idea of tubes of flux. Hamilton's study of systems of rays and surfaces would have put flux tubes on solid mathematical ground. This and his discovery of quaternions were later used by Maxwell to develop a mathematical theory of Electricity and Magnetism.
Computing Elliptic Arcs Using Polar Coordinates
Computing the length of elliptic arcs with polar coordinates is a little more challenging. To determine a point on an ellipse we need to know the point's distance from the origin and the direction to it. Using differential geometry allows us to derive the formula for the differential arc length in terms of the geocentric polar angle or colatitude, θ.
The radial distance of the point can be solved for using the formula for an ellipse in Cartesian coordinates and once this is known we can find the rate of its change with respect to the polar angle.
We can integrate the formula for the differential arc length using the functions derived for polar coordinates and compare with the previous results found using the parametric polar angle. In this calculation a=1.
The results show that the semi-elliptic arc length is only dependent on the ratio of the semi-minor to the semi-major axes, ε, and not the way in which the arc length is calculated.
Saturday, August 17, 2013
Hamilton's Papers on Dynamics
Hamilton's papers on dynamics followed shortly after the one on Paths of Light and the Planets. In them he used the Principal Function to determine the motion of a massive body, pairs of bodies and and also dwelt with the two-body problem with an application to the orbit of a planet about the Sun and the three-body problem. There were two parts.
General Method in Dynamics I (1834)
General Method in Dynamics II (1835)
In the second paper he uses tricks line substituting partials of the Principal Function for a second set of parameters and two sets of first order differential equations giving the time rate of change of the sets of parameters. One can compare the modern versions known as the Hamilton-Jacobi equation and Hamilton's equations.
Friday, August 16, 2013
Miller Indices
If one knows the points where a plane intercepts the coordinate axis one can easily determine the equation for the plane. Given the distances along the axes, x, y and z one can find the distance ρ of the proximal point, p, the closest point on the plane to the origin. This can then be used to find the direction, e, by finding the direction cosines, α, β and γ, and then we have the equation for the plane.
The intercepts for the plane are related to the Miller indices, (h k ℓ), used in crystallography where the planes are defined in terms of the unit cells of the crystal. A Euclidean coordinate system in which all directions can be measure using the same unit of length is analogous to a cubic crystal lattice. The formula for the distance of the plane is similar to our formula for the distance, ρ, of the plane from the origin if the lattice constant, a=1. So, given the Miller indices, we can find the distance and normal of a crystal plane.
We see that the Miller indices, (h k ℓ), are inversely proportional to distances of the intercepts from the origin.
Supplemental (Aug 16): The reason the crystallographers use Miller indices may be because they behave like wave vectors.
Hamilton Papers on Optics
Hamilton's paper on the Paths of Light and the Planets was more in the nature of a review of what he had already done and may have been intended for a broader audience than those interested in mathematical optics. The presentation appears to be overly simplistic and he may have caved in a little to his publishers. The first part of his paper was read in 1824 and it was not until 1828 that it was published in the Transactions of the Royal Irish Academy. The original plan was for three parts but time delays led him to modify this by publishing three additional supplements.
Hamilton's Theory of Systems of Rays
1st part, 1st supplement, 2nd supplement, 3rd supplement
Some of the topics mentioned were surfaces of constant action, surfaces and perpendicular rays and their relation to the characteristic function, the use of the elliptic integral of the 1st kind as a small angle approximation for the intensity of light near the focus when there is aberration present. He also gave the partial differential equation for the characteristic function with its dependence on the index of refraction. Hamilton also computed the 24 coefficients needed for extraordinary refraction.
Some loose ends:
1.) There are other uses for the elliptic integral of the 1st kind. It can be used for example to compute the nonlinear motion of a pendulum.
2.) The problem of determining the distance of a point from a plane using the length of the edges of rectangular triangular pyramid is an example of a set of covariant coordinates. One can think of them as the normal distances from the planes parallel to the coordinate planes through the point. The terms covariant and contravariant were introduced by Sylvester in a paper on Algebraic Forms publish in 1851. Covariant and contravariant coordinates are used in General Relativity.
3.) Extraordinary refraction involves crystals whose properties depend on direction and therefore involve tensors.
Hamilton clearly was ahead of his time and help point people in the direction that led to our current understanding of nature.
Wednesday, August 14, 2013
Distance of a Point from a Plane Example
In the last post we found a formula for calculating the distance of a point from a plane using the edges of a triangular pyramid which are parallel to the coordinate axes. Determining the edges of the pyramid if the plane and point are known can be found without too much difficulty. We start with a given plane and point in some coordinate system and translate the origin of the coordinate system to that of the point.
In the new coordinate system we can determine the length of the edges as follows.
To verify that the formula works we can work a simple example by selecting an arbitrary plane and a point and then translate the origin to the point. To find two unit vectors in the plane we determine the eigenvectors of the projection operator for e.
Then we can determine the unknown projections onto the plane to find the values of x, y and z.
The result for the triangular pyramid formula agrees with distance determined by using data from the original coordinate system.
Tuesday, August 13, 2013
Hamilton's Characteristic Function
Let's see if we can illuminate what Hamilton said about the characteristic function in his article on the Paths of Light and the Planets. First he assumed that characteristic function was invariant for changes in the path and this allowed him to show that the path was a straight line. One can add that since the characteristic function is independent of the path it is a "state" function and depends only on the initial and final points of the motion. To understand the derivation of the differential equation for the characteristic function let's look at the distance of a plane from the origin. The distance can be expressed as a function of the distances of the points where the plane intercepts the coordinate axes.
A translation of the origin doesn't change the distance of the point from the plane so we know the distance of the point from the in terms of its distances from the plane measured parallel to the axes. If we keep the direction defining the plane fixed and allow the distance of the plane to vary we get its change in distance in terms of the partial derivatives and this allows us to derive a differential equation for the characteristic function. Using the gradient allows us to simplify the equation some.
Hamilton's characteristic function allows us to determine a set of paths for the motion of particles starting from varying positions and set up surfaces for which the characteristic function is constant. The paths and the surfaces are normal to each other. The straight lines are a special case but it works for light when the index of refraction is constant everywhere. If we have a set of parallel paths the corresponding surfaces will be planes. If they radiate from a point the surfaces will be spherical.
Supplemental (Aug 21): I found the equation for the normal distance of a point from a plane in the 1980's while looking for an alternative to the usual way of measuring errors vertically in least squares. Something clicked when I was trying to illuminate what Hamilton was doing.
Monday, August 12, 2013
Convergence of the Complete Elliptic Integral of the 2nd Kind
As mentioned previously the power series for the complete elliptic integral of the 2nd kind doesn't converge very rapidly when the modulus κ=±1. There is also a problem with using double factorials since they grow rapidly and the number of terms is limited to 150 in Mathcad 11 which is when the numbers approach the upper limit on the numbers that can be represented.
For large n the ratio of the (2n-1)/2n approaches 1 and it is easier to use a recurrence relation to compute the new term of the sum using the previous result. The simple program shown below can be used to sum a larger number of terms of the series.
One finds that the sum of the series is nearly identical with the result the Mathcad got using numerical integration.
Even with the large number of terms there is still some error near κ=1.
The error is approximately 1/4 divided by the number of terms used which in this case was 10,000.
Plots for Hamilton's Variations of the Paths
In his paper on the Paths of Light and the Planets Hamilton was trying to show what happen to the length of the path as one varies it using circular arcs and elliptic arcs. In each case the minimum length was the length of the straight line when ε=0 and the purpose of the "transformed" expressions was to show that lengths for the nonzero ε's were greater. For both sets of arcs -1≤ε≤1 with ε=±1 corresponding to a semi-circular arc. For ε<0 the arcs are below the straight line. The plots are roughly parabolic in shape but there doesn't appear to be a power series expansion about ε=0 for the set of elliptic arcs. The integral used for the arc length is related to the complete elliptic integral of the second kind. One just has to make the substitution cos2φ=1-sin2φ and one gets k2=1-ε2 which indicates that k in the elliptic integral is the eccentricity of the ellipse. Hamilton may have used the set of ellipses since Jacobi had published a work on elliptic integrals a few years earlier.
Friday, August 9, 2013
Illuminating Hamilton on the Path of Light & Planets
In 1833 Hamilton wrote a paper on the path of light and the planets which dealt with the use of a characteristic or generating function to determine the path of motion. He included two examples which showed that a straight path was the minimum path for some alternative paths. The first example was that of a straight line of length V and the set of circular arcs, Vε, for which V is the chord. The two are shown as bold lines in the figure below. The parameter ε is determined by the ratio of the height of the arc, a, to 1/2 V.
The formula for the length of the arc, Vε, given in Eqn 6 can be derived as follows.
For small ε, tan-1ε is approximately ε and the factor on the right hand side of the bottom line involving ε reduces to ε2+1 and therefore Vε=V when ε=0.
Calculation shows that Hamilton's expression in Eqn 7 for the length of the arc is not equal to that obtained from the formula above. One can derive a similar formula that works but first one needs to derive an integral expression for θ=tan-1ε.
This expression can be integrated by parts and substituted into the formula in Eqn 6 to get a new formula for the length of the arc.
Once again Vε=V when ε=0 and the product of the two factors in the second term on the right increases with increasing positive ε. Since the formula involves functions of ε2 and their averages it is symmetric, that is, it has the same value when ε is replaced by -ε.
The second example involves a straight line and elliptic arcs with V, Vε and ε given in the figure below.
One can derive Hamilton's formula in Eqn 10 as follows allowing for differences in notation.
Again one sees that Vε=V when ε=0 and since the second term on the right hand side is a sum of positive quantities it increases with ε. The integral in Eqn 9 is the length of the elliptic arc and is smallest when ε=0 since only positive quantities are involved in the summation. One can also show that Vε is symmetric and that if 0<ε<η then, for I(ε)=Vε/V, 1<I(ε)<I(η) and so I(ε) increases with positive ε.
Hamilton was not the first to use the minimum path approach in mechanics. Over one hundred years earlier the Bernouli brothers showed that the brachistochrone is the path of minimum time for a descent between any two points.
Supplemental (Aug 20): The figure shown is not correct. Eqn. (9) in Hamilton's paper for the arc length indicates that the φ is the polar angle corresponding to the "parametric latitude." It doesn't indicate the direction of a point on the ellipse but its use simplifies the necessary equations.