In this post we are going to explore the so called orthogonal functions, and some of their properties. We are also going to show that these orthogonal functions are closely related to the least-squares approximation method. This alternative to the least-squares method can be helpful in certain cases when the least-squares produces a hard to solve linear system.
Orthogonal Functions
First, we should introduce some important theory. Let's begin by stating that two \(n\)-dimensional vectors \(x\) and \(y\) are orthogonal if their components satisfy
Now, if we increase the dimensions \(n\) to infinity such that we can replace the vectors in the limit with continous functions (\(f_{1}(x)\) and \(f_{2}(x)\)), and their sum approaches an integral, we have that two functions are orthogonal over the intevral \(a \leq x \leq b\) if
In some cases a weight \(w(x) \geq 0\) is included, and then \eqref{eq:1} becomes
Similarly, if we have a set of functions \(f_{i}(x)\), \(i = 0, 1, ...\) they are said to be mutually orthogonal if
Once again, a weight term \(w(x) \geq 0\) can be included, and then
The above system is called orthonormal if
Example
One of the most famous systems (families) of orthogonal functions is
Part of the family is visualised below.
We can easily show that the functions are orthogonal over the interval \(0 \leq x \leq 2\pi\) because the following equations hold (by using a fundamental trigonometric formulas)
and
Now, let's take a look at one possible application of these orthogonal functions. Let's assume that for a function \(F(x)\), \(0 \leq x \leq 2\pi\) we have
If we multiply \eqref{eq:2} with \(\cos{(mx)}\) and integrate over the range of \(x\), we get
If we multiply \eqref{eq:2} with \(\sin{(mx)}\) and integrate over the range of \(x\), we get
But why did we even mention that? In fact, with \eqref{eq:3} and \eqref{eq:4} we can compute the coefficients in the function expansion. Moreover, this way of computing the \(a_m\) and \(b_m\) coefficients gives them the name Fourier coefficeints. This is also valid for the general case of a system of orthogonal functions, meaning
If we have the function expansion
then the coefficients
are the Fourier coefficients.
Linear Independence and Orthogonality: Connection
Here, we are going to show that linear independence and orthogonality are closely connected. For this purpose the first thing we have to show is that a system of orthogonal functions \(f_i(x)\) is linearly independent over the interval of interest.
Let's assume there exists a linear dependence between the functions \(f_i(x)\) with non-zero coefficients, or
for some \(c_j \neq 0\). Then, we multiply with \(w(x) f_j(x)\), \(w(x) \geq 0\) and integrate over the interval and get
From the orthogonal properties it follows
meaning \(c_j = 0\) for every \(j\). Thus, the assumed relation doesn't exist, and the functions are independent.
We are left with proving the opposite, that from a system of linearly independent functions we can construct an orthogonal system. We can do this with the help of the Schmidt process. Let \(f_i(x)\) be the set of linearly independent functions. We have
Then
would be our first orthonormal function. By induction, we can assume we have constructed the first \(j\) orthonormal functions \(g_{i}(x)\), \(i = 0, 1, ..., j-1\). Let
We have that the functions \(f_i(x)\) are linearly independent, and that every \(g_{i}(x)\) is a linear combination of \(f_{k}(x)\), \(k \leq i\), thus
We need
From the definition of \(F_j(x)\) the above equation becomes
From here we can determine \(a_i\), and hence \(F_j(x)\) as well. We have to "norm" \(F_j(x)\) so we have to compute
and let
This ends the induction step.
If we have a finite number \(N\) of nodes \(x_m\), there exist at least \(N\) linearly independent functions \(f_j(x_m)\). But these \(N\) functions exist because of the following system
where no subset of these \(N\) functions \(g_j(x_m)\) can be linearly dependant.
Least Squares and the Fourier Coefficients
Finally, we are at the point of showing that the least-squares method is closely connected to orthogonality.
Theorem 1. The Fourier coefficients \(a_j\) give the best leas-squares approximation when the function \(F(x)\) is expanded over an orthogonal system of functions \(g_j(x)\).
In order to prove the theorem we have to minimize the arbitrary expansion
The above equation achieves its minimum only when \(c_i = a_i\), which is what we wanted.
It is useful to note that regarding the best approximation with orthogonal functions every coefficient \(a_i\) is determined independently of the others. Thus, if we dedcide to change the amount of the used functions \(g_i(x)\) there is no need to reevaluate the coefficients we have already evaluated. This motivates us to explore the next problem.
If the coefficients \(c_i\) of the function \(F(x)\) when approximating with the least-squares method via the system \(\mu_i(x)\) do not change when we change the number of used functions \(\mu_i(x)\), then the functions \(\mu_i(x)\) are orthogonal. Let
Once again, we have to minimze \(g\). Thus,
or
If the above property is true for all \(M\), then it should be true for \(M+1\) as well:
Now, after subtracting it from the previous equation, we get
for every \(j\). Simply said, \(\mu_j(x)\) is orthogonal to \(\mu_{M+1}\) for an arbitrary \(M\).
To summarise, we saw that the orthogonal functions, the Fourier coefficients, and the approxiamtion via the least-squares method are closely connected.