Ritz method

From HandWiki

In the study of differential equations, the Ritz method is a direct method to find an approximate solution for boundary value problems. The method is named after Walther Ritz. Some alternative formulations include the Rayleigh–Ritz method and the Ritz-Galerkin method.

In quantum mechanics, a system of particles can be described in terms of an "energy functional" or Hamiltonian, which will measure the energy of any proposed configuration of said particles. It turns out that certain privileged configurations are more likely than other configurations, and this has to do with the eigenanalysis ("analysis of characteristics") of this Hamiltonian system. Because it is often impossible to analyze all of the infinite configurations of particles to find the one with the least amount of energy, it becomes essential to be able to approximate this Hamiltonian in some way for the purpose of numerical computations.

The Ritz method can be used to achieve this goal. In the language of mathematics, it is exactly the finite element method used to compute the eigenvectors and eigenvalues of a Hamiltonian system.

Definitions

As with other variational methods, a trial wave function, [math]\displaystyle{ \Psi }[/math], is tested on the system. This trial function is selected to meet boundary conditions (and any other physical constraints). The exact function is not known; the trial function contains one or more adjustable parameters, which are varied to find a lowest energy configuration.

It can be shown that the ground state energy, [math]\displaystyle{ E_0 }[/math], satisfies an inequality: [math]\displaystyle{ E_0 \le \frac{\langle \Psi | \hat{H}| \Psi \rangle}{\langle \Psi | \Psi \rangle}. }[/math]

That is, the ground-state energy is less than this value. The trial wave-function will always give an expectation value larger than or equal to the ground-energy.

If the trial wave function is known to be orthogonal to the ground state, then it will provide a boundary for the energy of some excited state.

The Ritz ansatz function is a linear combination of N known basis functions [math]\displaystyle{ \left\lbrace\Psi_i\right\rbrace }[/math], parametrized by unknown coefficients: [math]\displaystyle{ \Psi = \sum_{i=1}^N c_i \Psi_i. }[/math]

With a known Hamiltonian, we can write its expected value as [math]\displaystyle{ \varepsilon = \frac{\left\langle \displaystyle\sum_{i=1}^N c_i\Psi_i \right| \hat{H} \left| \displaystyle\sum_{i=1}^Nc_i\Psi_i \right\rangle}{\left\langle \left. \displaystyle\sum_{i=1}^N c_i\Psi_i \right| \displaystyle\sum_{i=1}^Nc_i\Psi_i \right\rangle} = \frac{\displaystyle\sum_{i=1}^N\displaystyle\sum_{j=1}^Nc_i^*c_jH_{ij}}{\displaystyle\sum_{i=1}^N\displaystyle\sum_{j=1}^Nc_i^*c_jS_{ij}} \equiv \frac{A}{B}. }[/math]

The basis functions are usually not orthogonal, so that the overlap matrix S has nonzero nondiagonal elements. Either [math]\displaystyle{ \left\lbrace c_i \right\rbrace }[/math] or [math]\displaystyle{ \left\lbrace c_i^* \right\rbrace }[/math] (the conjugation of the first) can be used to minimize the expectation value. For instance, by making the partial derivatives of [math]\displaystyle{ \varepsilon }[/math] over [math]\displaystyle{ \left\lbrace c_i^* \right\rbrace }[/math] zero, the following equality is obtained for every k = 1, 2, ..., N: [math]\displaystyle{ \frac{\partial\varepsilon}{\partial c_k^*} = \frac{\displaystyle\sum_{j=1}^Nc_j(H_{kj}-\varepsilon S_{kj})}{B} = 0, }[/math] which leads to a set of N secular equations: [math]\displaystyle{ \sum_{j=1}^N c_j \left( H_{kj} - \varepsilon S_{kj} \right) = 0 \quad \text{for} \quad k = 1,2,\dots,N. }[/math]

In the above equations, energy [math]\displaystyle{ \varepsilon }[/math] and the coefficients [math]\displaystyle{ \left\lbrace c_j \right\rbrace }[/math] are unknown. With respect to c, this is a homogeneous set of linear equations, which has a solution when the determinant of the coefficients to these unknowns is zero: [math]\displaystyle{ \det \left( H - \varepsilon S \right) = 0, }[/math] which in turn is true only for N values of [math]\displaystyle{ \varepsilon }[/math]. Furthermore, since the Hamiltonian is a hermitian operator, the H matrix is also hermitian and the values of [math]\displaystyle{ \varepsilon_i }[/math] will be real. The lowest value among [math]\displaystyle{ \varepsilon_i }[/math] (i=1,2,..,N), [math]\displaystyle{ \varepsilon_0 }[/math], will be the best approximation to the ground state for the basis functions used. The remaining N-1 energies are estimates of excited state energies. An approximation for the wave function of state i can be obtained by finding the coefficients [math]\displaystyle{ \left\lbrace c_j \right\rbrace }[/math] from the corresponding secular equation.

Applications in mechanical engineering

The Rayleigh–Ritz method is often used in mechanical engineering for finding the approximate real resonant frequencies of multi degree of freedom systems, such as spring mass systems or flywheels on a shaft with varying cross section. It is an extension of Rayleigh's method. It can also be used for finding buckling loads and post-buckling behaviour for columns.

Consider the case whereby we want to find the resonant frequency of oscillation of a system. First, write the oscillation in the form, [math]\displaystyle{ y(x,t) = Y(x) \cos\omega t }[/math] with an unknown mode shape [math]\displaystyle{ Y(x) }[/math]. Next, find the total energy of the system, consisting of a kinetic energy term and a potential energy term. The kinetic energy term involves the square of the time derivative of [math]\displaystyle{ y(x,t) }[/math] and thus gains a factor of [math]\displaystyle{ \omega ^2 }[/math]. Thus, we can calculate the total energy of the system and express it in the following form: [math]\displaystyle{ E = T + V \equiv A[Y(x)] \omega^2\sin^2 \omega t + B[Y(x)] \cos^2 \omega t }[/math]

By conservation of energy, the average kinetic energy must be equal to the average potential energy. Thus, [math]\displaystyle{ \omega^2 = \frac{B[Y(x)]}{A[Y(x)]} = R[Y(x)] }[/math] which is also known as the Rayleigh quotient. Thus, if we knew the mode shape [math]\displaystyle{ Y(x) }[/math], we would be able to calculate [math]\displaystyle{ A[Y(x)] }[/math] and [math]\displaystyle{ B[Y(x)] }[/math], and in turn get the eigenfrequency. However, we do not yet know the mode shape. In order to find this, we can approximate [math]\displaystyle{ Y(x) }[/math] as a combination of a few approximating functions [math]\displaystyle{ Y_i(x) }[/math] [math]\displaystyle{ Y(x) = \sum_{i=1}^N c_i Y_i(x) }[/math] where [math]\displaystyle{ c_1,c_2,\cdots,c_N }[/math] are constants to be determined. In general, if we choose a random set of [math]\displaystyle{ c_1,c_2,\cdots,c_N }[/math], it will describe a superposition of the actual eigenmodes of the system. However, if we seek [math]\displaystyle{ c_1,c_2,\cdots,c_N }[/math] such that the eigenfrequency [math]\displaystyle{ \omega^2 }[/math] is minimised, then the mode described by this set of [math]\displaystyle{ c_1,c_2,\cdots,c_N }[/math] will be close to the lowest possible actual eigenmode of the system. Thus, this finds the lowest eigenfrequency. If we find eigenmodes orthogonal to this approximated lowest eigenmode, we can approximately find the next few eigenfrequencies as well.

In general, we can express [math]\displaystyle{ A[Y(x)] }[/math] and [math]\displaystyle{ B[Y(x)] }[/math] as a collection of terms quadratic in the coefficients [math]\displaystyle{ c_i }[/math]: [math]\displaystyle{ B[Y(x)] = \sum_i \sum_j c_i c_j K_{ij} = \mathbf{c}^\mathsf{T} K \mathbf{c} }[/math] [math]\displaystyle{ A[Y(x)] = \sum_i \sum_j c_i c_j M_{ij} = \mathbf{c}^\mathsf{T} M \mathbf{c} }[/math] where [math]\displaystyle{ K }[/math] and [math]\displaystyle{ M }[/math] are the stiffness matrix and mass matrix of a discrete system respectively.

The minimization of [math]\displaystyle{ \omega^2 }[/math] becomes: [math]\displaystyle{ \frac{\partial \omega^2}{\partial c_i} = \frac{\partial}{\partial c_i} \frac{\mathbf{c}^\mathsf{T} K \mathbf{c}}{\mathbf{c}^\mathsf{T} M \mathbf{c}} = 0 }[/math]

Solving this, [math]\displaystyle{ \mathbf{c}^\mathsf{T} M \mathbf{c}\frac{\partial \mathbf{c}^\mathsf{T} K \mathbf{c}}{\partial \mathbf{c}} - \mathbf{c}^\mathsf{T} K \mathbf{c} \frac{\partial \mathbf{c}^\mathsf{T} M \mathbf{c}}{\partial \mathbf{c}} = 0 }[/math] [math]\displaystyle{ K \mathbf c - \frac{\mathbf{c}^\mathsf{T} K \mathbf{c}}{\mathbf{c}^\mathsf{T} M \mathbf{c}}M\mathbf{c} = \mathbf{0} }[/math] [math]\displaystyle{ K \mathbf{c} - \omega^2 M \mathbf{c} = \mathbf{0} }[/math]

For a non-trivial solution of c, we require determinant of the matrix coefficient of c to be zero. [math]\displaystyle{ \det(K - \omega^2 M)=0 }[/math]

This gives a solution for the first N eigenfrequencies and eigenmodes of the system, with N being the number of approximating functions.

Simple case of double spring-mass system

The following discussion uses the simplest case, where the system has two lumped springs and two lumped masses, and only two mode shapes are assumed. Hence M = [m1, m2] and K = [k1, k2].

A mode shape is assumed for the system, with two terms, one of which is weighted by a factor B, e.g. Y = [1, 1] + B[1, −1]. Simple harmonic motion theory says that the velocity at the time when deflection is zero, is the angular frequency [math]\displaystyle{ \omega }[/math] times the deflection (y) at time of maximum deflection. In this example the kinetic energy (KE) for each mass is [math]\displaystyle{ \frac{1}{2}\omega^2 Y_1^2 m_1 }[/math] etc., and the potential energy (PE) for each spring is [math]\displaystyle{ \frac{1}{2} k_1 Y_1^2 }[/math] etc.

We also know that without damping, the maximal KE equals the maximal PE. Thus, [math]\displaystyle{ \sum_{i=1}^2 \left(\frac{1}{2} \omega^2 Y_i^2 M_i\right)=\sum_{i=1}^2 \left(\frac{1}{2} K_i Y_i^2\right) }[/math]

The overall amplitude of the mode shape cancels out from each side, always. That is, the actual size of the assumed deflection does not matter, just the mode shape.

Mathematical manipulations then obtain an expression for [math]\displaystyle{ \omega }[/math], in terms of B, which can be differentiated with respect to B, to find the minimum, i.e. when [math]\displaystyle{ d\omega/dB=0 }[/math]. This gives the value of B for which [math]\displaystyle{ \omega }[/math] is lowest. This is an upper bound solution for [math]\displaystyle{ \omega }[/math] if [math]\displaystyle{ \omega }[/math] is hoped to be the predicted fundamental frequency of the system because the mode shape is assumed, but we have found the lowest value of that upper bound, given our assumptions, because B is used to find the optimal 'mix' of the two assumed mode shape functions.

There are many tricks with this method, the most important is to try and choose realistic assumed mode shapes. For example, in the case of beam deflection problems it is wise to use a deformed shape that is analytically similar to the expected solution. A quartic may fit most of the easy problems of simply linked beams even if the order of the deformed solution may be lower. The springs and masses do not have to be discrete, they can be continuous (or a mixture), and this method can be easily used in a spreadsheet to find the natural frequencies of quite complex distributed systems, if you can describe the distributed KE and PE terms easily, or else break the continuous elements up into discrete parts.

This method could be used iteratively, adding additional mode shapes to the previous best solution, or you can build up a long expression with many Bs and many mode shapes, and then differentiate them partially.

The relationship with the finite element method

In the language of the finite element method, the matrix [math]\displaystyle{ H_{kj} }[/math] is precisely the stiffness matrix of the Hamiltonian in the piecewise linear element space, and the matrix [math]\displaystyle{ S_{kj} }[/math] is the mass matrix. In the language of linear algebra, the value [math]\displaystyle{ \epsilon }[/math] is an eigenvalue of the discretized Hamiltonian, and the vector [math]\displaystyle{ c }[/math] is a discretized eigenvector.

See also

Sources

Papers

External links