F. A da Costa, R. T. G. de Oliveira, G. M. Viswanathan
This blog post was written in co-authorship with my physics department colleague Professor Francisco “Xico” Alexandre da Costa and Professor Roberto Teodoro Gurgel de Oliveira, of the UFRN mathematics department. Xico obtained his doctorate under Professor Sílvio Salinas at the University of São Paulo. Roberto was a student of Xico many years ago, but left physics to study mathematics at IMPA in Rio de Janeiro in 2010. During 2006–2007, Roberto and Xico had written up a short text in Portuguese that included the exact solution of the Ising model on the infinite square lattice using the method of fermion operators developed by Schultz, Mattis and Lieb. With the aim of learning this method, I adapted their text and expanded many of the calculations for my own convenience. I decided to post it on this blog since others might also find it interesting. I have previously written an introduction to the 2-D Ising model here, where I review a combinatorial method of solution.
The spins in the Ising model can only take on two values, . This behavior is not unlike how the occupation number for some single particle state for fermions can only take on two values, . It thus makes sense to try to solve the Ising model via fermionization. This is what Schultz, Mattis and Lieb accomplished in their well-known paper of 1964. In turn, their method of solution is a simplified version of Bruria Kaufman’s spinor analysis method, which is in turn a simplification of Onsager’s original method.
We will proceed as follows. First we will set up the transfer matrix. Next we will reformulate it in terms of Pauli’s spin matrices for spin- particles. Recall that in quantum field theory boson creation and annihilation operators satisfy the well-known commutation relations of the quantum harmonic oscillator, whereas fermion operators satisfy analogous anticommutation relations. The spin annihilation and creation operators do not anticommute at distinct sites but instead commute, whereas fermion operators must anticommute at different sites. This problem of mixed commutation and anticommutation relations can be solved using a method known as the Jordan-Wigner transformation. This step completes the fermionic reformulation of the 2-D Ising model. To obtain the partition function in the thermodynamic limit, which is the largest eigenvalue of the transfer matrix, one diagonalizes the fermionized transfer matrix using appropriate canonical transformations.
2. The transfer matrix of the 2-D Ising model
To build intuition, let us consider first the 1-D Ising model. Define the Hamiltonian as
with periodic boundary conditions . We use instead of the usual to avoid confusion because the latter will be used to denote the Pauli matrices further below. The are classical variables whereas the Pauli spin matrices are associated with quantum mechanical angular momentum (or spin) operators.
The partition function is given by
Using Dirac’s bra-ket notation, let us define the matrix
Here the vector can assume two possible values, . We can now write
Observe that is thus a sum of the diagonal entries of the matrix . By definition of the trace, we thus have
Let and be the larger than smaller eigenvalues of . Then the eigenvalues of are and so
The partition function per site is then
so that in the thermodynamic limit we have
As taught in undergraduate courses in statistical mechanics, the free energy per site is related to the partition function per site via , so that knowledge of the partition function gives yields the thermodynamic properties of the system. Our goal is to find explicitly.
The eigenvalues are the roots of the characteristic polynomial
Here is the identity matrix. For a matrix, the determinant is particularly easy to calculate:
from which we get immediately
The partition function per site is .
Notice that the transfer matrix connects only nearest neighbors. Each power of the transfer matrix adds one bond and one spin site. The partition function is then obtained as the trace of a power of the transfer matrix.
This idea can be applied in higher dimensions. In two dimensions, there are at least two ways of setting up a transfer matrix. Either we can add two bonds and one site, using helical boundary conditions to sweep the entire lattice, or else we can add an entire line of sites and bonds, using periodic boundary conditions. In three dimensions, one can also set up the transfer matrix in one of these two ways. But in three dimensions, one can also add an entire plane of bonds and sites. If the transfer matrix adds sites, then the matrix has dimensions .
We choose to set up the transfer matrix to add one line of sites. For a lattice (of columns and rows), let us choose the transfer matrix to add one column of spins. So the matrix has dimensions , and connects two columns of dimension each. Let denote the vector consisting of the spin variables , corresponding to column .
Consider the 2-D Hamiltonian
where and are the couplings along rows and columns respectively.
Let and be the reduced temperatures (or, alternatively, the reduced couplings). The partition function of the 2-D Ising model can then be written as
where the sum over each is over states. Each of the factors in parentheses is due to interactions between neighboring columns on within a single column, so they can be made into elements of a matrix of dimension . Then
So similarly to the 1-D case, the partition function of the 2-D Ising model can also be expressed as the trace of the transfer matrix. The same is true for cubic lattice, etc. The difficult part is to actually compute the trace! For example, Barry Cipra has referred to challenge of solving the 3-D case as “the Holy Grail of statistical mechanics.”
3. The transfer matrix in terms of the Pauli spin matrices
Before we fully fermionize the problem, let us reformulate the transfer matrix in terms of the Pauli spin matrices, which have anticommuting properties. As we did above with the transfer matrix, let us first look at the 1-D analog before considering the 2-D case.
We saw earlier that the transfer matrix for the 1-D Ising model is a matrix. It thus makes sense that we should thus be able to write it, and therefore the partition function, in terms of the Pauli spin matrices, which we briefly review:
Equivalently, we can define and via the raising and lowering operators (also known as ladder operators) :
The raising and lowering operators have this name because they increase and decrease the spin angular momentum in quantum mechanics. To illustrate this property, consider to be the eigenvectors of , so that and
Then note that
So lower or raise the eigenvalues of by 2.
These matrices have several interesting properties. It is easy to verify that the Pauli matrices satisfy the commutation relation
and circular permutations of the last.
Viewed as operators, the matrices represent involutions, i.e. they are their own inverses:
In contrast, the raising and lowering operators are nilpotent with degree 2:
We note in passing that Grassmann numbers share this last property and indeed the Ising model can also be solved using Grassmann variables.
The first follows from the involution property discussed above and the last two are easily obtained by direct substitution. The anticommutation relation for the raising and lowering operators is the signature of fermions.
Before proceeding we first prove useful trigonometric identities. There are versions of Euler’s identity for the Pauli matrices:
the proof of which is easily obtained from the matrix power series and then using the involution property (even powers of are and so odd powers are simply ). The same argument gives us
Let us now return to the partition function of the 1-D Ising model and write
where and . Previously we expressed the transfer matrix such that the factors in parentheses were its elements. Now, we take a different approach and treat the two kinds of factors inside the parentheses separately. Let us define two matrices and
We can express and in terms of the Pauli matrices as follows.
For we proceed as follows. We first define and ,
Then we have
Let us denote the basis of eigenvectors of as . Then we can write the partition function as
By the invariance of the trace of a product under cyclic permutations of the factors, we can write the partition more symmetrically as
Let us now proceed to the 2-D case of and lattice. Let now denote one full column of sites, i.e.
This kind of product is sometimes called a direct product in physics textbooks. In the language of multilinear algebra, the vector belongs to the space which is the direct sum of the two-dimensional vector spaces corresponding to the sites in a single column, so the dimension of the direct sum space is . The symbol above does not refer to tensor products. If we had tensor products, the dimension of the product space would be instead.
Let be the Pauli spin operator which acts vector only on the row for a , i.e. it acts only on the subspace spanned by the , and similarly for and . Moreover, let be the raising and lowering operators that act on row of . We thus have
We are now ready to express the transfer matrix in terms of these operators. Recall that in the 1-D case, we expressed the transfer matrix in terms of the Pauli matrices. In the 2-D case, the transfer matrix is no longer a 22 matrix, but we can still try to express the elements of the transfer matrix in terms of the Pauli matrices. The interactions between columns and within columns are “orthogonal,” in the sense that they are independent. So we can try to express the interactions in one direction in terms of, say, and in the other by . In the 1-D case, the matrix is diagonal, because it only contains the contributions of the external field, but not from the interactions between the spins. In the 2-D case, similarly, the contribution to the transfer matrix from the interactions inside a column will give rise to a diagonal matrix. This fact is obvious if one considers that it makes no sense to have non-diagonal terms connecting two different columns for interactions inside a single column. Hence, it makes sense to use to represent the interactions inside a column. Then, we can try to represent the the interactions between the columns in terms of the . Note that in the 1964 paper by Schultz, Mattis and Lieb, the roles of and are exchanged and their is diagonal.
We next proceed as described above. Consider first the diagonal part:
So we define as
Now all that remains is to find the nondiagonal contribution, arising from interactions between columns. Observe that
Next, recall that
Here it is assumed implicitly that of course. The corresponding matrix is thus seen to be
We need the factors exponentiated, so this is still not what we want. But we can easily put it into the needed form, by repeating the trick we used in the 1-D case. Specifically, let us write
where is a function of and vice versa. Putting everything together, for two columns and we get
We can find the relation between and explicitly as follows. Putting in , we have
The last expression can be further manipulated as follows:
so that we finally have
We thus obtain
Following the 1-D example, the partition function can now we written as a trace:
4. Fermionization via the Jordan-Wigner transformation
Let us first put and into a more convenient form, by rotating around the axis such that
In this new arrangement, we get
Previously, the matrix was diagonal. But in this new form, it is which is diagonal.
To find the exact partition function, we need to find the largest eigenvalue of the transfer matrix. We will do this by fermionizing the problem. Our Pauli operators have the correct anticommutation property (14) for fixed site , but for different sites the operators commute, see (23). But Fermion operators anticommute at different sites. The physics behind the anticommutation is that when two indistinguishable fermion particles, such as electrons, are exchanged, their two-particle wave function changes sign. It is this property that underlies the Pauli exclusion principle. In contrast, bosons satisfy commutation relations rather than anticommutation relations. The Pauli matrices are neither fully boson nor fully fermion, making diagonalization of the transfer matrix difficult. But there is a way to work around this obstruction.
Let us define new creation and annihilation operators that depend on all sites to the “left” in the following sense:
These new operators have interesting properties, which we will show below. First observe that
Hence, putting we get
From this identity and from the commutation of the Pauli operators at distinct sites, we immediately get
Hence, . So the new operators have the correct fermion anticommutation property at the same site . What about at distinct sites? Observe that
Adding, we thus get
Moreover, we can show similarly that
So we have a complete set of anticommutation rules.
We can now express and in terms of these new operators. Note that
In contrast, is not only quadratic in the Pauli matrices, but connects different rows, i.e. different sites in a given column. Still, it is possible to re-express easily in terms of the new operators. Observe that for , we get from (33)
Using a similar reasoning, we get
The boundary term for must be dealt with separately, due to the periodic boundary conditions. As this point, we could just forget about the boundary, which is not important in the thermodynamic limit. However, eliminating periodic boundary conditions breaks translation invariance symmetry, which would prevent us from fully exploiting the Fourier transform. So it is worthwhile to spend some time getting the the boundary term just right. Let us define the number operator for the total number of fermions:
First, observe that
Next, observe that and are bilinear in the fermion operators. The neglected boundary term in is also bilinear. This means that and can only change the total number of fermions by an even number, which means that the oddness or evenness of the number operator must be a conserved quantity. In the language of commutators, we have
The conditions (37) can be satisfied provided we pick
for even and
for odd .
Let us now exploit translation invariance by invoking the Fourier transform. Let us define new operators via
where is a wavenumber. To simplify the notation, from now on let us write more simply
where the sum is understood to be over the different wavenumbers.
so that, as expected, the inverse transform is given by
Note that the are also fermion operators, as can be seen from
We can show using these same methods that Parseval’s identity holds (sometimes called the Plancherel theorem):
We should now recast the transfer matrix in terms of . For convenience, instead of letting run over values from to , it is more convenient to let it run from to . Moreover, we assume, with little or no loss of generality, that is even. We saw earlier that for even , the periodic boundary conditions force a sign change at the boundary, whereas for odd there is no sign change. So we have
For even , then, Parseval’s theorem assumes the form
Since the sum over the positive is over only values, we have
For odd , we must also consider the terms an separately.
Substituting into (36) we thus get
Now let us turn our attention to . Consider products of the form where :
If we denote sums over values of excluding via the symbol , we can write
Exploiting the anticommutation property for fermion operators, we can write
Now observe that
Putting everything together we get
Let us choose such that . Noting that
For odd we have neglected , which we must consider explicitly. In the expression for
we need to add for
and similarly for
Now we can write by substituting into (36):
From (46) we see that can be diagonalized simultaneously with as follows. Let us define the “vacuum” state as representing the system with all spins in the “down” direction. Let be a set of 4 vectors obtained by acting on in 4 different ways as follows.
Observe that for we have
Let us choose as our basis the 4 vectors in this order. Then we can write
On the other hand, “mixes” and , i.e. it not diagonal. To get around this problem, we proceed as follows. From (50), we obtain for for ,
So we know that we can write
Differentiating, we get
For we have so that we can write
Let us eliminate :
This is a linear second order ordinary differential equation, which is easily solved. Assume that the solution is the sum of terms of the form :
so that from the quadratic formula we get
So the general solution must be of the form
The initial condition at gives us so that
The other initial condition gives us
Having found we can now find from (54) as follows:
We can proceed as above and write similarly
Repeating the above method, we obtain
Meanwhile it is easy to check that
Putting everything together we obtain
We thus see that this matrix is block-diagonal. The first block corresponds to the basis vectors and , which keep count of zero and two fermions, whereas the second block relates to one fermion. Let us call these blocks the even and odd blocks.
The odd block of is diagonal with both eigenvalues equal to . It is easy to check that the even block is symmetric. Let us calculate it explicitly.
This is a symmetric matrix. Moreover, by factoring out from , , , and , we can write the above matrix as
Note that we are using notation slightly different from the 1964 paper. Our is their and our is their and our is their .
has eigenvalue satisfying the characteristic equation
and we thus obtain
So the even block of has eigenvalue
(Note that there is a sign typo inside the square root in the original 1964 article.)
These eigenvalues are centered around their arithmetic mean. Since we are more interested in the multiplicative rather than the additive properties of the partition function, it is more useful to express these eigenvalues as being centered around their geometric mean. We will next show that we can write
Simplifying, we get
so that (69) is satisfied provided
To proceed further, Schultz, Mattis and Lieb re-express the fermion operators once again in terms of “rotated” quantities, using a method that was used by Bogolubov and Valatin to simplify the BCS theory of superconductivity. We refer the interested reader to the original paper. We will skip this step and instead focus directly on the thermodynamic limit.
Even without knowing the details of the basis that diagonalizes in (51), we already know the eigenvalues. So in the diagonalizing basis (which we do not need to know) we can write in terms of these eigenvalues. From the above, the eigenvalues are known to be of the form
Noting that the can be paired and canceled we get
In the thermodynamic limit, only the largest eigenvalue contributes, as is well known. Taking the limit of of the -th root of the above considering only the larger of , and noting that the sum over above is over values only (so only half the full range from to ) we obtain the partition function per site:
5. The symmetric forms
The above expression is easy to put into the more commonly used forms. Let us first prove the following identity, which is actually found in Onsager’s seminal paper from 1944:
Amazingly, this identity is never proven in the usual textbooks of statistical mechanics. We will not satisfy ourselves with just accepting this identity on faith. Where is the proof? Huang’s graduate level textbook merely quotes the identity, giving no proof. Newell and Montroll’s well-known review article similarly just throws the identity at the reader, which is also what Onsager did. Reichl’s book on statistical physics at least contains an evaluated definite integral, but without going into details on how it is obtained. Indeed, why waste our time? Just look it up in some integral table or use Maple or Mathematica, right?
Since our goal here is to actually understand what is going on, we insist on proving the expression. The above intimidating identity is not as unapproachable as it seems. In fact, readers familiar with Mahler measures will recognize it as closely related to a simple one-variable logarithmic Mahler measure.
To prove the above identity, we first establish the following identity for complex :
Let us evaluate the above integral using power series. For the Taylor series for the logarithm converges absolutely and we can use the Fubini-Tonelli theorem to interchange the order of sums and integrals:
The case follows from taking the limit for real from inside the unit disk.
We have now proven (76), but it is worth asking what it means. The sum of a logarithm is the logarithm of a product, so (76) is, in fact, the complex logarithm of the geometric mean of over . In essence, (76) means that the geometric mean of the quantity over a circle of radius and centered at 1 on the complex plane is 1, hence the logarithm of the geometric mean is 0. Moreover, the real part of the integral is given by the Mahler measure,
Mahler measures are actually quite deep and connect diverse areas of mathematics and theoretical physics, such as random walks, the enumeration of spanning trees, lattice sums, determinantal processes, number theory and modular forms. Discussion of Mahler measures is beyond the scope of this article.
and the claim follows.
We can now insert this identity into (74):
Let us next get rid of the starred variables:
Substituting, we get
We do not need to integrate from to , it is enough to go from to . Taking the factor outside, and renaming the integration variables to be more friendly, we finally get
This is the celebrated result of Onsager for the anisotropic 2D Ising model.
There is one other form which is worth mentioning. Let us specialize to the isotropic case . Recall that
Let and so that
Let . The partition function for the isotropic case now becomes
One of the integrals can be carried out, in the manner usually found in the textbooks. Moreover, one of us (GMV) recently carried out the second integration in terms of a generalized hypergeometric function (see http://arxiv.org/abs/1411.2495).
6. Closing remarks
We note that there are several ways of solving the 2D Ising model. The “algebraic” method presented above is a simplified version of the original calculation of Onsager and also of Kaufman. Then, there is the “combinatorial” solution which is based on enumerating closed graphs using a Markovian random walk process on the lattice (see here). Other methods include the star-triangle transformation, the method of Pfaffians, and the Grassmann variable approach. Each method has its advantages and sheds light on a different aspect of the problem.
GMV thanks Sílvio Salinas for suggesting the study of the method of Schultz, Mattis and Lieb in an e-mail of 16 Nov 2014, and also for discussions when GMV visited USP and Salinas visited Natal during 2014.