**F. A da Costa, R. T. G. de Oliveira, G. M. Viswanathan**

This blog post was written in co-authorship with my physics department colleague Professor Francisco “Xico” Alexandre da Costa and Professor Roberto Teodoro Gurgel de Oliveira, of the UFRN mathematics department. Xico obtained his doctorate under Professor Sílvio Salinas at the University of São Paulo. Roberto was a student of Xico many years ago, but left physics to study mathematics at IMPA in Rio de Janeiro in 2010. During 2006–2007, Roberto and Xico had written up a short text in Portuguese that included the exact solution of the Ising model on the infinite square lattice using the method of fermion operators developed by Schultz, Mattis and Lieb. With the aim of learning this method, I adapted their text and expanded many of the calculations for my own convenience. I decided to post it on this blog since others might also find it interesting. I have previously written an introduction to the 2-D Ising model **here**, where I review a combinatorial method of solution.

**1. Introduction **

The spins in the Ising model can only take on two values, . This behavior is not unlike how the occupation number for some single particle state for fermions can only take on two values, . It thus makes sense to try to solve the Ising model via fermionization. This is what Schultz, Mattis and Lieb accomplished in their well-known paper of 1964. In turn, their method of solution is a simplified version of Bruria Kaufman’s spinor analysis method, which is in turn a simplification of Onsager’s original method.

We will proceed as follows. First we will set up the transfer matrix. Next we will reformulate it in terms of Pauli’s spin matrices for spin- particles. Recall that in quantum field theory boson creation and annihilation operators satisfy the well-known commutation relations of the quantum harmonic oscillator, whereas fermion operators satisfy analogous anticommutation relations. The spin annihilation and creation operators do not anticommute at distinct sites but instead commute, whereas fermion operators must anticommute at different sites. This problem of mixed commutation and anticommutation relations can be solved using a method known as the Jordan-Wigner transformation. This step completes the fermionic reformulation of the 2-D Ising model. To obtain the partition function in the thermodynamic limit, which is the largest eigenvalue of the transfer matrix, one diagonalizes the fermionized transfer matrix using appropriate canonical transformations.

**2. The transfer matrix of the 2-D Ising model **

To build intuition, let us consider first the 1-D Ising model. Define the Hamiltonian as

with periodic boundary conditions . We use instead of the usual to avoid confusion because the latter will be used to denote the Pauli matrices further below. The are classical variables whereas the Pauli spin matrices are associated with quantum mechanical angular momentum (or spin) operators.

The partition function is given by

Using Dirac’s bra-ket notation, let us define the matrix

Or, explicitly

Here the vector can assume two possible values, . We can now write

Observe that is thus a sum of the diagonal entries of the matrix . By definition of the trace, we thus have

Let and be the larger than smaller eigenvalues of . Then the eigenvalues of are and so

The partition function per site is then

so that in the thermodynamic limit we have

As taught in undergraduate courses in statistical mechanics, the free energy per site is related to the partition function per site via , so that knowledge of the partition function gives yields the thermodynamic properties of the system. Our goal is to find explicitly.

The eigenvalues are the roots of the characteristic polynomial

Here is the identity matrix. For a matrix, the determinant is particularly easy to calculate:

so that

with

from which we get immediately

The partition function per site is .

Notice that the transfer matrix connects only nearest neighbors. Each power of the transfer matrix adds one bond and one spin site. The partition function is then obtained as the trace of a power of the transfer matrix.

This idea can be applied in higher dimensions. In two dimensions, there are at least two ways of setting up a transfer matrix. Either we can add two bonds and one site, using helical boundary conditions to sweep the entire lattice, or else we can add an entire line of sites and bonds, using periodic boundary conditions. In three dimensions, one can also set up the transfer matrix in one of these two ways. But in three dimensions, one can also add an entire plane of bonds and sites. If the transfer matrix adds sites, then the matrix has dimensions .

We choose to set up the transfer matrix to add one line of sites. For a lattice (of columns and rows), let us choose the transfer matrix to add one column of spins. So the matrix has dimensions , and connects two columns of dimension each. Let denote the vector consisting of the spin variables , corresponding to column .

Consider the 2-D Hamiltonian

where and are the couplings along rows and columns respectively.

Let and be the reduced temperatures (or, alternatively, the reduced couplings). The partition function of the 2-D Ising model can then be written as

where the sum over each is over states. Each of the factors in parentheses is due to interactions between neighboring columns on within a single column, so they can be made into elements of a matrix of dimension . Then

So similarly to the 1-D case, the partition function of the 2-D Ising model can also be expressed as the trace of the transfer matrix. The same is true for cubic lattice, etc. The difficult part is to actually compute the trace! For example, Barry Cipra has referred to challenge of solving the 3-D case as “the Holy Grail of statistical mechanics.”

**3. The transfer matrix in terms of the Pauli spin matrices **

Before we fully fermionize the problem, let us reformulate the transfer matrix in terms of the Pauli spin matrices, which have anticommuting properties. As we did above with the transfer matrix, let us first look at the 1-D analog before considering the 2-D case.

We saw earlier that the transfer matrix for the 1-D Ising model is a matrix. It thus makes sense that we should thus be able to write it, and therefore the partition function, in terms of the Pauli spin matrices, which we briefly review:

Equivalently, we can define and via the raising and lowering operators (also known as ladder operators) :

The raising and lowering operators have this name because they increase and decrease the spin angular momentum in quantum mechanics. To illustrate this property, consider to be the eigenvectors of , so that and

Then note that

So lower or raise the eigenvalues of by 2.

These matrices have several interesting properties. It is easy to verify that the Pauli matrices satisfy the commutation relation

and circular permutations of the last.

Viewed as operators, the matrices represent involutions, i.e. they are their own inverses:

In contrast, the raising and lowering operators are nilpotent with degree 2:

We note in passing that Grassmann numbers share this last property and indeed the Ising model can also be solved using Grassmann variables.

Most important for our purposes are the anticommutation relations

The first follows from the involution property discussed above and the last two are easily obtained by direct substitution. The anticommutation relation for the raising and lowering operators is the signature of fermions.

Before proceeding we first prove useful trigonometric identities. There are versions of Euler’s identity for the Pauli matrices:

the proof of which is easily obtained from the matrix power series and then using the involution property (even powers of are and so odd powers are simply ). The same argument gives us

Let us now return to the partition function of the 1-D Ising model and write

where and . Previously we expressed the transfer matrix such that the factors in parentheses were its elements. Now, we take a different approach and treat the two kinds of factors inside the parentheses separately. Let us define two matrices and

or explicitly,

We can express and in terms of the Pauli matrices as follows.

so that

For we proceed as follows. We first define and ,

Then we have

Let us denote the basis of eigenvectors of as . Then we can write the partition function as

By the invariance of the trace of a product under cyclic permutations of the factors, we can write the partition more symmetrically as

Let us now proceed to the 2-D case of and lattice. Let now denote one full column of sites, i.e.

This kind of product is sometimes called a direct product in physics textbooks. In the language of multilinear algebra, the vector belongs to the space which is the direct sum of the two-dimensional vector spaces corresponding to the sites in a single column, so the dimension of the direct sum space is . The symbol above does not refer to tensor products. If we had tensor products, the dimension of the product space would be instead.

Let be the Pauli spin operator which acts vector only on the row for a , i.e. it acts only on the subspace spanned by the , and similarly for and . Moreover, let be the raising and lowering operators that act on row of . We thus have

We know the commutation relations of these operators for identical , but how do operators of distinct behave? Since they act on different subspaces, they must commute:

We are now ready to express the transfer matrix in terms of these operators. Recall that in the 1-D case, we expressed the transfer matrix in terms of the Pauli matrices. In the 2-D case, the transfer matrix is no longer a 22 matrix, but we can still try to express the elements of the transfer matrix in terms of the Pauli matrices. The interactions between columns and within columns are “orthogonal,” in the sense that they are independent. So we can try to express the interactions in one direction in terms of, say, and in the other by . In the 1-D case, the matrix is diagonal, because it only contains the contributions of the external field, but not from the interactions between the spins. In the 2-D case, similarly, the contribution to the transfer matrix from the interactions inside a column will give rise to a diagonal matrix. This fact is obvious if one considers that it makes no sense to have non-diagonal terms connecting two different columns for interactions inside a single column. Hence, it makes sense to use to represent the interactions inside a column. Then, we can try to represent the the interactions between the columns in terms of the . Note that in the 1964 paper by Schultz, Mattis and Lieb, the roles of and are exchanged and their is diagonal.

We next proceed as described above. Consider first the diagonal part:

So we define as

Now all that remains is to find the nondiagonal contribution, arising from interactions between columns. Observe that

Next, recall that

so that

Here it is assumed implicitly that of course. The corresponding matrix is thus seen to be

We need the factors exponentiated, so this is still not what we want. But we can easily put it into the needed form, by repeating the trick we used in the 1-D case. Specifically, let us write

where is a function of and vice versa. Putting everything together, for two columns and we get

We can find the relation between and explicitly as follows. Putting in , we have

so that

The last expression can be further manipulated as follows:

so that we finally have

We thus obtain

Following the 1-D example, the partition function can now we written as a trace:

Again, we can use the invariance of the trace relative to circular permutations to get the symmetric form

**4. Fermionization via the Jordan-Wigner transformation **

Let us first put and into a more convenient form, by rotating around the axis such that

In this new arrangement, we get

Previously, the matrix was diagonal. But in this new form, it is which is diagonal.

To find the exact partition function, we need to find the largest eigenvalue of the transfer matrix. We will do this by fermionizing the problem. Our Pauli operators have the correct anticommutation property (14) for fixed site , but for different sites the operators commute, see (23). But Fermion operators anticommute at different sites. The physics behind the anticommutation is that when two indistinguishable fermion particles, such as electrons, are exchanged, their two-particle wave function changes sign. It is this property that underlies the Pauli exclusion principle. In contrast, bosons satisfy commutation relations rather than anticommutation relations. The Pauli matrices are neither fully boson nor fully fermion, making diagonalization of the transfer matrix difficult. But there is a way to work around this obstruction.

Let us define new creation and annihilation operators that depend on all sites to the “left” in the following sense:

These new operators have interesting properties, which we will show below. First observe that

So

Hence, putting we get

From this identity and from the commutation of the Pauli operators at distinct sites, we immediately get

Hence, . So the new operators have the correct fermion anticommutation property at the same site . What about at distinct sites? Observe that

Adding, we thus get

Moreover, we can show similarly that

So we have a complete set of anticommutation rules.

We can now express and in terms of these new operators. Note that

In contrast, is not only quadratic in the Pauli matrices, but connects different rows, i.e. different sites in a given column. Still, it is possible to re-express easily in terms of the new operators. Observe that for , we get from (33)

Using a similar reasoning, we get

And

Hence,

Hence, neglecting the boundary, the bulk of the matrix can be written as

The boundary term for must be dealt with separately, due to the periodic boundary conditions. As this point, we could just forget about the boundary, which is not important in the thermodynamic limit. However, eliminating periodic boundary conditions breaks translation invariance symmetry, which would prevent us from fully exploiting the Fourier transform. So it is worthwhile to spend some time getting the the boundary term just right. Let us define the number operator for the total number of fermions:

First, observe that

Similarly,

and

The inverse of is itself, since . So

Next, observe that and are bilinear in the fermion operators. The neglected boundary term in is also bilinear. This means that and can only change the total number of fermions by an even number, which means that the oddness or evenness of the number operator must be a conserved quantity. In the language of commutators, we have

The conditions (37) can be satisfied provided we pick

for even and

for odd .

Let us now exploit translation invariance by invoking the Fourier transform. Let us define new operators via

where is a wavenumber. To simplify the notation, from now on let us write more simply

where the sum is understood to be over the different wavenumbers.

Consider that

so that, as expected, the inverse transform is given by

Note that the are also fermion operators, as can be seen from

We can show using these same methods that Parseval’s identity holds (sometimes called the Plancherel theorem):

We should now recast the transfer matrix in terms of . For convenience, instead of letting run over values from to , it is more convenient to let it run from to . Moreover, we assume, with little or no loss of generality, that is even. We saw earlier that for even , the periodic boundary conditions force a sign change at the boundary, whereas for odd there is no sign change. So we have

For even , then, Parseval’s theorem assumes the form

Since the sum over the positive is over only values, we have

For odd , we must also consider the terms an separately.

Substituting into (36) we thus get

Now let us turn our attention to . Consider products of the form where :

And similarly

If we denote sums over values of excluding via the symbol , we can write

Exploiting the anticommutation property for fermion operators, we can write

Now observe that

Putting everything together we get

Let us choose such that . Noting that

we get

For odd we have neglected , which we must consider explicitly. In the expression for

we need to add for

and similarly for

Now we can write by substituting into (36):

We are now in a position to recast (31). Substituting (50) and (46), we obtain

From (46) we see that can be diagonalized simultaneously with as follows. Let us define the “vacuum” state as representing the system with all spins in the “down” direction. Let be a set of 4 vectors obtained by acting on in 4 different ways as follows.

Observe that for we have

Let us choose as our basis the 4 vectors in this order. Then we can write

On the other hand, “mixes” and , i.e. it not diagonal. To get around this problem, we proceed as follows. From (50), we obtain for for ,

So we know that we can write

Differentiating, we get

For we have so that we can write

Let us eliminate :

This is a linear second order ordinary differential equation, which is easily solved. Assume that the solution is the sum of terms of the form :

so that from the quadratic formula we get

So the general solution must be of the form

The initial condition at gives us so that

The other initial condition gives us

so that

Having found we can now find from (54) as follows:

We can proceed as above and write similarly

Repeating the above method, we obtain

Meanwhile it is easy to check that

Putting everything together we obtain

We thus see that this matrix is block-diagonal. The first block corresponds to the basis vectors and , which keep count of zero and two fermions, whereas the second block relates to one fermion. Let us call these blocks the even and odd blocks.

The odd block of is diagonal with both eigenvalues equal to . It is easy to check that the even block is symmetric. Let us calculate it explicitly.

This is a symmetric matrix. Moreover, by factoring out from , , , and , we can write the above matrix as

where

Note that we are using notation slightly different from the 1964 paper. Our is their and our is their and our is their .

The matrix

has eigenvalue satisfying the characteristic equation

so that

and we thus obtain

So the even block of has eigenvalue

(Note that there is a sign typo inside the square root in the original 1964 article.)

These eigenvalues are centered around their arithmetic mean. Since we are more interested in the multiplicative rather than the additive properties of the partition function, it is more useful to express these eigenvalues as being centered around their geometric mean. We will next show that we can write

Simplifying, we get

Hence,

so that (69) is satisfied provided

To proceed further, Schultz, Mattis and Lieb re-express the fermion operators once again in terms of “rotated” quantities, using a method that was used by Bogolubov and Valatin to simplify the BCS theory of superconductivity. We refer the interested reader to the original paper. We will skip this step and instead focus directly on the thermodynamic limit.

Even without knowing the details of the basis that diagonalizes in (51), we already know the eigenvalues. So in the diagonalizing basis (which we do not need to know) we can write in terms of these eigenvalues. From the above, the eigenvalues are known to be of the form

Noting that the can be paired and canceled we get

In the thermodynamic limit, only the largest eigenvalue contributes, as is well known. Taking the limit of of the -th root of the above considering only the larger of , and noting that the sum over above is over values only (so only half the full range from to ) we obtain the partition function per site:

**5. The symmetric forms **

The above expression is easy to put into the more commonly used forms. Let us first prove the following identity, which is actually found in Onsager’s seminal paper from 1944:

Amazingly, this identity is never proven in the usual textbooks of statistical mechanics. We will not satisfy ourselves with just accepting this identity on faith. Where is the proof? Huang’s graduate level textbook merely quotes the identity, giving no proof. Newell and Montroll’s well-known review article similarly just throws the identity at the reader, which is also what Onsager did. Reichl’s book on statistical physics at least contains an evaluated definite integral, but without going into details on how it is obtained. Indeed, why waste our time? Just look it up in some integral table or use Maple or Mathematica, right?

Since our goal here is to actually understand what is going on, we insist on proving the expression. The above intimidating identity is not as unapproachable as it seems. In fact, readers familiar with Mahler measures will recognize it as closely related to a simple one-variable logarithmic Mahler measure.

To prove the above identity, we first establish the following identity for complex :

Let us evaluate the above integral using power series. For the Taylor series for the logarithm converges absolutely and we can use the Fubini-Tonelli theorem to interchange the order of sums and integrals:

The case follows from taking the limit for real from inside the unit disk.

We have now proven (76), but it is worth asking what it means. The sum of a logarithm is the logarithm of a product, so (76) is, in fact, the complex logarithm of the geometric mean of over . In essence, (76) means that the geometric mean of the quantity over a circle of radius and centered at 1 on the complex plane is 1, hence the logarithm of the geometric mean is 0. Moreover, the real part of the integral is given by the Mahler measure,

Mahler measures are actually quite deep and connect diverse areas of mathematics and theoretical physics, such as random walks, the enumeration of spanning trees, lattice sums, determinantal processes, number theory and modular forms. Discussion of Mahler measures is beyond the scope of this article.

Next we establish (75), with the help of (76). Let without loss of generality and let . Then,

and the claim follows.

We can now insert this identity into (74):

Let us next get rid of the starred variables:

Substituting, we get

We do not need to integrate from to , it is enough to go from to . Taking the factor outside, and renaming the integration variables to be more friendly, we finally get

This is the celebrated result of Onsager for the anisotropic 2D Ising model.

There is one other form which is worth mentioning. Let us specialize to the isotropic case . Recall that

Let and so that

Let . The partition function for the isotropic case now becomes

One of the integrals can be carried out, in the manner usually found in the textbooks. Moreover, one of us (GMV) recently carried out the second integration in terms of a generalized hypergeometric function (see http://arxiv.org/abs/1411.2495).

**6. Closing remarks **

We note that there are several ways of solving the 2D Ising model. The “algebraic” method presented above is a simplified version of the original calculation of Onsager and also of Kaufman. Then, there is the “combinatorial” solution which is based on enumerating closed graphs using a Markovian random walk process on the lattice (see here). Other methods include the star-triangle transformation, the method of Pfaffians, and the Grassmann variable approach. Each method has its advantages and sheds light on a different aspect of the problem.

**Acnowledgements**

GMV thanks Sílvio Salinas for suggesting the study of the method of Schultz, Mattis and Lieb in an e-mail of 16 Nov 2014, and also for discussions when GMV visited USP and Salinas visited Natal during 2014.