F. A da Costa, R. T. G. de Oliveira, G. M. Viswanathan
This blog post was written in co-authorship with my physics department colleague Professor Francisco “Xico” Alexandre da Costa and Professor Roberto Teodoro Gurgel de Oliveira, of the UFRN mathematics department. Xico obtained his doctorate under Professor Sílvio Salinas at the University of São Paulo. Roberto was a student of Xico many years ago, but left physics to study mathematics at IMPA in Rio de Janeiro in 2010. During 2006–2007, Roberto and Xico had written up a short text in Portuguese that included the exact solution of the Ising model on the infinite square lattice using the method of fermion operators developed by Schultz, Mattis and Lieb. With the aim of learning this method, I adapted their text and expanded many of the calculations for my own convenience. I decided to post it on this blog since others might also find it interesting. I have previously written an introduction to the 2-D Ising model here, where I review a combinatorial method of solution.
1. Introduction
The spins in the Ising model can only take on two values, . This behavior is not unlike how the occupation number
for some single particle state for fermions can only take on two values,
. It thus makes sense to try to solve the Ising model via fermionization. This is what Schultz, Mattis and Lieb accomplished in their well-known paper of 1964. In turn, their method of solution is a simplified version of Bruria Kaufman’s spinor analysis method, which is in turn a simplification of Onsager’s original method.
We will proceed as follows. First we will set up the transfer matrix. Next we will reformulate it in terms of Pauli’s spin matrices for spin- particles. Recall that in quantum field theory boson creation and annihilation operators satisfy the well-known commutation relations of the quantum harmonic oscillator, whereas fermion operators satisfy analogous anticommutation relations. The spin annihilation and creation operators
do not anticommute at distinct sites
but instead commute, whereas fermion operators must anticommute at different sites. This problem of mixed commutation and anticommutation relations can be solved using a method known as the Jordan-Wigner transformation. This step completes the fermionic reformulation of the 2-D Ising model. To obtain the partition function in the thermodynamic limit, which is the largest eigenvalue of the transfer matrix, one diagonalizes the fermionized transfer matrix using appropriate canonical transformations.
2. The transfer matrix of the 2-D Ising model
To build intuition, let us consider first the 1-D Ising model. Define the Hamiltonian as
with periodic boundary conditions . We use
instead of the usual
to avoid confusion because the latter will be used to denote the Pauli matrices further below. The
are classical variables whereas the Pauli spin matrices are associated with quantum mechanical angular momentum (or spin) operators.
The partition function is given by
Using Dirac’s bra-ket notation, let us define the matrix
Or, explicitly
Here the vector can assume two possible values,
. We can now write
Observe that is thus a sum of the diagonal entries of the matrix
. By definition of the trace, we thus have
Let and
be the larger than smaller eigenvalues of
. Then the eigenvalues of
are
and so
The partition function per site is then
so that in the thermodynamic limit we have
As taught in undergraduate courses in statistical mechanics, the free energy per site is related to the partition function per site via
, so that knowledge of the partition function gives yields the thermodynamic properties of the system. Our goal is to find
explicitly.
The eigenvalues are the roots of the characteristic polynomial
Here is the identity matrix. For a
matrix, the determinant is particularly easy to calculate:
so that
with
from which we get immediately
The partition function per site is .
Notice that the transfer matrix connects only nearest neighbors. Each power of the transfer matrix adds one bond and one spin site. The partition function is then obtained as the trace of a power of the transfer matrix.
This idea can be applied in higher dimensions. In two dimensions, there are at least two ways of setting up a transfer matrix. Either we can add two bonds and one site, using helical boundary conditions to sweep the entire lattice, or else we can add an entire line of sites and bonds, using periodic boundary conditions. In three dimensions, one can also set up the transfer matrix in one of these two ways. But in three dimensions, one can also add an entire plane of bonds and sites. If the transfer matrix adds sites, then the matrix has dimensions
.
We choose to set up the transfer matrix to add one line of sites. For a lattice (of
columns and
rows), let us choose the transfer matrix to add one column of
spins. So the matrix has dimensions
, and connects two columns of dimension
each. Let
denote the vector consisting of the
spin variables
, corresponding to column
.
Consider the 2-D Hamiltonian
where and
are the couplings along rows and columns respectively.
Let and
be the reduced temperatures (or, alternatively, the reduced couplings). The partition function of the 2-D Ising model can then be written as
where the sum over each is over
states. Each of the factors in parentheses is due to interactions between neighboring columns on within a single column, so they can be made into elements of a matrix
of dimension
. Then
So similarly to the 1-D case, the partition function of the 2-D Ising model can also be expressed as the trace of the transfer matrix. The same is true for cubic lattice, etc. The difficult part is to actually compute the trace! For example, Barry Cipra has referred to challenge of solving the 3-D case as “the Holy Grail of statistical mechanics.”
3. The transfer matrix in terms of the Pauli spin matrices
Before we fully fermionize the problem, let us reformulate the transfer matrix in terms of the Pauli spin matrices, which have anticommuting properties. As we did above with the transfer matrix, let us first look at the 1-D analog before considering the 2-D case.
We saw earlier that the transfer matrix for the 1-D Ising model is a matrix. It thus makes sense that we should thus be able to write it, and therefore the partition function, in terms of the Pauli spin matrices, which we briefly review:
Equivalently, we can define and
via the raising and lowering operators (also known as ladder operators)
:
The raising and lowering operators have this name because they increase and decrease the spin angular momentum in quantum mechanics. To illustrate this property, consider to be the eigenvectors of
, so that
and
Then note that
So lower or raise the eigenvalues of
by 2.
These matrices have several interesting properties. It is easy to verify that the Pauli matrices satisfy the commutation relation
and circular permutations of the last.
Viewed as operators, the matrices represent involutions, i.e. they are their own inverses:
In contrast, the raising and lowering operators are nilpotent with degree 2:
We note in passing that Grassmann numbers share this last property and indeed the Ising model can also be solved using Grassmann variables.
Most important for our purposes are the anticommutation relations
The first follows from the involution property discussed above and the last two are easily obtained by direct substitution. The anticommutation relation for the raising and lowering operators is the signature of fermions.
Before proceeding we first prove useful trigonometric identities. There are versions of Euler’s identity for the Pauli matrices:
the proof of which is easily obtained from the matrix power series and then using the involution property (even powers of are
and so odd powers are simply
). The same argument gives us
Let us now return to the partition function of the 1-D Ising model and write
where and
. Previously we expressed the transfer matrix such that the factors in parentheses were its elements. Now, we take a different approach and treat the two kinds of factors inside the parentheses separately. Let us define two matrices
and
or explicitly,
We can express and
in terms of the Pauli matrices as follows.
so that
For we proceed as follows. We first define
and
,
Then we have
Let us denote the basis of eigenvectors of as
. Then we can write the partition function as
By the invariance of the trace of a product under cyclic permutations of the factors, we can write the partition more symmetrically as
Let us now proceed to the 2-D case of and lattice. Let
now denote one full column of sites, i.e.
This kind of product is sometimes called a direct product in physics textbooks. In the language of multilinear algebra, the vector belongs to the space which is the tensor product of the two-dimensional vector spaces corresponding to the sites in a single column, so the dimension of the product space is
.
Let be the Pauli spin operator
which acts vector
only on the row
for a , i.e. it acts only on the subspace spanned by the
, and similarly for
and
. Moreover, let
be the raising and lowering operators that act on row
of
. We thus have
We know the commutation relations of these operators for identical , but how do operators of distinct
behave? Since they act on different subspaces, they must commute:
We are now ready to express the transfer matrix in terms of these operators. Recall that in the 1-D case, we expressed the transfer matrix in terms of the Pauli matrices. In the 2-D case, the transfer matrix is no longer a 22 matrix, but we can still try to express the elements of the transfer matrix in terms of the Pauli matrices. The interactions between columns and within columns are “orthogonal,” in the sense that they are independent. So we can try to express the interactions in one direction in terms of, say,
and in the other by
. In the 1-D case, the matrix
is diagonal, because it only contains the contributions of the external field, but not from the interactions between the spins. In the 2-D case, similarly, the contribution to the transfer matrix from the interactions inside a column will give rise to a diagonal matrix. This fact is obvious if one considers that it makes no sense to have non-diagonal terms connecting two different columns for interactions inside a single column. Hence, it makes sense to use
to represent the interactions inside a column. Then, we can try to represent the the interactions between the columns in terms of the
. Note that in the 1964 paper by Schultz, Mattis and Lieb, the roles of
and
are exchanged and their
is diagonal.
We next proceed as described above. Consider first the diagonal part:
So we define as
Now all that remains is to find the nondiagonal contribution, arising from interactions between columns. Observe that
Next, recall that
so that
Here it is assumed implicitly that of course. The corresponding
matrix is thus seen to be
We need the factors exponentiated, so this is still not what we want. But we can easily put it into the needed form, by repeating the trick we used in the 1-D case. Specifically, let us write
where is a function of
and vice versa. Putting everything together, for two columns
and
we get
We can find the relation between and
explicitly as follows. Putting in
, we have
so that
The last expression can be further manipulated as follows:
so that we finally have
We thus obtain
Following the 1-D example, the partition function can now we written as a trace:
Again, we can use the invariance of the trace relative to circular permutations to get the symmetric form
4. Fermionization via the Jordan-Wigner transformation
Let us first put and
into a more convenient form, by rotating around the
axis such that
In this new arrangement, we get
Previously, the matrix was diagonal. But in this new form, it is
which is diagonal.
To find the exact partition function, we need to find the largest eigenvalue of the transfer matrix. We will do this by fermionizing the problem. Our Pauli operators have the correct anticommutation property (14) for fixed site
, but for different sites the operators commute, see (23). But Fermion operators anticommute at different sites. The physics behind the anticommutation is that when two indistinguishable fermion particles, such as electrons, are exchanged, their two-particle wave function changes sign. It is this property that underlies the Pauli exclusion principle. In contrast, bosons satisfy commutation relations rather than anticommutation relations. The Pauli matrices
are neither fully boson nor fully fermion, making diagonalization of the transfer matrix difficult. But there is a way to work around this obstruction.
Let us define new creation and annihilation operators that depend on all sites to the “left” in the following sense:
These new operators have interesting properties, which we will show below. First observe that
So
Hence, putting we get
From this identity and from the commutation of the Pauli operators at distinct sites, we immediately get
Hence, . So the new operators have the correct fermion anticommutation property at the same site
. What about at distinct sites? Observe that
Adding, we thus get
Moreover, we can show similarly that
So we have a complete set of anticommutation rules.
We can now express and
in terms of these new operators. Note that
In contrast, is not only quadratic in the Pauli matrices, but connects different rows, i.e. different sites in a given column. Still, it is possible to re-express
easily in terms of the new operators. Observe that for
, we get from (33)
Using a similar reasoning, we get
And
Hence,
Hence, neglecting the boundary, the bulk of the matrix can be written as
The boundary term for
must be dealt with separately, due to the periodic boundary conditions. As this point, we could just forget about the boundary, which is not important in the thermodynamic limit. However, eliminating periodic boundary conditions breaks translation invariance symmetry, which would prevent us from fully exploiting the Fourier transform. So it is worthwhile to spend some time getting the the boundary term just right. Let us define the number operator for the total number of fermions:
First, observe that
Similarly,
and
The inverse of is itself, since
. So
Next, observe that and
are bilinear in the fermion operators. The neglected boundary term in
is also bilinear. This means that
and
can only change the total number of fermions by an even number, which means that the oddness or evenness of the number operator
must be a conserved quantity. In the language of commutators, we have
The conditions (37) can be satisfied provided we pick
for even and
for odd .
Let us now exploit translation invariance by invoking the Fourier transform. Let us define new operators via
where is a wavenumber. To simplify the notation, from now on let us write more simply
where the sum is understood to be over the different wavenumbers.
Consider that
so that, as expected, the inverse transform is given by
Note that the are also fermion operators, as can be seen from
We can show using these same methods that Parseval’s identity holds (sometimes called the Plancherel theorem):
We should now recast the transfer matrix in terms of . For convenience, instead of letting
run over
values from
to
, it is more convenient to let it run from
to
. Moreover, we assume, with little or no loss of generality, that
is even. We saw earlier that for even
, the periodic boundary conditions force a sign change at the boundary, whereas for odd
there is no sign change. So we have
For even , then, Parseval’s theorem assumes the form
Since the sum over the positive is over only
values, we have
For odd , we must also consider the terms
an
separately.
Substituting into (36) we thus get
Now let us turn our attention to . Consider products of the form
where
:
And similarly
If we denote sums over values of excluding
via the symbol
, we can write
Exploiting the anticommutation property for fermion operators, we can write
Now observe that
Putting everything together we get
Let us choose such that
. Noting that
we get
For odd we have neglected
, which we must consider explicitly. In the expression for
we need to add for
and similarly for
Now we can write by substituting into (36):
We are now in a position to recast (31). Substituting (50) and (46), we obtain
From (46) we see that can be diagonalized simultaneously with
as follows. Let us define the “vacuum” state
as representing the system with all spins in the “down” direction. Let
be a set of 4 vectors obtained by acting on
in 4 different ways as follows.
Observe that for we have
Let us choose as our basis the 4 vectors in this order. Then we can write
On the other hand, “mixes”
and
, i.e. it not diagonal. To get around this problem, we proceed as follows. From (50), we obtain for
for
,
So we know that we can write
Differentiating, we get
For we have
so that we can write
Let us eliminate :
This is a linear second order ordinary differential equation, which is easily solved. Assume that the solution is the sum of terms of the form :
so that from the quadratic formula we get
So the general solution must be of the form
The initial condition at
gives us
so that
The other initial condition gives us
so that
Having found we can now find
from (54) as follows:
We can proceed as above and write similarly
Repeating the above method, we obtain
Meanwhile it is easy to check that
Putting everything together we obtain
We thus see that this matrix is block-diagonal. The first block corresponds to the basis vectors and
, which keep count of zero and two fermions, whereas the second block relates to one fermion. Let us call these blocks the even and odd blocks.
The odd block of is diagonal with both eigenvalues equal to
. It is easy to check that the even block is symmetric. Let us calculate it explicitly.
This is a symmetric matrix. Moreover, by factoring out from
,
,
,
and
, we can write the above matrix as
where
Note that we are using notation slightly different from the 1964 paper. Our is their
and our
is their
and our
is their
.
The matrix
has eigenvalue satisfying the characteristic equation
so that
and we thus obtain
So the even block of has eigenvalue
(Note that there is a sign typo inside the square root in the original 1964 article.)
These eigenvalues are centered around their arithmetic mean. Since we are more interested in the multiplicative rather than the additive properties of the partition function, it is more useful to express these eigenvalues as being centered around their geometric mean. We will next show that we can write
Simplifying, we get
Hence,
so that (69) is satisfied provided
To proceed further, Schultz, Mattis and Lieb re-express the fermion operators once again in terms of “rotated” quantities, using a method that was used by Bogolubov and Valatin to simplify the BCS theory of superconductivity. We refer the interested reader to the original paper. We will skip this step and instead focus directly on the thermodynamic limit.
Even without knowing the details of the basis that diagonalizes in (51), we already know the eigenvalues. So in the diagonalizing basis (which we do not need to know) we can write
in terms of these eigenvalues. From the above, the eigenvalues are known to be of the form
Noting that the can be paired and canceled we get
In the thermodynamic limit, only the largest eigenvalue contributes, as is well known. Taking the limit of of the
-th root of the above considering only the larger of
, and noting that the sum over
above is over
values only (so only half the full range from
to
) we obtain the partition function per site:
5. The symmetric forms
The above expression is easy to put into the more commonly used forms. Let us first prove the following identity, which is actually found in Onsager’s seminal paper from 1944:
Amazingly, this identity is never proven in the usual textbooks of statistical mechanics. We will not satisfy ourselves with just accepting this identity on faith. Where is the proof? Huang’s graduate level textbook merely quotes the identity, giving no proof. Newell and Montroll’s well-known review article similarly just throws the identity at the reader, which is also what Onsager did. Reichl’s book on statistical physics at least contains an evaluated definite integral, but without going into details on how it is obtained. Indeed, why waste our time? Just look it up in some integral table or use Maple or Mathematica, right?
Since our goal here is to actually understand what is going on, we insist on proving the expression. The above intimidating identity is not as unapproachable as it seems. In fact, readers familiar with Mahler measures will recognize it as closely related to a simple one-variable logarithmic Mahler measure.
To prove the above identity, we first establish the following identity for complex :
Let us evaluate the above integral using power series. For the Taylor series for the logarithm converges absolutely and we can use the Fubini-Tonelli theorem to interchange the order of sums and integrals:
The case follows from taking the limit
for real
from inside the unit disk.
We have now proven (76), but it is worth asking what it means. The sum of a logarithm is the logarithm of a product, so (76) is, in fact, the complex logarithm of the geometric mean of over
. In essence, (76) means that the geometric mean of the quantity
over a circle of radius
and centered at 1 on the complex plane is 1, hence the logarithm of the geometric mean is 0. Moreover, the real part of the integral is given by the Mahler measure,
Mahler measures are actually quite deep and connect diverse areas of mathematics and theoretical physics, such as random walks, the enumeration of spanning trees, lattice sums, determinantal processes, number theory and modular forms. Discussion of Mahler measures is beyond the scope of this article.
Next we establish (75), with the help of (76). Let without loss of generality and let
. Then,
and the claim follows.
We can now insert this identity into (74):
Let us next get rid of the starred variables:
Substituting, we get
We do not need to integrate from to
, it is enough to go from
to
. Taking the factor
outside, and renaming the integration variables to be more friendly, we finally get
This is the celebrated result of Onsager for the anisotropic 2D Ising model.
There is one other form which is worth mentioning. Let us specialize to the isotropic case . Recall that
Let and
so that
Let . The partition function for the isotropic case now becomes
One of the integrals can be carried out, in the manner usually found in the textbooks. Moreover, one of us (GMV) recently carried out the second integration in terms of a generalized hypergeometric function (see http://arxiv.org/abs/1411.2495).
6. Closing remarks
We note that there are several ways of solving the 2D Ising model. The “algebraic” method presented above is a simplified version of the original calculation of Onsager and also of Kaufman. Then, there is the “combinatorial” solution which is based on enumerating closed graphs using a Markovian random walk process on the lattice (see here). Other methods include the star-triangle transformation, the method of Pfaffians, and the Grassmann variable approach. Each method has its advantages and sheds light on a different aspect of the problem.
Acnowledgements
GMV thanks Sílvio Salinas for suggesting the study of the method of Schultz, Mattis and Lieb in an e-mail of 16 Nov 2014, and also for discussions when GMV visited USP and Salinas visited Natal during 2014.