# Internal energy of the 2D Ising model

The above domain coloring plot shows an analytic continuation to the complex plane of the internal energy of the 2D Ising model. The domain variable is $z=\tanh J \beta$, which is a standard high temperature variable used to study the 2D Ising model.

I originally made the plot in 2014 soon after I first began to study the exact solution of the Ising model. A few days ago, I happened to find this old picture from 2014 and decided to share it here.

What a beautiful model! I had forgotten just how beautiful the structure is.

The 2 circles represent thermodynamic limit of the zeroes of the partition function. In this limit, singularities appear where the zeroes would be, as can be seen from Onsager’s formula. The critical temperature of the Ising model corresponds to the singularities on the real line, of which there are 4. The two singularities closer to the origin on the real line give the well known positive and negative critical temperatures $\pm \beta_c J \approx 0.44$. The other two give physically non-realizable complex critical temperatures, which are related to the standard critical temperatures via a symmetry known as the inversion relation (see MT Jaekel and JM Maillard, Symmetry relations in exactly soluble models, Journal of Physics A: Mathematical and General, 1982).

Mathematica Code:

f[z_] := (-\[Pi] (1 + z^2)^2 +
2 (1 – 6 z^2 + z^4) EllipticK[(16 z^2 (-1 + z^2)^2)/(1 + z^2)^4])/(
2 \[Pi] (z + z^3));

paint[z_] :=
Module[{x = Re[z], y = Im[z]},
color = Hue[Rescale[ArcTan[-x, -y], {-Pi, Pi}]];
shade = Mod[Log[2, Abs[x + I y]], 1];

ParametricPlot[{x, y}, {x, -3, 3}, {y, -3, 3},
ColorFunctionScaling -> False,
ColorFunction -> Function[{x, y}, paint[f[x + y I]]], Frame -> True,
MaxRecursion -> 1, PlotPoints -> 300, PlotRangePadding -> 0,
Axes -> False , Mesh -> 29 , Mesh -> {Range[-5, 5], Range[-5, 5]},
MeshFunctions -> {(Re@f[#1 + I #2] &), (Im@f[#1 + I #2] &)},
PlotRangePadding -> 0, MeshStyle -> Opacity[0.3], ImageSize -> 400]

# Exact solution of the 2D Ising model via Grassmann variables

1. Introduction

In 1980 Stuart Samuel gave what I consider to be one of the most elegant exact solutions of the 2D Ising model. He used Grassmann variables to formulate the problem in terms of a free-fermion model, via the fermionic path integral approach. The relevant Grassmann action is quadratic, so that the solution can be found via well-known formulas for Gaussian integrals.

In previous articles, I derived Onsager’s celebrated expression for the partition function of the 2D Ising model, using two different methods. First, I used a combinatorial method of graph counting that exploits the theory of random walks (see here). In the second article (see here), I reviewed the method of Schultz, Mattis and Lieb of treating the 2D Ising model as a problem of many fermions. Here I further explore the connection between the Ising model and fermions.

This article is based on my study notes. I more or less have followed Samuel’s solution of the 2D Ising model using Grassmann variables, but I had expanded the calculations for my own convenience.

Whereas on the one hand the behavior of bosons can be described using standard path integrals, known as bosonic path integrals, on the other hand the path integral formulation for fermions requires the use of Grassmann variables. Grassmann numbers can be integrated, but not using the standard approach based on Riemann sums and Lebesgue integrals, etc. Instead, they must be integrated using Berezin integration. I have previously written about Grassmann variables and Berezin integrals here.

In what follows, I assume familiarity with the 2D Ising model. Readers who find it difficult to understand the details are referred to the two previously mentioned articles, which are introductory and easier to understand.

The partition function can be approached as a power series either in a high temperature variable or a low temperature one. Consider first the high temperature expansion. It is well known that the partition function of the 2D Ising model is proportional to the generating function of connected or disconnected graphs in which all nodes have even degree, and edges link only nearest neighbors on the infinite square lattice. Graph nodes correspond to sites and links correspond to bonds. In such graphs, every incoming bond at a site has at least one corresponding outgoing bond, because each site has an even number of bonds. To each such “closed graph,” there are corresponding directed graphs, where the links are directed. Since each bond appears at most once in any such directed graph, but never twice or more, we can enumerate such closed directed graphs by assigning pairs of Grassmann variables to each bond. In this case, the generating function, when evaluated in a suitable high temperature variable (such as ${u=\tanh \beta J}$), gives the Ising model partition function up to a known prefactor.

Here, I do not use the above high temperature expansion, in terms of loops or multipolygons. Instead, I use a low temperature expansion and enumerate non-overlapping magnetic domains, following Samuel’s original work. Specifically, each configuration of the Ising model corresponds to a specific domain wall configuration. So summing over all domain wall configurations is equivalent to a sum over all spin states, excepting for a 2-fold degeneracy since each domain wall configuration corresponds to 2 possible spin configurations.

Since the Ising model on the square lattice is self-dual, the high temperature approach using overlapping multipolygons and the low temperature appproach using non-overlapping multipolygons on the dual lattice are equivalent, of course. Either way, explicit evaluation of a Berezin integral gives the exact solution. Specifically, we assign 4 Grassmann variables to each site of the dual lattice. Equivalently, one can instead think as follows: there are two types of Bloch walls, “vertical” domain walls and “horizontal” walls and each wall segment has two ends. A Bloch wall can thus be represented by Grassmann variables at 2 different sites. A corner of a Bloch domain can be represented by matching a vertical and horizontal variable at a single site. Finally, so-called “monomer” terms consist of 2 horizontal or 2 vertical variables. They represent the absence of a corresponding bond, and also of corners. A single monomer term at a site represents a domain wall that goes straight through a site, but perpendicularly to the monomer and without going around a corner. Two monomer terms at the same site represent a site interior to domain walls, i.e, not on a domain wall. Using the terms corresponding to Bloch walls, corners and monomers, we then construct an action.

2. Overview of the strategy

The solution strategy is as follows. We will exploit 2 properties of Grassmann variables. The first property is anticommutativity, so that the square of a Grassmann variable is zero. This nilpotency property can be exploited to count every domain wall unit (or bond in the high temperature method) no more than once. The second property we will exploit is a feature that is specific to the Berezin integral. Recall that a Berezin multiple integral is nonzero only if the Grassmann variables being integrated exactly match or correspond to the integration variables. If there is a single integration variable that is not matched, the whole multiple integral vanishes, because ${\int d\psi=0}$ for the Berezin integral of any Grassmann variable ${\psi}$. From now onwards, we will say that the integrand “saturates” the Berezin integral if all integration variables are matched by integrand variables. So the second property is the ability of the multiple Berezin integral to select only terms that saturate the integral.

The essence of the trick is to exponentiate the suitably chosen action. The Taylor series for a function of a finite number of Grassmann variables is a polynomial (because of the nilpotency). This polynomial can code all kinds of graphs and other strange groupings of parts of graphs, such as isolated corners or monomers. By Berezin integration, we can then select only the terms that saturate the integral. If the action is chosen appropriately, the saturating terms are precisely those that correspond to the non-overalapping Bloch domains. (In the high temperature variant, the action instead generates the desired multipolygons.)

It will turn out that for the 2D Ising model this action is a quadratic form, so that we essentially have a free-fermion model. Specifically, the quadratic action is partially “diagonalized” by the Fourier transform, immediately leading to the exact solution originally found by Onsager. Of all the methods of solution of the Ising model, I find this method to be the most simple, beautiful and powerful.

What I also found quite fascinating is that for the cubic lattice one obtains a quartic action instead, as is well known to experts in the field. So, in this sense, the cubic Ising model is not directly equivalent to a free-fermion model, but rather to a model of interacting fermions.

We assume the reader is familiar with the Ising model on the square lattice. Let the Boltzmann weight ${t= e^{-\beta J}}$ be our chosen low temperature variable. Then we can write the partition function as

$\displaystyle \Xi= \sum_{\sigma} t^{H(\sigma)} \ \ \ \ \ (1)$

where ${H}$ is the Ising model Hamiltonian and the sum is over all spin states. The choice of symbol ${\Xi}$ for the partition funtion was made so that the letter ${Z}$ can be reserved for the partition function per site in the thermodynamic limit, to be defined further below.

For a given spin configuration, let us consider the set of all bonds in the excited state, i.e. with antiparallel spins. These excited bonds form the Bloch walls separating domains of opposiite magnetization. On the dual lattice, these Bloch walls form non-overlapping loops or polygons. Moreover, every Bloch wall configuration corresponds to exactly 2 spin configurations, so that we can rewrite the partition function as

$\displaystyle \Xi= 2\sum _{\mbox{\tiny loops}} t^{\gamma} t^{-(2N-\gamma)} = 2t^{-2N}\sum _{\mbox{\tiny loops}} t^{2\gamma} ~. \ \ \ \ \ (2)$

Here, the factor ${t^\gamma}$ is due to the Bloch walls and ${2N-\gamma}$ is the number of bonds inside the Bloch domains. We thus see the partition function can be calculated by ennumerating non-overlapping loops and summing them with proper Boltzmann weights.

Let us define a modified partition function for these loops by

$\displaystyle \Xi'= \sum _{\mbox{\tiny loops}} t^{2\gamma} \ \ \ \ \ (3)$

so that ${\Xi=2 t^{-2N} \Xi'}$. Our goal henceforth is thus to calculate ${\Xi'}$. To do so, we will use Grassmann numbers and Berezin integration.

Let us define at each site ${x}$ of the 2D lattice a pair of Grassmann variables in the vertical direction, ${\eta_{\pm 1}(x)}$ and another pair for the horizontal direction ${\eta_{\pm 2}(x)}$.

We can now define an action for our fermionic path integral as follows. Each configuration of nonoverlapping loops consists of (i) individual segments of loops that link neighboring sites on the dual lattice (Bloch wall units), (ii) sites where the domain wall goes through the site and (iii) sites where the Bloch domain cuts a corner. We will thus write the total action as a sum of 3 terms: (i) the Bloch wall or “line” term ${S_L}$ , (ii) the “monomer” term ${S_M}$ and (iii) the corner term ${S_C}$:

$\displaystyle S= S_L + S_M + S_C ~. \ \ \ \ \ (4)$

We will then exponentiate this action and use a Berezin integral to obtain ${\Xi'}$:

$\displaystyle \Xi'= (-1)^N \int e^{\beta S} \prod_x d\eta_{-1}(x) d\eta_{+1}(x) d\eta_{-2}(x) d\eta_{+2}(x) ~ \ \ \ \ \ (5)$

We will use the same convention used by Samuel, so

$\displaystyle \int d\eta_{-1} d\eta_{+1} \eta_{-1} \eta_{+1} =1~. \ \ \ \ \ (6)$

Let us define the Bloch wall terms by

$\displaystyle S_L= t^2 \sum_x[ \eta_{+1}(x) \eta_{-1}(x +\hat 1) + \eta_{+2}(x) \eta_{-2}(x +\hat 2) ] ~. \ \ \ \ \ (7)$

To remain consistent with this definition the corner term must be defined as

$\displaystyle S_C = \sum_x[ \eta_{+1}(x) \eta_{-2}(x) + \eta_{+2}(x) \eta_{-1}(x) + \eta_{+2}(x) \eta_{+1}(x) + \eta_{-2}(x) \eta_{-1}(x) ] ~. \ \ \ \ \ (8)$

To see why, consider a corner formed along a path going horizontally forward followed by vertically forward. You thus have 2 Bloch walls segments meeting at the corner. We want to saturate first the horizontal wall, then the vertical. The horizontal wall contributes with ${\eta_{-1}(x)}$ and the vertical wall with ${\eta_{+2}(x)}$. So we want to saturate the Berezin integral at the site ${x}$ with the corner factor ${\eta_{+1}\eta_{-2}}$. This is the first term in the corner action. The 3 other terms are similarly deduced.

Meanwhile, the monomer terms are even simpler. We want the monomer terms to “do nothing”, i.e. to contribute with a factor of 1 when (and only when) needed. From the sign convention (6) we thus obtain simply

$\displaystyle S_M= \sum_x[ \eta_{-1}(x) \eta_{+1}(x) + \eta_{-2}(x) \eta_{+2}(x) ] ~ . \ \ \ \ \ (9)$

Note that corner and monomer terms have an even number of Grassmann variables per site, while the line term has only one Grassmann variable on each of two neighboring sites. So to saturate the Berezin integral, an even number of line terms (so 0,2, or 4) must come together at a given site.

The Berezin integral for a fixed site ${x}$ can only saturate in the following ways:

1. Two monomer terms, one horizontal and one vertical.
2. Two lines and a monomer.
3. Two lines and a corner.
4. Four lines.

The following are prohibited at any site:

1. An odd number of lines (because there is no way to saturate missing Grassmann variables).
2. One corner and one monomer (because one Grassmann variable will necessarily be repeated, so that the nilpotency kills the term, and similarly one variable will be missing, also leading to zero).
3. A double corner of 4 domain walls and 2 corners (because then you have 6 Grassmann variables at the site).

There is one other interesting case: 2 corner terms with no lines. In this case, each Grassmann variable at the site appears exactly once, so such terms do in fact contribute. Moreover, every such term is matched by another term with the two other kinds of corners. For example the two-corner term

$\displaystyle [\eta_{+1}(x) \eta_{-2}(x) ][ \eta_{+2}(x) \eta_{-1}(x) ]$

is matched by the term

$\displaystyle [\eta_{-2}(x) \eta_{-1}(x) ][ \eta_{+2}(x) \eta_{+1}(x) ] ~.$

But

$\displaystyle \begin{array}{rcl} [\eta_{-2}(x) \eta_{-1}(x) ][ \eta_{+2}(x) \eta_{+1}(x) ] &=& -\eta_{-2}(x) \eta_{+2}(x) \eta_{-1}(x) \eta_{+1}(x) \\ &=& +\eta_{-2}(x) \eta_{+2}(x) \eta_{+1}(x) \eta_{-1}(x) \\ &=& -\eta_{-2}(x) \eta_{+1}(x) \eta_{+2}(x) \eta_{-1}(x) \\ &=& +\eta_{+1}(x) \eta_{-2}(x) \eta_{+2}(x) \eta_{-1}(x) ~. \end{array}$

So the two ways of combining 2 corners lead to double the contribution. But the first double corner is actually the negative of the term with two monomers:

$\displaystyle \begin{array}{rcl} [\eta_{+1}(x) \eta_{-2}(x) ][ \eta_{+2}(x) \eta_{-1}(x) ] &=& \eta_{+2}(x) \eta_{-1}(x) \eta_{+1}(x) \eta_{-2}(x) \\ &=& - \eta_{-1}(x) \eta_{+2}(x) \eta_{+1}(x) \eta_{-2}(x) \\ &=& + \eta_{-1}(x) \eta_{+1}(x) \eta_{+2}(x) \eta_{-2}(x) \\ &=& - \eta_{-1}(x) \eta_{+1}(x) \eta_{-2}(x) \eta_{+2}(x) \end{array}$

so that at each site the double monomer term plus the two double corner terms produce a net contribution of ${+1-2=-1}$. If there are ${N}$ sites, then the number of sites on the lines is necessarily even, so that the number ${N'}$ of sites not on the walls satisfies ${(-1)^{N'}=(-1)^N}$. So there is an overall factor of ${(-1)^N}$, as seen in (5).

Meanwhile, every domain wall segment has weight ${t^2}$, so that a graph of non-overlappig loops of total loop length ${\gamma}$ will have a weight of ${t^{2\gamma}}$. There are many different ways to permute the lines, corners and monomers, but this is cancelled by the factorial in the denominator of the Taylor expansion of the exponential function. The final result is that the right hand sides of (5) and (3) are equal.

4. Diagonalization and exact solution

The quadratic action is translationally invariant, so the Fourier transform will diagonalize it, i.e. in the new Fourier conjugate variable the action is diagonal. By this we mean that the action does not mix different Fourier frequencies, upto sign.

So let us define the (unitary) Fourier transform ${\hat \eta}$ of the Grassmann variables ${\eta}$:

$\displaystyle \eta_i(x) = \frac 1 {\sqrt{N}} \sum_k e^{ik\cdot x} \hat \eta_i(k) ~. \ \ \ \ \ (10)$

Here ${x}$ and ${k}$ are 2-dimensional vectors. It will be more convenient for us to write

$\displaystyle k = k_x \hat 1 + k_y \hat 2\ \ \ \ \ (11)$

where

$\displaystyle \begin{array}{rcl} k_x&=& \frac {2\pi n_x} {L_x} \\ k_y&=& \frac {2\pi n_y} {L_x} ~, \end{array}$

where ${N_x}$ and ${N_y}$ are the number of rows and columns of the lattice and ${N_x N_y=N}$ (with ${L_x=N_x}$ and ${L_y=N_y}$).

We could take ${n_x}$ ranging as ${0\ldots (L_x-1)}$, but we will instead take ${N_x}$ and ${N_y}$ as odd and use negative wavenumbers as follows. Let ${N_x=2 L_x+1}$ and ${N_y=2 L_y+1}$. Then,

$\displaystyle \begin{array}{rcl} n_x&=& -L_x, -(L_x-1), \ldots, L_x-1, L_x\\ n_x&=& -L_x, -(L_x-1), \ldots, L_x-1, L_x ~. \end{array}$

It is easy to check that the ${\hat \eta}$ are Grassmann numbers, a fact that follows from the unitarity of the Fourier transform. Unitarity also guarantees that

$\displaystyle \prod_x d\eta_{-1}(x) d\eta_{+1}(x) d\eta_{-2}(x) d\eta_{+2}(x) = \prod_k d\hat\eta_{-1}(k) d\hat\eta_{+1}(k) d\hat\eta_{-2}(k) d\hat\eta_{+2}(k) \ \ \ \ \ (12)$

so that we can explicitly evaluate the Berezin integral provided we can rewrite the Grassmann action in simple enough form in the Fourier transformed Grassmann variables. This we do next.

Notice that the action is quadratic in the ${\eta}$. Specifically, the action is a quadratic form in the Grassmann variables that we can write as

$\displaystyle S= \sum_{xyij} \eta_i(x) A_{ij}(x,y) \eta_j(y) ~. \ \ \ \ \ (13)$

We will explicitly define ${A_{ij}(x,y)}$ below, but for the moment let us not worry about it, except to note that it is simply a matrix or linear operator kernel. Note that the “matrix” ${A}$ is actually a tensor of rank 4, because there are 4 indices: ${i,~j, ~x,~y}$. However, for a fixed pair of not-necessarily-distinct sites ${x}$ and ${y}$, ${A}$ is a genuine (4${\times}$4) matrix with only two indices. Similarly, for fixed ${i, ~j}$, ${A}$ is an ${N\times N}$ matrix. We will apply the Fourier transform on the site variables, ${x,~y}$, noting that the action has the form of an inner product, and by the unitarity of the Fourier transform, it is possible to rewrite the action as

$\displaystyle S= \sum_{kk'ij} \hat\eta^*_i(k) \hat A_{ij}(k,k') \hat\eta_j(k') ~. \ \ \ \ \ (14)$

All we have done here is apply Plancherel’s theorem, sometimes also known as Parseval’s theorem. Hatted symbols represent Fourier transformed quantities.

I have previously written about the magical powers of the Fourier transform. For example: (i) Convolutions become products and vice versa under Fourier transformation; (ii) differential operators transform to simple algebraic multipliers; (iii) translations hecome phase shifts. It is this last property which is the most important for our purposes here.

The translationally invariant quadratic action for the 2D Ising model only connects neighboring sites. The only non-diagonal part of the action is due to this shift to neighboring sites. But upon Fourier transformation, we get rid of the shifts and obtain an action which is diagonal in the Fourier transform variable ${k}$. In other words, ${\hat A_{ij}(k,k') = \hat A_{ij}(k,k) \delta(k,k')}$ where ${\delta(k,k')}$ is the Kronecker delta function.

Let us introduce more compact notation to make the calculations easier. Let us define the vector

$\displaystyle \eta= \left[\begin{array}{c} \eta_{-1} \\ \eta_{+1} \\ \eta_{-2} \\ \eta_{+2} \end{array} \right] \ \ \ \ \ (15)$

and similarly for ${\hat \eta}$. Then we can write

$\displaystyle \begin{array}{rcl} S_L&=& \sum_{xy} \eta^T(x) A_L(x,y) \eta(y) \\ S_C&=& \sum_{xy} \eta^T(x) A_C(x,y) \eta(y) \\ S_M&=& \sum_{xy} \eta^T(x) A_M(x,y) \eta(y) ~. \end{array}$

The superscript ${\cdot^T}$ denotes transpose. The matrices ${A_C}$ and ${A_M}$ are given by (8) and (9) and are diagonal in the site indices, i.e. in ${x,~y}$.

$\displaystyle \begin{array}{rcl} A_C(x,y)&=& \delta(x,y) \left[ \begin{array}{cccc} 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 1 & 0 & 0 & 0 \\ 1 & 1 & 0 & 0 \end{array} \right] \\ A_L(x,y)&=& \delta(x,y) \left[ \begin{array}{cccc} 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 \end{array} \right] ~, \end{array}$

where ${\delta(x,y)}$ is a Kronecker delta function, not to be confused with the Dirac delta. The matrix ${A_L}$ is not diagonal in the site indices because it connects neighboring sites:

$\displaystyle \begin{array}{rcl} A_L(x,y)&=& t^2\delta(x+\hat 1,y) \left[ \begin{array}{cccc} 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{array} \right] +t^2 \delta(x+\hat 2,y) \left[ \begin{array}{cccc} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \end{array} \right] ~. \end{array}$

In terms of ${\hat \eta}$, we can thus write the action as

$\displaystyle S= \sum_{k}\hat \eta^\dagger(k) B(k,k) \hat \eta(k) ~, \ \ \ \ \ (16)$

Here ${B=\hat A}$. The reason the sum is now over a single index ${k}$ is that ${A}$ is translationally invariant, and application of Parseval’s (or Plancherel’s) allows us to convert the site shift to a phase, so that ${\hat A(k_1,k_2)=\delta(k_1,k_2) \hat A(k_1,k_1)=B(k_1,k_1)}$. In other words, ${\hat A=B}$ is diagonal in the Fourier variables such as ${k}$, which is conjugate to the site index ${x}$.

As an illustrative example, let us show this last point explicitly. Consider the Fourier transform of ${\sum_y C(x,y) \eta(y)}$, where ${C(x,y)=C_0 \delta(x+\hat 1,y)}$ is translationaly invariant:

$\displaystyle \begin{array}{rcl} & & \frac 1 {\sqrt{N}} \sum_{x=1}^N e^{-ik\cdot x} \sum_y C_0 \delta(x+\hat 1,y) \eta(y) \\ & & = \frac {C_0} {\sqrt{N}} \sum_{x=1}^N e^{-ik \cdot x} \eta(x+\hat 1) \\ & & = \frac {C_0} {\sqrt{N}} \sum_1^N e^{-ik \cdot (x-\hat 1)} \eta(x) \\ & & = \frac {C_0 e^{i k\cdot \hat 1}} {\sqrt{N}} \sum_1^N e^{-ik \cdot x} \eta(x) \\ & & = e^{i k\cdot \hat 1}\hat \eta \end{array}$

Hence ${B}$ is diagonal in the sense previously explained. It is easily calclated to be

$\displaystyle \begin{array}{rcl} B(k,k) &=& \left[ \begin{array}{cccc} ~~0 & 1 & 0 & 0 \\ 0 & ~~0 & 1 & 0 \\ 1 & 0 & ~~0 & 1 \\ 1 & 1 & 0 & ~~0 \end{array} \right] \nonumber \\ & & \quad +t^2 e^{ik\cdot \hat 1} \left[ \begin{array}{cccc} ~~0 & 0 & ~~0 & ~~0 \\ 1 & ~~0 & ~~0 & ~~0 \\ ~~0 & ~~0 & ~~0 & ~~0 \\ ~~0 & ~~0 & ~~0 & ~~0 \end{array} \right] +t^2 e^{ik\cdot \hat 2} \left[ \begin{array}{cccc} ~~0 & ~~0 & ~~0 & ~~0 \\ ~~0 & ~~0 & ~~0 & ~~0 \\ ~~0 & ~~0 & ~~0 & 0 \\ ~~0 & ~~0 & 1 & ~~0 \end{array} \right] \\ &=& \left[ \begin{array}{cccc} ~~0 & 1 & 0 & 0 \\ t^2 e^{ik_x} & ~~0 & 1 & 0 \\ 1 & 0 & ~~0 & 1 \\ 1 & 1 & t^2 e^{ik_y} & ~~0 \end{array} \right] ~. \end{array}$

Notice that a daggered variable ${\hat \eta^\dagger(k)}$ (i.e., a conjugate transpose) appears in the action. We need to take care of this. Since ${\eta(x)}$ is, in principle, a “real” Grassmann variable, its Fourier transform satisfies ${[\hat \eta(k)]^* = \hat \eta(-k)}$. This “Hermitian” property of the Fourier transform of real functions allows us to write the action as

$\displaystyle S= \sum_{k}\hat \eta^T(-k) B(k,k) \hat \eta(k) ~. \ \ \ \ \ (17)$

We can explicitly evaluate the needed Berezin integral. But some care is necessary. Observe that the action “mixes” frequencies ${k}$ and ${-k}$. When we rewrite the exponential of the sum in the above action as a product of exponentials of the summands, these exponentials will contain not only the 4 grassman variables that make up the vector ${\hat \eta(k)}$, but also the other 4 variables in ${\hat \eta(-k)}$. So the full Berezin integral will factor not into Berezin integrals over 4 variables, but over 8 variables. So we will regroup the Grassmann differentials in groups of 8 rather than in groups of four, as follows:

$\displaystyle \begin{array}{rcl} & & \prod_k d\hat\eta_{-1}(k) d\hat\eta_{+1}(k) d\hat\eta_{-2}(k) d\hat\eta_{+2}(k)\\ & & = \prod_{k\geq 0} d\hat\eta_{-1}(k) d\hat\eta_{+1}(k) d\hat\eta_{-2}(k) d\hat\eta_{+2}(k) ~ d\hat\eta_{-1}(-k) d\hat\eta_{+1}(-k) d\hat\eta_{-2}(-k) d\hat\eta_{+2}(-k) ~. \end{array}$

In order to correctly factor the full Berezin integral, we need to collect all terms with ${k}$ and ${-k}$, which we can do by rewriting the sum in the action half the values of ${k}$. We will abuse the notation and write ${k\geq 0}$ to mean that ${k>0}$ and ${-k>0}$ are not both included in the summation. For instance, we can accomplish this by taking ${k_x \geq 0}$. With this notation, we can write,

$\displaystyle \begin{array}{rcl} S &=& \sum_{k\geq0}\left( \hat \eta^T(-k) B(k,k) \hat \eta(k) + \hat \eta^T(k) B(-k,-k) \hat \eta(-k) \right) \\ &=& \sum_{k\geq0}\left( \hat \eta^T(-k) B(k,k) \hat \eta(k) - \hat \eta^T(-k) B^T(-k,-k) \hat \eta(k) \right) \\ &=& \sum_{k\geq0}\left( \hat \eta^T(-k) B(k,k) \hat \eta(k) - \hat \eta^T(-k) B^T(-k,-k) \hat \eta(k) \right) \\ ~. \end{array}$

If we define

$\displaystyle B'(k) = B(k,k)-B^T(-k,-k) \ \ \ \ \ (18)$

then we can write the action as

$\displaystyle S= \sum_{k\geq0} \hat \eta(-k) B'(k) \hat \eta (k) ~. \ \ \ \ \ (19)$

Let us write

$\displaystyle d\hat \eta(k) d\hat \eta(-k) = d\hat\eta_{-1}(k) d\hat\eta_{+1}(k) d\hat\eta_{-2}(k) d\hat\eta_{+2}(k) ~ d\hat\eta_{-1}(-k) d\hat\eta_{+1}(-k) d\hat\eta_{-2}(-k) d\hat\eta_{+2}(-k)~. \ \ \ \ \ (20)$

The full Berezin integral now factors and can be written

$\displaystyle \begin{array}{rcl} & & \Xi'= \prod_{k\geq 0} \int { d\hat\eta(k) d\hat\eta(-k) } e^{\hat \eta(-k) B'(k) \hat \eta (k)} ~. \end{array}$

Each of the Berezin integral factors is a Gaussian integral. Recall that

$\displaystyle \int e^{-\sum_{ij}x_i A_{ij} y_j} dx dy = \det(A) \ \ \ \ \ (21)$

for Grassmann numbers ${x_i}$ and ${y_i}$ and a complex matrix ${A}$ (see here for an explanation). So

$\displaystyle \Xi'= \prod_{k\geq 0} \det B' \ \ \ \ \ (22)$

Removing the restriction to ${k \geq 0}$ we have

$\displaystyle \Xi'= \prod_{k} \sqrt {\det B'} ~. \ \ \ \ \ (23)$

Taking logarithms and dividing by the number of sites ${N}$ we have

$\displaystyle \frac 1 N \log \Xi' = \frac 1 2 \frac 1 N \sum_k \log \det B' ~. \ \ \ \ \ (24)$

In the thermodynamic limit, the sum becomes an integral and the partition function per site

$\displaystyle Z=\lim_{N\rightarrow \infty} \Xi^{1/N} \ \ \ \ \ (25)$

is thus given by

$\displaystyle \log Z = \log t^{-2}~ + \frac 1 2 \int_{-\pi}^{\pi}\int_{-\pi}^{\pi} \log \det B'~ \frac{dk_x}{2\pi} \frac{dk_y}{2\pi} \ \ \ \ \ (26)$

The anti-hermitian matrix ${B'}$ is easily found:

$\displaystyle B'(k)= \left[ \begin{array}{cccc} ~~0 & 1- t^2 e^{-ik_x} & -1 & -1 \\ -(1-t^2 e^{ik_x}) & ~~0 & 1 & -1 \\ 1 & -1 & ~~0 & (1-t^2e^{-ik_y}) \\ 1 & 1 & -(1-t^2e^{ik_y}) & ~~0 \end{array} \right] ~. \ \ \ \ \ (27)$

Its determinant is

$\displaystyle \det B'= (1 + t^4)^2 - 2 t^2 (-1 + t^4) (\cos k_x + \cos k_y) ~. \ \ \ \ \ (28)$

Substituting into the integral, we arrive at the expression for the partition function per site:

$\displaystyle \begin{array}{rcl} \log Z &=& \log t^{-2}~ + \frac 1 2 \iint_{-\pi}^{\pi} \log \big[(1 + t^4)^2 - 2 t^2 (-1 + t^4) (\cos k_x + \cos k_y)\big] ~ \frac{dk_x}{2\pi} \frac{dk_y}{2\pi} \\ &=& \frac 1 2 \iint_{-\pi}^{\pi} \log \big[(t^{-2} + t^2)^2 - 2 (-t^{-2} + t^2) (\cos k_x + \cos k_y)\big] ~ \frac{dk_x}{2\pi} \frac{dk_y}{2\pi} \\ &=& \frac 1 2 \iint_{-\pi}^{\pi} \log \big[4 \cosh^2 2\beta J - 4 \sinh 2\beta J ~(\cos k_x + \cos k_y)\big] ~ \frac{dk_x}{2\pi} \frac{dk_y}{2\pi} \end{array}$

We are done! This is Onsager’s famous result, specialized to the case of of equal coupĺings ${J=J_x=J_y}$.

Exercise: Repeat the calculation using the high temperature variable ${u}$ instead of the low tempoerature variable ${t}$. (The final answer is of course the same.)

5. The cubic Ising model

Ever since 1944 when Onsager published his seminal paper, tentative “exact solutions” have been proposed over the years for the 3D or cubic Ising model. As mentioned eariler, the Grassmann action is quartic for the cubic Ising model. In quantum field theory, quartic Grassmann actions are associated with models of interacting fermions, whereas quadratic actions are associated with free-fermion models. The latter are easily solved via Pfaffian and determinant formulas, as we have done above, but at the present time there are no methods known to be able to give exact solutions (in the thermodynamic limit) of lattice models with quartic Grassmann actions. Hence, anybody claiming an exact solution to the cubic Ising model must explain how they overcame the mathematical difficulty of dealing with quartic actions, or at least how the new method bypasses this mathematical obstruction.

Barry Cipra, in an article in Science, referred to the Ising problem as the “Holy Grail of statistical mechanics.” The article lists a number of other reasons why we may never attain the goal of finding an explicit exact solution of the cubic Ising model.

Exact solution of the cubic Ising model may be an impossible problem!

I thank Francisco “Xico” Alexandre da Costa and Sílvio Salinas for calling my attention to the Grassmann variables approach to solving the Ising model.

# Fermionization of the 2-D Ising model: The method of Schultz, Mattis and Lieb

F. A da Costa, R. T. G. de Oliveira, G. M. Viswanathan

This blog post was written in co-authorship with my physics department colleague Professor Francisco “Xico” Alexandre da Costa and Professor Roberto Teodoro Gurgel de Oliveira, of the UFRN mathematics department. Xico obtained his doctorate under Professor Sílvio Salinas at the University of São Paulo. Roberto was a student of Xico many years ago, but left physics to study mathematics at IMPA in Rio de Janeiro in 2010. During 2006–2007, Roberto and Xico had written up a short text in Portuguese that included the exact solution of the Ising model on the infinite square lattice using the method of fermion operators developed by Schultz, Mattis and Lieb. With the aim of learning this method, I adapted their text and expanded many of the calculations for my own convenience. I decided to post it on this blog since others might also find it interesting. I have previously written an introduction to the 2-D Ising model here, where I review a combinatorial method of solution.

1. Introduction

The spins in the Ising model can only take on two values, ${\pm 1}$. This behavior is not unlike how the occupation number ${n}$ for some single particle state for fermions can only take on two values, ${n=0,1}$. It thus makes sense to try to solve the Ising model via fermionization. This is what Schultz, Mattis and Lieb accomplished in their well-known paper of 1964. In turn, their method of solution is a simplified version of Bruria Kaufman’s spinor analysis method, which is in turn a simplification of Onsager’s original method.

We will proceed as follows. First we will set up the transfer matrix. Next we will reformulate it in terms of Pauli’s spin matrices for spin-${\tfrac 1 2}$ particles. Recall that in quantum field theory boson creation and annihilation operators satisfy the well-known commutation relations of the quantum harmonic oscillator, whereas fermion operators satisfy analogous anticommutation relations. The spin annihilation and creation operators ${\sigma_j^\pm }$ do not anticommute at distinct sites ${j}$ but instead commute, whereas fermion operators must anticommute at different sites. This problem of mixed commutation and anticommutation relations can be solved using a method known as the Jordan-Wigner transformation. This step completes the fermionic reformulation of the 2-D Ising model. To obtain the partition function in the thermodynamic limit, which is the largest eigenvalue of the transfer matrix, one diagonalizes the fermionized transfer matrix using appropriate canonical transformations.

# Onsager’s solution of the 2-D Ising model: The combinatorial method

1. Introduction

In this blog post, I derive, step by step, the exact partition function for the ferromagnetic Ising model on the square lattice. The result is celebrated as “Onsager’s solution” of the 2-D Ising model. It was originally derived by Lars Onsager in 1942 and published in 1944 in Physical Review [1]. That paper revolutionized the study of phase transitions and what we now call critical phenomena [2].

Somewhat ironically, I first heard about the Ising model when I was working in industry. I was 20 and held a summer job at what was then known as British Telecom Research Labs (BTRL), near Ipswich in the UK. This was before I had ever seen a cell phone or heard of the Internet (although I knew about BITNET and JANET). I worked there in the summer of 1990 and again for a month or so around April 1991. My job at BT involved writing C implementations of multilayer perceptrons and Hopfield neural nets. In those days, BT was interested in implementing hardware neural networks and my boss mentioned casually to me that certain kinds of neural nets are basically just special cases of the Ising model. (Indeed, the Hopfield network is closely related to the Ising spin glass.) Thus began my fascination with the Ising model. Later, in 1994 in Boston, I took a course given by Bill Klein at BU on statistical mechanics, where we went through the solution of the 1-D ferromagnetic Ising model. Still, I never had the chance to study properly the 2-D Ising model. As a PhD student, I would almost daily pass by a poster with a background photo of Lars Onsager (with a cigarette in his hand), hung near the office door of my advisor Gene Stanley, so I was regularly reminded of the 2-D Ising model. I kept telling myself that one day I would eventually learn how Onsager managed to do what seemed to me, at the time, an “impossible calculation.” That was 1994 and I am writing this in 2015!

In what follows, I solve the Ising model on the infinite square lattice, but I do not actually follow Onsager’s original argument. There are in fact several different ways of arriving at Onsager’s expression [3–9]. The method I use below is known as the combinatorial method and was developed by van der Waerden, Kac and Ward among others and relies essentially on counting certain kinds of closed graphs (see refs. [3,10–13]). I more or less follow Feynman [3] and I have also relied on the initial portions of ref. [13].

2. The 2-D Ising model

Consider a two dimensional lattice ${\Bbb Z^2}$ where at each point of the lattice is located a (somewhat idealized) spin-${\tfrac{1}{2}}$ particle. Consider a finite subset of this lattice, of size ${L\times L}$ and let ${i,j=1,2,\ldots L^2}$. Let ${\cal N}$ denote the set of pairs of integers ${(i,j)}$ such that spins ${\sigma_i}$ and ${\sigma_j}$ are nearest neighbors. In the ferromagnetic 2-D Ising model with nearest neighbor interactions, spins ${i}$ and ${j}$ interact if and only if ${(i,j)\in \cal N}$. Each spin ${\sigma_i}$ can assume only 2 values: ${\sigma=\pm 1}$.

Consider a system of ${N=L^2}$ spins. The Hamiltonian for a spin configuration ${\sigma=(\sigma_1,\sigma_2,\ldots \sigma_N)}$ is given by

$\displaystyle H(\sigma)= -J \displaystyle\sum_{i,j\in \cal N} \sigma_i \sigma_j ~. \ \ \ \ \ (1)$

The sum over the nearest neighbors should avoid double counting, so that ${(i,j)}$ and ${(j,i)}$ are not counted separately. Without loss of generality, we will assume ${J=1}$ for simplicity.

3. The canonical partition function

In the theory of equilibrium statistical mechanics, the canonical partition function contains all the information needed to recover the thermodynamic properties of a system with fixed number of particles, immersed in a heat bath, details of which can be found in any textbook on statistical mechanics [3-6,14].

I prefer to define the partition function as the two-sided Laplace transform of the degeneracy ${\Omega(E)}$ of the energy level ${E}$. But traditionally, the partition function is defined as a sum or integral over all possible states of the system:

$\displaystyle Z(\beta) = \displaystyle\sum _{\sigma} e^{-\beta H(\sigma)} ~. \ \ \ \ \ (2)$

The two ways of thinking are equivalent. The Laplace transform variable ${\beta}$ is related to the thermodynamic temperature ${T}$ via ${\beta =1/k_B T}$, where ${k_B}$ is the Boltzmann constant. What follows is the exact calculation of ${Z(\beta)}$.