Category Archives: Science and Math

Exact solution of the 2D Ising model via Grassmann variables

1. Introduction

In 1980 Stuart Samuel gave what I consider to be one of the most elegant exact solutions of the 2D Ising model. He used Grassmann variables to formulate the problem in terms of a free-fermion model, via the fermionic path integral approach. The relevant Grassmann action is quadratic, so that the solution can be found via well known formulas for Gaussian integrals.

In previous articles, I derived Onsager’s celebrated expression for the partition function of the 2D Ising model, using two different methods. First, I used a combinatorial method of graph counting that exploits the theory of random walks (see here). In the second article (see here), I reviewed the method of Schultz, Mattis and Lieb of treating the 2D Ising model as a problem of many fermions. Here I further explore the connection between the Ising model and fermions.

This article is based on my study notes. I more or less have followed Samuel’s solution of the 2D Ising model using Grassmann variables, but I had expanded the calculations for my own convenience.

Whereas on the one hand the behavior of bosons can be described using standard path integrals, known as bosonic path integrals, on the other hand the path integral formulation for fermions requires the use of Grassmann variables. Grassmann numbers can be integrated, but not using the standard approach based on Riemann sums and Lebesgue integrals, etc. Instead, they must be integrated using Berezin integration. I have previously written about Grassmann variables and Berezin integrals here.

In what follows, I assume familiarity with the 2D Ising model. Readers who find it difficult to understand the details are referred to the two previously mentioned articles, which are introductory and easier to understand.

The partition function can be approached as a power series either in a high temperature variable or a low temperature one. Consider first the high temperature expansion. It is well known that the partition function of the 2D Ising model is proportional to the generating function of connected or disconnected graphs in which all nodes have even degree, and edges link only nearest neighbors on the infinite square lattice. Graph nodes correspond to sites and links correspond to bonds. In such graphs, every incoming bond at a site has at least one corresponding outgoing bond, because each site has an even number of bonds. To each such “closed graph,” there are corresponding directed graphs, where the links are directed. Since each bond appears at most once in any such directed graph, but never twice or more, we can enumerate such closed directed graphs by assigning pairs of Grassmann variables to each bond. In this case, the generating function, when evaluated in a suitable high temperature variable (such as {u=\tanh \beta J}), gives the Ising model partition function up to a known prefactor.

Here, I do not use the above high temperature expansion, in terms of loops or multipolygons. Instead, I use a low temperature expansion and enumerate non-overlapping magnetic domains, following Samuel’s original work. Specifically, each configuration of the Ising model corresponds to a specific domain wall configuration. So summing over all domain wall configurations is equivalent to a sum over all spin states, excepting for a 2-fold degeneracy since each domain wall configuration corresponds to 2 possible spin configurations.

Since the Ising model on the square lattice is self-dual, the high temperature approach using overlapping multipolygons and the low temperature appproach using non-overlapping multipolygons on the dual lattice are equivalent, of course. Either way, explicit evaluation of a Berezin integral gives the exact solution. Specifically, we assign 4 Grassmann variables to each site of the dual lattice. Equivalently, one can instead think as follows: there are two types of Bloch walls, “vertical” domain walls and “horizontal” walls and each wall segment has two ends. A Bloch wall can thus be represented by Grassmann variables at 2 different sites. A corner of a Bloch domain can be represented by matching a vertical and horizontal variable at a single site. Finally, so-called “monomer” terms consist of 2 horizontal or 2 vertical variables. They represent the absence of a corresponding bond, and also of corners. A single monomer term at a site represents a domain wall that goes straight through a site, but perpendicularly to the monomer and without going around a corner. Two monomer terms at the same site represent a site interior to domain walls, i.e, not on a domain wall. Using the terms corresponding to Bloch walls, corners and monomers, we then construct an action.

2. Overview of the strategy

The solution strategy is as follows. We will exploit 2 properties of Grassmann variables. The first property is anticommutativity, so that the square of a Grassmann variable is zero. This nilpotency property can be exploited to count every domain wall unit (or bond in the high temperature method) no more than once. The second property we will exploit is a feature that is specific to the Berezin integral. Recall that a Berezin multiple integral is nonzero only if the Grassmann variables being integrated exactly match or correspond to the integration variables. If there is a single integration variable that is not matched, the whole multiple integral vanishes, because {\int d\psi=0} for the Berezin integral of any Grassmann variable {\psi}. From now onwards, we will say that the integrand “saturates” the Berezin integral if all integration variables are matched by integrand variables. So the second property is the ability of the multiple Berezin integral to select only terms that saturate the integral.

The essense of the trick is to exponentiate the suitably chosen action. The Taylor series for a function of a finite number of Grassmann variables is a polynomial (because of the nilpotency). This polynomial can code all kinds of graphs and other strange groupings of parts of graphs, such as isolated corners or monomers. By Berezin integration, we can then select only the terms that saturate the integral. If the action is chosen appropriately, the saturating terms are precisely those that correspond to the non-overalapping Bloch domains. (In the high temperature variant, the action instead generates the desired multipolygons.)

It will turn out that for the 2D Ising model this action is a quadratic form, so that we essentially have a free-fermion model. Specifically, the quadratic action is partially “diagonalized” by the Fourier transform, immediately leading to the exact solution originally found by Onsager. Of all the methods of solution of the Ising model, I find this method to be the most simple, beautiful and powerful.

What I also found quite fascinating is that for the cubic lattice one obtains a quartic action instead, as is well known to experts in the field. So, in this sense, the cubic Ising model is not directly equivalent to a free-fermion model, but rather to a model of interacting fermions.

3. The quadratic Grassmann action

We assume the reader is familiar with the Ising model on the square lattice. Let the Boltzmann weight {t= e^{-\beta J}} be our chosen low temperature variable. Then we can write the partition function as

\displaystyle \Xi= \sum_{\sigma} t^{H(\sigma)} \ \ \ \ \ (1)

where {H} is the Ising model Hamiltonian and the sum is over all spin states. The choice of symbol {\Xi} for the partition funtion was made so that the letter {Z} can be reserved for the partition function per site in the thermodynamic limit, to be defined further below.

For a given spin configuration, let us consider the set of all bonds in the excited state, i.e. with antiparallel spins. These excited bonds form the Bloch walls separating domains of opposiite magnetization. On the dual lattice, these Bloch walls form non-overlapping loops or polygons. Moreover, every Bloch wall configuration corresponds to exactly 2 spin configurations, so that we can rewrite the partition function as

\displaystyle \Xi= 2\sum _{\mbox{\tiny loops}} t^{\gamma} t^{-(2N-\gamma)} = 2t^{-2N}\sum _{\mbox{\tiny loops}} t^{2\gamma} ~. \ \ \ \ \ (2)

Here, the factor {t^\gamma} is due to the Bloch walls and {2N-\gamma} is the number of bonds inside the Bloch domains. We thus see the partition function can be calculated by ennumerating non-overlapping loops and summing them with proper Boltzmann weights.

Let us define a modified partition function for these loops by

\displaystyle \Xi'= \sum _{\mbox{\tiny loops}} t^{2\gamma} \ \ \ \ \ (3)

so that {\Xi=2 t^{-2N} \Xi'}. Our goal henceforth is thus to calculate {\Xi'}. To do so, we will use Grassmann numbers and Berezin integration.

Let us define at each site {x} of the 2D lattice a pair of Grassmann variables in the vertical direction, {\eta_{\pm 1}(x)} and another pair for the horizontal direction {\eta_{\pm 2}(x)}.

We can now define an action for our fermionic path integral as follows. Each configuration of nonoverlapping loops consists of (i) individual segments of loops that link neighboring sites on the dual lattice (Bloch wall units), (ii) sites where the domain wall goes through the site and (iii) sites where the Bloch domain cuts a corner. We will thus write the total action as a sum of 3 terms: (i) the Bloch wall or “line” term {S_L} , (ii) the “monomer” term {S_M} and (iii) the corner term {S_C}:

\displaystyle S= S_L + S_M + S_C ~. \ \ \ \ \ (4)

We will then exponentiate this action and use a Berezin integral to obtain {\Xi'}:

\displaystyle \Xi'= (-1)^N \int e^{\beta S} \prod_x d\eta_{-1}(x) d\eta_{+1}(x) d\eta_{-2}(x) d\eta_{+2}(x) ~ \ \ \ \ \ (5)

We will use the same convention used by Samuel, so

\displaystyle \int d\eta_{-1} d\eta_{+1} \eta_{-1} \eta_{+1} =1~. \ \ \ \ \ (6)

Let us define the Bloch wall terms by

\displaystyle S_L= t^2 \sum_x[ \eta_{+1}(x) \eta_{-1}(x +\hat 1) + \eta_{+2}(x) \eta_{-2}(x +\hat 2) ] ~. \ \ \ \ \ (7)

To remain consistent with this definition the corner term must be defined as

\displaystyle S_C = \sum_x[ \eta_{+1}(x) \eta_{-2}(x) + \eta_{+2}(x) \eta_{-1}(x) + \eta_{+2}(x) \eta_{+1}(x) + \eta_{-2}(x) \eta_{-1}(x) ] ~. \ \ \ \ \ (8)

To see why, consider a corner formed along a path going horizontally forward followed by vertically forward. You thus have 2 Bloch walls segments meeting at the corner. We want to saturate first the horizontal wall, then the vertical. The horizontal wall contributes with {\eta_{-1}(x)} and the vertical wall with {\eta_{+2}(x)}. So we want to saturate the Berezin integral at the site {x} with the corner factor {\eta_{+1}\eta_{-2}}. This is the first term in the corner action. The 3 other terms are similarly deduced.

Meanwhile, the monomer terms are even simpler. We want the monomer terms to “do nothing”, i.e. to contribute with a factor of 1 when (and only when) needed. From the sign convention (6) we thus obtain simply

\displaystyle S_M= \sum_x[ \eta_{-1}(x) \eta_{+1}(x) + \eta_{-2}(x) \eta_{+2}(x) ] ~ . \ \ \ \ \ (9)

Note that corner and monomer terms have an even number of Grassmann variables per site, while the line term has only one Grassmann variable on each of two neighboring sites. So to saturate the Berezin integral, an even number of line terms (so 0,2, or 4) must come together at a given site.

The Berezin integral for a fixed site {x} can only saturate in the following ways:

  1. Two monomer terms, one horizontal and one vertical.
  2. Two lines and a monomer.
  3. Two lines and a corner.
  4. Four lines.

The following are prohibited at any site:

  1. An odd number of lines (because there is no way to saturate missing Grassmann variables).
  2. One corner and one monomer (because one Grassmann variable will necessarily be repeated, so that the nilpotency kills the term, and similarly one variable will be missing, also leading to zero).
  3. A double corner of 4 domain walls and 2 corners (because then you have 6 Grassmann variables at the site).

There is one other interesting case: 2 corner terms with no lines. In this case, each Grassmann variable at the site appears exactly once, so such terms do in fact contribute. Moreover, every such term is matched by another term with the two other kinds of corners. For example the two-corner term

\displaystyle [\eta_{+1}(x) \eta_{-2}(x) ][ \eta_{+2}(x) \eta_{-1}(x) ]

is matched by the term

\displaystyle [\eta_{-2}(x) \eta_{-1}(x) ][ \eta_{+2}(x) \eta_{+1}(x) ] ~.


\displaystyle \begin{array}{rcl} [\eta_{-2}(x) \eta_{-1}(x) ][ \eta_{+2}(x) \eta_{+1}(x) ] &=& -\eta_{-2}(x) \eta_{+2}(x) \eta_{-1}(x) \eta_{+1}(x) \\ &=& +\eta_{-2}(x) \eta_{+2}(x) \eta_{+1}(x) \eta_{-1}(x) \\ &=& -\eta_{-2}(x) \eta_{+1}(x) \eta_{+2}(x) \eta_{-1}(x) \\ &=& +\eta_{+1}(x) \eta_{-2}(x) \eta_{+2}(x) \eta_{-1}(x) ~. \end{array}

So the two ways of combining 2 corners lead to double the contribution. But the first double corner is actually the negative of the term with two monomers:

\displaystyle \begin{array}{rcl} [\eta_{+1}(x) \eta_{-2}(x) ][ \eta_{+2}(x) \eta_{-1}(x) ] &=& \eta_{+2}(x) \eta_{-1}(x) \eta_{+1}(x) \eta_{-2}(x) \\ &=& - \eta_{-1}(x) \eta_{+2}(x) \eta_{+1}(x) \eta_{-2}(x) \\ &=& + \eta_{-1}(x) \eta_{+1}(x) \eta_{+2}(x) \eta_{-2}(x) \\ &=& - \eta_{-1}(x) \eta_{+1}(x) \eta_{-2}(x) \eta_{+2}(x) \end{array}

so that at each site the double monomer term plus the two double corner terms produce a net contribution of {+1-2=-1}. If there are {N} sites, then the number of sites on the lines is necessarily even, so that the number {N'} of sites not on the walls satisfies {(-1)^{N'}=(-1)^N}. So there is an overall factor of {(-1)^N}, as seen in (5).

Meanwhile, every domain wall segment has weight {t^2}, so that a graph of non-overlappig loops of total loop length {\gamma} will have a weight of {t^{2\gamma}}. There are many different ways to permute the lines, corners and monomers, but this is cancelled by the factorial in the denominator of the Taylor expansion of the exponential function. The final result is that the right hand sides of (5) and (3) are equal.

4. Diagonalization and exact solution

The quadratic action is translationally invariant, so the Fourier transform will diagonalize it, i.e. in the new Fourier conjugate variable the action is diagonal. By this we mean that the action does not mix different Fourier frequencies, upto sign.

So let us define the (unitary) Fourier transform {\hat \eta} of the Grassmann variables {\eta}:

\displaystyle \eta_i(x) = \frac 1 {\sqrt{N}} \sum_k e^{ik\cdot x} \hat \eta_i(k) ~. \ \ \ \ \ (10)

Here {x} and {k} are 2-dimensional vectors. It will be more convenient for us to write

\displaystyle k = k_x \hat 1 + k_y \hat 2\ \ \ \ \ (11)


\displaystyle \begin{array}{rcl} k_x&=& \frac {2\pi n_x} {L_x} \\ k_y&=& \frac {2\pi n_y} {L_x} ~, \end{array}

where {N_x} and {N_y} are the number of rows and columns of the lattice and {N_x N_y=N}.

We could take {n_x} ranging as {0\ldots (L_x-1)}, but we will instead take {N_x} and {N_y} as odd and use negative wavenumbers as follows. Let {N_x=2 L_x+1} and {N_y=2 L_y+1}. Then,

\displaystyle \begin{array}{rcl} n_x&=& -L_x, -(L_x-1), \ldots, L_x-1, L_x\\ n_x&=& -L_x, -(L_x-1), \ldots, L_x-1, L_x ~. \end{array}

It is easy to check that the {\hat \eta} are Grassmann numbers, a fact that follows from the unitarity of the Fourier transform. Unitarity also guarantees that

\displaystyle \prod_x d\eta_{-1}(x) d\eta_{+1}(x) d\eta_{-2}(x) d\eta_{+2}(x) = \prod_k d\hat\eta_{-1}(k) d\hat\eta_{+1}(k) d\hat\eta_{-2}(k) d\hat\eta_{+2}(k) \ \ \ \ \ (12)

so that we can explicitly evaluate the Berezin integral provided we can rewrite the Grassmann action in simple enough form in the Fourier transformed Grassmann variables. This we do next.

Notice that the action is quadratic in the {\eta}. Specifically, the action is a quadratic form in the Grassmann variables that we can write as

\displaystyle S= \sum_{xyij} \eta_i(x) A_{ij}(x,y) \eta_j(y) ~. \ \ \ \ \ (13)

We will explicitly define {A_{ij}(x,y)} below, but for the moment let us not worry about it, except to note that it is simply a matrix or linear operator kernel. Note that the “matrix” {A} is actually a tensor of rank 4, because there are 4 indices: {i,~j, ~x,~y}. However, for a fixed pair of not-necessarily-distinct sites {x} and {y}, {A} is a genuine (4{\times}4) matrix with only two indices. Similarly, for fixed {i, ~j}, {A} is an {N\times N} matrix. We will apply the Fourier transform on the site variables, {x,~y}, noting that the action has the form of an inner product, and by the unitarity of the Fourier transform, it is possible to rewrite the action as

\displaystyle S= \sum_{kk'ij} \hat\eta^*_i(k) \hat A_{ij}(k,k') \hat\eta_j(k') ~. \ \ \ \ \ (14)

All we have done here is apply Plancherel’s theorem, sometimes also known as Parseval’s theorem. Hatted symbols represent Fourier transformed quantities.

I have previously written about the magical powers of the Fourier transform. For example: (i) Convolutions become products and vice versa under Fourier transformation; (ii) differential operators transform to simple algebraic multipliers; (iii) translations hecome phase shifts. It is this last property which is the most important for our purposes here.

The translationally invariant quadratic action for the 2D Ising model only connects neighboring sites. The only non-diagonal part of the action is due to this shift to neighboring sites. But upon Fourier transformation, we get rid of the shifts and obtain an action which is diagonal in the Fourier transform variable {k}. In other words, {\hat A_{ij}(k,k') = \hat A_{ij}(k,k) \delta(k,k')} where {\delta(k,k')} is the Kronecker delta function.

Let us introduce more compact notation to make the calculations easier. Let us define the vector

\displaystyle \eta= \left[\begin{array}{c} \eta_{-1} \\ \eta_{+1} \\ \eta_{-2} \\ \eta_{+2} \end{array} \right] \ \ \ \ \ (15)

and similarly for {\hat \eta}. Then we can write

\displaystyle \begin{array}{rcl} S_L&=& \sum_{xy} \eta^T(x) A_L(x,y) \eta(y) \\ S_C&=& \sum_{xy} \eta^T(x) A_C(x,y) \eta(y) \\ S_M&=& \sum_{xy} \eta^T(x) A_M(x,y) \eta(y) ~. \end{array}

The superscript {\cdot^T} denotes transpose. The matrices {A_C} and {A_M} are given by (8) and (9) and are diagonal in the site indices, i.e. in {x,~y}.

\displaystyle \begin{array}{rcl} A_C(x,y)&=& \delta(x,y) \left[ \begin{array}{cccc} 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 1 & 0 & 0 & 0 \\ 1 & 1 & 0 & 0 \end{array} \right] \\ A_L(x,y)&=& \delta(x,y) \left[ \begin{array}{cccc} 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 \end{array} \right] ~, \end{array}

where {\delta(x,y)} is a Kronecker delta function, not to be confused with the Dirac delta. The matrix {A_L} is not diagonal in the site indices because it connects neighboring sites:

\displaystyle \begin{array}{rcl} A_L(x,y)&=& t^2\delta(x+\hat 1,y) \left[ \begin{array}{cccc} 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{array} \right] +t^2 \delta(x+\hat 2,y) \left[ \begin{array}{cccc} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \end{array} \right] ~. \end{array}

In terms of {\hat \eta}, we can thus write the action as

\displaystyle S= \sum_{k}\hat \eta^\dagger(k) B(k,k) \hat \eta(k) ~, \ \ \ \ \ (16)

Here {B=\hat A}. The reason the sum is now over a single index {k} is that {A} is translationally invariant, and application of Parseval’s (or Plancherel’s) allows us to convert the site shift to a phase, so that {\hat A(k_1,k_2)=\delta(k_1,k_2) \hat A(k_1,k_1)=B(k_1,k_1)}. In other words, {\hat A=B} is diagonal in the Fourier variables such as {k}, which is conjugate to the site index {x}.

As an illustrative example, let us show this last point explicitly. Consider the Fourier transform of {\sum_y C(x,y) \eta(y)}, where {C(x,y)=C_0 \delta(x+\hat 1,y)} is translationaly invariant:

\displaystyle \begin{array}{rcl} & & \frac 1 {\sqrt{N}} \sum_{x=1}^N e^{-ik\cdot x} \sum_y C_0 \delta(x+\hat 1,y) \eta(y) \\ & & = \frac {C_0} {\sqrt{N}} \sum_{x=1}^N e^{-ik \cdot x} \eta(x+\hat 1) \\ & & = \frac {C_0} {\sqrt{N}} \sum_1^N e^{-ik \cdot (x-\hat 1)} \eta(x) \\ & & = \frac {C_0 e^{i k\cdot \hat 1}} {\sqrt{N}} \sum_1^N e^{-ik \cdot x} \eta(x) \\ & & = e^{i k\cdot \hat 1}\hat \eta \end{array}

Hence {B} is diagonal in the sense previously explained. It is easily calclated to be

\displaystyle \begin{array}{rcl} B(k,k) &=& \left[ \begin{array}{cccc} ~~0 & 1 & 0 & 0 \\ 0 & ~~0 & 1 & 0 \\ 1 & 0 & ~~0 & 1 \\ 1 & 1 & 0 & ~~0 \end{array} \right] \nonumber \\ & & \quad +t^2 e^{ik\cdot \hat 1} \left[ \begin{array}{cccc} ~~0 & 0 & ~~0 & ~~0 \\ 1 & ~~0 & ~~0 & ~~0 \\ ~~0 & ~~0 & ~~0 & ~~0 \\ ~~0 & ~~0 & ~~0 & ~~0 \end{array} \right] +t^2 e^{ik\cdot \hat 2} \left[ \begin{array}{cccc} ~~0 & ~~0 & ~~0 & ~~0 \\ ~~0 & ~~0 & ~~0 & ~~0 \\ ~~0 & ~~0 & ~~0 & 0 \\ ~~0 & ~~0 & 1 & ~~0 \end{array} \right] \\ &=& \left[ \begin{array}{cccc} ~~0 & 1 & 0 & 0 \\ t^2 e^{ik_x} & ~~0 & 1 & 0 \\ 1 & 0 & ~~0 & 1 \\ 1 & 1 & t^2 e^{ik_y} & ~~0 \end{array} \right] ~. \end{array}

Notice that a daggered variable {\hat \eta^\dagger(k)} (i.e., a conjugate transpose) appears in the action. We need to take care of this. Since {\eta(x)} is, in principle, a “real” Grassmann variable, its Fourier transform satisfies {[\hat \eta(k)]^* = \hat \eta(-k)}. This “Hermitian” property of the Fourier transform of real functions allows us to write the action as

\displaystyle S= \sum_{k}\hat \eta^T(-k) B(k,k) \hat \eta(k) ~. \ \ \ \ \ (17)

We can explicitly evaluate the needed Berezin integral. But some care is necessary. Observe that the action “mixes” frequencies {k} and {-k}. When we rewrite the exponential of the sum in the above action as a product of exponentials of the summands, these exponentials will contain not only the 4 grassman variables that make up the vector {\hat \eta(k)}, but also the other 4 variables in {\hat \eta(-k)}. So the full Berezin integral will factor not into Berezin integrals over 4 variables, but over 8 variables. So we will regroup the Grassmann differentials in groups of 8 rather than in groups of four, as follows:

\displaystyle \begin{array}{rcl} & & \prod_k d\hat\eta_{-1}(k) d\hat\eta_{+1}(k) d\hat\eta_{-2}(k) d\hat\eta_{+2}(k)\\ & & = \prod_{k\geq 0} d\hat\eta_{-1}(k) d\hat\eta_{+1}(k) d\hat\eta_{-2}(k) d\hat\eta_{+2}(k) ~ d\hat\eta_{-1}(-k) d\hat\eta_{+1}(-k) d\hat\eta_{-2}(-k) d\hat\eta_{+2}(-k) ~. \end{array}

In order to correctly factor the full Berezin integral, we need to collect all terms with {k} and {-k}, which we can do by rewriting the sum in the action half the values of {k}. We will abuse the notation and write {k\geq 0} to mean that {k>0} and {-k>0} are not both included in the summation. For instance, we can accomplish this by taking {k_x \geq 0}. With this notation, we can write,

\displaystyle \begin{array}{rcl} S &=& \sum_{k\geq0}\left( \hat \eta^T(-k) B(k,k) \hat \eta(k) + \hat \eta^T(k) B(-k,-k) \hat \eta(-k) \right) \\ &=& \sum_{k\geq0}\left( \hat \eta^T(-k) B(k,k) \hat \eta(k) - \hat \eta^T(-k) B^T(-k,-k) \hat \eta(k) \right) \\ &=& \sum_{k\geq0}\left( \hat \eta^T(-k) B(k,k) \hat \eta(k) - \hat \eta^T(-k) B^T(-k,-k) \hat \eta(k) \right) \\ ~. \end{array}

If we define

\displaystyle B'(k) = B(k,k)-B^T(-k,-k) \ \ \ \ \ (18)

then we can write the action as

\displaystyle S= \sum_{k\geq0} \hat \eta(-k) B'(k) \hat \eta (k) ~. \ \ \ \ \ (19)

Let us write

\displaystyle d\hat \eta(k) d\hat \eta(-k) = d\hat\eta_{-1}(k) d\hat\eta_{+1}(k) d\hat\eta_{-2}(k) d\hat\eta_{+2}(k) ~ d\hat\eta_{-1}(-k) d\hat\eta_{+1}(-k) d\hat\eta_{-2}(-k) d\hat\eta_{+2}(-k)~. \ \ \ \ \ (20)

The full Berezin integral now factors and can be written

\displaystyle \begin{array}{rcl} & & \Xi'= \prod_{k\geq 0} \int { d\hat\eta(k) d\hat\eta(-k) } e^{\hat \eta(-k) B'(k) \hat \eta (k)} ~. \end{array}

Each of the Berezin integral factors is a Gaussian integral. Recall that

\displaystyle \int e^{-\sum_{ij}x_i A_{ij} y_j} dx dy = \det(A) \ \ \ \ \ (21)

for Grassmann numbers {x_i} and {y_i} and a complex matrix {A} (see here for an explanation). So

\displaystyle \Xi'= \prod_{k\geq 0} \det B' \ \ \ \ \ (22)

Removing the restriction to {k \geq 0} we have

\displaystyle \Xi'= \prod_{k} \sqrt {\det B'} ~. \ \ \ \ \ (23)

Taking logarithms and dividing by the number of sites {N} we have

\displaystyle \frac 1 N \log \Xi' = \frac 1 2 \frac 1 N \sum_k \log \det B' ~. \ \ \ \ \ (24)

In the thermodynamic limit, the sum becomes an integral and the partition function per site

\displaystyle Z=\lim_{N\rightarrow \infty} \Xi^{1/N} \ \ \ \ \ (25)

is thus given by

\displaystyle \log Z = \log t^{-2}~ + \frac 1 2 \int_{-\pi}^{\pi}\int_{-\pi}^{\pi} \log \det B'~ \frac{dk_x}{2\pi} \frac{dk_y}{2\pi} \ \ \ \ \ (26)

The anti-hermitian matrix {B'} is easily found:

\displaystyle B'(k)= \left[ \begin{array}{cccc} ~~0 & 1- t^2 e^{-ik_x} & -1 & -1 \\ -(1-t^2 e^{ik_x}) & ~~0 & 1 & -1 \\ 1 & -1 & ~~0 & (1-t^2e^{-ik_y}) \\ 1 & 1 & -(1-t^2e^{ik_y}) & ~~0 \end{array} \right] ~. \ \ \ \ \ (27)

Its determinant is

\displaystyle \det B'= (1 + t^4)^2 - 2 t^2 (-1 + t^4) (\cos k_x + \cos k_y) ~. \ \ \ \ \ (28)

Substituting into the integral, we arrive at the expression for the partition function per site:

\displaystyle \begin{array}{rcl} \log Z &=& \log t^{-2}~ + \frac 1 2 \iint_{-\pi}^{\pi} \log \big[(1 + t^4)^2 - 2 t^2 (-1 + t^4) (\cos k_x + \cos k_y)\big] ~ \frac{dk_x}{2\pi} \frac{dk_y}{2\pi} \\ &=& \frac 1 2 \iint_{-\pi}^{\pi} \log \big[(t^{-2} + t^2)^2 - 2 (-t^{-2} + t^2) (\cos k_x + \cos k_y)\big] ~ \frac{dk_x}{2\pi} \frac{dk_y}{2\pi} \\ &=& \frac 1 2 \iint_{-\pi}^{\pi} \log \big[4 \cosh^2 2\beta J - 4 \sinh 2\beta J ~(\cos k_x + \cos k_y)\big] ~ \frac{dk_x}{2\pi} \frac{dk_y}{2\pi} \end{array}

We are done! This is Onsager’s famous result, specialized to the case of of equal coupĺings {J=J_x=J_y}.

Exercise: Repeat the calculation using the high temperature variable {u} instead of the low tempoerature variable {t}. (The final answer is of course the same.)

5. The cubic Ising model

Ever since 1944 when Onsager published his seminal paper, tentative “exact solutions” have been proposed over the years for the 3D or cubic Ising model. As mentioned eariler, the Grassmann action is quartic for the cubic Ising model. In quantum field theory, quartic Grassmann actions are associated with models of interacting fermions, whereas quadratic actions are associated with free-fermion models. The latter are easily solved via Pfaffian and determinant formulas, as we have done above, but at the present time there are no methods known to be able to give exact solutions (in the thermodynamic limit) of lattice models with quartic Grassmann actions. Hence, anybody claiming an exact solution to the cubic Ising model must explain how they overcame the mathematical difficulty of dealing with quartic actions, or at least how the new method bypasses this mathematical obstruction.

Barry Cipra, in an article in Science, referred to the Ising problem as the “Holy Grail of statistical mechanics.” The article lists a number of other reasons why we may never attain the goal of finding an explicit exact solution of the cubic Ising model.

Exact solution of the cubic Ising model may be an impossible problem!

I thank Francisco “Xico” Alexandre da Costa and Sílvio Salinas for calling my attention to the Grassmann variables approach to solving the Ising model.


Berezin integration of Grassmann variables

1. Introduction

When I first came across the presentation of linear algebra in terms of Grassmann’s exterior products, I was struck by its elegance. An introduction to linear algebra in terms of Grassmann’s ideas can be found here. The Grassmann approach is so much more intuitive that, once learned, there is no going back to the old way of thinking. For example, although in college I found determinants and permanents relatively easy to understand using the conventional approach, I only really came to understand the meaning of Pfaffians after learning exterior algebra (a.k.a Grassmann algebra).

Curiously, Grassmann did not have a university education in mathematics. Rather, he was actually trained as a linguist. Yet, his contributions to mathematics are widely recognized today. Terms such as Grassmann numbers, Grassmann algebra and Grassmanian manifold are all named in his honor.

Also fascinating is how Grassmann numbers make their appearance very naturally in quantum field theory. Specifically, they are used in the path integral formulaton for fermionic fields. Readers interested in finding out more about the connection with fermionic path integrals should refer to standard textbooks, for instance the books by Weinberg or the one by Ryder.

Some years ago I derived, for my own convenience, a few of the basic identities satisfied by Grassmann variables, in the context of differentiation and integration. Integration of Grassmann variables is known as Berezin integration. This article is based on my old study notes.

2. Grassmann numbers

Consider the algebra generated by {N} grassmann numbers {x_i}, {i=1,2,\ldots N} that anticommute according to

\displaystyle \{x_i, x_j\} := x_i x_j + x_j x_i =0 ~. \ \ \ \ \ (1)

Such elements are nilpotent with degree 2 due to the antisymmetry property: {x_i^2=0}.

What is the dimension of this algebra (as a vector space)? Consider the set of all monomials. There are {N} generators {x_1\ldots x_N} which are each nilpotent with degree 2. Hence, a general nonzero monomial can have a generator as factor only once. Hence, there are exactly {2^N} monomials. It is then easy to check that the most general function of the generators can be expressed as a linear combination of these monomials. Indeed, all power series (e.g., Taylor series) terminate, i.e. the most general function is a polynomial in the generators. The dimension of the Grassmann algebra is thus equal to the number of linearly independent monomials, {2^N}.

It is worth calling attention to an antisymmetry property of the coefficients of monomials. Consider the monomial term {x_1 x_2}. By definition, it equals {-x_2 x_1}. So the representation of a function as a linear combination of monomials is not unique. However, if we define the coefficients to be antisymmetric, then we recover uniqueness. For instance, {a_{12} x_1 x_2 = a_{21} x_2 x_1} if the coefficients satisfy {a_{12}=-a_{21}}.

3. Differentiation

Because Grassmann variables do not commute, we can define derivatives acting from the right and from the left. Here, I consider only derivatives acting to the right. We define the derivative as follows for a single genrator:

\displaystyle {\partial x_i \over \partial x_i} =1 ~. \ \ \ \ \ (2)

To extend the derivative to a monomial, we must first bring the matching generator {x_i} all the way to the left, multiplying by {(-1)^k} where {k} is the number of generators to the left of {x_i} in the original monomial. Then, the derivative is obtained by dropping {x_i}. For example,

\displaystyle {\partial \over \partial x_2} x_1 x_2 x_3 = {\partial \over \partial x_2} - x_2 x_1 x_3 = - x_1 x_3 ~. \ \ \ \ \ (3)

The derivative then extends to all functions via linearity, i.e. differentiation is a linear operator.

The chain rule holds in the usual manner.

The product rule holds as usual if the first factor {f_1} is of even degree in the generators:

\displaystyle {\partial \over \partial x_p} f_1 f_2 = \left ( {\partial \over \partial x_p} f_1 \right) f_2 + f_1 {\partial \over \partial x_p} f_2 ~. \ \ \ \ \ (4)

However, if the first factor has odd degree, then there is a sign change in the second term. Since a general funtion need not be homogeneous and may have terms of both odd and even degree, I consider it safer to assume that the product rule does not hold in general, and instead to calculate term by term explicitly, unless you know what you are doing.


4. Berezin integration

Now we come to the most interesting part of this article! Is it possible to define an integral for Grassmann numbers? The usual antiderivative of a variable {x}

\displaystyle \int x dx = \frac 1 2 x^2

would be zero if {x} were a Grassmann variable, so it does not make sense to define integration in this manner. However, one can define the equivalent of a definite Riemann integral. The definite integral

\displaystyle \int_{-\infty} ^\infty f(x) dx \ \ \ \ \ (5)

has the following properties:

(i) Translation invariance:

\displaystyle \int_{-\infty} ^\infty f(x) dx = \int_{-\infty} ^\infty f(x+y) dx \ \ \ \ \ (6)

(ii) Linearity:

\displaystyle \int (a+ b f(x)) dx = a \int dx + b \int f(x) dx ~. \ \ \ \ \ (7)

Hence, we will require that that the integral of a Grassmann number also have these two properties. Let {x} and {y} now denote Grassmann numbers.

Then first we require translation invariance.

\displaystyle \int f(x) dx = \int f(x+y) dx . \ \ \ \ \ (8)

Given that {f(x)} is at most a linear function of {x}, let us write

\displaystyle f(x) = a + b x \ \ \ \ \ (9)

where {a} and {b} are complex numbers. Substituting, and invoking linearity, we get

\displaystyle \begin{array}{rcl}  & & \int f(x+y) dx \\ & & =\int (a + b(x+y)) dx \\ & & =(by)\int dx + \int (a+bx) dx\\ & & = (by)\int dx + \int f(x) dx~. \end{array}


\displaystyle \int f(x+y) dx - \int f(x) dx = b y \int dx = 0 ~. \ \ \ \ \ (10)

so that we are forced to assume

\displaystyle \int dx = 0 ~. \ \ \ \ \ (11)

Hence, we get

\displaystyle \int (a+ bx) dx= b \int x dx~. \ \ \ \ \ (12)

Berezin chose the convention that

\displaystyle \int x dx = 1 \ \ \ \ \ (13)

although other conventions are possible for the constant. Below I use Berezin’s convention.

From these observations, we define for Grassmann numbers,

\displaystyle \int dx_i =0 ~. \ \ \ \ \ (14)

(If this is too difficult or “wierd” to accept, try to imagine that supposed integrated quantity {x_i} vanishes at the boundary.)

Next we define

\displaystyle \int x_i dx_i = 1 ~. \ \ \ \ \ (15)

Note that the integral is independent of quantities such as {x_i^2/2} which is what you would expect for a Riemann integral for normal (i.e., non-Grassmann) variables, since that would be zero due to the nilpotent property. Instead, Berezin integration is similar to standard differentiation: the usual derivative of {x_i} is 1 and the usual derivative of {1} is 0.

Moreover, the differential {dx_i} anticommmutes with {x_i}. In fact, the anticommutation property holds generally:

\displaystyle \{dx_i, dx_j\} = \{dx_i, x_j\} =0 ~. \ \ \ \ \ (16)

Multiple integrals are defined, in analogy with Fubini’s theorem, as iterated integrals:

\displaystyle  \iint x_j x_i dx_i dx_j \equiv \int x_j \left( \int x_i dx_i \right) dx_j ~. \ \ \ \ \ (17)

Note that there are other sign conventions for Berezin integrals. Physicists usually use the convention

\displaystyle  \iint x_i x_j dx_i dx_j =1 ~. \ \ \ \ \ (18)

In what follows I use the former sign convention.

Next, we come to one of the most interesting and unexpected properties of the Berezin integral. Let {f(x)} represent a function of all the generators. Recall that the most general function of the Grassmann algebra generators is a polynomial. Hence, the most general function can be written

\displaystyle f (x) = f_0 + \sum_k f_1(k) x_{k} + \sum_{k_1<k_2} f_2({k_1},{k_2}) x_{k_1} x_{k_2} + \ldots + f_N (1,2,\ldots, N) x_{1} x_2 \ldots x_{N} ~. \ \ \ \ \ (19)

Indeed, every element of the Grassmann algebra can be written as such.

Now consider the multiple Berezin integral

\displaystyle \int f(x) ~ dx_1 dx_2 \ldots dx_N ~. \ \ \ \ \ (20)

Note that all monomial terms of degree {k} with {k<N} will vanish, because each of the {N-k} iterated integrals for the variables not appearing in the monomial vanish separately. Only monomials of degree {N} survive:

\displaystyle \int f(x) ~ dx_1 dx_2 \ldots dx_N =f_N(1,2,\ldots,N) ~. \ \ \ \ \ (21)

5. Change of variables

Riemann integrals satisfy

\displaystyle \int f(ax) dx = \frac 1 a \int f(x) dx ~. \ \ \ \ \ (22)

We will show that Berezin integrals satisfy intead

\displaystyle \int f(ax) dx = a \int f(x) dx ~. \ \ \ \ \ (23)

The reason for this opposite behavior is that Berezin integration is actually similar to (standard) differentiation.

Let {y=ax} for Grassmann variables {x} and {y} and consider that by definition

\displaystyle \int y \,dy = \int x \,dx =1 ~. \ \ \ \ \ (24)


\displaystyle \int y \,dy = \int a x \,dy = \int x \,dx ~. \ \ \ \ \ (25)


\displaystyle a \,dy = dx ~, \ \ \ \ \ (26)

which means that

\displaystyle dy = \frac {dx} a ~. \ \ \ \ \ (27)

In standard calculus, we would instead have {dy= a\,dx}, so Berezin differentials scale opposite to what one would expect from standard calculus.

Now let us generalize to {N} generators. Let {y_i= \sum_{j} a_{ij} x_j} Those with some familiarity with exterior products will recogize that the product

\displaystyle y_1 y_2 \ldots y_N \ \ \ \ \ (28)

corresponds to the exterior product of maximal grade, so that we naturally expect the determinant to make an appearance:

\displaystyle y_1 y_2 \ldots y_N = \det(a) ~x_1 x_2 \ldots x_N ~ . \ \ \ \ \ (29)

Moreover, because the differentials scale inversely to the generators, we have

\displaystyle \det(a)~dy_1 dy_2 \ldots dy_N = dx_1 dx_2 \ldots dx_N ~ . \ \ \ \ \ (30)


6. Gaussian integrals

Consider the following Riemann integrals:

\displaystyle \begin{array}{rcl} \int_{\Bbb R} e^{-ax^2} ~dx & = & \sqrt {\frac \pi a} \\ \iint_{\Bbb R^2} e^{-a(x^2 + y^2)} ~dxdy & =& \frac \pi a ~. \end{array}

We will next derive an analog of the above for the Berezin integral. The analog of the first integral is zero, due to the nilpotent property. We thus look at the Berezin analog of the second:

\displaystyle \begin{array}{rcl} \iint e^{-axy} ~dx dy &=& \iint (1 -axy ) ~dx dy \\ &=& -\iint axy ~dx dy \\ &=& \iint ay x ~dx dy \\ & =& a ~. \end{array}

Note that the {a} is in the numerator rather than the denominator. Morever, there is no more factor of {\pi}. (In fact, there are conventions that I do not discuss here, as previously mentioned.)

Let {x} and {y} be generators, so that there are {2N} generators total. Moreover, let

\displaystyle dx dy = dx_1 dy_1 dx_2 dy_2\ldots dx_N dy_N~.

Now consider the multiple Gaussian Berezin integral:

\displaystyle \int e^{-\sum_{ij}x_i A_{ij} y_j} dx dy ~ . \ \ \ \ \ (31)

Let us change basis in order to diagonalize the matrix {A_{ij}}, via a unitary transformation. In the new variables {x'} and {y' }, the transformed matrix {A'} is diagonal, so that the exponential factors into a product of terms such as {\exp[- x'_i A'_{ii} y'_i]}. Hence, the full integral can be written as a product of simple Gaussian integrals and the value of the full Gaussian integral will simply be

\displaystyle \int e^{-\sum_{ij}x_i A_{ij} y_j} dx dy = \prod _i A'_{ii} = \det A' = \det A~. \ \ \ \ \ (32)

Here we have used the fact that unitary transformations leave the determinant invariant.

In contrast, for a Riemann integral the correct expression is

\displaystyle \int_{\Bbb R^N} e^{-x^T Ax} ~d^Nx = \sqrt {\frac {\pi^N} {\det(A)}} ~. \ \ \ \ \ (33)

So, besides the factors of {\pi}, the determinant of {A} appears in the numerator instead of in the denominator for the Berezin integral.

7. A Gaussian integral in terms of a Pfaffian

Let us use a change of variable and define

\displaystyle \begin{array}{rcl} x_j &=& {\frac 1 {\sqrt 2}}~ (z_j^{(1)} + i z_j^{(2)}) \\ y_j &=& {\frac 1 {\sqrt 2}} ~(z_j^{(1)} + i z_j^{(2)}) ~. \end{array}

\displaystyle \begin{array}{rcl} dx_j dy_j &=& \det \left[ \begin{array}{cc} \frac 1{\sqrt 2} & \frac i {\sqrt 2} \\ \frac 1{\sqrt 2} & -\frac i {\sqrt 2} \end{array} \right]^{-1} dz_j^{(1)} dz_j^{(2)} \\ &=& i dz_j^{(1)} dz_j^{(2)} ~. \end{array}

Note that the reason the Jacobian matrix is inverted is due to the strange way that Grassmann variables behave under change of variables, as explained above.

Let {A} be an antisymmetric matrix of dimension {N\times N} with {N} even. Consider

\displaystyle \sum_{ij} x_i A_{ij} y_j = \sum_{ij} \frac {A_{ij}} 2 [ z_i^{(1)} z_j^{(1)} - i z_i^{(1)} z_j^{(2)} + i z_i^{(2)} z_j^{(1)} + z_i^{(2)} z_j^{(2)} ] ~. \ \ \ \ \ (34)

Since {A} is antisymmetric, the cross terms will cancel, so that

\displaystyle \sum_{ij} x_i A_{ij} y_j = \frac 1 2 ( z_i^{(1)} A_{ij }z_j^{(1)} + z_i^{(2)} A_{ij }z_j^{(2)} )~ . \ \ \ \ \ (35)

Integrating the exponential of the above and substituting into (32). and remembering that {N} is even we get,

\displaystyle \begin{array}{rcl} \int e^{-\sum_{ij}x_i A_{ij} y_j} dx dy &=& i^{N} \int e^{\sum_{ij} \frac 1 2 ( z_i^{(1)} A_{ij }z_j^{(1)} + z_i^{(2)} A_{ij }z_j^{(2)} ) } dz_1^{(1)} dz_1^{(2)} \ldots dz_N^{(1)} dz_N^{(2)} \\ &=& (-1)^{N/2} \int e^{\sum_{ij} \frac 1 2 ( z_i^{(1)} A_{ij }z_j^{(1)} + z_i^{(2)} A_{ij }z_j^{(2)} ) } (-1)^{\frac 1 2 N (N-1)} dz_1^{(1)} \ldots dz_N^{(1)} dz_1^{(2)} \ldots dz_N^{(2)} \nonumber \\ &=& \left[ \int e^{\sum_{ij} \frac 1 2 ( z_i A_{ij }z_j ) } dz \right]^2 ~. \end{array}

Recall the following identity for the Pfaffian of an antisymmetric even dimensional matrix:

\displaystyle {\rm Pf}(A) = \sqrt {\det(A)} ~. \ \ \ \ \ (36)

We thus obtain

\displaystyle \int e^{\sum_{ij} \frac 1 2 ( z_i A_{ij }z_j ) } dz = {\rm Pf}(A) ~ \ \ \ \ \ (37)

for any even dimensional antisymmetric matrix {A}.

8. Berezin integration as a contraction

My favorite way of thinking about Berezin integration is in terms of interior products in Grassmann algebras. (Note: interior products are not the same as inner products.) In fact, interior products are how I explicitly calculate more difficult Berezin integrals in practice. If time permits, I may write something up in the future on this topic. This idea is of course not new. It is known that Berezin integrals are a type of contraction, see here.


Writing a paper in E-prime

Many top scientists communicate clearly, sometimes seemingly effortlessly. The papers by Einstein flow elegantly in clear and logical steps, almost as if choreographed, from one idea to the next. Some articles even have qualities more commonly seen in great works of art, for example, Dirac’s seminal book on quantum mechanics or Shannon’s paper introducing his celebrated entropy. What a pleasure to read! Most physicists similarly recognize Feynman as a master of clear communication.

Before I became a grad student, I had underestimated the importance of good and effective communication. My former PhD advisor, an excellent communicator, taught me the crucial role played by communication in scientific discourse and debate.

Let me explain this point in greater detail. As an illustrative example, imagine if Einstein had not written clearly. Then it may very well have taken much longer for his ideas to percolate and gain acceptance throughout the scientific community. Indeed, Boltzmann, in contrast to Einstein, wrote lengthy and admittedly difficult-to-read texts. Some of his critics perhaps  failed to grasp his seminal ideas. Disappointed and possibly depressed, he eventually committed suicide while still in his prime. Today, the top prize in the field  of statistical physics honors his name— the Boltzmann Medal. Nevertheless, it took many years and the efforts of other scientists (e.g. Gibbs) for the physics community to recognize the full extent of Boltzmann’s contributions.    Clear exposition can make a big difference.

In this blog post, I do not give tips or advice about how to write clearly. Good tips on how to write clearly abound.  Instead, I want to draw your attention to how this article does not contain a single instance of the verb “to be” or any of its conjugations or derived words, such as “being,” “was,” “is,” and so forth — excepting this sentence, obviously. The subset of the English language that remains after the removal of these words goes by the name E-prime, often written E’. In other words, E’ equals English minus all words derived from the above-mentioned verb.

Writing in E’ usually forces a person to think more carefully. Scientists need to communicate not only clearly, but with a slightly higher degree of precision than your typical non-scientist. I have found that fluency in E’ helps me to spot certain kinds of errors of reasoning. The key error of reasoning attenuated by the use of E’ relates to identification.   Too often, the referents of the grammatical subject and object become identified in standard English, where in fact no such identification exists in the real world.  E’ helps to reduce this improper identification, or at least to call attention to it.  The topic of E’, and of related subjects, such as  its ultimate historical origins in general semantics, the study of errors of reasoning, the nature of beliefs, cognitive biases, etc., would require too broad a digression for me to discuss here, so I recommend that interested readers research such topics on their own.

In my early 30s, soon after I obtained tenure in my first faculty position, I decided to write a full article entirely in E’.  What a wonderful and interesting exercise!  Of course, I did not find it easy to write in E’, but with few exceptions, the finished paper contained only E’ sentences.  Forcing myself to think and write in E’ helped me to give a better description of what we, as scientists, really did.  I would cautiously claim that writing in E’ benefited our paper, at least as far as concerns clarity and precision.  No longer do I publish papers in E’, but I learned a lot about how to write (and think) a little bit more clearly.

That paper, about an empirical approach to music, appeared in print in 2004 in the statistical physics journal  Physica A. It eventually ended up cited very well: 33 citations according to  Thomson Reuters’  Web of Science and 60 citations on Google Scholar, as of May 2016.  Most incredibly, it even briefly shot up to the top headline at (click here to see)!  We had never expected this.

In that paper, my co-authors and I proposed a method for studying rhythmic complexity. The collaboration team included as first author Heather Jennings, a professor of music (and also my spouse). We took an empirical approach for comparing the rhythmic structures of Javanese Gamelan, Jazz, Hindustani music, Western European classical music, Brazilian popular music (MPB), techno (dance), New Age music, the northeastern Brazilian folk music known as Forró and last but not least: Rock’n Roll. Excepting a few sentences, the paper consists entirely of E’ sentences.

You can read the paper by clicking here for the PDF. A fun exercise: as you read the paper, (1) try to imagine how you would normally rephrase the E’ sentences in ordinary English; (2) try to spot the subtle difference in meaning between the English and E’ sentences.


Colóquio na USP sobre movimento de animais

Aqui está o link para o video de um Colóquio que proferi na USP em 09/04/2015. A palestra está em português, embora o título esteja em inglês.   Esse assunto representa o “feijão com arroz” das minhas pesquisas na área de física estatística aplicada.

Vale a pena também destacar que o professor que me apresenta no início do video é o professor titular Mario de Oliveira, autor do livro sobre termodinâmica que virou referência no Brasil. Seu livro é frequentemente usado como texto principal junto a disciplinas de termodinâmica nos cursos de graduação em física.

Scale invariance, random walks and complex networks

Here is the link to a youtube video of a talk I gave at the International Institute of Physics (IIP) at UFRN, in Natal, Brazil.  It is one of many talks given by invited lecturers at the school on Physics and Neuroscience, which was held at the IIP during 11 to 17 of August 2014.

This talk touches on the bread and butter of my research activities.  It should be completely or almost completely understandable to anyone at least midway through an undergraduate degree in the sciences. Since the participants in the conference came from diverse backgrounds, I had made a special effort to avoid the use of jargon and to speak in as clear a language as I could. (It is probably the longest talk I have given about my research.)

An explanation about the initial statement regarding elves and hobbits, etc.:  These comments  refer to a running “inside joke” at the school, contrasting the distinct scientific cultures of the participants, for example biologists vs. applied mathematicians and physicists etc.


Fermionization of the 2-D Ising model: The method of Schultz, Mattis and Lieb

F. A da Costa, R. T. G. de Oliveira, G. M. Viswanathan

This blog post was written in co-authorship with my physics department colleague Professor Francisco “Xico” Alexandre da Costa and Professor Roberto Teodoro Gurgel de Oliveira, of the UFRN mathematics department. Xico obtained his doctorate under Professor Sílvio Salinas at the University of São Paulo. Roberto was a student of Xico many years ago, but left physics to study mathematics at IMPA in Rio de Janeiro in 2010. During 2006–2007, Roberto and Xico had written up a short text in Portuguese that included the exact solution of the Ising model on the infinite square lattice using the method of fermion operators developed by Schultz, Mattis and Lieb. With the aim of learning this method, I adapted their text and expanded many of the calculations for my own convenience. I decided to post it on this blog since others might also find it interesting. I have previously written an introduction to the 2-D Ising model here, where I review a combinatorial method of solution.

1. Introduction

The spins in the Ising model can only take on two values, {\pm 1}. This behavior is not unlike how the occupation number {n} for some single particle state for fermions can only take on two values, {n=0,1}. It thus makes sense to try to solve the Ising model via fermionization. This is what Schultz, Mattis and Lieb accomplished in their well-known paper of 1964. In turn, their method of solution is a simplified version of Bruria Kaufman’s spinor analysis method, which is in turn a simplification of Onsager’s original method.

We will proceed as follows. First we will set up the transfer matrix. Next we will reformulate it in terms of Pauli’s spin matrices for spin-{\tfrac 1 2} particles. Recall that in quantum field theory boson creation and annihilation operators satisfy the well-known commutation relations of the quantum harmonic oscillator, whereas fermion operators satisfy analogous anticommutation relations. The spin annihilation and creation operators {\sigma_j^\pm } do not anticommute at distinct sites {j} but instead commute, whereas fermion operators must anticommute at different sites. This problem of mixed commutation and anticommutation relations can be solved using a method known as the Jordan-Wigner transformation. This step completes the fermionic reformulation of the 2-D Ising model. To obtain the partition function in the thermodynamic limit, which is the largest eigenvalue of the transfer matrix, one diagonalizes the fermionized transfer matrix using appropriate canonical transformations.

Continue reading


Are science and religion compatible?

This blog post explores whether or not science and religion are compatible. I use the term religion in the usual sense, to mean a system of faith, worship and sacred rituals or duties. Religions typically consist of an organized code or collection of beliefs related to the origins and purpose of humanity (or subgroups thereof), together with a set of practices  based on those beliefs. Can such belief systems be compatible with science?

Since this topic is controversial, I only reluctantly decided to write about it.  Being a physics professor and a research scientist, I decided not to flee debate on this issue (which is like the third rail of science). Instead, here I detail my thoughts in writing.

Actually, I spent decades trying to reconcile science and (organized) religion, however I made little or no significant progress. Eventually, after much hesitation and discomfort, I was forced to conclude that full reconciliation between science and organized religion may not be possible, even in principle.   Although this realization was initially surprising (and unpleasant) to me, I soon discovered new and more fulfilling ways of approaching issues such as ethics, morals and the purpose or meaning of life, which religion has traditionally monopolized.

1. Short answer: science and religion are incompatible

`Religion is a culture of faith and science is a culture of doubt.’  This statement is usually attributed to Richard Feynman.   Faith and doubt are indeed antagonistic, like water and fire. How can it be possible to fully reconcile religious views, which are based on faith, with the systematic doubt and the skeptical questioning that are intrinsic to the scientific method? Like many scientists, I too have concluded that full reconciliation of science and religion is not possible.

One caveat: obviously, if one removes the element of dogma and faith from religion, then reconciliation might be possible. But religion without dogma is more like a social club than a traditional religion. What would become of Christianity without faith in Jesus Christ? Can you imagine Islam without faith in the Koran?  So, by religion I always mean organized religion, with a set of teachings or dogmas.

Nevertheless, this caveat actually points to a possible way forward  at reconciliation of science and religion: if religions do become more like social clubs and less dogmatic, then disagreement with science can be minimized or even avoided. I see some movement in this direction. There is growing realization, even among people of faith,  that the arbitrary divisions of race, ethnicity and religion, for example, do not have a clear and well-established scientific foundation. In this context, it is admirable that the  new pope of the Catholic Church,  Pope Francis, has defended interfaith dialogue.  He has said, for example, that even atheists can be redeemed. This concession is a major advance, compared to the old threats about burning in hell in eternal damnation!  Moreover, by claiming that he would baptize even Martians, he has (perhaps inadvertently) signaled an openness to the possibility of extraterrestrial intelligence (i.e., aliens). Similarly, his emphasis on raising awareness of climate change is also most encouraging. These are all welcome developments. Other religions have also responded positively to the challenges brought on by science. The Dalai Lama is good example: a Buddhist religious leader who has shown interest in and kept an open mind about science. He has stated that “…if scientific analysis were conclusively to demonstrate certain claims in Buddhism to be false, then we must accept the findings of science and abandon those claims.” Pope Francis, the Dalai Lama and many others like them have contributed positively towards the reconciliation of science and religion. They are forward-thinking and broad-minded religious leaders. Maybe, in some sense, they have more in common with liberal social and political leaders than with the dogmatic defenders of religious orthodoxy and closed-minded conservatism. So I do see a ray of hope. While I welcome the positive change in attitudes brought by religious leaders like Pope Francis and the Dalai Lama, on the other hand the fact remains that their religions are still based on dogma. Religions still have too much dogma, too much superstition and too much bigotry. So, even considering the above caveat, overall I still feel  generally pessimistic about science and religion being compatible.

Below I explore these issues in some detail.

2. Dirac and Feynman on religion

In the list of the all-time greatest physicists, Newton and Einstein invariably take the top positions. Paul A. M. Dirac, of Dirac equation fame, is considered to be an intellectual giant, ranking just a few notches below Einstein or Newton. And Feynman, who usually ranks just below or comparable to Dirac, has rock star status in the physics community.

Feynman had the following to say about religion:

It doesn’t seem to me that this fantastically marvelous universe, this tremendous range of time and space and different kinds of animals, and all the different planets, and all these atoms with all their motions, and so on, all this complicated thing can merely be a stage so that God can watch human beings struggle for good and evil — which is the view that religion has. The stage is too big for the drama.

Dirac had the following to say about religion:

If we are honest — and scientists have to be — we must admit that religion is a jumble of false assertions, with no basis in reality. The very idea of God is a product of the human imagination. It is quite understandable why primitive people, who were so much more exposed to the overpowering forces of nature than we are today, should have personified these forces in fear and trembling. But nowadays, when we understand so many natural processes, we have no need for such solutions. I can’t for the life of me see how the postulate of an Almighty God helps us in any way. What I do see is that this assumption leads to such unproductive questions as why God allows so much misery and injustice, the exploitation of the poor by the rich and all the other horrors He might have prevented. If religion is still being taught, it is by no means because its ideas still convince us, but simply because some of us want to keep the lower classes quiet. Quiet people are much easier to govern than clamorous and dissatisfied ones. They are also much easier to exploit. Religion is a kind of opium that allows a nation to lull itself into wishful dreams and so forget the injustices that are being perpetrated against the people. Hence the close alliance between those two great political forces, the State and the Church. Both need the illusion that a kindly God rewards — in heaven if not on earth — all those who have not risen up against injustice, who have done their duty quietly and uncomplainingly. That is precisely why the honest assertion that God is a mere product of the human imagination is branded as the worst of all mortal sins.

I do not accept arguments from authority. But it is nevertheless interesting to read about what these eminent physicists had to say.

3. Scientists abandon God and religion

Most scientists are non-religious. Many are atheist. A Pew survey from 2009 found that while over 80% of Americans believed in God, less than 50% of scientists believed in God. The percentage was actually 33% in that particular survey. These numbers are typical. For instance, among the members of the US National Academy of Sciences, more than 60% of biological scientists had disbelief in God (i.e., were what most people call `atheists’) according to a study from 1998. In the physical sciences, 79% had disbelief in God.

This issue is relevant in society because most politicians and people in leadership positions are, at least outwardly, sympathetic to religion if not actively religious. So there is at least this one important difference between the majority of scientists and the rest of society. Whereas most people are religious, most scientists are non-believers.

More worrying is that many politicians actively campaign against science and science education. We have all heard about attempts by the religious to eliminate (or water down) the teaching of Darwinian evolution in schools. At least in the West, these attempts have largely failed.

Fortunately, the voting population does not particularly crave a return to the dark ages. It  is easy to understand why. The experience of the last few centuries has shown that social and economic development is only possible when there is political support and commitment to science research and education. Science is responsible for the invention of the Internet, cell phones, radio, TV, cars, trains, airplanes, X-rays, MRIs, the eradication of smallpox, etc.   Rich and socially developed countries are precisely those in which science education and research are well funded. Economic pressures have thus led to investment in science and in science education.

At the same time, science has led to unintended consequences. The more a person is exposed to science, the less religious they are likely to become. (Possibly as a consequence, wealth is also negatively correlated with religiosity. In other words, on average the richer you are, the less religious you are likely to be.)

Especially among those with less science education, there is a fear that exposure to science and to “subversive” ideas such as Darwinian evolution will infect the minds of young people and turn them into “Godless infidels.”  In fact, fear is a constant theme in religion: fear of God, fear of divine punishment, fear of hell, fear of forbidden knowledge, etc.  Science education dispels such fears, and replaces it with the cultivation of curiosity, wonder, questioning, doubt and awe. Since fear is often used as an instrument of control and power, the loss of fear can be a setback for the power structures of organized religion. In this sense, science and science education sometimes directly threaten some religious movements.

Consider, as an example, suicide bombing as a form of jihad by Islamic militant organizations. It is perfectly fair for us to ask: is it even remotely plausible that these hapless suicide bombers correctly understood the scientific method? This is a rhetorical question, of course. A genuinely curious and scientifically literate potential candidate for suicide bombing would immediately ask questions, especially when faced with death by suicide. Is life after death a sure thing? Will Allah really reward a suicide bomber? How it is possible for the big-breasted and hot Houri girls and women to recover their sexual virginity every morning? The young man may then go on to ask: is there a remote possibility, perhaps, that such ridiculous claims are not a sign of pulling the wool over the eyes of the naïve young men in their sexual prime who crave sex and intimacy with women, but who are forbidden by religion to engage in casual sex? And why is recreational and free sex allowed in Paradise but not on earth? A scientifically literate young man would probably say `Thanks but no thanks, I’ll let you go first to set the example!’  Hence the fear and loathing of doubt, curiosity and questioning. Indeed, scientific illiteracy makes people gullible and easier to manipulate.

There is no denying the statistics: exposure to science is correlated with loss of religious faith. This raises two questions: (i) why does this happen and (ii) is this good or bad? I am mostly concerned here with question (i) and only briefly touch upon point (ii).

Continue reading