Fermionization of the 2-D Ising model: The method of Schultz, Mattis and Lieb

F. A da Costa, R. T. G. de Oliveira, G. M. Viswanathan

This blog post was written in co-authorship with my physics department colleague Professor Francisco “Xico” Alexandre da Costa and Professor Roberto Teodoro Gurgel de Oliveira, of the UFRN mathematics department. Xico obtained his doctorate under Professor Sílvio Salinas at the University of São Paulo. Roberto was a student of Xico many years ago, but left physics to study mathematics at IMPA in Rio de Janeiro in 2010. During 2006–2007, Roberto and Xico had written up a short text in Portuguese that included the exact solution of the Ising model on the infinite square lattice using the method of fermion operators developed by Schultz, Mattis and Lieb. With the aim of learning this method, I adapted their text and expanded many of the calculations for my own convenience. I decided to post it on this blog since others might also find it interesting. I have previously written an introduction to the 2-D Ising model here, where I review a combinatorial method of solution.

1. Introduction

The spins in the Ising model can only take on two values, {\pm 1}. This behavior is not unlike how the occupation number {n} for some single particle state for fermions can only take on two values, {n=0,1}. It thus makes sense to try to solve the Ising model via fermionization. This is what Schultz, Mattis and Lieb accomplished in their well-known paper of 1964. In turn, their method of solution is a simplified version of Bruria Kaufman’s spinor analysis method, which is in turn a simplification of Onsager’s original method.

We will proceed as follows. First we will set up the transfer matrix. Next we will reformulate it in terms of Pauli’s spin matrices for spin-{\tfrac 1 2} particles. Recall that in quantum field theory boson creation and annihilation operators satisfy the well-known commutation relations of the quantum harmonic oscillator, whereas fermion operators satisfy analogous anticommutation relations. The spin annihilation and creation operators {\sigma_j^\pm } do not anticommute at distinct sites {j} but instead commute, whereas fermion operators must anticommute at different sites. This problem of mixed commutation and anticommutation relations can be solved using a method known as the Jordan-Wigner transformation. This step completes the fermionic reformulation of the 2-D Ising model. To obtain the partition function in the thermodynamic limit, which is the largest eigenvalue of the transfer matrix, one diagonalizes the fermionized transfer matrix using appropriate canonical transformations.

2. The transfer matrix of the 2-D Ising model

To build intuition, let us consider first the 1-D Ising model. Define the Hamiltonian as

\displaystyle \mathcal H = -J \sum_{i=1} ^N \varsigma_i\varsigma_{i+1} -H\sum_{i=1}^N \varsigma_i ~, \ \ \ \ \ (1)

with periodic boundary conditions {\varsigma_{N+1}=\varsigma_1}. We use {\varsigma_i} instead of the usual {\sigma_i} to avoid confusion because the latter will be used to denote the Pauli matrices further below. The {\varsigma_i=\pm 1} are classical variables whereas the Pauli spin matrices are associated with quantum mechanical angular momentum (or spin) operators.

The partition function is given by

\displaystyle \begin{array}{rcl} Z_N&=& \displaystyle\sum_{\varsigma_1=\pm 1} ... \sum_{\varsigma_N=\pm 1} \exp[-\beta \mathcal H] \\ &=& \displaystyle\sum_{\varsigma_1=\pm 1} ... \sum_{\varsigma_N=\pm 1} e^{\beta J\varsigma_1\varsigma_2 +\beta H \varsigma_1} e^{\beta J\varsigma_2\varsigma_3 +\beta H \varsigma_2} ... e^{\beta J\varsigma_N\varsigma_1 +\beta H \varsigma_N} ~. \end{array}

Using Dirac’s bra-ket notation, let us define the matrix

\displaystyle \left\langle{\varsigma_i|T| \varsigma_j}\right\rangle = e^{\beta J\varsigma_i\varsigma_j +\beta H \varsigma_i} ~. \ \ \ \ \ (2)

Or, explicitly

\displaystyle T = \left[ \begin{array}{cc} e^{\beta J + \beta H } & e^{-\beta J + \beta H } \\ e^{-\beta J - \beta H } & e^{+\beta J - \beta H } \end{array} \right] ~. \ \ \ \ \ (3)

Here the vector {\left|{\varsigma} \right\rangle} can assume two possible values, {\left| {\pm}\right\rangle}. We can now write

\displaystyle \begin{array}{rcl} Z_N&=& \displaystyle\sum_{\varsigma_1} ... \sum_{\varsigma_N} \left\langle{\varsigma_1|T| \varsigma_2}\right\rangle \left\langle{\varsigma_2|T| \varsigma_3} \right\rangle ... \left\langle{\varsigma_N|T| \varsigma_1} \right\rangle \\ &=& \displaystyle\sum_{\varsigma_1=\pm 1} \left\langle{\varsigma_1|T^N| \varsigma_1}\right\rangle ~. \end{array}

Observe that {Z_N} is thus a sum of the diagonal entries of the matrix {T^n}. By definition of the trace, we thus have

\displaystyle Z_N=\mathrm{Tr~} T^N ~. \ \ \ \ \ (4)

Let {\lambda_+} and {\lambda_-} be the larger than smaller eigenvalues of {T}. Then the eigenvalues of {T^N} are {\lambda_\pm ^N} and so

\displaystyle Z_N= \lambda_+^N + \lambda_-^N= \lambda_+^N (1 + (\lambda_-/\lambda_+)^N ) ~. \ \ \ \ \ (5)

The partition function per site is then

\displaystyle \lambda_N= Z_N^{1/N } = \lambda_+ (1 + (\lambda_-/\lambda_+)^N ) ^{1/N} ~, \ \ \ \ \ (6)

so that in the thermodynamic limit we have

\displaystyle \lambda_\infty = \lim_{N\rightarrow\infty} \lambda_N= \lambda_+ ~. \ \ \ \ \ (7)

As taught in undergraduate courses in statistical mechanics, the free energy per site {f} is related to the partition function per site via {-\beta f=\ln \lambda_+}, so that knowledge of the partition function gives yields the thermodynamic properties of the system. Our goal is to find {\lambda_+} explicitly.

The eigenvalues {\lambda_\pm} are the roots of the characteristic polynomial

\displaystyle \mathrm{det~} (T-\lambda {\boldsymbol{1}} ) =0 ~. \ \ \ \ \ (8)

Here {{ \boldsymbol{ 1}}} is the identity matrix. For a {2\times 2} matrix, the determinant is particularly easy to calculate:

\displaystyle \mathrm{det~} (T-\lambda {~\boldsymbol{ 1}} ) = e^{2\beta J} - \lambda (e^{\beta J+\beta H}+ e^{\beta J-\beta H} ) + \lambda^2 - e^{-2\beta J} ~, \ \ \ \ \ (9)

so that

\displaystyle a\lambda^2 + b\lambda+c=0 ~, \ \ \ \ \ (10)

with

\displaystyle \begin{array}{rcl} a&=& 1\\ b&=& - (e^{\beta J+\beta H}+ e^{\beta J-\beta H} ) =- 2 e^{\beta J} \cosh \beta H \\ c&=& 2 \sinh 2\beta J ~,\end{array}

from which we get immediately

\displaystyle \begin{array}{rcl} \lambda_\pm &=& \frac{2 e^{\beta J} \cosh \beta H \pm \sqrt { 4 e^{2\beta J} \cosh^2 \beta H -8 \sinh 2\beta J}}2 \\\nonumber &=& e^{\beta J} \cosh \beta H \pm \sqrt { e^{2\beta J} \cosh^2 \beta H -2 \sinh 2\beta J } \\\nonumber &=& e^{\beta J} \cosh \beta H \pm \sqrt { e^{2\beta J} (1+\sinh ^2 \beta H) -2 \sinh 2\beta J } \\\nonumber &=& e^{\beta J} \cosh \beta H \pm \sqrt { e^{2\beta J} + e^{2\beta J} \sinh ^2 \beta H - e^{2\beta J} + e^{-2\beta J} } \\ &=& e^{\beta J} \cosh \beta H \pm \sqrt {e^{2\beta J} \sinh ^2 \beta H + e^{-2\beta J} } ~. \end{array}

The partition function per site is {\lambda_+}.

Notice that the transfer matrix connects only nearest neighbors. Each power of the transfer matrix adds one bond and one spin site. The partition function is then obtained as the trace of a power of the transfer matrix.

This idea can be applied in higher dimensions. In two dimensions, there are at least two ways of setting up a transfer matrix. Either we can add two bonds and one site, using helical boundary conditions to sweep the entire lattice, or else we can add an entire line of sites and bonds, using periodic boundary conditions. In three dimensions, one can also set up the transfer matrix in one of these two ways. But in three dimensions, one can also add an entire plane of bonds and sites. If the transfer matrix adds {n} sites, then the matrix has dimensions {2^n\times 2^n}.

We choose to set up the transfer matrix to add one line of sites. For a {M\times N} lattice (of {N} columns and {M} rows), let us choose the transfer matrix to add one column of {M} spins. So the matrix has dimensions {2^M \times 2^M}, and connects two columns of dimension {M} each. Let \boldsymbol{\varsigma}_i denote the vector consisting of the {M} spin variables {\varsigma_{1,i}, \varsigma_{2,i}, \ldots \varsigma_{M,i}}, corresponding to column {i}.

Consider the 2-D Hamiltonian

\displaystyle \mathcal H= -J_1 \sum_{m,n}^{M,N} \varsigma_{n,m} \varsigma_{n+1,m} -J_2 \sum_{m,n}^{M,N} \varsigma_{n,m} \varsigma_{n,m+1} ~,\ \ \ \ \ (11)

where {J_1} and {J_2} are the couplings along rows and columns respectively.

Let {K_1=\beta J_1} and {K_2=\beta J_2} be the reduced temperatures (or, alternatively, the reduced couplings). The partition function of the 2-D Ising model can then be written as

\displaystyle Z_{MN} = \displaystyle\sum_ {\boldsymbol{\varsigma}_2} \sum_ {\boldsymbol{\varsigma}_2} \ldots \sum_{\boldsymbol{\varsigma}_N} (e^{\beta K_1 {\boldsymbol{\varsigma}_1} \cdot {\boldsymbol{\varsigma}_2}} e^{\beta K_2 \sum_m \varsigma_{1,m} \varsigma_{1,m+1} } ) \ldots (e^{\beta K_1 {\boldsymbol{\varsigma}_N} \cdot {\boldsymbol{\varsigma}_1} } e^{\beta K_2 \sum_m \varsigma_{N,m} \varsigma_{N,m+1} } ) ~,\ \ \ \ \ (12)

where the sum over each {\boldsymbol\varsigma_i} is over {2^M} states. Each of the factors in parentheses is due to interactions between neighboring columns on within a single column, so they can be made into elements of a matrix {T=T_{ {\boldsymbol{\varsigma}_i} {\boldsymbol{\varsigma}_j}}} of dimension {2^M\times 2^M}. Then

\displaystyle \begin{array}{rcl} Z_{MN} &=& \displaystyle\sum_ {\boldsymbol{\varsigma}_1}\sum_ {\boldsymbol{\varsigma}_2} \ldots \sum_ {\boldsymbol{\varsigma}_N} T_{ {\boldsymbol{\varsigma}_1} {\boldsymbol{\varsigma}_2}} T_{ {\boldsymbol{\varsigma}_2} {\boldsymbol{\varsigma}_3}} \ldots T_{ {\boldsymbol{\varsigma}_N} {\boldsymbol{\varsigma}_1}} \nonumber \\ &=& \displaystyle\sum_ {\boldsymbol{\varsigma}_1} (T^N) _{ {\boldsymbol{\varsigma}_1} {\boldsymbol{\varsigma}_1}} \nonumber \\ &=& \mathrm{Tr~} T^N ~. \end{array}

So similarly to the 1-D case, the partition function of the 2-D Ising model can also be expressed as the trace of the transfer matrix. The same is true for cubic lattice, etc. The difficult part is to actually compute the trace! For example, Barry Cipra has referred to challenge of solving the 3-D case as “the Holy Grail of statistical mechanics.”

3. The transfer matrix in terms of the Pauli spin matrices

Before we fully fermionize the problem, let us reformulate the transfer matrix in terms of the Pauli spin matrices, which have anticommuting properties. As we did above with the transfer matrix, let us first look at the 1-D analog before considering the 2-D case.

We saw earlier that the transfer matrix for the 1-D Ising model is a {2\times 2} matrix. It thus makes sense that we should thus be able to write it, and therefore the partition function, in terms of the Pauli spin matrices, which we briefly review:

\displaystyle \begin{array}{rcl} \sigma_x&=& \left[\begin{array}{cc} 0 & ~1~ \\ 1 & 0 \end{array} \right] \\ \sigma_y&=& \left[\begin{array}{cc} 0 & -i \\ i & 0 \end{array} \right] \\ \sigma_z&=& \left[\begin{array}{cc} 1 & ~0~ \\ 0 & -1 \end{array} \right] ~~~. \end{array}

Equivalently, we can define {\sigma_x} and {\sigma_y} via the raising and lowering operators (also known as ladder operators) {\sigma^\pm}:

\displaystyle \begin{array}{rcl} \sigma^+&=& \left[\begin{array}{cc} 0 & 1 \\ 0 & 0 \end{array} \right] \\ \sigma^-&=& \left[\begin{array}{cc} 0 & 0 \\ 1 & 0 \end{array} \right] \\ \sigma_x&=& \sigma_+ + \sigma_- \\ \sigma_y&=& -i (\sigma_+ - \sigma_-) ~~~. \end{array}

The raising and lowering operators have this name because they increase and decrease the spin angular momentum in quantum mechanics. To illustrate this property, consider {\left|{\pm}\right\rangle} to be the eigenvectors of {\sigma_z}, so that {\sigma_z \left|{\pm}\right\rangle = \pm 1} and

\displaystyle \begin{array}{rcl} \left|{-}\right\rangle &=& \left[\begin{array}{c} 0 \\ 1 \end{array} \right] \\ \left|{+}\right\rangle &=& \left[\begin{array}{c} 1 \\ 0 \end{array} \right] ~. \end{array}

Then note that

\displaystyle \begin{array}{rcl} \sigma^+ \left|{-}\right\rangle &=& \left|{+} \right\rangle\\ \sigma^+ \left|{+}\right\rangle &=& 0\\ \sigma^- \left|{-}\right\rangle &=& 0\\ \sigma^- \left|{+}\right\rangle &=& \left|{-}\right\rangle ~~. \end{array}

So {\sigma^\pm} lower or raise the eigenvalues of {\sigma_z} by 2.

These matrices have several interesting properties. It is easy to verify that the Pauli matrices satisfy the commutation relation

\displaystyle [ \sigma_x ,\sigma_x ]= [ \sigma_y, \sigma_y ] = [ \sigma_z , \sigma_z ] =0 ~,

\displaystyle [ \sigma_x, \sigma_ y ] = 2 i \sigma_z ~,

and circular permutations of the last.

Viewed as operators, the matrices {\sigma_x,\sigma_y, \sigma_z} represent involutions, i.e. they are their own inverses:

\displaystyle \sigma_x^2=\sigma_y^2=\sigma_z^2 = {~\boldsymbol{ 1}} ~. \ \ \ \ \ (13)

In contrast, the raising and lowering operators are nilpotent with degree 2:

\displaystyle (\sigma^\pm)^2 = 0 ~. \ \ \ \ \ (14)

We note in passing that Grassmann numbers share this last property and indeed the Ising model can also be solved using Grassmann variables.

Most important for our purposes are the anticommutation relations

\displaystyle \begin{array}{rcl} \{\sigma_x,\sigma_x \} &=& \{\sigma_y,\sigma_y \}= \{\sigma_z,\sigma_z \} = 2 ~\boldsymbol{ 1} \\ \{\sigma_x,\sigma_y \} &=& \{\sigma_y,\sigma_z \}= \{\sigma_z,\sigma_x \} = 0 \\ \{\sigma^+\! ,\sigma^- \}&=& ~\boldsymbol{ 1} ~. \end{array}

 

The first follows from the involution property discussed above and the last two are easily obtained by direct substitution. The anticommutation relation for the raising and lowering operators is the signature of fermions.

Before proceeding we first prove useful trigonometric identities. There are versions of Euler’s identity for the Pauli matrices:

\displaystyle e^{ A \sigma_z} = \cosh[A \sigma_z] + \sinh[A \sigma_z] = {~\boldsymbol{ 1}} \cosh A + \sigma_z \sinh A ~,\ \ \ \ \ (15)

the proof of which is easily obtained from the matrix power series and then using the involution property (even powers of {\sigma_z} are {{~\boldsymbol{ 1}}} and so odd powers are simply {\sigma_z}). The same argument gives us

\displaystyle e^{ A \sigma_x} = {~\boldsymbol{ 1}} \cosh A + \sigma_x \sinh A ~.\ \ \ \ \ (16)

Let us now return to the partition function of the 1-D Ising model and write

\displaystyle Z_N= \displaystyle\sum_\varsigma (e^{L\varsigma_1} e^{K\varsigma_1\varsigma_2} ) (e^{L\varsigma_2} e^{K\varsigma_2\varsigma_3} ) \ldots (e^{L\varsigma_N} e^{K\varsigma_N\varsigma_1} ) ~, \ \ \ \ \ (17)

where {L=\beta H} and {K=\beta J}. Previously we expressed the transfer matrix such that the factors in parentheses were its elements. Now, we take a different approach and treat the two kinds of factors inside the parentheses separately. Let us define two matrices {V_1} and {V_2}

\displaystyle \begin{array}{rcl} \left\langle{\varsigma|V_1|\varsigma'}\right\rangle &=& \delta_{\varsigma,\varsigma'} ~e ^{L\varsigma} \\ \left\langle{\varsigma|V_2|\varsigma'}\right\rangle &=& e ^{K\varsigma \varsigma'} ~,\end{array}

or explicitly,

\displaystyle \begin{array}{rcl} V_1&=& \left[ \begin{array}{cc} e^L & 0 \\ 0 & e^{-L} \end{array} \right] \\ V_2&=& \left[ \begin{array}{cc} e^K & e^{-K} \\ e^{-K} & e^K \end{array} \right] ~. \end{array}

We can express {V_1} and {V_2} in terms of the Pauli matrices as follows.

\displaystyle e ^{L\sigma_z} = {~\boldsymbol{ 1}} \cosh L + \sigma _z \sinh L = \left[ \begin{array}{cc} e^L & 0 \\ 0 & e^{-L} \end{array} \right] ~, \ \ \ \ \ (18)

so that

\displaystyle V_1=e^{L \sigma_z} ~. \ \ \ \ \ (19)

For {V_2} we proceed as follows. We first define {A} and {K^*},

\displaystyle \begin{array}{rcl} A(K) \cosh K^* &=& e^K\\ A(K) \sinh K^* &=& e^{-K} ~. \end{array}

Then we have

\displaystyle \begin{array}{rcl} A(K)e^{ K^* \sigma_x} &=& A(K) (~\boldsymbol{ 1} \cosh K^* + \sigma_x \sinh K^*)\\ &=& ~\boldsymbol{ 1} e^K + \sigma_x e^{-K}\\ &=& V_2 ~. \end{array}

Let us denote the basis of eigenvectors of {\sigma_z} as {\left|{\pm}\right\rangle}. Then we can write the partition function as

\displaystyle Z_N=\displaystyle\sum_{\mu=\pm} \left\langle{\mu_1|V_1V_2|\mu_2} \right\rangle \left\langle{\mu_2|V_1V_2|\mu_3}\right\rangle \ldots \left\langle{\mu_N|V_1V_2|\mu_1} \right\rangle =\mathrm{Tr~} (V_1 V_2)^N ~. \ \ \ \ \ (20)

By the invariance of the trace of a product under cyclic permutations of the factors, we can write the partition more symmetrically as

\displaystyle Z_N = \mathrm{Tr~} (V_2^{1/2} V_1 V_2^{1/2} )^N ~. \ \ \ \ \ (21)

Let us now proceed to the 2-D case of and {M\times N} lattice. Let {\left|{\mu}\right\rangle} now denote one full column of sites, i.e.

\displaystyle \left|{\mu}\right\rangle = \left|{\mu_{1} }\right\rangle \otimes \left|{\mu_{2} } \right\rangle\ldots \otimes \left|{\mu_{M} }\right\rangle ~. \ \ \ \ \ (22)

This kind of product is sometimes called a direct product in physics textbooks. In the language of multilinear algebra, the vector {\left|{\mu}\right\rangle} belongs to the space which is the direct sum of the two-dimensional vector spaces corresponding to the sites in a single column, so the dimension of the direct sum space is {2M}. The symbol {\otimes} above does not refer to tensor products. If we had tensor products, the dimension of the product space would be {2^M} instead.

Let {\sigma^z_j} be the Pauli spin operator {\sigma_z} which acts vector {\left|{\mu}\right\rangle} only on the row {j} for a , i.e. it acts only on the subspace spanned by the {\left|{\mu_j}\right\rangle}, and similarly for {\sigma^x_j} and {\sigma^y_j}. Moreover, let {\sigma_j^\pm} be the raising and lowering operators that act on row {j} of {\left|{\mu_j}\right\rangle}. We thus have

\displaystyle \begin{array}{rcl} \sigma^z_j \left|{\mu}\right\rangle &=& \mu_j \left|{\mu}\right\rangle, \quad \mu_j=\pm 1\\ \sigma^\pm_j \left|{\mu}\right\rangle &=& \delta_{\mp 1,\mu_j} ~ \big(~ \left|{\mu_{i1} }\right\rangle \otimes \left|{\mu_{j} \pm 2}\right\rangle \ldots \otimes \left|{\mu_{M} }\right\rangle ~\big) ~~. \end{array}

We know the commutation relations of these operators for identical {j}, but how do operators of distinct {j} behave? Since they act on different subspaces, they must commute:

\displaystyle [\sigma^\alpha_i,\sigma^\beta_j] =0, \quad \alpha,\beta = \pm,z ~. \ \ \ \ \ (23)

 

We are now ready to express the transfer matrix in terms of these operators. Recall that in the 1-D case, we expressed the transfer matrix in terms of the Pauli matrices. In the 2-D case, the transfer matrix is no longer a 2{\times}2 matrix, but we can still try to express the elements of the transfer matrix in terms of the Pauli matrices. The interactions between columns and within columns are “orthogonal,” in the sense that they are independent. So we can try to express the interactions in one direction in terms of, say, {\sigma^x_j} and in the other by {\sigma_j^z}. In the 1-D case, the matrix {V_1} is diagonal, because it only contains the contributions of the external field, but not from the interactions between the spins. In the 2-D case, similarly, the contribution to the transfer matrix from the interactions inside a column will give rise to a diagonal matrix. This fact is obvious if one considers that it makes no sense to have non-diagonal terms connecting two different columns for interactions inside a single column. Hence, it makes sense to use {\sigma_j^z} to represent the interactions inside a column. Then, we can try to represent the the interactions between the columns in terms of the {\sigma_j^x}. Note that in the 1964 paper by Schultz, Mattis and Lieb, the roles of {V_1} and {V_2} are exchanged and their {V_2} is diagonal.

We next proceed as described above. Consider first the diagonal part:

\displaystyle \begin{array}{rcl} \left\langle{\mu' |e^{K_2 \sum_m\sigma^z_m\sigma^z_{m+1}}|\mu }\right\rangle &=& \left\langle{\mu'_1 |e^{K_2 \sigma^z_1\sigma^z_2}|\mu_2 } \right\rangle \left\langle{\mu'_2 |e^{K_2 \sigma^z_2\sigma^z_3}|\mu_3 } \right\rangle \ldots \left\langle{\mu'_M |e^{K_2 \sigma^z_M\sigma^z_1}|\mu_1 } \right\rangle \\ &=&e^{K_2\sum_m \mu'_m \mu_{m+1}} \left\langle{\mu'_1|\mu_2}\right\rangle \left\langle{\mu'_2|\mu_3}\right\rangle \ldots \left\langle{\mu'_M|\mu_1}\right\rangle ~. \end{array}

So we define {V_1} as

\displaystyle V_1=e^{K_2 \sum_m\sigma^z_m\sigma^z_{m+1}} ~. \ \ \ \ \ (24)

Now all that remains is to find the nondiagonal contribution, arising from interactions between columns. Observe that

\displaystyle \begin{array}{rcl} \left\langle{\mu' |e^{K_1^* \sum_m \sigma^x_m}|\mu }\right\rangle &=& \left\langle{\mu'_1| e^{K_1^* \sigma^x_1}|\mu_1 }\right\rangle \left\langle{\mu'_2| e^{K_1^* \sigma^x_2}|\mu_2 }\right\rangle \ldots \left\langle{\mu'_M| e^{K_1^* \sigma^x_M}|\mu_M }\right\rangle ~. \end{array}

Next, recall that

\displaystyle \sigma_x= \sigma^+ + \sigma^- ~,\ \ \ \ \ (25)

so that

\displaystyle \begin{array}{rcl} e^{K_1^* \sigma^x_j}\left|{\mu_j} \right\rangle &=& \cosh K_1^* {~\boldsymbol{ 1}}_j \left|{\mu_j}\right\rangle + \sinh K_j^* \sigma^x_j \left|{\mu_j}\right\rangle \nonumber \\ &=& \cosh K_1^* {~\boldsymbol{ 1}}_j \left|{\mu_j}\right\rangle + \sinh K_j^* ( \sigma^+_j +\sigma^-_j) \left|{\mu_j}\right\rangle \nonumber \\ &=& \cosh K_1^* {~\boldsymbol{ 1}}_j \left|{\mu_j}\right\rangle + \sinh K_j^* ( \left|{\mu_j+2}\right\rangle + \left|{\mu_j-2}\right\rangle ) ~. \end{array}

Here it is assumed implicitly that {\left|{\pm 3}\right\rangle=0} of course. The corresponding {2\times 2} matrix is thus seen to be

\displaystyle \begin{array}{rcl} \nonumber \left\langle{\mu'_j|e^{K_1^* \sigma^x_j} |\mu_j} \right\rangle &=& \cosh K^*_1 \delta_ {\mu_j\mu_j'} + \sinh K_1^* (1-\delta_ {\mu_j\mu_j'}) \\ &=& \tfrac 1 2 (e^{K_1^*} + \mu_j \mu'_j e^{-K_1^*}) ~. \end{array}

We need the factors {\mu_j \mu'_j=\pm} exponentiated, so this is still not what we want. But we can easily put it into the needed form, by repeating the trick we used in the 1-D case. Specifically, let us write

\displaystyle {e^{K_1^*} + \mu_j \mu'_j e^{-K_1^*} \over 2} = A(K_1) e^{K_1\mu_j\mu'_j } ~,\ \ \ \ \ (26)

where {K_1} is a function of {K^*_1} and vice versa. Putting everything together, for two columns {\left|{\mu}\right\rangle} and {\left|{\mu'}\right\rangle} we get

\displaystyle \left\langle{\mu' |e^{K_1^* \sum_m \sigma^x_m}|\mu }\right\rangle = (A(K_1))^M e^{\sum _{j=1}^M \mu_j \mu'_j} ~. \ \ \ \ \ (27)

We can find the relation between {K_1} and {K_1^*} explicitly as follows. Putting in {\mu_j\mu_j'=\pm 1}, we have

\displaystyle \begin{array}{rcl} \cosh K_1^* &=& A(K_1) e^{K_1} \\ \sinh K_1^* &=& A(K_1) e^{-K_1} ~,\end{array}

so that

\displaystyle \begin{array}{rcl} \tanh K_1^*&=& e^{-2K_1} \\ \cosh K_1^* \sinh K_1^* &=& (A(K_1)) ^2 ~. \end{array}

The last expression can be further manipulated as follows:

\displaystyle \begin{array}{rcl} A(K_1)^2\nonumber &=&\cosh K_1^* \sinh K_1^* \\\nonumber &=& \cosh^2 K_1^* \tanh K_1^*\\\nonumber &=& \tanh K_1^* / \mathrm{ sech\/ } ^2 K_1^*\\\nonumber &=& \tanh K_1^* / (1- \tanh^2 K_1^*)\\\nonumber &=& e^{-2K_1} / (1- e^{-4 K_1})\\\nonumber &=& 1/ (e^{2K_1 }- e^{-2 K_1})\\\nonumber &=& 1/ (2 \sinh 2 K_1)\\ &=& 1/ (2 \sinh 2 K_1) ~,\end{array}

so that we finally have

\displaystyle A(K_1)= (2 \sinh 2 K_1 )^{-1/2 } ~. \ \ \ \ \ (28)

We thus obtain

\displaystyle V_2= (2\sinh 2K_1)^{M/2} e ^{K_1^* \sum_m \sigma_m^x} ~. \ \ \ \ \ (29)

Following the 1-D example, the partition function can now we written as a trace:

\displaystyle Z_{MN} = \mathrm{ Tr~} (V_2 V_1)^N ~. \ \ \ \ \ (30)

Again, we can use the invariance of the trace relative to circular permutations to get the symmetric form

\displaystyle Z_{MN} = \mathrm{ Tr~} (V_2^{1/2} V_1 V_2^{1/2})^N ~. \ \ \ \ \ (31)

 

4. Fermionization via the Jordan-Wigner transformation

Let us first put {V_1} and {V_2} into a more convenient form, by rotating around the {y} axis such that

\displaystyle \begin{array}{rcl} \sigma_x &\mapsto& \sigma_z \\ \sigma_z &\mapsto& -\sigma_x ~. \end{array}

In this new arrangement, we get

\displaystyle \begin{array}{rcl} V_1&=& e^{K_2 \sum_m \sigma^x_m \sigma^x_{m+1}} \\ V_2&=& A(K_1) e^{K_1^* \sum_m \sigma_m^z} ~. \end{array}

Previously, the matrix {V_1} was diagonal. But in this new form, it is {V_2} which is diagonal.

To find the exact partition function, we need to find the largest eigenvalue of the transfer matrix. We will do this by fermionizing the problem. Our Pauli operators {\sigma_j^\alpha} have the correct anticommutation property (14) for fixed site {j}, but for different sites the operators commute, see (23). But Fermion operators anticommute at different sites. The physics behind the anticommutation is that when two indistinguishable fermion particles, such as electrons, are exchanged, their two-particle wave function changes sign. It is this property that underlies the Pauli exclusion principle. In contrast, bosons satisfy commutation relations rather than anticommutation relations. The Pauli matrices {\sigma_j^\alpha} are neither fully boson nor fully fermion, making diagonalization of the transfer matrix difficult. But there is a way to work around this obstruction.

Let us define new creation and annihilation operators that depend on all sites to the “left” in the following sense:

\displaystyle \begin{array}{rcl} c_m &=& \exp \left[ i \pi \sum_{j=1}^{m-1} \sigma^+_j \sigma^-_j\right] \sigma_m^- \\ c_m^\dagger &=& \exp \left[ i \pi \sum_{j=1}^{m-1} \sigma^+_j \sigma^-_j\right] \sigma_m^+ ~. \end{array}

These new operators have interesting properties, which we will show below. First observe that

\displaystyle \begin{array}{rcl} (\sigma_j^+ \sigma_j^-)^2 \nonumber &=& \sigma_j^+ \sigma_j^- \sigma_j^+ \sigma_j^- \\\nonumber &=& \sigma_j^+ ({~\boldsymbol{ 1}}_j - \sigma_j^+ \sigma_j^-) \sigma_j^- \\\nonumber &=& \sigma_j^+ \sigma_j^- - \sigma_j^+ \sigma_j^+ \sigma_j^- \sigma_j^- \\ &=& \sigma_j^+ \sigma_j^- ~. \end{array}

So

\displaystyle \begin{array}{rcl} e^{\theta \sigma_j^+ \sigma_j^- } \nonumber &=& ~\boldsymbol{ 1}_j + \sum_{n=1}^\infty {\theta^n (\sigma_j^+ \sigma_j^-)^n \over n!} \\\nonumber &=& ~\boldsymbol{ 1}_j + \sigma_j^+ \sigma_j^- \sum_{n=0}^\infty {\theta^n \over n!} \\ &=& ~ 1_j + \sigma_j^+ \sigma_j^- (e^{\theta} -1 ) ~. \end{array}

Hence, putting {\theta=2\pi i} we get

\displaystyle e^{2 \pi i \sigma_j^+ \sigma_j^- } = {~\boldsymbol{ 1}}_j ~. \ \ \ \ \ (32)

From this identity and from the commutation of the Pauli operators at distinct sites, we immediately get

\displaystyle \begin{array}{rcl} c_m^\dagger c_m &=& \sigma_m^+ \sigma_m^- \\ c_m c_m^\dagger &=& \sigma_m^- \sigma_m^+ ~. \end{array}

Hence, \be \{ c_m^\dagger, c_m \} = 1 ~. So the new operators have the correct fermion anticommutation property at the same site {m}. What about at distinct sites? Observe that

\displaystyle \sigma_m^+e^{\theta \sigma_m^+\sigma_m^-} = \sigma_m^+ ~,\ \ \ \ \ (33)

so that for {n>m} we have

\displaystyle \begin{array}{rcl} c_m^\dagger c_n \nonumber &=& \sigma_m^+ e^{\pi i \sum_{j=1}^{m-1} \sigma_j^+ \sigma_j^- } e^{\pi i \sum_{j=1}^{n-1} \sigma_j^+ \sigma_j^- } \sigma_n^- \\\nonumber &=& \sigma_m^+ e^{2\pi i \sum_{j=1}^{m-1} \sigma_j^+ \sigma_j^- } e^{\pi i \sigma_m^+\sigma_m^- } e^{\pi i \sum_{j=m+1}^{n-1} \sigma_j^+ \sigma_j^- } \sigma_n^- \\ &=& e^{\pi i \sum_{j=m+1}^{n-1} \sigma_j^+ \sigma_j^- } \sigma_m^+ \sigma_n^- ~. \end{array}

 

Similarly,

\displaystyle \begin{array}{rcl} \nonumber c_n c_m^\dagger &=& \sigma_n^- e^{\pi i \sum_{j=1}^{n-1} \sigma_j^+ \sigma_j^- } e^{\pi i \sum_{j=1}^{m-1} \sigma_j^+ \sigma_j^- } \sigma_m^+ \\\nonumber &=& \sigma_n^- e^{2\pi i \sum_{j=1}^{m-1} \sigma_j^+ \sigma_j^- } e^{\pi i \sigma_m^+\sigma_m^- } e^{\pi i \sum_{j=m+1}^{n-1} \sigma_j^+ \sigma_j^- } \sigma_m^+ \\\nonumber &=& e^{\pi i \sum_{j=m+1}^{n-1} \sigma_j^+ \sigma_j^- } \sigma_n^- e^{\pi i \sigma_m^+\sigma_m^- } \sigma_m^+ \\\nonumber &=& e^{\pi i \sum_{j=m+1}^{n-1} \sigma_j^+ \sigma_j^- } \sigma_n^- [{~\boldsymbol{ 1}}_m + \sigma_m^+\sigma_m^- (e^{\pi i} -1 ) ] \sigma_m^+ \\\nonumber &=& e^{\pi i \sum_{j=m+1}^{n-1} \sigma_j^+ \sigma_j^- } \sigma_n^- \left(\sigma_m^+ - 2 \sigma_m^+\sigma_m^- \sigma_m^+ \right) \\\nonumber &=& e^{\pi i \sum_{j=m+1}^{n-1} \sigma_j^+ \sigma_j^- } \sigma_n^- \left(\sigma_m^+ - 2 \sigma_m^+ (1-\sigma_m^+ \sigma_m^- ) \right) \\\nonumber &=& e^{\pi i \sum_{j=m+1}^{n-1} \sigma_j^+ \sigma_j^- } \sigma_n^- \left(\sigma_m^+ - 2 \sigma_m^+ \right)\\ &=& - e^{\pi i \sum_{j=m+1}^{n-1} \sigma_j^+ \sigma_j^- } \sigma_n^- \sigma_m^+ ~. \end{array}

Adding, we thus get

\displaystyle \{ c_n , c_m^\dagger \} =0 ~, \quad n\neq m~. \ \ \ \ \ (34)

Moreover, we can show similarly that

\displaystyle \begin{array}{rcl} \{ c_n , c_m \} &=&0 \\ \{ c_n^\dagger , c_m^\dagger \} &=&0 ~. \end{array}

So we have a complete set of anticommutation rules.

We can now express {V_1} and {V_2} in terms of these new operators. Note that

\displaystyle \sigma^z_j = 2 \sigma_j^+ \sigma_j^- - {~\boldsymbol{ 1}}_j = 2 c_j^\dagger c_j -{~\boldsymbol{ 1}}_j ~,\ \ \ \ \ (35)

so that

\displaystyle V_2= (2 \sinh 2K_1)^{M/2 } e^{-2 K_1^* \sum_m (c_m^\dagger c_m -1/2) } ~. \ \ \ \ \ (36)

 

In contrast, {V_1} is not only quadratic in the Pauli matrices, but connects different rows, i.e. different sites in a given column. Still, it is possible to re-express {V_1} easily in terms of the new operators. Observe that for {m<M}, we get from (33)

\displaystyle \begin{array}{rcl} c^\dagger_m c_{m+1} &=& \sigma_m^+ \sigma^-_{m+1} ~. \end{array}

Using a similar reasoning, we get

\displaystyle \begin{array}{rcl} \nonumber c_m c_{m+1} &=& \sigma_m^- e^{i \pi \sigma^+_m \sigma^-_m} \sigma_{m+1}^- \\\nonumber &=& \sigma_m^- ({~\boldsymbol{ 1}}_m -2 \sigma^+_m \sigma^-_m) \sigma_{m+1}^- \\\nonumber &=& -2 \sigma_m^- \sigma^+_m \sigma^-_m \sigma_{m+1}^- + \sigma_m^-\sigma_{m+1}^- \\\nonumber &=& -2 ({~\boldsymbol{ 1}}_m -\sigma_m^+ \sigma^-_m) \sigma^-_m \sigma_{m+1}^- + \sigma_m^-\sigma_{m+1}^- \\\nonumber &=& -2 \sigma^-_m \sigma_{m+1}^- + \sigma_m^-\sigma_{m+1}^- \\ &=& -\sigma^-_m \sigma_{m+1}^- ~. \end{array}

And

\displaystyle \begin{array}{rcl} \nonumber c_m^\dagger c_{m+1} ^\dagger &=& \sigma_m^+ e^{i \pi \sigma^+_m \sigma^-_m} \sigma_{m+1}^ + \\\nonumber &=& \sigma_m^+ ({~\boldsymbol{ 1}}_m -2 \sigma^+_m \sigma^-_m) \sigma_{m+1}^+ \\ &=& \sigma_m^+ \sigma_{m+1}^+ ~. \end{array}

Hence,

\displaystyle \begin{array}{rcl} \nonumber \sigma^x_m\sigma^x_{m+1} &=& (\sigma_m^+ + \sigma_m^-) (\sigma_{m+1}^+ + \sigma_{m+1}^-) \\ &=& (c_m^\dagger - c_m) (c_{m+1}^\dagger + c_{m+1}) ~. \end{array}

Hence, neglecting the boundary, the bulk of the matrix {V_1} can be written as

\displaystyle \begin{array}{rcl} V_1 = e^{K_2\sum_m (c_m^\dagger - c_m) (c_{m+1}^\dagger + c_{m+1}) } ~. \end{array}

 

The boundary term {m=M} for {V_1} must be dealt with separately, due to the periodic boundary conditions. As this point, we could just forget about the boundary, which is not important in the thermodynamic limit. However, eliminating periodic boundary conditions breaks translation invariance symmetry, which would prevent us from fully exploiting the Fourier transform. So it is worthwhile to spend some time getting the the boundary term just right. Let us define the number operator for the total number of fermions:

\displaystyle n= \displaystyle\sum_{j=1}^M c^\dagger_j c_j = \sum_{j=1}^M \sigma^+_j \sigma_j^- ~. \ \ \ \ \ (37)

First, observe that

\displaystyle \begin{array}{rcl} \nonumber c_M^\dagger c_1 &=& e^{\pi i \sum_m^{M-1} \sigma_j^+\sigma_j^-} \sigma_M^+ \sigma_1^-\\\nonumber &=& e^{\pi i [(\sum_m^M \sigma_j^+\sigma_j^-) -\sigma^+_M \sigma_1^-]} \sigma_M^+ \sigma_1^-\\\nonumber &=& e^{\pi i \sum_m^M \sigma_j^+\sigma_j^- )} e^{-\pi i \sigma^+_M \sigma_M^-} \sigma_M^+ \sigma_1^-\\\nonumber &=& e^{\pi i n)} ({~\boldsymbol{ 1}}_M - 2\sigma_M^+ \sigma_M^-) \sigma_M^+ \sigma_1^-\\\nonumber &=& (-1)^n ({~\boldsymbol{ 1}}_M - 2({~\boldsymbol{ 1}}_M-\sigma_M^- \sigma_M^+) ) \sigma_M^+ \sigma_1^-\\\nonumber &=& (-1)^n (-{~\boldsymbol{ 1}}_M) \sigma_M^+ \sigma_1^-\\ &=& -(-1)^n \sigma_M^+ \sigma_1^- ~. \end{array}

Similarly,

\displaystyle \begin{array}{rcl} \nonumber c_M^\dagger c_1^\dagger &=& e^{\pi i \sum_m^{M-1} \sigma_j^+\sigma_j^-} \sigma_M^+ \sigma_1^+\\ &=& -(-1)^n \sigma_M^+ \sigma_1^+ ~, \end{array}

and

\displaystyle \begin{array}{rcl} \nonumber c_M c_1 &=& e^{\pi i \sum_m^{M-1} \sigma_j^+\sigma_j^-} \sigma_M^- \sigma_1^-\\\nonumber &=& e^{\pi i [(\sum_m^M \sigma_j^+\sigma_j^-) -\sigma^+_M \sigma_1^-]} \sigma_M^- \sigma_1^-\\\nonumber &=& e^{\pi i \sum_m^M \sigma_j^+\sigma_j^-} ({~\boldsymbol{ 1}}_M-2 \sigma^+_M \sigma_1^-) \sigma_M^- \sigma_1^-\\\nonumber &=& e^{\pi i \sum_m^M \sigma_j^+\sigma_j^-} \sigma_M^- \sigma_1^-\\ &=& (-1)^n \sigma_M^- \sigma_1^- ~. \end{array}

The inverse of {(-1)^n} is itself, since {(-1)^n (-1)^n = 1}. So

\displaystyle \begin{array}{rcl} \sigma_M^- \sigma_1^- \nonumber &=& (-1)^n c_M c_1 \\ \sigma_M^+ \sigma_1^- \nonumber &=& - (-1)^n c_M^\dagger c_1 \\ \sigma_M^+ \sigma_1^+ &=& - (-1)^n c_M^\dagger c_1^\dagger ~. \end{array}

 

Next, observe that {V_1} and {V_2} are bilinear in the fermion operators. The neglected boundary term in {V_1} is also bilinear. This means that {V_1} and {V_2} can only change the total number of fermions by an even number, which means that the oddness or evenness of the number operator {n} must be a conserved quantity. In the language of commutators, we have

\displaystyle {[}(-1)^n,V_1 {]} = {[}(-1)^n,V_2 {]} = 0 ~. \ \ \ \ \ (38)

The conditions (37) can be satisfied provided we pick

\displaystyle \begin{array}{rcl} c_M^\dagger &=& - c_1^\dagger \\ c_M &=& c_1 ~,\end{array}

for even {n} and

\displaystyle \begin{array}{rcl} c_M^\dagger &=& c_1^\dagger \\ c_M &=& c_1 \end{array}

for odd {n}.

Let us now exploit translation invariance by invoking the Fourier transform. Let us define new operators {\eta} via

\displaystyle c_m = \frac{e^{i\alpha}}{\sqrt M} \displaystyle\sum _{j=1}^M e^{2\pi ijm/M} \eta_{q(j)} ~,\quad \alpha \in ~ \Bbb R ~,\ \ \ \ \ (39)

where {q(j)= 2 \pi j m /M} is a wavenumber. To simplify the notation, from now on let us write more simply

\displaystyle c_m = \frac{e^{i\alpha}}{\sqrt M} \displaystyle\sum _{q} e^{iqm} \eta_{q} ~,\quad \alpha \in ~\Bbb R ~,\ \ \ \ \ (40)

where the sum is understood to be over the {M} different wavenumbers.

Consider that

\displaystyle \begin{array}{rcl} \nonumber \frac{e^{-i\alpha}}{\sqrt M} \sum _{m=1}^M e^{-iqm} c_m & =& \frac{1}{M} \sum _{m=1}^M e^{-iqm} \sum _{q'} e^{iq'm} \eta_{q'} \\\nonumber & =& \frac{1}{M} \sum _{q'} \sum _{m=1}^M e^{im(q'-q)} \eta_{q'} \\\nonumber & =& \sum _{q'} \delta_{q',q} \eta_{q'} \\ & =& \eta_{q} ~,\end{array}

so that, as expected, the inverse transform is given by

\displaystyle \eta_{q} = \frac{e^{-i\alpha}}{\sqrt M} \sum _{m=1}^M e^{-iqm} c_m ~. \ \ \ \ \ (41)

Note that the {\eta} are also fermion operators, as can be seen from

\displaystyle \begin{array}{rcl} \nonumber \{\eta_{q'}^\dagger , \eta_q\} &=& \frac 1 M \sum _m e^{-iqm} \sum _{m'} e^{+iq'm'} \{c_m^\dagger , c_{m'}\} \\\nonumber &=& \frac 1 M \sum _m e^{-m(q'-q)} \\ &=& \delta_{q,q'} \\ \{\eta_{q'} , \eta_q\} &=& e^{-2i\alpha}\frac 1 M \sum _m e^{-iqm} \sum _{m'} e^{-iq'm'} \{c_m , c_{m'}\} = 0 \\ \{\eta_{q'}^\dagger , \eta_q^\dagger\} &=& e^{2i\alpha} \frac 1 M \sum _m e^{+iqm} \sum _{m'} e^{+iq'm'} \{c_m^\dagger , c_{m'}^\dagger\} = 0 ~. \end{array}

We can show using these same methods that Parseval’s identity holds (sometimes called the Plancherel theorem):

\displaystyle \sum_q \eta_q^\dagger \eta_q= \sum_m c_m^\dagger c_m ~. \ \ \ \ \ (42)

We should now recast the transfer matrix in terms of {\eta}. For convenience, instead of letting {q} run over {M} values from {0} to {2\pi}, it is more convenient to let it run from {-\pi} to {+\pi}. Moreover, we assume, with little or no loss of generality, that {M} is even. We saw earlier that for even {n}, the periodic boundary conditions force a sign change at the boundary, whereas for odd {n} there is no sign change. So we have

\displaystyle q=\left\{ \begin{array}{ll} 0, \pm \tfrac 2 M \pi , \pm \tfrac 4 M \pi , \ldots, \pm \tfrac {M-2} M \pi , + \tfrac M M \pi ~,~~~~~~~& n ~\rm odd \\ \pm \tfrac 1 M \pi , \pm \tfrac 3 M \pi , \ldots , \pm \tfrac {M-1} M \pi ~,~~~~~& n ~\rm even \end{array}\right. ~. \ \ \ \ \ (43)

For even {n}, then, Parseval’s theorem assumes the form

\displaystyle \sum_{m=1}^M c_m^\dagger c_m=\sum_{q} \eta_q^\dagger \eta_q =\sum_{q> 0} \eta_q^\dagger \eta_q + \eta_{-q}^\dagger \eta_{-q} ~. \ \ \ \ \ (44)

Since the sum over the positive {q} is over only {M/2} values, we have

\displaystyle \sum_{m=1}^M (c_m^\dagger c_m -1/ 2)=\sum_{j>0~\rm odd} (\eta_j^\dagger \eta_j -1) ~.\ \ \ \ \ (45)

For odd {n}, we must also consider the terms {q=0} an {q=\pi} separately.

Substituting into (36) we thus get

\displaystyle V_2= (2 \sinh 2K_1)^{M/2 } \prod _{j\geq 0}V_{2,q} ~, \ \ \ \ \ (46)

 

where

\displaystyle \begin{array}{rcl} V_{2,0} &=& \exp [ -2 K_1^* (\eta_0^\dagger \eta_0 -1/2) ] \\ V_{2,\pi} &=& \exp [ -2 K_1^* (\eta_{\pi}^\dagger \eta_{\pi} -1/2) ] \\ V_{2,q} &=& \exp [ -2 K_1^* (\eta_{q}^\dagger \eta_{q} + \eta_{-q}^\dagger \eta_{-q} -1) ] ~.\end{array}

 

Now let us turn our attention to {V_1}. Consider products of the form {a_m a_{m+1}} where {a_m=c_m, ~c_m^\dagger}:

\displaystyle \begin{array}{rcl} c_m^\dagger c_{m+1}^\dagger \nonumber &=& \frac 1 M e^{-2i\alpha} \sum _{q,q'} e^{-iqm -iq'(m+1)} \eta_q^\dagger \eta_{q'}^\dagger \\ &=& \frac 1 M e^{-2i\alpha} \sum _{q,q'} e^{-im(q+q') -iq'} \eta_q^\dagger \eta_{q'}^\dagger \\ \sum _m c_m^\dagger c_{m+1}^\dagger &=& \nonumber \frac 1 M e^{-2i\alpha} \sum_m \sum _{q,q'} e^{-im(q+q') -iq'} \eta_q^\dagger \eta_{q'}^\dagger \\\nonumber &=& e^{-2i\alpha} \sum _{q,q'} \delta_{q,-q'}e^{-iq'} \eta_q^\dagger \eta_{q'}^\dagger \\ &=& e^{-2i\alpha} \sum _{q} e^{iq} \eta_q^\dagger \eta_{-q}^\dagger \end{array}

\displaystyle \begin{array}{rcl} c_m^\dagger c_{m+1} \nonumber &=& \frac 1 M \sum _{q,q'} e^{-iqm +iq'(m+1)} \eta_q^\dagger \eta_{q'} \\ &=& \frac 1 M \sum _{q,q'} e^{-im(q-q') +iq'} \eta_q^\dagger \eta_{q'} \\ \sum _m c_m^\dagger c_{m+1} &=& \nonumber \frac 1 M \sum_m \sum _{q,q'} e^{-im(q-q') +iq'} \eta_q^\dagger \eta_{q'} \\ &=& \nonumber \sum _{q,q'} \delta_{q,q'}e^{iq'} \eta_q^\dagger \eta_{q'} \\ &=& \sum _{q} e^{iq} \eta_q^\dagger \eta_{q} ~.\end{array}

And similarly

\displaystyle \begin{array}{rcl} \sum _m c_m c_{m+1} &=& e^{2i\alpha} \sum_q e^{-iq} \eta_q \eta_{-q} \\ \sum _m c_m c_{m+1}^\dagger &=& \sum_q e^{-iq} \eta_q \eta_{q}\dagger ~.\end{array}

If we denote sums over values of {q} excluding {q=0,\pi} via the symbol {\sum^*}, we can write

\displaystyle \begin{array}{rcl} {\sum_q}^* e^{iq} \eta_q^\dagger \eta_{-q}^\dagger &=& \sum_{q>0} \ \left( e^{iq} \eta_q^\dagger \eta_{-q}^\dagger + e^{-iq} \eta_{-q}^\dagger \eta_{q}^\dagger \right) \\ {\sum_q}^* e^{iq} \eta_q^\dagger \eta_{q} &=& \sum_{q>0} \ \left( e^{iq} \eta_q^\dagger \eta_{q} + e^{-iq} \eta_{-q}^\dagger \eta_{-q} \right) \\ {\sum_q}^* e^{-iq} \eta_q \eta_{q}^\dagger &=& \sum_{q>0} \ \left( e^{-iq} \eta_q \eta_{q}^\dagger + e^{iq} \eta_{-q} \eta_{-q}^\dagger \right) \\ {\sum_q}^* e^{-iq} \eta_q \eta_{-q} &=& \sum_{q>0} \ \left( e^{-iq} \eta_q \eta_{-q} + e^{+iq} \eta_{-q} \eta_{q} \right) ~.\end{array}

Exploiting the anticommutation property for fermion operators, we can write

\displaystyle \begin{array}{rcl} {\sum_q}^* e^{iq} \eta_q^\dagger \eta_{-q}^\dagger &=& \sum_{q>0} \left( - e^{iq} \eta_{-q}^\dagger \eta_{q}^\dagger + e^{-iq} \eta_{-q}^\dagger \eta_{q}^\dagger \right) = \sum_{q>0} - 2 i \sin q ~ \eta_{-q}^\dagger \eta_{q}^\dagger \\ {\sum_q}^* e^{-iq} \eta_q \eta_{-q} &=& \sum_{q>0} \left( e^{-iq} \eta_q \eta_{-q} - e^{+iq} \eta_{q} \eta_{-q} \right) = \sum_{q>0} -2 i \sin q ~ \eta_q \eta_{-q} ~.\end{array}

Now observe that

\displaystyle (c_m^\dagger-c_m) (c_{m+1}^\dagger + c_{m+1} ) = c_m^\dagger c_{m+1}^\dagger + c_m^\dagger c_{m+1} -c_m c_{m+1}^\dagger -c_m c_{m+1} ~.\ \ \ \ \ (47)

Putting everything together we get

\displaystyle \begin{array}{rcl} \nonumber &\displaystyle \sum_m & (c_m^\dagger-c_m) (c_{m+1}^\dagger + c_{m+1} ) = \\\nonumber & & e^{-2i\alpha} \sum_{q>0} - 2 i \sin q ~ \eta_{-q}^\dagger \eta_{q}^\dagger - e^{2i\alpha} \sum_{q>0} - 2 i \sin q ~ \eta_{q} \eta_{-q} \\\nonumber & & + \sum_{q>0} \left[\left( e^{iq} \eta_q^\dagger \eta_{q} + e^{-iq} \eta_{-q}^\dagger \eta_{-q} \right) -\left( e^{-iq} \eta_q \eta_{q}^\dagger + e^{iq} \eta_{-q} \eta_{-q}^\dagger \right)\right] \\\nonumber &=& \sum_{q>0}\bigg[ e^{-2i\alpha} (- 2 i \sin q ~ \eta_{-q}^\dagger \eta_{q}^\dagger) - e^{2i\alpha} (- 2 i \sin q ~ \eta_{q} \eta_{-q}) \\\nonumber & & \quad \quad \quad + e^{iq} \eta_q^\dagger \eta_{q} + e^{-iq} \eta_{-q}^\dagger \eta_{-q} - e^{-iq} \eta_q \eta_{q}^\dagger - e^{iq} \eta_{-q} \eta_{-q}^\dagger \bigg] \\\nonumber &=& \sum_{q>0}\bigg[ e^{-2i\alpha} (- 2 i \sin q ~ \eta_{-q}^\dagger \eta_{q}^\dagger) - e^{2i\alpha} (- 2 i \sin q ~ \eta_{q} \eta_{-q}) \\\nonumber & & \quad \quad \quad + e^{iq} \eta_q^\dagger \eta_{q} + e^{-iq} \eta_{-q}^\dagger \eta_{-q} + e^{-iq} \eta_q^\dagger \eta_{q} -1 + e^{iq} \eta_{-q}^\dagger \eta_{-q} -1 \bigg] \\\nonumber &=& \sum_{q>0}\bigg[ e^{-2i\alpha} (- 2 i \sin q ~ \eta_{-q}^\dagger \eta_{q}^\dagger) - e^{2i\alpha} (- 2 i \sin q ~ \eta_{q} \eta_{-q}) \\ & & \quad \quad \quad + 2\cos q~ (\eta_q^\dagger \eta_{q} + \eta_{-q}^\dagger \eta_{-q} ) -2 \bigg] ~. \end{array}

Let us choose {\alpha} such that {e^{2i\alpha}=i}. Noting that

\displaystyle \sum_{q>0} \cos q =0 ~, \ \ \ \ \ (48)

we get

\displaystyle \begin{array}{rcl} \displaystyle \sum_m (c_m^\dagger-c_m) (c_{m+1}^\dagger + c_{m+1} ) &=& \displaystyle\sum_{q>0}\bigg[ 2 \sin q ~ (\eta_{-q}^\dagger \eta_{q}^\dagger + \eta_{q} \eta_{-q}) + 2\cos q~ (\eta_q^\dagger \eta_{q} + \eta_{-q}^\dagger \eta_{-q} ) \bigg] \nonumber \\ & & ~. \end{array}

For odd {n} we have neglected {q=0,~\pi}, which we must consider explicitly. In the expression for

\displaystyle \sum_m (c_m^\dagger-c_m) (c_{m+1}^\dagger + c_{m+1} )

we need to add for {q=0}

\displaystyle (-1) \eta_0^\dagger\eta_0^\dagger + \eta_0^\dagger\eta_0 - \eta_0 \eta_0^\dagger - (-1) \eta_0 \eta_0 = 2 \eta_0^\dagger\eta_0 -1 ~, \ \ \ \ \ (49)

and similarly for {q=\pi}

\displaystyle -(2 \eta_\pi^\dagger\eta_\pi -1) ~. \ \ \ \ \ (50)

Now we can write {V_1} by substituting into (36):

\displaystyle \begin{array}{rcl} V_1 &=& \prod_{q\geq 0} V_{1,q} \\ V_{1,q} &=& \left\{\begin{array}{ll} \exp\left[2K_2 (\eta_0^\dagger\eta_0 -1/2)\right]~, & q=0 \\ \exp\left[-2K_2 (\eta_\pi^\dagger\eta_\pi -1/2)\right]~, & q=\pi \\ \exp\left[2K_2 \left( \sin q ~ (\eta_{-q}^\dagger \eta_{q}^\dagger + \eta_{q} \eta_{-q}) + \cos q~ (\eta_q^\dagger \eta_{q} + \eta_{-q}^\dagger \eta_{-q} ) \right) \right ]~, & 0<q<\pi \end{array}\right. ~. \end{array}

 

We are now in a position to recast (31). Substituting (50) and (46), we obtain

\displaystyle V = (2 \sinh 2K_1)^{M/2} \prod_{q\geq 0} V_{2,q}^{1/2} V_{q,1} V_{2,q}^{1/2} ~.\ \ \ \ \ (51)

 

From (46) we see that {V_{2,q}} can be diagonalized simultaneously with {V_{1,q}} as follows. Let us define the “vacuum” state {\left|{0}\right\rangle} as representing the system with all spins in the “down” direction. Let {\phi} be a set of 4 vectors obtained by acting on {\left|{0}\right\rangle} in 4 different ways as follows.

\displaystyle \begin{array}{rcl} \phi_{00} &=& \left|{0}\right\rangle \\ \phi_{10} &=& \eta_{q}^\dagger\left|{0}\right\rangle \\ \phi_{01} &=& \eta_{-q}^\dagger\left|{0}\right\rangle \\ \phi_{11} &=& \eta_{q}^\dagger \eta_{-q}^\dagger \left|{0}\right\rangle ~. \end{array}

Observe that for {0<q<\pi} we have

\displaystyle \begin{array}{rcl} \nonumber V_{2,q} \phi_{00} &=& e^{-2K_1^*(\eta_q^\dagger\eta_q + \eta_{-q}^\dagger\eta_{-q} -1 )} \phi_{00}\\\nonumber &=& e^{-2K_1^*(-1 )} \phi_{00}\\ &=& e^{+2K_1^*} \phi_{00}\\ \nonumber V_{2,q} \phi_{10} &=& e^{-2K_1^*(\eta_q^\dagger\eta_q + \eta_{-q}^\dagger\eta_{-q} -1 )} \phi_{10}\\\nonumber &=& e^{-2K_1^*(\eta_q^\dagger\eta_q -1 )} \phi_{10}\\\nonumber &=& e^{-2K_1^*(1 -1 )} \phi_{10}\\ &=& \phi_{10}\\\nonumber V_{2,q} \phi_{01} &=& e^{-2K_1^*(\eta_q^\dagger\eta_q + \eta_{-q}^\dagger\eta_{-q} -1 )} \phi_{01}\\\nonumber &=& e^{-2K_1^*(\eta_{-q}^\dagger\eta_{-q} -1 )} \phi_{01}\\ &=& \phi_{01}\\\nonumber V_{2,q} \phi_{11} &=& e^{-2K_1^*(\eta_q^\dagger\eta_q + \eta_{-q}^\dagger\eta_{-q} -1 )} \phi_{11}\\\nonumber &=& e^{-2K_1^*(1 + 1 -1 )} \phi_{11}\\ &=& e^{-2K_1^*}\phi_{11} ~.\end{array}

Let us choose as our basis the 4 vectors {\phi_{00}, \phi_{11},\phi_{10}, \phi_{01} } in this order. Then we can write

\displaystyle V_{2,q} = \left[ \begin{array}{cccc} e^{2K_1^*} & 0 & ~~0~~ & ~~0~~ \\ 0 & e^{-2K_1^*} & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array} \right] ~.\ \ \ \ \ (52)

On the other hand, {V_{1,q}} “mixes” {\phi_{00}} and {\phi_{11}}, i.e. it not diagonal. To get around this problem, we proceed as follows. From (50), we obtain for {V_{1,q}^{1/2}} for {0<q<\pi},

\displaystyle V_{1,q}^{1/2}= \exp\left[K_2 \left( \sin q ~ (\eta_{-q}^\dagger \eta_{q}^\dagger + \eta_{q} \eta_{-q}) + \cos q~ (\eta_q^\dagger \eta_{q} + \eta_{-q}^\dagger \eta_{-q} ) \right) \right ]~. ~.\ \ \ \ \ (53)

So we know that we can write

\displaystyle V_{1,q}^{1/2} \phi_{00} = g(K_2) \phi_{00} + f(K_2) \phi_{11} ~.\ \ \ \ \ (54)

Differentiating, we get

\displaystyle \begin{array}{rcl} {dg\over dK_2} \phi_{00} &+& \frac{df}{ dK_2} \phi_{11} \nonumber \\ &=& \left( \sin q ~ (\eta_{-q}^\dagger \eta_{q}^\dagger + \eta_{q} \eta_{-q} ) + \cos q~ (\eta_q^\dagger \eta_{q} + \eta_{-q}^\dagger \eta_{-q} ) \right) V_{1,q}^{1/2} \phi_{00} \nonumber \\ &=& \left( - \sin q ~ (\eta_{q}^\dagger \eta_{-q}^\dagger + \eta_{-q} \eta_{q} ) + \cos q~ (\eta_q^\dagger \eta_{q} + \eta_{-q}^\dagger \eta_{-q} ) \right) V_{1,q}^{1/2} \phi_{00} \nonumber \\ \nonumber &=& \left( - \sin q ~ (\eta_{q}^\dagger \eta_{-q}^\dagger + \eta_{-q} \eta_{q}) + \cos q~ (\eta_q^\dagger \eta_{q} + \eta_{-q}^\dagger \eta_{-q} ) \right)( g(K_2) \phi_{00} + f(K_2) \phi_{11}) \nonumber \\ \nonumber &=& g(K_2)[-\sin q~\phi_{11} + (0+0) \cos q~\phi_{00}] + f(K_2) [-\sin q~\phi_{00} + (1+1) \cos q~\phi_{11}] \nonumber \\ &=& -\sin q ~f(K_2) \phi_{00} + (-g(K_2)\sin q + 2 f(K_2)\cos q) \phi_{11} ~. \end{array}

Hence,

\displaystyle \begin{array}{rcl} {dg\over dK_2} + f(K_2) \sin q &=& 0\\ {df\over dK_2} + g(K_2) \sin q - 2 f(K_2) \cos q&=& 0 \end{array}

 

For {K_2=0} we have {V_{1,q}=1} so that we can write

\displaystyle \begin{array}{rcl} g(0)&=&1\\ f(0)&=&=0 \end{array}

Let us eliminate {g}:

\displaystyle \begin{array}{rcl} {d^2f\over dK_2^2} + {dg\over dK_2} \sin q - 2 {df \over dK_2} \cos q&=& 0 \\ {d^2f\over dK_2^2} - f(K_2) \sin^2 q - 2 {df \over dK_2} \cos q&=& 0 ~.\end{array}

This is a linear second order ordinary differential equation, which is easily solved. Assume that the solution is the sum of terms of the form {f_B= A e^{BK_2}}:

\displaystyle B^2 - \sin^2 q - 2 B \cos q =0 ~, \ \ \ \ \ (55)

so that from the quadratic formula we get

\displaystyle B= {2 \cos q \pm \sqrt{4 \cos^2 q + 4 \sin^2 q} \over 2} = \cos q \pm 1 ~.\ \ \ \ \ (56)

So the general solution must be of the form

\displaystyle \begin{array}{rcl} \nonumber f&=& A e^{(1+ \cos q)K_2} + A' e^{(-1+ \cos q)K_2} \\ &=& e^{K_2 \cos q} (A e^{K_2} + A'e^{-K_2}) ~.\end{array}

The initial condition {f=0} at {K_2=0} gives us {A'=-A} so that

\displaystyle f = e^{K_2 \cos q} (2 A \sinh K_2) ~.\ \ \ \ \ (57)

The other initial condition gives us

\displaystyle {df \over dK_2} \bigg|_{K_2=0} = 2 A = -\sin q ~, \ \ \ \ \ (58)

so that

\displaystyle f(K_2)= -\sin q ~ e^{K_2 \cos q} \sinh K_2 ~.\ \ \ \ \ (59)

Having found {f} we can now find {g} from (54) as follows:

\displaystyle \begin{array}{rcl} g &=& \frac{-1}{ \sin q}\left[ \frac{ df }{ dK_2} - 2 \cos q ~f\right] \\ &=& \frac{-1}{ \sin q}\left[ { -\sin q (\cosh K_2 ~ e^{K_2 \cos q} + \cos q~\sinh K_2~ e^{K_2 \cos q}) } - 2 \cos q ~f\right] \\ &=& \cosh K_2 ~ e^{K_2 \cos q} + \cos q~\sinh K_2~ e^{K_2 \cos q} - 2 \cos q~ e^{K_2 \cos q} \sinh K_2 \\ &=& e^{K_2 \cos q} (\cosh K_2 - \cos q~ \sinh K_2 ) ~.\end{array}

We can proceed as above and write similarly

\displaystyle V_{1,q}^{1/2} \phi_{11} = r(K_2) \phi_{00} + s(K_2) \phi_{11} ~.\ \ \ \ \ (60)

Repeating the above method, we obtain

\displaystyle \begin{array}{rcl} r&=&f\\ s&=& e^{K_2 \cos q} (\cosh K_2 + \cos q~ \sinh K_2 ) ~.\end{array}

Meanwhile it is easy to check that

\displaystyle \begin{array}{rcl} V_{1,q}^{1/2} \phi_{01} &=& e^{K_2 \cos q} \phi_{01}\\ V_{1,q}^{1/2} \phi_{10} &=& e^{K_2 \cos q} \phi_{10} ~.\end{array}

Putting everything together we obtain

\displaystyle V_{1,q}^{1/2} = \left[ \begin{array}{llcc} g~~~~~~~~ & f~~~ & 0 & 0 \\ f & s & 0 & 0 \\ 0 & 0 & e^{K_2 \cos q} & 0 \\ 0 & 0 & 0 & e^{K_2 \cos q} \end{array} \right] ~.\ \ \ \ \ (61)

We thus see that this matrix is block-diagonal. The first block corresponds to the basis vectors {\phi_{00}} and {\phi_{11}}, which keep count of zero and two fermions, whereas the second block relates to one fermion. Let us call these blocks the even and odd blocks.

The odd block of {V_{1,q}^{1/2} V_{2,q} V_{1,q}^{1/2} } is diagonal with both eigenvalues equal to {e^{2K_2 \cos q}}. It is easy to check that the even block is symmetric. Let us calculate it explicitly.

\displaystyle \left[ \begin{array}{lll} g & & f \\ f & & s \\ \end{array} \right] \left[ \begin{array}{ll} e^{2K_1^*} & 0 \\ 0 & e^{-2K_1^*} \\ \end{array} \right] \left[ \begin{array}{lll} g & & f \\ f & & s \\ \end{array} \right] = \left[ \begin{array}{lll} f^2 e^{-2K_1^*} + g^2 e^{2K_1^*} & & fs e^{-2K_1^*} + fg e^{2K_1^*} \\ fs e^{-2K_1^*} + fg e^{2K_1^*} & ~& s^2 e^{-2K_1^*} + f^2 e^{2K_1^*} \end{array} \right] ~.\ \ \ \ \ (62)

This is a symmetric matrix. Moreover, by factoring out {e^{2 K_2 \cos q}} from {f^2}, {g^2}, {s^2}, {fs} and {fg}, we can write the above matrix as

\displaystyle e^{2 K_2 \cos q} \left[ \begin{array}{lll} A & & B \\ B & & C \\ \end{array} \right] ~, \ \ \ \ \ (63)

where

\displaystyle \begin{array}{rcl} A &=& e^{-2K_1^*} \sinh^2 K_2 ~\sin^2 q + e^{2K_1^*} (\cosh K_2+ \cos q \sinh K_2)^2 \\ B &=& \sinh K_2 ~\sin q [e^{-2K_1^*} (\cosh K_2 + \cos q~ \sinh K_2 ) + e^{2K_1^*} (\cosh K_2 - \cos q~ \sinh K_2 ) ]\nonumber \\ &=& 2 \sinh K_2 ~\sin q ~ (\cosh 2K_1^* \cosh K_2 - \sinh 2K_1^* \cos q~ \sinh K_2 ) \\ C &=& e^{2K_1^*} \sinh^2 K_2 ~\sin^2 q + e^{-2K_1^*} (\cosh K_2 -\cos q \sinh K_2)^2 ~. ~.\end{array}

Note that we are using notation slightly different from the 1964 paper. Our {A} is their {B_q} and our {B} is their {C_q} and our {C} is their {A_q}.

The matrix

\displaystyle \left[ \begin{array}{lll} A & & B \\ B & & C \\ \end{array} \right] \ \ \ \ \ (64)

has eigenvalue {\varphi} satisfying the characteristic equation

\displaystyle (A-\varphi)(C-\varphi)-B^2=0 ~, \ \ \ \ \ (65)

so that

\displaystyle \varphi^2 - \varphi(A+C) -B^2=0 ~, \ \ \ \ \ (66)

and we thus obtain

\displaystyle \varphi= \tfrac 1 2 [(A+C) \pm \sqrt{(A+C)^2+4B^2} ] ~. \ \ \ \ \ (67)

So the even block of {V_{1,q}^{1/2} V_{2,q} V_{1,q}^{1/2} } has eigenvalue

\displaystyle \tfrac 1 2 e^{2 K_2 \cos q} [(A+C) \pm \sqrt{(A+C)^2+4B^2} ] ~.\ \ \ \ \ (68)

(Note that there is a sign typo inside the square root in the original 1964 article.)

These eigenvalues are centered around their arithmetic mean. Since we are more interested in the multiplicative rather than the additive properties of the partition function, it is more useful to express these eigenvalues as being centered around their geometric mean. We will next show that we can write

\displaystyle \tfrac 1 2 e^{2 K_2 \cos q} [(A+C) \pm \sqrt{(A+C)^2+4B^2} ] = e^{2 K_2 \cos q \pm \epsilon_q} ~.\ \ \ \ \ (69)

 

Simplifying, we get

\displaystyle \tfrac1 2 (A+C) [1 \pm \sqrt{1+ B^2/(A+C)^2}] = e^{\pm \epsilon_q} ~.\ \ \ \ \ (70)

Hence,

\displaystyle \begin{array}{rcl} \nonumber 2\cosh \epsilon_q &=& A+C \\\nonumber &=& e^{-2K_1^*} \sinh^2 K_2 ~\sin^2 q + e^{2K_1^*} (\cosh K_2+ \cos q \sinh K_2)^2 \nonumber \\\nonumber & & + e^{2K_1^*} \sinh^2 K_2 ~\sin^2 q + e^{-2K_1^*} (\cosh K_2 -\cos q \sinh K_2)^2 \\\nonumber & =& 2\cosh 2K_1^* (\cosh^2 K_2 + \sinh^2 K_2) - 4\sinh 2K_1^* \cosh K_2 ~\sinh K_2 ~\cos q \nonumber \\ & =& 2\cosh 2K_1^* \cosh 2K_2 - 2\sinh 2K_1^* \sinh 2K_2 ~\cos q ~, \end{array}

so that (69) is satisfied provided

\displaystyle \cosh \epsilon_q = \cosh 2K_1^* \cosh 2K_2 - \sinh 2K_1^* \sinh 2K_2 ~\cos q ~.\ \ \ \ \ (71)

To proceed further, Schultz, Mattis and Lieb re-express the fermion operators once again in terms of “rotated” quantities, using a method that was used by Bogolubov and Valatin to simplify the BCS theory of superconductivity. We refer the interested reader to the original paper. We will skip this step and instead focus directly on the thermodynamic limit.

Even without knowing the details of the basis that diagonalizes {V} in (51), we already know the eigenvalues. So in the diagonalizing basis (which we do not need to know) we can write {V} in terms of these eigenvalues.  From the above, the eigenvalues are known to be of the form

\displaystyle (2 \sinh 2K_1)^{M/2} \exp\left[ \sum_{q\geq 0} (2K_2 \cos q \pm \epsilon_q)\right] ~. \ \ \ \ \ (72)

Noting that the {\cos q} can be paired and canceled we get

\displaystyle (2 \sinh 2K_1)^{M/2} \exp\left[ \sum_{q\geq 0} \pm \epsilon_q\right] ~.\ \ \ \ \ (73)

In the thermodynamic limit, only the largest eigenvalue contributes, as is well known. Taking the limit of {M\rightarrow +\infty} of the {M}-th root of the above considering only the larger of \pm\epsilon_q, and noting that the sum over {q} above is over {M/2} values only (so only half the full range from {-\pi} to {+\pi}) we obtain the partition function per site:

\displaystyle \lambda = \sqrt{2\sinh 2K_1} ~ \exp\left[ \frac 1 {4\pi} \int_{-\pi}^\pi |\epsilon_q|\right] ~dq \ \ \ \ \ (74)

or

\displaystyle \begin{array}{rcl} \nonumber \log \lambda &=& \tfrac 1 2 \log 2\sinh 2K_1 + \frac 1 {4\pi } \int _{-\pi}^\pi |\epsilon_q| ~dq \\ &=& \tfrac 1 2 \log 2\sinh 2K_1 \nonumber \\ & & \quad + \frac 1 {4\pi } \int _{-\pi}^\pi \big|~ \mathrm{ arccosh~} [\cosh 2K_1^* \cosh 2K_2 - \sinh 2K_1^* \sinh 2K_2 ~\cos q] ~\big| ~dq \nonumber ~. \\ \end{array}

 

We are done!

5. The symmetric forms

The above expression is easy to put into the more commonly used forms. Let us first prove the following identity, which is actually found in Onsager’s seminal paper from 1944:

\displaystyle {1\over 2\pi }\int_0^{2\pi} \log [2\cosh x - 2 \cos \omega] ~d\omega = |x| ~. \ \ \ \ \ (75)

 

Amazingly, this identity is never proven in the usual textbooks of statistical mechanics. We will not satisfy ourselves with just accepting this identity on faith. Where is the proof? Huang’s graduate level textbook merely quotes the identity, giving no proof. Newell and Montroll’s well-known review article similarly just throws the identity at the reader, which is also what Onsager did. Reichl’s book on statistical physics at least contains an evaluated definite integral, but without going into details on how it is obtained. Indeed, why waste our time? Just look it up in some integral table or use Maple or Mathematica, right?

Since our goal here is to actually understand what is going on, we insist on proving the expression. The above intimidating identity is not as unapproachable as it seems. In fact, readers familiar with Mahler measures will recognize it as closely related to a simple one-variable logarithmic Mahler measure.

To prove the above identity, we first establish the following identity for complex {|r|\leq 1}:

\displaystyle \frac 1 {2\pi } \int_0^{2\pi} \log (1+re^{i\theta})~d\theta =0 ~. \ \ \ \ \ (76)

 

Let us evaluate the above integral using power series. For { |r|<1} the Taylor series for the logarithm converges absolutely and we can use the Fubini-Tonelli theorem to interchange the order of sums and integrals:

\displaystyle \begin{array}{rcl} \nonumber & & \int_0^{2\pi} \log (1+re^{i\theta})~d\theta \\\nonumber & & = \int_0^{2\pi} \sum_{n=1}^{\infty} {1\over n} {(-1)^{n-1} r^n } e^{in\theta}~d\theta \\\nonumber & & = \sum_{n=1}^{\infty} {1\over n} {(-1)^{n-1} r^n }\int_0^{2\pi} e^{in\theta}~d\theta \\\nonumber & & = \sum_{n=1}^{\infty} {1\over n} {(-1)^{n-1} r^n} ~~ {e^{in\theta}\over {i n}} \bigg|_0^{2\pi} \\ & & = 0 ~.\end{array}

The case {|r|=1} follows from taking the limit {r\rightarrow e^{i\phi}} for real {\phi} from inside the unit disk.

We have now proven (76), but it is worth asking what it means. The sum of a logarithm is the logarithm of a product, so (76) is, in fact, the complex logarithm of the geometric mean of {1+re^{i\theta}} over {0\leq \theta \leq 2\pi}. In essence, (76) means that the geometric mean of the quantity {z= 1+re^{i\theta}} over a circle of radius {r\leq 1} and centered at 1 on the complex plane is 1, hence the logarithm of the geometric mean is 0. Moreover, the real part of the integral is given by the Mahler measure,

\displaystyle \frac 1 {2\pi } \int_0^{2\pi} \log |1+re^{i\theta}|~d\theta ~.\ \ \ \ \ (77)

Mahler measures are actually quite deep and connect diverse areas of mathematics and theoretical physics, such as random walks, the enumeration of spanning trees, lattice sums, determinantal processes, number theory and modular forms. Discussion of Mahler measures is beyond the scope of this article.

Next we establish (75), with the help of (76). Let {x\geq 0} without loss of generality and let {r=e^{-x} \leq 1}. Then,

\displaystyle \begin{array}{rcl} \nonumber & & \frac{1}{2\pi} \int_0^{2\pi} \log [2\cosh x - 2 \cos \omega] ~d\omega \\\nonumber & & = \frac{1}{2\pi} \int_0^{2\pi} \log [e^x + e^{-x} - 2 \cos \omega] ~d\omega \\\nonumber & & = x + \frac{1}{2\pi} \int_0^{2\pi} \log [1 - 2 r \cos \omega + r^2] ~d\omega \\\nonumber & & = x + \frac{1}{2\pi} \int_0^{2\pi} \log (1-re^{i\omega}) (1-re^{-i\omega}) ~d\omega \\\nonumber & & = x + \frac{1}{2\pi} \int_0^{2\pi} \log (1-re^{i\omega}) ~d\omega + \frac{1}{2\pi} \int_0^{2\pi} \log (1-re^{-i\omega}) ~d\omega \\ & & = x ~, \end{array}

and the claim follows.

We can now insert this identity into (74):

\displaystyle \begin{array}{rcl} \nonumber \log \lambda &=& \tfrac 1 2 \log 2\sinh 2K_1 + \frac 1 {4\pi } \int _{-\pi}^\pi |\epsilon_q| ~dq \\\nonumber &=& \tfrac 1 2 \log 2\sinh 2K_1 + \frac 1 {2(2\pi)^2 } \int _{-\pi}^\pi \int_0^{2\pi} \log[ 2\cosh \epsilon_q -2\cos \omega] ~d\omega dq \nonumber\\\nonumber &=& \tfrac 1 2 \log 2\sinh 2K_1 \\\nonumber & & + \tfrac 1 {2(2\pi)^2 }\! \int _{-\pi}^\pi \!\int_0^{2\pi} \!\!\log[ 2(\cosh 2K_1^* \cosh 2K_2 - \sinh 2K_1^* \sinh 2K_2 ~\cos q ) -2\cos \omega] ~d\omega dq\nonumber ~. \\ \end{array}

Let us next get rid of the starred {^*} variables:

\displaystyle \begin{array}{rcl} \sinh 2K_1^* &=& 2\sinh K_1^*\cosh K_1^* \\ &=& \mathrm{csch~} 2K_1\\ \cosh 2K_1^* &=& \cosh^2K_1^* + \sinh^2 K_1^*\\ &=& 2 [A(K_1)]^2 \cosh 2K_1 \\ &=& \mathrm{csch~} 2K_1 \cosh 2K_1 \\ &=& \mathrm{coth~} 2K_1 ~.\end{array}

Substituting, we get

\displaystyle \begin{array}{rcl} \nonumber & & \log \lambda\\ \nonumber & &= \tfrac 1 2 \log 2\sinh 2K_1 \\ & & + \frac 1 {2(2\pi)^2 }\int _{-\pi}^\pi \int_0^{2\pi} \log[ 2(\coth 2K_1 \cosh 2K_2 - \mathrm{csch~} 2K_1 \sinh 2K_2 ~\cos q ) -2\cos \omega] ~d\omega dq\nonumber \\ & &= \frac 1 {2(2\pi)^2 }\int _{-\pi}^\pi \int_0^{2\pi} \log[ 4\cosh 2K_1 \cosh 2K_2 - 4\sinh 2K_2 ~\cos q - 4\sinh 2K_1 \cos \omega] ~d\omega dq\nonumber ~. \\ \end{array}

We do not need to integrate from {0} to {2\pi}, it is enough to go from {0} to {\pi}. Taking the factor {4} outside, and renaming the integration variables to be more friendly, we finally get

\displaystyle \log \lambda = \log 2 + \frac 1 {2\pi^2 }\int _{0}^\pi \int_0^{\pi} \log[ \cosh 2K_1 \cosh 2K_2 - \sinh 2K_2 ~\cos x - \sinh 2K_1 \cos y] ~dx dy ~. \ \ \ \ \ (78)

This is the celebrated result of Onsager for the anisotropic 2D Ising model.

There is one other form which is worth mentioning. Let us specialize to the isotropic case {K_1=K_2=K}. Recall that

\displaystyle \cos(\phi_1+\phi_2) + \cos(\phi_1-\phi_2) = 2 \cos \phi_1~\cos \phi_2 ~. \ \ \ \ \ (79)

Let {x=\phi_1+\phi_2} and {y=\phi_1-\phi_2} so that

\displaystyle \cos x + \cos y = 2 \cos \phi_1~\cos \phi_2 ~. \ \ \ \ \ (80)

Let {\kappa = 2\sinh 2K /\cosh^2 2K}. The partition function for the isotropic case now becomes

\displaystyle \begin{array}{rcl} \nonumber \log \lambda &=& \log 2 + \frac 1 {2\pi^2 }\int _{0}^\pi \int_0^{\pi} \log[ \cosh^2 2K - \sinh 2K_2 (\cos x + \cos y)] ~dx dy \\\nonumber &=& \log 2 + \frac 1 {(2\pi)^2 }\int _{-\pi}^\pi \int_{-\pi}^{\pi} \log[ \cosh^2 2K - 2 \sinh 2K_2 ~\cos\phi_1~\cos \phi_2 ] ~d\phi_1 d\phi_2 \\\nonumber &=& \log 2 \cosh 2K + \frac 1 {(2\pi)^2 }\int _{-\pi}^\pi \int_{-\pi}^{\pi} \log[ 1 - \kappa ~\cos\phi_1~\cos \phi_2 ] ~d\phi_1 d\phi_2 \\ &=& \log 2 \cosh 2K + \frac 1 {\pi^2 }\int _{0}^\pi \int_{0}^{\pi} \log[ 1 - \kappa ~\cos\phi_1~\cos \phi_2 ] ~d\phi_1 d\phi_2 ~. \end{array}

One of the integrals can be carried out, in the manner usually found in the textbooks. Moreover, one of us (GMV) recently carried out the second integration in terms of a {_4F_3} generalized hypergeometric function (see http://arxiv.org/abs/1411.2495).

6. Closing remarks

We note that there are several ways of solving the 2D Ising model. The “algebraic” method presented above is a simplified version of the original calculation of Onsager and also of Kaufman. Then, there is the “combinatorial” solution which is based on enumerating closed graphs using a Markovian random walk process on the lattice (see here). Other methods include the star-triangle transformation, the method of Pfaffians, and the Grassmann variable approach. Each method has its advantages and sheds light on a different aspect of the problem.

 


Acnowledgements

GMV thanks Sílvio Salinas for suggesting the study of the method of Schultz, Mattis and Lieb in an e-mail of 16 Nov 2014, and also for discussions when GMV visited USP and Salinas visited Natal during 2014.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s