# Where does the Boltzmann factor come from?

Most textbooks on statistical mechanics derive the Boltzmann factor ${e^{-E/kT}}$ by assuming a large composite system that consists of a very small system together with a heat bath. They assume Boltzmann’s formula for the entropy $\displaystyle S=k \ln \Omega \ \ \ \ \ (1)$

and then typically expand ${\ln \Omega}$ in a Taylor series. There is not much discussion about the necessary and sufficient conditions for Boltzmann statistics to be applicable. Below are the relevant sections taken from various textbooks:

1. Linda E. Reichl’s A Modern Course in Statistical Physics.

2. Kerson Huang’s Introduction to Statistical Physics.

3. Raj Kumar Pathria’s Statistical Mechanics, 3rd edition.

4. Walter Greiner et al.‘s Thermodynamics and Statistical Mechanics.

5. Kerson Huang’s Statistical Mechanics, 2nd edition.

These are all good textbooks that I have personally used and learned from. However, I found them to be insufficiently clear about the theoretical foundations of the canonical ensemble in equilibrium statistical mechanics. Where does the Gibbs factor really come from? Is it always applicable for systems that can exchange energy but not particles with a heat bath? For example, should we expect the Boltzmann factor to be applicable to Hamiltonian systems with long range interactions, such as galaxies composed of stars? None of the above books provide  much of an answer to such questions.  The books also leave an important question unanswered. Can we instead Taylor expand ${\Omega}$? Or ${\sin \Omega}$? Why ${\ln \Omega}$ and not some other function of ${\Omega}$? There are other issues that are typically not discussed. Can we expect the Boltzmann factor to appear in Hamiltonian systems that are not ergodic? Textbooks often do not attempt to clarify or discuss the necessary and sufficient conditions for the Boltzmann factor to be applicable.

I thus decided to rederive the Boltzmann factor and the canonical partition function from well known first principles, making clear the assumptions. (My approach is somewhat similar to a fleshed out version of the one found in Franz Mandl’s Statistical Physics.)

Consider an isolated large “total” system ${\mathcal S_T}$ that consists of a thermal reservoir (heat bath) ${\mathcal S_R}$ in thermal contact with a very small system ${\mathcal S_S}$. The total system ${\mathcal S_T}$ is in thermodynamic equilibrium. Hence the total energy ${E}$ is constant. Since there is no net heat flow and since there can be no irreversible process under equilibrium conditions, the total entropy ${S_T}$ is also constant (in fact, it is maximized).

The total energy ${E_T}$ can can be decomposed into 3 parts, viz., the energy ${E_R}$ of the heat bath, the energy of the small system ${E_S}$ and an interaction energy ${\Delta E}$ that arizes from (possibly long-range) interactions between the particles in ths small system ${\mathcal S_S}$ and the bath ${\mathcal S_R}$. We can thus write $\displaystyle E_T= E_R + E_S + \Delta E \ \ \ \ \ (2)$

.

In the thermodynamic limit of systems that do not have significant long-range interactions, the term ${\Delta E}$ grows in proportion to the boundary of the system ${\mathcal S_S}$, so that $\displaystyle \lim _ {N \rightarrow \infty} \Delta E/ E_S=0 \ \ \ \ \ (3)$

where ${N\rightarrow \infty}$ denotes the thermodynamic limit. In this case we can write $\displaystyle E_T\approx E_R + E_S ~. \ \ \ \ \ (4)$

In such a scenario, we can also write $\displaystyle S_T\approx S_R + S_S \ \ \ \ \ (5)$

for the entropy. We are thus assuming that the entropy is also additive.

Let us now consider the total system in the microcanonical ensemble, so that we can invoke the postulate of equal a priori probabilities. The entropy of the small system ${\mathcal S_S}$, for fixed energy ${E_S}$, is given by Boltzmann’s formula $\displaystyle S_S= k \ln \Omega_S(E_S) \ \ \ \ \ (6)$

where ${\Omega_S}$ is the number of microstates. The probability of finding ${\mathcal S_S}$ in a state with energy ${E_S}$ is given by $\displaystyle p(E_S) \propto \frac 1 {\Omega_S(E_S)} \ \ \ \ \ (7)$

Subtituting (6) we get $\displaystyle p(E_S) \propto e^{-S_S/k} = e^{(-S_T+S_R)/k} \ \ \ \ \ (8)$

Since ${S_T}$ is a constant, we have $\displaystyle p(E_S) \propto e^{S_R/k} ~. \ \ \ \ \ (9)$

Recall that we can expand a function $f(x)$ according to $\displaystyle \begin{array}{rcl} f(x) &=& f(a) + (x-a) f'(a) + \frac 1 {2!} (x-a)^2 {f''(a)} + \ldots \end{array}$

We can Taylor expand and write to second order $\displaystyle \begin{array}{rcl} S_R(E_R) &=& S_R(E_T-E_S)\\ &=& S_R(E_T) + (-E_S) {\partial S_R(E_T) \over \partial E} + \frac 1 2 (-E_S)^2 {\partial^2 \over \partial E^2} S_R(E_T) + \ldots \end{array}$

The first partial derivative is given by $\displaystyle {\partial S_R(E_T) \over \partial E} = \frac 1 T \ \ \ \ \ (10)$

from the definitions of internal energy, entropy and temperature. In our case, there is only one relevant temperature, viz., the temperature of the heat bath. Next, observe that the second order derivative involves a derivative of the temperature. However, by the idealized definition of a heat bath, the temperature is a constant, hence its derivative vanishes in the thermodynamic limit.

Notice above that we have not expanded ${\ln \Omega_R}$, where ${\Omega_R}$ is the number of states of the bath, but rather ${S_R}$. The real reason why the textbooks expand ${\ln \Omega_R}$ is because it is proportional to ${S_R}$. From thermodynamics, we know precisely how the entropy of a heat bath behaves. Specifically, we know how to Taylor expand the entropy of a heat bath.

We can now continue where we left off and thus write (9) as $\displaystyle \begin{array}{rcl} p(E_S) &\propto& \exp [ S_R(E_T) /k+ (-E_S)/kT] \\ &\propto& \exp [ -E_S/kT] ~. \end{array}$

In the above simplifications, we have used the fact that ${E_T}$ is a constant.

We thus obtain the Boltzmann factor: $\displaystyle \begin{array}{rcl} p(E_S) &=& \displaystyle \frac 1 Z e^{-E_S/kT} \\ Z&=& \sum e^{-E_S/kT} ~, \end{array}$

where ${Z}$ is the canonical partition function and the sum is over all microstates of ${\mathcal S_S}$.

Let us make clear what has been assumed:

1. We need finite size effects to be negligible. In other words, there is no reason to expect Boltzmann statistics to be valid far from the thermodynamic limit, e.g., for small systems or heat baths.

2. We have assumed that the total system can be described in the microcanonical ensemble. We have, however, not explicitly assumed ergodicity. But the microcanonical ensemble is compatible with the maximum entropy principle. So Boltzmann statistics may not be applicable far from equilibrium.

3. Additivity of the energy. We have assumed that the interaction energy between the heat bath and the smaller system can be ignored in the thermodynamic limit. When the Hamiltonian contains strong long-range interactions, this assumption does not hold and there is again no reason to expect Boltzmann statistics to be applicable.

4. The entropy must also be additive. If there are no long-range interactions, then in the thermodynamic limit we can expect to have $\displaystyle \Omega_T \approx \Omega_R \Omega_S \ \ \ \ \ (11)$

so that the entropy indeed becomes additive. If the Boltzmann entropy is not additive, however, our derivation above is no longer valid.

So now we know when to expect and when not to expect the Boltzmann factor to be applicable. A “gas” of stars in a galaxy, for example, has long-range gravitational interactions that cannot be ignored even in the thermodynamic limit. Hence, we should not expect the Boltzmann factor to be applicable. On the other hand, a gas of, say, helium atoms has no significant long-range interactions at room temperature, i.e. the mean interaction energies are negligible compared to ${kT}$.

Let me conclude by saying that although I am sure that Boltzmann statistics is not universal, I am entirely skeptical of alternative formulations that have become fashionable, such as those based on the Tsallis entropy. My collaborators and I have previously made our opinions clear on this question (see here).

I thank V.M. Kenkre for first calling my attention the intricacies involved in the proper derivation of the Boltzmann factor, during 2008-09, when I was a visitor at the Consortium of the Americas for Interdisciplnary Science, at the University of New Mexico, in Albuquerque. I thank Renê Montenegro for feedback.