LOG#003. Entropy.

“I propose to name the quantity S the entropy of the system, after the Greek word [τροπη trope], the transformation. I have deliberately chosen the word entropy to be as similar as possible to the word energy: the two quantities to be named by these words are so closely related in physical significance that a certain similarity in their names appears to be appropriate.”  Clausius (1865).

Entropy is one of the strangest and wonderful concepts in Physics. It is essential for the whole Thermodynamics and it is essential also to understand thermal machines. It is essential for Statistical Mechanics and the atomic structure of molecules and fundamental particles. From the Microcosmos to the Macrocosmos, entropy is everywhere: from  the kinetic theory of gases, information theory as we learned from the previous post, and it is also relevant in the realm of General Relativity, where equations of state for relativistic and non-relativistic particles arise too. And even more, entropy arises in the Black Hole Thermodynamics in a most mysterious form that nobody understands yet.

By the other hand, in the Quantum Mechanics, entropy arises in the (Von Neumann’s) approach to density matrix, the quantum incarnation of the classical version of entropy, ultimately related to the notion of  quantum entanglement. I have no knowledge of any other concept in Physics that can appear in such diffent branches of Physics. The true power of the concept of entropy is its generality.

There are generally three foundations for entropy, three roads to the entropy meaning that physicists have:

– Thermodynamical Entropy. In Thermodynamics, entropy arises after integrating out the heat with an integrating factor that is nothing but the inverse of the temperature. That is:

$\boxed{dS=\oint_\gamma\dfrac{\delta Q}{T}\rightarrow \Delta S= \dfrac{\Delta Q}{T}}$

The studies of thermal machines that existed as logical consequence of the Industrial Revolution during the XIX century created the first definition of entropy. Indeed, following Clausius, the entropy change $\Delta S$ of a thermodynamic system absorbing a quantity of heat $\Delta Q$  at absolute temperature T is simply the ratio between the two, as the above formula shows!  Armed with this definition and concept, Clausius was able to recast Carnot’s statement that steam engines can not exceed a specific theoretical optimum efficiency into a much grander principle we do know as the “2nd law of Thermodynamics” (sometimes called The Maximum Entropy, MAX-ENT, principle by other authors)

$\boxed{\mbox{The entropy of the Universe tends to a maximum}}$

The problem with this definition and this principle is that  it leaves unanswered the most important questionwhat really is the meaning of entropy? Indeed, the answer to this question had to await the revival of  atomic theories of the matter at the end of the 19th century.

– Statistical Entropy.  Ludwig Boltzmann was the scientist who provided a fundamental theoretical basis to the concept of entropy. His key observation was that absolute temperature is nothing more than the average energy per molecular degree of freedom. This fact strongly implies that Clausius ratio between absorbed energy and absolute temperature is nothing more than the number of molecular degrees of freedom. That is, Boltzmann greatest idea was indeed very simply put into words:

$\boxed{S=\mbox{Number of microscopical degrees of freedom}= N_{dof}}$

We can see a difference with respect to the thermodynamical picture of entropy: Boltzmann was able to show that the number of degrees of freedom of a physical system can be easily linked to the number of microstates $\Omega$ of that system. And it comes with a relatively simple expression from the mathematical viewpoint (using the 7th elementary arithmetical operation, beyond the more known addition, substraction, multiplication, division, powers, roots,…)

$\boxed{S \propto \log \Omega}$

Really the base of the logarithm is absolutely conventional. Generally, it is used the natural base (or the binary base, see below).

Why does it work? Why is the number of degrees of freedom related to the logarithm of the total number of available microscopical states? Imagine a system with 2 simple degrees of freedom, a coin. Clone/copy it up to N of those systems. Then, we have got a system of N coins  showing head or tail. Each coin contributes one degree of freedom that can take two distinct values. So in total we have N (binary, i.e., head or tail) degrees of freedom. Simple counting tells us that each coin (each degree of freedom) contributes a factor of two to the total number of distinct states the system can be in. In other words, $\Omega = 2^N$.  Taking the base-2 logarithm  of both sides of this equation yields the logarithm of the total number of states to equal the number of degrees of freedom: $\log_2 \Omega = N$.

This argument can be made completely general. The key argument is that the total number of states  $\Omega$ follows from multiplying together the number of states for each degree of freedom. By taking the logarithm of  $\Omega$, this product gets transformed into an addition of degrees of freedom. The result is an additive entropy definition: adding up the entropies of two independent subsystems provides us the entropy of the total system.

– Information Entropy.
Time machine towards the past future. 20th century. In 1948, Claude Shannon, an electrical engineer at Bell Telephone Laboratories, managed to mathematically quantify the concept of “information”. The key result he derived is that to describe the precise state of a system that can be in states labelled by numbers $1,2,...,n$ with probabilities $p_1, p_2,...,p_n$.

It requires a well-defined minimum number of bits. In fact, the best one can do is to assign $\log_2 (1/p_i)$ bits to the one event with state $i$. Result:  statistically speaking the minimum number of bits one needs to be capable of specifying the system regardless its precise state will be

$\displaystyle{\mbox{Minimum number of bits} = \sum_{i=i}^{n}p_i\log_2 p_i = p_1\log_2 p_1+p_2\log_2 p_2+...+p_n\log_2 p_n}$

When applied to a system that can be in $\Omega$ states, each with equal  probability $p= 1/\Omega$, we get that

$\mbox{Minimum number of bits} = \log_2 \Omega$

We got it. A full century after the thermodynamic and statistical research we were lead to the simple conclusion that the Boltzmann expression $S = \log \Omega$ is nothing more than an alternative way to express:

$S = \mbox{number of bits required to define some (sub)system}$

Entropy is therefore a simple bit (or trit, cuatrit, pentit,…,p-it) counting of your system. The number of bits required to completely determine the actual microscopic configuration between the total number of microstates allowed. In these terms the second law of thermodynamics tells us that closed systems tend to be characterized by a growing bit count. Does it work? Yes, it does. Very well as far as we know…Even in quantum information theory you have an analogue with the density matrix. Even it works in GR and even it strangely works with Black Hole Thermodynamics, excepting the fact that entropy is the area of the horizon, temperature is the surface gravity in the horizon, and the fact that mysteriously, BH entropy is proportional not to the volume as one could expect from conventional thermodynamics (where entropy scales as the volume of the container) , but to the area of the horizon. Incredible, isn’t it? That scaling of the Black Hole entropy with the area was the origin of the holographic principle. But it is far away where my current post wants to go today.

Indeed, there is  a subtle difference between the statistical and the informational entropy. A sign minus in the definition. (Thermodynamical) Entropy can be understood as “missing” (information) entropy:

$\boxed{Entropy = - Information}$

or mathematically

$S= - I$, do you prefer maybe $I+S=0$?

That is, entropy is the same thing that information, excepting a minus sign! So, if you add the same thing to its opposite you get zero.

The question that naturally we face in this entry is the following one: what is the most general mathematical formula/equation for “microscopic” entropy? Well, as many others great problems in Physics, it depends on the axiomatics and your assumptions! Let’s follow Boltzmann during the XIX century. He cleverly suggested a deep connection between thermodynamical entropy and the microscopical degrees of freedom of the considered system. He suggested that there were a connection between the entropy S of a thermodynamical system and the probability $\Omega$ of a given thermodynamical state. How can the functional relationship between S and $\Omega$ be found? Suppose we have $S=f(\Omega)$. In addition, suppose that we have a system that can be divided into two pieces, with their respective entropies and probabilities $S_1,S_2,\Omega_1,\Omega_2$. If we assume that the entropy is additive, meaning that

$S_\Omega=S_1(\Omega_1)+S_2(\Omega_2)$

with the additional hypothesis that the sybsystems are independent, i.e., $\Omega=\Omega_1\Omega_2$, then we can fix the functional form of the entropy in a very easy way: $S(\Omega)=f(\Omega_1\Omega_2)=f(\Omega_1)+f(\Omega_2)$. Do you recognize this functional equation from your High-School? Yes! Logarithms are involved with it. If you are not convinced, with simple calculus, following the Fermi lectures on Thermodynamics you can do the work. Let be $x=\Omega_1, y=\Omega_2$

$f(xy)=f(x)+f(y)$

Write now $y=1+\epsilon$, then $f(x+\epsilon x)=f(x)+f(1+\epsilon)$, where $\epsilon$ is a tiny infinitesimal quantity of first order. Thus, Taylor expanding to both sides, neglecting terms higher to first order infinitesimals, we get

$f(x)+\epsilon f'(x)=f(x)+f(1)+\epsilon f'(1)$

For $\epsilon=0$ we obtain $f(1)=0$, and therefore $xf'(x)=f'(1)=k=constant$, where k is a constant, and nowadays it is  called Boltzmann’s constant. We integrate the differential equation:

$f'(x)=k/x$ in order to obtain the celebrated Boltzmann’s equation for entropy: $S=k\log \Omega$. To be precise, $\Omega$ is not the probability, is the number of microstates compatible with the given thermodynamical state. To obtain the so-called Shannon-Gibbs-Boltzmann entropy, we must divide $\Omega$ between the number of possible dynamical states that agree with the microstate.The Shannon entropy functional form is then generally written as follows:

$\displaystyle{\boxed{S=-k \sum_i p_i\log p_i}}$

It approaches a maximum value when $p_i=1/\Omega$, i.e., when the probability$\Omega$ is a uniform distribution. There is a subtle issue related to the additive constant obtained from the above argument that is important in classical and quantum thermodynamics. But we will discuss that in the future. Now, we could be happy with this functional entropy but indeed, the real issue is that we derived it from some a priori axioms that could look natural, but they are not the most general set of axioms. And, then,  our fascinating trip continues here today! There previous considerations have been using, more or less, formal according to the so-called  “Khinchin axioms” of information theory. That is, The Khinchin axioms are enough to derive the Shannon-Gibbs-Boltzmann entropy we wrote before. However, as it happened with the axioms of euclidean geometry, we can modify our axioms in order to obtain more general “geometries”, here more general “statistical mechanics”. We are going now to explore some of the most known generalizations to Shannon entropy.In the succesive, for simplicity, we set the Boltzmann’s constant to one (i.e. we work with a k=1 system of units ). Is the above definition of entropy/information the only one that is interesting from the physical viewpoint? No, indeed, there has been an increasing activity in “generalized entropies” in the past years. Note, however, that we should recover the basic and simpler entropy (that of Shannon-Gibbs-Boltzmann) in some limit. I will review here some of the most studied entropic functionals that have been studied during the last decades.

The Rényi entropy.

It is a set of uniparametric entropies, now becoming more and more popular in works on entanglement and thermodynamics, with the following functional form:

$\displaystyle{ \boxed{S_q^R=\dfrac{1}{1-q}\ln \sum_i p_{i}^{q}}}$

where the sums extends itself to any microstate with non zero probability $p_i$. It is quite easy to see that in the limit $q\rightarrow 1$ the Rényi entropy transforms into the Shannon-Gibbs-Boltzmann entropy (it can be checked with a perturbative expansion around $q=1+\epsilon$ or using the L’Hôspital’s rule.

The Tsallis entropy.

Tsallis entropies, also called q-entropies by some researchers, are the uniparametric family of entropies defined by:

$\displaystyle{ \boxed{S_{q}^{T}=\dfrac{1}{1-q}\left( \sum_{i} p_{i}^{q}-1\right)}}$.

Tsallis entropy is related to Rényi’s entropies throug a nice equation:

$\boxed{S_q^ T=\dfrac{1}{q-1}(1-e^{(q-1)S_q^R})}$

and again, taking the limit q=1, Tsallis entropies provide the Shannon-Gibbs-Boltzmann’s entropies. Why consider such a Statistical Mechanics based on Tsallis entropy and not Renyi’s?Without entrying into mathematical details, the properties of Tsallis entropy makes itself more suitable to a generalized Statistical Mechanics for complex systems(in particular, it is due to the concavity of Tsallis entropy), as the seminal work of C.Tsallis showed. Indeed, Tsallis entropies were found unnoticed by Tsallis in other unexpected place. In a paper, Havrda-Charvat introduced the so-called “structural $\alpha$ entropy” related to some cybernetical problems in computing.

Interestingly, Tsallis entropies are non-additive, meaning that they satisfy a “pseudo-additivity” property:

$\boxed{S_{q}^{\Omega}=S_q^{\Omega_1}+S_q^{\Omega_2}-(q-1)S_q^{\Omega_1}S_q^{\Omega_2}}$

This means that if we built a Statistical Mechanics based on the Tsallis entropy, it is non-additive itself. Independent subsystems are generally non-additive. However, they are usually called “non-extensive” entropies. Why? The definition of extensivity is  different, namely the entropy of a given system is extensive if, in the so called thermodynamicla limit $N\rightarrow \infty$, then $S\propto N$ , where N is the number of elements of the given thermodynamical system. Therefore, the additivity only depends on the functional relation between the entropy and the probabilities, but extensivity depends not only on that, but also on the nature of the correlations between the elements of the system. The entropic additivity test is quite trivial, but checking its extensivity for a specific system can be complex and very complicated. Indeed, Tsallis entropies can be additive for certain systems, and for some correlated systems, they can become extensive, like usual Thermodynamics/Statistical Mechanics. However, in the broader sense, they are generally non-additive and non-extensive. And it is the latter feature, its thermodynamical behaviour in the thermodynamical limit, from where the name “non-extensive” Thermodynamics arises.

Landsberg-Vedral entropy.

They are also called “normalized Tsallis entropies”. Their functional form are the uniparametric family of entropies:

$\displaystyle{ \boxed{S_q^{LV} =\dfrac{1}{1-q} \left( 1-\dfrac{1}{\sum_i p_{i}^{q}}\right)}}$

They are related to Tsallis entropy through the equation:

$\displaystyle{ S_q^{LV}= \dfrac{S_q^T}{\sum_i p_i ^q}}$

It explains their alternate name as “normalized” Tsallis entropies. They satisfy a modified “pseudoadditivity” property:

$S_q^\Omega=S_q^{\Omega_1}+S_q^{\Omega_2}+(q-1)S_q^{\Omega_1}S_q^{\Omega_2}$

That is, in the case of normalized Tsallis entropies the rôle of (q-1) and -(q-1) is exchanged, i.e., -(q-1) becomes (q-1) in the transition from Tsallis to Landsberg-Vegral entropy.

Abe entropy.

This kind of uniparametric entropy is very symmetric. It is also related to some issues in quantum groups and fractal (non-integer) analysis. They are defined by the q-1/q entropic functional:

$\displaystyle{ \boxed{S_q^{Abe}=-\sum_i \dfrac{p_i^q-p_i^{q^{-1}}}{q-q^{-1}}}}$

Abe entropy can be obtained from Tsallis entropy as follows:

$\boxed{S_q^{ LV}=\dfrac{(q-1)S_q^T-(q^{-1}-1)S_{q^{-1}}^{T}}{q-q^{-1}}}$

Abe entropy is also concave from the mathematical viewpoint, like the Tsallis entropy. It has some king of “duality” or mirror symmetry due to the invariance swapping q and 1/q.

Other uniparametric entropic family well-know is the Kaniadakis entropy or $latex \kappa$-entropy. Related to relativistic kinematics, it has the functional form:

$\displaystyle{ \boxed{S_\kappa^{K}=-\sum_i \dfrac{p_i^{1+\kappa }-p_i^{1-\kappa}}{2\kappa}}}$

In the limit $\kappa \rightarrow 0$ Kaniadakis entropy becomes Shannon entropy. Also, writing $q=1+\kappa$, and $\dfrac{1}{q}=1-\kappa$, Kaniadakis entropy becomes Abe entropy. Kaniadakis entropy, in addition to be concave, have further subtle properties, like being something called Lesche stable. See references below for details!

Sharma-Mittal entropies.

Finally, we end our tour along entropy functionals with a biparametric family of entropies called Sharma-Mittal entropies. They have the following definition:

$\displaystyle{ \boxed{S_{\kappa,r}^{SM}=-\sum_i p_i^{r}\left( \dfrac{p_i^{1+\kappa}-p_i^{1-\kappa}}{2\kappa}\right)}}$

It can be shown that these entropy species contain many entropies as special subtypes. For instance, Tsallis entropy is recovered if $r=\kappa$ and $q=1-2\kappa$. Kaniadakis entropy is got if we set r=0. Abe entropy is the subcase with $\kappa=\frac{1}{2}(q-q^{-1})$ and $r=\frac{1}{2}(q+q^{-1})-1$. Isn’t it wonderful? There is an alternative expression of Sharma-Mittal entropy, taking the following expression:

$\displaystyle{ \boxed{S_{r,q}^{SM}=\dfrac{1}{1-r}\left[\sum_i \left(p_i^q\right)^{(\frac{1-r}{1-q})}-1\right]}}$

In this functional form, SM entropy recovers Rényi entropy for $r\rightarrow 1$, SM entropy becomes Tsallis entropy if $r\rightarrow q$. Finally, when both parameters approach 1, i.e., $r,q\rightarrow 1$, we recover the classical Shannon-Gibbs-Boltzmann. It is left as a nice exercise for the reader to relate the above 2 SM entropy functional forms and to derive Kaniadakis entropy, Abe entropy and Landsberg-Vedral entropy for some particular values of $r,q$ from the second definition of SM entropy.

However, entropy as a concept is yet very mysterious. Indeed, it is not clear yet if we have exhausted every functional form for entropy!

Non-extensive Statistical Mechanics and its applications are becoming more and more important and kwown between the theoretical physicists. It has a growing number of uses in High-Energy Physics, condensed matter, Quantum Information and Physics. The Nobel Prize Murray Gell-Mann has dedicated their last years of research in the world of Non-Extensive entropy. At least, from his book The Quark and the Jaguar, Murray Gell-Mann has progressively moved into this fascinating topic. In parallel, it has also produced some other interesting approaches to Statistical Mechanics, such as the so-called “superstatistics”. Superstatistics is some kind of superposition of statistics that was invented by the physicist Christian Beck.

The last research on the foundations of entropy functionals is related to something called “group entropies” and the transformation group of superstatistics and the rôle of group transformations on non-extensive entropies. It provides feedback between different branches of knowledge: group theory, number theory, Statistical Mechanics, and Quantum Satisties…And a connection with the classical Riemann zeta function even arises!

WHERE DO I LEARN ABOUT THIS STUFF and MORE if I am interested in it? You can study these topics in the following references:

The main entry is based in the following article by Christian Beck:

1) Generalized information and entropy measures in physics by Christian Beck. http://arXiv.org/abs/0902.1235v2

If you get interested in Murray Gell-Mann works about superstatistics and its group of transformations, here is the place to begin with:

2) Generalized entropies and the transformation group of superstatistics. Rudolf Haner, Stefan Thurner, Murray Gell-Mann

http://arxiv.org/abs/1103.0580

If you want to see a really nice paper on group entropies and zeta functions, you can read this really nice paper by P.Tempesta:

3)Group entropies, correlation laws and zeta functions. http://arxiv.org/abs/1105.1935

C.Tsallis himself has a nice bibliography related to non-extensive entropies in his web page:

The “Khinchin axioms” of information/entropy functionals can be found, for instance, here:

5) Mathematical Foundations of Information Theory, A. Y. Khinchin. Dover. Pub.

Two questions to be answered by the current and future scientists:

A) What is the most general entropy (functional entropy) that can be build from microscopic degrees of freedom? Are they classical/quantum or is that distinction irrelevant for the ultimate substrate of reality?

B) Is every fundamental interaction related to some kind of entropy? How and why?

C) If entropy is “information loss” or “information” ( only a minus sign makes the difference), and Quantum Mechanics says that Quantum Mechanics is about information (the current and modern interpretation of QM is based on it), is there some hidden relationship between mass-energy and information and entropy? Could it be used to build Relativity and QM from a common framework? Therefore, are then QM and (General) Relativity emergent and likely the two sides of a most fundamental theory based on information only?