“I propose to name the quantity S the entropy of the system, after the Greek word [τροπη trope], the transformation. I have deliberately chosen the word entropy to be as similar as possible to the word energy: the two quantities to be named by these words are so closely related in physical significance that a certain similarity in their names appears to be appropriate.” Clausius (1865).
Entropy is one of the strangest and wonderful concepts in Physics. It is essential for the whole Thermodynamics and it is essential also to understand thermal machines. It is essential for Statistical Mechanics and the atomic structure of molecules and fundamental particles. From the Microcosmos to the Macrocosmos, entropy is everywhere: from the kinetic theory of gases, information theory as we learned from the previous post, and it is also relevant in the realm of General Relativity, where equations of state for relativistic and non-relativistic particles arise too. And even more, entropy arises in the Black Hole Thermodynamics in a most mysterious form that nobody understands yet.
By the other hand, in the Quantum Mechanics, entropy arises in the (Von Neumann’s) approach to density matrix, the quantum incarnation of the classical version of entropy, ultimately related to the notion of quantum entanglement. I have no knowledge of any other concept in Physics that can appear in such diffent branches of Physics. The true power of the concept of entropy is its generality.
There are generally three foundations for entropy, three roads to the entropy meaning that physicists have:
– Thermodynamical Entropy. In Thermodynamics, entropy arises after integrating out the heat with an integrating factor that is nothing but the inverse of the temperature. That is:
The studies of thermal machines that existed as logical consequence of the Industrial Revolution during the XIX century created the first definition of entropy. Indeed, following Clausius, the entropy change of a thermodynamic system absorbing a quantity of heat at absolute temperature T is simply the ratio between the two, as the above formula shows! Armed with this definition and concept, Clausius was able to recast Carnot’s statement that steam engines can not exceed a specific theoretical optimum efficiency into a much grander principle we do know as the “2nd law of Thermodynamics” (sometimes called The Maximum Entropy, MAX-ENT, principle by other authors)
The problem with this definition and this principle is that it leaves unanswered the most important question “what really is the meaning of entropy?“ Indeed, the answer to this question had to await the revival of atomic theories of the matter at the end of the 19th century.
– Statistical Entropy. Ludwig Boltzmann was the scientist who provided a fundamental theoretical basis to the concept of entropy. His key observation was that absolute temperature is nothing more than the average energy per molecular degree of freedom. This fact strongly implies that Clausius ratio between absorbed energy and absolute temperature is nothing more than the number of molecular degrees of freedom. That is, Boltzmann greatest idea was indeed very simply put into words:
We can see a difference with respect to the thermodynamical picture of entropy: Boltzmann was able to show that the number of degrees of freedom of a physical system can be easily linked to the number of microstates of that system. And it comes with a relatively simple expression from the mathematical viewpoint (using the 7th elementary arithmetical operation, beyond the more known addition, substraction, multiplication, division, powers, roots,…)
Really the base of the logarithm is absolutely conventional. Generally, it is used the natural base (or the binary base, see below).
Why does it work? Why is the number of degrees of freedom related to the logarithm of the total number of available microscopical states? Imagine a system with 2 simple degrees of freedom, a coin. Clone/copy it up to N of those systems. Then, we have got a system of N coins showing head or tail. Each coin contributes one degree of freedom that can take two distinct values. So in total we have N (binary, i.e., head or tail) degrees of freedom. Simple counting tells us that each coin (each degree of freedom) contributes a factor of two to the total number of distinct states the system can be in. In other words, . Taking the base-2 logarithm of both sides of this equation yields the logarithm of the total number of states to equal the number of degrees of freedom: .
This argument can be made completely general. The key argument is that the total number of states follows from multiplying together the number of states for each degree of freedom. By taking the logarithm of , this product gets transformed into an addition of degrees of freedom. The result is an additive entropy definition: adding up the entropies of two independent subsystems provides us the entropy of the total system.
– Information Entropy.
Time machine towards the past future. 20th century. In 1948, Claude Shannon, an electrical engineer at Bell Telephone Laboratories, managed to mathematically quantify the concept of “information”. The key result he derived is that to describe the precise state of a system that can be in states labelled by numbers with probabilities .
It requires a well-defined minimum number of bits. In fact, the best one can do is to assign bits to the one event with state . Result: statistically speaking the minimum number of bits one needs to be capable of specifying the system regardless its precise state will be
When applied to a system that can be in states, each with equal probability , we get that
We got it. A full century after the thermodynamic and statistical research we were lead to the simple conclusion that the Boltzmann expression is nothing more than an alternative way to express:
Entropy is therefore a simple bit (or trit, cuatrit, pentit,…,p-it) counting of your system. The number of bits required to completely determine the actual microscopic configuration between the total number of microstates allowed. In these terms the second law of thermodynamics tells us that closed systems tend to be characterized by a growing bit count. Does it work? Yes, it does. Very well as far as we know…Even in quantum information theory you have an analogue with the density matrix. Even it works in GR and even it strangely works with Black Hole Thermodynamics, excepting the fact that entropy is the area of the horizon, temperature is the surface gravity in the horizon, and the fact that mysteriously, BH entropy is proportional not to the volume as one could expect from conventional thermodynamics (where entropy scales as the volume of the container) , but to the area of the horizon. Incredible, isn’t it? That scaling of the Black Hole entropy with the area was the origin of the holographic principle. But it is far away where my current post wants to go today.
Indeed, there is a subtle difference between the statistical and the informational entropy. A sign minus in the definition. (Thermodynamical) Entropy can be understood as “missing” (information) entropy:
, do you prefer maybe ?
That is, entropy is the same thing that information, excepting a minus sign! So, if you add the same thing to its opposite you get zero.
The question that naturally we face in this entry is the following one: what is the most general mathematical formula/equation for “microscopic” entropy? Well, as many others great problems in Physics, it depends on the axiomatics and your assumptions! Let’s follow Boltzmann during the XIX century. He cleverly suggested a deep connection between thermodynamical entropy and the microscopical degrees of freedom of the considered system. He suggested that there were a connection between the entropy S of a thermodynamical system and the probability of a given thermodynamical state. How can the functional relationship between S and be found? Suppose we have . In addition, suppose that we have a system that can be divided into two pieces, with their respective entropies and probabilities . If we assume that the entropy is additive, meaning that
with the additional hypothesis that the sybsystems are independent, i.e., , then we can fix the functional form of the entropy in a very easy way: . Do you recognize this functional equation from your High-School? Yes! Logarithms are involved with it. If you are not convinced, with simple calculus, following the Fermi lectures on Thermodynamics you can do the work. Let be
Write now , then , where is a tiny infinitesimal quantity of first order. Thus, Taylor expanding to both sides, neglecting terms higher to first order infinitesimals, we get
For we obtain , and therefore , where k is a constant, and nowadays it is called Boltzmann’s constant. We integrate the differential equation:
in order to obtain the celebrated Boltzmann’s equation for entropy: . To be precise, is not the probability, is the number of microstates compatible with the given thermodynamical state. To obtain the so-called Shannon-Gibbs-Boltzmann entropy, we must divide between the number of possible dynamical states that agree with the microstate.The Shannon entropy functional form is then generally written as follows:
It approaches a maximum value when , i.e., when the probability is a uniform distribution. There is a subtle issue related to the additive constant obtained from the above argument that is important in classical and quantum thermodynamics. But we will discuss that in the future. Now, we could be happy with this functional entropy but indeed, the real issue is that we derived it from some a priori axioms that could look natural, but they are not the most general set of axioms. And, then, our fascinating trip continues here today! There previous considerations have been using, more or less, formal according to the so-called “Khinchin axioms” of information theory. That is, The Khinchin axioms are enough to derive the Shannon-Gibbs-Boltzmann entropy we wrote before. However, as it happened with the axioms of euclidean geometry, we can modify our axioms in order to obtain more general “geometries”, here more general “statistical mechanics”. We are going now to explore some of the most known generalizations to Shannon entropy.In the succesive, for simplicity, we set the Boltzmann’s constant to one (i.e. we work with a k=1 system of units ). Is the above definition of entropy/information the only one that is interesting from the physical viewpoint? No, indeed, there has been an increasing activity in “generalized entropies” in the past years. Note, however, that we should recover the basic and simpler entropy (that of Shannon-Gibbs-Boltzmann) in some limit. I will review here some of the most studied entropic functionals that have been studied during the last decades.
The Rényi entropy.
It is a set of uniparametric entropies, now becoming more and more popular in works on entanglement and thermodynamics, with the following functional form:
where the sums extends itself to any microstate with non zero probability . It is quite easy to see that in the limit the Rényi entropy transforms into the Shannon-Gibbs-Boltzmann entropy (it can be checked with a perturbative expansion around or using the L’Hôspital’s rule.
The Tsallis entropy.
Tsallis entropies, also called q-entropies by some researchers, are the uniparametric family of entropies defined by:
Tsallis entropy is related to Rényi’s entropies throug a nice equation:
and again, taking the limit q=1, Tsallis entropies provide the Shannon-Gibbs-Boltzmann’s entropies. Why consider such a Statistical Mechanics based on Tsallis entropy and not Renyi’s?Without entrying into mathematical details, the properties of Tsallis entropy makes itself more suitable to a generalized Statistical Mechanics for complex systems(in particular, it is due to the concavity of Tsallis entropy), as the seminal work of C.Tsallis showed. Indeed, Tsallis entropies were found unnoticed by Tsallis in other unexpected place. In a paper, Havrda-Charvat introduced the so-called “structural entropy” related to some cybernetical problems in computing.
Interestingly, Tsallis entropies are non-additive, meaning that they satisfy a “pseudo-additivity” property:
This means that if we built a Statistical Mechanics based on the Tsallis entropy, it is non-additive itself. Independent subsystems are generally non-additive. However, they are usually called “non-extensive” entropies. Why? The definition of extensivity is different, namely the entropy of a given system is extensive if, in the so called thermodynamicla limit , then , where N is the number of elements of the given thermodynamical system. Therefore, the additivity only depends on the functional relation between the entropy and the probabilities, but extensivity depends not only on that, but also on the nature of the correlations between the elements of the system. The entropic additivity test is quite trivial, but checking its extensivity for a specific system can be complex and very complicated. Indeed, Tsallis entropies can be additive for certain systems, and for some correlated systems, they can become extensive, like usual Thermodynamics/Statistical Mechanics. However, in the broader sense, they are generally non-additive and non-extensive. And it is the latter feature, its thermodynamical behaviour in the thermodynamical limit, from where the name “non-extensive” Thermodynamics arises.
They are also called “normalized Tsallis entropies”. Their functional form are the uniparametric family of entropies:
They are related to Tsallis entropy through the equation:
It explains their alternate name as “normalized” Tsallis entropies. They satisfy a modified “pseudoadditivity” property:
That is, in the case of normalized Tsallis entropies the rôle of (q-1) and -(q-1) is exchanged, i.e., -(q-1) becomes (q-1) in the transition from Tsallis to Landsberg-Vegral entropy.
This kind of uniparametric entropy is very symmetric. It is also related to some issues in quantum groups and fractal (non-integer) analysis. They are defined by the q-1/q entropic functional:
Abe entropy can be obtained from Tsallis entropy as follows:
Abe entropy is also concave from the mathematical viewpoint, like the Tsallis entropy. It has some king of “duality” or mirror symmetry due to the invariance swapping q and 1/q.
Other uniparametric entropic family well-know is the Kaniadakis entropy or $latex \kappa $-entropy. Related to relativistic kinematics, it has the functional form:
In the limit Kaniadakis entropy becomes Shannon entropy. Also, writing , and , Kaniadakis entropy becomes Abe entropy. Kaniadakis entropy, in addition to be concave, have further subtle properties, like being something called Lesche stable. See references below for details!
Finally, we end our tour along entropy functionals with a biparametric family of entropies called Sharma-Mittal entropies. They have the following definition:
It can be shown that these entropy species contain many entropies as special subtypes. For instance, Tsallis entropy is recovered if and . Kaniadakis entropy is got if we set r=0. Abe entropy is the subcase with and . Isn’t it wonderful? There is an alternative expression of Sharma-Mittal entropy, taking the following expression:
In this functional form, SM entropy recovers Rényi entropy for , SM entropy becomes Tsallis entropy if . Finally, when both parameters approach 1, i.e., , we recover the classical Shannon-Gibbs-Boltzmann. It is left as a nice exercise for the reader to relate the above 2 SM entropy functional forms and to derive Kaniadakis entropy, Abe entropy and Landsberg-Vedral entropy for some particular values of from the second definition of SM entropy.
However, entropy as a concept is yet very mysterious. Indeed, it is not clear yet if we have exhausted every functional form for entropy!
Non-extensive Statistical Mechanics and its applications are becoming more and more important and kwown between the theoretical physicists. It has a growing number of uses in High-Energy Physics, condensed matter, Quantum Information and Physics. The Nobel Prize Murray Gell-Mann has dedicated their last years of research in the world of Non-Extensive entropy. At least, from his book The Quark and the Jaguar, Murray Gell-Mann has progressively moved into this fascinating topic. In parallel, it has also produced some other interesting approaches to Statistical Mechanics, such as the so-called “superstatistics”. Superstatistics is some kind of superposition of statistics that was invented by the physicist Christian Beck.
The last research on the foundations of entropy functionals is related to something called “group entropies” and the transformation group of superstatistics and the rôle of group transformations on non-extensive entropies. It provides feedback between different branches of knowledge: group theory, number theory, Statistical Mechanics, and Quantum Satisties…And a connection with the classical Riemann zeta function even arises!
WHERE DO I LEARN ABOUT THIS STUFF and MORE if I am interested in it? You can study these topics in the following references:
The main entry is based in the following article by Christian Beck:
1) Generalized information and entropy measures in physics by Christian Beck. http://arXiv.org/abs/0902.1235v2
If you get interested in Murray Gell-Mann works about superstatistics and its group of transformations, here is the place to begin with:
2) Generalized entropies and the transformation group of superstatistics. Rudolf Haner, Stefan Thurner, Murray Gell-Mann
If you want to see a really nice paper on group entropies and zeta functions, you can read this really nice paper by P.Tempesta:
3)Group entropies, correlation laws and zeta functions. http://arxiv.org/abs/1105.1935
C.Tsallis himself has a nice bibliography related to non-extensive entropies in his web page:
The “Khinchin axioms” of information/entropy functionals can be found, for instance, here:
5) Mathematical Foundations of Information Theory, A. Y. Khinchin. Dover. Pub.
Two questions to be answered by the current and future scientists:
A) What is the most general entropy (functional entropy) that can be build from microscopic degrees of freedom? Are they classical/quantum or is that distinction irrelevant for the ultimate substrate of reality?
B) Is every fundamental interaction related to some kind of entropy? How and why?
C) If entropy is “information loss” or “information” ( only a minus sign makes the difference), and Quantum Mechanics says that Quantum Mechanics is about information (the current and modern interpretation of QM is based on it), is there some hidden relationship between mass-energy and information and entropy? Could it be used to build Relativity and QM from a common framework? Therefore, are then QM and (General) Relativity emergent and likely the two sides of a most fundamental theory based on information only?
We live in the information era. Read more about this age here. Everything in your sorrounding and environtment is bound and related to some kind of “information processing”. Information can also be recorded and transmitted. Therefore, being rude, information is something which is processed, stored and transmitted. Your computer is now processing information, while you read these words. You also record and save your favourite pages and files in your computer. There are many tools to store digital information: HDs, CDs, DVDs, USBs,…And you can transmit that information to your buddies by e-mail, old fashioned postcards and letters, MSN, phone,…You are even processing information with your brain and senses, whenever you are reading this text. Thus, the information idea is abstract and very general. The following diagram shows you how large and multidisciplinary information theory(IT) is:
I enjoyed as a teenager that old game in which you are told a message in your ear, and you transmit it to other human, this one to another and so on. Today, you can see it at big scale on Twitter. Hey! The message is generally very different to the original one! This simple example explains the other side of communication or information transmission: “noise”. Although efficiency is also used. The storage or transmission of information is generally not completely efficient. You can loose information. Roughly speaking, every amount of information has some quantity of noise that depends upon how you transmit the information(you can include a noiseless transmission as a subtype of information process in which, there is no lost information). Indeed, this is also why we age. Our DNA, which is continuously replicating itself thanks to the metabolism (possible ultimately thanksto the solar light), gets progressively corrupted by free radicals and different “chemicals” that makes our cellular replication more and more inefficient. Don’t you remember it to something you do know from High-School? Yes! I am thinking about Thermodynamics. Indeed, the reason because Thermodynamics was a main topic during the 19th century till now, is simple: quantity of energy is constant but its quality is not. Then, we must be careful to build machines/engines that be energy-efficient for the available energy sources.
Before going into further details, you are likely wondering about what information is! It is a set of symbols, signs or objects with some well defined order. That is what information is. For instance, the word ORDER is giving you information. A random permutation of those letters, like ORRDE or OERRD is generally meaningless. I said information was “something” but I didn’t go any further! Well, here is where Mathematics and Physics appear. Don’t run far away! The beauty of Physics and Maths, or as I like to call them, Physmatics, is that concepts, intuitions and definitions, rigorously made, are well enough to satisfy your general requirements. Something IS a general object, or a set of objects with certain order. It can be certain DNA sequence coding how to produce certain substance (e.g.: a protein) our body needs. It can a simple or complex message hidden in a highly advanced cryptographic code. It is whatever you are recording on your DVD ( a new OS, a movie, your favourite music,…) or any other storage device. It can also be what your brain is learning how to do. That is “something”, or really whatever. You can say it is something obscure and weird definition. Really it is! It can also be what electromagnetic waves transmit. Is it magic? Maybe! It has always seems magic to me how you can browse the internet thanks to your Wi-Fi network! Of course, it is not magic. It is Science. Digital or analogic information can be seen as large ordered strings of 1’s and 0’s, making “bits” of information. We will not discuss about bits in this log. Future logs will…
Now, we have to introduce the concepts through some general ideas we have mention and we know from High-School. Firstly, Thermodynamics. As everybody knows, and you have experiences about it, energy can not completely turned into useful “work”. There is a quality in energy. Heat is the most degradated form of energy. When you turn on your car and you burn fuel, you know that some of the energy is transformed into mechanical energy and a lof of energy is dissipated into heat to the atmosphere. I will not talk about the details about the different cycles engines can realize, but you can learn more about them in the references below. Simbollically, we can state that
The great thing is that an analogue relation in information theory does exist! The relation is:
Therefore, there is some subtle analogy and likely some deeper idea with all this stuff. How do physicists play to this game? It is easy. They invent a “thermodynamic potential”! A thermodynamic potential is a gadget (mathematically a function) that relates a set of different thermodynamic variables. For all practical purposes, we will focus here with the so-called Gibbs “free-energy”. It allows to measure how useful a “chemical reaction” or “process” is. Moreover, it also gives a criterion of spontaneity for processes with constant pressure and temperature. But it is not important for the present discussion. Let’s define Gibbs free energy G as follows:
where H is called enthalpy, T is the temperature and S is the entropy. You can identify these terms with the previous concepts. Can you see the similarity with the written letters in terms of energy and communication concepts? Information is something like “free energy” (do you like freedom?Sure! You will love free energy!). Thus, noise is related to entropy and temperature, to randomness, i.e., something that does not store “useful information”.
Internet is also a source of information and noise. There are lots of good readings but there are also spam. Spam is not really useful for you, isn’t it? Recalling our thermodynamic analogy, since the first law of thermodynamics says that the “quantity of energy” is constant and the second law says something like “the quality of energy, in general, decreases“, we have to be aware of information/energy processing. You find that there are signals and noise out there. This is also important, for instance, in High Energy Physics or particle Physics. You have to distinguish in a collision process what events are a “signal” from a generally big “background”.
We will learn more about information(or entropy) and noise in my next log entries. Hopefully, my blog and microblog will become signals and not noise in the whole web.
Where could you get more information? 😀 You have some good ideas and suggestions in the following references:
1) I found many years ago the analogy between Thermodynamics-Information in this cool book (easy to read for even for non-experts)
Applied Chaos Theory: A paradigm for complexity. Ali Bulent Cambel (Author)Publisher: Academic Press; 1st edition (November 19, 1992)
Unfortunately, in those times, as an undergraduate student, my teachers were not very interested in this subject. What a pity!
2) There are some good books on Thermodynamics, I love (and fortunately own) these jewels:
Concepts in Thermal Physics, by Stephen Blundell, OUP. 2009.
A really self-contained book on Thermodynamics, Statistical Physics and topics not included in standard books. I really like it very much. It includes some issues related to the global warming and interesting Mathematics. I enjoy how it introduces polylogarithms in order to handle closed formulae for the Quantum Statistics.
Thermodynamcis and Statistical Mechanics. (Dover Books on Physics & Chemistry). Peter T. Landsberg
A really old-fashioned and weird book. But it has some insights to make you think about the foundations of Thermodynamics.
Thermodynamcis, Dover Pub. Enrico Fermi
This really tiny book is delicious. I learned a lot of fun stuff from it. Basic, concise and completely original, as Fermi himself. Are you afraid of him? Me too! E. Fermi was a really exceptional physicist and lecturer. Don’t loose the opportunity to read his lectures on Thermodynamcis.
Mere Thermodynamics. Don S. Lemons. Johns Hopkins University Press.
Other great little book if you really need a crash course on Thermodynamics.
Introduction to Modern Statistical Physics: A Set of Lectures. Zaitsev, R.O. URSS publishings.
I have read and learned some extra stuff from URSS ed. books like this one. Russian books on Science are generally great and uncommon. And I enjoy some very great poorly known books written by generally unknow russian scientists. Of course, you have ever known about Landau and Lipshitz books but there are many other russian authors who deserve your attention.
3) Information Theory books. Classical information theory books for your curious minds are
An Introduction to Information Theory: Symbols, Signals and Noise. Dover Pub. 2nd Revised ed. 1980. John. R. Pierce.
A really nice and basic book about classical Information Theory.
An introduction to Information Theory. Dover Books on Mathematics. F.M.Reza. Basic book for beginners.
The Mathematical Theory of Communication. Claude E. Shannon and W.Weaver.Univ. of Illinois Press.
A classical book by one of the fathers of information and communication theory.
Mathematical Foundations of Information Theory. Dover Books on Mathematics. A.Y.Khinchin.
A “must read” if you are interested in the mathematical foundations of IT.