LOG#113. Bohr’s legacy (I).

Dedicated to Niels Bohr

and his atomic model

(1913-2013)

1st part: A centenary model

atomElement-117-discoverypurity

This is a blog entry devoted to the memory of a great scientist, N. Bohr, one of the greatest master minds during the 20th century, one of the fathers of the current Quantum model of atoms and molecules.

Niels_Bohr

One century ago, Bohr was the pioneer of the introduction of the “quantization” rules into the atomic realm, 8 years after the epic Annus Mirabilis of A. Einstein (1905). Please, don’t forget that Einstein himself was the first physicist to consider Planck hypothesis into “serious” physics problems, explaining the photoelectric effect in a simple way with the aid of “quanta of light” (a.k.a. photons!). Therefore, it is not correct to assest that N.Bohr was the “first” quantum physicist. Indeed, Einstein or Planck were the first. Said, this, Bohr was the first to apply the quantum hypothesis into the atomic domain, changing forever the naive picture of atoms coming from the “classical” physics.  I decided that this year I would be writting something in to honour the centenary of his atomic model (for the hydrogen atom).

I wish you will enjoy the next (short) thread…

Atomic mysteries

When I was young, and I was explained and shown the Periodic Table (the ordered list or catalogue of elements) by the first time, I wondered how many elements could be in Nature. Are they 103? 118?Maybe 212? 1000? 10^{23}? Or 10^{100}? \infty, Infinity?

We must remember what an atom is…Atom is a greek word \alpha\tau o\mu o\sigma meaning “with no parts”. That is, an atom is (at least from its original idea), something than can not be broken into smaller parts. Nice concept, isn’t it?

Greek philosophers thought millenia ago if there is a limit to the divisibility of matter, and if there is an “ultimate principle” or “arche” ruling the whole Universe (remarkably, this is not very different to the questions that theoretical physicists are trying to solve even now or the future!). Different schools and ideas arose. I am not very interested today into discussing Philosophy (even when it is interesting in its own way), so let me simplify the general mainstream ideas several thousands of years ago (!!!!):

1st. There is a well-defined ultimate “element”/”substance” and an ultimate “principle”. Matter is infinitely divisible. There are deep laws that govern the Universe and the physical Universe, in a cosmic harmony.

2nd. There is a well-defined ultimate “element”/”substance” and an ultimate “principle”. Matter is FINITELY divisible. There are deep laws that govern the Universe and the physical Universe, in a cosmic harmony.

3rd. There is no a well-defined ultimate “element”/”substance” or an ultimate principle. Chaos rules the Universe. Matter is infinitely divisible.

4th. There is no a well-defined ultimate “element”/”substance” or an ultimate principle. Chaos rules the Universe. Matter is finitely divisible.

Remark: Please, note the striking “similarity” with some of the current (yet) problems of Physics. The existence of a Theory Of Everything (TOE) is the analogue to the question of the first principle/fundamental element quest of ancient greek philosophers or any other philosophy in all over the world. S.W. Hawking himself provided in his Brief Story of Time the following (3!) alternative approaches

1st. There is not a TOE. There is only a chaotic pattern of regularities we call “physical laws”. But Nature itself is ultimately chaotic and the finite human mind can not understand its ultimate description.

2nd. There is no TOE. There are only an increasing number of theories more and more precise or/and more and more accurate without any limit. As we are finite beings, we can only try to guess better and better approximations to the ultimate reality (out of our imagination) and the TOE can not be reached in our whole lifetime or even in the our whole species/civilization lifetime.

3rd. There is a well defined TOE, with its own principles and consequences. We will find it if we are persistent enough and if we are clever enough. All the physical events could be derived from this theory. If we don’t find the “ultimate theory and its principles” is not because it is non-existent, it is only that we are not smart enough. Try harder (If you can…)!

If I added another (non Greek) philosophies, I could create some other combinations, but, as I told you above, I am not going to tell you Philosophy here, not at least more than necessary.

As you probably know, the atomic idea was mainly defended by Leucippus and Democritus, based on previous ideas by Anaxagoras. It is quite likely that Anaxagoras himself learned them from India (or even from China), but that is quite speculative… Well, the keypoint of the atomic idea is that you can not smash into smaller pieces forever smaller and smaller bits of matter. Somewhere, the process of breaking down the fundamental constituents of matter must end…But where? And mostly, how can we find an atom or “see” what an atom looks like? Obviously, ancient greeks had not idea of how to do that, or even knowing the “ground idea” of what a atom is, they had no experimental device to search for them. Thus, the atomic idea was put into the freezer until the 18th and 19th century, when the advances in experimental (and theoretical) Chemistry revived the concept and the whole theory. But Nature had many surprises ready for us…Let me continue this a bit later…

In the 19th century, with the discovery of the ponderal laws of Chemistry, Dalton and other chemists were stunned. Finally, Dalton  was the man who recovered the atomism into “real” theoretical Science. But their existence was controversial until the 20th century. However, Dalton concluded that there was a unique atom for each element, using Lavoisier’s definition of an element as a substance that could not be analyzed into something simpler. Thus, Dalton arrived to an important conclusion:

“(…)Chemical analysis and synthesis go no farther than to the separation of particles one from another, and to their reunion. No new creation or destruction of matter is within the reach of chemical agency. We might as well attempt to introduce a new planet into the solar system, or to annihilate one already in existence, as to create or destroy a particle of hydrogen. All the changes we can produce, consist in separating particles that are in a state of cohesion or combination, and joining those that were previously at a distance(…)”.

The reality of atoms was a highly debated topic during all the 19th century. It is worthy to remark that was Einstein himself (yes, he…agian) who went further and with his studies about the Brownian motion established their physical existence. It was a brillian contribution to this area, even when, in time, he turned against the (interpretation of) Quantum Mechanics…But that is a different story not to be told today.

Dalton’s atoms or Dalton atomic model was very simple.

A_New_System_of_Chemical_Philosophy_fp

Atoms had no parts and thus, they were truly indivisible particles. However, the electrical studies of matter and the electromagnetic theory put this naive atomic model into doubt. After the discovery of “the cathode” rays (1897) and the electron by J.J.Thomson (no, it is not J.J.Abrahams), it became clear that atoms were NOT indivisible after all! Surprising, isn’t it? It is! Chemical atoms are NOT indivisible. They do have PARTS.

Thomson’s model or “plum pudding” model, came into the rescue…Dalton believed that atoms were solid spheres, but J.J.Thomson was forced (due to the electron existence) to elaborate a “more complex” atomic model. He suggested that atoms were a spherical “fluid” mass with positive charge, and that electrons were placed into that sphere as in a “plum pudding” cake.  I have to admit that I were impressed by this model when I was 14…It seemed too ugly for me to be true, but anyway it has its virtues (it can explain the cathode ray experiment!).cathode-rays-formation

thomsonAndNagaokaModels

The next big step was the Rutherford experiment! Thomson KNEW that electrons were smaller pieces inside the atom, but despite his efforts to find the positive particles (and you see there he had and pursued his own path since he discovered the reason of the canal rays), he could not find it (and they should be there since atoms were electrically neutrial particles). However, clever people were already investigating radioactivity and atomic structure with other ideas…In 1911, E. Rutherford, with the aid of his assistants, Geiger and Marsden, performed the celebrated gold foil experiment.

Rutherford_experiment

To his surprise (Rutherford’s), his assistants and collaborators provided a shocking set of results. To explain all the observations, the main consequences of the Rutherford’s experiment were the next set of hypotheses:

1st. Atoms are mostly vacuum space.

2nd. Atoms have a dense zone of positive charge, much smaller than the whole atom. It is the atomic nucleus!

3rd. Nuclei had positive charge, and electrons negative charge.

He (Rutherford) did not know from the beginning how was the charge arranged and distributed into the atom. He had to improve the analysis and perform additional experiment in order to propose his “Rutherford” solar atomic model and to get an estimate of the nuclei size (about 1fm or 10^{-15}m). In fact, years before him, the japanase Nagaoka had proposed a “saturnian” atomic model with a similar looking. It was unstable, though, due to the electric repulsion of the electronic “rings” (previously there was even a “cubic” model of atom, but it was unsuccessful too to explain every atomic experiment) and it had been abandoned.

And this is the point where theory become “hard” again. Rutherford supposed that the electron orbits around nuclei were circular (or almost circular) and then electrons experimented centripetal forces due to the electrical forces of the nucleus. The classical electromagnetic theory said that any charged particle being accelerated (and you do have acceleration with a centripetal force) should emit electromagnetic waves, losing energy and, then, electrons should fall over the the nuclei (indeed, the time of the fall down was ridiculously small and tiny). We do not observe that, so something is wrong with our “classical” picture of atoms and radiation (it was also hinted with the photoelectric effect or the blackbody physics, so it was not too surprising but challenging to find the rules and “new mechanics” to explain the atomic stability of matter). Moeover, the atomic spectra was known to be discrete (not continuous) since the 19th century as well. To find out the new dynamics and its principles became one of the oustanding issues in the theoretical (and experimental) community. The first scientist to determine a semiclassical but almost “quantum” and realistic atomic spectrum (for the simpler atom, the hydrogen) was Niels Bohr. The Bohr model of the hydrogen atom is yet explained at schools not only due to its historical insterest, but to the no less important fact that it provides right answers (indeed, Quantum Mechanics reproduces its features) for the simplest atom and that its equations are useful and valid from a quantitative viewpotint (as I told you, Quantum Mechanics reproduces Bohr formulae). Of course, Bohr model does not explain the Stark effect, the Zeeman effect, or the hyperfine structure of the hydrogen atom and some other “quantum/relativistic” important effects, but it is a really useful toy model and analytical machine to think about the challenges and limits of Quantum Mechanics of atoms and molecules. Bohr model can not be applied to helium and other elements in the Periodic Table of the elements (its structure is described by Quantum Mechanics), so it can be very boring but, as we will see, it has many secrets and unexpected surprises in its core…

Bohr model for the hydrogen atom

bohr_transitionsBohr_atom_model_EnglishBohr_atombohrAndBalmer

Bohr model hypotheses/postulates:

1st. Electrons describe circular orbits around the proton (in the hydrogen atom). The centripetal force is provided by the electrostatic force of the proton.

2nd. Electrons, while in “stationary” orbits with a fixed energy, do NOT radiate electromagnetic waves ( note that this postulate is againsts the classical theory of electromagnetics as it was known in the 19th century).

3rd. When a single electron passes from one energetic level to another, the energy transitions/energy differences satisfy the Planck law. That is, during level transitions, \Delta E=hf.

In summary, we have:

bohrPostulatesbohrmodelHypotheses

Firstly, we begin with the equality between the electron-proton electrostatic force and the centripetal force in the atom:

\begin{pmatrix}\mbox{Centripetal}\\ \mbox{Force}\end{pmatrix}=\begin{pmatrix}\mbox{Electron-proton}\\ \mbox{electric force}\end{pmatrix}

Mathematically speaking, this first postulate/ansatz requieres that q_1=q_2=e, where e=1\mbox{.}602\cdot 10^{-19}C is the elementary electric charge of the electron (and equal in absolute value to the proton charge) and m_e=9.11\cdot 10^{-31}kg is the electron mass:

F_c=\dfrac{m_ev^2}{R} and F_C=K_C\dfrac{q_1q_2}{R^2}=K_C\dfrac{e^2}{R^2} implies that

(1) \boxed{F_c=F_{el,C}}\leftrightarrow \boxed{\dfrac{m_ev^2}{R}=\dfrac{K_Ce^2}{R^2}}\leftrightarrow \boxed{v^2=\left(\dfrac{K_C}{m_e}\right)\left(\dfrac{e^2}{R}\right)}

Remark: Instead of having the electron mass, it would be more precise to use the “reduced” mass for this two body problem. The reduced mass is, by definition,

\mu=m_{red}=\dfrac{m_1m_2}{m_1+m_2}=\dfrac{m_em_p}{m_e+m_p}

However, it is easy to realize that the reduced mass is essentially the electron mass (since m_p\approx 1836m_e)

\mu=\dfrac{m_e}{1+\left(\dfrac{m_e}{m_p}\right)}\approx m_e(1-\dfrac{m_e}{m_p}+\ldots)=m_e+\mathcal{O} \left(\dfrac{m_e^2}{m_p}\right)

The second Bohr’s great idea was to quantize the angular momentum. Classically, angular momentum can take ANY value, Bohr great’s intuition suggested that it could only take multiple values of some fundamental constant, the Planck’s constant. In fact, assuming orbitar stationary orbits, the quantization rule provides

(2) \boxed{L=m_ev(2\pi R)=nh} or \boxed{L=m_evR=n\dfrac{h}{2\pi}=n\hbar} with \hbar=\dfrac{h}{2\pi} and n=1,2,3,\ldots,\infty a positive integer.

Remark: h=6\mbox{.}63\cdot 10^{-34}Js and \hbar=\dfrac{h}{2\pi}=1\mbox{.}055\cdot 10^{-34}Js are the Planck constant and the reduced Planck constant, respectively.

From this quantization rule (2), we can easily get

vR=\left(\dfrac{n\hbar}{m_e}\right) and then v^2R^2=\left(\dfrac{n\hbar}{m_e}\right)^2

Thus, we have

R^2=\left(\dfrac{n\hbar}{m_e}\right)^2\dfrac{1}{v^2}

Using the result we got in (1) for the squared velocity of the electron in the circular orbit, we deduce the quantization rule for the orbits in the hydrogen atom according to Bohr’s hypotheses:

R^2=\left(\dfrac{n\hbar}{m_e}\right)^2\left(\dfrac{m_eR}{K_Ce^2}\right)

R=\dfrac{n^2\hbar^2}{m_e^2}\dfrac{m_e}{K_Ce^2}

(3) \boxed{R_n=R(n)=\left(\dfrac{\hbar^2}{m_eK_Ce^2}\right)n^2}\leftrightarrow \boxed{R_n=a_Bn^2}

where n=1,2,3,\ldots,\infty again and the Bohr radius a_B is defined to be

(4) \boxed{a_B=\dfrac{\hbar^2}{m_eK_Ce^2}}

Inserting values into (4), we obtain the celebrated value of the Bohr radius

a_B\approx 0\mbox{.}53\AA=53pm=5\mbox{.}3\cdot 10^{-11}m

The third important consequence in the spectrum of energy levels in the hydrogen atom. To obtain the energy spectrum, there is two equivalent paths (in fact, they are the same): use the virial theorem or use (1) into the total energy for the electron-proton system. The total energy of the hydrogen atom can be written

E=\mbox{Kinetic Energy}+\mbox{(electrostatic) Potential Energy}

E=\dfrac{p^2}{2m_e}-\dfrac{K_Ce^2}{R}=\dfrac{m_ev^2}{2}-\dfrac{K_Ce^2}{R}

Substituting (1) into this, we get exactly the expected expression for the virial theorem to a 1/r^2 potential (i.e. E=E_p/2):

E=\dfrac{m_ev^2}{2}-\dfrac{K_Ce^2}{R}=-K_C\dfrac{e^2}{2R}

(5) \boxed{E=-K_C\dfrac{e^2}{2R}}

Inserting into (5) the quantized values of the orbit, we deduce the famous and well-known formula for the spectrum of the hydrogen atom (known to Balmer and the spectroscopists at the end of the 19th century and the beginning of the 20th century):

(6) \boxed{E_n=E(n)=-\dfrac{m_eK_C^2e^4}{2\hbar^2n^2}=-\dfrac{m_e}{2}\left(\dfrac{K_Ce^2}{n\hbar}\right)^2=-\dfrac{\mbox{Ry}}{n^2}} \;\;\forall n=1,2,3,\ldots,\infty

and where we have defined the Rydberg (constant) as

(7) \boxed{\mbox{Ry}=\dfrac{m_e(K_Ce^2)^2}{2\hbar^2}=\dfrac{m_eK_C^2e^4}{2\hbar^2}=\dfrac{1}{2}\alpha^2 m_ec^2}

Its value is Ry=R_H=2.18\cdot 10^{-18}J=13\mbox{.}6eV. Here, the electromagnetic fine structure constant (alpha) is

\alpha=K_C\dfrac{e^2}{\hbar c}

and c is the speed of light. In fact, using the quantum relation

E=\dfrac{hc}{\lambda}

we can deduce that the Rydberg corresponds to a wavenumber

k=1\mbox{.}097\cdot 10^{7}m^{-1}

or a frequency

f=\nu=3\mbox{.}29\cdot 10^{15}Hz

and a wavelength

\lambda =912\AA=91\mbox{.}2nm

Please, check it yourself! :D.

The above results allowed Bohr to explain the spectral series of the hydrogen atom. He won the Nobel Prize due to this wonderful achievement…

Hydrogenic atoms

(and positronium, muonium,…)

In fact, it is easily straightforward to extend all these results to “hydrogenic” (“hydrogenoid”) atoms, i.e., to atoms with only a single electron BUT a nucleus with charge equal to Ze, and Z>1 is an integer (atomic) number greater than one! The easiest way to obtain the results is not to repeat the deduction but to make a rescaling of the proton charge, i.e., you plug q_2=Ze or/and make a rescaling of the electric charge q_2=e\longrightarrow Ze (be aware of making the right scaling in the formulae). The final result for the radius and the energy spectrum is as follows:

A) From R_n=\left(\dfrac{\hbar^2}{m_eK_Ce^2}\right)n^2, with e\longrightarrow Ze, you get

(8) \boxed{\bar{R}_n=\bar{R}(n)=\dfrac{\hbar^2}{m_eK_CZe^2}n^2=\dfrac{a_Bn^2}{Z}}

B) From E_n=-m_e\dfrac{(K_Ce^2)^2}{2\hbar^2n^2}, with the rescaling e\longrightarrow Ze, you get

(9) \boxed{\bar{E}_n=\bar{E}(n)=-m_e\dfrac{Z^2(K_Ce^2)^2}{2\hbar^2n^2}=-\dfrac{Z^2\alpha^2m_ec^2}{2n^2}=-\dfrac{Z^2Ry}{n^2}}

Therefore, the consequence of the rescaling of the nuclear charge is that energy levels are “enlarged” by a factor Z^2 and that the orbits are “squeezed” or “contracted” by a factor 1/Z.

Exercise: Can you obtain the energy levels and the radius for the positronium (an electron and positron system instead an electron a positron). What happens with the muonium (strange substance formed by electron orbiting and antimuon)?And the muonic atom (muon orbiting an proton)? And a muon orbiting an antimuon? And the tau particle orbiting an antitau or the electron orbiting an antitau or a tau orbiting a proton(supposing that it were possible of course, since the tau particle is unstable)? Calculate the “Bohr radius” and the “Rydberg” constant for the positronium, the muonium, the muonic atom (or the muon-antimuon atom) and the tauonium (or the tau-antitau atom). Hint: think about the reduced mass for the positronium and the muonium, then make a good mass/energy or radius rescaling.

Now, we can also calculate the velocity of an electron in the quantized orbits for the Bohr atom and the hydrogenic atom. Using (3) and (8),

mvR=n\hbar\leftrightarrow mR=\dfrac{n\hbar}{m_e}\leftrightarrow v^2R^2=\dfrac{n^2\hbar^2}{m_e^2}

or

v^2=\left(\dfrac{n\hbar}{m_e}\right)^2\dfrac{1}{R^2}

and inserting the quantized values of the orbit radius

v_n^2=\dfrac{K_Ce^2}{m_eR_n}=\dfrac{m_e(K_Ce^2)^2}{m_en^2\hbar^2}

so, for the Bohr atom (hydrogen)

(10) \boxed{v_n=v(n)=\dfrac{K_Ce^2}{\hbar n}=\dfrac{\alpha c}{n}}

In the case of hydrogenic atoms, the rescaling of the electric charge yields

(11) \boxed{\bar{v}_n=\bar {v}(n)=\dfrac{ZK_Ce^2}{\hbar n}=\dfrac{Z\alpha c}{n}}

so, the hydrogenic atoms have a “enlarged” electron velocity in the orbits, by a factor of Z.

The feynmanium

This result for velocities is very interesting. Suppose we consider the fundamental level n=1 (or the orbital 1s in Quantum Mechanics, since, magically or not, Quantum Mechanics reproduces the results for the Bohr atom and the hydrogenic atoms we have seen here, plus other effects we will not discuss today relative to spin and some energy splitting for perturbed atoms). Then, the last formula yield, in the hydrogenic case,

v_1=Z\alpha c

Furthermore, suppose now in addition that we have some “superheavy” (hydrogenic) atom with, say, Z>137 (note that \alpha\approx 1/137 at ordinary energies), say Z=138 or greater than it. Then, the electron moves faster than the speed of light!!!!! That is, for hydrogenic atoms, with Z>137 and considering the fundalmental level, the electron would move with v>c. This fact is “surprising”. The element with Z=137 is called untriseptium (Uts) by the IUPAC rules, but it is often called the feynmanium (Fy), since R.P. Feynman often remarked the importance of this result and mystery. Of course, Special Relativity forbids this option. Therefore, something is wrong or Z=137 is the last element allowed by the Quantum Rules (or/and the Bohr atom). Obviously, we could claim that this result is “wrong” since we have not consider the relativistic quantum corrections or we have not made a good relativistic treatment of this system. It is not as simple as you can think or imagine, since using a “naive” relativistic treatment, e.g., using the Dirac equation , we obtain for the fundamental level of the hydrogenic atom the spectrum

(12) \boxed{E_1=E=m_ec^2\sqrt{1-Z^2\alpha^2}}. This result can be obtained from the Dirac equation spectrum for the hydrogen atom (in a Coulomb potential):

(13) \boxed{E_{n,k;Z,\alpha}=E(n,k;Z,\alpha)=mc^2\left[1+\left(\dfrac{Z\alpha}{n-\vert k\vert+\sqrt{k^2-Z^2\alpha^2}}\right)^2\right]^{-1/2}}

where n is a nonnegative integer number n=N+\vert k\vert and k^2=(j+\frac{1}{2})^2. Putting these into numbers, we get

HydrogenAtomSpectrumDiracEquationFirstLevelsor equivalently (I add comments from the slides)

HydrogenicAtomFirstLevelsDiracEq

If you plug Z=138 or more into the above equation from the Dirac spectrum, you obtain an imaginary value of the energy, and thus an oscillating (unbound) system! Therefore, the problem for atoms with high Z even persist taking the relativistic corrections! What is the solution? Nobody is sure. Greiner et al. suggest that taking into account the finite (extended) size of the nuclei, the problem is “solved” until Z\approx 172. Beyond, i.e., with Z>172, you can not be sure that quantum fluctuations of strong fields introduce vacuum pair creation effects such as they make the nuclei and thus atoms to be unstable at those high values of Z. Some people believe that the issues arise even before, around Z=150 or even that strong field effects can make atoms even below of Z=137 to be non-existent. That is why the search for superheavy elements (SHE) is interesting not only from the chemical viewpoint but also to the fundamental physics viewpoint: it challenges our understanding of Quantum Mechanics and Special Relativity (and their combination!!!!).

Is the feynmanium (Z=137) the last element? This hypothetical element and other superheavy elements (SHE) seem to hint the end of the Periodic Table. Is it true? Options:

1st. The feynmanium (Fy) or Untriseptrium (Uts) is the last element of the Periodic Table.

2nd. Greiner et al. limit around Z=172. References:

(i) B Fricke, W Greiner and J T Waber,Theor. Chim. Acta, 1971, 21, 235.

(ii)W Greiner and J Reinhardt, Quantum Electrodynamics, 4th edn (Springer, Berlin, 2009).

3rd. Other predictions of an end to the periodic table include Z = 128 (John Emsley) and Z = 155 (Albert Khazan). Even Seaborg, from his knowledge and prediction of an island of stability around Z,N= 126, 184,\ldots , left this question open to interpretation and experimental search!

4th. There is no end of the Periodic Table. According to Greiner et al. in fact, even when superheavy nuclei can produce a challenge for Quantum Mechanics and Special Relativity, indeed, since there is always electrons in the orbitals (a condition to an element to be a well-defined object), there is no end of The Periodic Table (even when there are probabilities to a positron-electron pair to be produced for a superheavy nuclei, the presence of electrons does not allow for it; but strong field effects are important there, and it should be great to produce these elements and to know their properties, both quantum and relativistic!). Therefore, it would be very, very interesting to test the superheavy element “zone” of the Periodic Table, since it is a place where (strong) quantum effects and (non-negligible) relativistic effects both matter. Then, if both theories are right, superheavy elements are a beautiful and wonderful arena to understand how to combine together the two greatest theories and (unfinished?) revolutions of the 20th century. What awesome role for the “elementary” and “fundamental” superheavy (composite) elements!

Probably, there is no limit to the number of (chemical) elements in our Universe… But we DO NOT KNOW!

In conclusion: what will happen for superheavy elements with >173 (or Z>126, 128, 137, etc.) remains unresolved with our current knowledge. And it is one of the last greatest mysteries in theoretical Chemistry!

More about the fine structure constant, the Sommerfeld corrections and the Dirac equation+QED (Quantum ElectroDynamics) corrections to the hydrogen spectrum, in slides (think it yourself!):

bohrsommIdea

sod1dirac04onelectronSpectrum

Final remarks (for experts only): Some comments about the self-adjointness of the Dirac equation for high value of Z in Coulombian potentials. It is a well known fact that the Dirac operator for the hydrogen problem is essentially self-adjoint if Z<119. Therefore, it is valid for all the currently known elements (circa 2013, June, every element in the Periodic Table, for the 7th period, has been created and then, we know that chemical elements do exist at least up to Z=118 and we have tried to search for superheavy elements beyond that Z with negative results until now). However, for 119\leq Z\leq 137 any “self-adjoint extension” requires a precise physical meaning. A good idea could be that the expectation value of every component of the Hamilton is finite in the selected basis. Indeed, the solution to the Coulombian potential for the hydrogenic atom using the Dirac equation makes use of hypergeometric functions that are well-posed for any Z\leq 137. If Z is greater than that critical value, we face the oscillating energy problem we discussed above. So, we have to consider the effect of the finite size of the nucleus and/or handle relativistic corrections more carefully. It is important to realize this and that we have to understand the main idea of all this crazy stuff. This means that the s states start to be destroyed above Z = 137, and that the p states begin being destroyed above Z = 274.  Note that this differs from the result of the Klein-Gordon equation, which predicts s states being destroyed above Z = 68 and p states destroyed above Z = 82. In summary, the superheavy elements are interesting because they challenge our knowledge of both Quantum Mechanics and Special Relativity. What a wonderful (final) fate for the chemical elements: the superheavy elements will test if the “marriage” between Quantum Mechanics or Special Relativity is going further or it ends into divorce!

Epilogue: What do you think about the following questions? This is a test for you, eager readers…

1) Is there an ultimate element?

2) Is there a theory of everything (TOE)?

3) Is there an ultimate chemical element?

4) Is there a single “ultimate” principle?

5) How many elements does the Periodic Table have?

6) Is the feynmanium the last element?

7) Are Quantum Mechanics/Special relativity consistent to each other?

8) Is Quantum Mechanics a fundamental and “ultimate” theory for atoms and molecules?

9) Is Special Relativity a fundamental and “ultimate” theory for “quick” particles?

10) Are the atomic shells and atomic structure completely explained by QM and SR?

11) Are the nuclei and their shell structure xompletely explained by QM and SR?

12) Do you think all this stuff is somehow important and relevant for Physics or Chemistry (or even for Mathematics)?

13) Will we find superheavy elements the next decade?

14) Will we find superheavy elements this century?

15) Will we find that there are some superheavy elements stable in the island of stability (Seaborg) with amazing properties and interesting applications?

16) Did you like/enjoy this post?

17) When you was a teenager, how many chemical elements did you know? How many chemical elements were known?

18) Did you learn/memorize the whole Periodic Table? In the case you did not, would you?

19) What is your favourite chemical element?

20) Did you know that every element in the 7th period of the Periodic table has been established to exist but th elements E113, E115,E117 and E118 are not named yet (circa, 2013, 30th June) and they keep their systematic (IUPAC) names ununtrium, ununpentium, ununseptium and ununoctium? By the way, the last named elements were the coperninicium (E112, Cn), the flerovium (Fl, E114) and the livermorium (Lv, E116)…

13502276-green-atom-electron-llustration-on-black-background

Advertisements

LOG#080. A Bug-Rivet “paradox”.

Imagine that an idealised bug of negligible dimensions is hiding at the end of a hole of length L. A rivet has a shaft length of a<L.

bugrivet1
Clearly the bug is “safe” when the rivet head is flush to the (very resiliente) surface. The problem arises as follows. Consider what happens when the rivet slams into the surface at a speed of v=\beta c, where c is the speed of light and 0<\beta<1. One of the essences of the special theory of relativity is that objects moving relative to our frame of reference are shortened in the direction of motion by a factor \gamma^{-1}=\sqrt{1-\beta^2}, where \gamma is generally called the Lorentz dilation factor, as readers of this blog already know. However, from the point of view (frame of reference) of the bug, the rivet shaft is even shorter and therefore the bug should continue to be safe, and thus fast the rivet is moving.

bugrivet2

Apparently, we have:

a_{app}=\dfrac{a}{\gamma}=a\sqrt{1-\beta^2}

Remark: this idea assumes that both objects are ideally rigid! We will return to this “fact” later.

From the frame of reference of the rivet, the rivet is stationary and unchanged, but the hole is moving fast and is shortened by the Lorentz contraction to

L_{app}=\dfrac{L}{\gamma}=L\sqrt{1-\beta^2}

bugrivet3

If the approach speed is fast enough, so that L_{app}<a, then the end of the hole slams into the tip of the rivet before the surface
can reach the head of the rivet. The bug is squashed! This is the “paradox”: is the bug squashed or not?

There are many good sources for this paradox (a relative of the pole-barn paradox), such as:
1)http://en.wikipedia.org/wiki/Wikipedia:Reference_desk/Archives/Science/2006_October_19#Bug_Rivet_Paradox

2) A nice animation can be found here  http://math.ucr.edu/~jdp/Relativity/Bug_Rivet.html

In this blog post we are going to solve this “paradox” in the framework of special relativity.

SOLUTION

One of the consequences of special relativity is that two events that are simultaneous in one frame of reference are no longer simultaneous in other frames of reference. Perfectly rigid objects are impossible.

In the frame of reference of the bug, the entire rivet cannot come to a complete stop all at the same instant. Information
cannot travel faster than the speed of light. It takes time for knowledge that the rivet head has slammed into the surface to
travel down the shaft of the rivet. Until each part of the shaft receives the information that the rivet head has stopped, that part keeps going at speed v=\beta c. The information proceeds down the shaft at speed c while the tip continues to move at speed v=\beta c.

bugrivet4

The tip cannot stop until a time

T_1=\dfrac{\dfrac{a}{\gamma}}{c-\beta c}=\dfrac{a}{\gamma c (1-\beta)}

after the head has stopped. During that time the tip travels a distance D_1=vT_1. The bug will be squashed if

vT_1>L-\dfrac{a}{\gamma}

This implies that

\dfrac{\beta c a}{\gamma c (1-\beta)}>L-\dfrac{a}{\gamma}\leftrightarrow \dfrac{a}{\gamma}\left(\dfrac{\beta}{1-\beta}+1\right) >L\leftrightarrow \dfrac{a}{\gamma}\left(\dfrac{\beta+1-\beta}{1-\beta}\right) >L

From \gamma^{-1}=\sqrt{1-\beta^2} we can calculate that

\dfrac{1}{\gamma (1-\beta)}=\dfrac{\sqrt{1-\beta^2}}{1-\beta}=\dfrac{\sqrt{(1+\beta)(1-\beta)}}{1-\beta}=\sqrt{\dfrac{1+\beta}{1-\beta}}

The bug will be squashed if the following condition holds

a\sqrt{\dfrac{1+\beta}{1-\beta}}>L\leftrightarrow \dfrac{a}{L}>\sqrt{\dfrac{1-\beta}{1+\beta}}\leftrightarrow \left(\dfrac{a}{L}\right)^2> \dfrac{1-\beta}{1+\beta}

or equivalently, after some algebraic manipulations, the bug will be squashed if:

\beta>\dfrac{1-\left(\dfrac{a}{L}\right)^2}{1+\left(\dfrac{a}{L}\right)^2}

Conclusion (in bug’s reference frame): the bug will be definitively squashed when v_{min}=\beta_{min}c such as

\boxed{\beta_{min}=\dfrac{1-\left(\dfrac{a}{L}\right)^2}{1+\left(\dfrac{a}{L}\right)^2}}

Check: It can be verified that the limits \displaystyle{\lim_{a\rightarrow 0^+}\beta_{min}=1^{-}} and \displaystyle{\lim_{a\rightarrow L^-}\beta_{min}=0^{+}} are valid and physically meaningful.

Note that the impact of the rivet head always happens before the bug is squashed.

In the frame of reference of the rivet, the bug is definitively squashed whenever \dfrac{L}{\gamma}<a.

bugrivet5

Then,

L\sqrt{1-\beta^2}<a\leftrightarrow 1-\beta^2<\left(\dfrac{a}{L}\right)^2

or equivalently

\beta>\sqrt{1-\left(\dfrac{a}{L}\right)^2}

or

\beta>\beta_{min2} where \boxed{\beta_{min2}=\sqrt{1-\left(\dfrac{a}{L}\right)^2}}

The bug is squashed before the impact of the surface on the rivet head. This last equation (and thus \beta_{min2}) is a velocity higher than \beta_{min}.

Conclusion (in rivet’s reference frame): The entire surface cannot come to an abrupt stop at the same instant. It takes time for the information about the impact of the rivet tip on the end of the hole to reach the surface that is rushing towards the rivet head. Let us now examine the case where the speed is not high enough for the Lorentz-contracted hole to be shorter than the rivet shaft in the frame of reference of the rivet. Now the observers agree that the impact of the rivet head happens first. When the surface slams into contact with the head of the rivet, it takes time for information about that impact to travel down to the end of the hole. During this time the hole continues to move towards the tip of the rivet.

bugrivet6

The time it takes for the propagating information to reach the tip of the stationary rivet is

T_2=\dfrac{a}{c}

during which time the bug moves a distance D_2=vT_2=\dfrac{\beta c a}{c}=\beta a

In the rivet’s reference frame, therefore, The bug is squashed if the following condition holds

vT_2>\dfrac{L}{\gamma}-a\leftrightarrow \beta a>\dfrac{L}{\gamma}-a\leftrightarrow (1+\beta)a>\dfrac{L}{\gamma}\leftrightarrow \dfrac{a}{L}>\dfrac{1}{1+\beta}\sqrt{1-\beta^2}

and then

\dfrac{\sqrt{(1+\beta)(1-\beta)}}{1+\beta}<\dfrac{a}{L}\leftrightarrow \sqrt{\dfrac{1-\beta}{1+\beta}}<\dfrac{a}{L}

and from this equation, we get same minimum speed that guarantees the squashing of the bug as was the case in the frame of reference of the bug! That is:

\boxed{\beta_{min}=\dfrac{1-\left(\dfrac{a}{L}\right)^2}{1+\left(\dfrac{a}{L}\right)^2}}

Note that observers travelling with each of the two frames of reference (bug and rivet) agree that the bug is squashed IF \beta>\beta_{min}, and that resolves the “paradox”. They also agree that the impact of rivet head on surface happens before the bug is squashed, provided that the following condition is satisfied:

\beta_{min}<\beta<\beta_{min2}

i.e., they agree if the impact of rivet head on surface happens before the bug is squashed

\boxed{\dfrac{1-\left(\dfrac{a}{L}\right)^2}{1+\left(\dfrac{a}{L}\right)^2}<\beta<\sqrt{1-\left(\dfrac{a}{L}\right)^2}}

Otherwise, they disagree on which event happens first.  For instance, if

\beta>\beta_{min2}=\sqrt{1-\left(\dfrac{a}{L}\right)^2}

For speeds this high, the observer in the bug’s frame of reference still deduces that the rivet-head impact happens first, but the other observer deduces that the bug is squashed first. This is consistent with the relativity of simultaneity! At the critical speed, when \beta=\beta_c=\beta_{min2} the two events are simultaneous in the frame of the rivet, (the river fits perfectly in the shortened hole), but they are not simultaneous in the other frame of reference.

See you in the next blog post!


LOG#075. Batmobile “paradox”.

The Batmobile “fake paradox” helps us to understand Special Relativity a little bit. This problem consists in the next experiment:

There are two observers. Alfred, the external observer, and Batman moving with his Batmobile.

Now, we will suppose that the Batmobile is moving at a very fast constant speed with respect to the garage. Let us suppose that v=0.866c=\dfrac{\sqrt{3}}{2}c. Then, we have the following situation from the external observer:


However, with respect to the Batmobile reference frame, we have:

The question is. Who is right? Alfred or Batman? The surprinsig answer from Special Relativity is that Both are correct. Alfred and Batman are right! Let’s see why it is true. For Alfred, there is a time during which the Batmobile is completely inside the garage with both doors closed:

By the other hand, for Batman, the front and rear doors are not closed simultaneously! So there is never a time during which the Batmobile is completely inside the garage with both doors closed.

So, there is no paradox at all, if you are aware about the notion of simultaneity and its relativity!


LOG#052. Chewbacca’s exam.

I found this fun (Spanish) exam about Special Relativity at a Spanish website:

Solutions:

1) v=25/29 c

2) 1.836 \times 10^{12} m = 12 A.U.

3) t=13.6 months = 13 months and 18 days.

Calculations:

1) We use the relativistic addition of velocities rule. That is,

V=(u-v)/(1-(uv/c^2))

where u=Millenium Falcon velocity, v=imperial cruiser velocity= c/5, y V=relative speed=4c/5.

Using units with c=1:

4/5=(v-1/5)/(1-v/5)

4/5(1-v/5)=v-1/5

4/5-4/25v=v-1/5

29/25 v=1

v=25/29

Then, v=25/29 c reinserting units.

2) This part is solved with the length contraction formula and the velocity calculated in the previous part (1). Moreover, we obtain:

\Delta x'=\Delta x/\gamma

Using the result we got from (1), and plugging that velocity v and the fact that \Delta t' is equal to one hour, then es

\Delta x'=v\Delta t'=\Delta x/\gamma , and from this

\Delta x=\gamma v\Delta t'

Substituting the numerical values, we obtain the given solution easily.

\Delta x =1.97 ( 25/29 c )1hour =1.7 hc=1.836 \times 10^{12} =12A.U.

3) Simple application of time dilation formula provides:

\Delta t'=\gamma \Delta t

Inserting, in this case, our given velocity, we obtain the solution we wrote above:

\Delta t' = 1.97 ( 9 months) = 13.6 months = 13 months 18 days.


LOG#049. Ludicrous speed.

We are going to learn about the different notions of velocity that the special theory of relativity provides.

The special theory of relativity is a simple wonderful theory, but it comes with many misconceptions due to bad teaching/science divulgation. It is not easy to master the full theory of relativity without the proper mathematical background and physical insight. In the internet era where knowledge is shared, a fundamental issue is to understand things properly. There are many people who thinks they understand the theory of relativity when they don’t. Even at the academia.

Moreover, you can find many people in the blogsphere/websphere trying to sell false theories and wrong theories. It is the same like the so-called alternative medicine: they are not medicine at all. Bad science is not science, it is simply a lie and not science at all. It is religion. Science can be critized, but nobody can critize that Earth revolves around the Sun, it is common knowledge and truth. So, we can make critics to scientist, but not the scientific method and well established theories. We can try to understand better or in a novel way, but we can not deny facts and experiments. Gerard ‘t Hooft, Nobe Prize, explain it in his web page www.phys.uu.nl/~thooft/.

It is important to remark that Science revolutions come when we extend the theories we know they are correct, like special relativity and not with a full destruction of the current and well-tested theories. Newtonian relativity is a limit of General Relativity. Galilean relativity is a limit of Special Relativity. Quantum Mechanics is a limit of QFT and so on. The issue is not that. Said these words, I am quite sure that scientists and particularly physicists wish to overcome current theories with new ones. However, the process to create a new theory is not easy. Specially, if you don’t understand the traps and theories that have passed every known test till now.

What is velocity? Classically, the answer is short and very clear/neat: velocity is the rate of change of position with respect to time. It is a vector magnitude. Mathematically speaking is the quotient between the displacement vector and the time interval, or in the infinitesimal limit, the derivative of the position vector with respect to time.

\boxed{\mathbf{v_m}=\dfrac{\Delta \mathbf{r}(t)}{\Delta t}\leftrightarrow \mbox{Average velocity}}

\boxed{\mathbf{v}=\dfrac{d\mathbf{r}(t)}{dt}\leftrightarrow \mbox{Instantaneous velocity}}

In the special theory of relativity, due to the fact that time is not universal but relative we can build different notions of velocity. And it matters. There are some clear concepts from relativity you should master till now:

a) You can attach a clock to any yardstick you could physically use for measurements of space and time.

b) You must distinguish the notions of coordinate velocity (map coordinate is another commonly used notion/concept) and proper velocity. The latter is sometimes called hyperbolic (or imaginary) velocity. These two notions are caused by the presence of two “natural” elections of time: the proper time and the coordinate time.

c) Due to the previous two facts, you must also distinguish between proper acceleration and geometric acceleration. Proper-accelerations caused by the tug of external forces and geometric accelerations caused by choice of a reference frame that’s not geodesic i.e. a local reference coordinate-system that is not ”in free-fall”. Proper-accelerations are felt through their points of action e.g. through forces on the bottom of your feet. On the other hand geometric accelerations give rise to inertial forces that act on every ounce of an object’s being. They either vanish when seen from the vantage point of a local free-float frame, or give rise to non-local force effects on your mass distribution that cannot be made to disappear. Coordinate acceleration goes to zero whenever proper-acceleration is exactly canceled by that connection term, and thus when physical and inertial forces add to zero.

People who are not aware of the previous comments, don’t understand relativity and the physics behind it. They even don’t undertand what experiments and their data say.

Let me review the main magnitudes, 3-vectors and 4-vectors which the special theory of relativity studies in the next tables:

The two notions of 3-velocity we do have from the special theory of relativity, i.e., from the 4-velocity \mathbb{U}=\dfrac{d\mathbb{X}}{d\tau},  are:

1) Coordinate velocity, \mathbf{v}:

\mathbf{v}=\dfrac{d\mathbf{r}}{dt}

It is the common notion of 3-velocity, measured from an inertial observer with respect to the coordinate time t. Note that the coordinate time is not a true invariant in SR!

2) Proper velocity (or the hyperbolic velocity/imaginary angle velocity related to it):

\mathbf{w}\equiv \dfrac{d\mathbf{r}}{d\tau}=\gamma \mathbf{v}

where \tau is the proper time. This velocity can intuitively defined as the distance per unit traveler-time, retains many of the properties that ordinary velocity loses at high speed. In addition to these two definitions, we also have:

1)Proper-acceleration \alpha, is the acceleration experienced relative to a locally co-moving free-float-frame, and it helps when we are accelerating, speeding, and in curvy space-time.

2) How some of the space-like effect of sideways ”felt” forces moves into the reference-frame’s time-domain at high speed, making the relatively unknown bound (from special relativity!)

\dfrac{dp}{dt}\leq m\alpha

With the above definitions, the relativistic momentum can be expressed in termns of coordinate velocity or proper velocity as follows:

\mathbf{P}=m\mathbf{w}=M\mathbf{v}=m\gamma \mathbf{v}

where

\gamma=\dfrac{dt}{d\tau}=\dfrac{1}{\sqrt{1-\dfrac{v^2}{c^2}}}=\sqrt{1+\dfrac{\omega^2}{c^2}}

is the Lorentz factor. The last equal sign in the previous equation can be easily derived from the relativistic relationship:

\left(c\dfrac{dt}{d\tau}\right)^2-\left(\dfrac{d\mathbf{r}}{d\tau}\right)^2=c^2

and the definition of \gamma above.

Thanks to the metric-equation’s assignment of a frame-invariant traveler or proper-time \tau to the displacement between events in context of a single map-frame of comoving yardsticks and synchronized clocks, proper velocity becomes one of three related derivatives in special relativity (coordinate velocity \mathbf{v}, proper-velocity \mathbf{w}, and Lorentz factor \gamma) that describe an object’s rate of travel. For unidirectional motion, in units of lightspeed c (i.e. c=1 if we want to) each of these is also simply related to  a traveling object’s hyperbolic velocity angle or rapidity \eta by the next set of equations:

\eta=\sinh^{-1}\left( \dfrac{w}{c}\right)=\tanh^{-1}\left(\dfrac{v}{c}\right)=\pm \cosh^{-1}\left(\gamma\right)

The next table illustrates how the proper-velocity of w_0 \equiv c or “one map-lightyear per traveler-year” is a natural benchmark for the transition from a sub-relativistic coordinate frame to a (fake) auxiliary super-relativistic motion (in imaginary units of i=\sqrt{-1}). Note that the velocity angle or pseudorapidity \eta and the proper-velocity w run from 0 to infinity and track the physical coordinate-velocity when w<<c. On the other hand when w>>c, the (hyperbolic or imaginary) proper-velocity tracks Lorentz factor \gamma while velocity angle \eta is logarithmic and hence increases much more slowly:

LUDICROUS SPEED AND WARP SPEED

 Hyperbolic velocities CAN exceed c! They can reach even the ludicrous speed of \infty when the coordinate velocity approaches c! However, you must never forget the fact that the velocity-angle/hyperbolic velocity IS imaginary in value. It is quite clear from the above table. Indeed, being somehow “trekkie” or a Sci-Fi “romantic” person, you could “define” warp-speeds as “imaginary/hyperbolic” velocities, i.e., in terms of proper velocity. In that case, you could get the correspondence

\mbox{WARP}0.25=\mbox{WARP}1/4=\dfrac{\sqrt{17}}{17}c\approx 0.24c

\mbox{WARP}0.5=\mbox{WARP}1/2=\dfrac{\sqrt{5}}{5}c\approx 0.45c

\mbox{WARP}1=\dfrac{\sqrt{2}}{2}c\approx 0.71c

\mbox{WARP}2=\dfrac{2\sqrt{5}}{5}c\approx 0.89c

\mbox{WARP}3=\dfrac{3\sqrt{10}}{10}c\approx 0.95c

\mbox{WARP}7=\dfrac{7\sqrt{2}}{10}c\approx 0.99c

\mbox{WARP}9=\dfrac{9\sqrt{82}}{82}c\approx 0.994c

\mbox{WARP}10=\dfrac{10\sqrt{101}}{101}c\approx 0.995c

\mbox{WARP}\infty\equiv c

In general, we can define the WARP speed as W=w/c and so, the proper velocity can be expressed in terms of the warp speed W in a very simple way w=Wc. Thus, the real or coordinate velocity would be connected with warp-speed through the relativistic equation:

\boxed{v=c\tanh\sinh^{-1}(W)=c\tanh\sinh^{-1}\left(\dfrac{w}{c}\right)}

Of course, the point is that, unlike the Sci-Fi franchise, the real velocity has never exceeded c, only the hyperbolic velocity and the proper velocity (note that in terms of SR, velocities approaching c imply very boosted frames, so despite we could travel to any point of the Universe in SR only approaching c very closely with respect to the traveler proper time-one human life-, but in terms of the “Earth” (or rest) reference frame millions of years would have passed away!).

When the coordinate-speeds approach c, the respective coordinate velocities deviate from this simple addition rule in that rapidities (hyperbolic velocity angle boosts) add instead of velocities, i.e. \eta_{12}=\eta_1+\eta_2. Coordinate velocities add non-linearly. And it is a well-tested consequence of the Special Theory of relativity.  For highly relativistic objects (i.e. those with momentum per unit mass much larger than lightspeed) the result of the coordinate-velocity expression  familiar from most textbooks is rather uninteresting since the coordinate-velocities all peak out at c, i.e., as everybody knows, in special relativity 1c\boxplus 1c=1c, because applying the relativistic addition of velocities rule, we get

c\boxplus c=\dfrac{ (c + c)}{(1 + 1)}=c

And it is a fact from both theory and experiment! It will remain as long as SR remains a valid theory. SR holds yet with an astonishing degree of precision and accuracy. So, you can not deny every data and experiment that confirms SR. That is completely nonsense but there are some people and pseudo-scientists out there building their own theories AGAINST the achievements and explanations that SR provides to every experiment we have done until the current time. I am sorry for all of them. They are totally wrong. Science is not what they say it is. Any theory going beyond SR HAS to explain every experiment and data that SR does explain, and it is not easy to build such a theory or to say, e.g., why we have not observed (apparently) superluminal objects. I will discuss more superluminal in a forthcoming post/log entry, some posts after the special 50th post/log that is coming after this one! Stay tuned!

Coming back to our discussion…Why is all this stuff important? High Energy Physics is the natural domain of SR! And there, SR has not provided ANY wrong result till, in spite that some researches going beyond the Standard Model include modified dispersion relationships that reduce to SR in the low energy regime, we have not seen yet ANY deviation from SR until now.

For unidirectional motion, at low speeds the coordinate velocity v_{13} of object 1 from the point of view of oncoming object 3 might be described as the sum of the velocity v_{12} of object 1 with respect to lab frame 2 plus the velocity v_{23} of the lab frame 2 with respect to object 3, that is:

v_{13}=v_{12}+v_{23}

Compare this expression to the previously obtained expression for rapidities! Rapidities always add, coordinate velocities add (linearly) only at low velocities. In conclusion, you must be careful by what you mean by velocity is a boosted system!

By the other hand, for relative proper-velocity, the result is:

w_{13}=\gamma_{13}v_{13}=\gamma_{12}\gamma_{23}(v_{12}+v_{23})

This expression shows how the momentum per unit mass as well as the map-distance traveled per unit traveler time of object 1, as seen in the frame of oncoming particle 3, goes as the sum of the coordinate-velocities times the product of the gamma (energy) factors. The proper velocity equation is especially important in high energy physics, because colliders enable one to explore proper-speed and energy ranges much higher than accessible with fixed-target collisions. For instance each of two electrons (traveling with frames 1 and 3) in a head-on collision traveling in the lab frame (2) at

\gamma_{12}mc^2=45\mbox{GeV}

or equivalenty w_{12}=w_{23}=\gamma v\approx 88000 lightseconds per traveler second  would see the other coming toward them at coordinate velocity v_{13}\approx c and w_{13}=88000^2(1+1) \approx 1.55\cdot 10^{10} lightseconds per traveler second or \gamma_{13}mc^2\approx 7.9 \mbox{PeV}. From the target’s view, that is an incredible increase in both energy and momentum per unit of mass.

Other magnitudes and their frame dependence in SR can be read from the following table:

CAUTION: These results don’t mean that the “real” energy is that. Energy is relative and it depends on the frame! The fact that in colliders, seen from the target reference frame, the energy can be greater than the center of mass energy is not an accident. It is a consequence of the formalism of special relativity. A similar observation can be done for velocities. Coordinate velocities, IN THE FRAMEWORK OF SPECIAL RELATIVITY, can never exceed the speed of light. As long as SR holds, there is no particle whose COORDINATE velocity can overcome the speed of light. However, we have seen that PROPER velocities are other monsters. They serve as a tool to handle rotations along the temporal axis, i.e., to handle boosts mixing space and time coordinates. Proper (or hyperbolic) velocities CAN be greater than speed of light. But, it does not contradict the special theory of relativity at all since hyperbolic velocities ARE NOT REAL since they are imaginary quantities and they are not physical. We can only measure momentum and real quantities!  Moreover, remember that, in fact, group or phase velocities we have found before can ALSO be greater than c. So, you must be careful by what do you mean by velocity in SR or in any theory. Furthermore, you must distinguish the notion of particle velocity with those of the relative velocity between two inertial frames, since the particle velocities ( coordinate or proper) always refer to some concrete frame! In summary, be aware of people saying that there are superluminal particles in our colliders or astrophysical processes. It is simply not true. Superluminal objects have observable consequences, and they have failed to be observed ( the last example was the superluminal neutrino affair by the OPERA collaboration, now in agreement with SR).

Remark (I): From the last table we observe that in SR, the rotation angle is imaginary. Therefore, we are forced to use this gadget of hyperbolic velocity in order to avoid “imaginary velocities”.

Remark (II): Hyperbolic velocities would become imaginary velocities if we used the imaginary formalism of SR, the infamous ict=x_4.

Remark (III): Hyperbolic velocities are not coordinate velocities, so they are not physical at all. They are just a tool to provide the right answers in terms of rapidities, or the hyperbolic angle, whose units are imaginary radians! Hyperbolic velocities are measured in imaginary units of velocity!

Remark (IV): About the imaginary issues you can have now. The spacetime separation formula s^2=-c^2t^2+x^2+y^2+z^2 means that the time t can often be treated mathematically as if it were an imaginary spatial dimension. That is, you can define ct=iw so -c^2t^2=w^2, where i  is the square root of  -1, and w is a “fourth spatial coordinate”. Of course it is not at all. It is only a trick to treat the problem in a clever way.  By the other hand, a Lorentz boost by a velocity v can likewise be treated as a rotation by an imaginary angle. Consider a normal spatial rotation in which a primed frame is rotated in the wx-plane clockwise by an angle \varphi about the origin, relative to the unprimed frame. The relation between the coordinates (w',x') and (w,x) of a point in the two frames is:

\begin{pmatrix}w'\\ x'\end{pmatrix}=\begin{pmatrix}\cos\theta & -\sin\theta\\ \sin\theta & \cos\theta\end{pmatrix}\begin{pmatrix}w\\ x\end{pmatrix}

Now set ct=iw and \theta=i\varphi, with t,\theta both real. In other words, take the spatial coordinate w to be imaginary, and the rotation angle \varphi likewise to be imaginary. Then the rotation formula above becomes

\begin{pmatrix}ct'\\ x'\end{pmatrix}=\begin{pmatrix}\cosh\theta & -\sinh\theta\\ -\sinh\theta & \cosh\theta\end{pmatrix}\begin{pmatrix}ct\\ x\end{pmatrix}

This agrees with the usual Lorentz transformation formulat if the boost velocity v and boost angle \theta are related by the known formula \tanh\theta=v/c=\beta. We realize that if we identify the imaginary angle with the rapidity, we are back to Special Relativity. Indeed, it is only the rotations involving the time axis which can cause confusion because they are so different from our everyday experience. That is, we experience rotations along some direction in our daily experience, so we are familiarized with rotations and their (real) rotation angles. However, rotations along a time axis mixing space and time is a weird creature. It uses imaginary numbers or, if we avoid them, we have to use hyperbolic (pseudo)-rotations.

SUMMARY OF MAIN IDEAS

A) Lorentz factor \gamma=\dfrac{E}{mc^2}

\boxed{\gamma \equiv \frac{dt}{d\tau}= \sqrt{1+\left(\frac{w}{c}\right)^2} = \frac{1}{\sqrt{1-(\frac{v}{c})^2}} = \cosh[\eta] \equiv \frac{e^{\eta} + e^{-\eta}}{2}}

B) Proper-velocity or momentum per unit mass.

\boxed{\frac{w}{c}\equiv \frac{1}{c} \frac{dx}{d\tau}=\frac{v}{c} \frac{1}{\sqrt{1-(\frac{v}{c})^2}}=\sinh[\eta]\equiv \frac{e^{\eta} - e^{-\eta}}{2} =\pm\sqrt{\gamma^2 - 1}}

C) Coordinate velocity v\leq c.

\boxed{\frac{v}{c} \equiv \frac{1}{c}\frac{dx}{dt}=\frac{w}{c}\frac{1}{\sqrt{1 + (\frac{w}{c})^2}} = \tanh[\eta] \equiv \frac{e^{2\eta} - 1} {e^{2\eta} + 1}= \pm \sqrt{1 - \left(\frac{1}{\gamma}\right)^2}}

D) Hyperbolic velocity angle or rapidity.

\boxed{\eta =\sinh^{-1}[\frac{w}{c}] = \tanh^{-1}[\frac{v}{c}] = \pm \cosh^{-1}[\gamma]}

or in terms of logarithms:

\boxed{\eta = \ln\left[\frac{w}{c} + \sqrt{\left(\frac{w}{c}\right)^2 + 1}\right] = \frac{1}{2} \ln\left[\frac{1+\frac{v}{c}}{1-\frac{v}{c}}\right] = \pm \ln\left[\gamma + \sqrt{\gamma^2 - 1}\right]}

E) Warp speed (just for fun):

\boxed{v=c\tanh\sinh^{-1}(W)=c\tanh\sinh^{-1}\left(\dfrac{w}{c}\right)}


LOG#048. Thomas precession.

LORENTZ TRANSFORMATIONS IN NON-STANDARD FORM

Let me begin this post with an uncommon representation of Lorentz transformations in terms of “uncommon matrices”. A Lorentz transformation can be written symbolically, as we have seen before, as the set of linear transformations leaving invariant

ds^2=d\mathbf{x}^2-c^2dt^2

Therefore, the Lorentz transformations are naively X'=\mathbb{L}X. Let \mathbf{A}, \mathbf{B} be 3-rowed column matrices and let M, R, \mathbb{I} represent 3\times 3 matrices and T will be used (unless it is stated the contrary) to denote the matrix transposition ( interchange of rows and columns in the matrix).

The invariance of ds'^2=ds^2 implies the following results from the previous definitions:

\gamma^2-\mathbf{B}^2=1

M^TM =\mathbf{A}\mathbf{A}^T+\mathbb{I}

M^T\mathbf{B}=\gamma \mathbf{A}\leftrightarrow \mathbf{B}^T M=\gamma \mathbf{A}^T

Then, we can write the matrix for a Lorent transformation (boost) in the following non-standard manner:

\boxed{\mathbb{L}=\begin{pmatrix}\gamma & -\mathbf{A}^T\\ -\mathbf{B} & M\end{pmatrix}}

and the inverse transformation will be

\boxed{\mathbb{L}^{-1}=\begin{pmatrix}\gamma & \mathbf{B}^T\\ \mathbf{A} & M^T\end{pmatrix}}

Thus, we have \mathbb{L}\mathbb{L}^{-1}=\mathbb{I}_{4x4}\equiv \mathbb{E}, where we also have

\gamma^2-\mathbf{A}^2=1

M\mathbf{A}=\gamma \mathbf{B}

MM^T=\mathbf{B}\mathbf{B}^T+\mathbb{I}_{3x3}

Let us define, in addition to this stuff, the reference frames S, \overline{S}', corresponding to the the coordinates \mathbf{X} and \overline{\mathbb{X}}'. Then, the boost matrix will be recasted, if the velocity read \mathbf{v}=\mathbf{A}/\gamma, as

L_{v}=\begin{pmatrix}\gamma & -\gamma \mathbf{v}^T\\ -\gamma \mathbf{v} & \mathbb{I}+\frac{\gamma^2}{1+\gamma}\mathbf{v}\mathbf{v}^T\end{pmatrix}=\begin{pmatrix}\gamma & -\mathbf{A}^T\\ -\mathbf{A} & \mathbb{I}+\frac{\mathbf{A}\mathbf{A}^T}{1+\gamma}\end{pmatrix}

Remark: a Lorentz transformation will differ from boosts only by rotations in the general case. That is, with these conventions, the most general Lorentz transformations include both boosts and rotations.

For all \gamma>0, the above transformation is well-defined, but if \gamma<0, then it implies we will face with transformations containing the reversal of time ( the time reversal operation T, please, is a different thing than matrix transposition, do not confuse their same symbols here, please. I will denote it by \mathbb{T} in order to distinguish, althoug there is no danger to that confusion in general). The time reversal can be written indeed as:

\mathbb{T}=\begin{pmatrix}-1 & \mathbf{0}^T\\ \mathbf{0} & \mathbb{I}\end{pmatrix}

In that case, (\gamma<0), after the boost L_{v}, we have to make the changes \gamma \rightarrow \vert \gamma\vert and \mathbf{A}\rightarrow -\mathbf{A}. If these shifts are done, the reference frames \overline{S} and \overline{S}' can be easily related

\overline{X}'=LX=LL^{-1}_{v}\overline{X}

in such a way that

LL^{-1}_{v}=\begin{pmatrix}1 & \mathbf{0}\\ \mathbf{0} & R\end{pmatrix}=L_R

where the rotation matrix is given formally by the next equation:

R=M-\dfrac{\mathbf{B}\mathbf{A}^T}{1+\gamma}

R must be an orthogonal matrix, i.e., R^TR=\mathbb{I}_{3x3}. Then (\det (R))^2=1, or det R=\pm 1.. For \det R=-1 we have the parity matrix

\mathbb{P}=\begin{pmatrix}1 & \mathbf{0}^T\\ \mathbf{0} & -\mathbb{I}_{3x3}\end{pmatrix}

and it will transform right-handed frames to left-handed frames \overline{S} or \overline{S}'. The rotation vector \alpha can be defined as well:

1+2\cos \alpha=Tr (R)\rightarrow \cos\alpha=\dfrac{Tr R-1}{2}

so \alpha^\mu=\dfrac{1}{2}\epsilon^{\mu\nu\lambda}R^\nu_{\lambda}\dfrac{\alpha}{\sin\alpha}, \forall 0\leq \alpha<\pi. The rotation acting on 3-rowed matrices:

R\mathbf{A}=\mathbf{B}

implies that \overline{X}'=R\overline{X}, and it changes -\mathbf{A}/\gamma of the frame S into \overline{S}. Passing from one frame into another, \overline{S}' to S', it implies we can define a boost with L_{-\mathbf{B}/\gamma}. In fact,

L_{-\mathbf{B}/\gamma}L=\begin{pmatrix}1 & \mathbf{0}^T\\ \mathbf{0} & R\end{pmatrix}=L_R

Q.E.D.

Remark(I): Without the time reversal, we would get L_{R\mathbf{v}}L_R=L=L_RL_{\mathbf{v}}

with \mathbf{v}=\mathbf{A}/\gamma and R=M-\dfrac{\mathbf{BA}^T}{1+\gamma}.

Remark (II): L_RL_v\rightarrow L^T=L^T_vL_R^T=L_vL_{R^T}. If L^T=L=L_{R\mathbf{v}}L_R, then the uniqueness of R\mathbf{v} provides that R=R^T=R^{-1}, i.e., that R is an orthogonal matrix. If R is an orthogonal matrix and a proper Lorentz transformation ( det R=+1), then we would get \sin\alpha=0, and thus \alpha=0 or \alpha=\pi, and so, R=I or R=2\mathbf{n}\mathbf{n}^T-1, with the unimodular vector \mathbf{n}, i.e., \vert \mathbf{n}\vert=1. That would be the case \forall \mathbf{v}\neq 0 and \mathbf{n}=\mathbf{v}/\vert\vert \mathbf{v}\vert\vert. Otherwise, if \mathbf{v}=0, then \mathbf{n} would be an arbitrary vector.

ADDITION OF VELOCITIES REVISITED

The second step previous to our treatment of Thomas precession is to review ( setting c=1) the addition of velocities in the special relativistic realm. Suppose a point particle moves with velocity \overline{w} in the reference frame \overline{S}. Respect to the S-frame (in rest) we will write:

\mathbf{x}=\overline{\mathbf{x}}+\dfrac{\gamma^2}{\gamma+1}(\overline{\mathbf{x}}\mathbf{v})\mathbf{v}+\gamma \mathbf{v}\overline{t}

and

t=\gamma \overline{t}+\gamma (\mathbf{v}\overline{\mathbf{x}})

and with \overline{x}=\overline{\mathbf{w}}\overline{t} we can calculate the ratio \mathbf{u}=\mathbf{x}/t:

\mathbf{u}=\dfrac{\dfrac{\overline{\mathbf{w}}}{\gamma}+\dfrac{\gamma}{1+\gamma}(\mathbf{v}\overline{\mathbf{w}})\mathbf{v}+\mathbf{v}}{1+\mathbf{v}\overline{\mathbf{w}}}

and thus

\mathbf{u}\equiv \dfrac{\mathbf{v}+\mathbf{w}_\parallel+(\mathbf{w}_\perp/\gamma)}{1+\mathbf{v}\overline{\mathbf{w}}}

where we have defined:

(\mathbf{w}_\perp/\gamma)\equiv\dfrac{\overline{\mathbf{w}}}{\gamma}

and

\mathbf{w}_\parallel\equiv \dfrac{\gamma}{1+\gamma}(\mathbf{v}\overline{\mathbf{w}})\mathbf{v}

Comment: the composition law for 3-velocities is special relativity is both non-linear AND non-associative.

There are two special cases of motion we use to consider in (special) relativity and inertial frames:

1st. The case of parallel motion between frames (or “parallel motion”). In this case \overline{\mathbf{w}}=\lambda \mathbf{v}, i.e., \mathbf{w}\times \mathbf{v}=0. Therefore,

\mathbf{u}=\dfrac{\mathbf{v}+\overline{\mathbf{w}}}{1+\mathbf{v}\overline{\mathbf{w}}}

This is the usual non-linear rule to add velocities in Special Relativity.

2nd. The case of orthogonal motion between frames, where \mathbf{v}\perp\mathbf{w}. It means \mathbf{v}\mathbf{w}=0. Then,

\mathbf{u}=\mathbf{v}+\mathbf{w}/\gamma= \mathbf{v}+\overline{\mathbf{w}}\sqrt{1-\mathbf{v}^2}

This orthogonal motion to the direction of relative speed has an interesting phenomenology, since this inertial motion will be slowed down due to time dilation because the spatial distances that are orthogonal to \mathbf{v} are equal in both reference frames.

Furthermore, we get also:

\mathbf{u}^2=1-\dfrac{(1-\overline{\mathbf{w}}^2)(1-\mathbf{v}^2)}{(1+\mathbf{v}\overline{\mathbf{w}})}\leq 1

Indeed, the condition \mathbf{u}^2=1 implies that \overline{\mathbf{w}}^2=1 or \mathbf{v}^2=1, and the latter condition is actually forbidden because of our interpretation of \mathbf{v} as a relative velocity between different frames. Thus, this last equation shows the Lorentz invariance in Special relativity don’t allow for superluminal motion, although, a priori, it could be also used for even superluminal speeds since no restriction apply for them beyond those imposed by the principle of relativity.

THOMAS PRECESSION

We are ready to study the Thomas precession and its meaning. Suppose an inertial frame \overline{\overline{S}} obtained from another inertial frame \overline{S} by boosting the velocity \overline{w}. Therefore, \overline{\overline{S}} owns the relative velocity \mathbf{v} given by the addition rule we have seen in the previous section. Moreover, we have:

\overline{\overline{x}}=L_{\overline{w}}\overline{x}=L_{\overline{w}}L_{\mathbf{v}}x

Then, we get

L_{\mathbf{v}}=\begin{pmatrix}\gamma_v & -\gamma_v \overline{\mathbf{v}}^T\\ -\gamma_v \mathbf{v} & \mathbf{1}+\dfrac{\gamma_v^2}{1+\gamma_v}\mathbf{v}\mathbf{v}^T\end{pmatrix}

L_{\overline{\mathbf{w}}}=\begin{pmatrix}\gamma_{\overline{\mathbf{w}}} & -\gamma_{\overline{\mathbf{w}}} \overline{\mathbf{w}}^T\\ -\gamma_{\overline{\mathbf{w}}} \overline{\mathbf{w}}^T & \mathbf{1}+\dfrac{\gamma_{\overline{\mathbf{w}}}^2}{1+\gamma_{\overline{\mathbf{w}}}}\overline{\mathbf{w}}\overline{\mathbf{w}}^T\end{pmatrix}

where

\gamma_{v}=\dfrac{1}{\sqrt{1-\mathbf{v}^2}}

\gamma_{\overline{\mathbf{w}}}=\dfrac{1}{\sqrt{1-\overline{\mathbf{w}}^2}}

and then

\boxed{L\equiv=L_{\overline{\mathbf{w}}}L_v=\begin{pmatrix}\gamma & -\mathbf{A}^T\\ -\mathbf{B} & M\end{pmatrix}}

with

\gamma (\mathbf{v},\overline{\mathbf{w}})=\gamma_v\gamma_{\overline{w}}(1+\mathbf{v}\overline{\mathbf{w}})\equiv \gamma (\overline{\mathbf{w}},\mathbf{v})

\mathbf{A}=\gamma (\mathbf{v},\overline{\mathbf{w}})\overline{\mathbf{w}}o \mathbf{v}

\mathbf{B}=\gamma (\overline{\mathbf{w}},\mathbf{v})\mathbf{v}o\overline{\mathbf{w}}

M=M(\overline{\mathbf{w}},\mathbf{v})=\mathbf{1}+\dfrac{\gamma_v^2}{1+\gamma_v}\mathbf{v}\mathbf{v}^T+\dfrac{\gamma_{\overline{\mathbf{w}}}^2}{1+\gamma_{\overline{\mathbf{w}}}}\overline{\mathbf{w}}\overline{\mathbf{w}}^T+\gamma_v\gamma_{\overline{\mathbf{w}}}\left( 1+\dfrac{\gamma_v\gamma_{\overline{\mathbf{w}}}}{(1+\gamma_v)(1+\gamma_{\overline{\mathbf{w}}})}\mathbf{v}\overline{\mathbf{w}}\right)\overline{\mathbf{w}}\mathbf{v}

Here, we have defined:

\boxed{\overline{\mathbf{w}}o \mathbf{v}\equiv \dfrac{\left( \gamma_{\overline{\mathbf{w}}}\gamma_v\mathbf{v}+\gamma_{\overline{\mathbf{w}}}\overline{\mathbf{w}}+\gamma_{\overline{\mathbf{w}}}\dfrac{\gamma_v^2}{1+\gamma_v}(\overline{\mathbf{w}}\mathbf{v})\right)}{\gamma (\mathbf{v},\overline{\mathbf{w}})}}

Remark (I): The matrix L given by

\begin{pmatrix}\gamma & -\mathbf{A}^T\\ -\mathbf{B} & M\end{pmatrix}

is NOT symmetric as we would expect from a boost. According to our decomposition for the matrix M it can be rewritten in the following way

\boxed{R=R(\overline{\mathbf{w}},\mathbf{v})=M(\overline{\mathbf{w}},\mathbf{v})-\dfrac{\mathbf{B}\mathbf{A}^T}{1+\gamma}}

This last equation is called the Thomas precession associated with the tridimensional 3-vectors \mathbf{v},\overline{\mathbf{w}}. We observe that R is a proper-orthogonal matrix from the multiplicative property of the determinants and the fact that all boosts have determinant one. Equivalently, from the condition R=\pm 1 for all orthogonal matrix R together with the continuous dependence of R on the velocities and the initial condition R(0,0)=\mathbf{1}.

Remark (II): From the definitions of M, and the vectors \mathbf{A},\mathbf{B}, we deduce that \mathbf{v}\times \overline{\mathbf{w}} is an eigenvector of R with eigenvalue +1 and this gives the axis of rotation. The rotation angle \alpha as calculated from Tr R=1+2\cos\alpha is complicated expression, and only after some clever manipulations or the use of the geometric algebra framework, it simplifies to

1+\cos\alpha=\dfrac{(1+\gamma_u+\gamma_v+\gamma_{\overline{w}})}{(1+\gamma_u)(1+\gamma_v)(1+\gamma_{\overline{w}})}>0

In order to understand what this equation means, we have to observe that the components \mathbf{v} and \overline{\mathbf{w}} refer to different reference frames, and then, the scalar product \mathbf{v}\mathbf{\overline{w}} and the cross product \mathbf{v}\times\overline{\mathbf{w}} must be given good analitic expressions before the geometric interpretation can be accomplished. Moreover, if we want to interpret the cross product as an axis in the reference frame S, and correspondingly we want to split L=L_{R\mathbf{v}}L_R,  by the definition \overline{\mathbf{w}}o\mathbf{v} we deduce that

\mathbf{v}\times\mathbf{u}=\dfrac{\mathbf{v}\times\overline{\mathbf{w}}}{\gamma_v(1+\mathbf{v}\overline{\mathbf{w}})}

and thus, the Thomas rotation of the inertial frame S has its axis orhtogonal to the relative velocity vectors \mathbf{v},\mathbf{u} of the reference frame \overline{\overline{S}}, \overline{\overline{S}} against S.

By the other hand, if we interpret the above last equation as an axis in the reference frame \overline{\overline{S}}, asociated to the split L=L_RL_\mathbf{u}, we would deduce that L_{R\mathbf{u}}L_R implies the following consequence. The reference frame \overline{\overline{S}} is got from boosting certain frame S’ obtained itself from a rotation of S by R. Then, \overline{\overline{S}} obtains (compared with S or S’), a velocity whose components are R\mathbf{u} in the inertial frame S’. Reciprocally, the components of the velocity of S or S’ against the frame \overline{\overline{S}} are provided, in \overline{\overline S}, by \overline{\overline{\mathbf{u}}}=-R\mathbf{u}. Therefore, from the Thomas precession formula for R we observe that R\mathbf{u} differs from \mathbf{u} only by linear combinations of the vectors \mathbf{v} and \overline{\mathbf{w}}. With all this results we easily derive:

\overline{\overline{u}}\times \overline{\overline{\mathbf{w}}}=(-R\mathbf{u})\times (-\overline{\mathbf{w}})\propto \mathbf{v}\times \overline{\mathbf{w}}

i.e., the axis for the Thomas rotation matrix of \overline{\overline{S}} is orthogonal to the relative velocities \overline{\overline{\mathbf{u}}}, \overline{\overline{\mathbf{w}}} of the inertial frames S, \overline{S} against \overline{\overline{S}}. Finally, to find the rotation matrix, it is enough to restrict the problem to the case where \overline{\mathbf{w}} is small so that squares of it may be neglected. In this simple case, R would become into:

\boxed{R\approx \mathbf{1}+\dfrac{\gamma_v}{1+\gamma_v}\left(\overline{\mathbf{w}}\mathbf{v}^T-\mathbf{v}\overline{\mathbf{w}}^T\right)}

and where the rotation angle is given by

\boxed{\alpha\approx -\dfrac{\gamma_v}{1+\gamma_v}\mathbf{v}\times\overline{\mathbf{w}}\approx -\dfrac{\gamma_v^2}{1+\gamma_v}\mathbf{v}\times\mathbf{u}}

In order to understand the Physics behind the Thomas precession, we will consider one single experiment. Imagine an inertial frame S in accelerated motion with respect to other inertial frame I. The spatial axes of S remain parallel at any time in the sense that the instantaneous reference frame coinciding with S at times t+\Delta t are related by a pure boost in the limit \Delta t\rightarrow 0. This may be managed if we orient S with the aid of a very fast spinning torque-free gyroscope. Then, from the inertial frame I, S seems to be rotated at each instant of time and there is a continuous rotation of S against I since the velocity of S varies and changes continuously. This gyroscopic rotation of S relative to I IS the Thomas precession.  We can determine the angular velocity of this motion in a straightforward manner. During the small interval of time \Delta t measured from I, the instantaneous velocity \mathbf{v} of S changes by certain quantity \Delta \mathbf{v}, measured from I. In that case,

\Delta \alpha=-\gamma_v^2\mathbf{v}\times\dfrac{\Delta\mathbf{v}}{(1+\gamma_v)}

for the rotation vector during a time interval \Delta t. Thus, the angular velocity for the Thomas precession will be given by:

\boxed{\omega_T=-\dfrac{\gamma^2}{1+\gamma_v}\mathbf{v}\times\dfrac{d\mathbf{v}}{dt}}

or reintroducing the speed of light we get

\boxed{\omega_T=-\dfrac{\gamma^2}{1+\gamma_v}\mathbf{v}\times\dfrac{1}{c^2}\dfrac{d\mathbf{v}}{dt}=\dfrac{\gamma^2}{1+\gamma_v}\dfrac{1}{c^2}\mathbf{a}\times\mathbf{v}}

Remark(I): The special relativistic effect given by the Thomas precession was used by Thomas himself to remove a discrepancy and mismatch between the non-relativistic theory of the spinning electron and the experimental value of the fine structure. His observation was, in fact, that the gyromagnetic ratio of the electron calculated from the anomalous Zeeman effect led to a wrong value of the fine structure constant \alpha. The Thomas precession introduces a correction to the equation of motion of an electron in an external electromagnetic filed and such a correction induces a correction of the spin-orbit coupling, explaining the correct value of the fine structure.

Remark (II): In the framework of the relativistic quantum theory of the electron, Dirac realized that the effect of Thomas precession was automatically included!

Remark (III): Inside the Thomas paper, we find these interesting words

“(…)It seems that Abraham (1903) was the first to consider in any detail an electron with an axis. Many have since then considered spinning electron, ring electrons, and the like. Compton (1921) in particular suggested a quantized spin for the electron. It remained for Uhlenberg and Goudsmit (1925) to show ho this idea can be used to explain the anomalous Zeeman effect. The asumptions they had to make seemed to lead to optical and relativity doublet separations twice larger than those we observe. The purpose of the following paper, which contains the results mentioned in my recent letter to Nature (1926), is to investigate the kinematics of an electron with an axis on the basis of the restricted theory of relativity. The main fact used is that the combination of two Lorentz transformations without rotation in general is not of the same form(…)”.

From the historical viewpoint it should also be remarked that the precession effect was known by the end of 1912 to the mathematician E.Borel (C.R.Acad.Sci.,156. 215 (1913)). It was described by him (Borel, 1914) as well as by L.Silberstein (1914) in textbooks already 1914. It seems that the effect was even known to A.Sommerfeld in 1909 and before him, perhaps even to H.Poincaré. The importance of Thomas’ work and papers on this subject was thus not only the rediscovery but the relevant application to a virulent problem in that time, as it was the structure of the atomic spectra and the fine structure constant of the electron!

Remark (IV): Not every Lorentz transformation can be written as the product of two boosts due to the Thomas precession!

THE LORENTZ GROUP AS A QUASIDIRECT PRODUCT: QUASIGROUPS, LOOPS AND GYROGROUPS

Even though we have not studied group theory in this blog, I feel the need to explain some group theory stuff related to the Thomas precession here.

The kinematical differences between Galilean and Einsteinian relativity theories is observed at many levels. The essential differences become apparent already on the level of the homogenous groups without reversals (inverses). Let me first consider the Galileo group. It is generated by space rotations G_R=L_R and galilean boosts in any number and order. Using the notation we have developed in this post, we could write X'=G_\mathbf{v}X in this way:

G_\mathbf{v}=\begin{pmatrix}1 & \mathbf{0}^T\\ -\mathbf{v} & \mathbf{1}\end{pmatrix}

The following relationships are deduced:

G_RG_\mathbf{v}=G_{R\mathbf{v}}G_{R}

G_{R_1}G_{R_2}=G_{R_1R_2}

G_{\mathbf{v}_1}G_{\mathbf{v}_2}=G_{\mathbf{v}_1+\mathbf{v}_2}=G_{\mathbf{v}_2}G_{\mathbf{v}_1}

In the case of the Lorentz group, these equations are “generalized” into

L_RL_\mathbf{v}=L_{R\mathbf{v}}L_{R}

L_{R_1}L_{R_2}=L_{R_1R_2}

L_{\mathbf{v}_1}L_{\mathbf{v}_2}=L_{R(\mathbf{v}_1,\mathbf{v}_2)}L_{\mathbf{v}_1 o \mathbf{v}_2}

where R(\mathbf{v}_1,\mathbf{v}_2) is the Thomas precession and the circle denotes the nonlinear relativisti velocity addition. Be aware that the domain of velocities in special relativity is \vert v\vert<1, in units with c set to unity.

Both groups (Galileo and Lorentz) contain as a subroupt the group of al spatial rotations G_R\equiv L_R. The set of galilean or lorentzian boosts G_v and L_v are invariant under conjugation by G_R=L_R, since

G_RG_vG_R^{-1}=G_{Rv}

L_RL_vL_R^{-1}=L_{Rv}

are boosts as well. In the case of the Galileo group, the set of (galilean) boost forms an (abelian) subgroup and then, it provides an invariant group. We can calculate the factor group with respect to it and we will obtain an isomorphic group to the subgroup of space rotations. Using the group law for the Galileo group:

\underbrace{G_{R_1}G_{v_1}}\underbrace{G_{R_2}G_{v_2}}=G_{R_1R_2}G_{R_2^{-1}v_1+v_2}=G_{R_3}G_{v_3}

with R_3=R_2R_1 and v_3=R_2^{-1}v_1+v_2. As a consequence, the homogenous Galileo group (without reversals) is called a semidirect product of the rotation group with the Abelian group \mathbb{R}^3 of all boosts given by \mathbf{v}.

The case of Lorentz group is more complicated/complex. The reason is the Thomas precession. Indeed, the set of boost does NOT form a subgroup of the Lorentz group! We can define a product in this group:

\boxed{L_{v_1} oL_{v_2}=L_{v_1 o v_2}}

but, in the contrary to the result we got with the Galileo group, this condition does NOT define a group structure. In fact, mathematicians call objects with this property groupoids. The domain of velocities of the this lorentzian grupoid becomes a groupoid under the multiplication v_1 o v_2. It has dramatic consequences. In particular, the associative does not hold for this multiplication and this groupoid structure! Anyway, a weaker form of it is true, involving the Thomas precession/rotation formula:

\boxed{(v_1 o v_2) o v_3=(R^{-1}(v_2,v_3)v_1) o (v_2 o v_3)}

In an analogue way, the multiplication is not commuative in general too, but it satisfies a weaker form of commutativity. While in general groupoids require to distinguish between right and left unit elements (if any), we have indeed \mathbf{v}=\mathbf{0} as a “two-sided” unit element for the velocity groupoid. In the same manner, while in general groupoids right and left inverses may differ (if any), in the case of Lorentz group, the groupoid associated to Thomas precession has a unique two-sided inverse -\mathbf{v} for any \mathbf{v} relative to the groupoid multiplication law. It is NON-trivial ( due to non-associativeness), albeit true, that the equation given by

v_1 o v_2=v_3

may be solved uniquely for v_2 and, provided we plug v_2, v_3, it may be solve uniquely for any v_1. A groupoid satisfying this property (i.e., a groupoid that allows such a uniqueness in the solutions of its equation) is called quasi-group.

In conclusion, we can say that the Lorentz group IS, in sharp contrast to the Galileo group, in no way a semidirect product, being what mathematicians and physicists call a simple group, i.e., it is a noncommutative group having no nontrivial invariant subgroup! It is due to the fact that the multiplication rule of the Lorentz group without reversals makes it, in the sense of our previous definitions, the quasidirect product of the rotation group (as a subgroup of the automorphism group of the velocity groupoid)  with the so-called “weakly associative groupoid of velocities”. Here, weakly associative(-commutative) groupoid means the following: a groupoid with a left-sided unit and left-sided inverses with the next properties:

1. Weak associativeness: R(\mathbf{0},\mathbf{v})=R(-\mathbf{v},\mathbf{v})=\mathbf{1}

2. Loop property (from Thomas precession formula): R(v_1,v_2)=R(v_1,v_1 o v_2)

and where the automorphims group of the velocity groupoid is defined with the next equations

Definition (Automorphism group of the velocity groupoid): (Sv_1)o(Sv_2)=S(v_1 o v_2)

Note: an associative groupoid is called semigroup and and a semigroup with two-sided unit element is called a monoid.

This algebraic structure hidden in the Lorentz group has been rediscovered several times along the History of mathematical physics. A groupoid satisfying the loop property has been named in other ways. For instance, in 1988, A. A. Ungar derived the above composition laws and the automorphism group of the Thomas precession R. Independently, A. Nesterov and coworkers in the Soviet Union had studied the same problem and quasigroup since 1986. And we can track this structure even more. 20 years before the Ungar “rediscovery”, H. Karzel had postulated a version of the same abstract object, and it was integrated into a richer one with two compositions (laws). He called it “near-domain”, where the automorphims R (Thomas precessions) were to be realized by the (distributive) left multiplication with suitable elements of the near-domian ( the reference is Abh. Math.Sem.Uni. Hamburg, 1968).

However, Ungar himself developed a more systematic treatment and description for the Thomas precession “groupoid” that is behind all this weird non-associative stuff in the Lorentz-group in 3+1 dimensions. Accorging to his new approach and terminology, the structure is called “gyrocommutative gyrogroup” and it includes the Thomas precession as “Thomas gyration” in this framework. If you want to learn more about gyrogroups and gyrovector spaces, read this article

http://en.wikipedia.org/wiki/Gyrovector_space

Some other authors, like Wefelscheid and coworkers, called K-loops to these gyrogroups. Even more, there are two extra sources from this nontrivial mathematical structure.

Firstly, in Japan, M.Kikkawa had studied certain loops with a compatible differentiable structure called “homegeneous symmetric Lie groups” ( Hiroshima Math. J.5, 141 (1975)). Even though he did not discuss any concrete example, it is natural from his definitions that it was the same structure Karzel found. Being romantic, we can observe certain justice to call K-loops to gyrogroups (since Kikkawa and Karzel discovered them first!). The second source can be tracked in time since the same ideas were already known by L.Sabinin et alii circa 1972 ( Sov. Math. Dokl.13,970(1972)). Their relation to symmetric homogeneous spaces of noncompact type has been discussed some years ago by W. Krammer and H.K.Urbatke, e.g., in Res. Math.33, 310 (1998).

Finally, a purely algebraic loop theory approach (with motivations far way from geometry or physics) was introduced by D. A. Robinson in 1966. In 1995, A. Kreuzer showed thath it was indeed identical to K-loops, again adding some extra nomenclature ( Math.Proc.Camb. Phylos.Soc.123, 53 (1998)).

THOMAS PRECESSION: EASY DEDUCTION

We have seen that the composition of 2 Lorentz boosts, generally with 2 non collinear velocities, results in a Lorentz transformation that IS NOT a pure boost but a composition of a single Lorentz transformation or boost and a single spatial rotation. Indeed, this phenomenon is also called Wigner-Thomas rotation. The final consequence, any body moving on a curvilinear trajectory undergoes and experiences a rotational precession, firstly noted by Thomas in the relativistic theory of the spinning electron.

In this final section, I am going to review the really simple deduction of the Thomas precession formula given in the paper http://arxiv.org/abs/1211.1854

Imagine 3 different inertial observers Anna, Bob and Charles and their respective inertial frames A, B, and C attached to them. We choose A as a non-rotated frame with respect to B, and B as a non-rotated reference frame w.r.t. C. However, surprisingly, C is going to be rotated w.r.t. A and it is inevitable! We are going to understand it better. Let Bob embrace Charles and let them move together with constant velocity \mathbf{v} w.r.t. Anna. In some point, Charles decides to run away from Bob with a tiny velocity \mathbf{dv'} w.r.t. Bob. Then, Bob is moving with relative velocity -\mathbf{dv'} w.r.t. C and Anna is moving with relative velocity -\mathbf{v} w.r.t. B. We can show these events with the following diagram:

Now, we can write Charles’ velocity in the Anna’s frame by the sum \mathbf{v+dv}. Since the frame C is rotated with respect to the A frame, his velocity in the C frame will be \hat{\mathbf{v}} will be calculated step to step as follows. Firstly, we remark that

\hat{\mathbf{v}}\neq -\mathbf{v}-d\mathbf{v}

Secondly, the angle d\mathbf{\Omega} of an infinitesimal rotation is given by:

d\mathbf{\Omega}=-\dfrac{\hat{\mathbf{v}}}{\vert \hat{\mathbf{v}}\vert }\times \dfrac{\mathbf{v}+d\mathbf{v}}{\vert \mathbf{v}+d\mathbf{v}\vert}\approx -\dfrac{\hat{\mathbf{v}}}{v^2}\times (\mathbf{v}+ d\mathbf{v})\;\;\; (1)

The precession rate in the A frame will be provided using the general nonlinear composition rule in SR. If the motion is parallel to the x-axis with velocity V, we do know that

u'_x=\dfrac{u_x-V}{1-\dfrac{u_x V}{c^2}}

u'_y=\dfrac{u_y\sqrt{1-\dfrac{V^2}{c^2}}}{1-\dfrac{u_x V}{c^2}}

u'_z=\dfrac{u_z\sqrt{1-\dfrac{V^2}{c^2}}}{1-\dfrac{u_x V}{c^2}}

and where \mathbf{u}=(u_x,u_y,u_z) and \mathbf{u}'=(u'_x,u'_y,u'_z) are the velocities of some object in the rest frame and the moving frame, respectively. For an arbitrary non-collinear, non-orthogonal, i.e., non parallel velocity \mathbf{V}=(V_x,V_y,V_z) we obtain the transformations

\boxed{\mathbf{u}'=\dfrac{\sqrt{1-\dfrac{V^2}{c^2}}\left(\mathbf{u}-\dfrac{\mathbf{u}\cdot\mathbf{V}}{V^2}\mathbf{V}\right)-\left( \mathbf{V}-\dfrac{\mathbf{u}\cdot\mathbf{V}}{V^2}\mathbf{V}\right)}{1-\dfrac{\mathbf{u}\cdot\mathbf{V}}{c^2}}\;\;\; (2)}

and where the unprimed and primed frames are mutually non-rotated to each other. Using this last equation, (2), we can easily describe the transition from the frame A to the frame B. It involves the substitutions:

\mathbf{V}\rightarrow \mathbf{v}

\mathbf{u}\rightarrow \mathbf{v}+d\mathbf{v}

\mathbf{u}'\rightarrow d\mathbf{v}'

After leaving the first order terms in d\mathbf{v}, we can get the following expansion from eq.(2):

d\mathbf{v}'\approx \dfrac{1}{\sqrt{1-\dfrac{v^2}{c^2}}}\left(d\mathbf{v}-\dfrac{\mathbf{v}\cdot d\mathbf{v}}{v^2}\mathbf{v}\right)+\dfrac{1}{1-\dfrac{v^2}{c^2}}\dfrac{\mathbf{v}\cdot d\mathbf{v}}{v^2}\mathbf{v}\;\;\; (3)

Using again eq.(2) to make the transition between the B frame to the C frame, i.e., making the substitutions:

\mathbf{V}\rightarrow d\mathbf{v}'

\mathbf{u}\rightarrow -\mathbf{v}

\mathbf{u}'\rightarrow \hat{\mathbf{v}}

and dropping out higher order differentials in d\mathbf{v}', we obtain the next formula after we neglect those terms

\boxed{\hat{\mathbf{v}}\approx -\mathbf{v}+\dfrac{\mathbf{v}\cdot d\mathbf{v}'}{c^2}\mathbf{v}-d\mathbf{v}'\;\;\; (4)}

The final step consists is easy: we plug eq.(3) into eq.(4) and the resulting expression into eq.(1). Then, we divice by the differential dt in the final formula to provide the celebrated Thomas precession formula:

\boxed{\dot{\Omega}=\dfrac{d\Omega}{dt}=\omega_T=-\dfrac{1}{v^2}\left(\dfrac{1}{\sqrt{1-\dfrac{v^2}{c^2}}}-1\right)\mathbf{v}\times \dot{\mathbf{v}}\;\;\; (5)}

or equivalently

\boxed{\dot{\Omega}=\dfrac{d\Omega}{dt}=\omega_T=-\dfrac{1}{v^2}\left(\gamma_{\mathbf{v}}-1\right)\mathbf{v}\times \mathbf{a}\;\;\; (6)}

It can easily shown that these formulae is the same as the given previously above, writing v^2 in terms of \gamma and performing some elementary algebraic manipulations.

Aren’t you fascinated by how these wonderful mathematical structures emerge from the physical world? I can say it: Fascinating is not enough for my surprised mind!