LOG#113. Bohr’s legacy (I).

Dedicated to Niels Bohr

and his atomic model

(1913-2013)

1st part: A centenary model

atomElement-117-discoverypurity

This is a blog entry devoted to the memory of a great scientist, N. Bohr, one of the greatest master minds during the 20th century, one of the fathers of the current Quantum model of atoms and molecules.

Niels_Bohr

One century ago, Bohr was the pioneer of the introduction of the “quantization” rules into the atomic realm, 8 years after the epic Annus Mirabilis of A. Einstein (1905). Please, don’t forget that Einstein himself was the first physicist to consider Planck hypothesis into “serious” physics problems, explaining the photoelectric effect in a simple way with the aid of “quanta of light” (a.k.a. photons!). Therefore, it is not correct to assest that N.Bohr was the “first” quantum physicist. Indeed, Einstein or Planck were the first. Said, this, Bohr was the first to apply the quantum hypothesis into the atomic domain, changing forever the naive picture of atoms coming from the “classical” physics.  I decided that this year I would be writting something in to honour the centenary of his atomic model (for the hydrogen atom).

I wish you will enjoy the next (short) thread…

Atomic mysteries

When I was young, and I was explained and shown the Periodic Table (the ordered list or catalogue of elements) by the first time, I wondered how many elements could be in Nature. Are they 103? 118?Maybe 212? 1000? 10^{23}? Or 10^{100}? \infty, Infinity?

We must remember what an atom is…Atom is a greek word \alpha\tau o\mu o\sigma meaning “with no parts”. That is, an atom is (at least from its original idea), something than can not be broken into smaller parts. Nice concept, isn’t it?

Greek philosophers thought millenia ago if there is a limit to the divisibility of matter, and if there is an “ultimate principle” or “arche” ruling the whole Universe (remarkably, this is not very different to the questions that theoretical physicists are trying to solve even now or the future!). Different schools and ideas arose. I am not very interested today into discussing Philosophy (even when it is interesting in its own way), so let me simplify the general mainstream ideas several thousands of years ago (!!!!):

1st. There is a well-defined ultimate “element”/”substance” and an ultimate “principle”. Matter is infinitely divisible. There are deep laws that govern the Universe and the physical Universe, in a cosmic harmony.

2nd. There is a well-defined ultimate “element”/”substance” and an ultimate “principle”. Matter is FINITELY divisible. There are deep laws that govern the Universe and the physical Universe, in a cosmic harmony.

3rd. There is no a well-defined ultimate “element”/”substance” or an ultimate principle. Chaos rules the Universe. Matter is infinitely divisible.

4th. There is no a well-defined ultimate “element”/”substance” or an ultimate principle. Chaos rules the Universe. Matter is finitely divisible.

Remark: Please, note the striking “similarity” with some of the current (yet) problems of Physics. The existence of a Theory Of Everything (TOE) is the analogue to the question of the first principle/fundamental element quest of ancient greek philosophers or any other philosophy in all over the world. S.W. Hawking himself provided in his Brief Story of Time the following (3!) alternative approaches

1st. There is not a TOE. There is only a chaotic pattern of regularities we call “physical laws”. But Nature itself is ultimately chaotic and the finite human mind can not understand its ultimate description.

2nd. There is no TOE. There are only an increasing number of theories more and more precise or/and more and more accurate without any limit. As we are finite beings, we can only try to guess better and better approximations to the ultimate reality (out of our imagination) and the TOE can not be reached in our whole lifetime or even in the our whole species/civilization lifetime.

3rd. There is a well defined TOE, with its own principles and consequences. We will find it if we are persistent enough and if we are clever enough. All the physical events could be derived from this theory. If we don’t find the “ultimate theory and its principles” is not because it is non-existent, it is only that we are not smart enough. Try harder (If you can…)!

If I added another (non Greek) philosophies, I could create some other combinations, but, as I told you above, I am not going to tell you Philosophy here, not at least more than necessary.

As you probably know, the atomic idea was mainly defended by Leucippus and Democritus, based on previous ideas by Anaxagoras. It is quite likely that Anaxagoras himself learned them from India (or even from China), but that is quite speculative… Well, the keypoint of the atomic idea is that you can not smash into smaller pieces forever smaller and smaller bits of matter. Somewhere, the process of breaking down the fundamental constituents of matter must end…But where? And mostly, how can we find an atom or “see” what an atom looks like? Obviously, ancient greeks had not idea of how to do that, or even knowing the “ground idea” of what a atom is, they had no experimental device to search for them. Thus, the atomic idea was put into the freezer until the 18th and 19th century, when the advances in experimental (and theoretical) Chemistry revived the concept and the whole theory. But Nature had many surprises ready for us…Let me continue this a bit later…

In the 19th century, with the discovery of the ponderal laws of Chemistry, Dalton and other chemists were stunned. Finally, Dalton  was the man who recovered the atomism into “real” theoretical Science. But their existence was controversial until the 20th century. However, Dalton concluded that there was a unique atom for each element, using Lavoisier’s definition of an element as a substance that could not be analyzed into something simpler. Thus, Dalton arrived to an important conclusion:

“(…)Chemical analysis and synthesis go no farther than to the separation of particles one from another, and to their reunion. No new creation or destruction of matter is within the reach of chemical agency. We might as well attempt to introduce a new planet into the solar system, or to annihilate one already in existence, as to create or destroy a particle of hydrogen. All the changes we can produce, consist in separating particles that are in a state of cohesion or combination, and joining those that were previously at a distance(…)”.

The reality of atoms was a highly debated topic during all the 19th century. It is worthy to remark that was Einstein himself (yes, he…agian) who went further and with his studies about the Brownian motion established their physical existence. It was a brillian contribution to this area, even when, in time, he turned against the (interpretation of) Quantum Mechanics…But that is a different story not to be told today.

Dalton’s atoms or Dalton atomic model was very simple.

A_New_System_of_Chemical_Philosophy_fp

Atoms had no parts and thus, they were truly indivisible particles. However, the electrical studies of matter and the electromagnetic theory put this naive atomic model into doubt. After the discovery of “the cathode” rays (1897) and the electron by J.J.Thomson (no, it is not J.J.Abrahams), it became clear that atoms were NOT indivisible after all! Surprising, isn’t it? It is! Chemical atoms are NOT indivisible. They do have PARTS.

Thomson’s model or “plum pudding” model, came into the rescue…Dalton believed that atoms were solid spheres, but J.J.Thomson was forced (due to the electron existence) to elaborate a “more complex” atomic model. He suggested that atoms were a spherical “fluid” mass with positive charge, and that electrons were placed into that sphere as in a “plum pudding” cake.  I have to admit that I were impressed by this model when I was 14…It seemed too ugly for me to be true, but anyway it has its virtues (it can explain the cathode ray experiment!).cathode-rays-formation

thomsonAndNagaokaModels

The next big step was the Rutherford experiment! Thomson KNEW that electrons were smaller pieces inside the atom, but despite his efforts to find the positive particles (and you see there he had and pursued his own path since he discovered the reason of the canal rays), he could not find it (and they should be there since atoms were electrically neutrial particles). However, clever people were already investigating radioactivity and atomic structure with other ideas…In 1911, E. Rutherford, with the aid of his assistants, Geiger and Marsden, performed the celebrated gold foil experiment.

Rutherford_experiment

To his surprise (Rutherford’s), his assistants and collaborators provided a shocking set of results. To explain all the observations, the main consequences of the Rutherford’s experiment were the next set of hypotheses:

1st. Atoms are mostly vacuum space.

2nd. Atoms have a dense zone of positive charge, much smaller than the whole atom. It is the atomic nucleus!

3rd. Nuclei had positive charge, and electrons negative charge.

He (Rutherford) did not know from the beginning how was the charge arranged and distributed into the atom. He had to improve the analysis and perform additional experiment in order to propose his “Rutherford” solar atomic model and to get an estimate of the nuclei size (about 1fm or 10^{-15}m). In fact, years before him, the japanase Nagaoka had proposed a “saturnian” atomic model with a similar looking. It was unstable, though, due to the electric repulsion of the electronic “rings” (previously there was even a “cubic” model of atom, but it was unsuccessful too to explain every atomic experiment) and it had been abandoned.

And this is the point where theory become “hard” again. Rutherford supposed that the electron orbits around nuclei were circular (or almost circular) and then electrons experimented centripetal forces due to the electrical forces of the nucleus. The classical electromagnetic theory said that any charged particle being accelerated (and you do have acceleration with a centripetal force) should emit electromagnetic waves, losing energy and, then, electrons should fall over the the nuclei (indeed, the time of the fall down was ridiculously small and tiny). We do not observe that, so something is wrong with our “classical” picture of atoms and radiation (it was also hinted with the photoelectric effect or the blackbody physics, so it was not too surprising but challenging to find the rules and “new mechanics” to explain the atomic stability of matter). Moeover, the atomic spectra was known to be discrete (not continuous) since the 19th century as well. To find out the new dynamics and its principles became one of the oustanding issues in the theoretical (and experimental) community. The first scientist to determine a semiclassical but almost “quantum” and realistic atomic spectrum (for the simpler atom, the hydrogen) was Niels Bohr. The Bohr model of the hydrogen atom is yet explained at schools not only due to its historical insterest, but to the no less important fact that it provides right answers (indeed, Quantum Mechanics reproduces its features) for the simplest atom and that its equations are useful and valid from a quantitative viewpotint (as I told you, Quantum Mechanics reproduces Bohr formulae). Of course, Bohr model does not explain the Stark effect, the Zeeman effect, or the hyperfine structure of the hydrogen atom and some other “quantum/relativistic” important effects, but it is a really useful toy model and analytical machine to think about the challenges and limits of Quantum Mechanics of atoms and molecules. Bohr model can not be applied to helium and other elements in the Periodic Table of the elements (its structure is described by Quantum Mechanics), so it can be very boring but, as we will see, it has many secrets and unexpected surprises in its core…

Bohr model for the hydrogen atom

bohr_transitionsBohr_atom_model_EnglishBohr_atombohrAndBalmer

Bohr model hypotheses/postulates:

1st. Electrons describe circular orbits around the proton (in the hydrogen atom). The centripetal force is provided by the electrostatic force of the proton.

2nd. Electrons, while in “stationary” orbits with a fixed energy, do NOT radiate electromagnetic waves ( note that this postulate is againsts the classical theory of electromagnetics as it was known in the 19th century).

3rd. When a single electron passes from one energetic level to another, the energy transitions/energy differences satisfy the Planck law. That is, during level transitions, \Delta E=hf.

In summary, we have:

bohrPostulatesbohrmodelHypotheses

Firstly, we begin with the equality between the electron-proton electrostatic force and the centripetal force in the atom:

\begin{pmatrix}\mbox{Centripetal}\\ \mbox{Force}\end{pmatrix}=\begin{pmatrix}\mbox{Electron-proton}\\ \mbox{electric force}\end{pmatrix}

Mathematically speaking, this first postulate/ansatz requieres that q_1=q_2=e, where e=1\mbox{.}602\cdot 10^{-19}C is the elementary electric charge of the electron (and equal in absolute value to the proton charge) and m_e=9.11\cdot 10^{-31}kg is the electron mass:

F_c=\dfrac{m_ev^2}{R} and F_C=K_C\dfrac{q_1q_2}{R^2}=K_C\dfrac{e^2}{R^2} implies that

(1) \boxed{F_c=F_{el,C}}\leftrightarrow \boxed{\dfrac{m_ev^2}{R}=\dfrac{K_Ce^2}{R^2}}\leftrightarrow \boxed{v^2=\left(\dfrac{K_C}{m_e}\right)\left(\dfrac{e^2}{R}\right)}

Remark: Instead of having the electron mass, it would be more precise to use the “reduced” mass for this two body problem. The reduced mass is, by definition,

\mu=m_{red}=\dfrac{m_1m_2}{m_1+m_2}=\dfrac{m_em_p}{m_e+m_p}

However, it is easy to realize that the reduced mass is essentially the electron mass (since m_p\approx 1836m_e)

\mu=\dfrac{m_e}{1+\left(\dfrac{m_e}{m_p}\right)}\approx m_e(1-\dfrac{m_e}{m_p}+\ldots)=m_e+\mathcal{O} \left(\dfrac{m_e^2}{m_p}\right)

The second Bohr’s great idea was to quantize the angular momentum. Classically, angular momentum can take ANY value, Bohr great’s intuition suggested that it could only take multiple values of some fundamental constant, the Planck’s constant. In fact, assuming orbitar stationary orbits, the quantization rule provides

(2) \boxed{L=m_ev(2\pi R)=nh} or \boxed{L=m_evR=n\dfrac{h}{2\pi}=n\hbar} with \hbar=\dfrac{h}{2\pi} and n=1,2,3,\ldots,\infty a positive integer.

Remark: h=6\mbox{.}63\cdot 10^{-34}Js and \hbar=\dfrac{h}{2\pi}=1\mbox{.}055\cdot 10^{-34}Js are the Planck constant and the reduced Planck constant, respectively.

From this quantization rule (2), we can easily get

vR=\left(\dfrac{n\hbar}{m_e}\right) and then v^2R^2=\left(\dfrac{n\hbar}{m_e}\right)^2

Thus, we have

R^2=\left(\dfrac{n\hbar}{m_e}\right)^2\dfrac{1}{v^2}

Using the result we got in (1) for the squared velocity of the electron in the circular orbit, we deduce the quantization rule for the orbits in the hydrogen atom according to Bohr’s hypotheses:

R^2=\left(\dfrac{n\hbar}{m_e}\right)^2\left(\dfrac{m_eR}{K_Ce^2}\right)

R=\dfrac{n^2\hbar^2}{m_e^2}\dfrac{m_e}{K_Ce^2}

(3) \boxed{R_n=R(n)=\left(\dfrac{\hbar^2}{m_eK_Ce^2}\right)n^2}\leftrightarrow \boxed{R_n=a_Bn^2}

where n=1,2,3,\ldots,\infty again and the Bohr radius a_B is defined to be

(4) \boxed{a_B=\dfrac{\hbar^2}{m_eK_Ce^2}}

Inserting values into (4), we obtain the celebrated value of the Bohr radius

a_B\approx 0\mbox{.}53\AA=53pm=5\mbox{.}3\cdot 10^{-11}m

The third important consequence in the spectrum of energy levels in the hydrogen atom. To obtain the energy spectrum, there is two equivalent paths (in fact, they are the same): use the virial theorem or use (1) into the total energy for the electron-proton system. The total energy of the hydrogen atom can be written

E=\mbox{Kinetic Energy}+\mbox{(electrostatic) Potential Energy}

E=\dfrac{p^2}{2m_e}-\dfrac{K_Ce^2}{R}=\dfrac{m_ev^2}{2}-\dfrac{K_Ce^2}{R}

Substituting (1) into this, we get exactly the expected expression for the virial theorem to a 1/r^2 potential (i.e. E=E_p/2):

E=\dfrac{m_ev^2}{2}-\dfrac{K_Ce^2}{R}=-K_C\dfrac{e^2}{2R}

(5) \boxed{E=-K_C\dfrac{e^2}{2R}}

Inserting into (5) the quantized values of the orbit, we deduce the famous and well-known formula for the spectrum of the hydrogen atom (known to Balmer and the spectroscopists at the end of the 19th century and the beginning of the 20th century):

(6) \boxed{E_n=E(n)=-\dfrac{m_eK_C^2e^4}{2\hbar^2n^2}=-\dfrac{m_e}{2}\left(\dfrac{K_Ce^2}{n\hbar}\right)^2=-\dfrac{\mbox{Ry}}{n^2}} \;\;\forall n=1,2,3,\ldots,\infty

and where we have defined the Rydberg (constant) as

(7) \boxed{\mbox{Ry}=\dfrac{m_e(K_Ce^2)^2}{2\hbar^2}=\dfrac{m_eK_C^2e^4}{2\hbar^2}=\dfrac{1}{2}\alpha^2 m_ec^2}

Its value is Ry=R_H=2.18\cdot 10^{-18}J=13\mbox{.}6eV. Here, the electromagnetic fine structure constant (alpha) is

\alpha=K_C\dfrac{e^2}{\hbar c}

and c is the speed of light. In fact, using the quantum relation

E=\dfrac{hc}{\lambda}

we can deduce that the Rydberg corresponds to a wavenumber

k=1\mbox{.}097\cdot 10^{7}m^{-1}

or a frequency

f=\nu=3\mbox{.}29\cdot 10^{15}Hz

and a wavelength

\lambda =912\AA=91\mbox{.}2nm

Please, check it yourself! :D.

The above results allowed Bohr to explain the spectral series of the hydrogen atom. He won the Nobel Prize due to this wonderful achievement…

Hydrogenic atoms

(and positronium, muonium,…)

In fact, it is easily straightforward to extend all these results to “hydrogenic” (“hydrogenoid”) atoms, i.e., to atoms with only a single electron BUT a nucleus with charge equal to Ze, and Z>1 is an integer (atomic) number greater than one! The easiest way to obtain the results is not to repeat the deduction but to make a rescaling of the proton charge, i.e., you plug q_2=Ze or/and make a rescaling of the electric charge q_2=e\longrightarrow Ze (be aware of making the right scaling in the formulae). The final result for the radius and the energy spectrum is as follows:

A) From R_n=\left(\dfrac{\hbar^2}{m_eK_Ce^2}\right)n^2, with e\longrightarrow Ze, you get

(8) \boxed{\bar{R}_n=\bar{R}(n)=\dfrac{\hbar^2}{m_eK_CZe^2}n^2=\dfrac{a_Bn^2}{Z}}

B) From E_n=-m_e\dfrac{(K_Ce^2)^2}{2\hbar^2n^2}, with the rescaling e\longrightarrow Ze, you get

(9) \boxed{\bar{E}_n=\bar{E}(n)=-m_e\dfrac{Z^2(K_Ce^2)^2}{2\hbar^2n^2}=-\dfrac{Z^2\alpha^2m_ec^2}{2n^2}=-\dfrac{Z^2Ry}{n^2}}

Therefore, the consequence of the rescaling of the nuclear charge is that energy levels are “enlarged” by a factor Z^2 and that the orbits are “squeezed” or “contracted” by a factor 1/Z.

Exercise: Can you obtain the energy levels and the radius for the positronium (an electron and positron system instead an electron a positron). What happens with the muonium (strange substance formed by electron orbiting and antimuon)?And the muonic atom (muon orbiting an proton)? And a muon orbiting an antimuon? And the tau particle orbiting an antitau or the electron orbiting an antitau or a tau orbiting a proton(supposing that it were possible of course, since the tau particle is unstable)? Calculate the “Bohr radius” and the “Rydberg” constant for the positronium, the muonium, the muonic atom (or the muon-antimuon atom) and the tauonium (or the tau-antitau atom). Hint: think about the reduced mass for the positronium and the muonium, then make a good mass/energy or radius rescaling.

Now, we can also calculate the velocity of an electron in the quantized orbits for the Bohr atom and the hydrogenic atom. Using (3) and (8),

mvR=n\hbar\leftrightarrow mR=\dfrac{n\hbar}{m_e}\leftrightarrow v^2R^2=\dfrac{n^2\hbar^2}{m_e^2}

or

v^2=\left(\dfrac{n\hbar}{m_e}\right)^2\dfrac{1}{R^2}

and inserting the quantized values of the orbit radius

v_n^2=\dfrac{K_Ce^2}{m_eR_n}=\dfrac{m_e(K_Ce^2)^2}{m_en^2\hbar^2}

so, for the Bohr atom (hydrogen)

(10) \boxed{v_n=v(n)=\dfrac{K_Ce^2}{\hbar n}=\dfrac{\alpha c}{n}}

In the case of hydrogenic atoms, the rescaling of the electric charge yields

(11) \boxed{\bar{v}_n=\bar {v}(n)=\dfrac{ZK_Ce^2}{\hbar n}=\dfrac{Z\alpha c}{n}}

so, the hydrogenic atoms have a “enlarged” electron velocity in the orbits, by a factor of Z.

The feynmanium

This result for velocities is very interesting. Suppose we consider the fundamental level n=1 (or the orbital 1s in Quantum Mechanics, since, magically or not, Quantum Mechanics reproduces the results for the Bohr atom and the hydrogenic atoms we have seen here, plus other effects we will not discuss today relative to spin and some energy splitting for perturbed atoms). Then, the last formula yield, in the hydrogenic case,

v_1=Z\alpha c

Furthermore, suppose now in addition that we have some “superheavy” (hydrogenic) atom with, say, Z>137 (note that \alpha\approx 1/137 at ordinary energies), say Z=138 or greater than it. Then, the electron moves faster than the speed of light!!!!! That is, for hydrogenic atoms, with Z>137 and considering the fundalmental level, the electron would move with v>c. This fact is “surprising”. The element with Z=137 is called untriseptium (Uts) by the IUPAC rules, but it is often called the feynmanium (Fy), since R.P. Feynman often remarked the importance of this result and mystery. Of course, Special Relativity forbids this option. Therefore, something is wrong or Z=137 is the last element allowed by the Quantum Rules (or/and the Bohr atom). Obviously, we could claim that this result is “wrong” since we have not consider the relativistic quantum corrections or we have not made a good relativistic treatment of this system. It is not as simple as you can think or imagine, since using a “naive” relativistic treatment, e.g., using the Dirac equation , we obtain for the fundamental level of the hydrogenic atom the spectrum

(12) \boxed{E_1=E=m_ec^2\sqrt{1-Z^2\alpha^2}}. This result can be obtained from the Dirac equation spectrum for the hydrogen atom (in a Coulomb potential):

(13) \boxed{E_{n,k;Z,\alpha}=E(n,k;Z,\alpha)=mc^2\left[1+\left(\dfrac{Z\alpha}{n-\vert k\vert+\sqrt{k^2-Z^2\alpha^2}}\right)^2\right]^{-1/2}}

where n is a nonnegative integer number n=N+\vert k\vert and k^2=(j+\frac{1}{2})^2. Putting these into numbers, we get

HydrogenAtomSpectrumDiracEquationFirstLevelsor equivalently (I add comments from the slides)

HydrogenicAtomFirstLevelsDiracEq

If you plug Z=138 or more into the above equation from the Dirac spectrum, you obtain an imaginary value of the energy, and thus an oscillating (unbound) system! Therefore, the problem for atoms with high Z even persist taking the relativistic corrections! What is the solution? Nobody is sure. Greiner et al. suggest that taking into account the finite (extended) size of the nuclei, the problem is “solved” until Z\approx 172. Beyond, i.e., with Z>172, you can not be sure that quantum fluctuations of strong fields introduce vacuum pair creation effects such as they make the nuclei and thus atoms to be unstable at those high values of Z. Some people believe that the issues arise even before, around Z=150 or even that strong field effects can make atoms even below of Z=137 to be non-existent. That is why the search for superheavy elements (SHE) is interesting not only from the chemical viewpoint but also to the fundamental physics viewpoint: it challenges our understanding of Quantum Mechanics and Special Relativity (and their combination!!!!).

Is the feynmanium (Z=137) the last element? This hypothetical element and other superheavy elements (SHE) seem to hint the end of the Periodic Table. Is it true? Options:

1st. The feynmanium (Fy) or Untriseptrium (Uts) is the last element of the Periodic Table.

2nd. Greiner et al. limit around Z=172. References:

(i) B Fricke, W Greiner and J T Waber,Theor. Chim. Acta, 1971, 21, 235.

(ii)W Greiner and J Reinhardt, Quantum Electrodynamics, 4th edn (Springer, Berlin, 2009).

3rd. Other predictions of an end to the periodic table include Z = 128 (John Emsley) and Z = 155 (Albert Khazan). Even Seaborg, from his knowledge and prediction of an island of stability around Z,N= 126, 184,\ldots , left this question open to interpretation and experimental search!

4th. There is no end of the Periodic Table. According to Greiner et al. in fact, even when superheavy nuclei can produce a challenge for Quantum Mechanics and Special Relativity, indeed, since there is always electrons in the orbitals (a condition to an element to be a well-defined object), there is no end of The Periodic Table (even when there are probabilities to a positron-electron pair to be produced for a superheavy nuclei, the presence of electrons does not allow for it; but strong field effects are important there, and it should be great to produce these elements and to know their properties, both quantum and relativistic!). Therefore, it would be very, very interesting to test the superheavy element “zone” of the Periodic Table, since it is a place where (strong) quantum effects and (non-negligible) relativistic effects both matter. Then, if both theories are right, superheavy elements are a beautiful and wonderful arena to understand how to combine together the two greatest theories and (unfinished?) revolutions of the 20th century. What awesome role for the “elementary” and “fundamental” superheavy (composite) elements!

Probably, there is no limit to the number of (chemical) elements in our Universe… But we DO NOT KNOW!

In conclusion: what will happen for superheavy elements with >173 (or Z>126, 128, 137, etc.) remains unresolved with our current knowledge. And it is one of the last greatest mysteries in theoretical Chemistry!

More about the fine structure constant, the Sommerfeld corrections and the Dirac equation+QED (Quantum ElectroDynamics) corrections to the hydrogen spectrum, in slides (think it yourself!):

bohrsommIdea

sod1dirac04onelectronSpectrum

Final remarks (for experts only): Some comments about the self-adjointness of the Dirac equation for high value of Z in Coulombian potentials. It is a well known fact that the Dirac operator for the hydrogen problem is essentially self-adjoint if Z<119. Therefore, it is valid for all the currently known elements (circa 2013, June, every element in the Periodic Table, for the 7th period, has been created and then, we know that chemical elements do exist at least up to Z=118 and we have tried to search for superheavy elements beyond that Z with negative results until now). However, for 119\leq Z\leq 137 any “self-adjoint extension” requires a precise physical meaning. A good idea could be that the expectation value of every component of the Hamilton is finite in the selected basis. Indeed, the solution to the Coulombian potential for the hydrogenic atom using the Dirac equation makes use of hypergeometric functions that are well-posed for any Z\leq 137. If Z is greater than that critical value, we face the oscillating energy problem we discussed above. So, we have to consider the effect of the finite size of the nucleus and/or handle relativistic corrections more carefully. It is important to realize this and that we have to understand the main idea of all this crazy stuff. This means that the s states start to be destroyed above Z = 137, and that the p states begin being destroyed above Z = 274.  Note that this differs from the result of the Klein-Gordon equation, which predicts s states being destroyed above Z = 68 and p states destroyed above Z = 82. In summary, the superheavy elements are interesting because they challenge our knowledge of both Quantum Mechanics and Special Relativity. What a wonderful (final) fate for the chemical elements: the superheavy elements will test if the “marriage” between Quantum Mechanics or Special Relativity is going further or it ends into divorce!

Epilogue: What do you think about the following questions? This is a test for you, eager readers…

1) Is there an ultimate element?

2) Is there a theory of everything (TOE)?

3) Is there an ultimate chemical element?

4) Is there a single “ultimate” principle?

5) How many elements does the Periodic Table have?

6) Is the feynmanium the last element?

7) Are Quantum Mechanics/Special relativity consistent to each other?

8) Is Quantum Mechanics a fundamental and “ultimate” theory for atoms and molecules?

9) Is Special Relativity a fundamental and “ultimate” theory for “quick” particles?

10) Are the atomic shells and atomic structure completely explained by QM and SR?

11) Are the nuclei and their shell structure xompletely explained by QM and SR?

12) Do you think all this stuff is somehow important and relevant for Physics or Chemistry (or even for Mathematics)?

13) Will we find superheavy elements the next decade?

14) Will we find superheavy elements this century?

15) Will we find that there are some superheavy elements stable in the island of stability (Seaborg) with amazing properties and interesting applications?

16) Did you like/enjoy this post?

17) When you was a teenager, how many chemical elements did you know? How many chemical elements were known?

18) Did you learn/memorize the whole Periodic Table? In the case you did not, would you?

19) What is your favourite chemical element?

20) Did you know that every element in the 7th period of the Periodic table has been established to exist but th elements E113, E115,E117 and E118 are not named yet (circa, 2013, 30th June) and they keep their systematic (IUPAC) names ununtrium, ununpentium, ununseptium and ununoctium? By the way, the last named elements were the coperninicium (E112, Cn), the flerovium (Fl, E114) and the livermorium (Lv, E116)…

13502276-green-atom-electron-llustration-on-black-background


LOG#080. A Bug-Rivet “paradox”.

Imagine that an idealised bug of negligible dimensions is hiding at the end of a hole of length L. A rivet has a shaft length of a<L.

bugrivet1
Clearly the bug is “safe” when the rivet head is flush to the (very resiliente) surface. The problem arises as follows. Consider what happens when the rivet slams into the surface at a speed of v=\beta c, where c is the speed of light and 0<\beta<1. One of the essences of the special theory of relativity is that objects moving relative to our frame of reference are shortened in the direction of motion by a factor \gamma^{-1}=\sqrt{1-\beta^2}, where \gamma is generally called the Lorentz dilation factor, as readers of this blog already know. However, from the point of view (frame of reference) of the bug, the rivet shaft is even shorter and therefore the bug should continue to be safe, and thus fast the rivet is moving.

bugrivet2

Apparently, we have:

a_{app}=\dfrac{a}{\gamma}=a\sqrt{1-\beta^2}

Remark: this idea assumes that both objects are ideally rigid! We will return to this “fact” later.

From the frame of reference of the rivet, the rivet is stationary and unchanged, but the hole is moving fast and is shortened by the Lorentz contraction to

L_{app}=\dfrac{L}{\gamma}=L\sqrt{1-\beta^2}

bugrivet3

If the approach speed is fast enough, so that L_{app}<a, then the end of the hole slams into the tip of the rivet before the surface
can reach the head of the rivet. The bug is squashed! This is the “paradox”: is the bug squashed or not?

There are many good sources for this paradox (a relative of the pole-barn paradox), such as:
1)http://en.wikipedia.org/wiki/Wikipedia:Reference_desk/Archives/Science/2006_October_19#Bug_Rivet_Paradox

2) A nice animation can be found here  http://math.ucr.edu/~jdp/Relativity/Bug_Rivet.html

In this blog post we are going to solve this “paradox” in the framework of special relativity.

SOLUTION

One of the consequences of special relativity is that two events that are simultaneous in one frame of reference are no longer simultaneous in other frames of reference. Perfectly rigid objects are impossible.

In the frame of reference of the bug, the entire rivet cannot come to a complete stop all at the same instant. Information
cannot travel faster than the speed of light. It takes time for knowledge that the rivet head has slammed into the surface to
travel down the shaft of the rivet. Until each part of the shaft receives the information that the rivet head has stopped, that part keeps going at speed v=\beta c. The information proceeds down the shaft at speed c while the tip continues to move at speed v=\beta c.

bugrivet4

The tip cannot stop until a time

T_1=\dfrac{\dfrac{a}{\gamma}}{c-\beta c}=\dfrac{a}{\gamma c (1-\beta)}

after the head has stopped. During that time the tip travels a distance D_1=vT_1. The bug will be squashed if

vT_1>L-\dfrac{a}{\gamma}

This implies that

\dfrac{\beta c a}{\gamma c (1-\beta)}>L-\dfrac{a}{\gamma}\leftrightarrow \dfrac{a}{\gamma}\left(\dfrac{\beta}{1-\beta}+1\right) >L\leftrightarrow \dfrac{a}{\gamma}\left(\dfrac{\beta+1-\beta}{1-\beta}\right) >L

From \gamma^{-1}=\sqrt{1-\beta^2} we can calculate that

\dfrac{1}{\gamma (1-\beta)}=\dfrac{\sqrt{1-\beta^2}}{1-\beta}=\dfrac{\sqrt{(1+\beta)(1-\beta)}}{1-\beta}=\sqrt{\dfrac{1+\beta}{1-\beta}}

The bug will be squashed if the following condition holds

a\sqrt{\dfrac{1+\beta}{1-\beta}}>L\leftrightarrow \dfrac{a}{L}>\sqrt{\dfrac{1-\beta}{1+\beta}}\leftrightarrow \left(\dfrac{a}{L}\right)^2> \dfrac{1-\beta}{1+\beta}

or equivalently, after some algebraic manipulations, the bug will be squashed if:

\beta>\dfrac{1-\left(\dfrac{a}{L}\right)^2}{1+\left(\dfrac{a}{L}\right)^2}

Conclusion (in bug’s reference frame): the bug will be definitively squashed when v_{min}=\beta_{min}c such as

\boxed{\beta_{min}=\dfrac{1-\left(\dfrac{a}{L}\right)^2}{1+\left(\dfrac{a}{L}\right)^2}}

Check: It can be verified that the limits \displaystyle{\lim_{a\rightarrow 0^+}\beta_{min}=1^{-}} and \displaystyle{\lim_{a\rightarrow L^-}\beta_{min}=0^{+}} are valid and physically meaningful.

Note that the impact of the rivet head always happens before the bug is squashed.

In the frame of reference of the rivet, the bug is definitively squashed whenever \dfrac{L}{\gamma}<a.

bugrivet5

Then,

L\sqrt{1-\beta^2}<a\leftrightarrow 1-\beta^2<\left(\dfrac{a}{L}\right)^2

or equivalently

\beta>\sqrt{1-\left(\dfrac{a}{L}\right)^2}

or

\beta>\beta_{min2} where \boxed{\beta_{min2}=\sqrt{1-\left(\dfrac{a}{L}\right)^2}}

The bug is squashed before the impact of the surface on the rivet head. This last equation (and thus \beta_{min2}) is a velocity higher than \beta_{min}.

Conclusion (in rivet’s reference frame): The entire surface cannot come to an abrupt stop at the same instant. It takes time for the information about the impact of the rivet tip on the end of the hole to reach the surface that is rushing towards the rivet head. Let us now examine the case where the speed is not high enough for the Lorentz-contracted hole to be shorter than the rivet shaft in the frame of reference of the rivet. Now the observers agree that the impact of the rivet head happens first. When the surface slams into contact with the head of the rivet, it takes time for information about that impact to travel down to the end of the hole. During this time the hole continues to move towards the tip of the rivet.

bugrivet6

The time it takes for the propagating information to reach the tip of the stationary rivet is

T_2=\dfrac{a}{c}

during which time the bug moves a distance D_2=vT_2=\dfrac{\beta c a}{c}=\beta a

In the rivet’s reference frame, therefore, The bug is squashed if the following condition holds

vT_2>\dfrac{L}{\gamma}-a\leftrightarrow \beta a>\dfrac{L}{\gamma}-a\leftrightarrow (1+\beta)a>\dfrac{L}{\gamma}\leftrightarrow \dfrac{a}{L}>\dfrac{1}{1+\beta}\sqrt{1-\beta^2}

and then

\dfrac{\sqrt{(1+\beta)(1-\beta)}}{1+\beta}<\dfrac{a}{L}\leftrightarrow \sqrt{\dfrac{1-\beta}{1+\beta}}<\dfrac{a}{L}

and from this equation, we get same minimum speed that guarantees the squashing of the bug as was the case in the frame of reference of the bug! That is:

\boxed{\beta_{min}=\dfrac{1-\left(\dfrac{a}{L}\right)^2}{1+\left(\dfrac{a}{L}\right)^2}}

Note that observers travelling with each of the two frames of reference (bug and rivet) agree that the bug is squashed IF \beta>\beta_{min}, and that resolves the “paradox”. They also agree that the impact of rivet head on surface happens before the bug is squashed, provided that the following condition is satisfied:

\beta_{min}<\beta<\beta_{min2}

i.e., they agree if the impact of rivet head on surface happens before the bug is squashed

\boxed{\dfrac{1-\left(\dfrac{a}{L}\right)^2}{1+\left(\dfrac{a}{L}\right)^2}<\beta<\sqrt{1-\left(\dfrac{a}{L}\right)^2}}

Otherwise, they disagree on which event happens first.  For instance, if

\beta>\beta_{min2}=\sqrt{1-\left(\dfrac{a}{L}\right)^2}

For speeds this high, the observer in the bug’s frame of reference still deduces that the rivet-head impact happens first, but the other observer deduces that the bug is squashed first. This is consistent with the relativity of simultaneity! At the critical speed, when \beta=\beta_c=\beta_{min2} the two events are simultaneous in the frame of the rivet, (the river fits perfectly in the shortened hole), but they are not simultaneous in the other frame of reference.

See you in the next blog post!


LOG#075. Batmobile “paradox”.

The Batmobile “fake paradox” helps us to understand Special Relativity a little bit. This problem consists in the next experiment:

There are two observers. Alfred, the external observer, and Batman moving with his Batmobile.

Now, we will suppose that the Batmobile is moving at a very fast constant speed with respect to the garage. Let us suppose that v=0.866c=\dfrac{\sqrt{3}}{2}c. Then, we have the following situation from the external observer:


However, with respect to the Batmobile reference frame, we have:

The question is. Who is right? Alfred or Batman? The surprinsig answer from Special Relativity is that Both are correct. Alfred and Batman are right! Let’s see why it is true. For Alfred, there is a time during which the Batmobile is completely inside the garage with both doors closed:

By the other hand, for Batman, the front and rear doors are not closed simultaneously! So there is never a time during which the Batmobile is completely inside the garage with both doors closed.

So, there is no paradox at all, if you are aware about the notion of simultaneity and its relativity!


LOG#052. Chewbacca’s exam.

I found this fun (Spanish) exam about Special Relativity at a Spanish website:

Solutions:

1) v=25/29 c

2) 1.836 \times 10^{12} m = 12 A.U.

3) t=13.6 months = 13 months and 18 days.

Calculations:

1) We use the relativistic addition of velocities rule. That is,

V=(u-v)/(1-(uv/c^2))

where u=Millenium Falcon velocity, v=imperial cruiser velocity= c/5, y V=relative speed=4c/5.

Using units with c=1:

4/5=(v-1/5)/(1-v/5)

4/5(1-v/5)=v-1/5

4/5-4/25v=v-1/5

29/25 v=1

v=25/29

Then, v=25/29 c reinserting units.

2) This part is solved with the length contraction formula and the velocity calculated in the previous part (1). Moreover, we obtain:

\Delta x'=\Delta x/\gamma

Using the result we got from (1), and plugging that velocity v and the fact that \Delta t' is equal to one hour, then es

\Delta x'=v\Delta t'=\Delta x/\gamma , and from this

\Delta x=\gamma v\Delta t'

Substituting the numerical values, we obtain the given solution easily.

\Delta x =1.97 ( 25/29 c )1hour =1.7 hc=1.836 \times 10^{12} =12A.U.

3) Simple application of time dilation formula provides:

\Delta t'=\gamma \Delta t

Inserting, in this case, our given velocity, we obtain the solution we wrote above:

\Delta t' = 1.97 ( 9 months) = 13.6 months = 13 months 18 days.


LOG#049. Ludicrous speed.

We are going to learn about the different notions of velocity that the special theory of relativity provides.

The special theory of relativity is a simple wonderful theory, but it comes with many misconceptions due to bad teaching/science divulgation. It is not easy to master the full theory of relativity without the proper mathematical background and physical insight. In the internet era where knowledge is shared, a fundamental issue is to understand things properly. There are many people who thinks they understand the theory of relativity when they don’t. Even at the academia.

Moreover, you can find many people in the blogsphere/websphere trying to sell false theories and wrong theories. It is the same like the so-called alternative medicine: they are not medicine at all. Bad science is not science, it is simply a lie and not science at all. It is religion. Science can be critized, but nobody can critize that Earth revolves around the Sun, it is common knowledge and truth. So, we can make critics to scientist, but not the scientific method and well established theories. We can try to understand better or in a novel way, but we can not deny facts and experiments. Gerard ‘t Hooft, Nobe Prize, explain it in his web page www.phys.uu.nl/~thooft/.

It is important to remark that Science revolutions come when we extend the theories we know they are correct, like special relativity and not with a full destruction of the current and well-tested theories. Newtonian relativity is a limit of General Relativity. Galilean relativity is a limit of Special Relativity. Quantum Mechanics is a limit of QFT and so on. The issue is not that. Said these words, I am quite sure that scientists and particularly physicists wish to overcome current theories with new ones. However, the process to create a new theory is not easy. Specially, if you don’t understand the traps and theories that have passed every known test till now.

What is velocity? Classically, the answer is short and very clear/neat: velocity is the rate of change of position with respect to time. It is a vector magnitude. Mathematically speaking is the quotient between the displacement vector and the time interval, or in the infinitesimal limit, the derivative of the position vector with respect to time.

\boxed{\mathbf{v_m}=\dfrac{\Delta \mathbf{r}(t)}{\Delta t}\leftrightarrow \mbox{Average velocity}}

\boxed{\mathbf{v}=\dfrac{d\mathbf{r}(t)}{dt}\leftrightarrow \mbox{Instantaneous velocity}}

In the special theory of relativity, due to the fact that time is not universal but relative we can build different notions of velocity. And it matters. There are some clear concepts from relativity you should master till now:

a) You can attach a clock to any yardstick you could physically use for measurements of space and time.

b) You must distinguish the notions of coordinate velocity (map coordinate is another commonly used notion/concept) and proper velocity. The latter is sometimes called hyperbolic (or imaginary) velocity. These two notions are caused by the presence of two “natural” elections of time: the proper time and the coordinate time.

c) Due to the previous two facts, you must also distinguish between proper acceleration and geometric acceleration. Proper-accelerations caused by the tug of external forces and geometric accelerations caused by choice of a reference frame that’s not geodesic i.e. a local reference coordinate-system that is not ”in free-fall”. Proper-accelerations are felt through their points of action e.g. through forces on the bottom of your feet. On the other hand geometric accelerations give rise to inertial forces that act on every ounce of an object’s being. They either vanish when seen from the vantage point of a local free-float frame, or give rise to non-local force effects on your mass distribution that cannot be made to disappear. Coordinate acceleration goes to zero whenever proper-acceleration is exactly canceled by that connection term, and thus when physical and inertial forces add to zero.

People who are not aware of the previous comments, don’t understand relativity and the physics behind it. They even don’t undertand what experiments and their data say.

Let me review the main magnitudes, 3-vectors and 4-vectors which the special theory of relativity studies in the next tables:

The two notions of 3-velocity we do have from the special theory of relativity, i.e., from the 4-velocity \mathbb{U}=\dfrac{d\mathbb{X}}{d\tau},  are:

1) Coordinate velocity, \mathbf{v}:

\mathbf{v}=\dfrac{d\mathbf{r}}{dt}

It is the common notion of 3-velocity, measured from an inertial observer with respect to the coordinate time t. Note that the coordinate time is not a true invariant in SR!

2) Proper velocity (or the hyperbolic velocity/imaginary angle velocity related to it):

\mathbf{w}\equiv \dfrac{d\mathbf{r}}{d\tau}=\gamma \mathbf{v}

where \tau is the proper time. This velocity can intuitively defined as the distance per unit traveler-time, retains many of the properties that ordinary velocity loses at high speed. In addition to these two definitions, we also have:

1)Proper-acceleration \alpha, is the acceleration experienced relative to a locally co-moving free-float-frame, and it helps when we are accelerating, speeding, and in curvy space-time.

2) How some of the space-like effect of sideways ”felt” forces moves into the reference-frame’s time-domain at high speed, making the relatively unknown bound (from special relativity!)

\dfrac{dp}{dt}\leq m\alpha

With the above definitions, the relativistic momentum can be expressed in termns of coordinate velocity or proper velocity as follows:

\mathbf{P}=m\mathbf{w}=M\mathbf{v}=m\gamma \mathbf{v}

where

\gamma=\dfrac{dt}{d\tau}=\dfrac{1}{\sqrt{1-\dfrac{v^2}{c^2}}}=\sqrt{1+\dfrac{\omega^2}{c^2}}

is the Lorentz factor. The last equal sign in the previous equation can be easily derived from the relativistic relationship:

\left(c\dfrac{dt}{d\tau}\right)^2-\left(\dfrac{d\mathbf{r}}{d\tau}\right)^2=c^2

and the definition of \gamma above.

Thanks to the metric-equation’s assignment of a frame-invariant traveler or proper-time \tau to the displacement between events in context of a single map-frame of comoving yardsticks and synchronized clocks, proper velocity becomes one of three related derivatives in special relativity (coordinate velocity \mathbf{v}, proper-velocity \mathbf{w}, and Lorentz factor \gamma) that describe an object’s rate of travel. For unidirectional motion, in units of lightspeed c (i.e. c=1 if we want to) each of these is also simply related to  a traveling object’s hyperbolic velocity angle or rapidity \eta by the next set of equations:

\eta=\sinh^{-1}\left( \dfrac{w}{c}\right)=\tanh^{-1}\left(\dfrac{v}{c}\right)=\pm \cosh^{-1}\left(\gamma\right)

The next table illustrates how the proper-velocity of w_0 \equiv c or “one map-lightyear per traveler-year” is a natural benchmark for the transition from a sub-relativistic coordinate frame to a (fake) auxiliary super-relativistic motion (in imaginary units of i=\sqrt{-1}). Note that the velocity angle or pseudorapidity \eta and the proper-velocity w run from 0 to infinity and track the physical coordinate-velocity when w<<c. On the other hand when w>>c, the (hyperbolic or imaginary) proper-velocity tracks Lorentz factor \gamma while velocity angle \eta is logarithmic and hence increases much more slowly:

LUDICROUS SPEED AND WARP SPEED

 Hyperbolic velocities CAN exceed c! They can reach even the ludicrous speed of \infty when the coordinate velocity approaches c! However, you must never forget the fact that the velocity-angle/hyperbolic velocity IS imaginary in value. It is quite clear from the above table. Indeed, being somehow “trekkie” or a Sci-Fi “romantic” person, you could “define” warp-speeds as “imaginary/hyperbolic” velocities, i.e., in terms of proper velocity. In that case, you could get the correspondence

\mbox{WARP}0.25=\mbox{WARP}1/4=\dfrac{\sqrt{17}}{17}c\approx 0.24c

\mbox{WARP}0.5=\mbox{WARP}1/2=\dfrac{\sqrt{5}}{5}c\approx 0.45c

\mbox{WARP}1=\dfrac{\sqrt{2}}{2}c\approx 0.71c

\mbox{WARP}2=\dfrac{2\sqrt{5}}{5}c\approx 0.89c

\mbox{WARP}3=\dfrac{3\sqrt{10}}{10}c\approx 0.95c

\mbox{WARP}7=\dfrac{7\sqrt{2}}{10}c\approx 0.99c

\mbox{WARP}9=\dfrac{9\sqrt{82}}{82}c\approx 0.994c

\mbox{WARP}10=\dfrac{10\sqrt{101}}{101}c\approx 0.995c

\mbox{WARP}\infty\equiv c

In general, we can define the WARP speed as W=w/c and so, the proper velocity can be expressed in terms of the warp speed W in a very simple way w=Wc. Thus, the real or coordinate velocity would be connected with warp-speed through the relativistic equation:

\boxed{v=c\tanh\sinh^{-1}(W)=c\tanh\sinh^{-1}\left(\dfrac{w}{c}\right)}

Of course, the point is that, unlike the Sci-Fi franchise, the real velocity has never exceeded c, only the hyperbolic velocity and the proper velocity (note that in terms of SR, velocities approaching c imply very boosted frames, so despite we could travel to any point of the Universe in SR only approaching c very closely with respect to the traveler proper time-one human life-, but in terms of the “Earth” (or rest) reference frame millions of years would have passed away!).

When the coordinate-speeds approach c, the respective coordinate velocities deviate from this simple addition rule in that rapidities (hyperbolic velocity angle boosts) add instead of velocities, i.e. \eta_{12}=\eta_1+\eta_2. Coordinate velocities add non-linearly. And it is a well-tested consequence of the Special Theory of relativity.  For highly relativistic objects (i.e. those with momentum per unit mass much larger than lightspeed) the result of the coordinate-velocity expression  familiar from most textbooks is rather uninteresting since the coordinate-velocities all peak out at c, i.e., as everybody knows, in special relativity 1c\boxplus 1c=1c, because applying the relativistic addition of velocities rule, we get

c\boxplus c=\dfrac{ (c + c)}{(1 + 1)}=c

And it is a fact from both theory and experiment! It will remain as long as SR remains a valid theory. SR holds yet with an astonishing degree of precision and accuracy. So, you can not deny every data and experiment that confirms SR. That is completely nonsense but there are some people and pseudo-scientists out there building their own theories AGAINST the achievements and explanations that SR provides to every experiment we have done until the current time. I am sorry for all of them. They are totally wrong. Science is not what they say it is. Any theory going beyond SR HAS to explain every experiment and data that SR does explain, and it is not easy to build such a theory or to say, e.g., why we have not observed (apparently) superluminal objects. I will discuss more superluminal in a forthcoming post/log entry, some posts after the special 50th post/log that is coming after this one! Stay tuned!

Coming back to our discussion…Why is all this stuff important? High Energy Physics is the natural domain of SR! And there, SR has not provided ANY wrong result till, in spite that some researches going beyond the Standard Model include modified dispersion relationships that reduce to SR in the low energy regime, we have not seen yet ANY deviation from SR until now.

For unidirectional motion, at low speeds the coordinate velocity v_{13} of object 1 from the point of view of oncoming object 3 might be described as the sum of the velocity v_{12} of object 1 with respect to lab frame 2 plus the velocity v_{23} of the lab frame 2 with respect to object 3, that is:

v_{13}=v_{12}+v_{23}

Compare this expression to the previously obtained expression for rapidities! Rapidities always add, coordinate velocities add (linearly) only at low velocities. In conclusion, you must be careful by what you mean by velocity is a boosted system!

By the other hand, for relative proper-velocity, the result is:

w_{13}=\gamma_{13}v_{13}=\gamma_{12}\gamma_{23}(v_{12}+v_{23})

This expression shows how the momentum per unit mass as well as the map-distance traveled per unit traveler time of object 1, as seen in the frame of oncoming particle 3, goes as the sum of the coordinate-velocities times the product of the gamma (energy) factors. The proper velocity equation is especially important in high energy physics, because colliders enable one to explore proper-speed and energy ranges much higher than accessible with fixed-target collisions. For instance each of two electrons (traveling with frames 1 and 3) in a head-on collision traveling in the lab frame (2) at

\gamma_{12}mc^2=45\mbox{GeV}

or equivalenty w_{12}=w_{23}=\gamma v\approx 88000 lightseconds per traveler second  would see the other coming toward them at coordinate velocity v_{13}\approx c and w_{13}=88000^2(1+1) \approx 1.55\cdot 10^{10} lightseconds per traveler second or \gamma_{13}mc^2\approx 7.9 \mbox{PeV}. From the target’s view, that is an incredible increase in both energy and momentum per unit of mass.

Other magnitudes and their frame dependence in SR can be read from the following table:

CAUTION: These results don’t mean that the “real” energy is that. Energy is relative and it depends on the frame! The fact that in colliders, seen from the target reference frame, the energy can be greater than the center of mass energy is not an accident. It is a consequence of the formalism of special relativity. A similar observation can be done for velocities. Coordinate velocities, IN THE FRAMEWORK OF SPECIAL RELATIVITY, can never exceed the speed of light. As long as SR holds, there is no particle whose COORDINATE velocity can overcome the speed of light. However, we have seen that PROPER velocities are other monsters. They serve as a tool to handle rotations along the temporal axis, i.e., to handle boosts mixing space and time coordinates. Proper (or hyperbolic) velocities CAN be greater than speed of light. But, it does not contradict the special theory of relativity at all since hyperbolic velocities ARE NOT REAL since they are imaginary quantities and they are not physical. We can only measure momentum and real quantities!  Moreover, remember that, in fact, group or phase velocities we have found before can ALSO be greater than c. So, you must be careful by what do you mean by velocity in SR or in any theory. Furthermore, you must distinguish the notion of particle velocity with those of the relative velocity between two inertial frames, since the particle velocities ( coordinate or proper) always refer to some concrete frame! In summary, be aware of people saying that there are superluminal particles in our colliders or astrophysical processes. It is simply not true. Superluminal objects have observable consequences, and they have failed to be observed ( the last example was the superluminal neutrino affair by the OPERA collaboration, now in agreement with SR).

Remark (I): From the last table we observe that in SR, the rotation angle is imaginary. Therefore, we are forced to use this gadget of hyperbolic velocity in order to avoid “imaginary velocities”.

Remark (II): Hyperbolic velocities would become imaginary velocities if we used the imaginary formalism of SR, the infamous ict=x_4.

Remark (III): Hyperbolic velocities are not coordinate velocities, so they are not physical at all. They are just a tool to provide the right answers in terms of rapidities, or the hyperbolic angle, whose units are imaginary radians! Hyperbolic velocities are measured in imaginary units of velocity!

Remark (IV): About the imaginary issues you can have now. The spacetime separation formula s^2=-c^2t^2+x^2+y^2+z^2 means that the time t can often be treated mathematically as if it were an imaginary spatial dimension. That is, you can define ct=iw so -c^2t^2=w^2, where i  is the square root of  -1, and w is a “fourth spatial coordinate”. Of course it is not at all. It is only a trick to treat the problem in a clever way.  By the other hand, a Lorentz boost by a velocity v can likewise be treated as a rotation by an imaginary angle. Consider a normal spatial rotation in which a primed frame is rotated in the wx-plane clockwise by an angle \varphi about the origin, relative to the unprimed frame. The relation between the coordinates (w',x') and (w,x) of a point in the two frames is:

\begin{pmatrix}w'\\ x'\end{pmatrix}=\begin{pmatrix}\cos\theta & -\sin\theta\\ \sin\theta & \cos\theta\end{pmatrix}\begin{pmatrix}w\\ x\end{pmatrix}

Now set ct=iw and \theta=i\varphi, with t,\theta both real. In other words, take the spatial coordinate w to be imaginary, and the rotation angle \varphi likewise to be imaginary. Then the rotation formula above becomes

\begin{pmatrix}ct'\\ x'\end{pmatrix}=\begin{pmatrix}\cosh\theta & -\sinh\theta\\ -\sinh\theta & \cosh\theta\end{pmatrix}\begin{pmatrix}ct\\ x\end{pmatrix}

This agrees with the usual Lorentz transformation formulat if the boost velocity v and boost angle \theta are related by the known formula \tanh\theta=v/c=\beta. We realize that if we identify the imaginary angle with the rapidity, we are back to Special Relativity. Indeed, it is only the rotations involving the time axis which can cause confusion because they are so different from our everyday experience. That is, we experience rotations along some direction in our daily experience, so we are familiarized with rotations and their (real) rotation angles. However, rotations along a time axis mixing space and time is a weird creature. It uses imaginary numbers or, if we avoid them, we have to use hyperbolic (pseudo)-rotations.

SUMMARY OF MAIN IDEAS

A) Lorentz factor \gamma=\dfrac{E}{mc^2}

\boxed{\gamma \equiv \frac{dt}{d\tau}= \sqrt{1+\left(\frac{w}{c}\right)^2} = \frac{1}{\sqrt{1-(\frac{v}{c})^2}} = \cosh[\eta] \equiv \frac{e^{\eta} + e^{-\eta}}{2}}

B) Proper-velocity or momentum per unit mass.

\boxed{\frac{w}{c}\equiv \frac{1}{c} \frac{dx}{d\tau}=\frac{v}{c} \frac{1}{\sqrt{1-(\frac{v}{c})^2}}=\sinh[\eta]\equiv \frac{e^{\eta} - e^{-\eta}}{2} =\pm\sqrt{\gamma^2 - 1}}

C) Coordinate velocity v\leq c.

\boxed{\frac{v}{c} \equiv \frac{1}{c}\frac{dx}{dt}=\frac{w}{c}\frac{1}{\sqrt{1 + (\frac{w}{c})^2}} = \tanh[\eta] \equiv \frac{e^{2\eta} - 1} {e^{2\eta} + 1}= \pm \sqrt{1 - \left(\frac{1}{\gamma}\right)^2}}

D) Hyperbolic velocity angle or rapidity.

\boxed{\eta =\sinh^{-1}[\frac{w}{c}] = \tanh^{-1}[\frac{v}{c}] = \pm \cosh^{-1}[\gamma]}

or in terms of logarithms:

\boxed{\eta = \ln\left[\frac{w}{c} + \sqrt{\left(\frac{w}{c}\right)^2 + 1}\right] = \frac{1}{2} \ln\left[\frac{1+\frac{v}{c}}{1-\frac{v}{c}}\right] = \pm \ln\left[\gamma + \sqrt{\gamma^2 - 1}\right]}

E) Warp speed (just for fun):

\boxed{v=c\tanh\sinh^{-1}(W)=c\tanh\sinh^{-1}\left(\dfrac{w}{c}\right)}


LOG#048. Thomas precession.

LORENTZ TRANSFORMATIONS IN NON-STANDARD FORM

Let me begin this post with an uncommon representation of Lorentz transformations in terms of “uncommon matrices”. A Lorentz transformation can be written symbolically, as we have seen before, as the set of linear transformations leaving invariant

ds^2=d\mathbf{x}^2-c^2dt^2

Therefore, the Lorentz transformations are naively X'=\mathbb{L}X. Let \mathbf{A}, \mathbf{B} be 3-rowed column matrices and let M, R, \mathbb{I} represent 3\times 3 matrices and T will be used (unless it is stated the contrary) to denote the matrix transposition ( interchange of rows and columns in the matrix).

The invariance of ds'^2=ds^2 implies the following results from the previous definitions:

\gamma^2-\mathbf{B}^2=1

M^TM =\mathbf{A}\mathbf{A}^T+\mathbb{I}

M^T\mathbf{B}=\gamma \mathbf{A}\leftrightarrow \mathbf{B}^T M=\gamma \mathbf{A}^T

Then, we can write the matrix for a Lorent transformation (boost) in the following non-standard manner:

\boxed{\mathbb{L}=\begin{pmatrix}\gamma & -\mathbf{A}^T\\ -\mathbf{B} & M\end{pmatrix}}

and the inverse transformation will be

\boxed{\mathbb{L}^{-1}=\begin{pmatrix}\gamma & \mathbf{B}^T\\ \mathbf{A} & M^T\end{pmatrix}}

Thus, we have \mathbb{L}\mathbb{L}^{-1}=\mathbb{I}_{4x4}\equiv \mathbb{E}, where we also have

\gamma^2-\mathbf{A}^2=1

M\mathbf{A}=\gamma \mathbf{B}

MM^T=\mathbf{B}\mathbf{B}^T+\mathbb{I}_{3x3}

Let us define, in addition to this stuff, the reference frames S, \overline{S}', corresponding to the the coordinates \mathbf{X} and \overline{\mathbb{X}}'. Then, the boost matrix will be recasted, if the velocity read \mathbf{v}=\mathbf{A}/\gamma, as

L_{v}=\begin{pmatrix}\gamma & -\gamma \mathbf{v}^T\\ -\gamma \mathbf{v} & \mathbb{I}+\frac{\gamma^2}{1+\gamma}\mathbf{v}\mathbf{v}^T\end{pmatrix}=\begin{pmatrix}\gamma & -\mathbf{A}^T\\ -\mathbf{A} & \mathbb{I}+\frac{\mathbf{A}\mathbf{A}^T}{1+\gamma}\end{pmatrix}

Remark: a Lorentz transformation will differ from boosts only by rotations in the general case. That is, with these conventions, the most general Lorentz transformations include both boosts and rotations.

For all \gamma>0, the above transformation is well-defined, but if \gamma<0, then it implies we will face with transformations containing the reversal of time ( the time reversal operation T, please, is a different thing than matrix transposition, do not confuse their same symbols here, please. I will denote it by \mathbb{T} in order to distinguish, althoug there is no danger to that confusion in general). The time reversal can be written indeed as:

\mathbb{T}=\begin{pmatrix}-1 & \mathbf{0}^T\\ \mathbf{0} & \mathbb{I}\end{pmatrix}

In that case, (\gamma<0), after the boost L_{v}, we have to make the changes \gamma \rightarrow \vert \gamma\vert and \mathbf{A}\rightarrow -\mathbf{A}. If these shifts are done, the reference frames \overline{S} and \overline{S}' can be easily related

\overline{X}'=LX=LL^{-1}_{v}\overline{X}

in such a way that

LL^{-1}_{v}=\begin{pmatrix}1 & \mathbf{0}\\ \mathbf{0} & R\end{pmatrix}=L_R

where the rotation matrix is given formally by the next equation:

R=M-\dfrac{\mathbf{B}\mathbf{A}^T}{1+\gamma}

R must be an orthogonal matrix, i.e., R^TR=\mathbb{I}_{3x3}. Then (\det (R))^2=1, or det R=\pm 1.. For \det R=-1 we have the parity matrix

\mathbb{P}=\begin{pmatrix}1 & \mathbf{0}^T\\ \mathbf{0} & -\mathbb{I}_{3x3}\end{pmatrix}

and it will transform right-handed frames to left-handed frames \overline{S} or \overline{S}'. The rotation vector \alpha can be defined as well:

1+2\cos \alpha=Tr (R)\rightarrow \cos\alpha=\dfrac{Tr R-1}{2}

so \alpha^\mu=\dfrac{1}{2}\epsilon^{\mu\nu\lambda}R^\nu_{\lambda}\dfrac{\alpha}{\sin\alpha}, \forall 0\leq \alpha<\pi. The rotation acting on 3-rowed matrices:

R\mathbf{A}=\mathbf{B}

implies that \overline{X}'=R\overline{X}, and it changes -\mathbf{A}/\gamma of the frame S into \overline{S}. Passing from one frame into another, \overline{S}' to S', it implies we can define a boost with L_{-\mathbf{B}/\gamma}. In fact,

L_{-\mathbf{B}/\gamma}L=\begin{pmatrix}1 & \mathbf{0}^T\\ \mathbf{0} & R\end{pmatrix}=L_R

Q.E.D.

Remark(I): Without the time reversal, we would get L_{R\mathbf{v}}L_R=L=L_RL_{\mathbf{v}}

with \mathbf{v}=\mathbf{A}/\gamma and R=M-\dfrac{\mathbf{BA}^T}{1+\gamma}.

Remark (II): L_RL_v\rightarrow L^T=L^T_vL_R^T=L_vL_{R^T}. If L^T=L=L_{R\mathbf{v}}L_R, then the uniqueness of R\mathbf{v} provides that R=R^T=R^{-1}, i.e., that R is an orthogonal matrix. If R is an orthogonal matrix and a proper Lorentz transformation ( det R=+1), then we would get \sin\alpha=0, and thus \alpha=0 or \alpha=\pi, and so, R=I or R=2\mathbf{n}\mathbf{n}^T-1, with the unimodular vector \mathbf{n}, i.e., \vert \mathbf{n}\vert=1. That would be the case \forall \mathbf{v}\neq 0 and \mathbf{n}=\mathbf{v}/\vert\vert \mathbf{v}\vert\vert. Otherwise, if \mathbf{v}=0, then \mathbf{n} would be an arbitrary vector.

ADDITION OF VELOCITIES REVISITED

The second step previous to our treatment of Thomas precession is to review ( setting c=1) the addition of velocities in the special relativistic realm. Suppose a point particle moves with velocity \overline{w} in the reference frame \overline{S}. Respect to the S-frame (in rest) we will write:

\mathbf{x}=\overline{\mathbf{x}}+\dfrac{\gamma^2}{\gamma+1}(\overline{\mathbf{x}}\mathbf{v})\mathbf{v}+\gamma \mathbf{v}\overline{t}

and

t=\gamma \overline{t}+\gamma (\mathbf{v}\overline{\mathbf{x}})

and with \overline{x}=\overline{\mathbf{w}}\overline{t} we can calculate the ratio \mathbf{u}=\mathbf{x}/t:

\mathbf{u}=\dfrac{\dfrac{\overline{\mathbf{w}}}{\gamma}+\dfrac{\gamma}{1+\gamma}(\mathbf{v}\overline{\mathbf{w}})\mathbf{v}+\mathbf{v}}{1+\mathbf{v}\overline{\mathbf{w}}}

and thus

\mathbf{u}\equiv \dfrac{\mathbf{v}+\mathbf{w}_\parallel+(\mathbf{w}_\perp/\gamma)}{1+\mathbf{v}\overline{\mathbf{w}}}

where we have defined:

(\mathbf{w}_\perp/\gamma)\equiv\dfrac{\overline{\mathbf{w}}}{\gamma}

and

\mathbf{w}_\parallel\equiv \dfrac{\gamma}{1+\gamma}(\mathbf{v}\overline{\mathbf{w}})\mathbf{v}

Comment: the composition law for 3-velocities is special relativity is both non-linear AND non-associative.

There are two special cases of motion we use to consider in (special) relativity and inertial frames:

1st. The case of parallel motion between frames (or “parallel motion”). In this case \overline{\mathbf{w}}=\lambda \mathbf{v}, i.e., \mathbf{w}\times \mathbf{v}=0. Therefore,

\mathbf{u}=\dfrac{\mathbf{v}+\overline{\mathbf{w}}}{1+\mathbf{v}\overline{\mathbf{w}}}

This is the usual non-linear rule to add velocities in Special Relativity.

2nd. The case of orthogonal motion between frames, where \mathbf{v}\perp\mathbf{w}. It means \mathbf{v}\mathbf{w}=0. Then,

\mathbf{u}=\mathbf{v}+\mathbf{w}/\gamma= \mathbf{v}+\overline{\mathbf{w}}\sqrt{1-\mathbf{v}^2}

This orthogonal motion to the direction of relative speed has an interesting phenomenology, since this inertial motion will be slowed down due to time dilation because the spatial distances that are orthogonal to \mathbf{v} are equal in both reference frames.

Furthermore, we get also:

\mathbf{u}^2=1-\dfrac{(1-\overline{\mathbf{w}}^2)(1-\mathbf{v}^2)}{(1+\mathbf{v}\overline{\mathbf{w}})}\leq 1

Indeed, the condition \mathbf{u}^2=1 implies that \overline{\mathbf{w}}^2=1 or \mathbf{v}^2=1, and the latter condition is actually forbidden because of our interpretation of \mathbf{v} as a relative velocity between different frames. Thus, this last equation shows the Lorentz invariance in Special relativity don’t allow for superluminal motion, although, a priori, it could be also used for even superluminal speeds since no restriction apply for them beyond those imposed by the principle of relativity.

THOMAS PRECESSION

We are ready to study the Thomas precession and its meaning. Suppose an inertial frame \overline{\overline{S}} obtained from another inertial frame \overline{S} by boosting the velocity \overline{w}. Therefore, \overline{\overline{S}} owns the relative velocity \mathbf{v} given by the addition rule we have seen in the previous section. Moreover, we have:

\overline{\overline{x}}=L_{\overline{w}}\overline{x}=L_{\overline{w}}L_{\mathbf{v}}x

Then, we get

L_{\mathbf{v}}=\begin{pmatrix}\gamma_v & -\gamma_v \overline{\mathbf{v}}^T\\ -\gamma_v \mathbf{v} & \mathbf{1}+\dfrac{\gamma_v^2}{1+\gamma_v}\mathbf{v}\mathbf{v}^T\end{pmatrix}

L_{\overline{\mathbf{w}}}=\begin{pmatrix}\gamma_{\overline{\mathbf{w}}} & -\gamma_{\overline{\mathbf{w}}} \overline{\mathbf{w}}^T\\ -\gamma_{\overline{\mathbf{w}}} \overline{\mathbf{w}}^T & \mathbf{1}+\dfrac{\gamma_{\overline{\mathbf{w}}}^2}{1+\gamma_{\overline{\mathbf{w}}}}\overline{\mathbf{w}}\overline{\mathbf{w}}^T\end{pmatrix}

where

\gamma_{v}=\dfrac{1}{\sqrt{1-\mathbf{v}^2}}

\gamma_{\overline{\mathbf{w}}}=\dfrac{1}{\sqrt{1-\overline{\mathbf{w}}^2}}

and then

\boxed{L\equiv=L_{\overline{\mathbf{w}}}L_v=\begin{pmatrix}\gamma & -\mathbf{A}^T\\ -\mathbf{B} & M\end{pmatrix}}

with

\gamma (\mathbf{v},\overline{\mathbf{w}})=\gamma_v\gamma_{\overline{w}}(1+\mathbf{v}\overline{\mathbf{w}})\equiv \gamma (\overline{\mathbf{w}},\mathbf{v})

\mathbf{A}=\gamma (\mathbf{v},\overline{\mathbf{w}})\overline{\mathbf{w}}o \mathbf{v}

\mathbf{B}=\gamma (\overline{\mathbf{w}},\mathbf{v})\mathbf{v}o\overline{\mathbf{w}}

M=M(\overline{\mathbf{w}},\mathbf{v})=\mathbf{1}+\dfrac{\gamma_v^2}{1+\gamma_v}\mathbf{v}\mathbf{v}^T+\dfrac{\gamma_{\overline{\mathbf{w}}}^2}{1+\gamma_{\overline{\mathbf{w}}}}\overline{\mathbf{w}}\overline{\mathbf{w}}^T+\gamma_v\gamma_{\overline{\mathbf{w}}}\left( 1+\dfrac{\gamma_v\gamma_{\overline{\mathbf{w}}}}{(1+\gamma_v)(1+\gamma_{\overline{\mathbf{w}}})}\mathbf{v}\overline{\mathbf{w}}\right)\overline{\mathbf{w}}\mathbf{v}

Here, we have defined:

\boxed{\overline{\mathbf{w}}o \mathbf{v}\equiv \dfrac{\left( \gamma_{\overline{\mathbf{w}}}\gamma_v\mathbf{v}+\gamma_{\overline{\mathbf{w}}}\overline{\mathbf{w}}+\gamma_{\overline{\mathbf{w}}}\dfrac{\gamma_v^2}{1+\gamma_v}(\overline{\mathbf{w}}\mathbf{v})\right)}{\gamma (\mathbf{v},\overline{\mathbf{w}})}}

Remark (I): The matrix L given by

\begin{pmatrix}\gamma & -\mathbf{A}^T\\ -\mathbf{B} & M\end{pmatrix}

is NOT symmetric as we would expect from a boost. According to our decomposition for the matrix M it can be rewritten in the following way

\boxed{R=R(\overline{\mathbf{w}},\mathbf{v})=M(\overline{\mathbf{w}},\mathbf{v})-\dfrac{\mathbf{B}\mathbf{A}^T}{1+\gamma}}

This last equation is called the Thomas precession associated with the tridimensional 3-vectors \mathbf{v},\overline{\mathbf{w}}. We observe that R is a proper-orthogonal matrix from the multiplicative property of the determinants and the fact that all boosts have determinant one. Equivalently, from the condition R=\pm 1 for all orthogonal matrix R together with the continuous dependence of R on the velocities and the initial condition R(0,0)=\mathbf{1}.

Remark (II): From the definitions of M, and the vectors \mathbf{A},\mathbf{B}, we deduce that \mathbf{v}\times \overline{\mathbf{w}} is an eigenvector of R with eigenvalue +1 and this gives the axis of rotation. The rotation angle \alpha as calculated from Tr R=1+2\cos\alpha is complicated expression, and only after some clever manipulations or the use of the geometric algebra framework, it simplifies to

1+\cos\alpha=\dfrac{(1+\gamma_u+\gamma_v+\gamma_{\overline{w}})}{(1+\gamma_u)(1+\gamma_v)(1+\gamma_{\overline{w}})}>0

In order to understand what this equation means, we have to observe that the components \mathbf{v} and \overline{\mathbf{w}} refer to different reference frames, and then, the scalar product \mathbf{v}\mathbf{\overline{w}} and the cross product \mathbf{v}\times\overline{\mathbf{w}} must be given good analitic expressions before the geometric interpretation can be accomplished. Moreover, if we want to interpret the cross product as an axis in the reference frame S, and correspondingly we want to split L=L_{R\mathbf{v}}L_R,  by the definition \overline{\mathbf{w}}o\mathbf{v} we deduce that

\mathbf{v}\times\mathbf{u}=\dfrac{\mathbf{v}\times\overline{\mathbf{w}}}{\gamma_v(1+\mathbf{v}\overline{\mathbf{w}})}

and thus, the Thomas rotation of the inertial frame S has its axis orhtogonal to the relative velocity vectors \mathbf{v},\mathbf{u} of the reference frame \overline{\overline{S}}, \overline{\overline{S}} against S.

By the other hand, if we interpret the above last equation as an axis in the reference frame \overline{\overline{S}}, asociated to the split L=L_RL_\mathbf{u}, we would deduce that L_{R\mathbf{u}}L_R implies the following consequence. The reference frame \overline{\overline{S}} is got from boosting certain frame S’ obtained itself from a rotation of S by R. Then, \overline{\overline{S}} obtains (compared with S or S’), a velocity whose components are R\mathbf{u} in the inertial frame S’. Reciprocally, the components of the velocity of S or S’ against the frame \overline{\overline{S}} are provided, in \overline{\overline S}, by \overline{\overline{\mathbf{u}}}=-R\mathbf{u}. Therefore, from the Thomas precession formula for R we observe that R\mathbf{u} differs from \mathbf{u} only by linear combinations of the vectors \mathbf{v} and \overline{\mathbf{w}}. With all this results we easily derive:

\overline{\overline{u}}\times \overline{\overline{\mathbf{w}}}=(-R\mathbf{u})\times (-\overline{\mathbf{w}})\propto \mathbf{v}\times \overline{\mathbf{w}}

i.e., the axis for the Thomas rotation matrix of \overline{\overline{S}} is orthogonal to the relative velocities \overline{\overline{\mathbf{u}}}, \overline{\overline{\mathbf{w}}} of the inertial frames S, \overline{S} against \overline{\overline{S}}. Finally, to find the rotation matrix, it is enough to restrict the problem to the case where \overline{\mathbf{w}} is small so that squares of it may be neglected. In this simple case, R would become into:

\boxed{R\approx \mathbf{1}+\dfrac{\gamma_v}{1+\gamma_v}\left(\overline{\mathbf{w}}\mathbf{v}^T-\mathbf{v}\overline{\mathbf{w}}^T\right)}

and where the rotation angle is given by

\boxed{\alpha\approx -\dfrac{\gamma_v}{1+\gamma_v}\mathbf{v}\times\overline{\mathbf{w}}\approx -\dfrac{\gamma_v^2}{1+\gamma_v}\mathbf{v}\times\mathbf{u}}

In order to understand the Physics behind the Thomas precession, we will consider one single experiment. Imagine an inertial frame S in accelerated motion with respect to other inertial frame I. The spatial axes of S remain parallel at any time in the sense that the instantaneous reference frame coinciding with S at times t+\Delta t are related by a pure boost in the limit \Delta t\rightarrow 0. This may be managed if we orient S with the aid of a very fast spinning torque-free gyroscope. Then, from the inertial frame I, S seems to be rotated at each instant of time and there is a continuous rotation of S against I since the velocity of S varies and changes continuously. This gyroscopic rotation of S relative to I IS the Thomas precession.  We can determine the angular velocity of this motion in a straightforward manner. During the small interval of time \Delta t measured from I, the instantaneous velocity \mathbf{v} of S changes by certain quantity \Delta \mathbf{v}, measured from I. In that case,

\Delta \alpha=-\gamma_v^2\mathbf{v}\times\dfrac{\Delta\mathbf{v}}{(1+\gamma_v)}

for the rotation vector during a time interval \Delta t. Thus, the angular velocity for the Thomas precession will be given by:

\boxed{\omega_T=-\dfrac{\gamma^2}{1+\gamma_v}\mathbf{v}\times\dfrac{d\mathbf{v}}{dt}}

or reintroducing the speed of light we get

\boxed{\omega_T=-\dfrac{\gamma^2}{1+\gamma_v}\mathbf{v}\times\dfrac{1}{c^2}\dfrac{d\mathbf{v}}{dt}=\dfrac{\gamma^2}{1+\gamma_v}\dfrac{1}{c^2}\mathbf{a}\times\mathbf{v}}

Remark(I): The special relativistic effect given by the Thomas precession was used by Thomas himself to remove a discrepancy and mismatch between the non-relativistic theory of the spinning electron and the experimental value of the fine structure. His observation was, in fact, that the gyromagnetic ratio of the electron calculated from the anomalous Zeeman effect led to a wrong value of the fine structure constant \alpha. The Thomas precession introduces a correction to the equation of motion of an electron in an external electromagnetic filed and such a correction induces a correction of the spin-orbit coupling, explaining the correct value of the fine structure.

Remark (II): In the framework of the relativistic quantum theory of the electron, Dirac realized that the effect of Thomas precession was automatically included!

Remark (III): Inside the Thomas paper, we find these interesting words

“(…)It seems that Abraham (1903) was the first to consider in any detail an electron with an axis. Many have since then considered spinning electron, ring electrons, and the like. Compton (1921) in particular suggested a quantized spin for the electron. It remained for Uhlenberg and Goudsmit (1925) to show ho this idea can be used to explain the anomalous Zeeman effect. The asumptions they had to make seemed to lead to optical and relativity doublet separations twice larger than those we observe. The purpose of the following paper, which contains the results mentioned in my recent letter to Nature (1926), is to investigate the kinematics of an electron with an axis on the basis of the restricted theory of relativity. The main fact used is that the combination of two Lorentz transformations without rotation in general is not of the same form(…)”.

From the historical viewpoint it should also be remarked that the precession effect was known by the end of 1912 to the mathematician E.Borel (C.R.Acad.Sci.,156. 215 (1913)). It was described by him (Borel, 1914) as well as by L.Silberstein (1914) in textbooks already 1914. It seems that the effect was even known to A.Sommerfeld in 1909 and before him, perhaps even to H.Poincaré. The importance of Thomas’ work and papers on this subject was thus not only the rediscovery but the relevant application to a virulent problem in that time, as it was the structure of the atomic spectra and the fine structure constant of the electron!

Remark (IV): Not every Lorentz transformation can be written as the product of two boosts due to the Thomas precession!

THE LORENTZ GROUP AS A QUASIDIRECT PRODUCT: QUASIGROUPS, LOOPS AND GYROGROUPS

Even though we have not studied group theory in this blog, I feel the need to explain some group theory stuff related to the Thomas precession here.

The kinematical differences between Galilean and Einsteinian relativity theories is observed at many levels. The essential differences become apparent already on the level of the homogenous groups without reversals (inverses). Let me first consider the Galileo group. It is generated by space rotations G_R=L_R and galilean boosts in any number and order. Using the notation we have developed in this post, we could write X'=G_\mathbf{v}X in this way:

G_\mathbf{v}=\begin{pmatrix}1 & \mathbf{0}^T\\ -\mathbf{v} & \mathbf{1}\end{pmatrix}

The following relationships are deduced:

G_RG_\mathbf{v}=G_{R\mathbf{v}}G_{R}

G_{R_1}G_{R_2}=G_{R_1R_2}

G_{\mathbf{v}_1}G_{\mathbf{v}_2}=G_{\mathbf{v}_1+\mathbf{v}_2}=G_{\mathbf{v}_2}G_{\mathbf{v}_1}

In the case of the Lorentz group, these equations are “generalized” into

L_RL_\mathbf{v}=L_{R\mathbf{v}}L_{R}

L_{R_1}L_{R_2}=L_{R_1R_2}

L_{\mathbf{v}_1}L_{\mathbf{v}_2}=L_{R(\mathbf{v}_1,\mathbf{v}_2)}L_{\mathbf{v}_1 o \mathbf{v}_2}

where R(\mathbf{v}_1,\mathbf{v}_2) is the Thomas precession and the circle denotes the nonlinear relativisti velocity addition. Be aware that the domain of velocities in special relativity is \vert v\vert<1, in units with c set to unity.

Both groups (Galileo and Lorentz) contain as a subroupt the group of al spatial rotations G_R\equiv L_R. The set of galilean or lorentzian boosts G_v and L_v are invariant under conjugation by G_R=L_R, since

G_RG_vG_R^{-1}=G_{Rv}

L_RL_vL_R^{-1}=L_{Rv}

are boosts as well. In the case of the Galileo group, the set of (galilean) boost forms an (abelian) subgroup and then, it provides an invariant group. We can calculate the factor group with respect to it and we will obtain an isomorphic group to the subgroup of space rotations. Using the group law for the Galileo group:

\underbrace{G_{R_1}G_{v_1}}\underbrace{G_{R_2}G_{v_2}}=G_{R_1R_2}G_{R_2^{-1}v_1+v_2}=G_{R_3}G_{v_3}

with R_3=R_2R_1 and v_3=R_2^{-1}v_1+v_2. As a consequence, the homogenous Galileo group (without reversals) is called a semidirect product of the rotation group with the Abelian group \mathbb{R}^3 of all boosts given by \mathbf{v}.

The case of Lorentz group is more complicated/complex. The reason is the Thomas precession. Indeed, the set of boost does NOT form a subgroup of the Lorentz group! We can define a product in this group:

\boxed{L_{v_1} oL_{v_2}=L_{v_1 o v_2}}

but, in the contrary to the result we got with the Galileo group, this condition does NOT define a group structure. In fact, mathematicians call objects with this property groupoids. The domain of velocities of the this lorentzian grupoid becomes a groupoid under the multiplication v_1 o v_2. It has dramatic consequences. In particular, the associative does not hold for this multiplication and this groupoid structure! Anyway, a weaker form of it is true, involving the Thomas precession/rotation formula:

\boxed{(v_1 o v_2) o v_3=(R^{-1}(v_2,v_3)v_1) o (v_2 o v_3)}

In an analogue way, the multiplication is not commuative in general too, but it satisfies a weaker form of commutativity. While in general groupoids require to distinguish between right and left unit elements (if any), we have indeed \mathbf{v}=\mathbf{0} as a “two-sided” unit element for the velocity groupoid. In the same manner, while in general groupoids right and left inverses may differ (if any), in the case of Lorentz group, the groupoid associated to Thomas precession has a unique two-sided inverse -\mathbf{v} for any \mathbf{v} relative to the groupoid multiplication law. It is NON-trivial ( due to non-associativeness), albeit true, that the equation given by

v_1 o v_2=v_3

may be solved uniquely for v_2 and, provided we plug v_2, v_3, it may be solve uniquely for any v_1. A groupoid satisfying this property (i.e., a groupoid that allows such a uniqueness in the solutions of its equation) is called quasi-group.

In conclusion, we can say that the Lorentz group IS, in sharp contrast to the Galileo group, in no way a semidirect product, being what mathematicians and physicists call a simple group, i.e., it is a noncommutative group having no nontrivial invariant subgroup! It is due to the fact that the multiplication rule of the Lorentz group without reversals makes it, in the sense of our previous definitions, the quasidirect product of the rotation group (as a subgroup of the automorphism group of the velocity groupoid)  with the so-called “weakly associative groupoid of velocities”. Here, weakly associative(-commutative) groupoid means the following: a groupoid with a left-sided unit and left-sided inverses with the next properties:

1. Weak associativeness: R(\mathbf{0},\mathbf{v})=R(-\mathbf{v},\mathbf{v})=\mathbf{1}

2. Loop property (from Thomas precession formula): R(v_1,v_2)=R(v_1,v_1 o v_2)

and where the automorphims group of the velocity groupoid is defined with the next equations

Definition (Automorphism group of the velocity groupoid): (Sv_1)o(Sv_2)=S(v_1 o v_2)

Note: an associative groupoid is called semigroup and and a semigroup with two-sided unit element is called a monoid.

This algebraic structure hidden in the Lorentz group has been rediscovered several times along the History of mathematical physics. A groupoid satisfying the loop property has been named in other ways. For instance, in 1988, A. A. Ungar derived the above composition laws and the automorphism group of the Thomas precession R. Independently, A. Nesterov and coworkers in the Soviet Union had studied the same problem and quasigroup since 1986. And we can track this structure even more. 20 years before the Ungar “rediscovery”, H. Karzel had postulated a version of the same abstract object, and it was integrated into a richer one with two compositions (laws). He called it “near-domain”, where the automorphims R (Thomas precessions) were to be realized by the (distributive) left multiplication with suitable elements of the near-domian ( the reference is Abh. Math.Sem.Uni. Hamburg, 1968).

However, Ungar himself developed a more systematic treatment and description for the Thomas precession “groupoid” that is behind all this weird non-associative stuff in the Lorentz-group in 3+1 dimensions. Accorging to his new approach and terminology, the structure is called “gyrocommutative gyrogroup” and it includes the Thomas precession as “Thomas gyration” in this framework. If you want to learn more about gyrogroups and gyrovector spaces, read this article

http://en.wikipedia.org/wiki/Gyrovector_space

Some other authors, like Wefelscheid and coworkers, called K-loops to these gyrogroups. Even more, there are two extra sources from this nontrivial mathematical structure.

Firstly, in Japan, M.Kikkawa had studied certain loops with a compatible differentiable structure called “homegeneous symmetric Lie groups” ( Hiroshima Math. J.5, 141 (1975)). Even though he did not discuss any concrete example, it is natural from his definitions that it was the same structure Karzel found. Being romantic, we can observe certain justice to call K-loops to gyrogroups (since Kikkawa and Karzel discovered them first!). The second source can be tracked in time since the same ideas were already known by L.Sabinin et alii circa 1972 ( Sov. Math. Dokl.13,970(1972)). Their relation to symmetric homogeneous spaces of noncompact type has been discussed some years ago by W. Krammer and H.K.Urbatke, e.g., in Res. Math.33, 310 (1998).

Finally, a purely algebraic loop theory approach (with motivations far way from geometry or physics) was introduced by D. A. Robinson in 1966. In 1995, A. Kreuzer showed thath it was indeed identical to K-loops, again adding some extra nomenclature ( Math.Proc.Camb. Phylos.Soc.123, 53 (1998)).

THOMAS PRECESSION: EASY DEDUCTION

We have seen that the composition of 2 Lorentz boosts, generally with 2 non collinear velocities, results in a Lorentz transformation that IS NOT a pure boost but a composition of a single Lorentz transformation or boost and a single spatial rotation. Indeed, this phenomenon is also called Wigner-Thomas rotation. The final consequence, any body moving on a curvilinear trajectory undergoes and experiences a rotational precession, firstly noted by Thomas in the relativistic theory of the spinning electron.

In this final section, I am going to review the really simple deduction of the Thomas precession formula given in the paper http://arxiv.org/abs/1211.1854

Imagine 3 different inertial observers Anna, Bob and Charles and their respective inertial frames A, B, and C attached to them. We choose A as a non-rotated frame with respect to B, and B as a non-rotated reference frame w.r.t. C. However, surprisingly, C is going to be rotated w.r.t. A and it is inevitable! We are going to understand it better. Let Bob embrace Charles and let them move together with constant velocity \mathbf{v} w.r.t. Anna. In some point, Charles decides to run away from Bob with a tiny velocity \mathbf{dv'} w.r.t. Bob. Then, Bob is moving with relative velocity -\mathbf{dv'} w.r.t. C and Anna is moving with relative velocity -\mathbf{v} w.r.t. B. We can show these events with the following diagram:

Now, we can write Charles’ velocity in the Anna’s frame by the sum \mathbf{v+dv}. Since the frame C is rotated with respect to the A frame, his velocity in the C frame will be \hat{\mathbf{v}} will be calculated step to step as follows. Firstly, we remark that

\hat{\mathbf{v}}\neq -\mathbf{v}-d\mathbf{v}

Secondly, the angle d\mathbf{\Omega} of an infinitesimal rotation is given by:

d\mathbf{\Omega}=-\dfrac{\hat{\mathbf{v}}}{\vert \hat{\mathbf{v}}\vert }\times \dfrac{\mathbf{v}+d\mathbf{v}}{\vert \mathbf{v}+d\mathbf{v}\vert}\approx -\dfrac{\hat{\mathbf{v}}}{v^2}\times (\mathbf{v}+ d\mathbf{v})\;\;\; (1)

The precession rate in the A frame will be provided using the general nonlinear composition rule in SR. If the motion is parallel to the x-axis with velocity V, we do know that

u'_x=\dfrac{u_x-V}{1-\dfrac{u_x V}{c^2}}

u'_y=\dfrac{u_y\sqrt{1-\dfrac{V^2}{c^2}}}{1-\dfrac{u_x V}{c^2}}

u'_z=\dfrac{u_z\sqrt{1-\dfrac{V^2}{c^2}}}{1-\dfrac{u_x V}{c^2}}

and where \mathbf{u}=(u_x,u_y,u_z) and \mathbf{u}'=(u'_x,u'_y,u'_z) are the velocities of some object in the rest frame and the moving frame, respectively. For an arbitrary non-collinear, non-orthogonal, i.e., non parallel velocity \mathbf{V}=(V_x,V_y,V_z) we obtain the transformations

\boxed{\mathbf{u}'=\dfrac{\sqrt{1-\dfrac{V^2}{c^2}}\left(\mathbf{u}-\dfrac{\mathbf{u}\cdot\mathbf{V}}{V^2}\mathbf{V}\right)-\left( \mathbf{V}-\dfrac{\mathbf{u}\cdot\mathbf{V}}{V^2}\mathbf{V}\right)}{1-\dfrac{\mathbf{u}\cdot\mathbf{V}}{c^2}}\;\;\; (2)}

and where the unprimed and primed frames are mutually non-rotated to each other. Using this last equation, (2), we can easily describe the transition from the frame A to the frame B. It involves the substitutions:

\mathbf{V}\rightarrow \mathbf{v}

\mathbf{u}\rightarrow \mathbf{v}+d\mathbf{v}

\mathbf{u}'\rightarrow d\mathbf{v}'

After leaving the first order terms in d\mathbf{v}, we can get the following expansion from eq.(2):

d\mathbf{v}'\approx \dfrac{1}{\sqrt{1-\dfrac{v^2}{c^2}}}\left(d\mathbf{v}-\dfrac{\mathbf{v}\cdot d\mathbf{v}}{v^2}\mathbf{v}\right)+\dfrac{1}{1-\dfrac{v^2}{c^2}}\dfrac{\mathbf{v}\cdot d\mathbf{v}}{v^2}\mathbf{v}\;\;\; (3)

Using again eq.(2) to make the transition between the B frame to the C frame, i.e., making the substitutions:

\mathbf{V}\rightarrow d\mathbf{v}'

\mathbf{u}\rightarrow -\mathbf{v}

\mathbf{u}'\rightarrow \hat{\mathbf{v}}

and dropping out higher order differentials in d\mathbf{v}', we obtain the next formula after we neglect those terms

\boxed{\hat{\mathbf{v}}\approx -\mathbf{v}+\dfrac{\mathbf{v}\cdot d\mathbf{v}'}{c^2}\mathbf{v}-d\mathbf{v}'\;\;\; (4)}

The final step consists is easy: we plug eq.(3) into eq.(4) and the resulting expression into eq.(1). Then, we divice by the differential dt in the final formula to provide the celebrated Thomas precession formula:

\boxed{\dot{\Omega}=\dfrac{d\Omega}{dt}=\omega_T=-\dfrac{1}{v^2}\left(\dfrac{1}{\sqrt{1-\dfrac{v^2}{c^2}}}-1\right)\mathbf{v}\times \dot{\mathbf{v}}\;\;\; (5)}

or equivalently

\boxed{\dot{\Omega}=\dfrac{d\Omega}{dt}=\omega_T=-\dfrac{1}{v^2}\left(\gamma_{\mathbf{v}}-1\right)\mathbf{v}\times \mathbf{a}\;\;\; (6)}

It can easily shown that these formulae is the same as the given previously above, writing v^2 in terms of \gamma and performing some elementary algebraic manipulations.

Aren’t you fascinated by how these wonderful mathematical structures emerge from the physical world? I can say it: Fascinating is not enough for my surprised mind!


LOG#047. The Askaryan effect.


I discussed and reviewed the important Cherenkov effect and radiation in the previous post, here:

https://thespectrumofriemannium.wordpress.com/2012/10/16/log046-the-cherenkov-effect/

Today we are going to study a relatively new effect ( new experimentally speaking, because it was first detected when I was an undergraduate student, in 2000) but it is not so new from the theoretical aside (theoretically, it was predicted in 1962). This effect is closely related to the Cherenkov effect. It is named Askaryan effect or Askaryan radiation, see below after a brief recapitulation of the Cherenkov effect last post we are going to do in the next lines.

We do know that charged particles moving faster than light through the vacuum emit Cherenkov radiation. How can a particle move faster than light? The weak speed of a charged particle can exceed the speed of light. That is all. About some speculations about the so-called tachyonic gamma ray emissions, let me say that the existence of superluminal energy transfer has not been established so far, and one may ask why. There are two options:

1) The simplest solution is that superluminal quanta just do not exist, the vacuum speed of light being the definitive upper bound.

2) The second solution is that the interaction of superluminal radiation with matter is very small, the quotient of tachyonic and electric fine-structure constants being q_{tach}^2/e^2<10^{-11}. Therefore superluminal quanta and their substratum are hard to detect.

A related and very interesting question could be asked now related to the Cherenkov radiation we have studied here. What about neutral particles? Is there some analogue of Cherenkov radiation valid for chargeless or neutral particles? Because neutrinos are electrically neutral, conventional Cherenkov radiation of superluminal neutrinos does not arise or it is otherwise weakened. However neutrinos do carry electroweak charge and may emit certain Cherenkov-like radiation via weak interactions when traveling at superluminal speeds. The Askaryan effect/radiation is this Cherenkov-like effect for neutrinos, and we are going to enlighten your knowledge of this effect with this entry.

We are being bombarded by cosmic rays, and even more, we are being bombarded by neutrinos. Indeed, we expect that ultra-high energy (UHE) neutrinos or extreme ultra-high energy (EHE) neutrinos will hit us as too. When neutrinos interact wiht matter, they create some shower, specifically in dense media. Thus, we expect that the electrons and positrons which travel faster than the speed of light in these media or even in the air and they should emit (coherent) Cherenkov-like radiation.

Who was Gurgen Askaryan?

Let me quote what wikipedia say about him: Gurgen Askaryan (December 14, 1928-1997) was a prominent Soviet (armenian) physicist, famous for his discovery of the self-focusing of light, pioneering studies of light-matter interactions, and the discovery and investigation of the interaction of high-energy particles with condensed matter. He published more than 200 papers about different topics in high-energy physics.

Other interesting ideas by Askaryan: the bubble chamber (he discovered the idea independently to Glaser, but he did not published it so he did not win the Nobel Prize), laser self-focussing (one of the main contributions of Askaryan to non-linear optics was the self-focusing of light), and the acoustic UHECR detection proposal. Askaryan was the first to note that the outer few metres of the Moon’s surface, known as the regolith, would be a sufficiently transparent medium for detecting microwaves from the charge excess in particle showers. The radio transparency of the regolith has since been confirmed by the Apollo missions.

If you want to learn more about Askaryan ideas and his biography, you can read them here: http://en.wikipedia.org/wiki/Gurgen_Askaryan

What is the Askaryan effect?

The next figure is from the Askaryan radiation detected by the ANITA experiment:

The Askaryan effect is the phenomenon whereby a particle traveling faster than the phase velocity of light in a dense dielectric medium (such as salt, ice or the lunar regolith) produces a shower of secondary charged particles which contain a charge anisotropy  and thus emits a cone of coherent radiation in the radio or microwave  part of the electromagnetic spectrum. It is similar, or more precisely it is based on the Cherenkov effect.

High energy processes such as Compton, Bhabha and Moller scattering along with positron annihilation  rapidly lead to about a 20%-30% negative charge asymmetry in the electron-photon part of a cascade. For instance, they can be initiated by UHE (higher than, e.g.,100 PeV) neutrinos.

1962, Askaryan first hypothesized this effect and suggested that it should lead to strong coherent radio and microwave Cherenkov emission for showers propagating within the dielectric. Since the dimensions of the clump of charged particles are small compared to the wavelength of the radio waves, the shower radiates coherent radio Cherenkov radiation whose power is proportional to the square of the net charge in the shower. The net charge in the shower is proportional to the primary energy so the radiated power scales quadratically with the shower energy, P_{RF}\propto E^2.

Indeed, these radio and coherent radiations are originated by the Cherenkov effect radiation. We do know that:

\dfrac{P_{CR}}{d\nu}\propto \nu d\nu

from the charged particle in a dense (refractive) medium experimenting Cherenkov radiation (CR). Every charge emittes a field \vert E\vert\propto \exp (i\mathbf{k}\cdot\mathbf{r}). Then, the power is proportional to E^2. In a dense medium:

R_{M}\sim 10cm

We have two different experimental and interesting cases:

A) The optical case, with \lambda <<R_M. Then, we expect random phases and P\propto N.

B) The microwave case, with \lambda>>R_M. In this situation, we expect coherent radiation/waves with P\propto N^2.

We can exploit this effect in large natural volumes transparent to radio (dry): pure ice, salt formations, lunar regolith,…The peak of this coherent radiation for sand is produced at a frequency around 5GHz, while the peak for ice is obtained around 2GHz.

The first experimental confirmation of the Askaryan effect detection were the next two experiments:

1) 2000 Saltzberg et.al., SLAC. They used as target silica sand. The paper is this one http://arxiv.org/abs/hep-ex/0011001

2) 2002 Gorham et.al., SLAC. They used a synthetic salt target. The paper appeared in this place http://arxiv.org/abs/hep-ex/0108027

Indeed, in 1965, Askaryan himself proposes ice and salt as possible target media. The reasons are easy to understand:
1st. They provide high densities and then it means a higher probability for neutrino interaction.
2nd. They have a high refractive index. Therefore, the Cerenkov emission becomes important.
3rd. Salt and ice are radio transparent, and of course, they can be supplied in large volumes available throughout the world.

The advantages of radio detection of UHE neutrinos provided by the Askaryan effect are very interesting:

1) Low attenuation: clear signals from large detection volumes.
2) We can observe distant and inclined events.
3) It has a high duty cycle: good statistics in less time.
4) I has a relative low cost: large areas covered.
5) It is available for neutrinos and/or any other chargeless/neutral particle!

Problems with this Askaryan effect detection are, though: radio interference, correlation with shower parameters (still unclear), and that it is limited only to particles with very large energies, about E>10^{17}eV.

In summary:

Askaryan effect = coherent Cerenkov radiation from a charge excess induced by (likely) neutral/chargeless particles like (specially highly energetic) neutrinos passing through a dense medium.

Why the Askaryan effect matters?

It matters since it allows for the detection of UHE neutrinos, and it is “universal” for chargeless/neutral particles like neutrinos, just in the same way that the Cherenkov effect is universal for charged particles. And tracking UHE neutrinos is important because they point out towards its source, and it is suspected they can help us to solve the riddle of the origin and composition of cosmic rays, the acceleration mechanism of cosmic radiation, the nuclear interactions of astrophysical objects, and tracking the highest energy emissions of the Universe we can observe at current time.

Is it real? Has it been detected? Yes, after 38 years, it has been detected. This effect was firstly demonstrated in sand (2000), rock salt (2004) and ice (2006), all done in a laboratory at SLAC and later it has been checked in several independent experiments around the world. Indeed, I remember to have heard about this effect during my darker years as undergraduate student. Fortunately or not, I forgot about it till now. In spite of the beauty of it!

Moreover, it has extra applications to neutrino detection using the Moon as target: GLUE (detectors are Goldstone RTs), NuMoon (Westerbork array; LOFAR), or RESUN (EVLA), or the LUNASKA project. Using ice as target, there has been other experiments checking the reality of this effect: FORTE (satellite observing Greenland ice sheet), RICE (co-deployed on AMANDA strings, viewing Antarctic ice), and the celebrated ANITA (balloon-borne over Antarctica, viewing Antarctic ice) experiment.

Furthermore, even some experiments have used the Moon (an it is likely some others will be built in the near future) as a neutrino detector using the Askaryan radiation (the analogue for neutral particles of the Cherenkov effect, don’t forget the spot!).

Askaryan effect and the mysterious cosmic rays.

Askaryan radiation is important because is one of the portals of the UHE neutrino observation coming from cosmic rays. The mysteries of cosmic rays continue today. We have detected indeed extremely energetic cosmic rays beyond the 10^{20}eV scale. Their origin is yet unsolved. We hope that tracking neutrinos we will discover the sources of those rays and their nature/composition. We don’t understand or know any mechanism being able to accelerate particles up to those incredible particles. At current time, IceCube has not detected UHE neutrinos, and it is a serious issue for curren theories and models. It is a challenge if we don’t observe enough UHE neutrinos as the Standard Model would predict. Would it mean that cosmic rays are exclusively composed by heavy nuclei or protons? Are we making a bad modelling of the spectrum of the sources and the nuclear models of stars as it happened before the neutrino oscillations at SuperKamiokande and Kamikande were detected -e.g.:SN1987A? Is there some kind of new Physics living at those scales and avoiding the GZK limit we would naively expect from our current theories?


LOG#046. The Cherenkov effect.

The Cherenkov effect/Cherenkov radiation, sometimes also called Vavilov-Cherenkov radiation, is our topic here in this post.

In 1934, P.A. Cherenkov was a post graduate student of S.I.Vavilov. He was investigating the luminescence of uranyl salts under the incidence of gamma rays from radium and he discovered a new type of luminiscence which could not be explained by the ordinary theory of fluorescence. It is well known that fluorescence arises as the result of transitions between excited states of atoms or molecules. The average duration of fluorescent emissions is about \tau>10^{-9}s and the transition probability is altered by the addition of “quenching agents” or by some purification process of the material, some change in the ambient temperature, etc. It shows that none of these methods is able to quench the fluorescent emission totally, specifically the new radiation discovered by Cherenkov. A subsequent investigation of the new radiation ( named Cherenkov radiation by other scientists after the Cherenkov discovery of such a radiation) revealed some interesting features of its characteristics:

1st. The polarization of luminiscence changes sharply when we apply a magnetic field. Cherenkov radiation luminescence is then causes by charged particles rather than by photons, the \gamma-ray quanta! Cherenkov’s experiment showed that these particles could be electrons produced by the interaction of \gamma-photons with the medium due to the photoelectric effect or the Compton effect itself.

2nd. The intensity of the Cherenkov’s radiation is independent of the charge Z of the medium. Therefore, it can not be of radiative origin.

3rd. The radiation is observed at certain angle (specifically forming a cone) to the direction of motion of charged particles.

The Cherenkov radiation was explained in 1937 by Frank and Tamm based on the foundations of classical electrodynamics. For the discovery and explanation of Cherenkov effect, Cherenkov, Frank and Tamm were awarded the Nobel Prize in 1958. We will discuss the Frank-Tamm formula later, but let me first explain how the classical electrodynamics handle the Vavilov-Cherenkov radiation.

The main conclusion that Frank and Tamm obtained comes from the following observation. They observed that the statement of classical electrodynamics concerning the impossibility of energy loss by radiation for a charged particle moving uniformly and following a straight line in vacuum is no longer valid if we go over from the vacuum to a medium with certain refractive index n>1. They went further with the aid of an easy argument based on the laws of conservation of momentum and energy, a principle that rests in the core of Physics as everybody knows. Imagine a charged partice moving uniformly in a straight line, and suppose it can loose energy and momentum through radiation. In that case, the next equation holds:

\left(\dfrac{dE}{dp}\right)_{particle}=\left(\dfrac{dE}{dp}\right)_{radiation}

This equation can not be satisfied for the vacuum but it MAY be valid for a medium with a refractive index gretear than one n>1. We will simplify our discussion if we consider that the refractive index is constant (but similar conclusions would be obtained if the refractive index is some function of the frequency).

By the other hand, the total energy E of a particle having a non-null mass m\neq 0 and moving freely in vacuum with some momentum p and velocity v will be:

E=\sqrt{p^2c^2+m^2c^4}

and then

\left(\dfrac{dE}{dp}\right)_{particle}=\dfrac{pc^2}{E}=\beta c=v

Moreover, the electromagnetic radiation in vaccum is given by the relativistic relationship

E_{rad}=pc

From this equation, we easily get that

\left(\dfrac{dE}{dp}\right)_{radiation}=c

Since the particle velocity is v<c, we obtain that

\left(\dfrac{dE}{dp}\right)_{particle}<\left(\dfrac{dE}{dp}\right)_{radiation}

In conclusion: the laws of conservation of energy and momentum prevent that a charged particle moving with a rectilinear and uniform motion in vacuum from giving away its energy and momentum in the form of electromagnetic radiation! The electromagnetic radiation can not accept the entire momentum given away by the charged particle.

Anyway, we realize that this restriction and constraint is removed and given up when the aprticle moves in a medium with a refractive index n>1. In this case, the velocity of light in the medium would be

c'=c/n<c

and the velocity v of the particle may not only become equal to the velocity of light c' in the medium, but even exceed it when the following phenomenological condition is satisfied:

\boxed{v\geq c'=c/n}

It is obvious that, when v=c' the condition

\left(\dfrac{dE}{dp}\right)_{particle}=\left(\dfrac{dE}{dp}\right)_{radiation}

will be satisfied for electromagnetic radiation emitted strictly in the direction of motion of the particle, i.e., in the direction of the angle \theta=0\textdegree. If v>c', this equation is verified for some direction \theta along with v=c', where

v'=v\cos\theta

is the projection of the particle velocity v on the observation direction. Then, in a medium with n>1, the conservation laws of energy and momentum say that it is allowed that a charged particle with rectilinear and uniform motion, v\geq c'=c/n can loose fractions of energy and momentum dE and dp, whenever those lost energy and momentum is carried away by an electromagnetic radiation propagating in the medium at an angle/cone given by:

\boxed{\theta=arccos\left(\dfrac{1}{n\beta}\right)=\cos^{-1}\left(\dfrac{1}{n\beta}\right)}

with respect to the observation direction of the particle motion.

These arguments, based on the conservation laws of momenergy, do not provide any ide about the real mechanism of the energy and momentum which are lost during the Cherenkov radiation. However, this mechanism must be associated with processes happening in the medium since the losses can not occur ( apparently) in vacuum under normal circumstances ( we will also discuss later the vacuum Cherenkov effect, and what it means in terms of Physics and symmetry breaking).

We have learned that Cherenkov radiation is of the same nature as certain other processes we do know and observer, for instance, in various media when bodies move in these media at a velocity exceeding that of the wave propagation. This is a remarkable result! Have you ever seen a V-shaped wave in the wake of a ship? Have you ever seen a conical wave caused by a supersonic boom of a plane or missile? In these examples, the wave field of the superfast object if found to be strongly perturbed in comparison with the field of a “slow” object ( in terms of the “velocity of sound” of the medium). It begins to decelerate the object!

Question: What is then the mechanism behind the superfast  motion of a charged particle in a medium wiht a refractive index n>1 producing the Cherenkov effect/radiation?

Answer:  The mechanism under the Cherenkov effect/radiation is the coherent emission by the dipoles formed due to the polarization of the medium atoms by the charged moving particle!

The idea is as follows. Dipoles are formed under the action of the electric field of the particle, which displaces the electrons of the sorrounding atoms relative to their nuclei. The return of the dipoles to the normal state (after the particle has left the given region) is accompanied by the emission of an electromagnetic signal or beam. If a particle moves slowly, the resulting polarization will be distribute symmetrically with respect to the particle position, since the electric field of the particle manages to polarize all the atoms in the near neighbourhood, including those lying ahead in its path. In that case, the resultant field of all dipoles away from the particle are equal to zero and their radiations neutralize one to one.

Then, if the particle move in a medium with a velocity exceeding the velocity or propagation of the electromagnetic field in that medium, i.e., whenever v>c'=c/n, a delayed polarization of the medium is observed, and consequently the resulting dipoles will be preferably oriented along the direction of motion of the particle. See the next figure:

It is evident that, if it occurs, there must be a direction along which a coherent radiation form dipoles emerges, since the waves emitted by the dipoles at different points along the path of the particle may turn our to be in the same phase. This direction can be easiy found experimentally and it can be easily obtained theoretically too. Let us imagine that a charged particle move from the left to the right with some velocity v in a medium with a n>1 refractive index, with c'=c/n. We can apply the Huygens principle to build the wave front for the emitted particle. If, at instant t, the aprticle is at the point x=vt, the surface enveloping the spherical waves emitted by the same particle on its own path from the origin at x=0 to the arbitrary point x. The radius of the wave at the point x=0 at such an instant t is equal to R_0=c't. At the same moment, the wave radius at th epint x is equal to R_x=c'(t-(x/v))=0. At any intermediate point x’, the wave radius at instant t will be R_{x'}=c'(t-(x'/v)). Then, the radius decreases linearly with increasing x'. Thus, the enveloping surface is a cone with angle 2\varphi, where the angle satisfies in addition

\sin\varphi=\dfrac{R_0}{x}=\dfrac{c't}{vt}=\dfrac{c'}{v}=\dfrac{c}{vn}=\dfrac{1}{\beta n}

The normal to the enveloping surface fixes the direction of propagation of the Cherenkov radiation. The angle \theta between the normal and the x-axis is equal to \pi/2-\varphi, and it is defined by the condition

\boxed{\cos\theta=\dfrac{1}{\beta n}}

or equivalently

\boxed{\tan\theta=\sqrt{\beta^2n^2-1}}

This is the result we anticipated before. Indeed, it is completely general and Quantum Mechanics instroudces only a light and subtle correction to this classical result. From this last equation, we observer that the Cherenkov radiation propagates along the generators of a cone whose axis coincides with the direction of motion of the particle an the cone angle is equal to 2\theta. This radiation can be registered on a colour film place perpendicularly to the direction of motion of the particle. Radiation flowing from a radiator of this type leaves a blue ring on the photographic film. These blue rings are the archetypical fingerprints of Vavilov-Cherenkov radiation!

The sharp directivity of the Cherenkov radiation makes it possible to determine the particle velocity \beta from the value of the Cherenkov’s angle \theta. From the Cherenkov’s formula above, it follows that the range of measurement of \beta is equal to

1/n\leq\beta<1

For \beta=1/n, the radiation is observed at an angle \theta=0\textdegree, while for the extreme with \beta=1, the angle \theta reaches a maximum value

\theta_{max}=\cos^{-1}\left(\dfrac{1}{n}\right)=arccos \left(\dfrac{1}{n}\right)

For instance, in the case of water, n=1.33 and \beta_{min}=1/1.33=0.75. Therefore, the Cherenkov radiation is observed in water whenever \beta\geq 0.75. For electrons being the charged particles passing through the water, this condition is satisfied if

T_e=m_ec^2\left(\dfrac{1}{\sqrt{1-\beta^2}}-1\right)=0.5\left( \dfrac{1}{\sqrt{1-0.75^2}}-1\right)=0.26MeV

As a consequence of this, the Cherenkov effect should be observed in water even for low-energy electrons ( for isntance, in the case of electrons produced by beta decay, or Compton electrons, or photoelectroncs resulting from the interaction between water and gamma rays from radioactive products, the above energy can be easily obtained and surpassed!). The maximum angle at which the Cherenkov effec can be observed in water can be calculated from the condition previously seen:

\cos\theta_{max}=1/n=0.75

This angle (for water) shows to be equal to about \theta\approx 41.5\textdegree=41\textdegree 30'. In agreement with the so-called Frank-Tamm formula ( please, see below what that formula is and means), the number of photons in the frequency interval \nu and \nu+d\nu emitted by some particle with charge Z moving with a velocity \beta in a medium with a refractive indez n is provided by the next equation:

\boxed{N(\nu) d\nu=4\pi^2\dfrac{(Zq)^2}{hc^2}\left(1-\dfrac{1}{n^2\beta^2}\right) d\nu}

This formula has some striking features:

1st. The spectrum is identical for particles with Z=constant, i.e., the spectrum is exactly the same, irespectively the nature of the particle. For instance, it could be produced both by protons, electrons, pions, muons or their antiparticles!

2nd. As Z increases, the number of emitted photons increases as Z^2.

3rd. N(\nu) increases with \beta, the particle velocity, from zero ( with \beta=1/n) to

N=4\pi^2\left(\dfrac{q^2Z^2}{hc^2}\right)\left(1-\dfrac{1}{n^2}\right)

with \beta\approx 1.

4th. N(\nu) is approximately independent of \nu. We observe that dN(\nu)\propto d\nu.

5th. As the spectrum is uniform in frequency, and E=h\nu, this means that the main energy of radiation is concentrated in the extreme short-wave region of the spectrum, i.e.,

\boxed{dE_{Cherenkov}\propto \nu d\nu}

And then, this feature explains the bluish-violet-like colour of the Cherenkov radiation!

Indeed, this feature also indicates the necessity of choosing materials for practical applications that are “transparent” up to the highest frequencies ( even the ultraviolet region). As a rule, it is known that n<1 in the X-ray region and hence the Cherenkov condition can not be satisfied! However, it was also shown by clever experimentalists that in some narrow regions of the X-ray spectrum the refractive index is n>1 ( the refractive index depends on the frequency in any reasonable materials. Practical Cherenkov materials are, thus, dispersive! ) and the Cherenkov radiation is effectively observed in apparently forbidden regions.

The Cherenkov effect is currently widely used in diverse applications. For instance, it is useful to determine the velocity of fast charged particles ( e.g, neutrino detectors can not obviously detect neutrinos but they can detect muons and other secondaries particles produced in the interaction with some polarizable medium, even when they are produced by (electro)weak intereactions like those happening in the presence of chargeless neutrinos). The selection of the medium fo generating the Cherenkov radiation depends on the range of velocities \beta over which measurements have to be produced with the aid of such a “Cherenkov counter”. Cherenkov detectors/counters are filled with liquids and gases and they are found, e.g., in Kamiokande, Superkamiokande and many other neutrino detectors and “telescopes”. It is worth mentioning that velocities of ultrarelativistic particles are measured with Cherenkov detectors whenever they are filled with some special gasesous medium with a refractive indes just slightly higher than the unity. This value of the refractive index can be changed by realating the gas pressure in the counter! So, Cherenkov detectors and counters are very flexible tools for particle physicists!

Remark: As I mentioned before, it is important to remember that (the most of) the practical Cherenkov radiators/materials ARE dispersive. It means that if \omega is the photon frequency, and k=2\pi/\lambda is the wavenumber, then the photons propagate with some group velocity v_g=d\omega/dk, i.e.,

\boxed{v_g=\dfrac{d\omega}{dk}=\dfrac{c}{\left[n(\omega)+\omega \frac{dn}{d\omega}\right]}}

Note that if the medium is non-dispersive, this formula simplifies to the well known formula v_g=c/n. As it should be for vacuum.

Accodingly, following the PDG, Tamm showed in a classical paper that for dispersive media the Cherenkov radiation is concentrated in a thin  conical shell region whose vertex is at the moving charge and whose opening half-angle \eta is given by the expression

\boxed{cotan \theta_c=\left[\dfrac{d}{d\omega}\left(\omega\tan\theta_c\right)\right]_{\omega_0}=\left(\tan\theta_c+\beta^2\omega n(\omega) \dfrac{dn}{d\omega} cotan (\theta_c)\right)\bigg|_{\omega_0}}

where \theta_c is the critical Cherenkov angle seen before, \omega_0 is the central value of the small frequency range under consideration under the Cherenkov condition. This cone has an opening half-angle \eta (please, compare with the previous convention with \varphi for consistency), and unless the medium is non-dispersive (i.e. dn/d\omega=0, n=constant), we get \theta_c+\eta\neq 90\textdegree. Typical Cherenkov radiation imaging produces blue rings.

THE CHERENKOV EFFECT: QUANTUM FORMULAE

When we considered the Cherenkov effect in the framework of QM, in particular the quantum theory of radiation, we can deduce the following formula for the Cherenkov effect that includes the quantum corrections due to the backreaction of the particle to the radiation:

\boxed{\cos\theta=\dfrac{1}{\beta n}+\dfrac{\Lambda}{2\lambda}\left(1-\dfrac{1}{n^2}\right)}

where, like before, \beta=v/c, n is the refraction index, \Lambda=\dfrac{h}{p}=\dfrac{h}{mv} is the De Broglie wavelength of the moving particle and \lambda is the wavelength of the emitted radiation.

Cherenkov radiation is observed whenever \beta_n>1 (i.e. if v>c/n), and the limit of the emission is on the short wave bands (explaining the typical blue radiation of this effect). Moreover, \lambda_{min} corresponds to \cos\theta\approx 1.

By the other hand, the radiated energy per particle per unit of time is equal to:

\boxed{-\dfrac{dE}{dt}=\dfrac{e^2V}{c^2}\int_0^{\omega_{max}}\omega\left[1-\dfrac{1}{n^2\beta^2}-\dfrac{\Lambda}{n\beta\lambda}\left(1-\dfrac{1}{n^2}\right)-\dfrac{\Lambda^2}{4\lambda^2}\left(1-\dfrac{1}{n^2}\right)\right]d\omega}

where \omega=2\pi c/n\lambda is the angular frequency of the radiation, with a maximum value of \omega_{max}=2\pi c/n\lambda_{min}.
Remark: In the non-relativistic case, v<<c, and the condition \beta n>1 implies that n>>1. Therefore, neglecting the quantum corrections (the charged particle self-interaction/backreaction to radiation), we can insert the limit \Lambda/\lambda\rightarrow 0 and the above previous equations will simplify into:

\boxed{\cos\theta=\dfrac{1}{n\beta}-\dfrac{c}{nv}}

\boxed{-\dfrac{dE}{dt}=\dfrac{e^2 v}{c^2}\int_0^{\omega_{max}}\omega\left(1-\dfrac{c^2}{n^2v^2}\right)d\omega}

Remember: \omega_{max} is determined with the condition \beta n(\omega_{max})=1, where n(\omega_{max}) represents the dispersive effect of the material/medium through the refraction index.

THE FRANK-TAMM FORMULA

The number of photons produced per unit path length and per unit of energy of a charged particle (charge equals to Zq) is given by the celebrated Frank-Tamm formula:

\boxed{\dfrac{d^2N}{dEdx}=\dfrac{\alpha Z^2}{\hbar c}\sin^2\theta_c=\dfrac{\alpha^2 Z^2}{r_em_ec^2}\left(1-\dfrac{1}{\beta^2n^2(E)}\right)}

In terms of common values of fundamental constants, it takes the value:

\boxed{\dfrac{d^2N}{dEdx}\approx 370Z^2\sin^2\theta_c(E)eV^{-1}\cdot cm^{-1}}

or equivalently it can be written as follows

\boxed{\dfrac{d^2N}{dEdx}=\dfrac{2\pi \alpha Z^2}{\lambda^2}\left(1-\dfrac{1}{\beta^2n^2(\lambda)}\right)}

The refraction index is a function of photon energy E=\hbar \omega, and it is also the sensitivity of the transducer used to detect the light with the Cherenkov effect! Therefore, for practical uses, the Frank-Tamm formula must be multiplied by the transducer response function and integrated over the region for which we have \beta n(\omega)>1.

Remark: When two particles are close toghether ( to be close here means to be separated a distance d<1 wavelength), the electromagnetic fields form the particles may add coherently and affect the Cherenkov radiation. The Cherenkov radiation for a electron-positron pair at close separation is suppressed compared to two independent leptons!

Remark (II): Coherent radio Cherenkov radiation from electromagnetic showers is significant and it has been applied to the study of cosmic ray air showers. In addition to this, it has been used to search for electron neutrinos induced showers by cosmic rays.

CHERENKOV DETECTOR: MAIN FORMULA AND USES

The applications of Cherenkov detectors for particle identification (generally labelled as PID Cherenkov detectors) are well beyond the own range of high-energy Physics. Its uses includes: A) Fast particle counters. B) Hadronic particle indentifications. C) Tracking detectors performing complete event reconstruction. The PDG gives some examples of each category: a) Polarization detector of SLD, b) the hadronic PID detectors at B factories like BABAR or the aerogel threshold Cherenkov in Belle, c) large water Cherenkov counters liket those in Superkamiokande and other neutrino detector facilities.

Cherenkov detectors contain two main elements: 1) A radiator/material through which the particle passes, and 2) a photodetector. As Cherenkov radiation is a weak source of photons, light collection and detection must be as efficient as possible. The presence of a refractive material specifically designed to detect some special particles is almost vindicated in general.

The number of photoelectrons detected in a given Cherenkov radiation detector device is provided by the following formula (derived from the Tamm-Frank formula simply taking into account the efficiency in a straightforward manner):

\boxed{N=L\dfrac{\alpha^2 Z^2}{r_em_ec^2}\int \epsilon (E)\sin^2\theta_c(E)dE}

where L is the path length of the particle in the radiator/material, \epsilon (E) is the efficiency for the collector of Cherenkov light and transducing it in photoelectrons, and

\boxed{\dfrac{\alpha^2}{r_em_ec^2}=370eV^{-1}cm^{-1}}

Remark: The efficiencies and the Cherenkov critical angle are functions of the photon energy, generally speaking. However, since the typical energy dependen variation of the refraction index is modest, a quantity sometimes called Cherenkov detector quality fact N_0 can be defined as follows

\boxed{N_0=\dfrac{\alpha^2Z^2}{r_em_ec^2}\int \epsilon dE}

In this case, we can write

\boxed{N\approx LN_0<\sin^2\theta_c>}

Remark(II): Cherenkov detectors are classified into imaging or threshold types, depending on its ability to make use of Cherenkov angle information. Imaging counters may be used to track particles as well as identify particles.

Other main uses/applications of the Vavilov-Cherenkov effect are:

1st. Detection of labeled biomolecules. Cherenkov radiation is widely used to facilitate the detection of small amounts and low concentrations of biomolecules. For instance, radioactive atoms such as phosphorus-32 are readily introduced into biomolecules by enzymatic and synthetic means and subsequently may be easily detected in small quantities for the purpose of elucidating biological pathways and in characterizing the interaction of biological molecules such as affinity constants and dissociation rates.

2nd. Nuclear reactors. Cherenkov radiation is used to detect high-energy charged particles. In pool-type nuclear reactors, the intensity of Cherenkov radiation is related to the frequency of the fission events that produce high-energy electrons, and hence is a measure of the intensity of the reaction. Similarly, Cherenkov radiation is used to characterize the remaining radioactivityof spent fuel rods.

3rd. Astrophysical experiments. The Cherenkov radiation from these charged particles is used to determine the source and intensity of the cosmic ray,s which is used for example in the different classes of cosmic ray detection experiments. For instance, Ice-Cube, Pierre-Auger, VERITAS, HESS, MAGIC, SNO, and many others. Cherenkov radiation can also be used to determine properties of high-energy astronomical objects that emit gamma rays, such as supernova remnants and blazars. In this last class of experiments we place STACEE, in new Mexico.

4th. High-energy experiments. We have quoted already this, and there many examples in the actual LHC, for instance, in the ALICE experiment.

VACUUM CHERENKOV RADIATION

Vacuum Cherenkov radiation (VCR) is the alledged and  conjectured phenomenon which refers to the Cherenkov radiation/effect of a charged particle propagating in the physical vacuum. You can ask: why should it be possible? It is quite straightforward to understand the answer.

The classical (non-quantum) theory of relativity (both special and general)  clearly forbids any superluminal phenomena/propagating degrees of freedom for material particles, including this one (the vacuum case) because a particle with non-zero rest mass can reach speed of light only at infinite energy (besides, the nontrivial vacuum itself would create a preferred frame of reference, in violation of one of the relativistic postulates).

However, according to modern views coming from the quantum theory, specially our knowledge of Quantum Field Theory, physical vacuum IS a nontrivial medium which affects the particles propagating through, and the magnitude of the effect increases with the energies of the particles!

Then, a natural consequence follows: an actual speed of a photon becomes energy-dependent and thus can be less than the fundamental constant c=299792458m/s of  speed of light, such that sufficiently fast particles can overcome it and start emitting Cherenkov radiation. In summary, any charged particle surpassing the speed of light in the physical vacuum should emit (Vacuum) Cherenkov radiation. Note that it is an inevitable consequence of the non-trivial nature of the physical vacuum in Quantum Field Theory. Indeed, some crazy people saying that superluminal particles arise in jets from supernovae, or in colliders like the LHC fail to explain why those particles don’t emit Cherenkov radiation. It is not true that real particles become superluminal in space or collider rings. It is also wrong in the case of neutrino propagation because in spite of being chargeless, neutrinos should experiment an analogue effect to the Cherenkov radiation called the Askaryan effect. Other (alternative) possibility or scenario arises in some Lorentz-violating theories ( or even CPT violating theories that can be equivalent or not to such Lorentz violations) when a speed of a propagating particle becomes higher than c which turns this particle into the tachyon.  The tachyon with an electric charge would lose energy as Cherenkov radiation just as ordinary charged particles do when they exceed the local speed of light in a medium. A charged tachyon traveling in a vacuum therefore undergoes a constant proper-time acceleration and, by necessity, its worldline would form an hyperbola in space-time. These last type of vacuum Cherenkov effect can arise in theories like the Standard Model Extension, where Lorentz-violating terms do appear.

One of the simplest kinematic frameworks for Lorentz Violating theories is to postulate some modified dispersion relations (MODRE) for particles , while keeping the usual energy-momentum conservation laws. In this way, we can provide and work out an effective field theory for breaking the Lorentz invariance. There are several alternative definitions of MODRE, since there is no general guide yet to discriminate from the different theoretical models. Thus, we could consider a general expansion  in integer powers of the momentum, in the next manner (we set units in which c=1):

\boxed{E^2=f(p,m,c_n)=p^2+m^2+\sum_{n=-\infty}^{\infty}c_n p^n}

However, it is generally used a more soft expansion depending only on positive powers of the momentum in the MODRE. In such a case,

\boxed{E^2=f(p,m,a_n)=p^2+m^2+\sum_{n=1}^{\infty}a_n p^n}

and where p=\vert \mathbf{p}\vert. If Lorentz violations are associated to the yet undiscovered quantum theory of gravity, we would get that ordinary deviations of the dispersion relations in the special theory of relativity should appear at the natural scale of the quantum gravity, say the Planck mass/energy. In units where c=1 we obtain that Planck mass/energy is:

\boxed{M_P=\sqrt{\hbar^5/G_N}=1.22\cdot 10^{19}GeV=1.22\cdot 10^{16}TeV}

Lets write and parametrize the Lorentz violations induced by the fundamental scale of quantum gravity (naively this Planck mass scale) by:

\boxed{a_n=\dfrac{\Xi_n}{M_P^{n-2}}}

Here, \Xi_n is a dimensionless quantity that can differ from one particle (type) to another (type). Considering, for instance n=3,4, since the n<3 seems to be ruled out by previous terrestrial experiments, at higer energies the lowest non-null term will dominate the expansion with n\geq 3. The MODRE reads:

E^2=p^2+m^2+\dfrac{\Xi_a p^n}{M_P^{n-2}}

and where the label a in the term \Xi_a is specific of the particle type. Such corrections might only become important at the Planck scale, but there are two exclusions:

1st. Particles that propagate over cosmological distances can show differences in their propagation speed.
2nd. Energy thresholds for particle reactions can be shifted or even forbidden processes can be allowed. If the p^n-term is comparable to the m^2-term in the MODRE. Thus, threshold reactions can be significantly altered or shifted, because they are determined by the particle masses. So a threshold shift should appear at scales where:

\boxed{p_{dev}\approx\left(\dfrac{m^2M_P^{n-2}}{\Xi}\right)^{1/n}}

Imposing/postulating that \Xi\approx 1, the typical scales for the thresholds for some diffent kind of particles can be calculated. Their values for some species are given in the next table:

We can even study some different sources of modified dispersion relationships:

1. Measurements of time of flight.

2. Thresholds creation for: A) Vacuum Cherenkov effect, B) Photon decay in vacuum.

3. Shift in the so-called GZK cut-off.

4. Modified dispersion relationships induced by non-commutative theories of spacetime. Specially, there are time shifts/delays of photon signals induced by non-commutative spacetime theories.

We will analyse this four cases separately, in a very short and clear fashion. I wish!

Case 1. Time of flight. This is similar to the recently controversial OPERA experiment results. The OPERA experiment, and other similar set-ups, measure the neutrino time of flight. I dedicated a post to it early in this blog

https://thespectrumofriemannium.wordpress.com/2012/06/08/

In fact, we can measure the time of flight of any particle, even photons. A modified dispersion relation, like the one we introduced here above, would lead to an energy dependent speed of light. The idea of the time of flight (TOF) approach is to detect a shift in the arrival time of photons (or any other massless/ultra-relativistic particle like neutrinos) with different energies, produced simultaneous in a distant object, where the distance gains the usually Planck suppressed effect. In the following we use the dispersion relation for n=3 only, as modifications in higher orders are far below the sensitivity of current or planned experiments. The modified group velocity becomes:

v=\dfrac{\partial E}{\partial p}

and then, for photons,

v\approx 1-\Xi_\gamma\dfrac{p}{M}

The time difference in the photon shift detection time will be:

\Delta t=\Xi_\gamma \dfrac{p}{M}D

where D is the distance multiplied (if it were the case) by the redshift (1+z) to correct the energy with the redshift. In recent years, several measurements on different objects in various energy bands leading to constraints up to the order of 100 for \Xi. They can be summarized in the next table ( note that the best constraint comes from a short flare of the Active Galactic Nucleus (AGN) Mrk 421, detected in the TeV band by the Whipple Imaging Air Cherenkov telescope):

There is still room for improvements with current or planned experiments, although the distance for TeV-observations is limited by absorption of TeV photons in low energy metagalactic radiation fields. Depending on the energy density of the target photon field one gets an energy dependent mean free path length, leading to an energy and redshift dependent cut off energy (the cut off energy is defined as the energy where the optical depth is one).

2. Thresholds creation for: A) Vacuum Cherenkov effect, B) Photon decay in vacuum. By the other hand, the interaction vertex in quantum electrodynamics (QED) couples one photon with two leptons. When we assume for photons and leptons the following dispersion relations (for simplicity we adopt all units with M=1). Then:

\omega_k^2=k^2+\xi k^n                E^2_p=p^2+m^2+\Xi p^n

Let us write the photon tetramomentum like \mathbb{P}=(\omega_k,\mathbf{k}) and the lepton tetramomentum \mathbb{P}=(E_p,\mathbf{p}) and \mathbb{Q}=(E_q,\mathbf{q}). It can be shown that the transferred tetramomentum will be

\xi k^n+\Xi p^n-\Xi q^n=2(E_p\omega_k-\mathbf{p}\cdot\mathbf{k})

where the r.h.s. is always positive. In the Lorentz invariant case the parameters \xi, \Xi  are zero, so that this equation can’t be solved and all processes of the single vertex are forbidden. If these parameters are non-zero, there can exist a solution and so these processes can be allowed. We now consider two of these interactions to derive constraints on the parameters \Xi, \xi. The vacuum
Cherenkov effect e^-\rightarrow \gamma e^- and the spontaneous photon-decay \gamma\rightarrow e^+e^-.

A) As we have studied here, the vacuum Cherenkov effect is a spontaneous emission of a photon by a charged particle 0<E_\gamma<E_{par}.  These effect occurs if the particle moves faster than the slowest possible radiated photon in vacuum!
In the case of \Xi>0, the maximal attainable speed for the particle c_{max} is faster than c. This means, that the particle can always be faster than a zero energy photon with

\displaystyle{c_{\gamma_0}=c\lim_{k\rightarrow 0}\dfrac{\partial \omega}{\partial k}=c\lim_{k\rightarrow 0}\dfrac{2k+n\xi k^{n-1}}{2\sqrt{k^2+\xi k^n}}=c}

and it is independent of \xi. In the case of \Xi<0, i.e., c_{par} decreases with energy, you need a photon with c_\gamma<c_{par}<x. This is only possible if \xi<\Xi.

Therefore, due to the radiation of photons such an electron loose energy. The observation of high energetic electrons allows to derive constraints on \Xi and \xi.  In the case of \Xi<0, in the case with n=3, we have the bound

\Xi<\dfrac{m^2}{2p^3_{max}}

Moreover, from the observation of 50 TeV photons in the Crab Nebula (and its pulsar) one can conclude the existens of 50 TeV electrons due to the inverse Compton scattering of these electrons with those photons. This leads to a constraint on \Xi of about

\Xi<1.2\times 10^{-2}

where we have used \Xi>0 in this case.

B) The decay of photons into positrons and electrons \gamma\rightarrow e^+e^- should be a very rapid spontaneous decay process. Due to the observation of Gamma rays from the Crab Nebula on earth with an energy up to E\sim 50TeV. Thus, we can reason that these rapid decay doesn’t occur on energies below 50 TeV. For the constraints on \Xi and \xi these condition means (again we impose n=3):

\xi<\dfrac{\Xi}{2}+0.08, \mbox{for}\; \xi\geq 0

\xi<\Xi+\sqrt{-0.16\Xi}, \mbox{for}\;\Xi<\xi<0.

3. Shift in the GZK cut-off. As the energy of a proton increases,the pion production reaction can happen with low energy photons of the Cosmic Microwave Background (CMB).

This leads to an energy dependent mean free path length of the particles, resulting in a cutoff at energies around E_{GZK}\approx 10^{20}eV. This is the the celebrated Greisen-Kuzmin-Zatsepin (GZK) cut off. The resonance for the GZK pion photoproduction with the CMB backgroud can be read from the next condition (I will derive this condition in a future post):

\boxed{E_{GZK}\approx\dfrac{m_p m_\pi}{2E_\gamma}=3\times 10^{20}eV\left(\dfrac{2.7K}{E_\gamma}\right)}

Thus in Lorentz invariant world, the mean free path length of a particle of energy 5.1019 eV is 50 Mpc i.e. particle over this energy are readily absorbed due to pion photoproduction reaction. But most of the sources of particle of ultra high energy are outside 50 Mpc. So, one expects no trace of particles of energy above 10^{20}eV on Earth. From the experimental point of view AGASA has found
a few particles having energy higher than the constraint given by GZK cutoff limit and claimed to be disproving the presence of GZK cutoff or at least for different threshold for GZK cutoff, whereas HiRes is consistent with the GZK effect. So, there are two main questions, not yet completely unsolved:

i) How one can get definite proof of non-existence GZK cut off?
ii) If GZK cutoff doesn’t exist, then find out the reason?

The first question could by answered by observation of a large sample of events at these energies, which is necessary for a final conclusion, since the GZK cutoff is a statistical phenomena. The current AUGER experiment, still under construction, may clarify if the GZK cutoff exists or not. The existence of the GZK cutoff would also yield new limits on Lorentz or CPT violation. For the second question, one explanation can be derived from Lorentz violation. If we do the calculation for GZK cutoff in Lorentz violated world we would get the modified proton dispersion relation as described in our previous equations with MODRE.

4. Modified dispersion relationships induced by non-commutative theories of spacetime. As we said above, there are time shifts/delays of photon signals induced by non-commutative spacetime theories. Noncommutative spacetime theories introduce a new source of MODRE: the fuzzy nature of the discreteness of the fundamental quantum spacetime. Then, the general ansatz of these type of theories comes from:

\boxed{\left[\hat{x}^\mu,\hat{x}^\nu\right]=i\dfrac{\theta^{\mu\nu}}{\Lambda_{NC}^2}}

where \theta^{\mu\nu} are the components of an antisymmetric Lorentz-like tensor which components are the order one. The fundamental scale of non-commutativity \Lambda^2_{NC} is supposed to be of the Planck length. However, there are models with large extra dimensions that induce non-commutative spacetime models with scale near the TeV scale! This is interesting from the phenomenological aside as well, not only from the theoretical viewpoint. Indeed, we can investigate in the following whether astrophysical observations are able to constrain certain class of models with noncommutative spacetimes which are broken at the TeV scale or higher. However, there due to the antisymmetric character of the noncommutative tensor, we need a magnetic and electric background field in order to study these kind of models (generally speaking, we need some kind of field inducing/producing antisymmetric field backgrounds), and then the dispersion relation for photons remains the same as in a commutative spacetime. Furthermore, there is no photon energy dependence of the dispersion relation. Consequently, the time-of-flight experiments are inappopriate because of their energy-dependent dispersion. Therefore, we suggest the next alternative scenario: suppose, there exists a strong magnetic field  (for instance, from a star or a cluster of stars) on the path photons emitted at a light source (e.g. gamma-ray bursts). Then, analogous to gravitational lensing, the photons experience deflection and/or change in time-of-arrival, compared to the same path without a magnetic background field. We can make some estimations for several known objects/examples are shown in this final table:

In summary:

1st. Vacuum Cherenkov and related effects modifying the dispersion relations of special relativity are natural in many scenarios beyond the Standard Relativity (BSR) and beyond the Standard Model (BSM).

2nd. Any theory allowing for superluminal propagation has to explain the null-results from the observation of the vacuum Cherenkov effect. Otherwise, they are doomed.

3rd. There are strong bounds coming from astrophysical processes and even neutrino oscillation experiments that severely imposes and kill many models. However, it is true that current MODRE bound are far from being the most general bounds. We expect to improve these bounds with the next generation of experiments.

4th. Theories that can not pass these tests (SR obviously does) have to be banned.

5th. Superluminality has observable consequences, both in classical and quantum physics, both in standard theories and theories beyond standard theories. So, it you buid a theory allowing superluminal stuff, you must be very careful with what kind of predictions can and can not do. Otherwise, your theory is complentely nonsense.

As a final closing, let me include some nice Cherenkov rings from Superkamiokande and MiniBoone experiments. True experimental physics in action. And a final challenge…

FINAL CHALLENGE: Are you able to identify the kind of particles producing those beautiful figures? Let me know your guesses ( I do know the answer, of course).

Figure 1. Typical SuperKamiokande Ring.  I dedicate this picture to my admired Japanase scientists there. I really, really admire that country and their people, specially after disasters like the 2011 Earthquake and the Fukushima accident. If you are a japanase reader/follower, you must know we support your from abroad. You were not, you are not and you shall not be alone.

Figure 2. Typical MiniBooNe ring. History: I used this nice picture in my Master Thesis first page, as the cover/title page main picture!


LOG#045. Fake superluminality.

Before becoming apparent superluminal readers, we are going to remember and review some elementary notation and concepts from the relativistic Doppler effect and the starlight aberration we have already studied in this blog.

Let us consider and imagine the next gedankenexperiment/thought experiment. Some moving object emits pulses of light during some time interval, denoted by \Delta \tau_e in its own frame. Its distance from us is very large, say

D>>c\Delta \tau_e

Question: Does it (light) arrive at time t=D/c? Suppose the object moves forming certain angle \theta according to the following picture

Time dilation means that a second pulse would be experiment a time delay \Delta t_e=\gamma \Delta \tau_e, later of course from the previous pulse, and at that time the object would have travelled a distance \Delta x=v\Delta t_e\cos\theta away from the source, so it would take it an additional time \Delta x/c to arrive at its destination. The reception time between pulses would be:

\Delta t_r=\Delta t_e+\beta \Delta t_e\cos\theta=\gamma (1+\beta \cos\theta)\Delta \tau_e

i.e.

\boxed{\Delta t_r=(1+\beta\cos\theta)\gamma \Delta \tau_e}

In the range 0<\theta<\pi, the time interval separation measured from both pulses in the rest frame on Earth will be longer than in the rest frame of the moving object. This analysis remains valid even if the 2 events are not light beams/pulses but succesive packets or “maxima” of electromagnetic waves ( electromagnetic radiation).

Astronomers define the dimensionless redshift

\boxed{(1+z)\equiv \dfrac{\Delta t_r}{\Delta \tau_e}=\gamma (1+\beta \cos\theta)}

where, as it is common in special relativity, \beta=v/c, \gamma^2=\dfrac{1}{1-\beta^2}

The 3 interesting limits of the above expression are:

1st. Receding emitter case. The moving object moves away from the receiver. Then, we have \theta=0 supposing a completely radial motion in the line of sight, and then a literal “redshift” ( lower frequencies than the proper frequencies)

(1+z)=\sqrt{\dfrac{1+\beta}{1-\beta}}

2nd. Approaching emitter case. The moving object approaches and goes closer to the observer. Then, we get \theta=\pi, or motion inward the radial direction, and then a “blueshift” ( higher frequencies than those of the proper frequencies)

(1+z)=\sqrt{\dfrac{1-\beta}{1+\beta}}

3rd. Tangential or transversal motion of the source. This is also called second-order redshift. It has been observed in extremely precise velocity measurements of pulsars in our Galaxy.

(1+z)=\gamma

Furthermore, these redshifts have all been observed in different astrophysical observations and, in addition, they have to be taken into account for tracking the position via GPS, geolocating satellites and/or following their relative positions with respect to time or calculating their revolution periods around our planet.

Remark: Quantum Mechanics and Special Relativity would be mutually inconsistent IF we did not find the same formual for the ratios between energy and frequencies at different reference frames.

EXAMPLE: The emission line of the oxygen (II) [O(II)] is, in its rest frame, \lambda_0=3727\AA. It is observed in a distant galaxy to be at \lambda=9500\AA. What is the redshift z and the recession velocity of this galaxy?

Solution.  From the definition of wavelength in electromagnetism cT=\lambda, adn c\tau=\lambda_0. Then,

(1+z)=\dfrac{T}{\tau}=\dfrac{\lambda}{\lambda_0}=\dfrac{9500}{3727}=2.55, and thus z=1.55

From the radial velocity hypothesis, we get

(1+z)=\sqrt{\dfrac{1+\beta}{1-\beta}} or

\beta=\dfrac{(1+z)^2-1}{(1+z)^2+1}=0.73

and thus \beta=0.73 or v=0.73c
Note that this result follows from the hypothesis of the expansion of the Universe, and it holds in the relativistic theory of gravity, General Relativity, and it should also holds in extensions of it, even in Quantum Gravity somehow!

Remember: Stellar aberration causes taht the positions on the sky of the celestial objects are changing as the Earth moves around the Sun. As the Earth’s velocity is about v_E\approx 30km/s, and then \beta_E\approx 10^{-4}, it implies an angular separation about \Delta \theta\approx 10^{-4}rad. Anyway, it is worth mentioning that the astronomer Bradley observed this starlight aberration in 1729! A moving observer observes that light from stars are at different positions with respect to a rest observer, and that the new position does not depend on the distance to the star. Thus, as the relative velocity increases, stars are “displaced” further and further towards the direction of observation.

Now, we are going to the main subject of the post. I decided to review this two important effects because it is useful to remember then and to understand that they are measured and they are real effects. They are not mere artifacts of the special theory of relativity masking some unknown reality. They are the reality in the sense they are measured. Alternative theories trying to understand these effects exist but they are more complicated and they remember me those people trying to defend the geocentric model of the Universe with those weird metaphenomenon known as epicycles in order to defend what can not be defended from the experimental viewpoint.

In order to make our discussion visual and phenomenological, I am going to consider a practical example. Certain radio-galaxy, denoted by 3C 273 moves with a velocity

\omega=0.8 miliarc sec/yr=4\cdot 10^{-9}\dfrac{rad}{yr}

Note that 1 miliarc sec=\left(\dfrac{10^{-3}}{3600}\right)^{\textdegree}

Knowing the rate expansion of the universe and the redshift of the radiogalaxy, its distance is calculated to be about 2.6\cdot 10^9 lyr. To obtain the relative tangential velocity, we simply multiply the angular velocity by the distance, i.e. v_{r\perp}=\omega D.

From the above data, we get that the apparent tangential radial velocity of our radiogalaxy would be about v_{r\perp}\approx 10c. Indeed, this observation is not isolated. There are even jets of matter flowin from some stars at apparent superluminal velocities. Of course this is an apparent issue for SR. How can we explain it? How is it possible in the SR framework to obtain a superluminal velocity? It shows that there is no contradiction with SR. The (fake and apparent) superluminal effect CAN BE EXPLAINED naturally in the SR framework in a very elegant way. Look at the following picture:

It shows:

-A moving object with velocity v=\vert \mathbf{v}\vert with respect to Earth, approaching to Earth.

-There is some angle \theta in the direction of observation. And as it moves towards Earth, with our conventions, $lates \theta\approx\pi=180\textdegree$

-The moving object emits flashes of light at two different points, A and B, separated by some time interval \Delta t_e in the Earth reference frame.

-The distance between those two points A and B, is very small compared with the distance object-Earth, i.e., d(A,B)<< D.

Question: What is the time separation \Delta t_r between the receptions of the pulses at the Earth surface?

The solution is very cool and intelligent. We get

A: time interval \Delta t_e=t_A=\dfrac{D}{c}

B: time interval t_B=t_A+\dfrac{v\Delta t_e\cos\theta}{c}

Note that \cos\theta<0!

From this equations, we get a combined equation for the time separation of pulses on Earth

\boxed{\Delta t_r=\Delta t_e (1+\beta \cos\theta)}

The tangential separation is defined to be

\Delta Y=Y_B-Y_A=v\Delta t_e\sin\theta

so, the apparent velocity of the source, seen from the Earth frame, is showed to be:

\boxed{v_a=\dfrac{\Delta Y}{\Delta t_r}=\dfrac{\beta\sin\theta}{1+\beta\cos\theta}c}

Remark (I): v_a>>c IFF \beta\approx 1 AND \cos\theta\approx -1!

Remark (II): There are some other sources of fake superluminality in special relativity or general relativity (the relativist theory of gravity). One example is that the phase velocity and the group velocity can indeed exceed the speed of light, since from the equation v_{ph}v_{g}=c^2, it is obvious that whenever that one of those two velocities (group or phase velocity) are lower than the speed of light at vacuum, the another has to be exceeding the speed of light. That is not observable but it has an important rôle in the de Broglie wave-particle portrait of the atom. Other important example of apparent and fake superluminal motion is caused by gravitational (micro)lensing in General Relativity. Due to the effect of intense gravitational fields ( i.e., big concentrations of mass-energy), light beams from slow-movinh objects can be magnified to make them, apparently, superluminal. In this sense, gravity acts in an analogue way of a lens, i.e., as it there were a refraction index and modifying the propagation of the light emitted by the sources.

Remark (III): In spite of the appearance, I am not opposed to the idea of superluminal entities, if they don’t break established knowledge that we do know it works. Tachyons have problems not completely solved and many physicists think (by good reasons) they are “unphysical”.  However, my own experience working with theories beyond special/general relativity and allowing superluminal stuff (again, we should be careful with what we mean with superluminality and with “velocity” in general) has showed me that if superluminal objects do exist, they have observable consequences. And as it has been showed here, not every apparent superluminal motion is superluminal!Indeed, it can be handled in the SR framework. So, be aware of crackpots claiming that there are superluminal jets of matter out there, that neutrinos are effectively superluminal entities ( again, an observation refuted by OPERA, MINOS and ICARUS and in complete disagreement with the theory of neutrino oscillations and the real mass that neutrino do have!) or even when they say there are superluminal protons and particles in the LHC or passing through the atmosphere without any effect that should be vissible with current technology. It is simply not true, as every good astronomer, astrophysicist or theoretical physicist do know! Superluminality, if it exists, it is a very subtle thing and it has observable consequences that we have not observed until now. As far as I know, there is no (accepted) observation of any superluminal particle, as every physicist do know. I have discussed the issue of neutrino time of flight here before:

https://thespectrumofriemannium.wordpress.com/2012/06/08/

Final challenge: With the date given above, what would the minimal value of \beta be in order to account for the observed motion and apparent (fake) superluminal velocity of the radiogalaxy 3C 273?