A physics story of infinities, goats and colours

When I was writing in August about physicist Sheldon Glashow’s objection to Abdus Salam being awarded a share of the 1979 physics Nobel Prize, I learnt that it was because Salam had derived a theory that Glashow had derived as well, taking a different route, but ultimately the final product was non-renormalisable. A year or so later, Steven Weinberg derived the same theory but this time also ensured that it was renormalisable. Glashow said Salam shouldn’t have won the prize because Salam hadn’t brought anything new to the table, whereas Glashow had derived the initial theory and Weinberg had made it renormalisable.

His objections aside, the episode brought to my mind the work of Kenneth Wilson, who made important contributions to the renormalisation toolkit. Specifically, using these tools, physicists ensure that the equations that they’re using to model reality don’t get out of hand and predict impossible values. An equation might be useful to solve problems in 99 scenarios but in one, it might predict an infinity (i.e. the value of a physical variable approaches a very large number), rendering the equation useless. In such cases, physicists use renormalisation techniques to ensure the equation works in the 100th scenario as well, without predicting infinities. (This is a simplistic description that I will finesse throughout this post.)

In 2013, when Kenneth Wilson died, I wrote about the “Indian idea of infiniteness” – including how scholars in ancient India had contemplated very large numbers and their origins, only for this knowledge to have all but disappeared from the public imagination today because of the country’s failure to preserve it. In both instances, I never quite fully understood what renormalisation really entailed. The following post is an attempt to fix this gap.

You know electrons. Electrons have mass. Not all this mass is implicit mass per se. Some of it is the mass of the particle itself, sometimes called the shell mass. The electron also has an electric charge and casts a small electromagnetic field around itself. This field has some energy. According to the mass-energy equivalence (E = mc2approx.), the energy should correspond to some mass. This is called the electron’s electromagnetic mass.

Now, there is an equation to calculate how much a particle’s electromagnetic mass will be – and this equation shows that this mass is inversely proportional to the particle’s radius. That is, smaller the particle, the more its electromagnetic mass. This is why the mass of a single proton, which is larger than the electron, has a lower contribution from its electromagnetic mass.

So far so good – but quickly a problem arises. As the particle becomes smaller, according to the equation, its electromagnetic mass will increase. In technical terms, as the particle radius approaches zero, its mass will approach infinity. If its mass approaches infinity, the particle will be harder to move from rest, or accelerate, because a very large and increasing amount of energy will be required to do so. So the equation predicts that smaller charged particles, like quarks, should be nearly impossible to move around. Yet this is not what we see in experiments, where these particles do move around.

In the first decade of the 20th century (when the equation existed but quarks had not yet been discovered), Max Abraham and Hendrik Lorentz resolved this problem by assuming that the shell mass of the particle is negative. It was the earliest (recorded) instance of such a tweak – so that the equations we use to model reality don’t lose touch with that reality – and was called renormalisation. Assuming the shell mass is negative is silly, of course, but it doesn’t affect the final result in a way that breaks the theory. To renormalise, in this context, assumes that our mathematical knowledge of the event to be modelled is not complete enough, or that introducing such completeness would make the majority of other problems intractable.

There is another route physicists take to make sure equations and reality match, called regularisation. This is arguably more intuitive. Here, the physicist modifies the equation to include a ‘cutoff factor’ that represents what the physicist assumes is their incomplete knowledge of the phenomenon to which the equation is being applied. By applying a modified equation in this way, the physicist argues that some ‘new physics’ will be discovered in future that will complete the theory and the equation to perfectly account for the mass.

(I personally prefer regularisation because it seems more modest, but this is an aesthetic choice that has nothing to do with the physics itself and is thus moot.)

It is sometimes the case that once a problem is solved by regularisation, the cutoff factor disappears from the final answer – so effectively it helped with solving the problem in a way that its presence or absence doesn’t affect the answer.

This brings to mind the famous folk tale of the goat negotiation problem, doesn’t it? A fellow in a village dies and bequeaths his 17 goats to three sons thus: the eldest gets half, the middle gets a third and the youngest gets one-ninth. Obviously the sons get into a fight: the eldest claims nine instead of 8.5 goats, the middle claims six instead of 5.67 and the youngest claims two instead of 1.89. But then a wise old woman turns up and figures it out. She adds one of her own goats to the father’s 17 to make up a total of 18. Now, the eldest son gets nine goats, the middle son gets six goats and the youngest son gets two goats. Problem solved? When the sons tally up the goats they received, the realise that the total is still 17. The old woman’s goat is left, which she then takes back and gets on her way. The one additional goat was the cutoff factor here: you add it to the problem, solve it, get a solution and move on.

The example of the electron was suitable but also convenient: the need to renormalise particle masses originally arose in the context of classical electrodynamics – the first theory developed to study the behaviour of charged particles. Theories that physicists developed later, in each case to account for some phenomena that other theories couldn’t, also required renormalisation in different contexts, but for the same purpose: to keep the equations from predicting infinities. Infinity is a strange number that compromises our ability to make sense of the natural universe because it spreads itself like an omnipresent screen, obstructing our view of the things beyond. To get to them, you must scale an unscaleable barrier.

While the purpose of renormalisation has stayed the same, it took on new forms in different contexts. For example, quantum electrodynamics (QED) studies the behaviour of charged particles using the rules of quantum physics – as opposed to classical electrodynamics, which is an extension of Newtonian physics. In QED, the charge of an electron actually comes out to be infinite. This is because QED doesn’t have a way to explain why the force exerted by a charged particle decreases as you move away. But in reality electrons and protons have finite charges. How do we fix the discrepancy?

The path of renormalisation here is as follows: Physicists assume that any empty space is not really empty. There may be no matter there, sure, but at the microscopic scale, the vacuum is said to be teeming with virtual particles. These are pairs of particles that pop in and out of existence over very short time scales. The energy that produces them, and the energy that they release when they annihilate each other and vanish, is what physicists assume to be the energy inherent to space itself.

Now, say an electron-positron pair, called ‘e’ and ‘p’, pops up near an independent electron, ‘E’. The positron is the antiparticle of the electron and has a positive charge, so it will move closer to E. As a result, the electromagnetic force exerted by E’s electric charge becomes screened at a certain distance away, and the reduced force implies a lower effective charge. As the virtual particle pairs constantly flicker around the electron, QED says that we can observe only the effects of its screened charge.

By the 1960s, physicists had found several fundamental particles and were trying to group them in a way that made sense – i.e. that said something about why these were the fundamental particles and not others, and whether an incomplete pattern might suggest the presence of particles still to be discovered. Subsequently, in 1964, two physicists working independently – George Zweig and Murray Gell-Mann – proposed that protons and neutrons were not fundamental particles but were made up of smaller particles called quarks and gluons. They also said that there were three kinds of quarks and that the quarks could bind together using the gluons (thus the name). Each of these particles had an electric charge and a spin, just like electrons.

Within a year, Oscar Greenberg proposed that the quarks would also have an additional ‘color charge’ to explain why they don’t violate Pauli’s exclusion principle. (The term ‘colour’ has nothing to do with colours; it is just the label that unamiginative physicists selected when they were looking for one.) Around the same time, James Bjorken and Sheldon Glashow also proposed that there would have to be a fourth kind of quark, because then the new quark-gluon model could explain three more unsolved problems at the time. In 1968, physicists discovered the first evidence for quarks and gluons in experiments, proving that Zweig, Gell-Mann, Glashow, Bjorken, Greenberg, etc. were right. But as usual, there was a problem.

Quantum chromodynamics (QCD) is the study of quarks and gluons. In QED, if an electron and a positron interact at higher energies, their coupling will be stronger. But physicists who designed experiments in which they could observe the presence of quarks found the opposite was true: at higher energies, the quarks in a bound state behaved more and more like individual particles, but at lower energies, the effects of the individual quarks didn’t show, only that of the bound state. Seen another way, if you move an electron and a positron apart, the force between them gradually drops off to zero. But if you move two quarks apart, the force between them will increase for short distance before falling off to zero. It seemed that QCD would defy QED renormalisation.

A breakthrough came in 1973. If a quark ‘Q’ is surrounded by virtual quark-antiquark pairs ‘q’ and ‘q*’, then q* would move closer to Q and screen Q’s colour charge. However, the gluons have the dubious distinction of being their own antiparticles. So some of these virtual pairs are also gluon-gluon pairs. And gluons also carry colour charge. When the two quarks are moved apart, the space in between is occupied by gluon-gluon pairs that bring in more and more colour charge, leading to the counterintuitive effect.

However, QCD has had need of renormalisation in other areas, such as with the quark self-energy. Recall the electron and its electromagnetic mass in classical electrodynamics? This mass was the product of the electromagnetic energy field that the electron cast around itself. This energy is called self-energy. Similarly, quarks bear an electric charge as well as a colour charge and cast a chromo-electric field around themselves. The resulting self-energy, like in the classical electron example, threatens to reach an extremely high value – at odds with reality, where quarks have a relatively lower, certainly finite, self-energy.

However, the simple addition of virtual particles wouldn’t solve the problem either, because of the counterintuitive effects of the colour charge and the presence of gluons. So physicists are forced to adopt a more convoluted path in which they use both renormalisation and regularisation, as well as ensure that the latter turns out like the goats – where a new factor introduced into the equations doesn’t remain in the ultimate solution. The mathematics of QCD is a lot more complicated than that of QED (they are notoriously hard even for specially trained physicists), so the renormalisation and regularisation process is also correspondingly inaccessible to non-physicists. More than anything, it is steeped in mathematical techniques.

All this said, renormalisation is obviously quite inelegant. The famous British physicist Paul A.M. Dirac, who pioneered its use in particle physics, called it “ugly”. This attitude changed the most due to the work of Kenneth Wilson. (By the way, his PhD supervisor was Gell-Mann.)

Quarks and gluons together make up protons and neutrons. Protons, neutrons and electrons, plus the forces between them, make up atoms. Atoms make up molecules, molecules make up compounds and many compounds together, in various quantities, make up the objects we see all around us.

This description encompasses three broad scales: the microscopic, the mesoscopic and the macroscopic. Wilson developed a theory to act like a bridge – between the forces that quarks experience at the microscopic scale and the forces that cause larger objects to undergo phase transitions (i.e. go from solid to liquid or liquid to vapour, etc.). When a quark enters or leaves a bound state or if it is acted on by other particles, its energy changes, which is also what happens in phase transitions: objects gain or lose energy, and reorganise themselves (liquid –> vapour) to hold or shed that energy.

By establishing this relationship, Wilson could bring to bear insights gleaned from one scale to difficult problems at a different scale, and thus make corrections that were more streamlined and more elegant. This is quite clever because even renormalisation is the act of substituting what we are modelling with what we are able to observe, and which Wilson improved on by dropping the direct substitution in favour of something more mathematically robust. After this point in history, physicists adopted renormalisation as a tool more widely across several branches of physics. As physicist Leo Kadanoff wrote in his obituary for Wilson in Nature, “It could … be said that Wilson has provided scientists with the single most relevant tool for understanding the basis of physics.”

This said, however, the importance of renormalisation – or anything like it that compensates for the shortcomings of observation-based theories – was known earlier as well, so much so that physicists considered a theory that couldn’t be renormalised to be inferior to one that could be. This was responsible for at least a part of Sheldon Glashow’s objection to Abdus Salam winning a share of the physics Nobel Prize.

Sources:

  1. Introduction to QCD, Michelangelo L. Mangano
  2. Lectures on QED and QCD, Andrey Grozin
  3. Lecture notes – Particle Physics II, Michiel Botje
  4. Lecture 5: QED
  5. Introduction to QCD, P.Z. Skands
  6. Renormalization: Dodging Infinities, John G. Cramer

‘Surface of last screaming’

This has nothing to do with anything in the news. I was reading up about the Big Bang for a blog post when I came across this lucid explanation – so good it’s worth sharing for that reason alone – for the surface of last scattering, the site of an important event in the history of the universe. A lot happens by this moment, even if it happens only 379,000 year after the bang, and it’s easy to get lost in the details. But as the excerpt below shows, coming at it from the PoV of phase transitions considerably simplifies the picture (assuming of course that you’re comfortable with phase transitions).

To visualise how this effect arises, imagine that you are in a large field filled with people screaming. You are screaming too. At some time t = 0 everyone stops screaming simultaneously. What will you hear? After 1 second you will still be able to hear the distant screaming of people more than 330 metres away (the speed of sound in air, v, is about 330 m/s). After 3 seconds you will be able to hear distant screams from people more than 1 kilometre away (even though those distant people stopped screaming when you did). At any time t, assuming a suitably heightened sense of hearing, you will hear some faint screams, but the closest and loudest will be coming from people a distance v*t away. This distance defines the ‘surface of last screaming’ and this surface is receding from you at the speed of sound. …

When something is hot and cools down it can undergo a phase transition. For example, hot steam cools down to become water, and when cooled further it becomes ice. The Universe went through similar phase transitions as it expanded and cooled. One such phase transition … produced the last scattering surface. When the Universe was cool enough to allow the electrons and protons to fall together, they ‘recombined’ to form neutral hydrogen. […] photons do not interact with neutral hydrogen, so they were free to travel through the Universe without being scattered. They decoupled from matter. The opaque Universe then became transparent.

Imagine you are living 15 billion years ago. You would be surrounded by a very hot opaque plasma of electrons and protons. The Universe is expanding and cooling. When the Universe cools down below a critical temperature, the fog clears instantaneously everywhere. But you would not be able to see that it has cleared everywhere because, as you look into the far distance, you would be seeing into the opaque past of distant parts of the Universe. As the Universe continues to expand and cool you would be able to see farther, but you would always see the bright opaque fog in the distance, in the past. That bright fog is the surface of last scattering. It is the boundary between a transparent and an opaque universe and you can still see it today, 15 billion years later.

A tale of vortices, skyrmions, paths and shapes

There are many types of superconductors. Some of them can be explained by an early theory of superconductivity called Bardeen-Cooper-Schrieffer (BCS) theory.

In these materials, vibrations in the atomic lattice force the electrons in the material to overcome their mutual repulsion and team up in pairs, if the material’s temperature is below a particular threshold (very low). These pairs of electrons, called Cooper pairs, have some properties that individual electrons can’t have. One of them is that all Cooper pairs together form an exotic state of matter called a Bose-Einstein condensate, which can flow through the material with much less resistance than individuals electrons experience. This is the gist of BCS theory.

When the Cooper pairs are involved in the transmission of an electric current through the material, the material is an electrical superconductor.

Some of the properties of the two electrons in each Cooper pair can influence the overall superconductivity itself. One of them is the orbital angular momentum, which is an intrinsic property of all particles. If both electrons have equal orbital angular momentum but are pointing in different directions, the relative orbital angular momentum is 0. Such materials are called s-wave superconductors.

Sometimes, in s-wave superconductors, some of the electric current – or supercurrent – starts flowing in a vortex within the material. If these vortices can be coupled with a magnetic structure called a skyrmion, physicists believe they can give rise to some new behaviour previously not seen in materials, some of them with important applications in quantum computing. Coupling here implies that a change in the properties of the vortex should induce changes in the skyrmion, and vice versa.

However, physicists have had a tough time creating a vortex-skyrmion coupling that they can control. As Gustav Bihlmayer, a staff scientist at the Jülich Research Centre, Germany, wrote for APS Physics, “experimental studies of these systems are still rare. Both parts” of the structures bearing these features “must stay within specific ranges of temperature and magnetic-field strength to realise the desired … phase, and the length scales of skyrmions and vortices must be similar in order to study their coupling.”

In a new paper, a research team from Nanyang Technical University, Singapore, has reported that they have achieved just such a coupling: they created a skyrmion in a chiral magnet and used it to induce the formation of a supercurrent vortex in an s-wave superconductor. In their observations, they found this coupling to be stable and controllable – important attributes to have if the setup is to find practical application.

A chiral magnet is a material whose internal magnetic field “typically” has a spiral or swirling pattern. A supercurrent vortex in an electrical superconductor is analogous to a skyrmion in a chiral magnet; a skyrmion is a “knot of twisting magnetic field lines” (source).

The researchers sandwiched an s-wave superconductor and a chiral magnet together. When the magnetic field of a skyrmion in the chiral magnet interacted with the superconductor at the interface, it induced a spin-polarised supercurrent (i.e. the participating electrons’ spin are aligned along a certain direction). This phenomenon is called the Rashba-Edelstein effect, and it essentially converts electric charge to electron spin and vice versa. To do so, the effect requires the two materials to be in contact and depends among other things on properties of the skyrmion’s magnetic field.

There’s another mechanism of interaction in which the chiral magnet and the superconductor don’t have to be in touch, and which the researchers successfully attempted to recreate. They preferred this mechanism, called stray-field coupling, to demonstrate a skyrmion-vortex system for a variety of practical reasons. For example, the chiral magnet is placed in an external magnetic field during the experiment. Taking the Rashba-Edelstein route means to achieve “stable skyrmions at low temperatures in thin films”, the field needs to be stronger than 1 T. (Earth’s magnetic field measures 25-65 µT.) Such a field could damage the s-wave superconductor.

For the stray-field coupling mechanism, the researchers inserted an insulator between the chiral magnet and the superconductor. Then, when they applied a small magnetic field, Bihlmayer wrote, the field “nucleated” skyrmions in the structure. “Stray magnetic fields from the skyrmions [then] induced vortices in the [superconducting] film, which were observed with scanning tunnelling spectroscopy.”


Experiments like this one reside at the cutting edge of modern condensed-matter physics. A lot of their complexity resides in scientists being able to closely control the conditions in which different quantum effects play out, using similarly advanced tools and techniques to understand what could be going on inside the materials, and to pick the right combination of materials to use.

For example, the heterostructure the physicists used to manifest the stray-field coupling mechanism had the following composition, from top to bottom:

  • Platinum, 2 nm (layer thickness)
  • Niobium, 25 nm
  • Magnesium oxide, 5 nm
  • Platinum, 2 nm

The next four layers are repeated 10 times in this order:

  • Platinum, 1 nm
  • Cobalt, 0.5 nm
  • Iron, 0.5 nm
  • Iridium, 1 nm

Back to the overall stack:

  • Platinum, 10 nm
  • Tantalum, 2 nm
  • Silicon dioxide (substrate)

The first three make up the superconductor, the magnesium oxide is the insulator, and the rest (except the substrate) make up the chiral magnet.

It’s possible to erect a stack like this through trial and error, with no deeper understanding dictating the choice of materials. But when the universe of possibilities – of elements, compounds and alloys, their shapes and dimensions, and ambient conditions in which they interact – is so vast, the exercise could take many decades. But here we are, at a time when scientists have explored various properties of materials and their interactions, and are able to engineer novel behaviours into existence, blurring the line between discovery and invention. Even in the absence of applications, such observations are nothing short of fascinating.

Applications aren’t wanting, however.


quasiparticle is a packet of energy that behaves like a particle in a specific context even though it isn’t actually one. For example, the proton is a quasiparticle because it’s really a clump of smaller particles (quarks and gluons) that together behave in a fixed, predictable way. A phonon is a quasiparticle that represents some vibrational (or sound) energy being transmitted through a material. A magnon is a quasiparticle that represents some magnetic energy being transmitted through a material.

On the other hand, an electron is said to be a particle, not a quasiparticle – as are neutrinos, photons, Higgs bosons, etc.

Now and then physicists abstract packets of energy as particles in order to simplify their calculations.

(Aside: I’m aware of the blurred line between particles and quasiparticles. For a technical but – if you’re prepared to Google a few things – fascinating interview with condensed-matter physicist Vijay Shenoy on this topic, see here.)

We understand how these quasiparticles behave in three-dimensional space – the space we ourselves occupy. Their properties are likely to change if we study them in lower or higher dimensions. (Even if directly studying them in such conditions is hard, we know their behaviour will change because the theory describing their behaviour predicts it.) But there is one quasiparticle that exists in two dimensions, and is quite different in a strange way from the others. They are called anyons.

Say you have two electrons in an atom orbiting the nucleus. If you exchanged their positions with each other, the measurable properties of the atom will stay the same. If you swapped the electrons once more to bring them back to their original positions, the properties will still remain unchanged. However, if you switched the positions of two anyons in a quantum system, something about the system will change. More broadly, if you started with a bunch of anyons in a system and successively exchanged their positions until they had a specific final arrangement, the system’s properties will have changed differently depending on the sequence of exchanges.

This is called path dependency, and anyons are unique in possessing this property. In technical language, anyons are non-Abelian quasiparticles. They’re interesting for many reasons, but one application stands out. Quantum computers are devices that use the quantum mechanical properties of particles, or quasiparticles, to execute logical decisions (the same way ‘classical’ computers use semiconductors). Anyons’ path dependency is useful here. Arranging anyons in one sequence to achieve a final arrangement can be mapped to one piece of information (e.g. 1), and arranging anyons by a different sequence to achieve the same final arrangement can be mapped to different information (e.g. 0). This way, what information can be encoded depends on the availability of different paths to a common final state.

In addition, an important issue with existing quantum computers is that they are too fragile: even a slight interaction with the environment can cause the devices to malfunction. Using anyons for the qubits could overcome this problem because the information stored doesn’t depend on the qubits’ existing states but the paths that they have taken there. So as long as the paths have been executed properly, environmental interactions that may disturb the anyons’ final states won’t matter.

However, creating such anyons isn’t easy.

Now, recall that s-wave superconductors are characterised by the relative orbital angular momentum of electrons in the Cooper pairs being 0 (i.e. equal but in opposite directions). In some other materials, it’s possible that the relative value is 1. These are the p-wave superconductors. And at the centre of a supercurrent vortex in a p-wave superconductor, physicists expect to find non-Abelian anyons.

So the ability to create and manipulate these vortices in superconductors, as well as, more broadly, explore and understand how magnet-superconductor heterostructures work, is bound to be handy.


The Nanyang team’s paper calls the vortices and skyrmions “topological excitations”. An ‘excitation’ here is an accumulation of energy in a system over and above what the system has in its ground state. Ergo, it’s excited. A topological excitation refers to energy manifested in changes to the system’s topology.

On this subject, one of my favourite bits of science is topological phase transitions.

I usually don’t quote from Wikipedia but communicating condensed-matter physics is exacting. According to Wikipedia, “topology is concerned with the properties of a geometric object that are preserved under continuous deformations, such as stretching, twisting, crumpling and bending”. For example, no matter how much you squeeze or stretch a donut (without breaking it), it’s going to be a ring with one hole. Going one step further, your coffee mug and a donut are topologically similar: they’re both objects with one hole.

I also don’t like the Nobel Prizes but some of the research that they spotlight is nonetheless awe-inspiring. In 2016, the prize was awarded to Duncan Haldane, John Kosterlitz and David Thouless for “theoretical discoveries of topological phase transitions and topological phases of matter”.

David Thouless in 1995. Credit: Mary Levin/University of Washington

Quoting myself from 2016:

There are four popularly known phases of matter: plasma, gas, liquid and solid. If you cooled plasma, its phase would transit to that of a gas; if you cooled gases, you’d get a liquid; if you cooled liquids, you’d get a solid. If you kept cooling a solid until you were almost at absolute zero, you’d find substances behaving strangely because, suddenly, quantum mechanical effects show up. These phases of matter are broadly called quantum phases. And their phase transitions are different from when plasma becomes a gas, a gas becomes a liquid, and so on.

A Kosterlitz-Thouless transition describes a type of quantum phase transition. A substance in the quantum phase, like all substances, tries to possess as low energy as possible. When it gains some extra energy, it sheds it. And how it sheds it depends on what the laws of physics allow. Kosterlitz and Thouless found that, at times, the surface of a flat quantum phase – like the surface of liquid helium – develops vortices, akin to a flattened tornado. These vortices always formed in pairs, so the surface always had an even number of vortices. And at very low temperatures, the vortices were always tightly coupled: they remained close to each other even when they moved across the surface.

The bigger discovery came next. When Kosterlitz and Thouless raised the temperature of the surface, the vortices moved apart and moved around freely, as if they no longer belonged to each other. In terms of thermodynamics alone, the vortices being alone or together wouldn’t depend on the temperature, so something else was at play. The duo had found a kind of phase transition – because it did involve a change in temperature – that didn’t change the substance itself but only a topological shift in how it behaved. In other words, the substance was able to shed energy by coupling the vortices.

Reality is so wonderfully weird. It’s also curious that some concepts that seemed significant when I was learning science in school (like invention versus discovery) and in college (like particle versus quasiparticle) – concepts that seemed meaningful and necessary to understand what was really going on – don’t really matter in the larger scheme of things.