EBook Overview
The importance and the beauty of modern quantum field theory resides in the power and variety of its methods and ideas, which find application in domains as different as particle physics, cosmology, condensed matter, statistical mechanics and critical phenomena. This book introduces the reader to the modern developments in a manner which assumes no previous knowledge of quantum field theory. Along with standard topics like Feynman diagrams, the book discusses effective lagrangians, renormalization group equations, the path integral formulation, spontaneous symmetry breaking and nonabelian gauge theories. The inclusion of more advanced topics will also make this a most useful book for graduate students and researchers.
EBook Content
OXFORD MASTER SERIES IN STATISTICAL, COMPUTATIONAL, AND THEORETICAL PHYSICS
OXFORD MASTER SERIES IN PHYSICS The Oxford Master Series is designed for final year undergraduate and beginning graduate students in physics and related disciplines. It has been driven by a perceived gap in the literature today. While basic undergraduate physics texts often show little or no connection with the huge explosion of research over the last two decades, more advanced and specialized texts tend to be rather daunting for students. In this series, all topics and their consequences are treated at a simple level, while pointers to recent developments are provided at various stages. The emphasis in on clear physical principles like symmetry, quantum mechanics, and electromagnetism which underlie the whole of physics. At the same time, the subjects are related to real measurements and to the experimental techniques and devices currently used by physicists in academe and industry. Books in this series are written as course books, and include ample tutorial material, examples, illustrations, revision points, and problem sets. They can likewise be used as preparation for students starting a doctorate in physics and related fields, or for recent graduates starting research in one of these fields in industry. CONDENSED MATTER PHYSICS
1. M. T. Dove: Structure and dynamics: an atomic view of materials 2. J. Singleton: Band theory and electronic properties of solids 3. A. M. Fox: Optical properties of solids 4. S. J. Blundell: Magnetism in condensed matter 5. J. F. Annett: Superconductivity 6. R. A. L. Jones: Soft condensed matter ATOMIC, OPTICAL, AND LASER PHYSICS
7. C. J. Foot: Atomic physics 8. G. A. Brooker: Modern classical optics 9. S. M. Hooker, C. E. Webb: Laser physics PARTICLE PHYSICS, ASTROPHYSICS, AND COSMOLOGY
10. D. H. Perkins: Particle astrophysics 11. TaPei Cheng: Relativity, gravitation, and cosmology STATISTICAL, COMPUTATIONAL, AND THEORETICAL PHYSICS
12. M. Maggiore: A modern introduction to quantum field theory 13. W. Krauth: Statistical mechanics: algorithms and computations 14. J. P. Sethna: Entropy, order parameters, and complexity
A Modern Introduction to Quantum Field Theory
Michele Maggiore D´epartement de Physique Th´eorique Universit´e de Gen`eve
1
3
Great Clarendon Street, Oxford OX2 6DP Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide in Oxford New York Auckland Bangkok Buenos Aires Cape Town Chennai Dar es Salaam Delhi Hong Kong Istanbul Karachi Kolkata Kuala Lumpur Madrid Melbourne Mexico City Mumbai Nairobi S˜ao Paulo Shanghai Taipei Tokyo Toronto Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries Published in the United States by Oxford University Press Inc., New York c Oxford University Press 2005 The moral rights of the author have been asserted Database right Oxford University Press (maker) First published 2005 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this book in any other binding or cover and you must impose this same condition on any acquirer A catalogue record for this title is available from the British Library Library of Congress Cataloging in Publication Data (Data available) ISBN 0 19 852073 5 (Hbk) ISBN 0 19 852074 3 (Pbk) 10 9 8 7 6 5 4 3 2 1 Printed in Great Britain on acidfree paper by Antony Rowe, Chippenham
A Maura, Sara e Ilaria
This page intentionally left blank
Contents Preface
xi
Notation
xii
1 Introduction 1.1 Overview 1.2 Typical scales in highenergy physics Further reading Exercises
1 1 4 11 12
2 Lorentz and Poincar´ e symmetries in QFT 2.1 Lie groups 2.2 The Lorentz group 2.3 The Lorentz algebra 2.4 Tensor representations 2.4.1 Decomposition of Lorentz tensors under SO(3) 2.5 Spinorial representations 2.5.1 Spinors in nonrelativistic quantum mechanics 2.5.2 Spinors in the relativistic theory 2.6 Field representations 2.6.1 Scalar ﬁelds 2.6.2 Weyl ﬁelds 2.6.3 Dirac ﬁelds 2.6.4 Majorana ﬁelds 2.6.5 Vector ﬁelds 2.7 The Poincar´e group 2.7.1 Representation on ﬁelds 2.7.2 Representation on oneparticle states Summary of chapter Further reading Exercises
13 13 16 18 20 22 24 24 26 29 29 31 32 33 34 34 35 36 40 41 41
3 Classical ﬁeld theory 3.1 The action principle 3.2 Noether’s theorem 3.2.1 The energy–momentum tensor 3.3 Scalar ﬁelds 3.3.1 Real scalar ﬁelds; Klein–Gordon equation 3.3.2 Complex scalar ﬁeld; U (1) charge
43 43 46 49 51 51 53
viii Contents
3.4
Spinor ﬁelds 3.4.1 The Weyl equation; helicity 3.4.2 The Dirac equation 3.4.3 Chiral symmetry 3.4.4 Majorana mass 3.5 The electromagnetic ﬁeld 3.5.1 Covariant form of the free Maxwell equations 3.5.2 Gauge invariance; radiation and Lorentz gauges 3.5.3 The energy–momentum tensor 3.5.4 Minimal and nonminimal coupling to matter 3.6 First quantization of relativistic wave equations 3.7 Solved problems The ﬁne structure of the hydrogen atom Relativistic energy levels in a magnetic ﬁeld Summary of chapter Exercises
54 54 56 62 63 65 65 66 67 69 73 74 74 79 80 81
4 Quantization of free ﬁelds 4.1 Scalar ﬁelds 4.1.1 Real scalar ﬁelds. Fock space 4.1.2 Complex scalar ﬁeld; antiparticles 4.2 Spin 1/2 ﬁelds 4.2.1 Dirac ﬁeld 4.2.2 Massless Weyl ﬁeld 4.2.3 C, P, T 4.3 Electromagnetic ﬁeld 4.3.1 Quantization in the radiation gauge 4.3.2 Covariant quantization Summary of chapter Exercises
83 83 83 86 88 88 90 91 96 96 101 105 106
5 Perturbation theory and Feynman diagrams 5.1 The Smatrix 5.2 The LSZ reduction formula 5.3 Setting up the perturbative expansion 5.4 The Feynman propagator 5.5 Wick’s theorem and Feynman diagrams 5.5.1 A few very explicit computations 5.5.2 Loops and divergences 5.5.3 Summary of Feynman rules for a scalar ﬁeld 5.5.4 Feynman rules for fermions and gauge bosons 5.6 Renormalization 5.7 Vacuum energy and the cosmological constant problem 5.8 The modern point of view on renormalizability 5.9 The running of coupling constants Summary of chapter Further reading Exercises
109 109 111 116 120 122 123 128 131 132 135 141 144 146 152 153 154
Contents ix
6 Crosssections and decay rates 6.1 Relativistic and nonrelativistic normalizations 6.2 Decay rates 6.3 Crosssections 6.4 Twobody ﬁnal states 6.5 Resonances and the Breit–Wigner distribution 6.6 Born approximation and nonrelativistic scattering 6.7 Solved problems Threebody kinematics and phase space Inelastic scattering of nonrelativistic electrons on atoms Summary of chapter Further reading Exercises
155 155 156 158 160 163 167 171 171 173 177 178 178
7 Quantum electrodynamics 7.1 The QED Lagrangian 7.2 Oneloop divergences 7.3 Solved problems e+ e− → γ → µ+ µ− Electromagnetic form factors Summary of chapter Further reading Exercises
180 180 183 186 186 188 193 193 193
8 The 8.1 8.2 8.3
195 195 197 202 202 205 209 212 216 217 217
9 Path 9.1 9.2 9.3 9.4 9.5 9.6 9.7
219 220 224 225 228 231 238 239 239 241 242
lowenergy limit of the electroweak theory A fourfermion model Charged and neutral currents in the Standard Model Solved problems: weak decays µ− → e− ν¯e νµ π + → l+ νl Isospin and ﬂavor SU (3) K 0 → π − l+ νl Summary of chapter Further reading Exercises
integral quantization Path integral formulation of quantum mechanics Path integral quantization of scalar ﬁelds Perturbative evaluation of the path integral Euclidean formulation QFT and critical phenomena QFT at ﬁnite temperature Solved problems Instantons and tunneling Summary of chapter Further reading
x Contents
10 Nonabelian gauge theories 10.1 Nonabelian gauge transformations 10.2 Yang–Mills theory 10.3 QCD 10.4 Fields in the adjoint representation Summary of chapter Further reading
243 243 246 248 250 252 252
11 Spontaneous symmetry breaking 11.1 Degenerate vacua in QM and QFT 11.2 SSB of global symmetries and Goldstone bosons 11.3 Abelian gauge theories: SSB and superconductivity 11.4 Nonabelian gauge theories: the masses of W ± and Z 0 Summary of chapter Further reading
253 253 256 259 262 264 265
12 Solutions to exercises 12.1 Chapter 1 12.2 Chapter 2 12.3 Chapter 3 12.4 Chapter 4 12.5 Chapter 5 12.6 Chapter 6 12.7 Chapter 7 12.8 Chapter 8
266 266 267 270 272 275 276 279 281
Bibliography
285
Index
287
Preface This book grew out of the notes of the course on quantum ﬁeld theory that I give at the University of Geneva, for students in the fourth year. Most courses on quantum ﬁeld theory focus on teaching the student how to compute crosssections and decay rates in particle physics. This is, and will remain, an important part of the preparation of a highenergy physicist. However, the importance and the beauty of modern quantum ﬁeld theory resides also in the great power and variety of its methods and ideas. These methods are of great generality and provide a unifying language that one can apply to domains as diﬀerent as particle physics, cosmology, condensed matter, statistical mechanics and critical phenomena. It is this power and generality that makes quantum ﬁeld theory a fundamental tool for any theoretical physicist, independently of his/her domain of specialization, as well as, of course, for particle physics experimentalists. In spite of the existence of many textbooks on quantum ﬁeld theory, I decided to write these notes because I think that it is diﬃcult to ﬁnd a book that has a modern approach to quantum ﬁeld theory, in the sense outlined above, and at the same time is written having in mind the level of fourth year students, which are being exposed for the ﬁrst time to the subject. The book is selfcontained and can be covered in a two semester course, possibly skipping some of the more advanced topics. Indeed, my aim is to propose a selection of topics that can really be covered in a course, but in which the students are introduced to many modern developments of quantum ﬁeld theory. At the end of some chapters there is a Solved Problems section where some especially instructive computations are presented in great detail, in order to give a model of how one really performs nontrivial computations. More exercises, sometimes quite demanding, are provided for Chapters 1 to 8, and their solutions are discussed at the end of the book. Chapters 9, 10 and 11 are meant as a bridge toward more advanced courses at the PhD level. A few parts which are more technical and can be skipped at a ﬁrst reading are written in smaller characters.
Acknowledgments. I am very grateful to Stefano Foﬀa, Florian Dubath, Alice Gasparini, Alberto Nicolis and Riccardo Sturani for their help and for their careful reading of the manuscript. I also thank JeanPierre Eckmann for useful comments, and Sonke Adlung, of Oxford University Press, for his friendly and useful advice.
Notation Our notation is the same as Peskin and Schroeder (1995). We use units = c = 1; their meaning and usefulness is illustrated in Section 1.2. The metric signature is ηµν = (+, −, −, −) .
1 We will never use lower spatial indices, to avoid the possible ambiguity due to the fact that in equations with only spatial indices it would be natural to use δij to raise and lower them, while with our signature it is rather ηij = −δij .
Indices. Greek indices take values µ = 0, . . . , 3, while spatial indices are denoted by Latin letters, i, j, . . . = 1, 2, 3. The totally antisymmetric tensor µνρσ has 0123 = +1 (therefore 0123 = −1). Observe that, e.g. 1230 = −1 since, to recover the reference sequence 0123, the index zero has to jump three positions. Therefore µνρσ is anticyclic. Repeated upper and lower Lorentz indices are summed over, e.g. Aµ B µ ≡ 3µ=0 Aµ B µ . When the equations contain only spatial indices, we will keep all indices as upper indices,1 and we will sum over repeated upper indices; e.g. the angular momentum commutation relations are written as [J k , J l ] = iklm J m , and the totally antisymmetric tensor ijk is normalized as 123 = +1. The notation A denotes a spatial vector whose components have upper indices, A = (A1 , A2 , A3 ). The partial derivative is denoted by ∂µ = ∂/∂xµ and the (ﬂat space) d’Alambertian is 2 = ∂µ ∂ µ = ∂02 − ∇2 . With our choice of signature the fourmomentum operator is represented on functions of the coordinates as pµ = +i∂ µ , so p0 = i∂/∂x0 = i∂/∂t and pi = i∂ i = −i∂i = −i∂/∂xi . Therefore pi = −i∇i with ∇i = ∂/∂xi = ∂i or, in vector notation, p = −i∇ and ∇ = ∂/∂x . ↔
↔
The symbol ∂µ is deﬁned by f ∂µ g = f ∂µ g − (∂µ f )g. We also use the Feynman slash notation: for a fourvector Aµ , we deﬁne A = Aµ γ µ . In particular, ∂ = γ µ ∂µ . Dirac matrices. Dirac γ matrices satisfy {γ µ , γ ν } ≡ γ µ γ ν + γ ν γ µ = 2η µν . Therefore γ02 = 1 and, for each i, (γ i )2 = −1; γ 0 is hermitian while, for each i, γ i is antihermitian, (γ 0 )† = γ 0 ,
(γ i )† = −γ i ,
or, more compactly, (γ µ )† = γ 0 γ µ γ 0 . The matrix γ 5 is deﬁned as γ 5 = +iγ 0 γ 1 γ 2 γ 3 , and satisﬁes (γ 5 )2 = 1 ,
(γ 5 )† = γ 5 ,
{γ 5 , γ µ } = 0 .
xiii
We also deﬁne
i µ ν [γ , γ ] . 2 Two particularly useful representations of the γ matrix algebra are 0 1 0 σi −1 0 5 γ0 = , γi = = , γ 1 0 0 1 −σ i 0 σ µν =
(here 1 denotes the 2 × 2 identity matrix), which is called the chiral or Weyl representation, and 1 0 0 σi 0 1 0 i 5 γ = , γ = , , γ = 0 −1 1 0 −σ i 0 which is called the ordinary, or standard, representation. The Pauli matrices are 0 1 0 −i 1 σ1 = , σ2 = , σ3 = 1 0 i 0 0
0 −1
,
and satisfy σ i σ j = δ ij + iijk σ k . We also deﬁne σ µ = (1, σ i ) ,
σ ¯ µ = (1, −σ i ) .
In the calculation of crosssections and decay rates we often need the following traces of products of γ matrices, Tr(γ µ γ ν ) = 4 η µν , Tr(γ µ γ ν γ ρ γ σ ) = 4 (η µν η ρσ − η µρ η νσ + η µσ η νρ ) , Tr(γ 5 γ µ γ ν γ ρ γ σ ) = −4iµνρσ . Fourier transform. The fourdimensional Fourier transform is d4 k −ikx ˜ f(k) , e f (x) = (2π)4 ˜ = d4 x eikx f (x) , f(k) and, because of our choice of signature, the threedimensional Fourier transform is deﬁned as d3 k +ik ·x ˜ f (x ) = e f (k ) , (2π)3 f˜(k ) = d3 x e−ik ·x f (x ) . For arbitrary n, the ndimensional Dirac delta satisﬁes dn x eikx = (2π)n δ (n) (k) .
xiv Notation
Electromagnetism. The electron charge is denoted by e, and e < 0. As is customary in quantum ﬁeld theory and particle physics, we use the Heaviside–Lorentz system of units for electromagnetism (also called rationalized Gaussian c.g.s. units). This means that the ﬁne structure constant α = 1/137.035 999 11(46) is related to the electron charge by α=
e2 , 4πc
or simply α = e2 /(4π) when we set = c = 1. With this deﬁnition of the unit of charge there is no factor of 4π in the Maxwell equations, ∇·E = ρ ,
∇×B − ∂0 E = J ,
while the Coulomb potential between two static particles of charges Q 1 = q1 e and Q2 = q2 e is V (r) =
Q1 Q2 α = q1 q2 4πr r
(1)
(where in the last equality we have used = c = 1), and the energy density of the electromagnetic ﬁeld is ε=
2 Observe that, once the result is written in terms of α, it is independent of the conventions on e, since α is always the same constant 1/137. For instance, the Coulomb potential between two electrons (in units = c = 1) is always V (r) = α/r.
1 2 (E + B2 ) . 2
In quantum electrodynamics nowadays these conventions on the electric charge are almost universally used, but it is useful to remark that they diﬀer from the (unrationalized ) Gaussian units commonly used in classical electrodynamics; see, e.g. Jackson (1975) or Landau and Lifshitz, vol. II (1979), where the electron charge is rather √ deﬁned so that α = e2unrat /(c) 1/137, and therefore eunrat = e/ 4π. The unrationalized electric and magnetic ﬁelds, Eunrat , Bunrat by deﬁnition are related to the rationalized electric and √ magnetic ﬁelds, E, B by Eunrat = √ √ 4π E, Bunrat = 4π B, i.e. Aµunrat = 4π Aµ . The form of the Lorentz force equation is therefore unchanged, since with these deﬁnitions eE = eunrat Eunrat and eB = eunrat Bunrat . However, a factor 4π appears in the Maxwell equations, ∇·Eunrat = 4πρunrat and ∇×Bunrat − ∂0 Eunrat = 4πJunrat ; the Coulomb potential becomes V (r) = (Q1 Q2 )unrat /r, and the electromagnetic energy density becomes ε = (E2unrat + B2unrat )/(8π). In quantum electrodynamics, since eAµ = eunrat Aµunrat , the interaction vertex is −ieγ µ in rationalized units and −ieunrat γ µ in unrationalized units. However, in unrationalized units the gauge ﬁeld is not canonically normalized, as we see for instance from the form of the energy density. Therefore in unrationalized units the √ factor associated to an incoming photon in a Feynman √ graph becomes 4πµ rather than just µ , to an outgoing photon it is 4π∗µ rather than just ∗µ , and in the photon propagator the factor 1/k 2 becomes 4π/k 2 . In quantum theory it is more convenient to have a canonically normalized gauge ﬁeld, which is the reason why, except in Landau and Lifshitz, vol. IV (1982), rationalized units are always used.2
xv
Experimental data. Unless explicitly speciﬁed otherwise, our experimental data are taken from the 2004 edition of the Review of Particle Physics of the Particle Data Group, S. Eidelman et al., Phys. Lett. B592, 1 (2004), also available online at http://pdg.lbl.gov.
This page intentionally left blank
1
Introduction 1.1
Overview
Quantum ﬁeld theory is a synthesis of quantum mechanics and special relativity, and it is one of the great achievements of modern physics. Quantum mechanics, as formulated by Bohr, Heisenberg, Schr¨ odinger, Pauli, Dirac, and many others, is an intrinsically nonrelativistic theory. To make it consistent with special relativity, the real problem is not to ﬁnd a relativistic generalization of the Schr¨ odinger equation.1 Wave equations, relativistic or not, cannot account for processes in which the number and the type of particles changes, as in almost all reactions of nuclear and particle physics. Even the process of an atomic transition from an excited atomic state A∗ to a state A with emission of a photon, A∗ → A + γ, is in principle unaccessible to this treatment (although in this case, describing the electromagnetic ﬁeld classically and the atom quantum mechanically, one can get some correct results, even if in a not very convincing manner). Furthermore, relativistic wave equations suﬀer from a number of pathologies, like negativeenergy solutions. A proper resolution of these diﬃculties implies a change of viewpoint, from wave equations, where one quantizes a single particle in an external classical potential, to quantum ﬁeld theory, where one identiﬁes the particles with the modes of a ﬁeld, and quantizes the ﬁeld itself. The procedure also goes under the name of second quantization. The methods of quantum ﬁeld theory (QFT) have great generality and ﬂexibility and are not restricted to the domain of particle physics. In a sense, ﬁeld theory is a universal language, and it permeates many branches of modern research. In general, ﬁeld theory is the correct language whenever we face collective phenomena, involving a large number of degrees of freedom, and this is the underlying reason for its unifying power. For example, in condensed matter the excitations in a solid are quanta of ﬁelds, and can be studied with ﬁeld theoretical methods. An especially interesting example of the unifying power of QFT is given by the phenomenon of superconductivity which, expressed in the ﬁeld theory language, turns out to be conceptually the same as the Higgs mechanism in particle physics. As another example we can mention that the Feynman path integral, which is a basic tool of modern quantum ﬁeld theory, provides a formal analogy between ﬁeld theory and statistical mechanics, which has stimulated very important exchanges between these two areas. Beside playing a crucial role for physicists,
1.1 Overview
1
1.2 Typical scales in highenergy physics
4
1 Actually, Schr¨ odinger ﬁrst found a relativistic equation, that today we call the Klein–Gordon equation. He then discarded it because it gave the wrong ﬁne structure for the hydrogen atom, and he retained only the nonrelativistic limit. See Weinberg (1995), page 4.
2 Introduction
quantum ﬁeld theory even plays a role in pure mathematics, and in the last 20 years the physicists’ intuition stemming in particular from the path integral formulation of QFT has been at the basis of striking and unexpected advances in pure mathematics. QFT obtains its most spectacular successes when the interaction is small and can be treated perturbatively. In quantum electrodynamics (QED) the theory can be treated order by order in the ﬁne structure constant α = e2 /(4πc) 1/137. Given the smallness of this parameter, a perturbative treatment is adequate in almost all situations, and the agreement between theoretical predictions and experiments can be truly spectacular. For example, the electron has a magnetic moment of modulus ge/(4me c), where g is called the gyromagnetic ratio. While classical electrodynamics erroneously suggests g = 1, the Dirac equation gives g = 2, and QED predicts a small deviation from this value; the experimentally measured value is g − 2 = 0.001 159 652 187(4) (1.1) 2 exp (the digit in parentheses is the experimental error on the last ﬁgure), and the theoretical prediction, computed perturbatively up to order α4 , is α 2 α 3 g − 2 α − (0.328 478 965 . . .) = + (1.176 11 . . . ) 2 2π π π th α 4 −(1.434 . . .) = 0.001 159 652 140(5)(4)(27) . π
2
See http://www.g2.bnl.gov/. This values updates the value reported in the 2004 edition of the Review of Particle Physics.
Diﬀerent sources of errors on the last ﬁgures are written separately in parentheses. The theoretical error is due partly to the numerical evaluation of Feynman diagrams (there are 891 of them at order α4 !) and partly to the fact that, at this level of precision, hadronic contributions come into play. We also need to know α with suﬃcient accuracy; this is provided by the quantum Hall eﬀect. The gyromagnetic ratio has been measured very precisely also for the muon, and the accuracy of this measurement has been improved recently,2 with the result (g − 2)/2exp = 0.001 165 9208(6), and a theoretical prediction (g − 2)/2th = 0.001 165 9181(7). The remaining discrepancy has aroused much interest, in the hope that it might be a signal of new physical eﬀects, but to see whether this is actually the case requires ﬁrst a better theoretical understanding of hadronic contributions, which are more diﬃcult to compute. In any case, an agreement between theory and experiment at the level of 10 decimal ﬁgures for the electron (or eight for the muon) is spectacular, and it is among the most precise in physics. As we know today, QED is only a part of a larger theory. As we approach the scales of nuclear physics, i.e. length scales r ∼ 10−13 cm
1.1
or energies E ∼ 200 MeV, the existence of new interactions becomes evident: strong interactions are responsible for instance for binding together neutrons and protons into nuclei, and weak interactions are responsible for a number of decays, like the beta decay of the neutron into the proton, electron and antineutrino, n → pe− ν¯e . A successful theory of beta decay was already proposed by Fermi in 1934. We now understand the Fermi theory as a low energy approximation to a more complete theory, that uniﬁes the weak and electromagnetic interactions into a single conceptual framework, the electroweak theory. This theory, developed in the early 1970s, together with the fundamental theory of strong interactions, quantum chromodynamics (QCD), has such spectacular experimental successes that it now goes under the name of the Standard Model. In the last decade of the 20th century the LEP machine at CERN performed a large number of precision measurements, at the level of one part in 104 , which are all completely reproduced by the theoretical predictions of the Standard Model. These results show that we do understand the laws of Nature down to the scale of 10−17 cm, i.e. four orders of magnitude below the size of a nucleus and nine orders of magnitude below the size of an atom. Part of the activity of high energy physicists nowadays is devoted to the search of physics beyond the Standard Model. The best hint for new physics presently comes from the recent experimental evidence for neutrino oscillations. These oscillations imply that neutrinos have a very small mass, whose deeper origin is suspected to be related to physics beyond the Standard Model. The Standard Model has a beautiful theoretical structure; its discovery and development, due among others to Glashow, Weinberg, Salam and ’t Hooft, requires a number of new concepts compared to QED. A detailed explanation of the Standard Model is beyond the scope of this course, but we will discuss two of its main ingredients: nonabelian gauge ﬁelds, or Yang–Mills theories, and spontaneous symmetry breaking through the Higgs mechanism. In spite of the remarkable successes of the Standard Model, the search for the fundamental laws governing the microscopic world is still very far from being completed. In the Standard Model itself there is still a missing piece, since it predicts a particle, the Higgs boson, which plays a crucial role and which has not yet been observed. LEP, after 11 years of glorious activity, was closed in November 2000, after reaching a maximum center of mass energy of 209 GeV. The new machine, LHC, is now under construction at CERN, and together with the Tevatron collider at Fermilab aims at exploring the TeV (= 103 GeV = 1012 eV) energy range. It is hoped that they will ﬁnd the Higgs boson and that they will test theoretical ideas like supersymmetry that, if correct, are expected to give observable signals at this energy scale. Looking much beyond the Standard Model, there is a very substantial reason for believing that we are still far from a true understanding of the fundamental laws of Nature. This is because gravity cannot be included in the conceptual schemes that we have discussed so far. General rela
Overview 3
4 Introduction
3
However, this could change in theories with large extra dimensions. In fact, both in quantum ﬁeld theory and in string theory, have been devised mechanisms such that some extra dimensions are accessible only to gravitational interactions, and not to electromagnetic, weak or strong interactions. In this case, it turns out that the extra dimensions could even be as large as the millimeter without conﬂicting with any experimental result, and the huge value 1019 GeV of the gravitational scale would emerge from a combination of the large volume of the extra dimensions and a much smaller massscale which characterizes the energy where genuine quantum gravity eﬀects set in. This new gravitational massscale might even be as low as a few tens of TeV, and in this case it could be within the reach of future particle physics experiments.
tivity is incompatible with quantum ﬁeld theory. From an experimental point of view, at present, this causes no real worry; the energy scale at which quantum gravity eﬀects are expected to become important is so huge (of order 1019 GeV) that we can forget them altogether in accelerator experiments.3 There remains the conceptual need for a new theoretical scheme where these two pillars of modern physics, quantum ﬁeld theory and general relativity, merge consistently. And, of course, one should also be subtle enough to ﬁnd situations where this can give testable predictions. A consistent theoretical scheme is perhaps slowly emerging in the form of string theory; but this would lead us very far from the scope of this course.
1.2
Typical scales in highenergy physics
Before entering into the technical aspects of quantum ﬁeld theory, it is important to have a physical understanding of the typical scales of atomic and particle physics and to be able to estimate what are the orders of magnitudes involved. Often this can be done just with elementary dimensional considerations, supplemented by some very basic physical inputs. We will therefore devote this section to an overview of order of magnitude estimates in particle physics. These estimates are much simpliﬁed by the use of units = c = 1. To understand the meaning of these units, observe ﬁrst of all that and c are universal constants, i.e. they have the same numerical value for all observers. The speed of light has the value c = 299 792 458 m/s, with no error because, after having deﬁned the unit of time from a particular atomic transition (a hyperﬁne transition of cesium133) this value of c is taken as the deﬁnition of the meter. However, instead of using the meter, we can decide to use a new unit of length (or a new unit of time) deﬁned by the statement that in these units c = 1. Then, the velocity v of a particle is measured in units of the speed of light, which is very natural since in particle physics we typically deal with relativistic objects. In these units 0 v < 1 for massive particles, and v = 1 for massless particles. The Planck constant is another universal constant, and it has dimensions [energy] × [time] or [length] × [momentum] as we see for instance from the uncertainty principle. We can therefore choose units of energy such that = 1. Then all multiplicative factors of and c disappear from our equations and formally, from the point of view of dimensional analysis, [velocity] = pure number , [energy] = [momentum] = [mass] , [length] = [mass]−1 .
(1.2) (1.3) (1.4)
The ﬁrst two equations follow immediately from c = 1 while the third follows from the fact that /(mc) is a length. Thus all physical quantities have dimensions that can be expressed as powers of mass or, equivalently,
1.2
Typical scales in highenergy physics 5
as powers of length. For instance an energy density, [energy]/[length]3 , becomes a [mass]4 . Units = c = 1 are called natural units. The ﬁne structure constant α = e2 /(4πc) 1/137 is a pure number, and therefore in natural units the electric charge e becomes a pure number. To make numerical estimates, it is useful to observe that c, in ordinary units, has dimensions [energy×time]×[velocity] = [energy]×[length]. In particle physics a useful unit of energy is the MeV (= 106 eV) and a typical lengthscale is the fermi: 1 fm = 10−13 cm; one fm is the typical size of a proton. Expressing c in MeV×fm, one gets c 200 MeV fm .
(1.5)
(The precise value is 197.326 968 (17) MeV fm.) Then, in natural units, 1 fm 1/(200 MeV). The following examples will show that sometimes we can go quite far in the understanding of physics with just very simple dimensional estimates. If we want to make dimensional estimates in QED the two parameters that enter are the ﬁne structure constant α 1/137 and the electron mass, me 0.5 MeV/c2 . Note that in units c = 1 masses are expressed simply in MeV, as energies. We now consider a few examples. The Compton radius. The simplest lengthscale associated to a particle of mass m in its rest frame is its Compton radius, rC = 1/m. In particular, for the electron rC =
1 200 MeV fm = 4 × 10−11 cm . me 0.5 MeV
(1.6)
Since rC does not depend on α, it is the relevant lengthscale in situations in which there is no dependence on the strength of the interaction. Historically, rC made its ﬁrst appearance in the Compton scattering of Xrays oﬀ electrons. Classically, the wavelength of the scattered Xrays should be the same as the incoming waves, since the process is described in terms of forced oscillations. Quantum mechanically, treating the Xrays as photons, we understand that part of the momentum hν of the incoming photon is used to produce the recoil of the electron, so the momentum of the outgoing photon is smaller, and its wavelength is larger. The wavelength of the outgoing photon is ﬁxed by energy–momentum conservation, and therefore is independent of α, so the relevant lengthscale must be rC . Indeed, a simple computation gives λ − λ = rC (1 − cos θ) ,
(1.7)
where λ, λ are the initial and ﬁnal Xray wavelengths and θ is the scattering angle. The hydrogen atom. Let us ﬁrst estimate the Bohr radius rB . The only mass that enters the problem is the reduced mass of the electron–
6 Introduction
proton system; since mp 938 MeV is much bigger than me we can identify the reduced mass with me , within a precision of 0.05 per cent. Dimensionally, again rB ∼ 1/me , but now α enters. Clearly, the radius of the bound state is smaller if the interaction responsible for the binding is stronger, while it must go to inﬁnity in the limit α → 0, so α must be in the denominator and it is very natural to guess that rB ∼ 1/(me α). This is indeed the case, as can be seen with the following argument: by the uncertainty principle, an electron conﬁned in a radius r has a momentum p ∼ 1/r. If the electron in the hydrogen atom is nonrelativistic (we will verify the consistency of this hypothesis a posteriori) its kinetic energy is p2 /(2me ) ∼ 1/(2me r2 ). This kinetic energy must be balanced by the Coulomb potential, so at the equilibrium radius 1/(2me r2 ) ∼ α/r, which indeed gives rB ∼ 1/(me α). In principle factors of 2 are beyond the power of dimensional estimates, but here it is quite tempting to observe that the virial theorem of classical mechanics states that, for a potential proportional to 1/r, at equilibrium the kinetic energy is one half of the absolute value of the potential energy, so we would guess, 2 ) = α/(2rB ), i.e. more precisely, that 1/(2me rB rB =
1 0.5 × 10−8 cm , me α
(1.8)
which is indeed the deﬁnition of the Bohr radius as found in the quantum mechanical treatment. The typical potential energy of the hydrogen atom is then α = −me α2 , (1.9) V ∼ V (rB ) = − rB and, again using the virial theorem, the kinetic energy is 1 1 E = − V ∼ me α2 . 2 2
(1.10)
This is the kinetic energy of a nonrelativistic electron with typical velocity v ∼ α. (1.11) Since α 1, our approximation of a nonrelativistic electron is indeed consistent. This of course was expected, since we know that, in a ﬁrst approximation, the nonrelativistic Schr¨ odinger equation gives a good description of the hydrogen atom. The sum of the kinetic and potential energy is −(1/2)me α2 so the binding energy of the hydrogen atom is 2 1 1 1 2 binding energy = me α 0.5MeV 13.6 eV . (1.12) 2 2 137 odinger The Rydberg energy is indeed deﬁned as (1/2)me α2 , and the Schr¨ equation gives the energy levels En = −
me α2 . 2n2
(1.13)
1.2
Typical scales in highenergy physics 7
In QED this is just the ﬁrst term of an expansion in α; at next order one ﬁnds the ﬁne structure of the hydrogen atom, n α4 3 α2 En,j = me − 2 − 4 − + . . . , (1.14) 2n 2n 4 j + 12 where j is the total angular momentum and, to be more accurate, the electron mass should be replaced by the reduced mass me mp /(me +mp ). We will derive eq. (1.14) in Solved Problem 3.1. The ﬁne structure constant α gets its name from this formula. From eq. (1.11) we understand that, in the hydrogen atom, the expansion in α is the same as an expansion in powers of v, and the ﬁne structure of the hydrogen atom is just the ﬁrst relativistic correction. Electron–photon scattering. We want to estimate the crosssection for the scattering of a photon by an electron, which we take initially at rest, e− γ → e− γ. We denote by ω the initial photon energy (in natural units the energy of the photon E = ω becomes simply ω). The energy of the ﬁnal photon is ﬁxed by the initial energy ω and by the scattering angle θ, so the total crosssection (i.e. the crosssection integrated over the scattering angle) can depend only on two energy scales, me and ω, and on the dimensionless coupling α. The dependence on α is determined observing that the scattering process takes place via the absorption of the incoming photon and the emission of the outgoing photon. As we will study in detail in Chapters 5 and 7, this is a process of second order in perturbation theory and its amplitude is O(e2 ) so the crosssection, which is proportional to the squared amplitude, is O(e4 ), i.e. O(α2 ). For a generic incoming photon energy ω, we have two diﬀerent scales in the problem and we cannot go very far with dimensional considerations. Things simplify in the limit ω me . In this limit we can neglect ω compared to me and we have basically only one massscale, me . Since the crosssection has dimensions [length]2 , we can estimate σ ∼ α2 /m2e . It is therefore useful to deﬁne r0 , r0 =
α 2.8 × 10−13 cm , me
(1.15)
so that the crosssection is σ ∼ r02 . The exact computation gives the result 8 (1.16) σT = πr02 3 and the factor of π is also easily understood, since a crosssection is an eﬀective area, so it is ∼ πr02 . The electron–photon crosssection at ω me is known as the Thomson crosssection and can be computed just with classical electrodynamics, since when ω me the photons are well described by a classical electromagnetic ﬁeld; r0 is therefore called the classical electron radius, and gives a measure of the size of an electron, as seen using classical electromagnetic ﬁelds as a probe.
8 Introduction
4
In general, not every quantum computation has a welldeﬁned classical limit; just think of what happens to the black body spectrum when → 0 (indeed, this example was just the original motivation of Planck for introducing !). However, reinstating and c explicitly, the classical electron radius is r0 = α(/me c) = (e2 /4πc)(/me c) and cancels, so the limit → 0 is well deﬁned.
Consider now the opposite limit ω me . In this case the crosssection must have a dependence on the energy of the photon and, because of Lorentz invariance, the crosssection integrated over the angles will depend on the energy of the photon through the energy in the center of mass system. If k is the initial fourmomentum of the photon and pe is the initial fourmomentum of the electron, the total initial fourmomentum is p = k + pe and the square of the energy in the center of mass is s = p2 . In the rest frame of the electron pe = (me , 0, 0, 0) and k = (ω, 0, 0, ω), so s = (me +ω)2 −ω 2 = 2me ω+m2e . In the limit ω me we have s m2e and we would expect that we can neglect me . Then the √ only energy scale is provided by s, and we would expect that σ ∼ α2 /s. Here however there is a subtlety. In the previous case, ω me , we have implicitly assumed that in the limit ω → 0 the crosssection is ﬁnite. This is indeed the case, since in this limit the electromagnetic ﬁeld can be treated classically, and the classical computation gives a ﬁnite answer.4 If instead ω me , we are eﬀectively taking the limit me → 0; it turns out that this limit is problematic in QED, and taking me → 0 one ﬁnds socalled infrared divergences. In fact, from the explicit computation one ﬁnds that the correct highenergy limit of the crosssection is s 2πα2 σ log . (1.17) s m2e This is an example of the fact that divergences, which are typical of quantum ﬁeld theory, can spoil naive dimensional analysis. We will examine this issue in a more general context in Section 5.9. In conclusion, we have found three diﬀerent scales that can be constructed with me and α. The largest is rB = 1/(me α) and gives the characteristic size of an electron bound by the Coulomb potential of a proton; rC = 1/me is the characteristic lengthscale associated with a free electron in its rest frame, and the smallest, r0 = α/me , is associated with classical eγ scattering. Nucleons and strong interactions. Nuclei are bound states of nucleons, i.e. of protons and neutrons, with a radius r ∼ A1/3 ×1 fm, where A is the total number of nucleons (so that the volume is proportional to A). From the uncertainty principle, a particle conﬁned within 1 fm has a momentum p ∼ 1/(1 fm) 200 MeV. If the nucleons in the nucleus are nonrelativistic, their kinetic energy is p2N 20 MeV (1.18) 2mN so this must be the typical scale of nuclear binding energies; the typical velocity is pN 0.2 . (1.19) vN mN This values of v shows that the nonrelativistic approximation is roughly correct, but relativistic corrections in nuclei are numerically more important than in atoms. Since the corrections are proportional to v 2 (compare eqs. (1.11) and (1.14)), in nuclei they are of order 4%. EN
1.2
Typical scales in highenergy physics 9
It is also interesting to estimate the analogue of α for the strong interactions. For this we need to know that the nucleon–nucleon strong potential is not Coulomblike, but rather decays exponentially at large distances, αs (1.20) V − e−mπ r , r where αs is the coupling constant of strong interactions and mπ 140 MeV is the mass of a particle, the pion, that at lengthscales l > ∼1 fm can be considered the mediator of the strong interaction (we will derive this result in Section 6.6). Consider for instance a proton–neutron system, which makes a bound state (the nucleus of deuterium) of radius r ∼ 1 fm. At equilibrium, (−1/2)V must be equal to the kinetic energy p2 /(2m) ∼ 1/(2mr2 ), where m mp /2 is the reduced mass of the twonucleon system (and the −1/2 comes again from the virial theorem). Since we already know that the equilibrium radius is at r 1 fm, we ﬁnd αs ∼ 2(mp r)−1 exp{mπ r}r=1 fm ∼ 0.8. The precise numerical value is not of great signiﬁcance, since we are making order of magnitude estimates, but anyway this shows that the coupling αs is not a small number, and strong interactions cannot be treated perturbatively in the same way as QED.5 Lifetime and crosssections of strong interactions. Hadrons are deﬁned as particles which have strong interactions. If a particle decays by strong interactions it is possible to estimate its lifetime τ as follows. The quantities that can enter the computation of the lifetime are the coupling αs , the masses of the particles involved, and the typical interaction radius of the strong interactions. However, these particles have typical masses in the GeV range, and the interaction range of the strong interaction ∼ 1fm (200MeV)−1 . Then all energy scales in the problem are between a few hundred MeV and a few GeV, so in a ﬁrst approximation we can say that the only lengthscale in the problem is of the order of the fermi. Furthermore, we have seen that αs = O(1). This means that, in order of magnitude, the lifetimes of particles which decay by strong interactions are in the ballpark of τ ∼ 1 fm/c ∼ 3 × 10 −24 s. Particles with such a small lifetime only show up as peaks in a plot of a scattering crosssection against the energy, and are called resonances, since the mechanism that produces the peak is conceptually the same as the resonance in classical mechanics (we will discuss resonances in detail in Section 6.5). The width Γ of the peak is related to the lifetime by Γ = /τ or, in natural units, Γ=
1 1 ∼ 200 MeV . τ 1 fm
(1.21)
We can estimate similarly the typical crosssections of processes mediated by strong interactions. Since a crosssection is an eﬀective area, we must typically have σ ∼ π (1 fm)2 ∼ 3 × 10−26 cm2 . A common unit for crosssections is the barn, 1 barn = 10−24 cm2 . Therefore a typical strong interactions crosssection, in the absence of dynamical phenomena
5
We will see in Section 5.9 that the coupling constants actually are not constant at all, but rather depend on the lengthscale at which they are measured. We will see that the correct statement is that the theory of strong interactions, QCD, cannot be treated perturbatively at lengthscales l> ∼1 fm, while αs becomes small at l 1 fm, and there perturbation theory works well.
10 Introduction
like resonances, is of the order of 30 millibarns. Here we have implicitly assumed that the particles are relativistic, i.e. their relative speed is close to one. Otherwise we must take into account that the relevant lengthscale for a particle of mass m and velocity v 1 is given by the De Broglie wavelength λ = 1/(mv) 1/m, and a typical nuclear crosssection for slow particles, in the absence of resonances, is of the order σ ∼ πλ2 , see Exercise 1.3.
Table 1.1 Examples of electroweak decays. In the right column we give the lifetime of the decaying particle and in the left column its main decay mode. Observe the broad range of lifetimes. For lifetimes so small as for the Z 0 , it is more convenient to give the decay width. For the Z 0 , the full width is Γ = 2.4952(23) GeV. main mode
lifetime (sec)
→
pe− ν¯e
0.8857(8) × 103
µ−
→
2.19703(4) × 10−6
π+
→
Λ0
0.8958(6) × 10−10
n
→
e− ν¯e νµ µ+ νµ pπ −
KS0 π0
→
π+ π−
→ γγ
0.84(6) × 10−16
Σ0
→ Λγ
0.74(7) × 10−19
Z0
→ hadrons
2.6379(24) × 10−25
2.6033(5) × 10−8 2.632(20) × 10−10
Electroweak decays. Leptons do not have strong interactions and either are stable or decay through electroweak interactions. Furthermore, strong interactions obey a number of conservation laws, which result in the fact that also many hadrons cannot decay via the strong interaction; in this case they decay through electroweak interactions (except for the proton, which in the Standard Model is stable) and their lifetime is considerably longer than the typical lifetimes τ ∼ 10−24 s of strong decays. Weak decays span a broad range of lifetimes because they depend on quite diﬀerent massscales: the electroweak scale, the mass of the decaying particle, and the masses of the decay products. While in the case of hadronic resonances the scales which are involved are all between a few hundred MeV and a few GeV, for weak decays these scales can be very diﬀerent from each other: the electroweak scale is O(100) GeV, while the masses of the decaying particle or of the decay products can be anywhere between zero (for the photon) or less than a few eV (for the electron neutrino) up to hundreds of GeV. Furthermore the electroweak coupling constants are not of order one. Rather, the electromagnetic coupling is α ∼ 1/137 0.007 while, as we will discuss in Chapter 8, weak interactions are characterized by two coupling constants g 2 /(4π) and g¯2 /(4π) both numerically of order 0.1. For these reasons the electroweak lifetimes, even in order of magnitude, vary from case to case. Some examples are given in Table 1.1. The lifetime can be written as τ=
= Γ i Γi
(1.22)
where in the last equality the sum runs over all decay channels. Γ is called the full width, while the Γi are the partial widths relative to the decay mode labeled by i. In the ﬁrst column of Table 1.1 we give the dominant decay mode, i.e. the mode with the largest partial width. In the second column we give the lifetime, i.e. the inverse of the full width. The quantity Γi /Γ is called the branching ratio of the mode labeled by i. We will compute explicitly many weak decays in the Solved Problems section of Chapter 8. The Planck mass. Using simple dimensional estimates we can also understand the statement made at the end of Section 1.1 that, in the realm of particle physics, gravity enters into play only at huge energies. Comparing the Newton potential V = −GN m2 /r with a Coulomb potential V = −e2 /4πr = −(αc)/r, we see that GN times a mass squared
1.2
Typical scales in highenergy physics 11
has the dimensions of c. Therefore from the fundamental constants , c, GN we can build a massscale
c MPl = , (1.23) GN known as the Planck mass, whose numerical value is MPl 1.2 × 2 1019 GeV/c2 . In natural units, then, GN = 1/MPl and we see, comparing the Newton and Coulomb laws, that the gravitational analogue of the ﬁne structure constant is (m/MPl )2 . More precisely, since in general relativity any form of energy is a source for the gravitational ﬁeld, particles with an energy E have an eﬀective gravitational coupling αG =
E2 2 . MPl
(1.24)
At the typical energies of particle physics, say E ∼ 1 GeV, we have αG ∼ 10−38 and gravity is completely irrelevant. In the realm of particle physics, gravity becomes important only at energies comparable to the Planck scale. These considerations only apply to the microscopic domain. On the macroscopic scale, gravity can become more important than electric interactions because it is always attractive, so it has a cumulative eﬀect, while on a large scale the electrostatic forces are screened by the formation of electrically neutral objects, and the residual force decreases faster than 1/r 2 . Since MPl provides a natural massscale, in quantum gravity it is customary to use units in which not only and c but also MPl are set equal to one. These are called Planck units, and in these units all physical quantities are dimensionless. We will not use them in this book.
Further reading • A historical introduction to quantum ﬁeld theory is given in Weinberg (1995), Chapter 1.
namics, World Scientiﬁc, Singapore 1990. Recently the measure of the g − 2 of the muon has been further improved by an experiment in Brookhaven, see the link http://www.g2.bnl.gov/
• The standard compilation of experimental data for highenergy physics is the Review of Particle Physics of the Particle Data Group. Unless explicitly stated otherwise, our experimental data are taken from the 2004 edition, S. Eidelman et al., Phys. Lett. B592, 1 (2004), also available online at http://pdg.lbl.gov.
• A wellwritten popular book, which gives a ﬂavor of modern research in quantum gravity and string theory is B. Greene, The elegant universe: superstrings, hidden dimensions, and the quest for the ultimate theory, Norton, New York 1999.
• Precision measurements are a fascinating ﬁeld by themselves; the experimentally minded student might enjoy browsing the detailed article by F. J. M. Farley and E. Picasso, The muon g2 experiment, in T. Kinoshita ed., Quantum Electrody
• QFT is a domain where there can be an interplay between frontier research in theoretical physics and in pure mathematics, and in the last decades this has generated important advances in both ﬁelds. The physicist who wishes an introduction to the ap
12 Introduction plication to physics of important concepts of geometry and topology (like cohomology groups, complex manifolds, ﬁbre bundles, characteristic classes, etc.) can consult, for instance, Nakahara (1990). These concepts ﬁnd many applications in the theory of nonabelian gauge ﬁelds and in string the
ory. Conversely, the mathematician interested in the mathematical applications of QFT, supersymmetry and string theory is referred to P. Deligne et al. eds., Quantum Fields and Strings: A Course for Mathematicians, AMS IAS 1999.
Exercises (1.1) The Universe is permeated by a thermal background of electromagnetic radiation at a temperature T = 2.725(1) K (the cosmic microwave background radiation, or CMB). Estimate with dimensional arguments the energy density of this gas of photons and compare it with the critical density for closing the Universe, ρc ∼ 0.5 × 10−5 GeV/cm3 . [Hint: a useful mnemonic for kB is given by the fact that, at room temperature T = 300 K, kB T (1/40) eV. In the energy density, the numerical constant in front of (kB T )4 turns out to be (π 2 /30)g(T ), where g(T ) is of the order of the number of particles which are relativistic at a temperature T , i.e. which have m T . With T 2.7 K, only the photon and at most three neutrinos are relativistic and g(T ) is between 3 and 4. Then, for the purpose of this exercise, the only thing that matters is that the constant (π 2 /30)g(T ) is of order one.] (1.2) Model the Sun as an ionized plasma of electrons
and protons, with an average temperature T 4.5 × 106 K and an average mass density ρ 1.4 gm/cm3 . Estimate the mean free path of photons in the Sun’s interior, and compare the contribution to the mean free path coming from the scattering on electrons with that from the scattering on protons. Knowing that the radius of the Sun is R 6.96×1010 cm, estimate the total time that a photon takes to escape from the Sun. [Hint: recall that the mean free path l of a particle scattering oﬀ an ensemble of targets with number density (i.e. particles per unit volume) n and crosssection σ is 1 (1.25) l= nσ or,Pif there are diﬀerent species of targets, l = 1/ i ni σi .] (1.3) Estimate the crosssection for a nonrelativistic neutron with kinetic energy E ∼ 1 MeV, scattering on a proton at rest.
Lorentz and Poincar´ e symmetries in QFT We mentioned in the Introduction that quantum ﬁeld theory (QFT) is a synthesis of the principles of quantum mechanics and of special relativity. Our ﬁrst task will be to understand how Lorentz symmetry is implemented in ﬁeld theory. We will study the representations of the Lorentz group in terms of ﬁelds and we will introduce scalar, spinor, and vector ﬁelds. We will then examine the information coming from Poincar´e invariance. This chapter is rather mathematical and formal. The eﬀort will pay, however, since an understanding of this group theoretical approach greatly simpliﬁes the construction of the Lagrangians for the various ﬁelds in Chapter 3 and gives in general a deeper understanding of various aspects of QFT. From now on we always use natural units = c = 1.
2.1
Lie groups
Lie groups play a central role in physics, and in this section we recall some of their main properties. In the next sections we will apply these concepts to the study of the Lorentz and Poincar´e groups. A Lie group is a group whose elements g depend in a continuous and diﬀerentiable way on a set of real parameters θ a , a = 1, . . . , N . Therefore a Lie group is at the same time a group and a diﬀerentiable manifold. We write a generic element as g(θ) and without loss of generality we choose the coordinates θ a such that the identity element e of the group corresponds to θ a = 0, i.e. g(0) = e. A (linear) representation R of a group is an operation that assigns to a generic, abstract element g of a group a linear operator DR (g) deﬁned on a linear space, (2.1) g → DR (g) with the properties that (i): DR (e) = 1, where 1 is the identity operator, and (ii): DR (g1 )DR (g2 ) = DR (g1 g2 ), so that the mapping preserves the group structure. The space on which the operators DR act is called the basis for the representation R. A typical example of a representation is a matrix representation. In this case the basis is a vector space of ﬁnite dimension
2 2.1 Lie groups
13
2.2 The Lorentz group
16
2.3 The Lorentz algebra
18
2.4 Tensor representations
20
2.5 Spinorial representations
24
2.6 Field representations
29
2.7 The Poincar´ e group
34
14 Lorentz and Poincar´ e symmetries in QFT
n, and an abstract group element g is represented by a n × n matrix (DR (g))i j , with i, j = 1, . . . , n. The dimension of the representation is deﬁned as the dimension n of the base space. Writing a generic element of the base space as (φ1 , . . . , φn ), a group element g induces a transformation of the vector space φi → (DR (g))i j φj .
(2.2)
Equation (2.2) allows us to attach a physical meaning to a group element: before introducing the concept of representation, a group element g is just an abstract mathematical object, deﬁned by its composition rules with the other group members. Choosing a speciﬁc representation instead allows us to interpret g as a transformation on a certain space; for instance, taking as group SO(3) and as base space the spatial vectors v, an element g ∈ SO(3) can be interpreted physically as a rotation in threedimensional space. A representation R is called reducible if it has an invariant subspace, i.e. if the action of any DR (g) on the vectors in the subspace gives another vector of the subspace. Conversely, a representation with no invariant subspace is called irreducible. A representation is completely reducible if, for all elements g, the matrices DR (g) can be written, with a suitable choice of basis, in block diagonal form. In other words, in a completely reducible representation the basis vectors φi can be chosen so that they split into subsets that do not mix with each other under eq. (2.2). This means that a completely reducible representation can be written, with a suitable choice of basis, as the direct sum of irreducible representations. Two representations R, R are called equivalent if there is a matrix S, independent of g, such that for all g we have DR (g) = S −1 DR (g)S. Comparing with eq. (2.2), we see that equivalent representations correspond to a change of basis in the vector space spanned by the φi . When we change the representation, in general the explicit form and even the dimensions of the matrices DR (g) will change. However, there is an important property of a Lie group that is independent of the representation. This is its Lie algebra, which we now introduce. By the assumption of smoothness, for θ a inﬁnitesimal, i.e. in the neighborhood of the identity element, we have DR (θ) 1 + iθa TRa ,
(2.3)
∂DR ≡ −i . (2.4) ∂θa θ=0 The TRa are called the generators of the group in the representation R. It can be shown that, with an appropriate choice of the parametrization far from the identity, the generic group elements g(θ) can always be represented by1 with
TRa
1
To be precise, this is only true for the component of the group manifold connected with the identity.
a
DR (g(θ)) = eiθa TR ,
(2.5)
2.1
Lie groups 15
whose inﬁnitesimal form reproduces eq. (2.3). The factor i in the deﬁnition (2.4) is chosen so that, if in the representation R the generators are hermitian, then the matrices DR (g) are unitary. In this case R is a unitary representation. Given two matrices DR (g1 ) = exp(iαa TRa ) and DR (g2 ) = exp(iβa TRa ), their product is equal to DR (g1 g2 ) and therefore must be of the form exp(iδa TRa ), for some δa (α, β), a
a
a
eiαa TR eiβa TR = eiδa TR .
(2.6)
Observe that TRa is a matrix. If A, B are matrices, in general eA eB = eA+B , so in general δa = αa + βa . Taking the logarithm and expanding up to second order in α and β we get 1 1 iδa TRa = log [1 + iαa TRa + (iαa TRa )2 ][1 + iβa TRa + (iβa TRa )2 ] (2.7) 2 2 1 1 a 2 a 2 a b a = log 1 + i(αa + βa )TR − (αa TR ) − (βa TR ) − αa βb TR TR . 2 2 Expanding the logarithm, log(1 + x) x − x2 /2, and paying attention to the fact that the TRa do not commute we get
(2.8) αa βb TRa , TRb = iγc (α, β)TRc , with γc (α, β) = −2(δc (α, β) − αc − βc ). Since this must be true for all α and β, γc must be linear in αa and in βa , so the relation between γ and α, β must be of the general form γc = αa βb f ab c for some constants f ab c . Therefore [T a , T b ] = if ab c T c .
(2.9)
This is called the Lie algebra of the group under consideration. Two important points must be noted here. The ﬁrst is that, even if the explicit form of the generators T a depends on the representation used, the structure constants f ab c are independent of the representation. In fact, if f ab c were to depend on the representation, γ a and therefore δ a would also a depend on R, so it would be of the form δR (α, β). Then from eq. (2.6) we would conclude that the product of the group elements g1 and g2 gives a result which depends on the representation. This is impossible, since the result of the multiplication of two abstract group element g1 g2 is a property of the group, deﬁned at the abstract group level without any reference to the representations. Therefore, we conclude that f ab c are independent of the representation.2 The second important point is that this equation has been derived requiring the consistency of eq. (2.6) to second order; however, once this is satisﬁed, it can be proved that no further requirement comes from the expansion at higher orders. Thus the structure constants deﬁne the Lie algebra, and the problem of ﬁnding all matrix representations of a Lie algebra amounts to the algebraic problem of ﬁnding all possible matrix solutions TRa of eq. (2.9).
2 Actually, the generators of a Lie group can even be deﬁned without making any reference to a speciﬁc representation. One makes use of the fact that a Lie group is also a manifold, parametrized by the coordinates θ a , and deﬁnes the generators as a basis of the tangent space at the origin. One then proves that their commutator (deﬁned as a Lie bracket) is again a tangent vector, and therefore it must be a linear combination of the basis vector. In this approach no speciﬁc representation is ever mentioned, so it becomes obvious that the structure constants are independent of the representation. See, e.g., Nakahara (1990), Section 5.6.
16 Lorentz and Poincar´ e symmetries in QFT
A group is called abelian if all its elements commute between themselves, otherwise the group is nonabelian. For an abelian Lie group the structure constants vanish, since in this case in eq. (2.6) we have δa = αa + βa . The representation theory of abelian Lie algebras is very simple: any ddimensional abelian Lie algebra is isomorphic to the direct sum of d onedimensional abelian Lie algebras. In other words, all irreducible representations of abelian groups are onedimensional. The nontrivial part of the representation theory of Lie algebras is related to the nonabelian structure. In the study of the representations, an important role is played by the Casimir operators. These are operators constructed from the T a that commute with all the T a . In each irreducible representation, the Casimir operators are proportional to the identity matrix, and the proportionality constant labels For example, the angu
the representation. lar momentum algebra is J i , J j = iijk J k and the Casimir operator is J2 . On an irreducible representation, J2 is equal to j(j + 1) times the identity matrix, with j = 0, 21 , 1, . . .. A Lie group that, considered as a manifold, is a compact manifold is called a compact group. Spatial rotations are an example of a compact Lie group, while we will see that the Lorentz group is noncompact. A theorem states that noncompact groups have no unitary representations of ﬁnite dimension, except for representations in which the noncompact generators are represented trivially, i.e. as zero. The physical relevance of this theorem is due to the fact that in a unitary representation the generators are hermitian operators and, according to the rules of quantum mechanics, only hermitian operators can be identiﬁed with observables. If a group is noncompact, in order to identify its generators with physical observables we need an inﬁnitedimensional representation. We will see in this chapter that the Lorentz and Poincar´e groups are noncompact, and that inﬁnitedimensional representations are obtained introducing the Hilbert space of oneparticle states.
2.2
The Lorentz group
The Lorentz group is deﬁned as the group of linear coordinate transformations, (2.10) xµ → xµ = Λµν xν which leave invariant the quantity ηµν xµ xν = t2 − x2 − y 2 − z 2 .
(2.11)
The group of transformations of a space with coordinates (y1 , . . . ym , 2 x1 , . . . xn ), which leaves invariant the quadratic form (y12 + . . . + ym )− 2 2 (x1 + . . . + xn ) is called the orthogonal group O(n, m), so the Lorentz group is O(3, 1). The condition that the matrix Λ must satisfy in order to leave invariant the quadratic form (2.11) is ηµν xµ xν = ηµν (Λµρ xρ )(Λν σ xσ ) = ηρσ xρ xσ .
(2.12)
2.2
Since this must hold for x generic, we must have ηρσ = ηµν Λµρ Λν σ .
(2.13)
In matrix notation, this can be rewritten as η = ΛT ηΛ. Taking the determinant of both sides, we therefore have (det Λ)2 = 1 or det Λ = ±1. Transformations with det Λ = −1 can always be written as the product of a transformation with det Λ = 1 and of a discrete transformation that reverses the sign of an odd number of coordinates, e.g. a parity transformation (t, x, y, z) → (t, −x, −y, −z), or a reﬂection around a single spatial axis (t, x, y, z) → (t, −x, y, z), or a timereversal transformation, (t, x, y, z) → (−t, x, y, z). Transformations with det Λ = +1 are called proper Lorentz transformations. The subgroup of O(3, 1) with det Λ = 1 is denoted by SO(3, 1). Writing explicitly the 00 component of eq. (2.13) we ﬁnd 1 = (Λ0 0 )2 −
3
(Λi 0 )2
(2.14)
i=1
which implies that (Λ0 0 )2 1. Therefore the proper Lorentz group has two disconnected components, one with Λ0 0 1 and one with Λ0 0 −1, called orthochronous and nonorthochronous, respectively. Any nonorthochronous transformation can be written as the product of an orthochronous transformation and a discrete inversion of the type (t, x, y, z) → (−t, −x, −y, −z), or (t, x, y, z) → (−t, −x, y, z), etc. It is convenient to factor out all these discrete transformations, and to redeﬁne the Lorentz group as the component of SO(3, 1) for which Λ0 0 1. If we consider an inﬁnitesimal transformation Λµν = δνµ + ω µν
(2.15)
ωµν = −ωνµ .
(2.16)
eq. (2.13) gives An antisymmetric 4 × 4 matrix has six independent elements, so the Lorentz group has six parameters. These are easily identiﬁed: ﬁrst of all we have the transformations which leave t invariant. This is just the SO(3) rotation group, generated by the three rotations in the (x, y), (x, z) and (y, z) planes. Furthermore, we have three transformations in the (t, x), (t, y) and (t, z) planes that leave invariant t2 − x2 , etc. A transformation that leaves t2 − x2 invariant is called a boost along the x axis, and can be written as t → γ(t + vx) ,
x → γ(x + vt) .
(2.17)
with γ = (1 − v 2 )−1/2 and −1 < v < 1. Its physical meaning is understood looking at the small v limit, where it reduces to the velocity transformation of classical mechanics. It is therefore the relativistic generalization of a velocity transformation. The six independent parameters of the Lorentz group can therefore be taken as the three rotation angles and the three components of the velocity v.
The Lorentz group 17
18 Lorentz and Poincar´ e symmetries in QFT
Since −1 < v < 1, we can write v = tanh η, with −∞ < η < +∞. Then γ = cosh η and eq. (2.17) can be written as a hyperbolic rotation, t → (cosh η)t + (sinh η)x x → (sinh η)t + (cosh η)x .
(2.18)
The variable η is called the rapidity. We see that the Lorentz group is parametrized in a continuous and diﬀerentiable way by six parameters, and it is therefore a Lie group. However, in the Lorentz group one of the parameters is the modulus of the boost velocity, v, which ranges over the noncompact interval 0 v < 1. Therefore the Lorentz group is noncompact.
2.3
The Lorentz algebra
We have seen that the Lorentz group has six parameters, the six independent elements of the antisymmetric matrix ωµν , to which correspond six generators. It is convenient to label the generators as J µν , with a pair of antisymmetric indices (µ, ν), so that J µν = −J νµ . A generic element Λ of the Lorentz group is therefore written as Λ = e− 2 ωµν J i
µν
.
(2.19)
The factor 1/2 in the exponent compensates for the fact that we are summing over all µ, ν rather than over the independent pairs with µ < ν, and therefore each generator is counted twice. By deﬁnition a set of objects φi , with i = 1, . . . , n, transforms in a representation R of dimension n of the Lorentz group if, under a Lorentz transformation, µν
φi → e− 2 ωµν JR i
i
φj ,
(2.20)
j
µν where exp{−(i/2)ωµν JR } is a matrix representation of dimension n of µν the abstract element (2.19) of the Lorentz group; JR are the Lorentz generators in the representation R, and are n × n matrices. Under an inﬁnitesimal transformation with inﬁnitesimal parameters ωµν , the variation of φi is i µν i δφi = − ωµν (JR ) j φj . (2.21) 2 µν i ) j the pair of indices µ, ν identify the generator while the indices In (JR i, j are the matrix indices of the representation that we are considering. All physical quantities can be classiﬁed accordingly to their transformation properties under the Lorentz group. A scalar is a quantity that is invariant under the transformation. A typical Lorentz scalar in particle physics is the rest mass of a particle. A contravariant fourvector V µ is deﬁned as an object that satisﬁes the transformation law
V µ → Λµν V ν ,
(2.22)
with Λµ ν deﬁned by the condition (2.13). A covariant fourvector Vµ transforms as Vµ → Λµ ν Vν , with Λµ ν = ηµρ η νσ Λρ σ . One immediately
2.3
veriﬁes that, if V µ is a contravariant fourvector, then Vµ ≡ ηµν V ν is a covariant fourvector. We refer generically to covariant and contravariant fourvectors simply as fourvectors. The spacetime coordinates xµ are the simplest example of fourvector. Another particularly important example is given by the fourmomentum pµ = (E, p). µν i The explicit form of the generators (JR ) j as n × n matrices depends on the particular representation that we are considering. For a scalar φ, the index i takes only one value, so it is a onedimensional representation, and (J µν )i j is a 1 × 1 matrix, i.e. a number, for each given pair (µ, ν). But in fact, by deﬁnition, on a scalar a Lorentz transformation is the identity transformation, so δφ = 0 and J µν = 0. A representation in which all generators are equal to zero is trivially a solution of eq. (2.9), for any Lie group, and so it is called the trivial representation. The fourvector representation is more interesting. In this case i, j are themselves Lorentz indices, so each generator J µν is represented by a 4 × 4 matrix (J µν )ρ σ . The explicit form of this matrix is (J µν )ρ σ = i (η µρ δσν − η νρ δσµ ) .
(2.23)
This can be shown observing that, from eqs. (2.22) and (2.15), the variation of a fourvector V µ under an inﬁnitesimal Lorentz transformation is δV µ = ω µν V ν , which can be rewritten as i δV ρ = − ωµν (J µν )ρ σ V σ , 2
(2.24)
with (J µν )ρ σ given by eq. (2.23) (this solution for J µν is unique because we require the antisymmetry under µ ↔ ν). This representation is irreducible since a generic Lorentz transformation mixes all four components of a fourvector and therefore there is no change of basis that allows us to write (J µν )ρ σ in block diagonal form. We can now use the explicit expression (2.23) to compute the commutators, and we ﬁnd [J µν , J ρσ ] = i (η νρ J µσ − η µρ J νσ − η νσ J µρ + η µσ J νρ ) .
(2.25)
This is the Lie algebra of SO(3, 1). It is convenient to rearrange the six components of J µν into two spatial vectors, Ji =
1 ijk jk J , 2
K i = J i0 .
(2.26)
In terms of J i , K i the Lie algebra of the Lorentz group (2.25) becomes
i j (2.27) J , J = iijk J k ,
i j J , K = iijk K k , (2.28)
i j K , K = −iijk J k . (2.29) Equation (2.27) is the Lie algebra of SU (2) and this shows that J i , deﬁned in eq. (2.26), is the angular momentum. Instead eq. (2.28) expresses the fact that K is a spatial vector.
The Lorentz algebra 19
20 Lorentz and Poincar´ e symmetries in QFT
We also introduce the deﬁnitions θ i = (1/2)ijk ω jk and η i = ω i0 . Then 1 ωµν J µν = ω12 J 12 + ω13 J 13 + ω23 J 23 + ωi0 J i0 2 i=1 3
= θ ·J −η · K,
(2.30)
where we used ωi0 = −ω i0 = −η i while ω12 = ω 12 = θ3 , etc. Then a Lorentz transformation can be written as Λ = exp{−iθ · J + iη · K} .
3
This is the “active” point of view. Alternatively, we can say that we keep P ﬁxed and we rotate the reference frame clockwise; this is the “passive” point of view.
(2.31)
With our deﬁnitions θ i = +(1/2)ijk ω jk and η i = +ω i0 a rotation by an angle θ > 0 in the (x, y) plane rotates counterclockwise the position of a point P with respect to a ﬁxed reference frame,3 while performing a boost of velocity v on a particle at rest we get a particle with velocity +v. To check these signs, we can consider inﬁnitesimal transformations, and use the explicit form (2.23) of the generators. Performing a rotation by an angle θ around the z axis, eqs. (2.31) and (2.23) give δxµ = −iθ(J 12 )µ ν xν = θ (η 1µ δν2 − η 2µ δν1 )xν
(2.32)
and therefore δx = −θy and δy = +θx, corresponding to a counterclockwise rotation. Similarly, performing a boost along the x axis, δxµ = +iη(J 10 )µ ν xν = −η (η 1µ δν0 − η 0µ δν1 )xν
(2.33)
and therefore δt = +η x and δx = +η t, which is the inﬁnitesimal form of eq. (2.18).
2.4
Tensor representations
By deﬁnition a tensor T µν with two contravariant (i.e. upper) indices is an object that transforms as
T µν → Λµµ Λν ν T µ ν .
(2.34)
In general, a tensor with an arbitrary number of upper and lower indices transforms with a factor Λµµ for each upper index and a factor Λµµ for each lower index. Tensors are examples of representations of the Lorentz group. For instance, a generic tensor T µν with two indices has 16 components and eq. (2.34) shows that these 16 components transform among themselves, i.e. they are a basis for a representation of dimension 16. However, this representation is reducible. From eq. (2.34) we see that, if T µν is antisymmetric, after a Lorentz transformation it remains antisymmetric, while if it is symmetric it remains symmetric. So the symmetric and antisymmetric parts of a tensor T µν do not mix, and the 16dimensional
2.4
Tensor representations 21
representation is reducible into a sixdimensional antisymmetric representation Aµν = (1/2)(T µν − T νµ ) and a 10dimensional symmetric representation S µν = (1/2)(T µν + T νµ ). Furthermore, also the trace of a symmetric tensor is invariant, S ≡ ηµν S µν → ηµν Λµρ Λν σ S ρσ = S ,
(2.35)
where in the last step we used the deﬁning property of the Lorentz group, eq. (2.13). This means, in particular, that a traceless tensor remains traceless after a Lorentz transformation, and thus the 10dimensional symmetric representation decomposes further into a ninedimensional irreducible symmetric traceless representation, S µν − (1/4)η µν S, and the onedimensional scalar representation S. The following notation is commonly used: an irreducible representation is denoted by its dimensionality, written in boldface. Thus the scalar representation is denoted as 1, the fourvector representation as 4, the antisymmetric tensor as 6 and the traceless symmetric tensor as 9.4 The tensor representation (2.34) is a tensor product of two fourvector representations, which means that each of the two indices of T µν transforms separately as a fourvector index, i.e. with the matrix Λ. The tensor product of two representations is denoted by the symbol ⊗. We have found above that the tensor product of two fourvector representations decomposes into the direct sum of the 1, 6, and 9 representations. Denoting the direct sum by ⊕, we have5 4 ⊗4 = 1⊕6⊕9.
(2.36)
The decomposition into irreducible representations of tensors with more than two indices can be obtained similarly. The most general irreducible tensor representations of the Lorentz group are found starting from a generic tensor with an arbitrary number of indices, removing ﬁrst all traces, and then symmetrizing or antisymmetrizing over all pairs of indices. Note that, using η µν , we can always restrict to contravariant tensors; for instance V µ and Vµ are equivalent representations. All tensor representations are in a sense derived from the fourvector representation, since the transformation law of a tensor is obtained applying separately on each Lorentz index the matrix Λµν that deﬁnes the transformation of fourvectors. This means that (as the name suggests) tensor representations are tensor products of the fourvector representation. For this reason, the fourvector representation plays a distinguished role and is called the fundamental representation of SO(3, 1).6 Another representation of special importance is the adjoint representation. It is a representation which has the same dimension as the number of generators. This means that we can use the same type of indices a, b, c for labeling the generator and its matrix elements, and for any Lie group it can be written in full generality in terms of the structure constants, as a b ) c = −if ab c . (2.37) (Tadj The Lie algebra (2.9) is automatically satisﬁed by (2.37). This follows from the fact that, for all matrices A, B, C, there is an algebraic identity
4
If two inequivalent representations happen to have the same dimensionality one can use a prime or an index to distinguish between them.
5
In Exercise 2.5 we discuss the separation of the representation 6, i.e. the antisymmetric tensor, into its selfdual and antiselfdual parts, both in Minkowski space and in a Euclidean space with metric δ µν . We will see that in the Euclidean case the antisymmetric tensor Aµν is reducible and decomposes into two threedimensional representations corresponding to selfdual and antiselfdual tensors, while in Minkowski space an antisymmetric tensor Aµν with real components is irreducible.
6
To avoid all misunderstanding, we anticipate that in Section 2.5 we will enlarge the deﬁnition of the Lorentz group to include spinorial representations. With this enlarged deﬁnition, fourvectors are no longer the fundamental representation of the Lorentz group. Instead, all representations of the Lorentz group will be built from the spinorial representations (1/2, 0) and (0, 1/2) that will be deﬁned in Section 2.5.
22 Lorentz and Poincar´ e symmetries in QFT
known as the Jacobi identity, [A, [B, C]] + [B, [C, A]] + [C, [A, B]] = 0 ,
(2.38)
which is easily veriﬁed writing the commutators explicitly. Setting in this identity A = T a , B = T b and C = T c we ﬁnd that the structure constants of any Lie group obey the identity f ab d f cd e + f bc d f ad e + f ca d f bd e = 0 .
(2.39)
If we substitute eq. (2.37) into eq. (2.9), we see that the Lie algebra is automatically satisﬁed because of eq. (2.39). For the Lorentz group, the adjoint representation has dimension six, so it is given by the antisymmetric tensor Aµν . The adjoint representation plays an especially important role in nonabelian gauge theories, as we will see in Chapter 10. All the representation theory on tensors that we have developed having in mind SO(3, 1) goes through for SO(n) or SO(n, m) generic, simply replacing ηµν with δµν for SO(n), or with a diagonal matrix with n minus signs and m plus sign for SO(n, m).
2.4.1
Decomposition of Lorentz tensors under SO(3)
Since we know how a tensor behaves under a generic Lorentz transformation, we know in particular its transformation properties under the SO(3) rotation subgroup, and we can therefore ask what is the angular momentum j of the various tensor representations. Recall that the representations of SO(3) are labeled by an index j which takes integer values j = 0, 1, 2, . . ., and the dimension of the representation labeled by j is 2j + 1. Within each representation, these 2j + 1 states are labeled by jz = −j, . . . , j. For SO(3), it is more common to denote the representation as j, i.e. to label it with the angular momentum rather than with the dimension of the representation, 2j + 1. In this notation, 0 is the scalar (also called the singlet), 1 is a triplet with components jz = −1, 0, 1, while 2 is a representation of dimension 5, etc. (if we rather use the same convention as in the case of the Lorentz group, i.e. we label them by their dimensionality, we should write 1, 3, 5, . . .). A Lorentz scalar is of course also scalar under rotations, so it has j = 0. A fourvector V µ = (V 0 , V) is an irreducible representation of the Lorentz group, since a generic Lorentz transformation mixes all four components, but from the point of view of the SO(3) subgroup it is reducible: spatial rotations do not mix V 0 with V; V 0 is invariant under spatial rotations, so it has j = 0, while the three spatial components V i form an irreducible threedimensional representation of SO(3), so they have j = 1. In group theory language we say that, from the point of view of spatial rotations, a fourvector decomposes into the direct sum of a scalar and a j = 1 representation, Vµ ∈0⊕1
(2.40)
2.4
or, if we prefer to label the representations by their dimension, rather than by j, we write 4 = 1 ⊕ 3. The former notation indicates more clearly what are the spins involved while the latter makes apparent that the number of degrees of freedom on the lefthand side matches those on the righthand side. We now want to understand what angular momenta appear in a generic tensor T µν with two indices. By deﬁnition a tensor T µν transforms as the tensor product of two fourvector representations. Since, from the point of view of SO(3), a fourvector is 0 ⊕ 1, a generic tensor with two indices has the following decomposition in angular momenta T µν ∈ (0 ⊕ 1) ⊗ (0 ⊕ 1) = (0 ⊗ 0) ⊕ (0 ⊗ 1) ⊕ (1 ⊗ 0) ⊕ (1 ⊗ 1) = 0 ⊕ 1 ⊕ 1 ⊕ (0 ⊕ 1 ⊕ 2) .
(2.41)
In the last step we used the usual rule of composition of angular momenta, which says that composing two angular momenta j1 and j2 we get all angular momenta between j1 − j2  and j1 + j2 , so 0 ⊗ 0 = 0, 0 ⊗ 1 = 1 and 1 ⊗ 1 = 0 ⊕ 1 ⊕ 2. Thus, in the decomposition of a generic tensor T µν in representations of the rotation group, the j = 0 representation appears twice, the j = 1 representation appears three times, and the j = 2 once. It is interesting to see how these representations are shared between the symmetric traceless, the trace and the antisymmetric part of the tensor T µν , since these are the irreducible Lorentz representations. The trace is a Lorentz scalar, so it is in particular scalar under rotations and therefore is a 0 representation. An antisymmetric tensor Aµν has six components, which can be written as A0i and (1/2)ijk Ajk . These are two spatial vectors and therefore Aµν ∈ 1 ⊕ 1 .
(2.42)
For example, an important antisymmetric tensor in electromagnetism is the ﬁeld strength tensor Fµν , and in this case the two vectors are E i = −F 0i and B i = −(1/2)ijk F jk , i.e. the electric and magnetic ﬁelds. Another example of an antisymmetric tensor is given by the Lorentz generators J µν themselves; in this case the two spatial vectors are the angular momentum and the boost generators that have been introduced in eq. (2.26). Since we have identiﬁed the trace S with a 0 and Aµν with 1 ⊕ 1, comparison with eq. (2.41) shows that the nine components of a symmetric traceless tensor S µν decompose, from the point of view of spatial rotations, as (2.43) S µν ∈ 0 ⊕ 1 ⊕ 2 . Observe that, when in eq. (2.41) we write T µν as (0 ⊕ 1) ⊗ (0 ⊕ 1), the ﬁrst 0 corresponds to taking the index µ = 0, the ﬁrst 1 corresponds to taking the index µ = i, and similarly for the second factor (0 ⊕ 1) and the index ν. Therefore the term (0 ⊗ 0) in eq. (2.41) corresponds to T 00 , (0 ⊗ 1) is T 0i , (1 ⊗ 0) is T i0 and (1 ⊗ 1) is T ij . It is clear that T 00 is
Tensor representations 23
24 Lorentz and Poincar´ e symmetries in QFT
a scalar under spatial rotations, while T 0i and T i0 are spatial vectors. As for T ij , the antisymmetric part Aij = T ij − T ji is a vector, as can be seen considering ijk Ajk ; this gives the third 1 representation. The symmetric part S ij = T ij + T ji can be separated into its trace, which gives the second 0 representation, and the traceless symmetric part, which therefore must have j = 2. For example, gravitational waves can be described by a traceless symmetric spatial tensor (transverse to the propagation direction) and therefore have spin 2, see Exercise 2.6. In general, a symmetric tensor with N indices contains angular momenta up to j = N . In four dimensions, higher antisymmetric tensors are instead less interesting, because the index µ takes only four values 0, . . . , 3 and therefore we cannot antisymmetrize over more than four indices, otherwise we get zero. Furthermore, a totally antisymmetric tensor with four indices, Aµνρσ , has only one independent component A0123 , so it must be a Lorentz scalar. An antisymmetric tensor with three indices, Aµνρ , has 4 · 3 · 2/3! = 4 components and it has the same transformation properties of a fourvector. The last point can be better understood introducing the totally antisymmetric tensor deﬁned as follows. In a given reference frame µνρσ is deﬁned by 0123 = +1 and by the condition of total antisymmetry, so it vanishes if any two indices are equal and it changes sign for any exchange of indices, e.g. 1023 = −1, etc. Normally, if one gives the numerical value of the components of a tensor in a given frame, in another frame they will be diﬀerent. The tensor is however special, because under (proper) Lorentz transformations
µνρσ → Λµµ Λν ν Λρ ρ Λσσ µ ν
ρ σ
= (det Λ)µνρσ = µνρσ .
(2.44)
So, the components of the tensor have the same numerical value in all Lorentz frames. In terms of this tensor, it is immediate to understand that the four independent components of Aµνρ can be rearranged in a fourvector Aµ = µνρσ Aνρσ , and that A0123 = (1/4!)µνρσ Aµνρσ is a scalar. A tensor which is invariant under all group transformations (i.e. for the Lorentz group, a tensor which has the same form in all Lorentz frames) is called an invariant tensor. The only other invariant tensor of the Lorentz group is ηµν ; its invariance follows from the deﬁning property of the Lorentz group, eq. (2.13).
2.5 2.5.1
Spinorial representations Spinors in nonrelativistic quantum mechanics
Tensor representations do not exhaust all physically interesting ﬁnitedimensional representations of the Lorentz group. We can understand the issue considering spatial rotations, i.e. the SO(3) subgroup of the Lorentz group. The tensor representations of SO(3) are constructed exactly as before, with scalars φ, spatial vectors v i , tensors T ij , etc. with
2.5
i = 1, 2, 3. However we know from nonrelativistic quantum mechanics that, beside the tensor representations, there are other representations of great physical interest. These are the spinorial representations. Strictly speaking, these are not SO(3) representations, because under a rotation of 2π a spinor changes sign, while an SO(3) rotation by 2π is the same as the identity transformation. However, since the observables are quadratic in the wave function, this sign ambiguity is perfectly acceptable physically, and these representations must be included. In more formal terms, this means that, for spatial rotations, the physically relevant group is not SO(3) but rather SU (2). We recall some facts about SU (2) representations, well known from nonrelativistic quantum mechanics. The Lie algebras of SU (2) and of SO(3) are the same, and are given by the angular momentum algebra [J i , J j ] = iijk J k .
(2.45)
From the discussion in Section 2.1, we see that the Lie algebra knows only about the properties of a group near the identity element, and the fact that SU (2) and SO(3) have the same Lie algebra means that they are indistinguishable at the level of inﬁnitesimal transformations. However, SU (2) and SO(3) diﬀer at the global level, i.e. far from the identity. In SO(3) a rotation by 2π is the same as the identity. Instead, it can be shown that SU (2) is periodic only under rotations by 4π. This means that an object that picks a minus sign under a rotation by 2π is an acceptable representation of SU (2), while it is not an acceptable representation of SO(3). Therefore when we consider SU (2) we include the solutions of eq. (2.45) that correspond to halfinteger spin, while for SO(3) we only retain representations with integer spin. Thus, the representations of SU (2) are labeled by an index j which takes values 0, 21 , 1, 32 , . . . and gives the spin of the state, in units of . The spinj representation has dimension 2j + 1, and the various states within it are labeled by jz , which takes the values −j, . . . , j in integer steps. The representation j = 1/2 is called the spinorial representation, and has dimension 2: on it the J i are represented as Ji =
σi , 2
where σ i are the Pauli matrices, 0 1 0 −i σ1 = σ2 = 1 0 i 0
(2.46) σ3 =
1 0 0 −1
.
(2.47)
They satisfy the algebraic identity σ i σ j = δ ij + iijk σ k ,
(2.48)
from which it follows immediately that σ i /2 obey the commutation relations (2.45). The spinorial is the fundamental representation of SU (2) since all representations can be constructed with tensor products of spinors. In
Spinorial representations 25
26 Lorentz and Poincar´ e symmetries in QFT
physical terms, this means that with spin 1/2 particles we can construct composite systems with all possible integer or halfinteger spin. For instance, the composition of two spin 1/2 states gives spin zero and spin 1, 1 1 ⊗ = 0⊕1. (2.49) 2 2 If we denote by ↑ and ↓ the j = 1/2 states with jz = +1/2 and jz = −1/2, respectively, then the three states with j = 1 are given by (↑↑) ,
1 √ (↑↓ + ↓↑) , 2
(↓↓)
(2.50)
while the singlet (i.e. the scalar state) is 1 √ (↑↓ − ↓↑) . 2
2.5.2
(2.51)
Spinors in the relativistic theory
We certainly want to keep spinors in the relativistic theory. This means that we must enlarge the set of representations of the Lorentz group, compared to the tensor representations discussed above. This is most easily done starting from the Lorentz algebra in the form given by eqs. (2.27)−(2.29), and deﬁning J± =
J ± iK . 2
The Lie algebra becomes
+,i +,j J ,J = iijk J +,k
−,i −,j J ,J = iijk J −,k
+,i −,j = 0. J ,J 7
The fact that the Lorentz algebra can be written as the algebra of SU (2) × SU (2) does not mean that the Lorentz group SO(3, 1) is the same as SU (2) × SU (2). First of all, the Lie algebra only reﬂects the properties of the group close to the identity. Furthermore, J± are complex combinations of J and K. Observe that, because of the factor i in eq. (2.52), a representation of SU (2) × SU (2) with J± hermitian induces a representation of SO(3, 1) with J hermitian but K antihermitian. For the more mathematical reader: SU (2)× SU (2) is the universal covering group of SO(4) (similarly to the fact that SU (2) is the universal covering group of SO(3)) and SO(4) is the Euclidean version of the Lorentz group, i.e. it is obtained taking the time variable t purely imaginary. The universal covering group of SO(3, 1) is SL(2, C).
(2.52)
(2.53) (2.54) (2.55)
Therefore we have two copies of the angular momentum algebra, which commute between themselves.7 Having written the Lorentz group in this form, it is now easy to include spinorial representations: we simply take all solutions of the algebra (2.53)−(2.55), including spinor representations. Since we know the representations of SU (2), and here we have two commuting SU (2) factors, we ﬁnd that: • The representations of the Lorentz algebra can be labeled by two halfintegers: (j− , j+ ). • The dimension of the representation (j− , j+ ) is (2j− + 1)(2j+ + 1). • The generator of rotations J is related to J+ and J− by J = J+ + J− ; therefore, by the usual addition of angular momenta in quantum mechanics, in the representation (j− , j+ ) we have states with all possible spin j in integer steps between the values j+ −j−  and j+ + j− .
2.5
The representations are in general complex and the dimension of the representation is the number of independent complex components. In some cases we can impose a reality condition and (2j− + 1)(2j+ + 1) becomes the number of independent real components. The representations (j− , j+ ) must include all tensor representations discussed in the previous section, plus spinorial representations. We examine the simplest cases. (0, 0). This representation has dimension one. On it, J± = 0 so also J, K are zero. Therefore it is the scalar representation. ( 12 , 0) and (0, 12 ). These representations have both dimension two and spin 1/2, so they are spinorial representations. We denote by (ψL )α , with α = 1, 2, a spinor in (1/2, 0) and by (ψR )α a spinor in (0, 1/2) (sometimes in the literature the index of ψL is instead denoted by α˙ to stress that it is an index in a diﬀerent representation compared to the index of ψR ). ψL is called a lefthanded Weyl spinor and ψR is called a righthanded Weyl spinor : Weyl spinors:
ψL ∈
1 ,0 2
,
ψR ∈
1 0, . 2
(2.56)
We want to determine the explicit form of the generators J, K on Weyl spinors. Consider ﬁrst the representation (1/2, 0). By deﬁnition, on this representation J− is represented by a 2 × 2 matrix, while J+ = 0. The solution of (2.54) in terms of 2 × 2 matrices is of course J− = σ/2, and therefore σ 2 σ K = −i(J+ − J− ) = i . 2 J = J+ + J− =
(2.57) (2.58)
Observe that in this representation the generators K i are not hermitian, in agreement with the comment in note 7. This is a consequence of the fact that the Lorentz group is noncompact and of the theorem mentioned on page 16, which states that noncompact groups have no unitary representations of ﬁnite dimension, except for representations in which the noncompact generators (in this case the K i ) are represented trivially, i.e. K i = 0. We can now write explicitly how a Weyl spinor transforms under Lorentz transformations, using eq. (2.31), σ ψL . ψL → ΛL ψL = exp (−iθ − η) · 2
(2.59)
Repeating the argument for the (0, 1/2) representation, we ﬁnd again J = σ/2 but K = −iσ/2 and σ ψR . ψR → ΛR ψR = exp (−iθ + η) · (2.60) 2
Spinorial representations 27
28 Lorentz and Poincar´ e symmetries in QFT
Note that ΛL,R are complex matrices, and therefore necessarily the two components of a Weyl spinor are complex numbers. Using the property of the Pauli matrices σ 2 σ i σ 2 = −σ i∗ and the explicit form of ΛL,R it is easy to show that σ 2 Λ∗L σ 2 = ΛR .
(2.61)
From this it follows that ∗ ∗ ∗ σ 2 ψL = ΛR (σ 2 ψL ) → σ 2 (ΛL ψL )∗ = (σ 2 Λ∗L σ 2 )σ 2 ψL
(2.62)
where we used the fact that σ 2 σ 2 = 1. Therefore , if ψL ∈ (1/2, 0), then ∗ σ 2 ψL is a righthanded Weyl spinor, 1 2 ∗ σ ψL ∈ 0, . (2.63) 2 We deﬁne the operation of charge conjugation on Weyl spinors as an c deﬁned as operation that transforms ψL into a new spinor ψL c ∗ = iσ 2 ψL . ψL
(2.64)
Then charge conjugation transforms a lefthanded Weyl spinor into a righthanded one. Taking the complex conjugate of eq. (2.64) and dec ∗ by ψR , we have ψL = −iσ 2 ψR (having noting the righthanded spinor ψL 2 2 2 used the fact that σ is purely imaginary and σ σ = 1). Therefore we deﬁne charge conjugation on a righthanded spinor ψR as c ∗ ψR = −iσ 2 ψR ,
(2.65)
so that charge conjugation transforms a righthanded Weyl spinor into a lefthanded one. The factor i in eq. (2.64) is chosen so that, iterating the transformation twice, we get the identity operation, c c ∗ c ∗ ∗ ) = (iσ 2 ψL ) = −iσ 2 (iσ 2 ψL ) = ψL . (ψL
(2.66)
We will understand the physical meaning of charge conjugation in Chapter 4. ( 21 , 12 ). This representation has (complex) dimension four and 1/2 − 1/2 j 1/2 + 1/2, i.e. j = 0, 1. Comparing with eq. (2.40) we see that it is a complex fourvector representation. A generic element of the (1/2, 1/2) representation can be written as a pair ((ψL )α , (ξR )β ), where ψL and ξR are two independent Weyl spinors, lefthanded and righthanded, respectively, and α, β take the values 1, 2. We want to make explicit the relation between these four (complex) quantities and the four components of a (complex) fourvector. First of all, we have seen above that, given a righthanded spinor ξR , ∗ , and similarly from ψL we can form a lefthanded spinor ξL ≡ −iσ 2 ξR 2 ∗ we can build ψR ≡ iσ ψL . We deﬁne the matrices σ µ and σ ¯ µ as σ µ = (1, σ i ) ,
σ ¯ µ = (1, −σ i ) ,
(2.67)
2.6
where σ i are the Pauli matrices and 1 is the 2 × 2 identity matrix. Then, it is easy to show (see Exercise 2.3) that
and
† µ ξR σ ψR
(2.68)
† µ σ ¯ ψL . ξL
(2.69)
are contravariant fourvectors. These four vectors are by construction complex. Since the matrix Λµ ν that represents the Lorentz transformation of a fourvector is real, given a complex fourvector V µ it is consistent with Lorentz invariance to impose on it a reality condition, Vµ = Vµ∗ because, if we impose it in a given frame, it will remain true in all Lorentz frames. Therefore we obtain the real fourvector representation. (1, 0) and (0, 1). These correspond to selfdual and antiselfdual antisymmetric tensors Aµν , and each have complex dimension three, i.e. real dimension six. We discuss them in Exercise 2.5.
2.6
Field representations
Our main motivation for studying Lorentz symmetry is to construct a Lorentzinvariant ﬁeld theory. A ﬁeld φ(x) is a function of the coordinates with some deﬁnite transformation properties under the Lorentz group. In general, if µ (2.70) xµ → x = Λµν xν the ﬁeld φ(x) will transform into a new function of the new coordinates, φ(x) → φ (x ) .
(2.71)
To deﬁne how a ﬁeld transforms means to state how φ (x ) is related to φ(x).
2.6.1
Scalar ﬁelds
The simplest possible transformation is that of a scalar ﬁeld, φ (x ) = φ(x) .
(2.72)
In other words, the numerical value of a scalar ﬁeld at a point is Lorentz invariant: a point P has coordinates x in a reference frame and x in the transformed frame, and the functional form of the ﬁeld changes so that its numerical value in P is the same, independently of how P is labeled. Consider now an inﬁnitesimal Lorentz transformation xρ → x = xρ + δxρ ρ
(2.73)
Field representations 29
30 Lorentz and Poincar´ e symmetries in QFT
with
i δxρ = ω ρσ xσ = − ωµν (J µν )ρ σ xσ , (2.74) 2 and (J µν )ρ σ = i(η µρ δσν −η νρ δσµ ), as in eqs. (2.23) and (2.24). Under this transformation δφ ≡ φ (x ) − φ(x) = 0 by deﬁnition. This corresponds to the fact that the scalar representation gives a trivial representation of the generators, J µν = 0. However, in the case of ﬁelds we have a more interesting possibility, namely we can consider an inﬁnitesimal variation at ﬁxed coordinate x (rather than at a given point P ), δ0 φ ≡ φ (x) − φ(x) .
(2.75)
To understand the diﬀerence between δφ and δ0 φ we observe that, when we compute δφ = φ (x ) − φ(x), we are studying how a single degree of freedom (the ﬁeld evaluated at the point P ) changes when we change the label of the point P from x to x . However the point P is kept ﬁxed, so the base space is made by the single degree of freedom φ(P ) and therefore is onedimensional. More generally, when in the next subsections we consider spinor or vector ﬁelds, we will see that δψ or δAµ always provides a ﬁnitedimensional representation of the generators. For instance the four degrees of freedom Aµ (P ) provide a fourdimensional base space. Instead, when we compute δ0 φ, we are comparing the ﬁelds at two diﬀerent spacetime points P and P , so we are comparing different degrees of freedom. The base space now becomes the set of φ(P ) with P varying over all of spacetime, or in other words is a space of functions, and therefore it is an inﬁnitedimensional basespace. We then obtain an inﬁnitedimensional representation of the generators. To ﬁnd the generators in this representation, we expand eq. (2.75) to ﬁrst order in δx, δ0 φ = φ (x − δx) − φ(x) = −δxρ ∂ρ φ(x) .
(2.76)
Using eq. (2.74) for δxρ , this can be rewritten as δ0 φ =
i i ωµν (J µν )ρ σ xσ ∂ρ φ ≡ − ωµν Lµν φ , 2 2
(2.77)
where we deﬁned Lµν = −(J µν )ρ σ xσ ∂ρ = i(xµ ∂ ν − xν ∂ µ ) .
(2.78)
We can easily check that the operators Lµν satisfy the Lie algebra (2.25) and therefore give a representation of the generators of the Lorentz group. As discussed above, the basis for the representation is the space of scalar ﬁelds. This is a space of functions, so it is inﬁnitedimensional, and therefore this is an inﬁnitedimensional representation of the Lorentz algebra. We have not yet speciﬁed what is the scalar product in the ﬁeld space, so we cannot yet ask whether this representation is unitary. We postpone the issue to the next chapter. Recalling that with our metric signature pµ = +i∂ µ (see the Notation), we ﬁnd Lµν = xµ pν − xν pµ . In particular, for spatial rotations we have Lij = xi pj − xj pi and Li = (1/2)ijk Ljk = ijk xj pk , and we recognize that Li is the orbital angular momentum.
2.6
2.6.2
Weyl ﬁelds
A lefthanded Weyl ﬁeld ψL (x) is deﬁned as a ﬁeld that, under xµ → µ x = Λµν xν , transforms as ψL (x) → ψL (x ) = ΛL ψL (x) ,
(2.79)
with ΛL given by eq. (2.59). Similarly a righthanded Weyl ﬁeld ψR transforms with ΛR given in eq. (2.60). In the classical theory we will consider ψL , ψR as ordinary, commuting, cnumbers. The representation of the Lorentz generators on ψL can be found computing δ0 ψL ≡ ψL (x) − ψL (x) = ψL (x − δx) − ψL (x) = ψL (x ) − δxρ ∂ρ ψL (x) − ψL (x) = (ΛL − 1)ψL (x) − δxρ ∂ρ ψL (x) .
(2.80)
We see that δ0 ψL is made of two parts; one comes from the variation of the coordinate δxρ and is the same as for scalar ﬁelds. Exactly as in eqs. (2.76) and (2.77), we have i −δxρ ∂ρ ψL = − ωµν Lµν ψL , 2
(2.81)
with Lµν given in eq. (2.78). We write ΛL in the form ΛL = e− 2 ωµν S i
µν
.
(2.82)
Then eq. (2.80) becomes i δ0 ψL = − ωµν J µν ψL 2
(2.83)
J µν = Lµν + S µν .
(2.84)
with Comparing eq. (2.82) with eq. (2.59) we see that Si =
1 ijk jk σi S = , 2 2
while
(2.85)
σi . (2.86) 2 We recognize in eq. (2.84) the separation of the angular momentum into the orbital and the spin contributions. It is clear that this separation is completely general, and holds for any representation. The orbital part Lµν always has the form (2.78) independently of the representation, while S µν depends on the speciﬁc representation used. For instance, for righthanded Weyl ﬁelds S i are still given by eq. (2.85) while S i0 = −iσ i /2, as we see from eq. (2.60). S i0 = i
Field representations 31
32 Lorentz and Poincar´ e symmetries in QFT
2.6.3
8
More precisely, this is the Dirac ﬁeld written in the chiral basis, see Section 3.4.2.
Dirac ﬁelds
Consider a parity transformation (t, x) → (t, −x). Under this operation the boost generators behave as true vectors and change sign, K → −K, since the parity transformation reverses the velocity v of the boost. The angular momentum generator is instead a pseudovector, J → J. Therei i i generators, J+ ↔ J− . This fore a parity operation exchanges the J± means that under a parity transformation an object in the (j− , j+ ) representation is transformed into an object in the (j+ , j− ) representation. Therefore the representation (j− , j+ ) of the Lorentz group is not at the same time a basis for a representation of the parity transformation, unless j− = j+ . In particular, ψL and ψR , separately, are not a basis for a representation of the parity transformation. In Nature, we know experimentally that parity is violated by weak interactions. At the theoretical level, this is reﬂected in the fact that in the Standard Model the left and righthanded components of the spin 1/2 particles enter the theory in a very diﬀerent way, as we will see in Chapter 8. However, we saw in Section 1.2 that the typical scale of weak interactions is O(100) GeV, much higher than the scale of strong and of electromagnetic interactions. At suﬃciently low energies, therefore, the eﬀect of weak interactions is small, and the dominant contributions come from the electromagnetic and the strong interactions, which both conserve parity. In this case, it is convenient to work with ﬁelds which provide a representation of Lorentz and parity transformations. We then deﬁne a Dirac ﬁeld as8 Ψ=
ψL ψR
.
(2.87)
A Dirac ﬁeld therefore has four complex components, and it provides a basis for a representation of both Lorentz and parity transformations. In fact, under a Lorentz transformation, Ψ → ΛD Ψ with ΛL 0 ΛD = , (2.88) 0 ΛR 9
In Section 3.4.2, after introducing the Dirac matrices, we will see how to write ΛD in terms of the commutator of Dirac matrices, and the result will be independent of the chiral basis that we have used here.
and ΛL , ΛR given in eqs. (2.59) and (2.60).9 Under a parity transformation P the coordinates change as xµ → x µ = (t, −x) while ψL (x) ψR (x ) → (2.89) ψR (x) ψL (x ) and therefore
Ψ(x) →
0 1
1 0
Ψ(x ) .
(2.90)
When we study the quantized Dirac ﬁeld we will examine the possibility and the meaning of an overall phase η = ±1 in the transformation law (2.90), see Section 4.2.3.
2.6
In eqs. (2.64) and (2.65) we deﬁned the operation of charge conjugation on Weyl spinors. Given a Dirac spinor Ψ as in eq. (2.87), charge conjugation allows us to deﬁne a new Dirac spinor ∗ −iσ 2 ψR 0 σ2 c (2.91) Ψ∗ . Ψ = = −i ∗ iσ 2 ψL −σ 2 0 and, as for Weyl spinors, iterating charge conjugation twice one ﬁnds the identity transformation, (Ψc )c = Ψ .
(2.92)
Note that the coordinates xµ are unchanged under charge conjugation. We will understand the importance of charge conjugation when we quantize the theory and we will ﬁnd particles and antiparticles. Dirac spinors are the basic objects in quantum electrodynamics (QED). Since QED preserves parity and charge conjugation, the Weyl spinors always appear in the combination Ψ. On Ψ parity is a welldeﬁned operation, and we can use it to construct a parityinvariant theory while, having for instance only ψL at our disposal, it is impossible to build a theory invariant under parity. We will see that in the Standard Model, parity and charge conjugation are not symmetries and ψL , ψR appear separately, in a nonsymmetric way. Therefore, Weyl spinors are more fundamental objects than Dirac spinors.
2.6.4
Majorana ﬁelds
A Majorana spinor is a Dirac spinor in which ψL and ψR are not inde∗ pendent, but rather ψR = iσ 2 ψL , ψL ΨM = . (2.93) ∗ iσ 2 ψL So, it has the same number of degrees of freedom as a Weyl spinor, although it is written in the form of a Dirac spinor. From this deﬁnition it follows that a Majorana spinor is invariant under charge conjugation ΨcM = ΨM .
(2.94)
Observe that, if we have a complex scalar ﬁeld φ(x), we can impose on it a reality condition φ(x) = φ∗ (x), and this is a Lorentzinvariant condition: since φ and φ∗ are both Lorentz invariant, if we impose φ = φ∗ in a frame, we will have φ = φ∗ in any other frame. The same is true for the fourvector representation, as we already discussed on page 29. For a Dirac spinor Ψ the situation is diﬀerent; Ψ is a complex ﬁeld, and the condition Ψ = Ψ∗ is not Lorentz invariant, since the matrix ΛD in eq. (2.88) is not real. Therefore, if we impose the relation Ψ = Ψ∗ in a frame, it will not hold in general in another Lorentz frame. Instead, the condition (2.94) is by construction Lorentz invariant, since it is a consequence of the deﬁnition (2.93), which in turns expresses the
Field representations 33
34 Lorentz and Poincar´ e symmetries in QFT ∗ Lorentzinvariant statement that iσ 2 ψL is a righthanded spinor. Since c ΨM involves complex conjugation, see eq. (2.91), the condition (2.94) is a Lorentzinvariant relation between Ψ and Ψ∗ , and in this sense it is called a reality condition. So we can see Majorana ﬁelds as “real” Dirac ﬁelds, with respect to the only possible Lorentzinvariant reality condition, eq. (2.94). It is possible that Majorana spinors play an important role in the description of the neutrino. We will come to this issue later.
2.6.5
Vector ﬁelds
The deﬁnition of vector ﬁelds at this point is obvious. A (contravariant) µ vector ﬁeld V µ (x) is deﬁned as a ﬁeld that, under xµ → x = Λµν xν , transforms as µ V µ (x) → V (x ) = Λµν V ν (x) . (2.95) From the discussion in Section 2.4.1 we see that a general vector ﬁeld has a spin0 and a spin1 component. An example of a vector ﬁeld that will be important for us is the gauge ﬁeld Aµ (x) in electromagnetism. We will see in Section 4.3.1 that Aµ (x) is subject to some conditions, stemming from gauge invariance, that eliminate the spin0 component and the state with (j = 1, jz = 0), where z is the propagation direction. Since a vector ﬁeld belongs to the (1/2, 1/2) representation, it has j− = j+ and therefore it is a basis for the representation of parity. A true vector transform as (V 0 , V) → (V 0 , −V) while a pseudovector (or axial vector) transforms as (V 0 , V) → (−V 0 , V). Tensor ﬁelds are deﬁned similarly.
2.7
The Poincar´ e group
Beside invariance under Lorentz transformations, we require also invariance under spacetime translations. A generic element of the translation group is written as (2.96) exp{−iP µ aµ } where aµ are the parameters of the translation, xµ → xµ + aµ , and the components of the fourmomentum operator P µ are the generators. Translations plus Lorentz transformations form a group, called the Poincar´e group, or the inhomogeneous Lorentz group (it is sometimes denoted as ISO(3, 1), where “I” stands for inhomogeneous). Since the translations commute, we have [P µ , P ν ] = 0 .
(2.97)
To ﬁnd the commutator between P µ and J ρσ we can start from the commutators
i j (2.98) J , P = iijk P k ,
i 0 (2.99) J ,P = 0,
2.7
which express the facts that P i is a vector under rotations and that the energy is a scalar under rotations. The unique Lorentzcovariant generalization of eqs. (2.98) and (2.99) is [P µ , J ρσ ] = i (η µρ P σ − η µσ P ρ ) .
(2.100)
Together with the Lorentz algebra (2.25), eqs. (2.97) and (2.100) deﬁne the Poincar´e algebra. In terms of J i , K i , P 0 = H and P i it reads
i j
J , J = iijk J k , J i , K j = iijk K k , J i , P j = iijk P k , (2.101)
i j K , K = −iijk J k , P i , P j = 0 , K i , P j = iHδ ij ,
i
i
i J ,H = 0, P ,H = 0, K , H = iP i .
(2.102) (2.103)
Equations (2.101) express the fact that the J i generate spatial rotations and K i , P i are vectors under rotations. Equations (2.103) state that J i and P i commute with the generator of time translations and therefore are conserved quantities; the K i instead are not conserved, and this is the reason why the eigenvalues of K are not used for labeling physical states.
2.7.1
Representation on ﬁelds
We saw in Section 2.6 that ﬁelds provide an inﬁnitedimensional representation of the Lorentz group, and that on ﬁelds the generators J µν are represented as (2.104) J µν = Lµν + S µν where Lµν = i(xµ ∂ ν − xν ∂ µ )
(2.105)
and S µν depends on the spin of the ﬁeld in question, but not on the coordinates xµ . To obtain a representation of the full Poincar´e group on ﬁelds, we must now ﬁnd how to represent the fourmomentum operator P µ , i.e. we have to specify the transformation law of ﬁelds under translations. We require that all ﬁelds, independently of their transformation property under the Lorentz group, behave as scalars under spacetime translation. Let us label by φ a generic ﬁeld, either a Lorentz scalar ﬁeld, or a component of a spinor ﬁeld ξα with α given, or a given component V µ of a vector ﬁeld, etc. Then, under a translation x → x = x + a, all ﬁelds, independently of their Lorentz properties, transform as φ (x ) = φ(x) .
(2.106)
Under an inﬁnitesimal translation xµ → xµ = xµ + µ we have, to ﬁrst order in , δ0 φ ≡ φ (x) − φ(x) = φ (x − ) − φ(x) = −µ ∂µ φ(x) .
(2.107)
The Poincar´e group 35
36 Lorentz and Poincar´ e symmetries in QFT
On the other hand, from the form (2.96) of the translation operator, it follows that φ (x − ) = e−iP
µ
(− µ )
φ (x ) = ei µ P φ(x) µ
(2.108)
and therefore to ﬁrst order in δ0 φ = iµ P µ φ(x) .
(2.109)
Comparing eqs. (2.107) and (2.109) we see that the momentum operator is represented as P µ = +i∂ µ . (2.110) Therefore H=i
∂ ∂ =i , 0 ∂x ∂t
P i = i∂ i = −i∂i = −i
∂ . ∂xi
(2.111)
The explicit form of J µν and of P µ has been found requiring that the ﬁelds have welldeﬁned transformation properties under the Poincar´e group; therefore these explicit expressions must automatically satisfy the Poincar´e algebra. We can check this easily observing that S µν does not depend on the coordinates and therefore commutes with ∂ µ , while [∂ µ , xν ] = η µν . Therefore [P µ , J ρσ ] = [i∂ µ , i(xρ ∂ σ − xσ ∂ ρ )] = −η µρ ∂ σ + η µσ ∂ ρ = i (η µρ P σ − η µσ P ρ ) ,
(2.112)
in agreement with eq. (2.100). The commutator [P µ , P ν ] = 0 is also satisﬁed by P µ = i∂ µ and we already know that the commutator [J µν , J ρσ ] is correctly reproduced, so the full Poincar´e algebra is obeyed.
2.7.2
10
Actually there is also the possibility of an antiunitary operator; the only symmetry transformation where this happens is timereversal, and we postpone the deﬁnition of antiunitary operators to Chapter 4.
Representation on oneparticle states
The representation of the Poincar´e group on ﬁelds allows us to construct Poincar´e invariant Lagrangians, as we will study in the next chapter. At the classical level, a Lagrangian description is all that we need in order to specify the dynamics of the system. At the quantum level, however, one of our aims will be to understand how the concept of particle emerges from ﬁeld quantization. It is therefore useful to see how the Poincar´e group can be represented using as a basis the Hilbert space of a free particle. We will denote the states of a free particle with momentum p as p , s , where s labels collectively all other quantum numbers. Since p is a continuous and unbounded variable, this base space is inﬁnitedimensional. A theorem by Wigner (see Weinberg (1995), Chapter 2) states that on this Hilbert space any symmetry transformation can be represented by a unitary operator.10 Therefore in this base space a Poincar´e transformation is represented by a unitary matrix, and the generators J i , K i , P i and H by hermitian operators. The representations are labeled by the Casimir operators. One is easily found, and is Pµ P µ . On a oneparticle state it has the value m2 ,
2.7
The Poincar´e group 37
where m is the mass of the particle. Using the commutation relations of the Poincar´e group one can verify that there is a second Casimir operator given by Wµ W µ , where 1 W µ = − µνρσ Jνρ Pσ 2
(2.113)
is called the Pauli–Lubanski fourvector. To prove that Wµ W µ is a Casimir operator is straightforward. First of all, W µ is clearly a fourvector, so Wµ W µ is Lorentzinvariant and therefore commutes with J µν . From the explicit form it also follows that [W µ , P ν ] = 0 ,
(2.114)
(using eq. (2.100) and the antisymmetry of µνρσ ), and then Wµ W µ commutes also with P ν . Since Wµ W µ is Lorentzinvariant, we can compute it in the frame that we prefer. If m = 0, it is convenient to choose the rest frame of the particle; in this frame W µ = (−m/2)µνρ0 Jνρ = (m/2)0µνρ Jνρ , so W 0 = 0 while m m W i = 0ijk J jk = ijk J jk = mJ i . (2.115) 2 2 Therefore on a oneparticle state with mass m and spin j we have −Wµ W µ = m2 j(j + 1) ,
(m = 0) .
(2.116)
If instead m = 0 the rest frame does not exist, but we can choose a frame where P µ = (ω, 0, 0, ω); in this frame a straightforward computation gives W 0 = W 3 = ωJ 3 , W 1 = ω(J 1 − K 2 ) and W 2 = ω(J 2 + K 1 ). Therefore −Wµ W µ = ω 2 [(K 2 − J 1 )2 + (K 1 + J 2 )2 ] ,
(m = 0) .
(2.117)
Comparing eqs. (2.116) and (2.117) we see that the limit m → 0 is quite subtle, and we must study separately the massive and massless representations. Massive representations: In this case on the oneparticle states we have P µ Pµ = m2 while Wµ W µ = −m2 j(j + 1). We will restrict to m real11 and positive. Therefore the representations are labeled by the mass m and by the spin j. We can understand this better observing that, if m = 0, with a Lorentz transformation we can bring P µ into the form P µ = (m, 0, 0, 0). This choice of P µ still leaves us with the freedom of performing spatial rotations. In other words, the space of oneparticle states with momentum P µ = (m, 0, 0, 0) is still a basis for the representation of spatial rotations. The group of transformations which leaves invariant a given choice of P µ is called the little group. In this case, since we want to include spinor representations, the little group is SU (2). The massive representations are therefore labeled by the mass m and by the spin j = 0, 1/2, 1, . . ., and states within each
11
In principle there is also the possibility of representations with m2 < 0, known as tachyons. In ﬁeld theory the emergence of a tachyonic mode is the signal of an instability, and reﬂects the fact that we have expanded around the wrong vacuum, e.g. around a maximum rather than a minimum of a potential.
38 Lorentz and Poincar´ e symmetries in QFT
representation are labeled by jz = −j, −j + 1, . . . , j. This means that massive particles of spin j have 2j + 1 degrees of freedom. Massless representations: When P 2 = 0 the rest frame does not exist, but we can reduce P µ to the form P µ = (ω, 0, 0, ω). The little group is the set of Poincar´e transformations that leaves this vector unchanged. One sees immediately that the rotations in the (x, y) plane leave this P µ invariant; this is an SO(2) group, generated by J 3 . This part is more technical and can be omitted at a ﬁrst reading. Just assume that the little group is SO(2) and skip the part written in smaller characters.
Furthermore there are two less evident Lorentz transformations that do not change P µ ; to ﬁnd the most general solution, it is suﬃcient to restrict to inﬁnitesimal Lorentz transformations Λµ ν = δνµ +ω µ ν , and to look for the most general matrix ω µν which satisﬁes ω µν = −ω νµ (in order to have a Lorentz transformation) and (2.118) ω µν Pν = 0 , for Pν = (ω, 0, 0, −ω). Therefore 10 0 1 ω 02 ω 03 1 0 ω 01 B B −ω 01 C 0 ω 12 ω 13 C CB 0 C = 0, B (2.119) 23 A @ @ −ω 02 −ω 12 0 ω 0 A 03 13 23 −ω −ω −ω 0 −1 which gives ω 03 = 0, ω 01 + ω 13 = 0 and ω 02 + ω 23 = 0. Denoting ω 01 = α, ω 02 = β and ω 12 = θ we see that the most general Lorentz transformation that leaves P µ invariant can be written as Λ = e−i(αA+βB+θC) where (lowering the second 0 0 −1 0 B −1 0 0 µ A ν = iB @ 0 0 0 0 −1 0
Lorentz index) 1 0 0 0 B 0 1 C µ C , B ν = iB @ −1 0 A 0 0
(2.120) 0 0 0 0
−1 0 0 −1
1 0 0 C C 1 A 0
(2.121)
0
1 0 0 0 0 B 0 0 −1 0 C C. C µν = i B (2.122) @ 0 1 0 0 A 0 0 0 0 Comparison with eq. (2.23) shows that C µ ν is nothing but (J 3 )µ ν , i.e. the explicit expression of J 3 as a 4 × 4 matrix in the fourvector representation. Similarly we ﬁnd that Aµ ν = (K 1 + J 2 )µ ν and B µ ν = (K 2 − J 1 )µ ν . These are just the combinations that appear in eq. (2.117), so in the massless case and
−Wµ W µ = ω 2 (A2 + B 2 ) .
12 They would be hermitian if we write them as Aµν , B µν and C µν . However, δxρ is proportional to ωµν (J µν )ρ σ xσ , so the representation is provided by the matrices with one upper and one lower index, and it is for these matrices that the algebra (2.124) holds.
(2.123)
Using the commutation rules of the Lorentz group, or directly the explicit expressions given above, one ﬁnds that the operators J 3 , A and B close an algebra: [J 3 , B] = −iA , [A, B] = 0 . (2.124) [J 3 , A] = +iB , x y Formally this is the same algebra generated by the operators p , p and Lz = xpy − ypx, which describe the translations and rotations of a Euclidean plane, with A and B playing the role of the translation operators. This algebra is denoted by ISO(2). The matrices Aµ ν and B µ ν given in eq. (2.121) are not hermitian.12 This is as it should be, since they are 4×4 matrices, and therefore are a ﬁnitedimensional representation of noncompact Lorentz generators.
2.7
The Poincar´e group 39
We can however represent the algebra (2.124) taking as the base space the oneparticle states with momentum p. In this representation A and B are hermitian operators because of Wigner’s theorem and, since they are commuting, they can be diagonalized simultaneously. We denote by a, b the respective eigenvalues. Then Ap ; a, b = ap ; a, b ,
Bp ; a, b = bp ; a, b .
(2.125)
However, if a and b are nonzero, we can now ﬁnd a continuous set of eigenvalues! Consider in fact the state 3
p ; a, b, θ ≡ e−iθJ p ; a, b ,
(2.126)
with θ an arbitrary angle. We have 3
A e−iθJ p ; a, b = e−iθJ
3
“
3
eiθJ Ae−iθJ
3
”
p ; a, b .
(2.127)
Using the commutation rules (2.124) we ﬁnd that 3
3
eiθJ Ae−iθJ = A cos θ − B sin θ
(2.128)
(this can be proved expanding the exponentials in power series) and therefore Ap ; a, b, θ = (a cos θ − b sin θ)p ; a, b, θ ,
(2.129)
Bp ; a, b, θ = (a sin θ + b cos θ)p ; a, b, θ .
(2.130)
and similarly This means that, unless a = b = 0, we ﬁnd representations corresponding to massless particles with a continuous internal degree of freedom θ. These representations do not so far ﬁnd physical applications, and we therefore restrict to states with a = b = 0. Since for massless particles we found −Wµ W µ = ω 2 (A2 + B 2 ), on these states (and only on these states) we have −Wµ W µ = 0, which agrees with the m → 0 limit of eq. (2.116). On the states with a = b = 0 the little group is simply SO(2) or, equivalently, U (1).
As for any abelian group, the irreducible representations of SO(2) are onedimensional. The generator of the group SO(2) of rotations in the (x, y) plane is the angular momentum J 3 and therefore the onedimensional representations are labeled by the eigenvalue h of J 3 ; it represents the angular momentum in the direction of propagation of the particle (in this case, the z axis), and is called the helicity. It can be shown that h is quantized, h = 0, ±1/2, ±1, . . .. Actually, there is a subtle technical point in the quantization of h: the elementary proof that, for SU (2), jz is quantized is of an algebraic nature. One deﬁnes λm = j, m+1(Jx +iJy )jm and, using the commutation relations between the three Ji , one ﬁnds a recursion relation λm−1 2 −λm 2 = 2m. The condition that this recursion relation does not produce a negative λm 2 provides the quantization of m = jz .13 In the case of the little group of massless particles we do not have Jx , Jy at our disposal, but only the single SO(2) generator Jz and therefore this algebraic proof does not go through. There is however a topological proof, based on the fact that the universal covering of the Lorentz group is SL(2, C); this is a double covering, therefore any Lorentz rotation by 4π is the same
13 See any book on quantum mechanics, e.g. L. Schiﬀ, Quantum Mechanics, third edition, McGrawHill, New York 1968, eq. (27.23).
40 Lorentz and Poincar´ e symmetries in QFT
as the identity matrix. A detailed discussion can be found in Weinberg (1995), pages 86–90. This analysis shows that massless particles have only one degree of freedom, and are characterized by the value h of their helicity. On a state of helicity h, a U (1) rotation of the little group is represented by U (θ) = exp{−ihθ} .
(2.131)
From the point of view of the representations of the Poincar´e group, a massless particle with helicity +h and a massless particle with helicity −h are logically two diﬀerent species of particles, since they belong to two diﬀerent representations of the Poincar´e group. However, the helicity is the projection of the angular momentum along the direction of motion, so it can be written as ˆ·J h=p
(2.132)
ˆ is the unit vector in the direction of propagation. We see from where p eq. (2.132) that the helicity is a pseudoscalar, i.e. it changes sign under parity. If the interaction conserves parity, to each particle of helicity h there must correspond another particle with helicity −h, and these two helicity states must enter into the theory in a symmetric way. Since the electromagnetic interaction conserves parity, it is more natural to deﬁne the photon as a representation of the Poincar´e group and of parity, i.e. to assemble together the two states of helicity h = ±1. The two states h = ±1 are then referred to as lefthanded (h = −1) and righthanded (h = +1) photons. Similarly the two states with helicity h = ±2 that mediate the gravitational interaction are better considered as two polarization states of the same particle, the graviton: Photon: m2 = 0, two polarization states h = ±1. Graviton: m2 = 0, two polarization states h = ±2. On the contrary, neutrinos have only weak interactions (apart from the much smaller gravitational interaction), which do not conserve parity, and the two states with helicity h = ±1/2 are given diﬀerent names: neutrino is reserved for h = −1/2, and antineutrino for h = +1/2.
Summary of chapter In this chapter we have introduced a number of mathematical tools that will greatly simplify our construction of classical and quantum ﬁeld theories in the next chapters. We recall some important points. • Lie group, Lie algebras and their representations have been discussed in Section 2.1. They are central concepts in modern theoretical physics, independently of our applications to the Lorentz and Poincar´e group. Basically, Lie groups are the correct language for describing continuous symmetries.
2.7
The Poincar´e group 41
• The Lorentz group is generated by rotations and boosts, and its algebra is given in eqs. (2.27)−(2.29). We have discussed its tensor representations in Section 2.4 and its spinorial representations in Section 2.5. This leads in particular to the introduction of Weyl spinors, eq. (2.56); Dirac spinors are obtained assembling a lefthanded and a righthanded Weyl spinor, and are a representation of Lorentz and of parity transformations. • Fields are functions of the coordinates with welldeﬁned transformation properties under Poincar´e transformations. Depending on their transformation properties under the Lorentz group, we have scalar ﬁelds, Weyl ﬁelds, Dirac ﬁelds, vector ﬁelds, etc. • The study of the representations of the Poincar´e group using as base space the Hilbert space of oneparticle states leads to massive particles, characterized by the spin j and having 2j + 1 degrees of freedom, and massless particles, which have one degree of freedom and a deﬁnite helicity h. For the photon and for the graviton, parity considerations suggest assembling the two states with helicity h = ±1 (for the photon) and h = ±2 (for the graviton) into a single particle.
Further reading • For Weyl and Dirac spinors see Ramond (1990), Chapter 1 and Peskin and Schroeder (1995), Chapter 3. Observe that our deﬁnitions of ψL and ψR are inverted compared to Ramond (but agree with Peskin and Schroeder). In particular, for us the boost generator on ψL is +iσ/2 while for Ramond is −iσ/2 and as a consequence for us the four† µ σ ¯ ψL , vector made with lefthanded spinors is ξL † µ σ ψL see eq. (2.69), while for Ramond it is ξL (the fact that we both say that ψL belongs to the (1/2, 0) representation is due to the fact that we write (j− , j+ ) while Ramond writes (j+ , j− )). In the next chapter we will see that with our deﬁni
tion ψL has helicity −1/2 (and therefore with the deﬁnition of Ramond it has h = +1/2). • A very clear book on Lie groups for physicists is Georgi (1999). The second edition contains many improvements of the already ‘classical’ ﬁrst edition. For a more geometrical approach to Lie groups, see, e.g., Nakahara (1990), Section 5.6. An advanced book is J. Fuchs and C. Schweigert, Symmetries, Lie Algebras and Representations, Cambridge University Press 1997. • For the representations of the Poincar´e group see Sections 2.4 and 2.5 of Weinberg (1995).
Exercises (2.1) Consider a massive particle moving with velocity v = tanh η. Show that, if E is the energy of the particle and p its momentum along the propaga
tion direction, then η=
E+p 1 log , 2 E−p
(2.133)
42 Lorentz and Poincar´ e symmetries in QFT
(2.2)
(2.3)
(2.4)
(2.5)
and verify that, under a boost in the direction of motion with velocity v , η transforms additively: η → η + arctanh(v ). Therefore dη is invariant under longitudinal boosts. Show that a totally symmetric and traceless spatial tensor T i1 ...iN with N spatial indices has angular momentum j = N . Discuss some physically interesting examples. Prove that, if ψR and ξR are righthanded Weyl † µ σ ψR is a fourvector, and similarly for spinors, ξR † µ σ ¯ ψL , where ξL and ψL are lefthanded Weyl ξL spinors. Find the explicit form of the variation of an antisymmetric tensor F µν under an inﬁnitesimal Lorentz transformation. Writing F 0i = −E i and F ij = − ijk B k , ﬁnd the inﬁnitesimal transformation of E i and B i . Consider an antisymmetric tensor Aµν . (i) Prove that, with Minkowski signature η µν , if we try to impose the condition Aµν = (1/2) µνρσ Aρσ , or the condition Aµν = −(1/2) µνρσ Aρσ , then the only solution is Aµν = 0. (ii) Repeat the same exercise in a Euclidean space with metric δ µν . A Euclidean tensor is called selfdual if Aµν = (1/2) µνρσ Aρσ , or antiselfdual if Aµν = −(1/2) µνρσ Aρσ . Verify that selfdual and antiselfdual tensors are irreducible representations of (real) dimension three of the Euclidean group SO(4), and verify that the sixdimensional representation Aµν of SO(4) decomposes into its selfdual and antiselfdual parts. (iii) With Minkowski signature, an antisymmetric tensor Aµν with complex components is called selfdual if Aµν = (i/2) µνρσ Aρσ and antiselfdual if
Aµν = −(i/2) µνρσ Aρσ . Prove that the selfdual and antiselfdual parts are irreducible representations of the Lorentz group of complex dimension three (i.e. real dimension six) and identify these representations with the appropriate (j− , j+ ) representations. (iv) Write the Maxwell tensor F µν as a sum of a selfdual and antiselfdual part. Realize that a real antisymmetric tensor, such as F µν , is an irreducible representation of SO(3, 1). (2.6) (i) A classical electromagnetic wave propagating in ˆ = (0, 0, 1) is described by the linear the direction n polarization vectors e1 = (1, 0, 0) and e2 = (0, 1, 0). Deﬁne the circular polarizations as e± = e1 ± ie2 . Compute the transformation of e± under a rotation in the (x, y) plane and conclude that electromagnetic waves are made of massless spin1 particles. (ii) A classical gravitational wave propagating in ˆ is described by a polarization tenthe direction n sor hij symmetric, traceless, and transverse to the propagation direction, n ˆ i hij = 0, i.e. (setting ˆ ˆ n = z) by a matrix of the general form 0
hij
h+ = @ h× 0
h× −h+ 0
1 0 0 A, 0
(2.134)
and h+,× are called the plus and cross polarizations, respectively. Compute the transformation properties of h+,× under a rotation in the (x, y) plane, and the transformation properties of the circular polarization tensors h× ± ih+ . Conclude that gravitational waves are made of massless spin2 particles.
Classical ﬁeld theory In the previous chapter we deﬁned our basic objects, the ﬁelds. We now introduce their dynamics, ﬁrst of all at the classical level. We will discuss the Lagrangian and the Hamiltonian formalism and we will present the Noether theorem, which provides the relation between symmetries and conservation laws. We will see that at the classical level the ﬁelds obey relativistic wave equations, such as the Klein–Gordon, Dirac, and Maxwell equations. Finally, even if it is a subject logically distinct from classical ﬁeld theory, we will include in this chapter a discussion of the ﬁrst quantization of the relativistic wave equations and we will see that, despite the intrinsic limitations of the method, this can nevertheless be useful to compute the lowestorder relativistic corrections to the Schr¨ odinger equation. In the Solved Problems section we will present some explicit computations using ﬁrst quantized relativistic wave equations, and in particular we will compute the ﬁne structure of the hydrogen atom using the Dirac equation.
3.1
The action principle
We ﬁrst brieﬂy recall the basic principles of classical mechanics in the Lagrangian formalism. A classical system with N degrees of freedom is described by a set of coordinates qi (t), with i = 1, . . . , N , which we often denote collectively simply as q. The Lagrangian L is a function of the qi ’s and of their ﬁrst time derivatives q˙i , i.e. L = L(q, q); ˙ in the simplest case it is given by a kinetic term minus a potential term, L(q, q) ˙ = i (mi /2)q˙i2 − V (q). The action S is S = dt L(q, q) ˙ . (3.1) The action principle states that, if we ﬁx the values of the coordinates at the initial time tin and at the ﬁnal time tf , so that q(tin ) = qin , q(tf ) = qf , then the classical trajectory which satisﬁes these boundary conditions is an extremum of the action, tf dt L(q, q) ˙ = 0. (3.2) δ tin
The variation is performed holding ﬁxed the boundary conditions, i.e. one must ﬁnd the function q(t) which extremizes the action, within the space of functions that satisfy q(tin ) = qin , q(tf ) = qf . The variation of
3 3.1 The action principle
43
3.2 Noether’s theorem
46
3.3 Scalar ﬁelds
51
3.4 Spinor ﬁelds
54
3.5 The electromagnetic ﬁeld
65
3.6 First quantization
73
3.7 Solved problems
74
44 Classical ﬁeld theory
the Lagrangian is δL =
1
This can be understood discretizing this functional space (for instance taking t discrete rather than continuous), so that it becomes a ﬁnitedimensional space, described by a ﬁnite number of parameters that we denote collectively by α. Then δqi = (∂qi /∂α)δα becomes an ordinary variation of a function of a ﬁnite number of variables α and of time, and similarly δq˙i = (∂ q˙i /∂α)δα; on a suﬃciently regular function q(α, t) we can interchange the derivatives with re∂ ∂qi spect to t and to α, δq˙i = ∂α ( ∂t )δα = ∂ ∂qi ∂ ( )δα = ∂t δqi . ∂t ∂α
∂L ∂L δqi + δ q˙i . ∂qi ∂ q˙i i
(3.3)
δqi , δ q˙i are variations in the space of functions of t (with the given boundary conditions), and therefore in an inﬁnitedimensional space. The time derivative commutes1 with the variation operator δ, δ q˙i =
∂ δqi . ∂t
The variation of the action then becomes tf ∂L ∂L ∂ δS = δqi = 0 . δqi + dt ∂qi ∂ q˙i ∂t tin i
(3.4)
(3.5)
Integrating the second term by parts, the boundary term does not contribute because the boundary conditions are held ﬁxed: δqi (tin ) = δqi (tf ) = 0. Therefore we get tf ∂L ∂ ∂L δqi = 0 . dt (3.6) − ∂qi ∂t ∂ q˙i tin i This must be true for any functional form of the variations δqi (we are considering systems which are not subject to constraints, therefore all qi are independent), so we obtain the Euler–Lagrange equations, ∂L ∂ ∂L − = 0, ∂qi ∂t ∂ q˙i
(3.7)
with i = 1, . . . , N . These are the equations of motion in the Lagrangian formulation. In order to pass to the Hamiltonian formalism, one deﬁnes the conjugate momenta ∂L pi = , (3.8) ∂ q˙i and the Hamiltonian H(p, q) = pi q˙i − L , (3.9) i
where q˙i is expressed in terms of pi and possibly qi using eq. (3.8). In the previous chapter we introduced the ﬁelds, deﬁned as functions of the spacetime point x with given transformation properties under the Poincar´e group. The classical dynamics of a ﬁeld can be deﬁned extending the Lagrangian formalism from the case of functions of time, qi (t), to the case of functions of spacetime φi (x). We will only be interested in local ﬁeld theories, in which case the Lagrangian has the general form (3.10) L = d3 x L(φ, ∂µ φ) ,
3.1
where L is called the Lagrangian density (however, following standard use in ﬁeld theory, we will often refer to L simply as the Lagrangian) and depends only on a ﬁnite number of derivatives. We will often denote collectively the ﬁelds φi simply as φ. To make contact with the variables qi (t) of classical mechanics, we can think of φi (t, x ) as functions of time labeled both by the index i and by the continuous label x (the analogy becomes exact if we discretize space and we put the system in a ﬁnite box). In most cases of interest, L depends only on the ﬁrst derivative.2 In a Lorentzcovariant theory L depends on the time and space derivatives of φ only through the fourvector ∂µ φ. For the moment we do not assume anything about the transformation properties of the ﬁeld, so for instance φi could denote a set of scalars, or the four components of a vector ﬁeld, etc. The action has the general form S = dt L = d4 x L(φ, ∂µ φ) . (3.11) While for point particles we considered the time integral between two values tin , tf , in classical ﬁeld theory we will rather be interested in the situation where the integral extends over all of spacetime, and the boundary conditions are that all ﬁelds decrease suﬃciently fast at inﬁnity so that, in particular, all boundary terms can be neglected. The classical dynamics is again deﬁned by the principle that the action is stationary. The same manipulations performed above in the case of a function q(t) are immediately generalized to the case of a function φ(x), and ∂L ∂L 4 δ(∂µ φi ) δφi + δS = d x ∂φi ∂(∂µ φi ) i ∂L ∂L = d4 x − ∂µ (3.12) δφi = 0 . ∂φi ∂(∂µ φi ) i Therefore the equations of motion, or Euler–Lagrange equations, are ∂L ∂L = 0, − ∂µ ∂φi ∂(∂µ φi )
(3.13)
with i = 1, . . . , N . Consider now the theory obtained replacing the original Lagrangian density L by a new Lagrangian density L which diﬀers from L only by a fourdivergence, L = L + ∂µ K µ , µ
(3.14)
µ
with K = K (φ). In a ﬁnite volume V bounded by a surface Σ we have, by Stokes theorem, d4 x ∂µ K µ = dA nµ K µ , (3.15) V
Σ
where dA is the surface element and nµ is the outward normal to the surface. This is a boundary term and therefore vanishes on ﬁeld conﬁgurations that go to zero suﬃciently fast at inﬁnity. More generally,
2
The action principle 45
The variational principle can be generalized to Lagrangians containing higher derivatives. For instance, if L = L(q, q, ˙ q¨), then δL = (δL/δq)δq + (δL/δq)δ ˙ q˙ + (δL/δq¨)δq¨. At the boundaries we must hold ﬁxed both q and q˙ and, after integrations by parts, the equation of motion is (δL/δq) − d/dt(δL/δq) ˙ + d2 /dt2 (δL/δq¨) = 0.
46 Classical ﬁeld theory
we can consider the variational principle in a ﬁnite fourdimensional volume V but in this case, similarly to the situation discussed in classical mechanics, in order to have a welldeﬁned variational principle we must impose the boundary condition that the ﬁelds are kept constant on Σ. In any case, the term V d4 x ∂µ K µ either vanishes or is anyway a constant, and therefore the condition that S is stationary is equivalent to the condition that S is stationary. This means that two Lagrangian densities which diﬀer by a total derivative give rise to the same equations of motion and therefore are classically equivalent. In the Hamiltonian formalism one deﬁnes the conjugate momenta as Πi (x) =
∂L . ∂ (∂0 φi (x))
(3.16)
The Hamiltonian density H is then deﬁned as
H(x) =
Πi (x)∂0 φi (x) − L ,
i
and the total Hamiltonian is
(3.17)
H=
d3 x H .
(3.18)
The Lagrangian formalism has the advantage of keeping explicit at each stage the Lorentz covariance. Instead, in the Hamiltonian formalism Lorentz invariance is less explicit, since the time variable plays a special role in deﬁning the conjugate momenta.
3.2
Noether’s theorem
The relation between symmetries and conservation laws is extremely important in ﬁeld theory and, at the classical level, it is expressed by Noether’s theorem. We consider a ﬁeld theory with ﬁelds φi and action S, and we perform an inﬁnitesimal transformation of the coordinates and of the ﬁelds, parametrized by a set of inﬁnitesimal parameters a , with a = 1, . . . , N , of the general form xµ → x = xµ + a Aµa (x) µ
φi (x) →
φi (x )
a
= φi (x) + Fi,a (φ, ∂φ) ,
(3.19) (3.20)
with Aµa (x) and Fi,a (φ, ∂φ) given. Equations (3.19, 3.20) deﬁne a symmetry transformation if they leave the action S(φ) invariant, for any φ. Note that we are not assuming that the ﬁelds φi satisfy the classical equations of motion. A symmetry by deﬁnition leaves the action invariant for every ﬁeld conﬁguration, solution or not of the equations of motion. There are two important distinctions to be drawn. The ﬁrst is between local and global symmetries. If in eqs. (3.19) and (3.20) the parameters
3.2
a are constants, we have a global symmetry. Instead, the above transformation is a local symmetry if it leaves invariant the action even when is allowed to be an arbitrary function of x. Of course, a local symmetry gives rise also to a global symmetry.3 The second important distinction is between internal and spacetime symmetries. Internal symmetries do not change the coordinates, so they have Aµa (x) = 0, while spacetime symmetries involve also a change in the coordinates. For internal symmetries (and also for Lorentz transformations and for translations) d4 x is invariant and the condition of invariance of the action is equivalent to the condition of invariance of L, but in the general case the symmetries are given by invariances of S, not of L. Now, suppose that eqs. (3.19) and (3.20) are a global, but not a local, symmetry of our theory. Consider what happens if, starting from a given ﬁeld conﬁguration φi , we perform the above transformation, but with the a slowly varying functions of x, i.e. a  1 and l∂µ a  a , where l is the characteristic scale of variation of φi . Since the a depend on x, and we are assuming that eqs. (3.19) and (3.20) are not a local symmetry, this transformation will not leave the action invariant, and δS will have a nonvanishing term at O(). However, since the a are slowly varying, we can expand this O() term in powers of the derivatives, S(φ ) = S(φ) + d4 x [a (x)Ka (φ) − (∂µ a )jaµ (φ) + O(∂∂)] + O(2 ) , (3.21) where we have denoted by Ka the coeﬃcient of a and by −jaµ the coeﬃcient of ∂µ a . This equation holds for any slowly varying .4 We can then apply it also to the case where the a are all constants. However we know that, if the a are constants, the variation of the action must be zero because in this case parametrizes a global symmetry. Therefore we learn that, in eq. (3.21), the function Ka (φ) is actually zero, for any φ. Observe that Ka (φ) is by deﬁnition a function of φ but is independent of ; all the dependence is written explicitly in eq. (3.21). Therefore, even if to show that Ka (φ) = 0 we have looked at the limiting case of constant , once we have shown that Ka vanishes, it vanishes independently of . Then, for any slowly varying function (x), we have the expansion S(φ ) = S(φ) − d4 x [(∂µ a )jaµ (φ) + O(∂∂)] + O(2 ) . (3.22) We now take to be a function which goes to zero suﬃciently fast at inﬁnity. This allows us to integrate the above equation by parts (without making any assumptions on φ) and the boundary term vanishes, so we get (3.23) S(φ ) = S(φ) + d4 x a (x)∂µ jaµ (φ) + O(∂∂) + O(2 ) . We have derived the above result independently of our choice of φ, and for slowly varying and vanishing at inﬁnity.
Noether’s theorem 47
3
However, it can happen that the corresponding global symmetry is trivial, i.e. it is just the identity transformation. For example, we will see in Section 3.5 that the free electromagnetic ﬁeld is invariant under the gauge transformation Aµ → Aµ − ∂µ θ. For θ constant the corresponding transformation is just the identity and does not give rise to conserved charges (as long as we do not include matter ﬁelds).
4
We generically denote by the set of
a with a = 1, . . . N . The statement “
slowly varying” means that all a are slowly varying.
48 Classical ﬁeld theory
Suppose now that, for φ, we choose a classical solution of the equations of motion, φcl . Observe, ﬁrst of all, that in the action the x are integration variables so, after performing the transformation (3.19, 3.20), we can rename x as x. Then, as long as we are only interested in studying the variation of the action, the inﬁnitesimal transformation in eqs. (3.19) and (3.20) is equivalent to a transformation in which the coordinates are unchanged, xµ → xµ , while φi (x) → φi (x − a Aµa ) + a Fi,a = φi (x) + a Fi,a − a Aµa ∂µ φi .
(3.24)
Thus we have rewritten the transformations (3.19, 3.20) in a form that does not change the coordinates, and with φi (x) → φi (x) + δφi (x) with δφi (x) some variation that goes to zero at inﬁnity. This is the kind of variation that is used in the derivation of the equations of motion. By deﬁnition, a classical solution is an extremum of the action and therefore if we now take φi = φcl i the linear term in the variation of S in eq. (3.23) must be identically zero for any that vanishes at inﬁnity, independently of whether or not depends on x, and of whether or not it is a symmetry transformation (the condition that has to vanish at inﬁnity follows from the fact that the classical equations of motion are obtained from a variation with ﬁxed boundary conditions, i.e. keeping the ﬁelds ﬁxed at inﬁnity). Therefore we arrive at the important conclusion that, on a classical solution of the equations of motion, the N currents jaµ are conserved, ∂µ jaµ (φcl ) = 0 ,
5 The transformations that we consider depend on the a in a continuous and diﬀerentiable way, therefore they form a Lie group. If the linear term jaµ in eq. (3.22) vanished, integrating the inﬁnitesimal transformation we would ﬁnd that the ﬁnite transformation is the identity; compare with eqs. (2.3) and (2.5).
(3.25)
which is the content of Noether’s theorem. In other words, there is a conserved current jaµ (with a = 1, . . . , N ) for each generator of a symmetry transformation. The fact that the symmetry was global but not local means that S(φ ) − S(φ) in eq. (3.22) must be nonvanishing, and therefore jaµ themselves are nonvanishing.5 We now deﬁne the charges Qa , Qa ≡ d3 x ja0 (x, t) . (3.26) The current conservation (in the sense of eq. (3.25)) implies that Qa is conserved (in the sense that it is timeindependent). In fact 3 0 ∂0 Qa = d x ∂0 ja (x, t) = − d3 x ∂i jai (x, t) . (3.27) This is the integral of a total divergence, and vanishes since we assume a suﬃciently fast decrease of the ﬁelds at inﬁnity. More generally, in a ﬁnite volume the variation of the charge is given by a boundary term representing the incoming or outgoing ﬂux. The explicit form of the current can be obtained performing the variation of the action with slowly varying, collecting the terms proportional to ∂µ , and comparing with eq. (3.22). This can be done in full generality. Denoting by δ the variations induced by the transformation (3.19,
3.2
3.20), we have ∂L ∂L δ (∂µ φi ) . δ (d4 x) L + d4 x δ φi + δ S = δ d4 x L = ∂φi ∂(∂µ φi ) (3.28) Computing the Jacobian of the transformation (3.19, 3.20) to linear order, we ﬁnd d4 x → d4 x(1 + Aµa ∂µ a ) plus a term ∼ ; δ φi does not produce terms ∼ ∂ while δ (∂µ φi ) =
∂φi ∂xν ∂ ∂φi ∂φi − = (φi + a Fi,a ) − µ . µ µ µ ν ∂x ∂x ∂x ∂x ∂x
(3.29)
This produces a term proportional to ∂µ and equal to −(∂µ a ) (Aνa ∂ν φi − Fi,a ) .
(3.30)
(Observe that, since δ is a variation that also changes the coordinates, δ (∂µ φi ) = ∂µ (δ φi ).) Putting together all terms ∼ ∂µ a gives jaµ =
∂L [Aν (x)∂ν φi − Fi,a (φ, ∂φ)] − Aµa (x)L . ∂(∂µ φi ) a
(3.31)
For internal symmetries Aµa = 0 and the above expression simpliﬁes to jaµ = −
∂L Fi,a ∂(∂µ φi )
(internal symmetries) .
(3.32)
Quite often one is interested in linear transformations of the ﬁelds, in which case Fi,a (φ, ∂φ) = (Ma )i j φj , where (Ma )i j are N constant matrices. Finally, let us see what happens if the transformation (3.19, 3.20) is not a global symmetry. In this case eq. (3.21) still holds, since it is the most general expansion of the variation of the action when is slowly varying; however, now Ka (φ) does not vanish, and indeed it represents the variation of the Lagrangian under the global transformation, K a ≡ (δa L)global , where (δa L)global is deﬁned by (δL)global = a (δa L)global . Following the same steps that lead to eq. (3.25) we now ﬁnd that, on the classical solutions, ∂µ jaµ = −(δa L)global ,
(3.33)
where jaµ is still given by eq. (3.31). This expression is particularly useful when (δa L)global is small compared to the relevant scales, so that the current is approximately conserved.
3.2.1
The energy–momentum tensor
Spacetime translations are a symmetry which is present in all the theories that we will consider. In this case the index a appearing in a is
Noether’s theorem 49
50 Classical ﬁeld theory
a Lorentz index and, as explained in Section 2.7, all ﬁelds are scalars under translations. Therefore xµ → x = xµ + µ = xµ + ν δνµ φi (x) → φi (x ) = φi (x) µ
(3.34)
and in eqs. (3.19) and (3.20) we have Aµν = δνµ and Fi,a = 0. The four µ conserved currents j(ν) ≡ θµ ν therefore form a Lorentz tensor, known as the energy–momentum tensor. Using eq. (3.31) and raising the ν index, θµν = η νρ θµ ρ we get θµν =
∂L ∂ ν φi − η µν L . ∂(∂µ φi )
(3.35)
Equation (3.25) becomes ∂µ θµν = 0
(3.36)
on the solutions of the classical equations of motion. The conserved charge associated to the energy–momentum tensor is the fourmomentum, (3.37) P ν ≡ d3 x θ0ν . This is the deﬁnition of fourmomentum in ﬁeld theory. A ﬁeld conﬁguration, solution of the equations of motion, carries an energy E = P 0 and a spatial momentum P i which can be calculated using eqs. (3.35) and (3.37). The energy–momentum tensor deﬁned from eq. (3.35) in general is not automatically symmetric in the two indices µ, ν, so for instance in eq. (3.36) one should be careful to contract ∂µ with the ﬁrst index of θµν . However, consider the “improved” energy–momentum tensor T µν = θµν + ∂ρ Aρµν
(3.38)
where Aρµν is an arbitrary tensor antisymmetric in the indices ρ, µ. This new tensor is still conserved: ∂µ ∂ρ Aρµν = 0 because of the antisymmetry in ρ, µ. Furthermore, for µ = 0, ∂ρ Aρ0ν = ∂i Ai0ν is a spatial divergence, and therefore this term does not contribute to the fourmomentum (3.37), if the ﬁelds vanish suﬃciently fast at inﬁnity. Therefore T µν and θµν are physically equivalent, and one can choose Aρµν such that T µν is symmetric. The reader with some knowledge of general relativity may compare this deﬁnition with the deﬁnition of energy–momentum tensor in general relativity, which is given by the variation of the action with respect to the metric. Since the metric is symmetric, this deﬁnition automatically gives the symmetric form of T µν .
3.3
3.3 3.3.1
Scalar ﬁelds 51
Scalar ﬁelds Real scalar ﬁelds; Klein–Gordon equation
We now have all the elements for writing down Poincar´e invariant actions. We start with the theory of a single, real scalar ﬁeld φ. An action describing a nontrivial dynamics must contain ∂µ φ. In order to have a Lorentz invariant action the index µ must be saturated and, for a scalar ﬁeld, the only possibility is to contract it with another factor ∂ µ φ. Therefore the kinetic term must be proportional to ∂µ φ∂ µ φ. We ﬁrst consider the action 1 S= (3.39) d4 x ∂µ φ∂ µ φ − m2 φ2 . 2 The Euler–Lagrange equation gives (2 + m2 )φ = 0 ,
(3.40)
with 2 = ∂µ ∂ µ . This is the free Klein–Gordon (KG) equation. A plane wave e±ipx is a solution of eq. (3.40) if p2 = m2 , i.e. if (p0 )2 − p 2 = m2 .
(3.41)
Therefore, the classical KG equation imposes the relativistic dispersion relation, and the parameter m appearing in the action is the mass (we take by deﬁnition m > 0). Taking into account that φ must be real, the most general solution is a real superposition of plane waves, d3 p φ(x) = ap e−ipx + a∗p eipx p0 =Ep , (3.42) 3 (2π) 2Ep where Ep ≡ + p 2 + m2 . The factor 2Ep is a convenient choice of normalization of the coeﬃcients ap . The solution is evaluated on p0 = +Ep , i.e. on the positive solution of eq. (3.41). Note however that in eq. (3.42) we have both solutions that oscillate as e−iEp t (positive frequency modes) and as e+iEp t (negative frequency modes). The proper interpretation of the latter modes will only come after quantization of the theory.6 The overall sign (and normalization) of the action (3.39) is irrelevant as long as we are interested in the equations of motion, but is important for obtaining a positive deﬁnite Hamiltonian (and the correct choice depends on our convention for the metric). The momentum conjugate to φ is ∂L = ∂0 φ , (3.43) Πφ = ∂(∂0 φ) and the Hamiltonian density is H = Πφ ∂0 φ − L =
1 2 Πφ + (∇φ)2 + m2 φ2 . 2
(3.44)
6
Of course, after one has included both the solutions eipx and e−ipx with p0 = +Ep one does not get anything new including solutions with p0 = −Ep . For instance, a term e−ipx , when p0 = −Ep , is equal to exp{−ip0 t + ip·x} = exp{iEp t + ip·x}. After changing the dummy integration variable p to −p we get back to exp{iEp t − ip·x} which is just eipx with p0 = +Ep .
52 Classical ﬁeld theory
The energy momentum tensor is found from eq. (3.35), θµν = ∂ µ φ∂ ν φ − η µν L ,
(3.45)
so that 1 (∂0 φ)2 + (∇φ)2 + m2 φ2 . (3.46) 2 We see that θ00 = H, as expected, and H = d3 x θ00 . The Hamiltonian is the conserved charge related to the invariance under time translations. We now compute the conserved currents associated to Lorentz invariance. In this case the parameters a which appear in eqs. (3.19) and (3.20) are better labeled by an antisymmetric pair of Lorentz indices; we will denote them by ω ρσ , and δxµ = ω µ ν xν can be rewritten as µ δxµ = A(ρσ) ω ρσ , (3.47) θ00 = (∂0 φ)2 − L =
ρ 1/L. If we take L λ, the uncertainty on the momentum becomes much bigger than the momentum itself.
70 Classical ﬁeld theory
a massless vector ﬁeld Aµ . Gauge invariance is a guiding principle in building the theory of fundamental interactions, and the corresponding theories are known as gauge theories. A very general method for writing a gauge invariant action is the following. We start from a theory with a global U (1) invariance. Let us consider for deﬁniteness the Dirac action (3.89), but the procedure is completely general. We consider the transformation Ψ → eiqθ Ψ .
(3.165)
This transformation is a symmetry of the free Dirac action if θ is a constant, but we want to consider a generic function θ(x). The free Dirac action is not invariant if θ depends on x, because then the factor eiqθ coming from Ψ does not commute with ∂µ and so it cannot be ¯ We have assigned to the canceled by the factor e−iqθ coming from Ψ. ﬁeld Ψ a parameter q that deﬁnes its transformation properties. We will see that it has the meaning of the electric charge of the particles described by the ﬁeld, in units of e. At the same time, the action of the free electromagnetic ﬁeld is invariant under (3.166) Aµ → Aµ − ∂µ θ . We then deﬁne the covariant derivative of Ψ as Dµ Ψ = (∂µ + iqAµ )Ψ
(3.167)
and we immediately verify that, under the combined transformations (3.165) and (3.166), with θ = θ(x), Dµ Ψ → eiqθ Dµ Ψ ,
(3.168)
i.e. Dµ Ψ transforms in the same way as Ψ, even when θ is a function of x. It is now easy to construct a Lagrangian with a local U (1) invariance. It suﬃces to replace all derivatives ∂µ with covariant derivatives Dµ . This procedure is expressed by saying that we have gauged the global U (1) symmetry, promoting it to a local symmetry. The resulting theory is called a gauge theory and Aµ is called a gauge ﬁeld. More precisely, it is a U (1), or abelian gauge ﬁeld, since we have gauged a U (1) symmetry. In Chapter 10 we will study how to gauge nonabelian groups, like SU (N ), and this will lead to nonabelian gauge theories and to the Standard Model. It is important to note that the form of the covariant derivative depends on the transformation properties of the ﬁeld on which it acts. For instance, for ﬁelds transforming as in eq. (3.165), Dµ depends on the parameter q. One can consider more general transformation laws, however. As an example, a gauge ﬁeld transforms as in eq. (3.166) rather than being multiplied by a phase, and on a gauge ﬁeld we simply deﬁne Dµ Aν = ∂µ Aν since Fµν is already gauge invariant. We will ﬁnd more
3.5
general transformation properties, and more general deﬁnitions of the covariant derivative, when we study nonabelian gauge theories. It is now straightforward to couple a Dirac ﬁeld of charge q to the electromagnetic ﬁeld: we just replace ∂µ by Dµ in the Dirac Lagrangian and we have ¯ (iγ µ Dµ − m) Ψ LD = Ψ (3.169) or, more explicitly, ¯ µΨ . ¯ (iγ µ ∂µ − m) Ψ − qAµ Ψγ LD = Ψ
(3.170)
Thus, the electrodynamics of a spinor ﬁeld is obtained coupling Aµ to ¯ µ Ψ. The resulting theory has by construction the local the current Ψγ U (1) symmetry deﬁned by eqs. (3.165) and (3.166), with θ an arbitrary function of x, and therefore it obviously also has the global symmetry Ψ → eiqθ Ψ, Aµ → Aµ with θ a constant. Applying the Noether theorem to this global symmetry we see that (modulo of course an arbitrary normalization) the conserved current is the vector current that we already met in Section 3.4.3, ¯ µΨ . (3.171) jVµ = Ψγ Therefore the electromagnetic ﬁeld is coupled to a conserved current. The conserved charge is ¯ 0 Ψ = d3 x Ψ† Ψ (3.172) Q = d3 x Ψγ and, as we will see after quantization of the theory, it has the meaning of the electric charge, in units of e. The equation of motion is (iγ µ Dµ − m) Ψ = 0 .
(3.173)
This is the Dirac equation describing a spin 1/2 charged particle interacting with an electromagnetic ﬁeld. We will discuss some of its consequences in the next section. Consider now a complex scalar ﬁeld φ transforming under gauge transformations as φ → eiqθ φ. Again Dµ φ is deﬁned as (∂µ + iqAµ )φ and the complex KG Lagrangian becomes L = (Dµ φ)∗ Dµ φ − m2 φ∗ φ (3.174) = ∂µ φ∂ µ φ∗ + iqAµ (φ∂µ φ∗ − φ∗ ∂µ φ) + q 2 φ2 Aµ Aµ − m2 φ∗ φ . This is the Lagrangian of scalar electrodynamics. As we discussed in Section 3.3.2, the complex Klein–Gordon theory has a U (1) symmetry, whose conserved current is given in eq. (3.62). We see from eq. (3.175) that Aµ couples to this current, and there is also a term in the Lagrangian proportional to Aµ Aµ φ2 . The latter term plays an important role in the Higgs mechanism and in superconductivity, as we will see in
The electromagnetic ﬁeld 71
72 Classical ﬁeld theory
Section 11.3. Note that it is not possible to couple in this way a real scalar ﬁeld to the electromagnetic ﬁeld. For a real ﬁeld necessarily q = 0 otherwise eiqθ φ becomes complex. After quantization, a real scalar ﬁeld describes particles which are neutral under electromagnetism. However, one can have a neutral scalar particle formed by charged constituents, and this particle will interact with the electromagnetic ﬁeld not through its electric charge, which is zero, but through its higher electric and magnetic multipoles, exactly as the hydrogen atom is neutral but interacts with the electromagnetic ﬁelds through its electric and magnetic dipole moments, quadrupole moments, etc. This means that it must be possible to write a gaugeinvariant coupling to the electromagnetic ﬁeld also for a real scalar ﬁeld. For example, a possible interaction term is LS = aS φFµν F µν ,
(3.175)
where aS is a coupling constant. Another possibility is LP S = aP S φ µνρσ F µν F ρσ ,
(3.176)
with another coupling constant aP S . Observe that under parity Fµν F µν is invariant while µνρσ F µν F ρσ is a pseudoscalar. Therefore the interaction Lagrangian LS preserves parity only if φ is a scalar ﬁeld, while LP S is invariant only if φ is a pseudoscalar ﬁeld. For example the neutral pion π 0 is a pseudoscalar, and it decays into two photons; the Lagrangian LP S gives a good phenomenological description of its interaction with the electromagnetic ﬁeld. The coupling to the electromagnetic ﬁeld which is obtained performing in the free Lagrangian the replacement ∂µ → Dµ is called the minimal coupling. Otherwise the coupling is called nonminimal. Similarly, for a Dirac fermion we can in principle write nonminimal couplings. For example we can add to the Lagrangian an interaction term ¯ µν Ψ Fµν , Lint = a Ψσ
(3.177)
with a coupling constant a. After quantization of the theory we will ¯ µ Ψ describes indeed the coupling ﬁnd that the interaction term qAµ Ψγ of the gauge ﬁeld with the electric charge of the particle. We will instead show in Solved Problem 7.2 that the coupling (3.177) corresponds to a magnetic dipole interaction. We leave as an exercise to verify that the nonminimal coupling constants aS , aP S and a are not dimensionless, but rather have the dimension of the inverse of a mass. We will understand in Section 5.6 why interaction terms of this sort have a less fundamental signiﬁcance than interactions terms in which the coupling constant is dimensionless, as in the minimal coupling.
3.6
3.6
First quantization of relativistic wave equations 73
First quantization of relativistic wave equations
As we already discussed in the Introduction, a ﬁrst quantization of relativistic wave equations cannot be performed consistently. In particular, we have seen that both the free Klein–Gordon and the free Dirac equation have solutions proportional to e−ipx and solutions proportional to e+ipx . The former oscillate in time as e−iEt and therefore in a ﬁrst quantized formalism are eigenfunctions of the Hamiltonian H = i∂/∂t with eigenvalue E, while the latter have eigenvalue −E and therefore correspond to negative energy solutions. The proper interpretation of these solutions comes only after the ﬁeld quantization that we will discuss in the next chapter.13 However, as we will show below and in Solved Problem 3.1, in the nonrelativistic limit the Dirac equation in an external electromagnetic ﬁeld reduces to a Schr¨ odinger equation, i∂ψ/∂t = Hψ, with a Hamiltonian H which contains an expansion in powers of the velocity of the particle, i.e. relativistic corrections. A posteriori, the ﬁeld theoretical treatment shows that the ﬁrstorder correction produced by the relativistic wave equations is correct, i.e. it coincides with the ﬁeld theory result.14 It is therefore useful to examine the nonrelativistic limit of the Dirac equation, and to treat it in ﬁrst quantization, promoting the classical ﬁeld to a wave function. One should be aware, however, that higherorder corrections are not correctly given by the relativistic wave equations, and the full QFT treatment is needed. The Dirac equation (3.173) for an electron of charge q = e (with e < 0 in our notation) in an external electromagnetic ﬁeld Aµ is [γ µ (i∂µ − eAµ ) − m]Ψ = 0 .
(3.178)
To study the nonrelativistic limit, it is convenient to use the standard representation, eqs. (3.95) and (3.96), and to deﬁne χ (x, t) = eimt χ(x, t) ,
φ (x, t) = eimt φ(x, t)
(3.179)
−iEt
so that, if φ, χ have a timedependence e with E the relativistic energy, then φ and χ oscillate as e−iENR t with ENR = E − m. Then the Dirac equation reads (i∂0 − eA0 )φ = −σ·(i∇ + eA)χ ,
(i∂0 − eA0 + 2m)χ = −σ·(i∇ + eA)φ .
(3.180) (3.181)
Observe that in eq. (3.180) the mass term obtained acting with ∂0 on eimt cancels with the mass term originally present in the Dirac equation, while in eq. (3.181) they add up (recall also that ∂i = ∇i , see the Notation). In the nonrelativistic limit we have i∂0 χ mχ ,
eA0 m
(3.182)
and to lowest order the equation for χ is easily solved, χ −
1 σ·(i∇ + eA)φ . 2m
(3.183)
13 For fermions, Dirac found an ingenious solution to the negative energy problem using the Pauli principle and assuming that all states with negative energies are ﬁlled. However, this solution does not work for bosons, and today in high energy physics this “ﬁlled Dirac sea” has only historical interest. In condensed matter, however, it leads to an interpretation in terms of electrons and holes which is still useful, see Exercise 4.6.
14 In the language of Feynman graphs that we will discuss in Chapter 5, the relativistic wave equations reproduce the result of tree level graphs.
74 Classical ﬁeld theory
This solution is the lowest order in a relativistic expansion, and further corrections will be computed in Solved Problem 3.1. We now insert this expression for χ into eq. (3.180) and use
σ i σ j (i∇i + eAi )(i∇j + eAj )φ = (i∇ + eA)2 + iijk σ k ie(∇i Aj ) φ (3.184) which follows from σ i σ j = δ ij +iijk σ k . Finally, ijk ∇i Aj = (∇ × A)k = B k is the magnetic ﬁeld, and therefore (writing p = −i∇) eq. (3.180) becomes i∂0 φ
(p − eA)2 e − σ·B + eA0 φ . 2m 2m
(3.185)
Therefore in the nonrelativistic limit the Dirac equation reduces to a Schr¨ odinger equation for the twocomponent spinor φ , with a minimal coupling to the gauge ﬁeld Aµ , plus an interaction term with a magnetic ﬁeld. We see that the contribution to the energy due to the term σ·B can be written as −µ·B with a magnetic moment µ given by µ=
e e σ= S 2m m
(3.186)
where S = σ/2 is the spin of the electron. In nonrelativistic mechanics, a charged particle with charge e and angular momentum L has a magnetic moment e µ= L. (3.187) 2m It is then customary to write the magnetic moment due to the spin as µ=
ge S, 2m
(3.188)
where g is called the gyromagnetic ratio, and we see that the Dirac equation predicts g = 2 while nonrelativistic physics erroneously suggests g = 1. The present experimental value is (see the Introduction) (g − 2)/2 = 0.001 159 652 187(4) and the deviations from g = 2 come from loop corrections that we will discuss in Chapter 7 and, in detail, in Solved Problem 7.2.
3.7
Solved problems
Problem 3.1. The ﬁne structure of the hydrogen atom In this problem we use the Dirac equation to compute the ﬁne structure of the hydrogen atom. In this case the external potential Aµ is just the Coulomb potential of the nucleus of charge −Ze (with Z = 1 for hydrogen, but it takes no eﬀort to keep Z generic; recall also that e < 0), therefore A = 0 and
3.7 A0 = −Ze/(4πr). The Coulomb potential is V (r) = eA0 = −Zα/r. The Dirac equation in the standard representation becomes (i∂0 − V − m)φ = −iσ·∇χ ,
(3.189)
(i∂0 − V + m)χ = −iσ·∇φ .
(3.190)
We look for a solution φ(x, t) = e−iEt φ(x), χ(x, t) = e−iEt χ(x) and we deﬁne ε = E − m. Then (ε − V )φ = −iσ·∇χ ,
(3.191)
(2m + ε − V )χ = −iσ·∇φ .
(3.192)
We now want to perform an expansion in powers of p2 /m2 of the Dirac equation, keeping corrections O(p2 /m2 ) to the kinetic term p2 /(2m) and to the potential V , i.e. we want to keep terms up to O(p4 /m3 ) and O(V p2 /m2 ). Equation (3.192) allows us to eliminate χ using „ «−1 „ « 1 1 ε−V ε−V −i σ·∇φ = σ·pφ 1+ 1− σ·pφ . χ= 2m + ε − V 2m 2m 2m 2m (3.193) We can then obtain an equation of the Schr¨ odinger type for the twocomponent spinor φ. In order to make contact with a Schr¨ odinger equation, however, we must also ensure that the wave function that appears in the Schr¨ odinger equation is properly normalized. To this purpose, we observe that the total charge, in units of e, is given by eq. (3.172), Z Z ˆ ˜ (3.194) Q = d3 x Ψ† Ψ = d3 x φ2 + χ2 . Observe that, in ﬁrst quantization, Q is positive deﬁnite. This will not be true in second quantization, where Q will be the number of electrons minus the number of positrons, as we will see in eq. (4.43). In the ﬁrst quantized formalism, we require that the wave function expresses the condition that there is one electron in a volume V (with V → ∞). We therefore deﬁne a Schr¨ odinger wave function φS (again a twocomponents spinor) requiring that Z Z ˆ ˜ d3 x φS 2 = d3 x φ2 + χ2 . (3.195) V
V
p φ), at zeroth order φS = φ. However, we want to substitute Since χ = O( m eq. (3.193) into eq. (3.191) keeping the ﬁrstorder correction, so for consistency we must use χ (1/2m)(−iσ·∇)φ in eq. (3.195). Then to this order » – Z Z 1 ∗ d3 x φS 2 = d3 x φ2 + (σ·∇φ )(σ·∇φ) 4m2 V » – ZV 1 ∗ d3 x φ2 − φ (σ·∇)(σ·∇)φ = 4m2 » – ZV 1 ∗ 2 d3 x φ2 − φ ∇ φ = 4m2 „ « ZV p2 φ. (3.196) d3 x φ ∗ 1 + = 4m2 V
„
Therefore φS =
1+
p2 p4 + O( ) 8m2 m4
« φ
(3.197)
Solved problems 75
76 Classical ﬁeld theory and
« „ p4 p2 + O( 4 ) φS . φ= 1− 8m2 m
(3.198)
In terms of φS , keeping only the ﬁrstorder corrections, eq. (3.193) reads « „ « „ 1 ε−V p2 φS 1− σ·p 1 − 2m 2m 8m2 « » „ – ε−V p2 1 − σ·p 1 − σ·p φS . 2m 8m2 2m
χ
(3.199)
We now substitute eqs. (3.199) and (3.198) into eq. (3.191). Performing some simple algebra (and paying attention to the fact that p = −i∇ does not commute with the potential V (r)!) we get »
– p2 p4 V p2 1 p2 + + − (σ·p)V (σ·p) φS = 0 . −V +ε 2m 8m2 16m3 8m2 4m2 (3.200) p2 + V )φS . Therefore the term At lowest order, we have of course εφS ( 2m p2 ε 8m 2 in the above equation can be rewritten as ε−
ε
p2 p2 p2 = ε 8m2 8m2 8m2
„
p2 +V 2m
« .
(3.201)
Of course ε is a cnumber and we can write it both to the left or to the right of p2 . When we substitute it with p2 /(2m) + V the diﬀerence between writing it to the left or to the right is O(p6 /m4 ) and therefore can be neglected, at the order at which we are working. Equation (3.200) then becomes » εφS =
p4 p2 1 +V − + 2m 8m3 4m2
„ (σ·p)V (σ·p) −
1 2 (p V + V p2 ) 2
«– φS .
(3.202) The correction term involving the potential can be rewritten in a more transparent form using the identity σ i σ j = δ ij + i ijk σ k , together with [pi , V ] = −i(∇i V ) = ieE i
(3.203)
(where E is the electric ﬁeld) and pi V pi = (V pi + [pi , V ])pi = V p2 + ieE·p .
(3.204)
Then 1 2 1 (p V + V p2 ) = pi V pi + i ijk σ k pi V pj − (p2 V + V p2 ) 2 2 1 1 = V p2 + ieE·p − p2 V − V p2 + i ijk σ k ([pi , V ] + V pi )pj 2 2 1 2 2 = ieE·p + (V p − p V ) − e ijk E i pj σ k . (3.205) 2
σ i σ j pi V pj −
(In the last line we used the fact that ijk V pi pj = 0 because pi pj is a symmetric tensor; note that this could not be used directly on ijk pi V pj because pi and V do not commute). Using eq. (3.203) it is easy to see that V p2 − p2 V = −ie(E·p + p ·E)
(3.206)
3.7 (again, one has to be careful since E and p do not commute!). Inserting this into eq. (3.205) we ﬁnd σ i σ j pi V pj −
1 2 ie (p V + V p2 ) = ieE·p − (E·p + p ·E) − e ijk E i pj σ k 2 2 ie = (E·p − p ·E) − e ijk E i pj σ k 2 e (3.207) = − (∇·E) − e(E × p)·σ . 2
Plugging this into eq. (3.202) we get » 2 – p4 p e e +V − − σ·(E × p) − (∇·E) φS . εφS = 2m 8m3 4m2 8m2
(3.208)
All the manipulations that we have performed until now hold for a generic potential V (x). We now use the fact that in the hydrogen atom V = V (r) and therefore « „ 1 dV , (3.209) eE = −∇V = −r r dr so that −
1 1 dV 1 1 dV e S·(r × p) = S·L σ·(E × p) = 4m2 2m2 r dr 2m2 r dr
(3.210)
where S = σ/2 is the spin of the electron and L is the orbital angular momentum. Therefore, in a radial potential V (r), the ﬁrst relativistic correction to the Schr¨ odinger equation is given by » εφS =
– p2 1 1 dV p4 e + (∇·E) φS . +V − S·L − 2m 8m3 2m2 r dr 8m2 (3.211)
The correction term −p4 /(8m3 ) is easily understood, since it comes from the expansion of the relativistic expression ε = (p2 + m2 )1/2 . The term ∼ S · L is the spin–orbit coupling and the term ∼ ∇·E is known as the Darwin term. Restricting now to the Coulomb potential Zα r
(3.212)
1 dV Zα = 3 . r dr r
(3.213)
V (r) = − we have Using
1 = −4πδ (3) (x) r (see, e.g. Jackson (1975), Section 1.7 for the proof) we ﬁnd ∇2
−e∇·E = +∇2 V = −Zα∇2
1 = 4πZα δ (3) (x) . r
(3.214)
(3.215)
We can therefore write εφS = (H0 + Hpert )φs
(3.216)
2
where H0 = p /(2m) + V is the unperturbed Hamiltonian of the hydrogen atom and Hpert = −
p4 Zα πZα (3) + S·L + δ (x) . 8m3 2m2 r 3 2m2
(3.217)
Solved problems 77
78 Classical ﬁeld theory Denoting by njl the unperturbed states of the hydrogen atom, to ﬁrst order in perturbation theory the correction to the energy levels is given by (∆E)njl = njlHpert njl .
(3.218)
We must therefore compute the following expectation values: odinger equa(1) njlp4 njl: if ψnjl is a solution of the unperturbed Schr¨ tion, then by deﬁnition (p2 /(2m) + V )ψnjl = n ψnjl , or « „ p2 Zα ψnjl = n + ψnjl (3.219) 2m r where
mZ 2 α2 2n2 are the unperturbed energy levels. Therefore „ «2 Z Zα ∗ p4 ψnjl = 4m2 njl n + njl . d3 x ψnjl r n = −
(3.220)
(3.221)
For a Coulomb potential V = −Zα/r one has njl
1 (mαZ)2 njl = 3 2 r n (l + 12 )
(3.222)
„ « 3 1 . njl p4 njl = 4(mZα)4 − 4 + 3 4n n (l + 21 )
(3.223)
mαZ 1 njl = , r n2
and therefore
njl
(2) njlS · L/r 3 njl: from J = L + S it follows that j(j + 1) = l(l + 1) + s(s + 1) + 2S · L
(3.224)
with s = 1/2, and using the wave function of the hydrogen atoms one has njl
(mαZ)3 1 , njl = 3 3 r n l(l + 12 )(l + 1)
if l = 0
(3.225)
and njl1/r 3 njl = 0 if l = 0. Therefore njl
1 3 (mαZ)3 [j(j +1)−l(l+1)− ] . (3.226) S · Lnjl = (1−δl,0 ) 3 3 r 4 2n l(l + 12 )(l + 1)
(3) njlδ 3 (x)njl: this is easily computed: Z (mαZ)3 njlδ 3 (x)njl = d3 xψnjl (x)2 δ 3 (x) = ψnjl (0)2 = δl,0 . (3.227) πn3 Putting all contributions together and considering the two cases j = l ± 1/2 when l = 0, and j = 1/2 when l = 0, we ﬁnd that the result can always be expressed only in terms of n, j, and there is no separate dependence on l. The ﬁnal result is » – m(Zα)4 3 1 . (∆E)njl = − − (3.228) 2n3 4n j + 12
3.7 Therefore the ﬁne structure removes the degeneracy between states with the same principal quantum number n but diﬀerent values of j. However, states with the same n, j and diﬀerent l, as the states 2S1/2 and 2P1/2 , are still degenerate at the level of the Dirac equation, i.e. at the level of the ﬁrst relativistic correction. In principle one might look for higherorder corrections coming from the Dirac equation, using perturbation theory with respect to Hpert at higher orders (indeed, it is even possible to ﬁnd a closed form for the energy levels predicted by the Dirac equation to all orders in α), but physically this is not meaningful since, starting from the next order, the corrections due to the quantum nature of the electromagnetic ﬁeld come into play, and the correct framework for computing these corrections is quantum electrodynamics, rather than the Dirac equation where Aµ is treated as an external, given, classical ﬁeld. The structure of the energy levels of the hydrogen atom, including the ﬁne structure correction (3.228) is shown in Fig. 3.1. For instance, the separation between the states 2P3/2 and 2P1/2 of the hydrogen is, from eq. (3.228) E2P3/2 − E2P1/2 = −
mα4 5 mα4 mα4 1 + = 4.53 × 10−5 eV , 16 8 16 8 32
2S 1/2 Lamb shift 2P1/2
separation from the .. Schrodinger equation
1S 1/2
Triplet
Hyperfine structure
Singlet
The corresponding wavelength is λ = c/f 21.105 cm, in the radio waves. This line is of great importance in astrophysics for investigating the presence of neutral hydrogen in our and in other galaxies because radio waves, compared to most other wavelengths, are much less aﬀected by absorption in the interstellar medium, and propagate to a very large distance. Problem 3.2. Relativistic energy levels in a magnetic ﬁeld We consider now an electron in a magnetic ﬁeld B = Bz (we take B > 0). For the gauge ﬁeld we can take A0 = Ax = Az = 0 and Ay = Bx. It is even technically simpler to solve the Dirac equation in this external magnetic ﬁeld exactly, rather than performing a nonrelativistic expansion. However, one should recall that only the ﬁrstorder nonrelativistic correction is really correct, and at higher orders eﬀects from quantum ﬁeld theory come into play. We write the Dirac equation in the standard representation as (i∂0 − m)φ = σ·(p − eA)χ
(3.232)
(i∂0 + m)χ = σ·(p − eA)φ
(3.233)
where, as usual, p = −i∇. We look for a solution of the form χ(x) = χ(x )e−iEt .
Fine structure
(3.230)
know as the Lamb shift. The explanation for this splitting was, historically, one of the ﬁrst successes of QED. At a comparable level, we ﬁnd the hyperﬁne structure, due to the interaction between the spin of the nucleus and the spin of the electron. Each level then splits into a triplet and a singlet and, for instance, (3.231) E1S1/2 ,triplet − E1S1/2 ,singlet 1420.4 MHz .
φ(x) = φ(x )e−iEt ,
2P3/2
(3.229)
corresponding to a frequency f = ω/(2π) 10.9 GHz, in the domain of microwaves. Actually, the levels 2S1/2 and 2P1/2 are not exactly degenerate, as predicted by eq. (3.228), but rather have a splitting E2S1/2 − E2P1/2 1057 MHz
Solved problems 79
(3.234)
Fig. 3.1 The lowest lying energy levels of the hydrogen atom. Note that the ﬁgure is not to scale. In reality, the ﬁne structure splittings are smaller by a factor ∼ 10−5 compared to the separation between the levels 2S1/2 and 1S1/2 , and the Lamb shift and the hyperﬁne structure are smaller by a factor ∼ 10 compared to the ﬁne structure.
80 Classical ﬁeld theory Then eqs. (3.232) and (3.233) become
15
Observe that with our choice of A we have [pi , Aj ] = 0 and therefore in (p − eA)2 we do not have to be careful about ordering. Otherwise the mixed term in (p − eA)2 is −e(p ·A + A·p ).
(E − m)φ(x ) = σ·(p − eA)χ(x )
(3.235)
(E + m)χ(x ) = σ·(p − eA)φ(x )
(3.236)
Substituting χ(x ) from eq. (3.236) into eq. (3.235) and performing basically the same manipulations as in Section 3.6, we ﬁnd15 ˆ ˜ (E 2 − m2 )φ(x ) = (p − eA)2 − eσ·B φ(x ) ˆ 2 ˜ = p + e2 B 2 x2 − 2epy Bx − eσz B φ(x ) . (3.237) Since py , pz commute with x, we can search for a solution of the form φ(x ) = ei(py y+pz z) f (x) ,
(3.238)
where py and pz are cnumbers and f (x), as φ(x ), is a twocomponent spinor. The equation for f (x) becomes – » d2 2 − 2 + (py − eBx) − eBσz f (x) = (E 2 − m2 − p2z )f (x) . (3.239) dx We take f (x) to be an eigenfunction of σz with eigenvalue σ = ±1, σz f = σf . Then » – d2 1 py 2 − 2 + (2e2 B 2 )(x − ) f (x) = (E 2 − m2 − p2z + eBσ)f (x) . (3.240) dx 2 eB This is formally identical to the Schr¨ odinger equation of a harmonic oscillator with frequency 2eB. The energy levels therefore are given by E 2 − m2 − p2z + eBσ = (n +
1 )2eB , 2
(3.241)
or (using e = −e) ˆ ˜1/2 E(n, pz , σ) = m2 + p2z + (2n + 1 + σ)eB .
(3.242)
We observe that there is a continuous degeneracy in px and py , as well as a discrete degeneracy E(n, pz , σ = +1) = E(n + 1, pz , σ = −1). In the nonrelativistic limit p2z m2 , (2n + 1)eB m2 , the expansion of eq. (3.242) gives „ « 1+σ p2 (3.243) E(n, pz , σ) m + z + n + ωB 2m 2 with ωB = eB/m, and we recover the Landau levels of nonrelativistic quantum mechanics.
Summary of chapter • The classical dynamics of a ﬁeld theory is given by the Euler– Lagrange equation (3.13). • Noether’s theorem states that for any continuous global symmetry there is a current j µ which is conserved, i.e. ∂µ j µ = 0. Given the symmetry transformation in the form (3.19, 3.20), the current can be computed using eq. (3.31). Given a conserved current, the charge given in eq. (3.26) is timeindependent.
Exercises 81
• Invariance under spacetime translations leads, via Noether’s theorem, to the conservation of the energy–momentum tensor, given in eq. (3.35). The corresponding conserved charges are energy and momentum, see eq. (3.37). • The kinetic term of the actions for the scalar or spinor ﬁelds are derived from the requirement of Poincar´e invariance. This leads to the free Klein–Gordon equation for scalar particles (eq. (3.40)) and to the free Weyl or Dirac equations for spin 1/2 particles, eqs. (3.67) and (3.87). For the vector ﬁelds, there is also an issue of gauge invariance; the equations of motion give a pair of Maxwell equations, while the second pair is a consequence of the deﬁnition of F µν , see eqs. (3.141) and (3.144). • There is large freedom in the choice of interaction terms. For instance, we can add a generic potential V (φ) to the action of a scalar ﬁeld, eq. (3.58). For the electromagnetic ﬁeld, we have seen minimal and nonminimal couplings in Section 3.5.4. When we study the quantum theory, we will see that some choices of interaction terms can be of more fundamental signiﬁcance than others. • Relativistic wave equations reduce, in the nonrelativistic limit, to a Schr¨ odinger equation plus corrections. We can then treat them in ﬁrst quantization. We have seen that in this way two remarkable predictions are obtained from the Dirac equation: the gyromagnetic factor of the electron is predicted to be g = 2, in contrast with g = 1 from classical physics, and the ﬁne structure of the hydrogen atom can be correctly computed. However, higherorder relativistic corrections can only be computed in the framework of second quantization.
Exercises (3.1) Actions have the same dimensions as , so they are dimensionless in our = 1 units. Find the dimensions of scalar, spinor, and gauge ﬁelds in d = 4 dimensions. R Repeat the analysis in d dimensions, with S = dd x L and L the same as in the d = 4 case.
zero, S=
1 2
Z d4 x η µν ∂µ φ∂ν φ ,
(3.244)
and a dilatation transformation with parameter α, xµ → x = eα xµ µ
(3.2) We saw on page 59 that the solution of the massive Dirac equation in the rest frame is uL = uR ≡ √ m ξ. Perform a boost on this solution along the z axis and verify that the result is given by eq. (3.103). (3.3) Consider the KG action with the mass term set to
φ(x) → φ (x ) = φ(x) exp{−dφ α} . (3.245) (i) Show that this transformation is a global symmetry, for an appropriate choice of the parameter dφ . Find the Noether current associated to this symmetry and verify that it is conserved on the equations of motions.
82 Classical Field Theory (ii) Show that, if in the KG action we have also a nonvanishing mass term, then the above transformation is not a symmetry. Show that instead a term V (φ) = λφ4 does not spoil the dilatation symmetry. What are the dimensions of λ ? (3.4) (i) Consider the Lagrangian of QED with the electron mass m set to zero, and consider the dilatation xµ → eα xµ , Aµ → e−dA α Aµ , ψ → e−dψ α ψ. Find the values of dA , dψ for which this transformation is a symmetry. (ii) Compute the conserved current and express it in terms of the energy–momentum tensor of the theory. Verify that the conservation of the dilatation current follows from the fact that the trace of the energy–momentum tensor vanishes in the massless theory. (iii) Include the electron mass term in the Lagrangian. Compute the dilatation current using eq. (3.31) with the new Lagrangian. Verify that it is not conserved and relate its divergence to the trace of the energy–momentum tensor.
(3.5) Consider the two Dirac Lagrangians ¯ ∂ − m)ψ , L = ψ(i
↔
¯ i ∂ −m)ψ . (3.246) L = ψ( 2
Verify that they are classically equivalent. Compute the energy–momentum tensor in the two cases. Verify that they are diﬀerent, but they give rise to the same conserved charges. (3.6) Consider a ﬁvedimensional spacetime labeled by (t, x, y) where (t, x) are the usual coordinates of fourdimensional spacetime and y is a compact coordinate which parametrizes the extra dimension, with −R/2 y R/2. Consider the free KG equation in this space, (25 + m2 )φ = 0, where 25 ≡ 2 − ∂ 2 /∂y 2 and 2 = ∂02 − ∇2 is the usual fourdimensional d’Alambertian. Show that, from the point of view of a fourdimensional observer, this equation describes an inﬁnite set of massive particles, and compute their masses. These particles are known as Kaluza–Klein modes. What do you think is the experimental bound on the size R of the extra dimension?
Quantization of free ﬁelds 4.1 4.1.1
Scalar ﬁelds
4.1 Scalar ﬁelds
Real scalar ﬁelds. Fock space
From the basic principles of quantum mechanics we know that, to quantize a classical system with coordinates q i and momenta pi , in the Schr¨ odinger picture, we promote q i , pi to operators and we impose the commutation relation [q i , pj ] = iδ ij . In the Heisenberg picture, where the operators depend on time, the commutation relation is imposed at equal time. The same principle can be applied to a scalar ﬁeld theory, where the coordinates q i (t) are replaced by the ﬁelds φ(t, x) while pi (t) are replaced by the conjugate momenta Π(t, x), and we interpret x as a label that distinguishes the “coordinates” φ(t, x) of our system. Since x is a continuous variable, δij must be replaced by a Dirac delta. Thus, the basic principle of canonical quantization is to promote the ﬁeld φ and its conjugate momentum to operators, and to impose the equal time commutation relation [φ(t, x), Π(t, y)] = iδ (3) (x − y) ,
(4.1)
while, at equal time, we impose [φ(t, x), φ(t, y)] = [Π(t, x), Π(t, y)] = 0. Furthermore, a real ﬁeld is promoted to a hermitian operator. Let us ﬁrst apply this procedure to a free real scalar ﬁeld. The mode expansion of a free real scalar ﬁeld is given in eq. (3.42). Promoting the real ﬁeld φ to a hermitian operator means to promote ap to an operator while a∗p becomes the hermitian conjugate operator a†p ; thus φ(x) =
d3 p ap e−ipx + a†p eipx , 3 (2π) 2Ep
(4.2)
with p0 = Ep . The conjugate momentum is given by eq. (3.43). Using these expressions it is easy to verify that, in terms of ap , a†p , the commutation relation (4.1) reads [ap , a†q ] = (2π)3 δ (3) (p − q) ,
(4.3)
[a†p , a†q ] = 0 .
(4.4)
while [ap , aq ] = 0,
4 83
4.2 Spin 1/2 ﬁelds
88
4.3 Electromagnetic ﬁeld
96
84 Quantization of free ﬁelds
It is sometimes convenient to put the system into a box of size L, so that the total volume V = L3 is ﬁnite. This procedure regularizes divergences coming from the inﬁnitevolume limit or, equivalently, from the small momentum region, and is an example of an infrared cutoﬀ. In a ﬁnite box of size L, imposing periodic boundary conditions on the ﬁelds, the momenta take the discrete values pi = (2π/L)ni with ni = 0, ±1, ±2, . . ., and therefore 3 2π 3 . (4.5) d p→ L n The condition d3 p δ (3) (p − q) = 1 then gives 3 L δ (3) (p − q) → δp,q . (4.6) 2π In particular, this implies that (2π)3 δ (3) (p = 0) → V .
(4.7)
Recalling the standard commutation relation of the creation and annihilation operators of a harmonic oscillator, [a, a† ] = 1, we see from eq. (4.3) that the commutation relations of the real scalar ﬁeld are equivalent to that of a collection of harmonic oscillators, with one oscillator for √ each value of the momentum p (apart from a normalization factor 1/ V in ap and a†p ). We can now construct the Fock space following the standard procedure for the harmonic oscillator: we interpret ap as destruction operators and a†p as creation operators, and we deﬁne a vacuum state 0 as the state annihilated by all destruction operators, so for all p ap 0 = 0 .
(4.8)
We normalize the vacuum with 00 = 1. The generic state of the Fock space is obtained acting on the vacuum with the creation operators, p1 , . . . pn ≡ (2Ep1 )1/2 . . . (2Epn )1/2 a†p1 . . . a†pn 0 .
(4.9)
The factors (2Ep )1/2 are a convenient choice of normalization. In particular, the oneparticle states are p = (2Ep )1/2 a†p 0 .
(4.10)
From the commutation relations and eq. (4.8) we ﬁnd that p1 p2 = (2Ep1 )1/2 (2Ep2 )1/2 0ap1 a†p2 0 = (2Ep1 )1/2 (2Ep2 )1/2 0[ap1 , a†p2 ]0 = 2Ep1 (2π)3 δ (3) (p1 − p2 ) .
(4.11)
The factors (2Ep )1/2 in eq. (4.10) have been chosen so that in the above scalar product the combination Ep δ (3) (p − q) appears, which is Lorentz invariant (see Exercise 4.2).
4.1
Scalar ﬁelds 85
We now compute the energy of these states. The Hamiltonian can be written in terms of ap , a†p substituting eq. (4.2) and the corresponding expression for Π into the Hamiltonian density (3.44), and integrating over d3 x; one ﬁnds 1 † d3 p 1 d3 p † † † a ap + a p ap = Ep Ep ap ap + [ap , ap ] . H= (2π)3 2 p (2π)3 2 (4.12) The second term is the sum of the zeropoint energy of all oscillators, and it is proportional to (2π)3 δ (3) (0). In a ﬁnite volume we see from eq. (4.7) that (2π)3 δ (3) (0) → V . The zeropoint energy is therefore 1 d3 p Ep , (4.13) Evac = V 2 (2π)3 and the energy density of the vacuum is 1 1 d3 p ρvac ≡ Evac = Ep . (4.14) V 2 (2π)3 For large p, Ep = p2 + m2 p and the integral diverges. We can regulate the divergence putting a cutoﬀ Λ in the integration over large momenta, so that we integrate only over p < Λ. This is an example of an ultraviolet cutoﬀ. The vacuum energy density then diverges as ρvac ∼
Λ
p3 dp ∼ Λ4 .
(4.15)
This is our ﬁrst encounter with an ultraviolet divergence. In quantum ﬁeld theory we will get used to divergences and we will see under what conditions these can be cured. In this case, however, the divergence apparently is relatively harmless. Since what we measure are energy diﬀerences, we can simply discard this zeropoint energy1 and declare that our Hamiltonian is d3 p Ep a†p ap . (4.16) H= (2π)3 We can formalize this statement introducing the concept of normal ordering : given an operator O its normal ordered form, denoted by : O :, is obtained writing by hand all creation operators to the left of all destruction operators. Thus, for instance, : ap a†p : = a†p ap and we can say that the quantum Hamiltonian (4.16) is obtained from the classical expression (3.44) promoting φ to an operator and performing the normal ordering, 1 d3 x : Π2 + (∇φ)2 + m2 φ2 : . (4.17) H= 2 We can now compute the energy of the various states of the Fock space. The vacuum state 0 now, by deﬁnition, has zero energy. The operator a†p ap is just the number operator of the oscillator labeled by p and
1
This is not true when we include gravity, since any form of energy contributes to the gravitational interaction. In Section 5.7, after having studied the renormalization of ﬁeld theory, we will come back to this zeropoint energy and we will discuss its relation with the cosmological constant problem.
86 Quantization of free ﬁelds
therefore the energy of the generic state (4.9) is given by the sum of the energies Epi of the various particles, Hp1 , . . . pn = (Ep1 + . . . + Epn ) p1 , . . . pn .
(4.18)
Similarly we can compute the spatial momentum of these states. From the Noether theorem, we know how to write the spatial momentum as the conserved charge associated to spatial translations. For the real scalar ﬁeld we found it in Section 3.3.1. Performing the normal ordering we have the quantum expression i 3 0i P = d x : θ : = d3 x : ∂0 φ∂ i φ : . (4.19)
2
Actually, for the momentum operator it is not really necessary to perform the normal ordering, since the terms that come out from the commutators are odd under p → −p and cancel when we integrate over d3 p.
Substituting φ from eq. (4.2) we see that the terms quadratic in the destruction operators vanish because they are given by an integral over d3 p of the function pi a−p ap , which is odd under p → −p . Similarly for the terms quadratic in the creation operators, and we are left with2 d3 p i † Pi = p ap ap . (4.20) (2π)3 † Therefore the states ap 0 are oneparticle states with momentum p, en2 2 ergy Ep = p + m and mass m. The generic state of the Fock space (4.9) is a multiparticle state, and its energy and momentum are the sum of the individual energies and momenta. From the fact that the creation operators commute between themselves we see that the multiparticle states (4.9) are symmetric under the exchange of any two particles, and therefore obey Bose–Einstein statistics. This is an example of the spinstatistics theorem, which states that particles with integer spin are bosons and particles with halfinteger spin are fermions. Finally, we can examine the angular momentum of these states. From the Noether theorem we found that for scalar ﬁelds the angular momentum operator has a part interpreted as orbital angular momentum, eq. (3.50), and that there is no intrinsic spin part. Therefore the quanta of the scalar ﬁeld are spin0 particles.
4.1.2
Complex scalar ﬁeld; antiparticles
We now consider a free complex scalar ﬁeld. Eq. (3.60) becomes d3 p ap e−ipx + b†p eipx (4.21) φ(x) = 3 (2π) 2Ep and the complex conjugate ﬁeld φ∗ becomes the hermitian conjugate operator, † ipx d3 p φ† (x) = ap e + bp e−ipx . (4.22) 3 (2π) 2Ep Imposing the canonical commutation relation (4.1) gives [ap , a†q ] = [bp , b†q ] = (2π)3 δ (3) (p − q) ,
(4.23)
4.1
Scalar ﬁelds 87
while all other commutators [a, a], [a† , a† ], [b, b], [b† , b† ] and all commutators between the a, a† and b, b† are equal to zero. The Fock space is constructed deﬁning the vacuum state as the state annihilated by all ap and bp , for each p , ap 0 = bp 0 = 0 . (4.24) Acting with a†p , b†p we generate the Fock space. After normal ordering, the Hamiltonian and spatial momentum are given by d3 p H= Ep (a†p ap + b†p bp ) , (4.25) (2π)3 d3 p i † (4.26) Pi = p (ap ap + b†p bp ) . (2π)3 We see that the quanta of a complex scalar are given by two diﬀerent particle species with the same mass, created by the a†p and b†p operators respectively. The U (1) charge is given in eq. (3.63). We compute it explicitly as a prototype of many similar calculations in this section, Z
Z
d3 q d3 p p p (2π)3 2Eq (2π)3 2Ep ” “ ” h“ × a†q eiqx + bq e−iqx ∂0 ap e−ipx + b†p eipx ”i ”“ “ − ∂0 (a†q eiqx + bq e−iqx ) ap e−ipx + b†p eipx Z d3 q d3 p p p = d3 x 3 (2π) 2Eq (2π)3 2Ep ” ” “ h“ × a†q eiqx + bq e−iqx Ep ap e−ipx − b†p eipx “ ”“ ”i +Eq a†q eiqx − bq e−iqx ap e−ipx + b†p eipx .
QU (1) = i
↔
d3 x φ † ∂ 0 φ = i
d3 x
(4.27)
The integration over d3 x produces (2π)3 δ (3) (p − q) on the terms ∼ a†q ap and bq b†p , and (2π)3 δ (3) (p + q) on the terms ∼ bq ap and a†q b†p ; in both cases p  = q  and therefore Eq = Ep . Using this fact it is straightforward to ﬁnd that the terms ∼ bq ap and a†q b†p cancel and we are left with the terms a†q ap and bp b†p . In these terms the exponentials are of the type exp{±i(q − p)x} = exp{±i[(Eq − Ep )t − (q − p )·x]}, and since Eq = Ep the time dependence cancels, as it should for a conserved charge. Therefore Z d3 q d3 p p p (2π)3 δ (3) (p − q)2Ep (a†q ap − bq b†p ) QU (1) = (2π)3 2Ep (2π)3 2Eq Z d3 p (4.28) (a† ap − bp b†p ) . = (2π)3 p
Again we normal order this expression,3 and we obtain d3 p QU(1) = (a† ap − b†p bp ) . (2π)3 p
(4.29)
Since a† a is the number operator of a harmonic oscillator, we see that the U (1) charge is equal to the number of quanta created by the operators a†p minus the number of quanta created by b†p , integrated over
3 Even if one accepts the normal ordering of the Hamiltonian on the grounds that a vacuum energy is unobservable, one cannot accept the normal ordering of the charge on the same grounds, since a charged vacuum would have observable eﬀects. Rather, in this case one can understand the need for normal ordering observing that the classical expression for the charge involves the product of φ∗ with ∂0 φ, eq. (3.63). When promoted to quantum operators, φ† and ∂0 φ do not commute, and therefore there is an ordering ambiguity already in the starting expression (4.27); for instance, we could write φ† ∂0 φ, or (∂0 φ)φ† , or we could take the symmetric combination. The ambiguity is removed requiring that the charge of the vacuum vanishes.
88 Quantization of free ﬁelds
4 Of course the overall sign (and normalization) of the Noether charge are arbitrary, since if a current j µ is conserved also −j µ is conserved, and it is also an arbitrary convention what state we call a particle and what an antiparticle.
all momenta. Therefore the states a†p 0 and b†p 0 represent particles with momentum p, mass m, spin zero and opposite charge; a†p 0 has QU(1) = +1 while b†p 0 has QU(1) = −1 and is called the antiparticle of a†p 0 .4 We now understand what is the proper interpretation of the negative energy solutions of the KG equation. The coeﬃcient of the positive energy solution e−ipx after quantization becomes the destruction operator of a particle and the coeﬃcient of eipx becomes the creation operator of its antiparticle. In the case of the real scalar ﬁeld the reality condition requires ap = bp and therefore the particle is its own antiparticle, and it is neutral under any U (1) symmetry.
4.2 4.2.1
Spin 1/2 ﬁelds Dirac ﬁeld
¯ µ ∂µ − m)Ψ. The conjuWe start from the Lagrangian (3.89), L = Ψ(iγ gate momentum is ΠΨ =
δL ¯ 0 = iΨ† . = iΨγ δ(∂0 Ψ)
(4.30)
A basic principle of quantum ﬁeld theory is the spinstatistic theorem, that requires that ﬁelds with halfinteger spin are quantized imposing equal time anticommutation relation, while spin with integer spin with equal time commutation relation. We will not discuss this theorem in full generality, but will see below how the need for anticommutators arises in the case of Dirac ﬁelds. So we impose {Ψa (x, t), Ψ†b (y, t)} = δ (3) (x − y)δab ,
(4.31)
where { , } is the anticommutator and a, b = 1, . . . 4 are the Dirac indices. The expansion of the free Dirac ﬁeld in plane waves is written d3 p ap,s us (p)e−ipx + b†p,s v s (p)eipx , (4.32) Ψ(x) = 3 (2π) 2Ep s=1,2 and therefore d3 p s −ipx † s ipx ¯ Ψ(x) = b . (4.33) v ¯ (p)e + a u ¯ (p)e p,s p,s (2π)3 2Ep s=1,2 The wave functions us (p), v s (p) are given in eqs. (3.103) and (3.107). Writing the anticommutation relations (4.31) in terms of the a, b operators we ﬁnd r s† 3 (3) {arp , as† (p − q)δ rs , q } = {bp , bq } = (2π) δ
(4.34)
4.2
with all other anticommutators equal to zero. The Fock space is constructed deﬁning ﬁrst a vacuum state annihilated by all destruction operators (4.35) ap,s 0 = bp,s 0 = 0 . Then multiparticle states are obtained acting on the vacuum with a†p,s or b†p,s . Since these operators anticommute between themselves, the resulting multiparticle state is antisymmetric under the exchange of two particles, so spin 1/2 particles (as in general all halfinteger spin particles) obey Fermi–Dirac statistics. The oneparticle states are normalized as in the case of the scalar ﬁeld, (2Ep )1/2 a†p,s 0 ,
(2Ep )1/2 b†p,s 0
(4.36)
and depend on the momentum as well as on the spin degree of freedom s, which takes the values s = 1, 2. The Hamiltonian density is obtained computing ﬁrst the classical expression, ¯ iγ 0 ∂0 + iγ i ∂i − m Ψ H = ΠΨ ∂0 Ψ − L = iΨ† ∂0 Ψ − Ψ ¯ −iγ i ∂i + m Ψ , (4.37) =Ψ and therefore we get the Dirac Hamiltonian 3 ¯ i ¯ (−iγ·∇ + m) Ψ . H = d x Ψ −iγ ∂i + m Ψ = d3 x Ψ
(4.38)
We then substitute the mode expansion (4.32) and we perform the normal ordering, which in this case means that we put all a†p,s to the left of all ap,s and all b†p,s to the left of all bp,s , adding a minus sign each time we exchange the position of any destruction or creation operator, but without paying the price of the Dirac delta; e.g. : ap,s a†p,s : = −a†p,s ap,s . The ﬁnal result is d3 p (4.39) H= Ep a†p,s ap,s + b†p,s bp,s . 3 (2π) s=1,2 If we were to quantize the Dirac ﬁeld in terms of commutators, at this point we would have found a minus sign in front of the term b†p,s bp,s , and therefore the energy would have been unbounded from below. In this way, instead, we see that the situation becomes completely analogous to the complex scalar ﬁeld, and the coeﬃcients of the negative energy solutions eipx become the creation operators of another type of particle. Let us now study the momentum, spin and charge of these particles. The momentum operator is again obtained from the Noether theorem, d3 p † (4.40) p ap,s ap,s + b†p,s bp,s . P= 3 (2π) s=1,2 The new aspect compared to the complex scalar ﬁeld is the spin degree of freedom. The angular momentum is the Noether charge associated
Spin 1/2 ﬁelds 89
90 Quantization of free ﬁelds
to spatial rotations and, as discussed in Section 2.6.2, it is made up of the orbital contribution plus a spin term. Using the expressions for the Noether current given in Section 2.6.2, the spin part is 1 d3 x Ψ† Σ Ψ , (4.41) S= 2
σi 0 Σ = . (4.42) 0 σi Substituting again the mode expansion one ﬁnds that, in the rest frame (i.e. when p = 0), the state created by a†p,s with s = 1 has Jz = +1/2 and that with s = 2 has Jz = −1/2. For the state created by b†p,s the situation is reversed and s = 1 has Jz = −1/2 while s = 2 has Jz = +1/2. Performing a boost in the z direction, Jz is unchanged and therefore a state created by a†p,s with p = (0, 0, pz ) and s = 1 has helicity h = +1/2, etc. Finally, we saw in Section 3.4.3 that the Dirac action has a U (1) global symmetry Ψ → eiα Ψ. The conserved charge is d3 p † a ap,s − b†p,s bp,s , (4.43) QU(1) = (2π)3 s=1,2 p,s where
i
Table 4.1 The quantum numbers of the oneparticles states created by a†p,s , b†p,s . The momentum p is directed by deﬁnition along the z direction. The electric charge is in units of e, with e < 0. state
Jz
U (1) charge
a†p,1 0
+ 12
+1
a†p,2 0 b†p,1 0 b†p,2 0
− 12
+1
− 12 + 12
−1 −1
The oneparticle states and their quantum numbers are summarized in Table 4.1. The states created by a†p,s are called particles and the states created by b†p,s are called antiparticles. In particular, in electrodynamics, we identify a†p,s 0 with the electron and b†p,s 0 with the positron. The U (1) charge is equal to the number of particles minus the number of antiparticles.
4.2.2
Massless Weyl ﬁeld
In this section we consider a massless Weyl ﬁeld. Its quantization follows immediately from the quantization of the Dirac ﬁeld. It is convenient to use a Dirac notation for the Weyl ﬁelds so, if ψL is a lefthanded twocomponent Weyl ﬁeld and ψR a righthanded Weyl ﬁeld, we write, in the chiral representation 0 ψL ΨL = , ΨR = . (4.44) ψR 0 We consider ﬁrst ΨL . As with any Dirac ﬁeld, we can expand it as in eq. (4.32), d3 p s ap,s usL (p)e−ipx + b†p,s vL ΨL (x) = (p)eipx . 3 (2π) 2Ep s=1,2 (4.45) s (p) are Dirac spinors which, in the However, by deﬁnition usL (p) and vL chiral representation, have the two lowest components vanishing, s s UL VL s usL = = , vL . (4.46) 0 0
4.2
Comparing with eqs. (3.105), (3.106) and (3.107) we see that when s = 1 s = 0, so only the term with s = 2 contributes in we have usL = vL eq. (4.45). Therefore d3 p † 2 −ipx 2 ipx , (4.47) a u (p)e + b v (p)e ΨL (x) = p,2 L L p,2 (2π)3 2Ep and
¯ L (x) = Ψ
d3 p 2 (p)e−ipx , ¯2L (p)eipx + bp,2 v¯L a†p,2 u 3 (2π) 2Ep
(4.48)
We see from Table 4.1 that b†p,2 creates an antiparticle with h = +1/2 while a†p,2 creates a particle with h = −1/2. Therefore: ¯ L create or destroy a particle In the massless case, the operators ΨL , Ψ with h = −1/2 and its antiparticle with h = +1/2. If we neglect the small masses indicated by the oscillations experiments, the neutrinos in the Standard Model are described by massless lefthanded Weyl ﬁelds. The neutrino has h = −1/2 and the antineutrino has h = +1/2. Repeating the analysis for ΨR we see that now only the s = 1 term survives and therefore the situation is reversed. A righthanded massless Weyl ﬁeld describes a particle with h = +1/2 and its antiparticle with h = −1/2.
4.2.3
C, P, T
In Section 2.6.3 we studied how parity and charge conjugation act on a classical Dirac ﬁeld. We now want to understand how they act on oneparticle states or, equivalently, on the operator Ψ that represents a quantized Dirac ﬁeld. Let us start with the parity operator P . Under parity the momentum p → −p while the spin s is unchanged since the angular momentum is a pseudovector. Then for a particle of type a we must have (4.49) P p, s; a = ηa  − p, s; a . We have inserted an index a to label the type of particle and we have included the possibility of a constant phase factor ηa , since vectors in the Fock space which diﬀer by a phase still represent the same physical state. We will call ηa the intrinsic parity of the particle a. Performing twice the parity transformation on a physical observable gives the identity operation; this is not yet suﬃcient to conclude that ηa2 = 1, since the observables are built from an even number of fermionic operators. Therefore, the condition that P 2 is the identity on the physical observable only implies that either ηa2 = +1 or ηa2 = −1. However, it can be shown (see Weinberg (1995), page 125) that for all spin 1/2 particles except Majorana fermions it is possible to redeﬁne the parity operation so that ηa2 = +1, and therefore ηa = ±1. In the following we will restrict to
Parity
Spin 1/2 ﬁelds 91
92 Quantization of free ﬁelds
this case. We will come back to Majorana fermions on page 94, and we will see that the intrinsic parity of Majorana fermions satisﬁes instead ηa2 = −1, i.e. ηa = ±i. In order to implement (4.49) on a generic multiparticle state, the operators a†p,s , b†p,s must satisfy P a†p,s = ηa a†−p,s P and P b†p,s = ηb b†−p,s P , so that, for instance, P a†p,s b†q,s 0 = ηa a†−p,s P b†q,s 0 = ηa ηb a†−p,s b†−q,s 0 5
We are assuming that the vacuum is nondegenerate and invariant under parity, and therefore P 0 = η0. In this case, as part of the deﬁnition of the operator P , we can choose η = +1, so P 0 = 0. The situation in which the vacuum state is degenerate and the parity operation sends a vacuum state into a diﬀerent vacuum state is an example of spontaneous symmetry breaking, and will be treated in Chapter 11.
(4.50)
and similarly on a generic multiparticle state.5 Using P 2 = 1, this means that P a†p,s P = ηa a†−p,s ,
P b†p,s P = ηb b†−p,s .
(4.51)
As mentioned in Section 2.7.2, the Wigner theorem states that we can implement this symmetry transformation by means of a unitary operator P . Then the conditions P P † = P † P = 1, together with P P = 1, give P † = P . Taking the hermitian conjugate of (4.51) and taking into account that we are restricting to ηa,b real, we ﬁnd P ap,s P = ηa a−p,s , Therefore with
P bp,s P = ηb b−p,s .
Ψ(x) → Ψ (x ) = P Ψ(x)P ,
(4.52) (4.53)
d3 p † s −ipx s ipx η . a u (p)e + η b v (p)e a −p,s b −p,s (2π)3 2Ep s=1,2 (4.54) We now change the integration variable p to p = −p. This trans formation does not change p0 , which is quadratic in p, so p0 = p0 . 0 0 Then exp(ipx) = exp(ip t − ipx) = exp(ip t + ip x) = exp(ip x ) with x = (t, −x), and similarly exp(−ipx) = exp(−ip x ). To understand how us (p) and v s (p) transform if we change their argument from p to p we can, without loss of generality, choose p along the z axis and use the explicit form given in eqs. (3.103) and (3.107). We see that the transformation p3 → −p3 exchanges the upper and lower components in the chiral representation, and therefore can be written in terms of γ 0 ,
P Ψ(x)P =
us (p) = γ 0 us (p ) ,
v s (p) = −γ 0 v s (p ) .
(4.55)
Therefore, renaming the integration variable p = p, d3 p s −ipx † s ipx η . P Ψ(x)P = γ 0 a u (p)e − η b v (p)e a p,s b p,s (2π)3 2Ep s=1,2 (4.56) We now require that the quantum operator Ψ is a representation of parity, up to a phase. From the above equation, we see that this is possible if and only if ηa = −ηb .
(4.57)
4.2
This shows that the intrinsic parity of a spin 1/2 particle and of its antiparticle are opposite.6 The transformation law of the operator Ψ then becomes Ψ(x) → Ψ (x ) = ηa γ 0 Ψ(x ) ,
(4.58)
which, once we recall the form of γ 0 in the chiral representation, is in agreement with the classical result (2.90), plus the novel quantum eﬀect of the intrinsic parity factor ηa .7 Of course ηa cancels in any fermion bilinear involving only particles of one type. However, the relative phase factors of diﬀerent particles can be observables,8 and in particular the opposite sign of the parity of the particle and its antiparticle is observable. An interesting application is to the case of positronium, which is the bound state of an electron and positron, and is discussed in Exercise 4.1. The situation should be compared with what happens to a complex scalar ﬁeld. In this case repeating the same arguments we have again P ap P = ηa a−p and P bp P = ηb b−p . We can go through the same steps, with the only diﬀerence that in φ(x) the annihilation and creation operators are multiplied simply by e−ipx and eipx , respectively, while in Ψ(x) they were multiplied by u(p)e−ipx and v(p)e−ipx , respectively. Therefore for scalar ﬁelds we do not get the relative minus sign between ηa and ηb , which for Dirac ﬁelds originated in the diﬀerent transformation properties of u(p) and v(p), see eq. (4.55). Then we ﬁnd that the quantized complex scalar ﬁeld φ gives a representation of parity if ηa = +ηb , so that the intrinsic parity of a spin0 particle and of its antiparticle are equal. The eﬀect of charge conjugation on the classical Dirac ﬁeld has been obtained in eq. (2.91), working in the chiral representation. In terms of γ matrices, eq. (2.91) reads Ψ → −iγ 2 Ψ∗ . We now study how the charge conjugation C acts on oneparticle states. Let us consider the following transformation of the operators a†p,s , b†p,s Cap,s C = ηC bp,s ,
Cbp,s C = ηC ap,s .
(4.60)
and therefore (since −iγ 2 is real and (γ 2 )2 = −1) we also have v s (p) = −iγ 2 (us (p))∗ . We can now write CΨ(x)C = ηC
6
If we consider also Majorana fermions the phase η is no longer restricted to be real (see page 92), and repeating the above steps one ﬁnds ηa = −ηb∗ .
7
We have derived the transformation of Ψ working in the chiral representation. As already remarked below eq. (3.122), once the transformation has been written in terms of γ matrices, it holds in any representation; in this case, if Ψ → γ 0 Ψ, we have U Ψ → U γ 0 Ψ = (U γ 0 U −1 )U Ψ. However, Ψ = U Ψ is the spinor in the new representation and (U γ 0 U −1 ) is γ 0 in the new representation. 8
More precisely, redeﬁning a new parity operator P = P exp{iαB +iβL+iγQ}, where B, L, Q are the baryon number, lepton number and electric charge, respectively, we can always set the intrinsic parity of the neutron, proton and electron to the value +1. The intrinsic parities of other particles are then ﬁxed, see Weinberg (1995), page 125.
Charge conjugation
(4.59)
We limit for simplicity to ηC = ±1. As we saw in eq. (2.91), charge conjugation relates Ψ and Ψ∗ . Therefore we need to know how u(p), v(p) transform under complex conjugation. The result is9 us (p) = −iγ 2 (v s (p))∗
Spin 1/2 ﬁelds 93
(4.61)
d3 p s −ipx † s ipx b u (p)e + a v (p)e p,s p,s (2π)3 2Ep s=1,2
9
One can check this result on the explicit expressions (3.103, 3.107) with p = (0, 0, p3 ). However in this frame only σ3 appears, which is real, so u(p), v(p) are real. To check that us (p) is indeed equal to −iγ 2 (vs (p))∗ rather than to −iγ 2 vs (p) it suﬃces to consider also the case where p = (0, p2 , 0).
94 Quantization of free ﬁelds
= −iηC γ 2
d3 p bp,s (v s (p))∗ e−ipx + a†p,s (us (p))∗ eipx 3 (2π) 2Ep s=1,2
= −iηC γ 2 Ψ∗ .
(4.62)
We see that this transformation is just the charge conjugation operation deﬁned on the classical ﬁeld, apart from the quantum phase ηC = ±1 which depends on the particle type. From eq. (4.59) we see that charge conjugation exchanges the particle with the antiparticle. The momentum p is unchanged by C and also the index s. Recall however from the previous section that the state created by a†p,s describes a particle with Jz = +1/2 when s = 1 and Jz = −1/2 when s = 2, while the state created by b†p,s describes a particle with Jz = −1/2 when s = 1 and Jz = +1/2 when s = 2 (see Table 4.1); Jz was deﬁned as the spin in the rest frame, but if a particle has spin Jz in its rest frame, it also has the same value of Jz if we make a boost along the z direction, since the generators Jz and Kz commute. Then we see that charge conjugation transforms a fermion with momentum p = (0, 0, p3 ) and Jz = +1/2 (hence helicity h = 1/2) into an antifermion with the same momentum but Jz = −1/2 , which means h = −1/2. Therefore charge conjugation reverses the helicity. Using the deﬁnition (4.62), one can verify (see Exercise 4.3) that the current changes sign under charge conjugation, µ ¯ µΨ . ¯ Ψ C = −Ψγ (4.63) C Ψγ
10 As already remarked in Section 2.6.4, a reality condition on a Dirac spinor cannot be imposed in the form Ψ = Ψ∗ . In terms of Weyl spinors such a ∗ and condition would imply ψL = ψL ∗ ψR = ψR ; however, these conditions are not Lorentz invariant, as we see from eqs. (2.59) and (2.60).
Timereversal
For Majorana spinors we found in eqs. (2.91) and (2.94) that ΨcM = ΨM , i.e. −iγ 2 Ψ∗M = ΨM . Then eq. (4.62) becomes CΨM C = ηC ΨM . Expanding ΨM in terms of creation and annihilation operators and using eq. (4.59) we ﬁnd ap,s = ηC bp,s . Therefore for a Majorana spinor the particle and the antiparticle are identical. As we already remarked in Section 2.6.4, the relation between Majorana spinors and Dirac spinors is similar to the relation between real scalar ﬁelds and complex scalar ﬁelds. In both cases we have a reality condition (φ = φ∗ for a scalar ﬁeld and Ψ = −iγ 2 Ψ∗ for a Dirac ﬁeld) which eliminates one half of the degrees of freedom, and identiﬁes the particle with the antiparticle.10 As we mentioned in note 6, in the general case where the intrinsic parity η is not assumed to be real, the parity of a fermion and an antifermion are related by ηa = −ηb∗ . Since for Majorana fermions the particle is the same as the antiparticle, we have ηa = −ηa∗ and therefore ηa = ±i. Finally, we consider the timereversal transformation T . The implementation of time reversal in quantum ﬁeld theory is somewhat peculiar. In fact, the Wigner theorem states that a symmetry transformation can be implemented either by a linear unitary operator, which is the case that we have met until now, or by an antiunitary and antilinear operator, i.e. by an operator U that, given two states a and b with scalar product ab , satisﬁes U aU b = ab ∗ (instead of being equal to ab as for a unitary operator) and, for c a complex constant, U ca = c∗ U a . Time
4.2
reversal is indeed the case of a symmetry that can be implemented only by an antiunitary and antilinear operator (see Peskin and Schroeder (1995), page 67). We want to deﬁne T in such a way that T ΨT satisﬁes the timereversed Dirac equation. Using the antilinearity of T , it can be shown that this can be obtained deﬁning T ap,s T = a−p,−s ,
T bp,s T = b−p,−s
(4.64)
where ap,−s ≡ (ap,2 , −ap,1 ) and bp,−s ≡ (bp,2 , −bp,1). Therefore T changes the sign of the momentum and ﬂips the spin, as we expect for time reversal. On the Dirac ﬁeld this gives T Ψ(t, x)T = −γ 1 γ 3 Ψ(−t, x) .
(4.65)
We leave it as an exercise to the reader to show that −γ 1 γ 3 Ψ(−t, x) indeed veriﬁes the Dirac equation with t → −t. Now that we have deﬁned C, P , and T on the ﬁeld Ψ, we can ask whether the Lagrangian governing the dynamics of Ψ is invariant under these transformations. For the free Dirac action, one immediately sees that C, P and T are indeed symmetry operations, but it is easy to construct interaction terms with fermion bilinears and possibly with derivatives that violate C, P or T separately. However, it is impossible to write a Lorentzinvariant term that violates CP T . In fact, under ¯ ¯ 5 Ψ, the combined action of C, P and T , the fermion bilinears ΨΨ, iΨγ µν µ 5 µ ¯ ¯ Ψ and Ψγ ¯ γ Ψ change sign. To and Ψσ Ψ are invariant while Ψγ construct a quadratic Lorentzinvariant term, the free Lorentz indices ¯ µ Ψ and Ψγ ¯ 5 γ µ Ψ must be contracted with a derivative ¯ µν Ψ, Ψγ in Ψσ ∂µ , while in quartic and higherorder terms the indices can also be contracted between the various fermion bilinears. Of course ∂µ is invariant under C while under the combined action of P, T we have ∂µ → −∂µ . We see that each free Lorentz index in a fermion bilinear constructed ¯ and Ψ carries a minus sign under CP T , and the same is true for with Ψ the Lorentz index in ∂µ . Therefore all possible Lorentz invariant terms, where all indices are contracted and therefore are even in number, are invariant under CP T . The fact that CP T is conserved, therefore, follows from the fact that it is impossible even to write down a Lorentzinvariant term that violates CP T . This is an example of the CP T theorem, which states that, independently of the spin of the particle, a (local) Lorentzinvariant ﬁeld theory with a hermitian Hamiltonian cannot violate CP T . Since CP T exchanges a particle with the antiparticle, and is an exact symmetry, i.e. it commutes with the Hamiltonian, it implies that the mass of a particle and of its antiparticle must be exactly equal. Ex¯0 perimentally, this is veriﬁed to an extraordinary accuracy in the K 0 K system, where the bound on the mass diﬀerence is mK 0 − mK¯ 0  < 10−18 . mK 0
(4.66)
Spin 1/2 ﬁelds 95
96 Quantization of free ﬁelds
4.3 4.3.1
Electromagnetic ﬁeld Quantization in the radiation gauge
The quantization of the electromagnetic ﬁeld presents new aspects. The core of the problem is that, because of gauge invariance, the ﬁeld Aµ gives a redundant description. We have therefore two choices in the quantization procedure. The ﬁrst possibility is to choose from the beginning a gauge such as the radiation gauge (3.150), which ﬁxes completely the gauge freedom; in this case we work directly with the physical degrees of freedom, but the price that we have to pay is a loss of explicit Lorentz covariance and at the end of the quantization procedure we must verify that we have not really lost Lorentz symmetry. This quantization scheme will be discussed in this section. The second possibility is to work with the full gauge ﬁeld Aµ . This will introduce some spurious degrees of freedom, which we will have to get rid of. This second quantization procedure will be discussed in the next section. Thus in this section we choose the radiation gauge (3.150), that we recall here ∇·A = 0 . (4.67) A0 = 0 , We have seen that in this gauge the equation of motion for the three residual components Ai is simply 2Ai = 0, so the most general classical solution is d3 p (p, λ)ap,λ e−ipx + ∗ (p, λ)a∗p,λ eipx , A= 3 (2π) 2ωp λ=1,2 (4.68) where, for the electromagnetic ﬁeld, we use the notation ωp ≡ p0 . Inserting this expansion in the equation of motion 2Ai = 0 we get p2 = 0 and therefore ωp = p; the gauge ﬁxing condition ∇·A = 0 requires instead ·p = 0; this equation, for each ﬁxed p, has of course two independent solutions, the two orthogonal vectors, which we label by an index λ = 1, 2. The physical degrees of freedom of the electromagnetic ﬁeld are described by two independent polarization vectors (p, 1) and (p, 2). We now promote A to a hermitian operator, and we write d3 p † −ipx ∗ ipx e + (p, λ)a e (p, λ)a , A(x) = p,λ p,λ (2π)3 2ωp λ=1,2 (4.69) † where now ap,λ , ap,λ are operators. Since for scalar ﬁelds we have understood that imposing on ap,λ , a†p,λ the commutation relation of the harmonic oscillator allows us to interpret the states of the Fock space as particles, we impose the commutation relations [ap,λ , a†q,λ ] = (2π)3 δ (3) (p − q)δλλ
(4.70)
4.3
and
[ap,λ , aq,λ ] = [a†p,λ , a†q,λ ] = 0 .
Electromagnetic ﬁeld 97
(4.71)
Equations (4.70) and (4.71) are our deﬁning rules for the quantization of the electromagnetic ﬁeld in the radiation gauge. We now want to understand what this deﬁnition means in terms of the commutation relations of the ﬁelds Ai with their conjugate momenta. The momentum conjugate to A0 is δ 1 µν Π0 = = 0. (4.72) − Fµν F δ(∂0 A0 ) 4 In this quantization scheme A0 and Π0 are equal to zero, and are not dynamical variables. The momentum conjugate to Ai is instead δ δ 1 1 Πi = − Fµν F µν = − F0i F 0i = −F 0i = E i . δ(∂0 Ai ) 4 δ(∂0 Ai ) 2 (4.73) One can verify that from the commutation relations (4.70) it follows that ki kj d3 k ik·(x−y) ij [Ai (t, x), E j (t, y)] = −i − e δ . (4.74) (2π)3 k2 In the derivation one uses the relation 1 i ki kj (k, λ)j ∗ (k, λ) + i ∗ (−k, λ)j (−k, λ) = δ ij − 2 . (4.75) 2 k λ=1,2
This identity can be veriﬁed choosing a frame where k = (0, 0, k). In this frame we can choose as orthogonal vectors the linear polarization vectors, i.e. (k, 1) = (1, 0, 0) and (k, 2) = (0, 1, 0), and eq. (4.75) is then trivially checked. The validity in any frame follows from the fact that both sides transform as tensors under rotations.11 The integral on the righthand side of eq. (4.74) is called a “transverse” Dirac delta and ij is denoted by δtr (x − y), so we can write ij [Ai (t, x), E j (t, y)] = −iδtr (x − y) .
(4.76)
Note that, were it not for the term k i k j /k2 , the integral over d3 k in eq. (4.74) would give an ordinary Dirac delta, and we would have the standard equal time commutation relations.12 The necessity of the term k i k j /k2 can be understood taking the divergence with respect to x of both sides of eq. (4.74); on the lefthand side we ﬁnd [∇·A(t, x), E(t, y)]. However, since we have imposed the gauge condition ∇·A = 0, this must vanish. Indeed, the divergence of the righthand side vanishes thanks to the additional term −k i k j /k2 since, taking the divergence, inside the integral we get kikj k i δ ij − 2 = kj − kj = 0 . (4.77) k For the same reason, taking now the divergence of eq. (4.76) with respect to y, we get [Ai (t, x), ∇·E(t, y)] = 0. This means that ∇·E commutes
11 The linear polarizations are real and therefore in this basis the above identity simpliﬁes to P i j ij − ki kj . λ=1,2 (k, λ) (k, λ) = δ k2 However the form (4.75) holds also choosing as a basis the circular √ polarizations, i.e. (k, √1) = (1/ 2)(1, i, 0) and (k, 2) = (1/ 2)(1, −i, 0). 12
Observe that E i is the momentum conjugate to Ai = −Ai . Therefore the sign in eq. (4.76) is in agreement with eq. (4.1).
98 Quantization of free ﬁelds
13
In particular, it is the equation of motion obtained performing the variation with respect to A0 ; note that to choose the gauge A0 = 0 means that we set A0 = 0 in the solutions of the equations of motion, not directly in the action, otherwise we lose this equation of motion. The classical solutions are deﬁned by the fact that the action must be stationary with respect to all ﬁelds, including A0 . Observe also that the equation of motion ∇·E = 0 contains no time derivative. Therefore it is not an equation that determines the evolution of an initial ﬁeld conﬁguration, but rather a constraint on the possible initial ﬁeld conﬁgurations.
with all operators and therefore, even in the quantum theory of the free electromagnetic ﬁeld, it is a cnumber, so it is consistent to impose ∇·E = 0 as an operator equation. Classically, ∇·E = 0 is just a Maxwell equation in the absence of sources.13 We can now proceed with the standard construction of the Fock space. We deﬁne the vacuum of the Fock space from ap,λ 0 = 0
(4.78)
for all p and λ = 1, 2; the Fock space is then generated acting with the creation operators a†p,λ . The quantum Hamiltonian is obtained normal ordering the classical expression (3.158), 1 d3 k 3 2 2 d x : E + B := H= ωk a†k,λ ak,λ , (4.79) 2 (2π)3 λ=1,2
where ωk = k. The momentum is obtained from the normal ordering of (3.159), d3 k 3 P = d x : E × B := k a†k,λ ak,λ . (4.80) (2π)3 λ=1,2
This shows that the state a†k,λ 0 describes a particle with energy ωk , momentum k and two polarization states λ = 1, 2. Since the dispersion relation is ωk = k, it has zero mass. To compute its spin we must ﬁrst compute the angular momentum operator of the electromagnetic ﬁeld using the Noether theorem, and then we can study its action on the oneparticle states. The reader who wishes to skip the explicit calculation can go directly to eq. (4.88).
It is instructive to perform the computation explicitly. We consider a rotation in the (jk) plane and we call J jk the associated conserved charge. The angular momentum along the i axis J i is then given by J i = (1/2) ijk J jk . From the Noether’s theorem J jk is given by the integral of the µ = 0 component of a current j µ(jk) , given by eq. (3.31) » – Z Z ∂L (jk) (aν(jk) ∂ν Ai − Fi ) − a0(jk) L . J jk = d3 x j 0(jk) = d3 x ∂(∂0 Ai ) (4.81) For a spacetime rotation the coeﬃcients aµ(jk) (here we denote them by a lower case letter in order not to create confusion with the gauge ﬁeld) have µ µ 0(jk) = 0 and been found in eq. (3.48), aµ (ρσ) = δρ xσ − δσ xρ so in particular a l(jk) lj k lk j l(jk) a = δ x −δ x . The coeﬃcients a measure the variation of the vector xl under rotation, δxl = ω jk al(jk) . The coeﬃcients F l(jk) similarly measure the variation of the gauge ﬁeld Ai under a rotation, and since Ai is a spatial vector its transformation law is the same as xi , so that F i(jk) = δ ij Ak −δ ik Aj . Therefore Z h i J jk = d3 x ∂0 Ai −(δ lj xk − δ lk xj )∂ l Ai − (δ ij Ak − δ ik Aj ) Z h i = d3 x ∂0 Ai (xj ∂ k − xk ∂ j )Ai − (Ak ∂0 Aj − Aj ∂0 Ak ) . (4.82)
4.3 The ﬁrst term in the bracket is clearly the contribution from the orbital angular momentum. More precisely, it is the matrix element of the orbital angular momentum operator Ljk = i(xj ∂ k − xk ∂ j ) with the same scalar product used in the Klein–Gordon case; compare with eq. (3.50) and the discussion below it. We are now interested in the second term which, according to the discussion after eq. (2.84), is the spin part S ij . Inserting the expansion (4.69) and performing the normal ordering we get Z S ij = d3 x : Ai ∂0 Aj − Aj ∂0 Ai : Z X d3 k d3 q √ √ = d3 x (iωq ) (4.83) 3 2ω 3 (2π) (2π) 2ωk q λ λ i n h × : i (k, λ )ak,λ e−ikx + i ∗ (k, λ )a†k,λ eikx i o h × − j (q, λ )aq,λ e−iqx + j ∗ (q, λ )a†q,λ eiqx : − (i ↔ j) . The integration over x of the various terms gives (2π)3 δ (3) (k ± q); then one ﬁnds that the terms ∼ aa and a† a† cancel, while the terms obtained exchanging (i ↔ j) gives a factor of two, so Z i d3 q X h i (q, λ ) j ∗ (q, λ ) − i ∗ (q, λ ) j (q, λ ) a†q,λ aq,λ . S ij = i 3 (2π) λ λ (4.84) Now we apply this operator to the oneparticle state a†k,λ 0. Using aq,λ a†k,λ 0 = [aq,λ , a†k,λ ] 0 = (2π)3 δ (3) (k − q)δλ λ 0 we ﬁnd S ij a†k,λ 0 = i
(4.85)
i X h i (k, λ) j ∗ (k, λ ) − i ∗ (k, λ ) j (k, λ) a†k,λ 0 . (4.86)
λ =1,2
We choose k = (0, 0, k) and we compute the spin along the z axis, i.e. S 3 = S 12 . As a basis for the polarization vectors we choose the linear polarizations, (k, 1) = (1, 0, 0), (k, 2) = (0, 1, 0); in components i (k, λ) = δλi and X 1 2 (δλ δλ − δλ2 δλ1 )a†k,λ 0 . (4.87) S 3 a†k,λ 0 = i λ =1,2
The ﬁnal result of this calculation is therefore S 3 a†k,1 0 = ia†k,2 0 S 3 a†k,2 0 = −ia†k,1 0 ,
(4.88)
with k = (0, 0, k). We see that the linear polarizations are not eigenstates of the helicity. The eigenstates are given by the circular polarizations, S 3 a†k,+ 0 = +a†k,+ 0 where
S 3 a†k,− 0 = −a†k,− 0
(4.89)
1 a†k,± = √ (a†k,1 ± ia†k,2 ) . 2
(4.90)
Electromagnetic ﬁeld 99
100 Quantization of free ﬁelds
14 A note for the advanced reader. In general, it is quite common in ﬁeld theory that, in the quantization procedure, a symmetry of the classical Lagrangian is not explicitly preserved in the intermediate steps, and at the end of the quantization procedure one must check whether the symmetry is still present in the quantum theory. It turns out that it is not at all automatic that such a symmetry is recovered. If this does not happen, the symmetry is called anomalous, and the theory is said to have an anomaly. In QFT this can only happen as a consequence of the divergences of the interacting theory, that will be the subject of the next chapter, and in a free ﬁeld theory, such as the free electromagnetic ﬁeld that we are considering here, no anomaly can appear. For this reason, the recovery of Lorentz invariance in the quantization of the free electromagnetic ﬁeld is guaranteed. It is however interesting to observe that in string theory there is no distinction between a free Lagrangian and an interaction term, i.e. the free propagation of the string ﬁxes also the interaction. It is possible to quantize the theory in a way very similar to the quantization in radiation gauge of the electromagnetic ﬁeld, breaking the explicit Lorentz covariance, and one ﬁnds that at the end Lorentz invariance is recovered only if the theory lives in 26 spacetime dimensions (for the bosonic string) or in 10 spacetime dimensions (for superstrings). See Polchinski (1998), Section 1.3, for a clear discussion.
The conclusion is that the states a†k,± 0 describe particles with momentum k, energy ωk = k, mass zero, spin 1, and helicity ±1. These quanta are the photons. The fact that massless particles are helicity eigenstates and that there is no state with Jz = 0 is in agreement with our general discussion of the representation of the Poincar´e group in Section 2.7.2. Our quantization procedure did not maintain Lorentz covariance, since we broke it from the beginning with our gauge choice. We must therefore now ask whether at the end Lorentz invariance is recovered. The fact that we found a particle which ﬁts within the representations of the Poincar´e group already indicates that the ﬁnal result is compatible with Poincar´e (and therefore Lorentz) invariance. To make sure that indeed the theory has Lorentz invariance what we actually have to do is to construct all generators of the Poincar´e group in terms of the creation and annihilation operators. We already wrote explicitly the energy, momentum and the spin part of the angular momentum in eqs. (4.79), (4.80) and (4.84) and the reader can complete it computing the orbital part of the angular momentum, and the boost generator. Using the commutation relations of the creation and annihilation operators one can then check that these generators indeed close the Lorentz algebra, and that the oneparticle states, under the transformations generated by these generators, transform as expected for a spin1 massless particle. This proves the covariance of the quantization in the radiation gauge.14 Finally, we can deﬁne on the photon states the operations of parity and charge conjugation. Concerning parity, we have understood that the physical photon states are described by a vector ﬁeld A(t, x ), subject to the condition ∇·A = 0. The gauge ﬁeld A is a true vector, as follows for instance from the fact that the electric ﬁeld is a true vector and the magnetic ﬁeld is a pseudovector, so under parity it transforms as A(t, x ) → −A(t, −x ) .
(4.91)
Expanding each of the three components A(t, x ) in spherical harmonics, under parity the terms with orbital angular momentum L get the usual factor (−1)L from the transformation of the spherical harmonic YLM under x → −x , plus an overall minus sign from the fact that A is a vector. In terms of photon states, this means that P γ; k , s = −γ; −k , s ,
(4.92)
where k , s are the momentum and spin of the photon γ. Therefore the intrinsic parity of a physical photon state is −1. We saw in eq. (4.63) that the fermionic current changes sign under charge conjugation. Therefore, if we deﬁne C on the gauge ﬁeld as CAµ C = −Aµ ,
(4.93)
then charge conjugation is a symmetry of the QED Lagrangian. On the creation and annihilation operators of the photon, the above equation means that (4.94) Cap,λ C = −ap,λ .
4.3
Electromagnetic ﬁeld 101
By deﬁnition, we take C0 = +0 . Then, since C 2 = 1, we have Cap,λ 0 = (Cap,λ C) C0 = −ap,λ 0 .
(4.95)
Therefore the photon has charge conjugation −1.
4.3.2
Covariant quantization
In this section we take a diﬀerent route for the quantization of the electromagnetic ﬁeld. We do not want to spoil the covariance, so we do not impose the radiation gauge and we accept working with the redundant ﬁeld Aµ . However, if we try to perform straightforwardly a covariant quantization of the Maxwell Lagrangian, 1 L = − Fµν F µν , 4
(4.96)
we fail immediately. In fact, in a naive covariant quantization, we would ﬁrst of all deﬁne the conjugate momenta as Πµ (x) =
∂L , ∂(∂0 Aµ )
(4.97)
Consider ﬁrst the spatial components. The momentum conjugate to Ai is Πi = −Πi , with Πi = E i . Equation (4.1) would therefore suggest imposing the equal time commutation relations15 [Ai (t, x), Πj (t, y)] = −iδ ij δ (3) (x − y)
(4.98)
(the minus sign is due to the fact that the momentum conjugate to Ai is Πi , and Πi = −Πi ), while [Ai (t, x), Aj (t, y)] = 0 .
(4.99)
These commutation relations have the covariant generalization [Aµ (t, x), Aν (t, y)] = 0 ,
(4.100)
[Aµ (t, x), Πν (t, y)] = iη µν δ (3) (x − y) .
(4.101)
The metric ηµν is forced upon us from the condition of Lorentz covariance, since the lefthand side is a tensor. In a covariant quantization, one would therefore use eqs. (4.97), (4.100) and (4.101) as the starting point. However, eqs. (4.97) and (4.101) are incompatible, because in the Maxwell Lagrangian there is no dependence on ∂0 A0 and therefore Π0 vanishes identically, and cannot have a nontrivial commutator with A0 . To tackle this problem we proceed as follows. We start from a modiﬁed Lagrangian, 1 1 (4.102) L = − Fµν F µν − (∂µ Aµ )2 . 4 2 This Lagrangian at ﬁrst sight seems to describe a very diﬀerent theory compared to the Maxwell Lagrangian (4.96). Indeed, the Lagrangian
15
Observe that in the quantization in radiation gauge of the previous section the commutator [Ai (t, x), Πj (t, y)] was rather given in terms of a transverse Dirac delta, see eq. (4.76). This was a consequence of our gauge ﬁxing, which eliminated from the beginning the longitudinal polarization vector, i.e. the vector i (k, 3) which, in the frame where k = (0, 0, k), has the form i (k, 3) = (0, 0, 1). Because of this, the sum over the polarization gave the transverse tensor δ ij − k i k j /k2 , see eq. (4.75). The diﬀerence with the covariant quantization is that now we are not ﬁxing the gauge, and we keep for the moment all polarization vectors. Therefore, the sum over the spatial polarization vectors now gives δ ij rather than δij − k i k j /k2 .
102 Quantization of free ﬁelds
(4.102) is not even gauge invariant. For the moment we postpone the question of what the Lagrangian (4.102) has to do with (4.96), and we proceed to its quantization. The conjugate momenta can now be deﬁned straightforwardly ∂L (4.103) Πµ (x) = ∂(∂0 Aµ ) so that Πi = −F 0i = E i as in the usual Maxwell Lagrangian (4.96), while Π0 = −∂µ Aµ is nonvanishing. It therefore makes perfectly sense to impose the canonical commutation relations (4.101). The equation of motion derived from eq. (4.102) is simply 2Aµ = 0. The operators Aµ can therefore be expanded as
3 d3 p † ipx −ipx ∗ (p, λ)a e (p, λ)a e + , µ p,λ µ p,λ (2π)3 2ωp λ=0 (4.104) and the equation of motion 2Aµ = 0 translates into p2 = 0. The important diﬀerence compared to the canonical quantization discussed in the previous section is that now there is no constraint on µ ; in the canonical quantization the conditions 0 = 0 and pµ µ = 0 came from the gauge choice, while here we start from a Lagrangian which is not even gauge invariant, and no constraint has been imposed on Aµ . Therefore we have four independent solutions for µ (p, λ), labeled by λ = 0, 1, 2, 3. In the frame where pµ = (p, 0, 0, p) we will choose as a basis
Aµ (x) =
µ (p, 0) = (1, 0, 0, 0) ,
µ (p, 1) = (0, 1, 0, 0) ,
µ (p, 2) = (0, 0, 1, 0) ,
µ (p, 3) = (0, 0, 0, 1) ,
(4.105)
or, more compactly, µ (p, λ) = δλµ . In a generic frame the form of µ (p, λ) is found performing the appropriate Lorentz transformation. The two vectors µ (p, 1) and µ (p, 2) satisfy µ pµ = 0, i.e. they are transverse. Instead µ (p, 0) and µ (p, 3) have µ pµ = 0. From the expansion (4.104) and the canonical commutation relations (4.101) it follows that [ap,λ , a†q,λ ] = −(2π)3 δ (3) (p − q)ηλλ ,
(4.106)
with λ, λ = 0, 1, 2, 3, and [ap,λ , aq,λ ] = [a†p,λ , a†q,λ ] = 0. The crucial new point here is that the commutator (4.106) with λ = λ = 0 has the “wrong” sign, because η00 = +1. The consequence of this sign apparently is a disaster. Consider the states p, λ ≡ (2ωp )1/2 a†p,λ 0
(4.107)
and try to interpret them as oneparticle states. The norm of these states is p, λp, λ = (2ωp )0ap,λ a†p,λ 0 = (2ωp )0[ap,λ , a†p,λ ]0 = −ηλλ 2ωp V (4.108)
4.3
(where, in a ﬁnite volume, (2π)3 δ (3) (0) = V , see eq. (4.7)). Therefore, the state created by the oscillator with λ = 0 has a negative norm! Since the scalar products in quantum mechanics are interpreted as probabilities, a Fock space with a scalar product which is not positive deﬁnite has no probabilistic interpretation. On the other hand, we must observe that the states that create the problem have no counterpart in the canonical quantization of the electromagnetic ﬁeld discussed in the previous section; we have seen in fact that the physical states are only those associated with transverse polarization vectors. The states created by a†p,0 and by a†p,3 (in the frame where p is along the third axis) are unphysical. We must now recall that the theory that we have quantized so far is not electrodynamics, because of the extra term (∂µ Aµ )2 in the Lagrangian. The basic idea of the covariant quantization of the electromagnetic ﬁeld, or Gupta–Bleuler quantization, is to start from the apparently diﬀerent theory (4.102) and to recover a quantum theory of the electromagnetic ﬁeld imposing a restriction on the Fock space: we deﬁne the subspace of physical states requiring that for any two physical states phys , phys phys ∂µ Aµ phys = 0 .
(4.109)
In other words, ∂µ Aµ = 0, rather than being imposed at the level of the Lagrangian (4.102), is recovered as an operator equation on physical states, and we expect that the quantization of the theory (4.102), supplemented with the condition (4.109), is equivalent to the canonical quantization studied in the previous section. We must therefore study whether imposing the condition (4.109) is suﬃcient to eliminate the states with negative norm from the physical space and whether the ﬁnal result is a Fock space of transverse photons, as we expect from the previous section. We ﬁrst observe that the operator ∂µ Aµ can be separated into its positive and negative frequency parts, ∂µ Aµ = (∂µ Aµ )+ + (∂µ Aµ )−
(4.110)
where (∂µ Aµ )+ contains only the positive frequency part, i.e. the annihilation operators, and (∂µ Aµ )− contains the creation operators, (∂µ Aµ )+ = −i µ −
(∂µ A ) = i
3 d3 p pµ µ (p, λ)ap,λ e−ipx (4.111) (2π)3 2ωp λ=0
3 d3 p pµ µ∗ (p, λ)a†p,λ eipx . (2π)3 2ωp λ=0
(4.112)
Since (∂µ Aµ )− = (∂µ Aµ )+† , eq. (4.109) is satisﬁed if we deﬁne the physical states from the condition (∂µ Aµ )+ phys = 0 ,
(4.113)
Electromagnetic ﬁeld 103
104 Quantization of free ﬁelds
since then automatically phys(∂µ Aµ )− = 0 for any physical state. Equation (4.113) will be taken as the deﬁnition of the physical subspace. Since it has the form of a linear operator applied to a state, it preserves the linear structure of the physical Hilbert space: if phys1 and phys2 are physical states, then αphys1 + βphys2 is a physical state. Let us examine what this condition means for oneparticle states. We consider a state ψ = λ cλ a†k,λ 0 , i.e. the most general superposition of polarization states with a given momentum k, and we choose k along the third axis, k µ = (k, 0, 0, k). On this state the physical state condition (4.113) becomes c0 + c3 = 0. We see that the two transverse photons a†k,1 0 and a†k,2 0 , and any linear combination of them, are physical states. This is good news, since we know from the previous section that these are the true degrees of freedom of the photon. Consider now the subspace generated by a†k,0 0 and a†k,3 0 . We see that neither a†k,0 0 nor a†k,3 0 are physical states. This is also good news, since the state a†k,0 0 is just the negative norm state, and a†k,3 0 , even if it has a positive norm, does not correspond to a physical polarization state. However the combination φ = (a†k,0 − a†k,3 )0
(4.114)
has c0 = +1, c3 = −1 and satisﬁes the physical state condition c0 + c3 = 0. Therefore the most general oneparticle state of the physical subspace, with momentum k, is of the form ψT + cφ
(4.115)
where ψT is an arbitrary linear combination of the transverse states a†k,1 0 and a†k,2 0 , φ is given by eq. (4.114) and c is an arbitrary constant. The question now is: what shall we do with φ , which has no counterpart in the canonical quantization? First of all, observe that φ has zero norm, φφ = 0(ak,0 − ak,3 )(a†k,0 − a†k,3 )0 = 0ak,0 a†k,0 + ak,3 a†k,3 0 = 0[ak,0 , a†k,0 ] + [ak,3 , a†k,3 ]0 = 0 ,
(4.116)
because the commutator [ak,0 , a†k,0 ] has the opposite sign of [ak,3 , a†k,3 ]. This means that φ is orthogonal to all physical states, since it is trivially orthogonal to all states of the form ψT , and is also orthogonal to itself. Therefore all scalar products of ψT + cφ with any other physical state are the same as the scalar product of ψT . Let us next look at the contribution of φ to the energy and momentum. The energy and momentum in the covariant quantization are found as usual from the Noether theorem and are d3 k † † −a , (4.117) ω a + a a H= k k,0 k,λ k,0 k,λ (2π)3 λ=1,2,3
4.3
P =
d3 k k −a†k,0 ak,0 + 3 (2π)
a†k,λ ak,λ ,
(4.118)
λ=1,2,3
and the minus sign in front of a†k,0 ak,0 is simply a consequence of Lorentz covariance; the terms in brackets can in fact be written as −η λλ a†k,λ ak,λ . Computing the matrix element of the energy and momentum operators between physical states, the contribution from the term −a†k,0 ak,0 cancels that from the term a†k,3 ak,3 . In fact, the condition c0 + c3 = 0 can be rewritten as (4.119) (ak,0 − ak,3 )ψ = 0 . Then phys − a†k,0 ak,0 + a†k,3 ak,3 phys = phys (−a†k,0 + a†k,3 )ak,3 phys = 0 , (4.120) where the ﬁrst equality follows from eq. (4.119) and the second from its hermitian conjugate. This means that the contribution to the energy and to the momentum comes only from the transverse oscillators, and therefore, for oneparticle states, it is determined completely by the transverse part ψT in eq. (4.115), and is independent of cφ . In conclusion, the states ψT + cφ and ψT have the same energy, momentum (and angular momentum, as can be checked similarly), and they have the same scalar product with all physical states. Therefore they are physically indistinguishable. We therefore introduce an equivalence relation, saying that ψT is equivalent to ψT , ψT ∼ ψT
(4.121)
ψT = ψT + cφ .
(4.122)
if for some constant c
We then identify the photons as the equivalence classes with respect to this relation. As a representative of the equivalence class we can conveniently take the purely transverse state ψT . In any case, no physical result depends on this choice. The photon is described by two transverse degrees of freedom, and the energy, momentum (and angular momentum) coincide with those found performing the quantization in the radiation gauge. The generic multiparticle state is obtained tensoring this physical oneparticle state. As long as we quantize the free theory, this gives a consistent quantization scheme. In an interacting theory, one must however be careful to check that the interaction between physical states does not produce unphysical states.
Summary of chapter • The basic principle of the canonical quantization of a free scalar ﬁeld is to promote the ﬁeld φ and its conjugate momentum to operators, and to impose the equal time commutation relation (4.1).
Electromagnetic ﬁeld 105
106 Quantization of free ﬁelds
•
•
• • •
Expanding the ﬁeld in plane waves, the coeﬃcients ap of the expansion become operators, and their complex conjugates a∗p become the hermitian conjugate operators a†p . The commutation relation between ap and a†p is given by eq. (4.3) and shows that a free scalar ﬁeld theory is equivalent to a collection of harmonic oscillators, one for each degree of freedom, labeled by the momentum p. The Fock space is constructed in eqs. (4.8) and (4.9). It describes a multiparticle space. The operator ap , acting on a state of the Fock space, destroys a particle with momentum p , while a†p creates it. This is a crucial aspect of quantum ﬁeld theory. The transition amplitudes between diﬀerent states of the Fock space (that we will learn to compute in the following chapters) describe processes in which the number and the type of particle changes, something which is impossible to describe using only ﬁrstquantized wave equations. The Hamiltonian and momentum operators are obtained from the classical expressions, performing the normal ordering. Equa† tions (4.16) and (4.20) show that the state a p 0 is a oneparticle state with momentum p and energy Ep = p 2 + m2 . The quantization of complex ﬁelds gives rise to two diﬀerent kinds of quanta, i.e. each particle has its antiparticle, which has the same mass but opposite U (1) charge. Spinor ﬁelds are quantized imposing anticommutation relations, eq. (4.31), and obey Fermi–Dirac statistics. The quantization of the electromagnetic ﬁeld has a complication due to gauge invariance. One can choose between: (i) a description in which only the physical degrees of freedom appear, at the price of dealing in the intermediate steps with equations which are not explicitly Lorentz covariant (Section 4.3.1), and (ii) a description where Lorentz covariance is explicit, but in the intermediate steps we must deal with spurious degrees of freedom (Section 4.3.2). In any case, one ends up with a fully Lorentz and Poincar´e invariant theory, describing a massless particle with spin 1 and two helicity states, the photon.
Exercises (4.1) Positronium is a hydrogenlike bound state of an electron and positron. (i) Show that the parity of a positronium state with orbital angular momentum L is P = (−1)L+1 .
positronium is an eigenstate of the charge conjuˆ gation operator, Cpos = Cpos. Show that, on a positronium state with angular momentum L and total spin S, the eigenvalue is C = (−1)L+S .
(ii) The charge conjugation operator Cˆ exchanges the electron with the positron, and therefore
(iii) In QED parity and charge conjugation are conserved, and therefore the positronium states have
Exercises 107 welldeﬁned values of C and P . From the results obtained above it follows that positronium states also have L, S deﬁned. The state with S = 0 is called parapositronium while S = 1 is called orthopositronium. Show that the ground state of parapositronium can decay into two photons while the ground state of orthopositronium cannot decay into two photons but can decay into three photons. We will compute explicitly the decay rate for the annihilation in two photons in Exercise 7.2. (4.2) Show that the quantity Ep δ (3) (p − k) is Lorentz invariant and therefore the oneparticle states have a Lorentzinvariant normalization. ¯ µ Ψ changes (4.3) Show that under charge conjugation Ψγ sign, where Ψ is the quantized ﬁeld operator. (4.4) Consider the Proca Lagrangian 1 1 L = − Fµν F µν + m2 Aµ Aµ , (4.123) 4 2 with m = 0. (i) Verify that this theory is not gauge invariant and show that the equations of motion derived from this Lagrangian are (2 + m2 )Aµ = 0 ,
∂ µ Aµ = 0 .
(4.124)
(ii) Perform the canonical quantization and verify that the theory describes a massive spin1 particle. (4.5) (i) Let H be the second quantized Hamiltonian of a free real scalar ﬁeld, see eq. (4.16), and β a number. Prove the identity e−βH a†p = a†p e−β(H+Ep ) .
(4.125)
(ii) According to the rules of quantum mechanics, in a mixed state described by a density matrix ρ the expectation value of any operator O is given by Tr (ρO), when ρ is normalized by Tr ρ = 1. On a thermal state with temperature T = 1/β the density matrix is ρ = e−βH /Tr e−βH and therefore the thermal expectation values are deﬁned as Tr Oe−βH , (4.126) Tr e−βH where the trace is over the Fock space. Using the result obtained in (i), show that Oβ =
a†p aq β
(2π)3 δ (3) (p − q ) = eβEp − 1
(4.127)
√ This shows that a†p / V create quanta that obey the Bose–Einstein distribution. (iii) Repeat the exercise for anticommuting operators and verify that one obtains Fermi–Dirac statistics. (4.6) (i) Consider a gas of N electrons in a box of volume V . Show that, in the ground state, the electrons ﬁll all states with a momentum p such that p  pF , with pF given by p3 N = F2 . V 3π
(4.129)
The state with all levels ﬁlled up to pF is called the Fermi vacuum and pF is called the Fermi momentum. We denote the Fermi vacuum by 0F to distinguish it from the Fock vacuum 0, where all levels are empty. The Fermi vacuum is the vacuum state of the system subject to the constraint of a ﬁxed number of particle, N , while the Fock vacuum is the vacuum state of the system with no constraint on the particle number. (ii) Let ap,s , a†p,s be the usual annihilation and creation operators of the electron, introduced in Section 4.2.1. By deﬁnition ap,s 0 = 0 for all p . Verify that it is not true that ap,s 0F = 0 for all p . As a consequence, a†p,s is not the appropriate operator to describe the excitation above 0F . Deﬁne Ap ,s = θ(p  − pF )ap,s + θ(pF − p )a†−p ,−s A†p ,s = θ(p  − pF )a†p,s + θ(pF − p )a−p ,−s , (4.130) where θ is the step function. Verify that Ap ,s 0F = 0 for all p and that Ap ,s and A†q ,r still satisfy canonical anticommutation relations. Give a physical interpretation of the action of A†q ,r on 0F . (iii) Equation (4.130) is a special case of a Bogoliubov transformation, which can be deﬁned both on bosonic and on fermionic operators, as Ap ,s = αp ap,s − βp a†−p ,−s A†p ,s = α∗p a†p,s − βp∗ a−p ,−s ,
(4.131)
and therefore, in a ﬁnite volume V , a†p ap β =
V e
βEp
−1
.
(4.128)
with αp , βp complex coeﬃcients (for the spin zero case just omit the spin index s). Show that, in the
108 Quantization of free ﬁelds bosonic case, the condition that Ap ,s and A†p ,s satisfy the canonical commutation relations requires that
for simplicity the spatial volume V = 1) are A†p Ap and a†p ap . Let np be an eigenstate of the operator a†p ap with eigenvalue np and deﬁne
αp 2 − βp 2 = 1 , αp β−p − α−p βp = 0 ,
Np ≡ np A†p Ap np . Show that
„
while for fermions 2
Np = np + 2βp 
αp 2 + βp 2 = 1 , αp β−p + α−p βp = 0 .
(4.134)
(4.132)
(4.133)
(iv) Considering for notational simplicity the spin zero case, let 0, a be the vacuum state annihilated by the operators ap and 0, A be the vacuum state annihilated by the operators Ap . Correspondingly we have two diﬀerent type of particles, the “a”particles obtained acting with a†p on 0, a and the “A”particles obtained acting with A†p on 0, A. The respective particle number operators (setting
1 np + 2
« .
(4.135)
In particular, on the vacuum of the “a”particles, Np = βp 2 . This means that, in terms of “A”particles, the vacuum of the “a”particles is a multiparticle state. Bogoliubov transformations of this type are used in condensed matter physics, in the context of supeﬂuidity or superconductivity, and also in cosmology, to compute particle production by gravitational ﬁelds.
Perturbation theory and Feynman diagrams
5
In Chapter 4 we studied the quantization of free ﬁelds. We now introduce the interaction. In the canonical quantization, perturbation theory is developed more easily using the Hamiltonian formalism (the Lagrangian formalism is instead more useful in the path integral quantization that will be discussed in Chapter 9). We therefore consider a general ﬁeld theory with a Hamiltonian H = H0 + Hint
(5.1)
where H0 is the free Hamiltonian and Hint is the interaction term. The interaction term will be considered small. For instance in QED (5.2) Hint = d3 x Hint = − d3 x Lint with ¯ µΨ Lint = −eAµ Ψγ
(5.3)
as discussed in Section 3.5.4. (Note that the identity Hint = −Lint holds only when the interaction Lagrangian does not contain derivatives of the ﬁelds.) The smallness of the interaction follows from the fact that the parameter which turns out to be relevant for the perturbative expansion is α = (e2 /4π) 1/137. A useful toy model for learning the basic techniques is a quartic selfinteraction of a scalar ﬁeld. In this case H0 corresponds to the free Klein–Gordon theory, and Hint =
λ 4 φ , 4!
(5.4)
with λ a dimensionless coupling constant. Perturbation theory will be meaningful in the weak coupling regime, λ 1.
5.1
The Smatrix
In the Schr¨ odinger picture we consider a state a (t) which, at an initial time Ti , is an eigenstate of a set of commuting operators, with eigenvalues labeled collectively by a. Typically, a will be the set of momenta and spins of the incoming particles. Let us denote a (Ti ) simply by
5.1 The Smatrix
109
5.2 LSZ reduction formula
111
5.3 Perturbative expansion
116
5.4 Feynman propagator
120
5.5 Feynman diagrams
122
5.6 Renormalization
135
5.7 Vacuum energy and the cosmological constant
141
5.8 The modern point of view on renormalizability 144 5.9 Running couplings
146
110 Perturbation theory and Feynman diagrams
a . Similarly, we consider a state b (t) that, at a ﬁnal time Tf , is an eigenstate with eigenvalues b, and we denote b (Tf ) simply as b . The state a (t) evolves as a (t) = e−iH(t−Ti ) a and therefore at the ﬁnal time Tf it has evolved into e−iH(Tf −Ti ) a . The amplitude for the process in which the initial state a evolves into the ﬁnal state b is therefore given by (5.5) be−iH(Tf −Ti ) a . In the limit Tf − Ti → ∞ the evolution operator e−iH(Tf −Ti ) , with H the second quantized Hamiltonian of ﬁeld theory, is called the Smatrix. Therefore S is an operator that maps an initial state to a ﬁnal state, a → Sa ,
(5.6)
and the scattering amplitudes are given by its matrix elements, bSa . Observe that S is a unitary operator, SS † = S † S = 1. In fact, if a is an initial state, normalized as aa = 1, and n is a complete set of states, the probability that a evolves into n , summed over all n , must be 1, nSa 2 = 1 . (5.7) n
On the other hand we can write nSa 2 = aS † n nSa = aS † Sa , n
(5.8)
n
since n is a complete set and therefore n n n = 1. This means that aS † Sa = 1 for a arbitrary, and we conclude that S † S = SS † = 1. We see that the unitarity of the Smatrix expresses the conservation of probability. It is also convenient to deﬁne the T matrix, separating the identity operator, S = 1 + iT . (5.9) In terms of T the condition SS † = 1 becomes −i(T − T † ) = T T † .
(5.10)
Denoting the matrix element bT a by Tba , and inserting a complete set of states, the above equation reads ∗ ∗ )= Tbn Tan , (5.11) −i(Tba − Tab n
and in particular, if a = b, 2 Im Taa =
Tan 2 .
(5.12)
n
Therefore unitarity relates the imaginary part of the diagonal matrix element Taa to the squared modulus of Tan 2 , summed over all possible intermediate states. In quantum ﬁeld theory the Heisenberg representation is often more useful than the Schr¨ odinger representation. The reason is that in QFT
5.2
the operators are just the ﬁelds, so in the Heisenberg representation the quantum ﬁelds depend both on x and t while in the Schr¨ odinger representation they depend only on x . The Heisenberg representation is therefore more natural from the point of view of Lorentz covariance. Given a state a (t) in the Schr¨ odinger representation, in the Heisenberg picture we deﬁne the state a H as a H = eiHt a (t). If A is an operator in the Schr¨ odinger representation, the corresponding Heisenberg operator AH is deﬁned as AH (t) = eiHt Ae−iHt . Since a (t) evolves with e−iHt , and A is timeindependent, by deﬁnition in the Heisenberg picture the states a H are independent of t while the operators AH evolve with time. Writing a H = eiHt a (t) at time t = Ti and recalling that we denoted a (Ti ) simply as a , we can write a, Ti H = eiHTi a .
(5.13)
Note that, even if it is timeindependent, the Heisenberg state a H carries a label Ti which was implicit in the deﬁnition of a , and therefore we have denoted it as a, Ti H . This label tells us of what Heisenberg operator the state a, Ti H is an eigenvector. For instance, suppose that in the Schr¨ odinger representation the state x0 , at t = t0 , is an eigenvector of the position operator x ˆ, and let x ˆH (t) = eiHt x ˆe−iHt . Then the iHt state x0 , t0 H = e x0 (t) is an eigenvector of the Heisenberg position operator xˆH (t0 ) but it is not an eigenvector of the operator xˆH (t1 ) with t1 = t0 . Similarly to eq. (5.13) (and omitting hereafter the subscript “H” on states in the Heisenberg representation), we have b, Tf = eiHTf b ,
(5.14)
and in terms of the states in the Heisenberg picture the matrix element (5.5) is written as bSa = b, Tf a, Ti . (5.15)
5.2
The LSZ reduction formula
Consider a generic Smatrix element written in the Heisenberg picture, p1 , p2 , . . . , pn ; Tf k1 , k2 , . . . , km ; Ti .
(5.16)
It is understood that at the end of the computation Tf → +∞ and Ti → −∞. For notational simplicity we consider a single species of neutral scalar particle, so the states are labeled just by their momenta, but all our considerations can be generalized to particles with spin. Our ﬁrst step will be to relate this matrix element to the expectation value of some operator on the vacuum state. We begin by observing that the expansion of a free real scalar ﬁeld in terms of creation and annihilation operators, eq. (4.2), can be inverted
The LSZ reduction formula 111
112 Perturbation theory and Feynman diagrams
to give
↔
(2Ek )1/2 ak = i
d3 x eikx ∂0 φfree , ↔ 1/2 † (2Ek ) ak = −i d3 x e−ikx ∂0 φfree ,
1
The most important example where the interaction does not decrease at large distances is the interaction of quarks in QCD. As a consequence, quarks are not seen as free particles (they are “conﬁned” inside hadrons), and the free particles seen at t → ±∞ are rather the hadrons. We will discuss in Problem 8.2 how to proceed in these cases.
Using a technique known as K¨ allen– Lehmann representation (see Weinberg (1995), Section 10.7) one can show that eq. (5.19) cannot hold as an operator equation, since otherwise one would ﬁnd that Z = 1 and that φ is a free ﬁeld; see, e.g., Itzykson and Zuber (1980), Section 5.1.2.
(5.18)
as one easily veriﬁes substituting eq. (4.2) in the above equations and performing the integration over d3 x. Note that in eqs. (5.17) and (5.18) the integrands are timedependent but the integrals are independent of t. We have denoted the ﬁeld by φfree to stress that eqs. (5.17) and (5.18) hold only if the ﬁeld is free. When the ﬁeld is not free, it cannot be expanded in terms of creation and annihilation operators as in eq. (4.2), and eqs. (5.17) and (5.18) do not hold. However, as t → −∞ we intuitively expect that the theory reduces to a free theory, since all incoming particles are inﬁnitely far apart and, if the interaction decreases suﬃciently fast with the distance, there will be no diﬀerence between a free and an interacting theory.1 These intuitive considerations are formalized by the hypothesis that, as t → −∞, φ(x) → Z 1/2 φin (x) ,
(5.19)
where φin (x) is a free ﬁeld and Z is a cnumber, known as wave function renormalization. We will discuss later the physical meaning of Z, and how to compute it. Similarly we assume that, as t → +∞, φ(x) → Z 1/2 φout (x) ,
2
(5.17)
(5.20)
with φout again a free ﬁeld, and the same constant Z. The limits in eqs. (5.19) and (5.20) must be understood in the weak sense, i.e. they are assumed to hold not as operator equations, but only when we take matrix elements.2 We now consider eq. (5.18) with φin playing the role of the free ﬁeld φfree . As we observed above, the integrand in eq. (5.18) is timedependent, but the result of the integration is independent of t. We can therefore perform it at t → −∞, and use eq. (5.19) to write ↔ 1/2 †,(in) = −i d3 x e−ikx ∂0 φin (2Ek ) ak t→−∞ ↔ = −iZ −1/2 lim (5.21) d3 x e−ikx ∂0 φ , t→−∞
where the superscript “in” means that the operator a†k acts on the space of initial states at Ti = −∞. Similarly, we deﬁne creation operators acting on the ﬁnal states as ↔ †,(out) = −i d3 x e−ikx ∂0 φout (2Ek )1/2 ak t→+∞ ↔ = −iZ −1/2 lim (5.22) d3 x e−ikx ∂0 φ . t→+∞
5.2
The LSZ reduction formula 113
Observe that in eqs. (5.21) and (5.22) the ﬁnal integral depends on time, †,(in) since it is performed with φ rather than with a free ﬁeld; ak is deﬁned †,(out) taking the limit t → −∞ of this integral while ak is deﬁned taking the limit t → +∞, and the relation between in and out creation operators is nontrivial. Recalling our normalization (4.10) for oneparticle states, we see that we can eliminate the particle with momentum k1 from the initial state writing p1 , p2 , . . . , pn ; Tf k1 , k2 , . . . km ; Ti †,(in)
(5.23) = (2Ek1 )1/2 p1 , p2 , . . . , pn ; Tf ak1 k2 , . . . , km ; Ti ↔ = −iZ −1/2 lim d3 x e−ik1 x p1 , p2 , . . . , pn ; Tf  ∂0 φk2 , . . . , km ; Ti . t→−∞
The idea is to iterate the process removing all particles from the initial and ﬁnal states. We perform the computation in detail. First of all, eq. (5.23) can be written in an explicitly covariant form. We use the fact that, for any integrable function f (t, x ), we have the identity Z Z Z ∞ ∂ d3 x f (t, x ) . ( lim − lim ) d3 x f (t, x ) = (5.24) dt t→+∞ t→−∞ ∂t −∞ ↔
Applying this identity to the function f (t, x ) = −iZ −1/2 e−ikx ∂0 φ and using eqs. (5.21) and (5.22) we ﬁnd Z ↔ †,(in) †,(out) (2Ek )1/2 (ak − ak ) = iZ −1/2 d4 x ∂0 (e−ikx ∂0 φ) . (5.25) The integral in this equation can be written in a covariant form observing that Z Z ↔ d4 x ∂0 (e−ikx ∂0 φ) = d4 x ∂0 (e−ikx ∂0 φ − φ ∂0 e−ikx ) Z = d4 x (e−ikx ∂02 φ − φ∂02 e−ikx ) Z i h = d4 x e−ikx ∂02 φ − φ(∇2 − m2 ) e−ikx , (5.26) where in the last line we used the fact that k 2 = m2 , since kµ is the fourmomentum of an initial or ﬁnal particle with mass m, and therefore ∂02 e−ikx = (∇2 − m2 )e−ikx . It is understood that our initial and ﬁnal particle states, which we have written simply as states with deﬁnite momentum, i.e. plane waves, will be convoluted to form wave packets, so at each given time they are localized in space. This means that we can integrate ∇2 twice by parts (while ∂0 cannot be integrated by parts, since φ is not localized in time), and we ﬁnd Z †,(in) †,(out) (2Ek )1/2 (ak − ak ) = iZ −1/2 d4 x e−ikx (2 + m2 )φ(x) . (5.27) Therefore †,(in)
†,(out)
(2Ek1 )1/2 p1 , p2 , . . . , pn ; Tf ak1 − ak1 k2 , . . . , km ; Ti (5.28) Z = iZ −1/2 d4 x e−ik1 x (2 + m2 ) p1 , p2 , . . . , pn ; Tf φ(x)k2 , . . . , km ; Ti .
The reader uninterested in the derivation can just take note of the definition of timeordered product in eq. (5.32) and then can jump directly to eq. (5.40).
114 Perturbation theory and Feynman diagrams †,(out)
3
In the language of Feynman diagrams that we will explain below, this means that we can restrict to connected diagrams.
The operator ak1 acts on the state to its left, destroying an out particle with momentum k1 . We assume that none of the initial momenta pj coincides with a ﬁnal momentum ki . This eliminates processes in which one of the particles behaves as a “spectator” and does not interact with the other particles.3 Then †,(out) ak1 acting on the state on its left gives zero, because the particle that it would annihilate is absent, and the lefthand side of eq. (5.28) coincides with the expression that appears in eq. (5.23). The conclusion is that we can remove the particle with momentum k1 from the initial state, at the price of inserting the operator Z (5.29) iZ −1/2 d4 x e−ik1 x (2 + m2 )φ(x) in the matrix element, i.e. (5.30) p1 , p2 , . . . , pn ; Tf k1 , k2 , . . . , km ; Ti Z −1/2 4 −ik1 x 2 d xe (2 + m ) p1 , p2 , . . . , pn ; Tf φ(x)k2 , . . . , km ; Ti . = iZ Now we would like to iterate the procedure, eliminating all initial and ﬁnal particles and remaining with the vacuum expectation value of some combination of ﬁelds. For instance, we next eliminate the ﬁnal particle with momentum p1 . Following the same strategy adopted before, we write p1 , p2 , . . . , pn ; Tf φ(x)k2 , . . . , km ; Ti = (2Ep1 )1/2 p2 , . . . , pn ; Tf a(out) φ(x)k2 , . . . , km ; Ti . p1
(5.31)
We now deﬁne the timeordered product, or simply the T product, of two ﬁelds as follows, j T {φ(y)φ(x)} =
y 0 > x0 y 0 < x0
φ(y)φ(x) φ(x)φ(y)
(5.32)
or T {φ(y)φ(x)} = θ(y 0 − x0 )φ(y)φ(x) + θ(x0 − y 0 )φ(x)φ(y) , 0
0
0
0
(5.33) 0
where θ(x ) is the step function: θ(x ) = 1 if x > 0 and θ(x ) = 0 if x < 0. (in) Taking the hermitian conjugate of eq. (5.21) we see that ap1 is constructed 0 in terms of φ(y) with y → −∞, and therefore (in) T {a(in) p1 φ(x)} = φ(x)ap1 . (out)
Similarly, ap1
(5.34)
is constructed in terms of φ(y) with y 0 → +∞ and (out) T {a(out) p1 φ(x)} = ap1 φ(x) .
(5.35)
We can use this to write the righthand side of eq. (5.31) as − a(in) (2Ep1 )1/2 p2 , . . . , pn ; Tf T {(a(out) p1 )φ(x)}k2 , . . . , km ; Ti . p1
(5.36)
In fact, the ﬁrst term in the T product is the same as the original expression in (in) eq. (5.31), while the second gives zero since we have seen that T {ap1 φ(x)} = (in) (in) φ(x)ap1 and then ap1 annihilates the state on its right (recall that we are assuming that the ﬁnal momenta pj are diﬀerent from any of the initial momenta ki ).
5.2 (out)
The LSZ reduction formula 115
(in)
The advantage of the form (5.36) is that the combination ap1 − ap1 is given in terms of a covariant expression involving the φ ﬁeld, which is just the hermitian conjugate of eq. (5.27), Z −1/2 d4 y eip1 y (2y + m2 )φ(y) . − a(in) (5.37) (2Ep1 )1/2 (a(out) p1 p1 ) = iZ Therefore p1 , p2 , . . . , pn ; Tf φ(x)k2 , . . . , km ; Ti (5.38) Z = iZ −1/2 d4 y eip1 y (2y + m2 ) p2 , . . . , pn ; Tf T {φ(y)φ(x)}k2 , . . . , km ; Ti , where 2y = ∂y∂µ ∂y∂µ .4 Putting together eqs. (5.30) and (5.38) we ﬁnd the result of eliminating the particles with momenta k1 and p1 , p1 , p2 , . . . , pn ; Tf k1 , k2 , . . . , km ; Ti Z Z = (iZ −1/2 )2 d4 x e−ik1 x (2x + m2 ) d4 y e+ip1 y (2y + m2 )
(5.39)
× p2 , . . . , pn ; Tf T {φ(y)φ(x)}k2 , . . . , km ; Ti .
The procedure can now be iterated in a straightforward way, and the result is p1 , . . . , pn ; Tf k1 , . . . , km ; Ti m m n n 4 4 −1/2 n+m pj y j − i ki xi ) ) d xi d yj exp(i = (iZ i=1
j=1 2
j=1
i=1
×(2x1 + m ) . . . (2yn + m )0T {φ(x1 ) . . . φ(yn )}0 , 2
(5.40)
where the T product T {φ(x1 ) . . . φ(yn )} by deﬁnition orders the n + m ﬁelds φ(x1 ), . . . , φ(ym ) according to decreasing times, so that larger times are leftmost. The vacuum at t = ±∞ is the perturbative vacuum, i.e. the vacuum used in the construction of the Fock space of the free theory.5 As we explained in Section 5.1, p1 . . . pn ; Tf k1 . . . km ; Ti is the matrix element in the Heisenberg representation. In the Schr¨ odinger representation we write instead p1 . . . pn Sk1 . . . km .
(5.41)
We have also deﬁned the operator T from S = 1 + iT . Since in eq. (5.40) we restricted to the situation in which no initial and ﬁnal momenta coincide, the matrix element of the identity operator between these states vanishes, and we have actually computed the matrix element of iT , i.e. of the nontrivial part of the evolution operator, p1 . . . pn iT k1 . . . km m n m n = (iZ −1/2 )n+m d4 xi d4 yj exp(i pj y j − i ki xi ) i=1
j=1 2
j=1
i=1
×(2x1 + m ) . . . (2yn + m )0T {φ(x1 ) . . . φ(yn )}0 . 2
(5.42)
4 A very technical remark: writing eq. (5.38) we have extracted 2y from the T product; strictly speaking this is not correct, because ∂/∂y 0 does not commute with the theta function that enters in the deﬁnition of the T product, since ∂x θ(x) = δ(x). However, a simple calculation shows that the additional term is proportional to δ(x0 − y 0 )[∂0 φ(y), φ(x)] ∼ δ (4) (x − y), and the inclusion of this Dirac delta (and of its derivatives, coming from acting on it with the 2x operators present in the LSZ formula) modiﬁes the ﬁnal result for the LSZ formula, eq. (5.46), by the addition of terms which are polynomial in the fourmomenta. Since however both the lefthand side and the righthand side of eq. (5.46) are polelike in the fourmomenta, i.e. proportional to factors 1/(p2 − m2 ), the addition of a regular term is irrelevant when we go on mass shell, i.e. when we set p2 = m2 ; see the discussion below eq. (5.46). 5
Observe that initial oneparticle states are deﬁned from k = (2Ek )1/2 ak †,(in) 0 and ﬁnal states from k = (2Ek )1/2 ak †,(out) 0, with the same state 0 in both cases, including its phase.
116 Perturbation theory and Feynman diagrams
We now deﬁne the N point Green’s function G(x1 , . . . , xN ) = 0T {φ(x1 ) . . . φ(xN )}0 . ˜ we have In terms of its Fourier transform G, N d4 ki −i PN ˜ 1 , . . . , kN ) . i=1 xi ki G(k G(x1 , . . . , xN ) = e 4 (2π) i=1
(5.43)
(5.44)
Using (2xj + m2 )G(x1 , . . . , xN ) N PN d4 ki ˜ 1 , . . . , kN ) , (kj2 − m2 )e−i i=1 xi ki G(k =− 4 (2π) i=1
(5.45)
eq. (5.42) can be rewritten as m i=1
=
d4 xi e−iki xi
n
d4 yj e+ipj yj
j=1
×0T {φ(x1 ) . . . φ(xm )φ(y1 ) . . . φ(yn )}0 ⎞ ⎛ √ √ n i Z i Z ⎠ p1 . . . pn iT k1 . . . km . ⎝ k 2 − m2 p2 − m2 j=1 j i=1 i
m
(5.46)
This is the Lehmann–Symanzik–Zimmermann (LSZ) reduction formula. It is important to understand the meaning of the factors ki2 − m2 and p2j − m2 in the denominator. Of course for a physical particle with fourmomentum pµ we have p2 −m2 = 0 (which is often expressed saying that the particle is “on mass shell”). The meaning of these factors is that we must ﬁrst compute the lefthand side of eq. (5.46) working oﬀ mass shell, i.e. without using any relation between p20 and p2 . In the limit in which we send the particles on mass shell, the lefthand side develops poles of the form 1/(ki2 − m2 ) for each incoming particle and 1/(p2j − m2 ) for each outgoing particle. These factors cancel the same pole factors which appear explicitly on the righthand side, and we remain with an equation between quantities that are ﬁnite when the particles are on mass shell. We have therefore succeeded in relating the scattering amplitude to the vacuum expectation value of a timeordered product of ﬁelds. In the next section we will see how the latter can be computed order by order in perturbation theory.
5.3
Setting up the perturbative expansion
At the classical level, the ﬁeld φ(x) satisﬁes a complicated nonlinear equation of motion, determined by the full Lagrangian L0 + Lint which
5.3
Setting up the perturbative expansion 117
corresponds to the full Hamiltonian H0 + Hint . The exact form of the solution will in general be very diﬃcult to obtain, but certainly it will not be given just by simple plane waves, and so φ(x) does not have a simple expansion in plane waves with coeﬃcients that in the quantum theory can be interpreted as creation and annihilation operators. In order to set up the perturbative expansion, we want to relate φ to a ﬁeld φI whose time evolution is instead determined just by the free Hamiltonian H0 . We therefore deﬁne a quantum ﬁeld φI (t, x) stating that, if at some reference time t = t0 it is equal to φI (t0 , x), then at generic t it is given by φI (t, x) = eiH0 (t−t0 ) φI (t0 , x)e−iH0 (t−t0 ) . (5.47) The ﬁeld φI that evolves with the free Hamiltonian is called the interaction picture ﬁeld. By deﬁnition this is a free ﬁeld and we can expand it as d3 p ap e−ipx + a†p eipx , (5.48) φI (t, x) = 3 (2π) 2Ep with ap , a†p the usual destruction and annihilation operators. Now we want to express the full Heisenberg ﬁeld φ in terms of φI . At time t0 the ﬁeld φ(x) is a given function of the spatial coordinates, φ(t0 , x ) = f (x ). Let φI (t, x) be the interaction picture ﬁeld that at t = t0 is equal to the same function f (x). Then (setting t − t0 ≡ τ ) φ(t, x) = eiHτ φ(t0 , x)e−iHτ
= eiHτ e−iH0 τ e+iH0 τ φ(t0 , x)e−iH0 τ eiH0 τ e−iHτ = eiHτ e−iH0 τ φI (t, x)eiH0 τ e−iHτ .
(5.49)
It is therefore useful to deﬁne the time evolution operator U (t, t0 ) ≡ eiH0 (t−t0 ) e−iH(t−t0 ) ,
(5.50)
which evolves from time t0 to time t, and is unitary. Then φ(t, x) = U † (t, t0 )φI (t, x)U (t, t0 ) .
(5.51)
Note that U (t, t0 ) = exp{i(H0 − H)(t − t0 )} = exp{−iHint (t − t0 )} because H0 and Hint do not commute, and therefore we cannot combine the exponentials in this way. However, one can observe that ∂U = eiH0 (t−t0 ) (H −H0 )e−iH(t−t0 ) = eiH0 (t−t0 ) Hint e−iH0 (t−t0 ) U (t, t0 ) . ∂t (5.52) We deﬁne the interaction picture Hamiltonian HI as
i
HI (t) = eiH0 (t−t0 ) Hint e−iH0 (t−t0 ) .
(5.53)
The solution of eq. (5.52) with the boundary condition U (t0 , t0 ) = 1 is t U (t, t0 ) = T exp −i dt HI (t ) . (5.54) t0
118 Perturbation theory and Feynman diagrams
We recall that the exponential of an operator is deﬁned by its Taylor expansion. Then the timeordering T of the exponential means that all terms in the Taylor expansion are time ordered. The fact that this is a solution of eq. (5.52) can be checked expanding the exponential and comparing order by order in HI . Equations (5.51), (5.53) and (5.54) express the ﬁeld φ(t, x) in terms of the interaction picture ﬁeld. Our task is to compute the npoint Green’s function, i.e. 0φ(x1 )φ(x2 ) . . . φ(xn )0
(5.55)
when the xi are T ordered, i.e. t1 > t2 > . . . > tn . Then, using eq. (5.51), we can rewrite it as 0 U † (t1 , t0 )φI (x1 )U (t1 , t0 ) U † (t2 , t0 )φI (x2 )U (t2 , t0 ) . . . (5.56) . . . U † (tn , t0 )φI (xn )U (tn , t0 ) 0 . Observe now that U † (t2 , t0 ) = U (t0 , t2 ), since the hermitian conjugation changes i → −i in the exponent in eq. (5.54), and this is reabsorbed inverting the integration limits. Furthermore, U (t1 , t0 )U (t0 , t2 ) = U (t1 , t2 ). A simple derivation of this identity (valid independently of the ordering between t0 , t1 , t2 ) is obtained observing that ∂ ∂ U (t, t0 ) U (t0 , t2 ) = HI (t)[U (t, t0 )U (t0 , t2 )]. i [U (t, t0 )U (t0 , t2 )] = i ∂t ∂t (5.57) Therefore the equation satisﬁed by [U (t, t0 )U (t0 , t2 )] is the same as that satisﬁed by U (t, t0 ), eq. (5.52), but the boundary condition is [U (t, t0 )U (t0 , t2 )]t=t0 = U (t0 , t2 ) ,
(5.58)
since U (t0 , t0 ) = 1. The solution with this boundary condition is t dt HI (t ) . (5.59) U (t, t0 )U (t0 , t2 ) = T exp −i t2
However, this is nothing but U (t, t2 ), as we see comparing with eq. (5.54). Using these identities we can combine the various factors U , U † in eq. (5.56) and we get 0U † (t1 , t0 )φI (x1 )U (t1 , t2 )φI (x2 )U (t2 , t3 ) . . . . . . U (tn−1 , tn )φI (xn )U (tn , t0 )0 .
(5.60)
We now introduce a new variable t with a very large value, so that t t1 > t2 > . . . > tn −t. Using U (tn , t0 ) = U (tn , −t)U (−t, t0 ) and U † (t1 , t0 ) = U (t0 , t1 ) = U (t0 , t)U (t, t1 ) = U † (t, t0 )U (t, t1 ), we rewrite (5.60) as 0U † (t, t0 ) [U (t, t1 )φI (x1 )U (t1 , t2 )φI (x2 )U (t2 , t3 ) . . . . . . U (tn−1 , tn )φI (xn )U (tn , −t)] U (−t, t0 )0 .
(5.61)
Observe that the term in brackets is automatically timeordered. In fact, e.g. U (t1 , t2 ) contains powers of the integral of HI (t) between t1 and t2 ,
5.3
Setting up the perturbative expansion 119
time ordered, so terms like φI (x1 )U (t1 , t2 )φI (x2 ) are sums of terms of the form φI (x1 )HI (t1 ) . . . HI (tk )φI (x2 ) with t1 > t1 > . . . > tk > t2 , so everything is automatically time ordered. Therefore the term in square brackets can be rewritten as [. . .] = T {φI (x1 ) . . . φI (xn )U (t, t1 )U (t1 , t2 ) . . . U (tn , −t)} .
(5.62)
We have rewritten the various factors in a convenient order, since anyway the order in which they appeared in eq. (5.61) is implemented by the T product symbol. Now however we can combine the U factors into a single factor U (t, −t). We therefore arrive at t dt HI (t ) U (−t, t0 )0 . 0U † (t, t0 )T φI (x1 ) . . . φI (xn ) exp −i −t
(5.63) Note that the T product symbol in eq. (5.54) need not be repeated inside (5.63) because the outmost T product symbol already instructs to time order all its arguments. These manipulations hold for t0 arbitrary. We now chose t0 = −t and we send t → ∞. Then U (−t, t0 ) = 1 while U † (t, t0 ) → U † (∞, −∞). The term 0U † (∞, −∞) in eq. (5.63) is the hermitian conjugate of U (∞, −∞)0 , which is the state obtained evolving the vacuum state from time −∞ to +∞. Physically it is clear that, if the vacuum state is stable, applying to it the evolution operator U (∞, −∞) we still ﬁnd the vacuum. Recall however that in quantum mechanics state vectors that diﬀer by a phase still represent the same physical state. Therefore we will have in general (5.64) U (∞, −∞)0 = eiα 0 , with α a phase. The explicit form of this phase can be obtained taking the scalar product of the above equation with 0 and using the explicit form (5.54) of the evolution operator, +∞ iα e = 0T exp −i dt HI (t ) 0 . (5.65) −∞
In eq. (5.63) the hermitian conjugate, 0U † (∞, −∞) = e−iα 0 appears, and from eq. (5.65) we have e
−iα
= 0T exp −i
+∞
−∞
−1 dt HI (t ) 0 .
(5.66)
So we ﬁnally get our basic formula
0T {φ(x1 ) . . . φ(xn )}0 " !
0T φI (x1 ) . . . φI (xn ) exp −i d4 x HI 0 " !
= , 0T exp −i d4 x HI 0
(5.67)
120 Perturbation theory and Feynman diagrams
∞ where we have written −∞ dt HI = d4 x HI . The lefthand side of eq. (5.67) is the Green’s function which enters in the LSZ formula, and the righthand side shows how we can compute it in terms of the free ﬁeld φI . Observe furthermore that HI is expressed very simply in terms of φI , since the functional dependence of HI on φI is exactly the same as the functional dependence of Hint on φ. To understand this point, consider for instance a scalar ﬁeld theory with the quartic selfinteraction, Hint = (λ/4!)φ4 . Then, using eqs. (5.53) and (5.47), λ 4 −iH0 (t−t0 ) iH0 (t−t0 ) HI (t) = e φ e 4! λ iH0 (t−t0 ) −iH0 (t−t0 ) iH0 (t−t0 ) −iH0 (t−t0 ) e e = φe φe 4! × eiH0 (t−t0 ) φ e−iH0 (t−t0 ) eiH0 (t−t0 ) φ e−iH0 (t−t0 ) =
λ 4 φ . 4! I
(5.68)
At this point the perturbative strategy is clear. We expand the exponential in eq. (5.67) in powers of HI , and we are left with the task of computing timeordered products of free ﬁelds. In principle, it is clear that these can be written in terms of creation and annihilation operators, and therefore we know how to compute them. Such a brute force computation, however, would quickly become too cumbersome, and in the following we will study a technique, based on Wick’s theorem and Feynman graphs, which makes these computations feasible. In the next section we will start from the simplest case, the T product of two ﬁelds.
5.4
The Feynman propagator
We want to compute the Feynman propagator, deﬁned as 0T {φI (x)φI (y)}0 .
(5.69)
When we study perturbation theory we always use the interaction picture ﬁeld φI . The original ﬁeld φ which evolves with the full Hamiltonian will never appear again. Therefore, to make the notation simpler, from now on we will denote the interaction picture ﬁeld simply by φ, omitting the subscript “I”. We ﬁrst separate φ into its creation and annihilation parts, φ(x) = φ+ (x) + φ− (x) ,
(5.70)
where
d3 p ap e−ipx , (2π)3 2Ep d3 p − φ (x) = a†p e+ipx . (2π)3 2Ep +
φ (x) =
(5.71) (5.72)
5.4
The Feynman propagator 121
Of course φ+ (x)0 = 0 and 0φ− (x) = 0. Consider ﬁrst the case x0 > y 0 . Then T {φ(x)φ(y)} = φ+ (x)φ+ (y) + φ+ (x)φ− (y) + φ− (x)φ+ (y) +φ− (x)φ− (y) = φ+ (x)φ+ (y) + φ− (y)φ+ (x) + φ− (x)φ+ (y) +φ− (x)φ− (y) + [φ+ (x), φ− (y)] = : φ(x)φ(y) : + [φ+ (x), φ− (y)] ,
(5.73)
where as usual the colons denote normal ordering. Similarly for y 0 > x0 we get T {φ(x)φ(y)} = : φ(x)φ(y) : +[φ+ (y), φ− (x)]. Therefore T {φ(x)φ(y)} = : φ(x)φ(y) : + D(x − y) ,
(5.74)
where D(x − y) = θ(x0 − y 0 )[φ+ (x), φ− (y)] + θ(y 0 − x0 )[φ+ (y), φ− (x)] . (5.75) Now observe that the expectation value of the normal ordered term : φ(x)φ(y) : is zero, because there is always either one annihilation operator acting on 0 or a creation operator acting on 0. Instead the commutator of ap and a†p is a cnumber, so also D(x − y) is a cnumber, and 0D(x − y)0 = D(x − y)00 = D(x − y). Therefore D(x − y) is just the Feynman propagator, 0T {φ(x)φ(y)}0 = D(x − y) .
(5.76)
Computing the commutators we ﬁnd d3 p 1 0 0 −ip(x−y) 0 0 ip(x−y) θ(x . − y )e + θ(y − x )e D(x − y) = (2π)3 2Ep (5.77) The integral over d3 p can be computed explicitly, but it is more useful to rewrite (5.77) as a fourdimensional integral, D(x − y) =
i d4 p e−ip(x−y) , (2π)4 p2 − m2 + i
p0
(5.78)
where → 0+ . To prove the equivalence of eqs. (5.77) and (5.78), observe that the integral in eq. (5.78) can be written as 0 0 0 d3 p ip·(x−y) +∞ dp0 i e−ip (x −y ) , e (5.79) 0 )2 − E 2 + i (2π)3 2π (p −∞ p where Ep = +(p2 + m2 )1/2 . The integral over p0 can be computed going in the complex p0 plane. The i factor displaces slightly the poles from the real axis. The poles are at ±p0 Ep (1 − i/(2Ep2 )). Thus the pole at p0 = Ep is slightly displaced below the real axis and the pole at p0 = −Ep is slightly displaced above the real axis, as shown in Fig. 5.1.
Fig. 5.1 The position of the poles in
the complex p0 plane.
122 Perturbation theory and Feynman diagrams
For x0 −y 0 > 0 we can close the contour in the lower halfplane and we only get the contribution of the pole at p0 = Ep . The pole is encircled clockwise, so it gives e−iEp (x −y i (−2πi) 2π 2Ep 0
0
)
=+
1 −iEp (x0 −y0 ) e . 2Ep
(5.80)
If instead x0 − y 0 < 0 we can close the contour in the upper halfplane and we only get the contribution of the pole at p0 = −Ep , which now is encircled counterclockwise, so it gives eiEp (x −y i (2πi) 2π −2Ep 0
0
)
=+
1 iEp (x0 −y0 ) e . 2Ep
(5.81)
In both cases, we have reproduced eq. (5.77) (in the second term we must also rename the integration variable p → −p). From eq. (5.78) we read oﬀ the Feynman propagator in momentum space, i ˜ D(p) = 2 . (5.82) p − m2 + i It is also easy to see that the Feynman propagator is just a Green’s function of the operator 2 + m2 . In fact, from eq. (5.78), i d4 p 2 (−p2 + m2 )e−ip(x−y) (2x + m )D(x − y) = (2π)4 p2 − m2 + i = −iδ (4) (x − y) . (5.83) Observe that this result holds independently of the prescription for going around the poles. There are in principle four diﬀerent prescriptions (each of the two poles can be slightly displaced above or below the real axis) and diﬀerent Green’s functions are obtained with diﬀerent prescriptions, and obey diﬀerent boundary conditions (see Exercise 5.1).
5.5
Wick’s theorem and Feynman diagrams
Wick’s theorem is a very useful tool for reducing the expectation value of a generic T product of ﬁelds 0T {φ(x1 ) . . . φ(xn )}0
(5.84)
to a combination of Feynman propagators. It generalizes the identity (5.74) to an npoint function, and states that T {φ(x1 ) . . . φ(xn )} is equal to the normal ordered product : φ(x1 ) . . . φ(xn ) : plus all possible combinations of normal ordering and contractions of ﬁelds, where a contraction of two ﬁelds φ(x1 ), φ(x2 ) is deﬁned to be equal to the Feynman propagator D(x1 − x2 ). For instance (using the notation φ(xi ) = φi and
5.5
Wick’s theorem and Feynman diagrams 123
D(xi − xj ) = Dij ), T {φ1 φ2 φ3 φ4 } = : φ1 φ2 φ3 φ4 : +D12 : φ3 φ4 : +D13 : φ2 φ4 : +D14 : φ2 φ3 : +D23 : φ1 φ4 : +D24 : φ1 φ3 :
(5.85)
+D34 : φ1 φ2 : +D12 D34 + D13 D24 + D14 D23 . The proof of the theorem can be given by induction on the number of ﬁelds, see, e.g. Peskin and Schroeder (1995), pages 88–90. When we take the vacuum expectation value, all terms with a normal ordering factor give zero, and only the terms where all ﬁelds have been contracted survive. Thus in our example 0T {φ1 φ2 φ3 φ4 }0 = D12 D34 + D13 D24 + D14 D23 .
(5.86)
The above equation has a vivid physical interpretation. If we interpret D(x1 − x2 ) as the amplitude for the propagation of a particle from the spacetime point x1 to x2 , then D(x1 − x2 )D(x3 − x4 ) is the amplitude for the process in which one particle goes from x1 to x2 and another from x3 to x4 , without interacting with each other. We can now associate a Feynman graph in position space to each nonvanishing contribution. We simply draw a line connecting points xi and xj for each propagator D(xi − xj ). For instance, the term D(x1 − x2 )D(x3 − x4 ) can be associated to the (rather trivial) Feynman diagram in position space given in Fig. 5.2. When we expand the exponential in eq. (5.67) in powers of HI , each term HI contains ﬁelds at the same spacetime point. As we will see explicitly below, this gives rise to less trivial Feynman graphs. The best way to understand all this machinery is to put it to work, and to start computing. Therefore, in the next two subsections, we will perform a few computations in all details. The important point that will emerge, however, is that it is not necessary every time to go through the rather involved steps that we will present, since the results can be summarized very compactly by a set of rules, the Feynman rules, that allow us to associate to each amplitude a set of Feynman diagrams, and to write down almost immediately the contribution of each Feynman diagram. Still, once in a lifetime, it might be useful to go through all the detailed steps, before starting to use the Feynman rules as an automatic machinery. The reader who does not wish to follow the computations in all details can go quickly through the next two subsections and ﬁnd a summary of Feynman rules in Section 5.5.3.
5.5.1
A few very explicit computations
We begin with the scattering amplitude for a process with two initial particles with momenta k1 , k2 into two ﬁnal particles with momenta p1 , p2 , in the theory with HI = (λ/4!)φ4 . The general formulas (5.46, 5.67) give ! 2 ! √ √ 2 Y Y i Z i Z p1 p2 iT k1 k2 p2 − m2 k 2 − m2 i=1 i j=1 j
x1
x2
x3
x4
Fig. 5.2 The diagrammatic repre
sentation of D(x1 − x2 )D(x3 − x4 ).
124 Perturbation theory and Feynman diagrams Z
=
d4 x1 d4 x2 d4 x3 d4 x4 ei(p1 x1 +p2 x2 −k1 x3 −k2 x4 ) ˆ λ R 4 4˜ 0T {φ(x1 )φ(x2 )φ(x3 )φ(x4 ) exp −i 4! d xφ }0 ˆ R ˜ × . λ 0T {exp −i 4! d4 xφ4 }0
(5.87)
We work at the lowest nontrivial order in perturbation theory in λ. As we will see later, in λφ4 theory Z = 1 + O(λ2 ) and therefore, since we will work up to O(λ), we set Z = 1. Zeroorder term: First of all there is a term of order λ0 , which is given simply setting λ = 0 in eq. (5.87). Of course, if there is no coupling, there is no scattering, and we must ﬁnd a trivial amplitude at this order. Let us nevertheless check explicitly how this comes out. At λ = 0, using eq. (5.86), Z Y 4
d4 xi ei(p1 x1 +p2 x2 −k1 x3 −k2 x4 ) 0T {φ(x1 )φ(x2 )φ(x3 )φ(x4 )}0
i=1
=
Z Y 4
d4 xi ei(p1 x1 +p2 x2 −k1 x3 −k2 x4 ) (D(x1 − x2 )D(x3 − x4 ) + . . .) ,
i=1
(5.88) where the dots represent the other two terms in eq. (5.86). In terms of Feynman graphs in position space, the contribution written explicitly is shown in Fig. 5.2 and represents a particle traveling from x1 to x2 and a second particle traveling from x3 to x4 , without interacting with the ﬁrst (and similarly for the other two terms denoted by the dots in eq. (5.88)). Changing integration variables to x = x1 − x2 and X = (x1 + x2 )/2 and similarly for x3 , x4 we have Z d4 x1 d4 x2 d4 x3 d4 x4 ei(p1 x1 +p2 x2 −k1 x3 −k2 x4 ) D(x1 − x2 )D(x3 − x4 ) »Z – d4 xd4 Xei(p1 +p2 )X+i(p1 −p2 )x/2 D(x) = »Z – × d4 xd4 Xe−i(k1 +k2 )X−i(k1 −k2 )x/2 D(x) = (2π)4 δ (4) (p1 + p2 )(2π)4 δ (4) (k1 + k2 )
6
Observe that the factor i in p2 −m2 + i should be kept when p is an integration variable, because in this case it gives the prescription for going around the poles, but if p is the momentum on an external leg, and is therefore ﬁxed, we can set directly = 0. 7
There is a subtlety here connected with graphs known as tadpoles, that we will introduce below. In principle inserting tadpole graphs in the lines of a disconnected diagram we can obtain the appropriate number of pole factors. However, the tadpoles are simply reabsorbed in the mass renormalization, as we will see in Section 5.6.
i i . p21 − m2 k12 − m2
(5.89)
Comparing with eq. (5.87) we see that on the righthand side we have obtained only two pole factors,6 which originated from the two propagators in momentum space, while on the lefthand side of eq. (5.87) there are four pole factors. Therefore only two of them cancel, and we get a contribution to the T matrix element p1 p2 iT k1 k2 = −(2π)8 (p22 − m2 )(k22 − m2 )δ (4) (p1 + p2 )δ (4) (k1 + k2 ) (5.90) and when we go on mass shell this gives zero. The same happens for the other contributions indicated by the dots in eq. (5.88). Therefore at zero order in λ there is no contribution to the scattering amplitude, as expected. This is a general situation which repeats for the n → m scattering amplitudes, and disconnected graphs do not contribute, simply because the Feynman graphs do not provide enough pole factors to cancel those that appear in the LSZ formula.7 Term O(λ): The ﬁrst nontrivial contribution comes expanding the exponential in eq. (5.87) to ﬁrst order in λ. Let us for the moment neglect the
5.5 denominator in eq. (5.87). Then at this order we have ! 2 ! 2 Y Y i i p1 p2 iT k1 k2 p2 − m2 k 2 − m2 j=1 j i=1 i Z = d4 x1 d4 x2 d4 x3 d4 x4 exp{i(p1 x1 + p2 x2 − k1 x3 − k2 x4 )} «Z „ λ d4 x 0T {φ(x1 )φ(x2 )φ(x3 )φ(x4 )φ4 (x)}0 . × −i 4!
Wick’s theorem and Feynman diagrams 125
(5.91)
As before, the disconnected graphs are unable to cancel all pole factors and therefore, when we go on mass shell, they give zero. The only connected Feynman graphs are obtained contracting each of the four φ(xi ) with one of the four φ(x); there are 4! possible contractions of this type, and therefore the righthand side of eq. (5.91) is equal to Z d4 x d4 x1 d4 x2 d4 x3 d4 x4 exp{i(p1 x1 + p2 x2 − k1 x3 − k2 x4 )} ×(−iλ)D(x1 − x)D(x2 − x)D(x3 − x)D(x4 − x) .
(5.92)
We can represent it with the Feynman graph shown in Fig. 5.3. Setting yi = xi − x and performing ﬁrst the integration over the yi we get Z ˜ 1 )D(p ˜ 2 )D(k ˜ 1 )D(k ˜ 2 ) d4 x ei(p1 +p2 −k1 −k2 )x (−iλ)D(p (5.93) ! ! 2 2 Y Y i i . = (−iλ)(2π)4 δ (4) (p1 + p2 − k1 − k2 ) 2 2 2 p − m k − m2 i j j=1 i=1 Therefore p1 p2 iT k1 k2 = (−iλ)(2π)4 δ (4) (p1 + p2 − k1 − k2 ) .
(5.94)
We note several aspects which can be generalized to all Feynman graphs: • Only connected graphs contribute. • For connected graphs, the propagators associated to the external legs cancel exactly the pole factors in the LSZ formula. • Each interaction vertex gives a factor −iλ. • There is an integration over a variable x which gives the overall energy– momentum conservation. Finally, we have to consider the eﬀect of the denominator in eq. (5.87). The denominator gives only vacuumtovacuum graphs, i.e. Feynman diagrams with no external lines. However each contribution from the numerator can be “dressed” with all possible vacuumtovacuum graphs, considering all possible disconnected graph made with the original graph plus all possible vacuumtovacuum graphs. This is shown in the ﬁrst line of Fig. 5.4, where we give examples of disconnected graphs with four external legs. As shown graphically in the second line of the ﬁgure, the connected graph with four external legs factorizes, and the term in the parentheses is the sum of all vacuumtovacuum graphs. This is nothing but the perturbative expansion of the denominator, so the term in parentheses exactly cancels the denominator. This means that we can simply set the vacuum expectation value at the denominator to one, if at the same time we neglect in the numerator all disconnected diagrams in which a disconnected component is a vacuum diagram.
x1
x3
x
x2
x4
Fig. 5.3 The Feynman diagram in
position space representing the 2 → 2 scattering amplitude at O(λ).
126 Perturbation theory and Feynman diagrams
=
+
+
* ( 1+
+
+
+
+
+
... =
... )
Fig. 5.4 A diagrammatic representation of the eﬀect of vacuumtovacuum
graphs.
x3
x1 x
y
x2
x4
Fig. 5.5 The Feynman graph in po
sition space corresponding to the contraction described in the text.
x1
x3
x
x2
y x4
The Feynman graphs that we have considered until now have no internal lines. To illustrate what happens with internal lines it is simpler to consider a theory with HI = (λ/3!)φ3 . In this case three lines instead of four meet at each vertex. To compute the 2 → 2 scattering amplitude we consider Z d4 x1 d4 x2 d4 x3 d4 x4 exp{i(p1 x1 + p2 x2 − k1 x3 − k2 x4 )} » – Z λ d4 x φ3 (x) }0c . (5.95) × 0T {φ(x1 )φ(x2 )φ(x3 )φ(x4 ) exp −i 3! We have seen that the denominator in eq. (5.67) is canceled by disconnected graphs in which a disconnected component is a vacuumtovacuum graph, while disconnected graphs with external legs in more than one disconnected component do not contribute because they do not provide the right pole factors. We therefore omit the vacuumtovacuum amplitude in the denominator hereafter, and we add a subscript 0 . . . 0c to remind us that we must consider only connected graphs. In this theory at O(λ) there is no contribution to the fourpoint amplitude, because expanding the exponential to ﬁrst order in λ we get the Green’s function 0T {φ(x1 )φ(x2 )φ(x3 )φ(x4 )φ3 (x)}0, which has an odd number of ﬁelds, so we cannot contract all of them and the vacuum expectation value is zero. Therefore the leading term is O(λ2 ), and is Z (5.96) d4 x1 d4 x2 d4 x3 d4 x4 exp{i(p1 x1 + p2 x2 − k1 x3 − k2 x4 )} „ «2 Z Z λ 1 −i × d4 x d4 y 0T {φ(x1 )φ(x2 )φ(x3 )φ(x4 )φ3 (x)φ3 (y)}0c . 2! 3!
Fig. 5.6 A Feynman graph corre
sponding to a diﬀerent set of contractions.
The new aspect here is that, in order to contract all ﬁelds and to obtain a connected diagram, we must contract one ﬁeld φ(x) with one φ(y), where x, y are the positions of the two vertices. We therefore have graphs with two vertices connected by an internal line, while in the previous example all lines were external, i.e. related to the initial or ﬁnal particles. To understand how to treat these graphs we consider the case in which we contract φ(x1 ) and φ(x2 ) each with one of the φ(x), φ(x3 ) and φ(x4 ) each with one of the φ(y), and the remaining φ(x) with the remaining φ(y), as in Fig. 5.5. (With diﬀerent contractions, we can also obtain the graph in Fig. 5.6, and an analogous graph with x3 and x4 interchanged.) Taking into account the combinatorial factor, the factor 1/2! from the expansion of the exponential, and the fact that there is an equal contribution with x and y interchanged, the graph in Fig. 5.5 gives Z (5.97) d4 x1 d4 x2 d4 x3 d4 x4 exp{i(p1 x1 + p2 x2 − k1 x3 − k2 x4 )}
5.5
Wick’s theorem and Feynman diagrams 127
Z
Z
d4 y(−iλ)2 D(x1 − x)D(x2 − x)D(x − y)D(y − x3 )D(y − x4 ) Z Z ˜ 2 )D(k ˜ 1 )D(k ˜ 2 ) d4 x d4 y ei(p1 +p2 )x−i(k1 +k2 )y D(x − y) ˜ 1 )D(p = (−iλ)2 D(p ×
d4 x
˜ 1 )D(p ˜ 2 )D(p ˜ 1 + p2 )D(k ˜ 1 )D(k ˜ 2) . = (−iλ)2 (2π)4 δ (4) (p1 + p2 − k1 − k2 )D(p Again we ﬁnd the momentum space propagators associated to the external legs, which are canceled by the pole terms in the LSZ formula, and the energy– ˜ 1 + p2 ), associated to the momentum conservation. The new factor is D(p internal line. It now becomes clear that it is more convenient to work in momentum space, rather than in position space. Then to each line of a graph is associated the momentum space propagator given in eq. (5.82). The external lines give factors which cancel the pole terms in the LSZ formula. The Feynman diagrams in momentum space corresponding to Figs. 5.3 and 5.5 are shown in Figs. 5.7 and 5.8. At this point it is not diﬃcult to see how to compute the most general amplitude associated with tree graphs (a tree graph is a graph with internal and external lines, but with no closed internal loop). The technique can be summarized by the Feynman rules in momentum space, for a scalar ﬁeld theory (some generalization will be needed for fermions, gauge ﬁelds, etc., and will be discussed later): • Draw all connected graphs corresponding to the given initial and ﬁnal states. The number of lines that meet at each vertex is determined by the interaction term; e.g. three lines in φ3 theory and four lines in φ4 theory. Disconnected graphs do not contribute. • To each external leg is associated a factor which compensates the pole factor in the LSZ reduction formula, eq. (5.46). Therefore we can simply omit all these factors from the graph, and we will obtain directly the matrix element of iT . This is often expressed saying that we consider the graphs “with external legs amputated”. • There is an overall Dirac delta imposing energy–momentum conservation. In order not to write explicitly the Dirac delta each time we compute a Feynman graph, it is convenient to deﬁne a matrix element Mf i from ! X X 4 (4) pi − kj iMf i . (5.98) p1 . . . pn iT k1 . . . km = (2π) δ i
j
The labels i, f refer to the initial and ﬁnal states or more explicitly, for a scalar theory, Mf i = M(p1 , . . . , pn ; k1 , . . . , km ) (more generally, the initial and ﬁnal states are labeled also by the spin states of the initial and ﬁnal particles). • Energy–momentum conservation must be imposed separately at each vertex. Note for instance that in the internal line in Fig. 5.8 we had two external momenta p1 and p2 ﬂowing into a vertex, and the momentum associated to the internal line ﬂowing out of this vertex is p1 + p2 . The “virtual particle” associated with this internal line decays in the ﬁnal states with momenta k1 , k2 , and the overall Dirac delta ensures p1 + p2 = k1 + k2 , so momentum is conserved also at the other vertex. • To each vertex associate a factor −i times the coupling constant.
p1
k1
p
k2
2
Fig. 5.7 The graph in momentum
space corresponding to Fig. 5.3.
p1
p2
p1 + p2
k1
k2
Fig. 5.8 The graph in momentum
space corresponding to Fig. 5.5.
128 Perturbation theory and Feynman diagrams • To each internal line associate a propagator, with the value of the fourmomentum given by energy–momentum conservation. • There is a combinatorial factor which combines the number of equivalent contractions, the factors 1/n! from the expansion of the exponential at order n, and numerical factors associated to the deﬁnition of the coupling constant, such as the factors 1/4! in (λ/4!)φ4 . In the next subsection we will understand what happens when there are internal loops in a graph.
5.5.2
k p
k1
1
p2
k2
p1 + p2 _ k
p
k1
1
k1+ k _ p1
k
p2
k2
p1
k2
k 2 + k _ p1
k
p2
We have by now understood how to write a Feynman graph when there are internal lines whose momentum is ﬁxed by energy–momentum conservation at a vertex. Consider however the O(λ2 ) corrections to the 2 → 2 scattering amplitude in λφ4 . The possible connected graphs that can be constructed with four external legs (since we study a 2 → 2 amplitude), four lines meeting at each vertex (since we are in φ4 theory) and two vertices (since we want to study the terms O(λ2 )) are given in Fig. 5.9. The three graphs correspond to diﬀerent types of contractions, which we will discuss in more detail below. The important new point is that the momenta of the internal lines are not completely ﬁxed imposing energy–momentum conservation at each vertex. We can assign an arbitrary momentum k to one internal line and the other has a momentum ﬁxed by energy–momentum conservation at each vertex. The fact that not all momenta are ﬁxed takes place each time we have loops in a Feynman graph. We must therefore understand how to treat these graphs. Again, we discuss some examples in full detail. Let us work out the 2 → 2 amplitude in λφ4 at one loop order. We start as usual from d4 x1 d4 x2 d4 x3 d4 x4 exp{i(p1 x1 + p2 x2 − k1 x3 − k2 x4 )} λ 4 4 ×0T {φ(x1 )φ(x2 )φ(x3 )φ(x4 ) exp −i d xφ }0 c . (5.99) 4!
k1
Fig. 5.9 The three contributions to
the oneloop 2 → 2 amplitude in momentum space. x1 x3 x x2
Loops and divergences
The term of order λ2 is 1 (5.100) d4 x1 d4 x2 d4 x3 d4 x4 exp{i(p1 x1 + p2 x2 − k1 x3 − k2 x4 )} 2! 2 λ × −i d4 x d4 y0T {φ(x1 )φ(x2 )φ(x3 )φ(x4 )φ4 (x)φ4 (y)}0 c . 4!
y x4
Fig. 5.10 A contribution to the one
loop amplitude in position space.
The Feynman graph in position space shown in Fig. 5.10 is obtained contracting φ(x1 ) with one of the φ(x), φ(x2 ) with another φ(x) (there are 4 · 3 possible ways to do it), φ(x3 ) with one φ(y), φ(x4 ) with another of the φ(y) (again 4 · 3 combinations), and the remaining two of the φ(x) with the remaining two φ(y) (there are 2 possible ways to do it). An equal contribution is obtained exchanging x ↔ y, i.e. contracting φ(x1 )
5.5
Wick’s theorem and Feynman diagrams 129
and φ(x2 ) with two of the φ(y) and φ(x3 ), φ(x4 ) with two of the φ(x); this gives an additional factor of 2, so we ﬁnally get 1 2 (−iλ) d4 x1 d4 x2 d4 x3 d4 x4 d4 x d4 y ei(p1 x1 +p2 x2 −k1 x3 −k2 x4 ) 2 ×D(x1 − x)D(x2 − x)D(x − y)D(x − y)D(y − x3 )D(y − x4 ) 1 ˜ 1 )D(p ˜ 2 )D(k ˜ 1 )D(k ˜ 2) = (−iλ)2 D(p 2 ×
d4 x d4 y ei(p1 +p2 )x−i(k1 +k2 )y D2 (x − y)
1 ˜ 1 )D(p ˜ 2 )D(k ˜ 1 )D(k ˜ 2 )(2π)4 δ (4) (p1 + p2 − k1 − k2 ) = (−iλ)2 D(p 2 d4 k ˜ ˜ 1 + p2 − k) . D(k)D(p (5.101) × (2π)4 Similar expressions are obtained if x1 and x3 are joined to x while x2 and x4 to y, and if x1 and x4 are joined to x while x2 and x3 to y. These three possibilities in position space give rise to the three graphs of Fig. 5.9 in momentum space. In eq. (5.101) we recognize the usual factors associated to the external legs and the overall energy–momentum conservation. The new aspect here is the integration over the momentum k of the internal line which was not ﬁxed by the energy–momentum conservation at the vertices. We can therefore add to our list of Feynman rules in momentum space: • associate a propagator to each internal line in a loop, use momentum conservation at the vertices to reduce the number of independent momenta, and integrate over the remaining unﬁxed momenta, with the measure d4 k/(2π)4 . We deﬁne (−iλ)2 A(p) ≡ 2
i i d4 k . (2π)4 k 2 − m2 + i (p − k)2 − m2 + i
(5.102)
The 2 → 2 scattering amplitude at one loop level is then iM2→2 = −iλ + A(p1 + p2 ) + A(p1 − k1 ) + A(p1 − k2 ) ,
(5.103)
where the three contributions correspond to the three ways in which the process can take place, shown in Fig. 5.9. Now however we discover that Feynman diagrams containing loops can be divergent! Indeed, the integral in eq. (5.102) diverges at large k, and is an example of an ultraviolet (UV) divergence. To study this integral we proceed as follows. First of all, we will limit for simplicity to the calculation of A(p) when p = 0.8 Recall that the i prescription means that, in the complex k 0 plane, the pole at k 0 > 0 is below the real axis and the pole at k 0 < 0 is above the real axis, see Fig. 5.1. Therefore we can change the integration path in the complex k 0 plane, rotating counterclockwise from the real axis to the imaginary axis.
8
For this graph, this is suﬃcient for extracting the divergent part, because the divergence comes from the region k → ∞, where (p−k)2 → k 2 , and there are no subleading divergencies.
130 Perturbation theory and Feynman diagrams
This is called the Wick rotation. The integration variable is the com0 plex variable k 0 ; on the real axis we write dk 0 = dkM while on the imagi0 0 0 0 nary axis we write dk = idkE , where kM and kE are real variables, and the subscript denotes Minkowskian and Euclidean space, respectively. Then we obtain d4 k 1 λ2 A(0) = i . (5.104) 2 (2π)4 (k 2 + m2 )2 9
To check the sign consider the complex zplane, with z = x + iy; let C1 denote the real axis running from x = −∞ toward x = +∞ and C2 the imaginary axis again oriented from y = −∞ to y = +∞; then, closing the contour at inﬁnity in the R ﬁrst and third quadrants, we have C −C dz = 0 so that R R R 1 2 R = C dz. But C dz = dx C1 dz R 2 R 1 and C dz = i dy.
0 2 ) + k2 . The overall factor Here k is a Euclidean momentum, k 2 = (kE 9 of i comes from the rotation. For our purposes the exact computation of the integral is not necessary. The important point is that the integral diverges in the UV. We introduce a cutoﬀ Λ stating that we integrate only over Euclidean momenta with k 2 < Λ2 , and we extract the divergent part:
λ2 d4 k 1 + ﬁnite parts 2 (2π)4 k 4 Λ λ2 1 dk 2 + ﬁnite parts =i (2π ) 2 (2π)4 k 1 = iλ2 log Λ + ﬁnite parts . 16π 2
A(0) = i
2
(5.105)
The factor 2π 2 is the solid angle in four dimensions. At p ﬁnite the calculation is more complicated, and depending on the value of p one can have poles also in the ﬁrst and third quadrant. In general, when one has to perform more complicated computations, it is not convenient to use as regularization a cutoﬀ over Euclidean momenta; there are techniques, in particular dimensional regularization, which are much more convenient (see Chapter 7). In any case, in this graph the divergent part is independent of p and the p dependence only enters in the ﬁnite parts. Since the divergence is independent of p, the three graphs in Fig. 5.9 give the same contribution to the divergence. As a result, at oneloop, the 2 → 2 scattering amplitude in λφ4 is iM2→2 = −iλ + iλ2 (β0 log Λ + ﬁnite parts) ,
(5.106)
with β0 =
k p
p
Fig. 5.11 A tadpole graph.
3 . 16π 2
(5.107)
The sign of β0 plays a very important role and will be discussed in detail in Section 5.9. These divergences are typical of loop graphs. In the next section we will understand how they can be cured. First however we discuss another example of divergence, considering the twopoint function in λφ4 theory. In momentum space, at zero order in λ, this is just the ˜ Feynman propagator D(p). At order λ we have the graph shown in Fig. 5.11. This graph is known as a tadpole graph.
5.5
Wick’s theorem and Feynman diagrams 131
Using the Feynman rules in momentum space and performing the Wick rotation k 0 → ik 0 , this is given by −iλ d4 k 1 λ Λ 2 + m2 2 2 −iB ≡ = −i Λ − m log , 2 (2π)4 k 2 + m2 32π 2 m2 (5.108) so this term has a quadratic and a logarithmic divergence, both coming from the large k integration region, so they are again UV divergences. Observe also that the tadpole graph (with external legs amputated) is independent of the external momentum p. Actually, one can even resum the whole class of graphs shown in Fig. 5.12. Including also the propagators from the external legs, the result for the twopoint function in momentum space is ˜ ˜ ˜ ˜ ˜ ˜ D(p) + D(p)(−iB) D(p) + D(p)(−iB) D(p)(−iB) D(p) + ... 2 ˜ ˜ ˜ = D(p) 1 + (−iB D(p)) + (−iB D(p)) + . . . (5.109) i 1 i 1 ˜ = 2 . = 2 = D(p) B 2 2−B ˜ p − m p − m 1 − 1 + iB D(p) p2 −m2 We see that the net eﬀect is to shift the mass from m2 to m2 +B. We will make use of this fact when we study the renormalization of the theory. At O(λ2 ) there are further contributions to the twopoint function. One possible graph is shown in Fig. 5.13. Iterating graphs of this type we obtain again a geometric series, see Fig. 5.14, so again the result goes in the denominator. The Feynman graph in Fig. 5.13 gives 4 λ2 1 d k1 d4 k2 i . (5.110) 4 4 2 2 6 (2π) (2π) [(p − k1 − k2 ) − m ](k12 − m2 )(k22 − m2 ) The integral is somewhat more diﬃcult to compute, compared to the previous examples, and the result turns out to be proportional to Λ2 λ2 p2 log 2 + C , (5.111) p so in this case the divergence depends on p2 . After resumming the geometrical series, the result for the twopoint function in momentum space becomes of the form i A(Λ, p2 )p2 − m2 − B(Λ)
(5.112)
2
with A(Λ, p2 ) = 1 + λ2 (c1 log Λp2 + c2 ), and c1 , c2 some constants.
5.5.3
Summary of Feynman rules for a scalar ﬁeld
It is now clear that it is not necessary to go each time explicitly through the process of developing the exponential of the interaction Hamiltonian
+
+
+
+ ...
Fig. 5.12 The resummation of tad
pole graphs.
k1 p
k2
p
k1 + k 2 _ p Fig. 5.13 A twoloop correction to
the propagator.
+
+
+
+...
Fig. 5.14 The resummation of the graph shown in Fig. 5.13.
132 Perturbation theory and Feynman diagrams
to the desired order and performing the contractions using the Wick theorem, since the result can always be summarized by a simple set of rules. First of all, we have seen that it is usually more convenient to work in momentum space, and we summarize here the Feynman rules in momentum space. We consider for deﬁniteness a real scalar ﬁeld with an interaction term λ (5.113) Hint = φn . n! The ﬁrst step is to draw all connected graphs corresponding to the given initial and ﬁnal states, with n lines meeting at each vertex. For each graph proceed as follows: • Neglect the external legs. • Energy–momentum conservation must be imposed separately at each vertex. • To each vertex associate a factor −iλ. • To each internal line with momentum p associate a propagator ˜ D(p) =
i . p2 − m2 + i
(5.114)
• Integrate over the fourmomenta ki which are not ﬁxed by energy– momentum conservation at each vertex, with a measure d4 ki /(2π)4 . • Include the appropriate combinatorial factor, which combines a factor 1/N ! from the expansion of the exponential at order N , the number of equivalent contractions, and the factor 1/n! from the normalization of the coupling in the theory with Hint = (λ/n!)φn . The sum of the contributions of all Feynman diagrams gives iMf i . This is related to the matrix element of the T operator by ⎛ ⎞ p1 . . . pn iT k1 . . . km = (2π)4 δ (4) ⎝ pi − kj ⎠ iMf i , (5.115) i
j
and T is related to the Smatrix by S = 1 + iT .
5.5.4
Feynman rules for fermions and gauge bosons
We now want to understand the Feynman rules for more interesting theories, like QED, which contains fermions and gauge bosons. The derivation is conceptually similar to what we have seen for the scalar ﬁelds. We will therefore just collect the relevant results, referring to Peskin and Schroeder (1995), Sections 4.7 and 4.8 for the derivations. The fermion propagator. Wick’s theorem can be generalized to fermionic ﬁelds, if we deﬁne the T product of two Dirac ﬁelds as ¯ Ψ(x)Ψ(y) x0 > y 0 ¯ (5.116) T {Ψ(x)Ψ(y)} = ¯ −Ψ(y)Ψ(x) x0 < y 0 . The Feynman propagator for the Dirac ﬁeld is ¯ S(x − y) = 0T {Ψ(x)Ψ(y)}0 .
(5.117)
5.5
Wick’s theorem and Feynman diagrams 133
Observe that S(x − y) is a 4 × 4 matrix in the Dirac indices. It can be computed explicitly expanding Ψ in creation and destruction operators and using the anticommutation relations (4.34). The result is d4 p ˜ S(p)e−ip(x−y) , (5.118) S(x − y) = (2π)4 where the momentum space propagator is ˜ S(p) =
i( p + m) . p2 − m2 + i
(5.119)
Observe that ( p − m)( p + m) = γ µ γ ν pµ pν − m2 = p2 − m2 ,
(5.120)
since pµ pν is symmetric under µ ↔ ν and therefore we can replace γ µ γ ν → (1/2){γ µ , γ ν } = η µν . Then we can multiply by ( p − m) both the numerator and the denominator in eq. (5.119) (where dividing by ( p − m) means to multiply by the 4 × 4 Dirac matrix ( p − m)−1 ) and ˜ rewrite S(p) in the form ˜ S(p) =
i , p − m
(5.121)
where the prescription for going around the poles is understood. An alternative way to compute the fermion propagator is to observe that, from the deﬁnition, S(x − y) is a Green’s function of the Dirac operator, (i ∂ − m)S(x − y) = iδ (4) (x − y) .
(5.122)
It is then straightforward to check that eqs. (5.118) and (5.121) provide a solution, and the prescription for going around the poles is the same as in the scalar case, and corresponds to the Feynman propagator. The photon propagator. By deﬁnition the photon propagator is Dµν (x − y) = 0T {Aµ (x)Aν (y)}0 .
(5.123)
Using the covariant quantization of the gauge ﬁeld discussed in Section 4.3.2, the calculation is a simple generalization of the calculation for the KG ﬁeld, with the only diﬀerence that now we have four diﬀerent creation operators a†p,λ labeled by a Lorentz index λ = 0, . . . 3, and the commutator is given by eq. (4.106). Then the propagator in momentum space is simply ˜ µν (k) = D
−i ηµν . k 2 + i
(5.124)
In other words, the spatial components Ai have the same propagator as a massless scalar ﬁeld, while A0 has the “wrong” sign. A more general form of the photon propagator will be given in Chapter 7.
134 Perturbation theory and Feynman diagrams
The interaction vertex. While the propagators are ﬁxed by the kinetic terms, i.e. by the free theory, the interaction vertices depend of course on the speciﬁc theory that we are considering. In QED the in¯ µ Ψ. Let us recall from Secteraction term in the Hamiltonian is eAµ Ψγ ¯ in terms of creation and tion 4.2 that the expansion of the ﬁeld Ψ, Ψ annihilation operators is (see eqs. (4.32) and (4.33)) d3 p Ψ(x) = ap,s us (p)e−ipx + b†p,s v s (p)e+ipx , 3 (2π) 2Ep s=1,2 (5.125) 3 d p s −ipx † s +ipx ¯ Ψ(x) = b , v ¯ (p)e + a u ¯ (p)e p,s p,s (2π)3 2Ep s=1,2 (5.126) where ap,s destroys an electron (in a spin state labeled by s), a†p,s creates an electron, bp,s destroys a positron and b†p,s creates a positron. ¯ can Therefore Ψ can destroy an electron or create a positron while Ψ destroy a positron or create an electron. Similarly the gauge ﬁeld, in the covariant quantization, has the expansion (4.104), Fig. 5.15 The QED vertex:
the solid lines represent the fermions and the wavy line the photon.
Fig. 5.16 The same interaction vertex, describing e− γ → e− .
10
Observe that for the physical process e+ e− → γ the matrix element Mf i is nonvanishing, iMf i = ieγ µ but the matrix element of iT is zero because the Dirac delta in eq. (5.98) cannot be satisﬁed, so the process is forbidden by energy–momentum conservation. However, the vertex of Fig. 5.15 enters as a building block in all other Feynman diagrams of QED.
3 d3 p † −ipx ∗ +ipx , (p, λ)a e + (p, λ)a e µ p,λ µ p,λ (2π)3 2ωp λ=0 (5.127) ¯ µ Ψ there are and can destroy or create a photon. Therefore in eAµ Ψγ all possible terms with two fermion lines and one photon line, which conserve the electric charge: for instance, we can destroy an electron with ¯ while at the same time emitting a photon, Ψ and create it back with Ψ corresponding to a vertex e− → e− γ; or we can absorb the photon, corresponding to a vertex e− γ → e− ; or we can destroy an electron with ¯ and create a photon, e+ e− → γ, etc. Ψ, destroy a positron with Ψ All these possibilities are summarized associating a factor
Aµ (x) =
−ieγ µ
(5.128)
to the interaction vertex of Fig. 5.15. As in the scalar ﬁeld theory, the factor −i in eq. (5.128) comes from the fact that in the T product appears the exponential of −iHI . In Fig. 5.15 the solid line can represent either an electron propagating in the direction of the arrow or a positron propagating in the opposite direction. If we imagine that time runs from left to right, then Fig. 5.15 actually describes the process e+ e− → γ, while e− γ → e− will be drawn as in Fig. 5.16, etc.10 The interaction vertex is proportional to γ µ and therefore is a matrix in the Dirac indices and carries a Lorentz index. The external lines. In the case of the scalar ﬁeld, acting with the ﬁeld operator φ on the vacuum to create a particle brings a factor eipx while destroying a particle brings a factor e−ipx , see eqs. (4.21) and (4.22). This is the origin of the factors eipi xi for each ﬁnal particle and e−ikj yj
5.6
for each initial particle in the LSZ formula (5.46). From eqs. (5.125), (5.126) and (5.127) we see that for fermions and gauge bosons, together with the exponential factors (which, as we have seen, combine to give an overall energy–momentum conservation and transform the position space propagators into momentum space propagators) there are further factors associated to the external legs. Namely • A factor ∗µ (k) for each ﬁnal photon with momentum k and polarization given by µ (k). • A factor µ (k) for each initial photon with momentum k and polarization given by µ (k). • A factor us (p) for each initial electron with momentum p and spin state s. • A factor v s (p) for each ﬁnal positron with momentum p and spin state s. • A factor u ¯s (p) for each ﬁnal electron with momentum p and spin state s. • A factor v¯s (p) for each initial positron with momentum p and spin state s. In other words, to each initial particle is associated its wave function, and to each ﬁnal particle is associated the complex conjugate of the wave function (or the Dirac adjoint, for Dirac spinors). For an elementary scalar ﬁeld the wave function is just the plane wave e−ipx while for particles with spin there is also the spin wave function, e.g. µ (k) for a photon or us (p) for an electron. Closed fermionic loops. Finally, from the anticommuting nature of fermionic ﬁelds, it follows that for each closed fermionic loop there is an additional minus sign.
5.6
Renormalization
The basic idea of renormalization is the following. We have seen that some diagrams give divergent contributions. The ﬁrst step is therefore to regularize the theory. For instance, we can put a cutoﬀ Λ over the modulus of the Euclidean momenta, as we have done above (for technical reasons, especially in gauge theories, there are more convenient choices of the regularization scheme; however, for understanding the general ideas, we will use this cutoﬀ). Eventually we want to send Λ → ∞ but, as long as we take Λ ﬁnite, our theory has a dependence on the cutoﬀ, and therefore we begin by deﬁning the theory at ﬁnite Λ admitting that even the couplings, the masses and the ﬁelds depend on Λ, in a way which for the moment we leave unspeciﬁed. In λφ4 theory we will therefore write the Lagrangian at ﬁnite Λ as L=
1 1 λ0 (∂φ0 )2 − m20 φ20 − φ40 , 2 2 4!
(5.129)
Renormalization 135
136 Perturbation theory and Feynman diagrams
where the subscript 0 indicates that these quantities depend on the cutoﬀ Λ: φ0 = φ0 (x; Λ), m0 = m0 (Λ), λ0 = λ0 (Λ). We call φ0 (x; Λ) the bare ﬁeld, m0 (Λ) the bare mass and λ0 (Λ) the bare coupling. Consider ﬁrst the twopoint amplitude. In this new notation the twopoint function of the bare ﬁeld, eq. (5.112), is written as 0T {φ0 (x, Λ)φ0 (y, Λ)}0 c i d4 p e−ip(x−y) . = (2π)4 A(Λ, p2 )p2 − m20 (Λ) − B(Λ)
(5.130)
The i prescription in the denominator is understood. We saw in eq. (5.108) that B(Λ) is divergent as Λ2 at the oneloop level, B(Λ) =
λ0 (Λ) 2 Λ + O(log Λ) + ﬁnite parts , 2 32π
and A diverges as log Λ at the twoloop level, Λ2 A(Λ, p2 ) = 1 + λ20 (Λ) c1 log 2 + c2 . p
(5.131)
(5.132)
Observe that A and B also have an implicit, and as yet unspeciﬁed, dependence on Λ through λ0 (Λ). For simplicity, let us at ﬁrst examine eq. (5.130) at the oneloop level. Then the twopoint function in momentum space is i , (5.133) 2 2 p − m0 (Λ) − B(Λ) i.e. A = 1, and we recall that B is independent of p2 . The basic idea is that neither m0 (Λ) nor B(Λ) are physically observable. Rather, the physical or renormalized mass mR is deﬁned by m2R = m20 (Λ) + B(Λ) .
(5.134)
In other words, we ﬁx the physical mass requiring that the propagator has a pole at p2 = m2R . Since m0 (Λ) is a parameter completely in our hands, we choose it such that it cancels the divergence in B(Λ), and it leaves us with a value of mR ﬁnite and equal to the measured physical value. At the twoloop level the situation is slightly more complicated because there is also the divergence coming from A. However, we still deﬁne mR as the position of the pole of the propagator, i.e. by the condition (5.135) [A(Λ, p2 )p2 − m20 (Λ) − B(Λ)]p2 =m2R = 0 . This is one condition, and is not yet suﬃcient to eliminate the two divergencies coming from A and B. However, expanding the function A(p2 )p2 − m20 − B near p2 = m2R , we ﬁnd that close to the pole iZ + ... , (5.136) d4 x eipx 0T {φ0 (x, Λ)φ0 (0, Λ)}0 c = 2 p − m2R
5.6
Renormalization 137
where
$−1 # d Λ 2 2 Z = Z λ0 (Λ), A(Λ, p )p , ≡ mR d(p2 ) p2 =m2
(5.137)
R
and the dots represent terms that are ﬁnite for p2 = m2R . For later reference, we have also written explicitly that Z depends also on the bare coupling λ0 . Furthermore, being dimensionless, Z can depend on Λ and mR only through the combination Λ/mR . Now we deﬁne the renormalized ﬁeld φR from Λ 1/2 λ0 (Λ), (5.138) φ0 (x, Λ) = Z φR (x) . mR By deﬁnition φR is independent of the cutoﬀ, and this ﬁxes the dependence of φ0 on Λ. The factor Z 1/2 is called the wave function renormalization, or ﬁeld renormalization. We see from eq. (5.136) that in terms of φR the twopoint function is the same as that of a free ﬁeld with mass mR , and therefore Z is the same factor that appeared in the LSZ reduction formula (5.46). In other words, Z disappears from the LSZ formula if, instead of using the bare ﬁeld φ0 , as we did in eq. (5.46), we use the “physical”, renormalized ﬁeld. Thus, after mass and wave function renormalization, the onshell twopoint function is ﬁnite.11 Now that we have made the twopoint function ﬁnite, we turn our attention to the fourpoint function. At oneloop there are two types of divergences in the fourpoint function. The ﬁrst is associated to the graphs in Fig. 5.9, and we have seen that it is a logarithmic divergence. The second is associated with graphs like Fig. 5.17, i.e. tadpoles on external legs. The crucial point is that the divergence due to tadpoles is automatically cured by the renormalization of the twopoint function, i.e. by the mass and wave function renormalization, because it is a divergence that concerns only a subgraph of Fig. 5.17, corresponding to a twopoint function, and we have already made the twopoint function ﬁnite. The graphs in Fig. 5.9 give instead a genuinely new divergence, and we computed it in eq. (5.106). Actually, to renormalize the divergence, we have ﬁrst to be more careful in specifying the kinematical conﬁguration. A simple choice is to consider the scattering amplitude in the limit of zero spatial momentum, i.e. p1 = p2 = k1 = k2 = (mR , 0). With this choice it is clear that, in eq. (5.106), the only scale that can be combined with Λ to give a dimensionless argument of the logarithm is mR , so we rewrite eq. (5.106) (with our new notation λ0 for the bare coupling) (5.139) iM2→2 (pi = ki = 0) ≡ −iλR Λ + ﬁnite parts + O(λ30 ) . = −iλ0 (Λ) 1 − λ0 (Λ) β0 log mR Now, λR is the quantity that is measured performing a scattering experiment and which therefore must be ﬁnite. We call it the physical, or renormalized coupling. We therefore choose the parameter λ0 (Λ), which
11
The fact that to see the wave function renormalization we had to go to twoloops is a peculiarity of λφ4 theory. In general theories Z = 1 already at one loop.
Fig. 5.17 A tadpole on an external
line.
138 Perturbation theory and Feynman diagrams
is completely in our hands, requiring that λR is ﬁnite, and equal to the desired value. If we consider the scattering amplitude in a diﬀerent kinematical regime, for instance when (p1 + p2 )2 ≡ q 2 m2R , we ﬁnd instead an amplitude Λ2 β0 2 log 2 + ﬁnite parts + O(λ30 ) . iM2→2 (q ) = −iλ0 (Λ) 1 − λ0 (Λ) 2 q (5.140) This result follows from the explicit computation, but it is easily understood observing that in the limit q 2 m2R the relevant dimensional scale is provided by q 2 rather than m2R , so it is this scale that combines with Λ to provide a dimensionless argument of the logarithm. Writing log(Λ2 /q 2 ) = log(Λ2 /m2R ) + log(m2R /q 2 ) and using the deﬁnition of λR from eq. (5.139), we get β0 q2 2 (5.141) iM2→2 (q ) = −iλR 1 + λR log 2 + O(λ3R ) , 2 mR
Fig. 5.18 A oneparticle reducible
graph. Cutting along the dashed line, the graph is separated in two disconnected pieces.
where we could replace λ20 with λ2R since terms O(λ30 ) are neglected anyway. The important point that we understand from eq. (5.141) is that, once we have made ﬁnite the fourpoint amplitude at a given value of the external momenta, it is ﬁnite for all momenta. In this way we have cured the divergences of the twopoint and fourpoint amplitudes. We can continue and examine the sixpoint amplitude and so on. In principle, what can happen is that in the Feynman graphs that determine the sixpoint amplitude there are divergent subgraphs that are automatically cured by the renormalization of the twoand fourpoint functions plus possibly some genuinely new divergence. Figure 5.18 shows an example of a graph contributing to the sixpoint amplitude with a divergent subgraph that is automatically cured by the renormalization of the fourpoint function. Observe that it is possible to separate this graph into two disconnected parts by cutting a single line, along the dashed line in Fig. 5.18. Such graphs are called oneparticle reducible, and cannot carry genuinely new divergences. If instead it is not possible to make the graph disconnected by cutting just one line, the graph is called oneparticle irreducible (often abbreviated 1PI), and can in principle carry genuinely new divergences. In the case of λφ4 theory we saw that in the fourpoint function there were graphs cured by the renormalization of the twopoint function and a genuinely new divergence which required the renormalization of λR . If this were the case also for the sixpoint function, after the renormalization of the ﬁeld, of the mass and of λ we would still be left with a divergent result for the sixpoint function. To cure it, we could introduce a new term proportional to φ6 in the Lagrangian, with a new bare coupling λ(6),0 (Λ). This would give a further contribution to the sixpoint amplitude, and we could choose λ(6),0 (Λ) so that it cancels the divergence that was left. This means that we should again ﬁx the renormalized coupling λ(6),R by comparison with experiment.
5.6
If this process never terminates, and each time that we consider a new amplitude with a larger number of external legs we must introduce a new coupling and ﬁx the amplitude at a certain energy by comparison with experiment, then the theory that we have constructed by this renormalization procedure is ﬁnite, because all divergences have been reabsorbed, but (apparently) has little predictive power, because we have introduced an inﬁnite number of parameters, to be ﬁxed by experiment. Then the theory is called nonrenormalizable. Actually, even nonrenormalizable theories can be very useful, but we postpone their discussion until Section 5.8. If instead at some point the process terminates, we just need to eliminate the divergences from a few amplitudes, ﬁxing a few parameters by comparison with experiment, and then all the other amplitudes are automatically ﬁnite. In this case the theory is called renormalizable. The criterion for understanding when a theory is renormalizable turns out to be quite simple and holds not only for the scalar theory that we are considering, but more generally (although the actual proof of renormalizability for gauge theories, and especially for nonabelian gauge theory, is far from trivial!). Consider for example a theory with interaction Hamiltonian λφn , with n 3, integer, in four spacetime dimensions. In a general Feynman diagram there will be an integration over d4 k for each loop in the graph and a factor of the type 1/((k − p)2 − m2 ) for each propagator on an internal line, where k is one of the integration variables (or a linear combination of integration variables) and p is a combination of external momenta. So each loop integration carries four powers of momenta at the numerator and each internal line two powers at the denominator. In this theory the superﬁcial degree of divergence D is then deﬁned as (5.142) D = 4L − 2Ni where L is the number of loops and Ni the number of internal lines. The number of loops can be expressed as L = Ni − V + 1
(5.143)
where V is the number of vertices of the graph. We can check it observing that the simplest tree level graph has one vertex and no internal line, so V = 1, Ni = 0 and eq. (5.143) correctly gives L = 0. Adding a second vertex and connecting it to the ﬁrst with just one propagator still gives a tree level graph, and eq. (5.143) still correctly gives L = 0, since we have increased both Ni and V by one. Similarly we construct the most general tree level graph adding each time one vertex and one internal line. Instead, each time a new vertex is joined by two lines, we have added a loop and correctly eq. (5.143) shows that L increases by one. Finally, if the theory is λφn , there are n lines at each vertex, so 2Ni + Ne = nV
(5.144)
where Ne is the number of external lines and the factor 2 reﬂects the fact that one internal line connects two vertices. Combining these expressions
Renormalization 139
140 Perturbation theory and Feynman diagrams
we ﬁnd D = (n − 4)V + 4 − Ne .
(5.145)
If D 0 we expect that the diagram is divergent, unless some numerical in the leading term appears (D = 0 corresponds to 4 cancellation d k/k 4 , i.e. to a logarithmic divergence). If D < 0 the diagram is not necessarily convergent. In fact, the various integrations and propagators could be distributed in such a way that there is a divergent subgraph. However, these divergences are cured by the renormalization of the Green’s function with a smaller number of points, and do not bother us. Genuinely new divergences instead do not appear if D < 0. The condition for renormalizability therefore is that only a ﬁnite number of Green’s functions have D 0. Consider ﬁrst the case n = 4, i.e. λφ4 theory. Then eq. (5.145) gives D = 4 − Ne . Therefore the only genuinely divergent graphs are those with no external legs, i.e. the vacuumtovacuum amplitude which diverges as Λ4 (and that will be examined more closely in Section 5.7 in connection with the cosmological constant problem), the twopoint function that, as we have seen, diverges as Λ2 , and the fourpoint function that diverges as log Λ. After renormalizing these divergences, there is no other divergence to be cured, so the theory is renormalizable. Similarly, a theory λφ3 is renormalizable since (n − 4)V = −V gives a negative contribution to D, and therefore helps the convergence. If instead n > 4, for each Ne given there are graphs with a suﬃciently large number of vertices which have D 0. Therefore all Green’s functions, at a suﬃciently large order in perturbation theory, have genuinely new divergences, and the theory is not renormalizable. The criterion n 4 for λφn theory can be understood in a way that can be generalized to other theories. The ﬁeld φ has dimensions of mass, 4 2 since the action is dimensionless, the kinetic term is ∼ d x(∂φ) , and 4 n ∂ ∼ 1/length = mass. Requiring that d x λn φ is dimensionless, we see that the coupling λn has dimensions of (mass)4−n . Then the criterion n 4 means that: Terms in the Lagrangian whose coeﬃcients have either a positive mass dimension or are dimensionless are renormalizable. Terms with negative mass dimension are not renormalizable. In this form the criterion for renormalizability turns out to hold quite generally, and is not restricted to φn theories; the proofs however can be very complicated and depend on the details of the theory. The intuitive reason for this is however easily understood. If the coupling constant has for instance dimensions 1/M 2 (as would be the case for a term φ6 in four spacetime dimensions) each new vertex brings in a new factor 1/M 2 . For dimensional reasons, this must be compensated by some parameter with dimensions of mass squared. Barring cancellations, we therefore expect to ﬁnd divergences with higher and higher powers of Λ2 /M 2 .
5.7
5.7
Vacuum energy and the cosmological constant problem 141
Vacuum energy and the cosmological constant problem
In Section 4.1.1 we found that the vacuum energy is divergent. It was our ﬁrst example of a divergence in ﬁeld theory, and we simply disposed of it eliminating the inﬁnity by hand, with the physical argument that only energy diﬀerences can be observed. This physical argument is however incorrect when we include gravity, since in general relativity any form of energy contributes to the gravitational interaction. To understand this point, we recall a few basic facts of general relativity and of cosmology (see e.g. Kolb and Turner (1990)). The energy density of the vacuum, ρvac , is conveniently written as ρvac =
Λ , 8πGN
(5.146)
where Λ is called the cosmological constant (in this section we reserve the symbol Λ for the cosmological constant and we denote the cutoﬀ by Λcut). This normalization is chosen so that the Einstein equations of general relativity, in the presence of a cosmological constant, read12 Gµν = 8πGTµν + Λgµν
(5.147)
where gµν is the metric, Gµν is the Einstein tensor, which contains up to two derivatives of the metric, and Tµν is the energy–momentum tensor of matter. The Universe on very large scales is in a ﬁrst approximation homogeneous, and can be described by the Friedmann–Robertson–Walker (FRW) metric, ds2 = dt2 − R2 (t)(dx2 + dy 2 + dz 2 )
(5.148)
i.e. gµν = (1, −R2 , −R2 , −R2 ) in these coordinates; R(t) is known as the scale factor (actually we have restricted for simplicity to a spatially ﬂat Universe, otherwise the spatial part of the metric is more complicated). The expansion of the Universe is encoded in the fact that R(t) is growing. The energy–momentum tensor of a ﬂuid of matter is of the form T µ ν = (ρ, −p, −p, −p), where ρ is the energy density and p is the pressure. Then, using the above form of the metric, one can show that eq. (5.147) ¨ implies an equation for the acceleration R, ¨ R 4πGN =− (ρ + 3p) . R 3
(5.149)
¨ < 0: For ordinary matter ρ and p are positive, therefore ρ+3p > 0 and R the eﬀect of ordinary matter is to produce a deceleration of the expansion of the Universe. Instead, we see from eq. (5.147) that the eﬀect of Λ on ¨ is formally equivalent to that of a ﬂuid with energy– the equation for R momentum tensor T µ ν = (Λ/8πG)δνµ = ρvac δνµ , and therefore to a ﬂuid with ρ = ρvac and p = −ρvac . This means that ρ + 3p = −2ρvac < 0: a positive vacuum energy density contributes to accelerate, rather than decelerate the Universe!
12
The signs depend on a number of conventions on the metric signature, Riemann tensors, etc. Here we are following the conventions used for instance in Kolb and Turner (1990) and Landau and Lifshitz, vol. II (1979).
142 Perturbation theory and Feynman diagrams
A detection of the vacuum energy is therefore in principle possible from cosmological measurements. Without entering into the details that are beyond the scope of this course, we just mention that the most important observations have been obtained from type Ia supernovae and from the ﬂuctuations of the cosmic microwave background (CMB). The result is usually expressed introducing the quantity ρvac ΩΛ = , (5.150) ρc where ρc is the critical density for closing the Universe. The combination of supernovae and CMB measurement indicates a nonzero value ΩΛ 0.7 ± 0.1. Using the known value of ρc , this means that −3 eV . ρ1/4 vac 2 × 10
(5.151)
This result stimulates us to look in more detail into the problem of the vacuum energy, using the language of renormalization that we have just developed. As we saw in Section 4.1.1, when we compute the vacuum energy density we ﬁnd a result that diverges quartically with the cutoﬀ, ρvac = cΛ4cut ,
(5.152)
with c some numerical constant. According to the general rules of renormalization of QFT, we then say that the correct starting point is a Lagrangian which also has a bare vacuum energy density. So, for instance, for a λφ4 theory we would generalize eq. (5.129) to L=
1 1 λ0 (∂φ0 )2 − m20 φ20 − φ40 − ρ0 , 2 2 4!
(5.153)
where the bare vacuum energy ρ0 depends on the cutoﬀ Λcut , similarly to m0 , λ0 , φ0 . Computing the vacuum energy density with this Lagrangian we now ﬁnd (5.154) ρvac = ρ0 (Λcut ) + c Λ4cut . The bare quantity ρ0 is a parameter completely in our hands; we can choose it positive or negative, as we wish, and we ﬁx it requiring that the physical energy density of the vacuum ρvac has the value determined by experiment. It is important to make clear two points: 13 In string theory, at least as far as the masses and couplings are concerned, the question of why they have certain values becomes equivalent to why the extra dimensions have a certain geometry (compare with Exercise 3.6) or why certain ﬁelds acquire a given vacuum expectation value, so in this sense they become at least meaningful dynamical questions, which however presently we do not know how to answer. Instead, for the cosmological constant, even string theory presently does not shed any new light.
• In ﬁeld theory quantities like the cosmological constant, the elementary particle masses, the couplings (such as the ﬁne structure coupling), etc., cannot be predicted. They are just ﬁxed, by the renormalization procedure, to the experimentally observed values. Questions such as why the cosmological constant has a certain value, or why the electron mass is about 0.5 MeV, or why the ﬁne structure constant is about 1/137 strictly speaking are ill deﬁned in the framework of quantum ﬁeld theory.13 • The bare quantities, such as ρ0 (Λcut ), are objects which are useful in the intermediate steps of the calculations, but they have no physical meaning. They are just chosen so that they cancel the divergences and leave us with the desired renormalized quantity.
5.7
Vacuum energy and the cosmological constant problem 143
For these reasons, a possible attitude could be simply to take notice of the value (5.151) and observe that in QFT we can ﬁx the renormalized vacuum energy density to this value. However, this value of the vacuum energy density is probably trying to tell us something very important that we do not yet understand. There are two main reasons for this. (i) A ﬁne tuning argument: even if strictly speaking in the framework of QFT we are not really allowed to ask why a renormalized quantity has a given value, still we can make the following observation. We know experimentally that, at least up to an energy scale of the order of one TeV, Nature is well described by a quantum ﬁeld theory, the Standard Model. This means that in the Standard Model the cutoﬀ Λcut can be taken to be at least 1 TeV (= 1012 eV), and eq. (5.154) numerically reads something like (2 × 10−3 eV)4 = ρ0 (Λcut ) + c (1012 eV)4 .
(5.155)
Therefore ρ0 (Λcut ) must be chosen so that it cancels something of order 1048 eV4 , leaving something of order 10−11 eV4 . Even if ρ0 (Λcut ) is a parameter that we can choose at will, and of no physical meaning, still this requires an incredible ﬁne tuning, at the level of 60 decimal ﬁgures, and in this sense it seems very unnatural. Furthermore, this ﬁne tuning does not show up for most other observables in QFT, and is really speciﬁc to the vacuum energy (and to the Higgs mass, see below). The point is that most quantities have logarithmic, rather than powerlike divergences. For instance, in QED at oneloop the renormalized electron mass m is related to the bare electron mass m0 (Λcut ) by Λ2 3α log cut m = m0 1 + . 4π m20
(5.156)
The cutoﬀ Λcut appears only inside the log, and in front of the log we have α which is small. Therefore, even if we are so bold as to push the cutoﬀ to the Planck scale, Λcut ∼ 1019 GeV, with m0 ∼ MeV we have (3α/4π) log Λ2 /m20 ∼ 0.1, so this is really a small correction and to reproduce a physical electron mass m 0.5 MeV we must indeed take a value m0 of the same order of magnitude. Therefore here there is no ﬁne tuning problem.14 (ii) The really crucial point, however, is that the millieV scale indicated by the experimental results does not remind us of anything meaningful in particle physics (except possibly neutrino masses, but there is no reason why the mass of any speciﬁc particle should be related to ρvac ), so it is diﬃcult to see how such a value could be derived from fundamental physics. And especially, why, of all possible energy scales, it turns out that this value is just comparable to the energy density needed for closing the Universe?
14
The only other important situation where one is confronted with a ﬁne tuning problem similar to that of the vacuum energy is when we consider the renormalization of the mass of a scalar ﬁeld. As we have seen in eqs. (5.131) and (5.134), the divergence in the mass of a scalar ﬁeld is quadratic, rather than logarithmic. If we say that the cutoﬀ is given by the Planck scale MPl = 1019 GeV, this poses a ﬁne tuning problem for the Higgs ﬁeld, which is a scalar ﬁeld predicted by the Standard Model, and is expected to have a mass around a few hundred GeV. In fact in this case the bare mass m0 should satisfy something like (102 GeV)2 = m20 + (1019 GeV)2 . However, the problem here is diﬀerent from the cosmological constant problem, since it really depends on the form of the vacuum ﬂuctuations from the TeV scale up to the Planck scale, about which we know nothing experimentally. In particular, in supersymmetric extensions of the Standard Model this ﬁne tuning problem disappears because above a few TeV the contribution of the vacuum energy due to bosons is canceled by the contribution of fermions.
144 Perturbation theory and Feynman diagrams
5.8
The modern point of view on renormalizability
The fact that quantum ﬁeld theory is plagued by divergences was already realized around 1929–1930 by Heisenberg and Pauli and, quite understandably, it was considered a major problem. It was then realized (by Dyson in 1949, building on work of Feynman, Schwinger, Tomonaga, and others) that in some theories these divergences can be reabsorbed into the redeﬁnition of a ﬁnite number of parameters. These theories were then called renormalizable, and considered “honest” theories, while nonrenormalizable theories were considered intrinsically sick. The modern point of view (largely stimulated by the work of K. Wilson) is quite diﬀerent. First of all, it is important to realize that the problem with nonrenormalizable theories is not mathematical consistency, but rather predictivity. As we have seen in Section 5.6, we can reabsorb the inﬁnities of a nonrenormalizable theory, but the price is that any N point amplitude AN (p1 , . . . , pN ), at a suﬃciently large order n in perturbation theory, will develop divergences which are not automatically cured by the renormalization of amplitude with less than N external legs. To cure it we must therefore introduce a new term, and a new coupling, in the Lagrangian, and then ﬁx this new coupling comparing the contribution to AN at perturbative order n with the experimental value. The fact that in principle we have to ﬁx an inﬁnite number of quantities by comparison with experiment is however only an apparent disaster. The point is that, as we have seen, in a typical nonrenormalizable theory the coupling is not dimensionless, but rather has the dimensions of inverse powers of mass. We suppose for deﬁniteness that the coupling λ is dimensionally the inverse of a mass squared, so that it can be written as 1/M 2 , for some massscale M ; we also assume for simplicity that we have just one typical momentum (or energy) scale, and we denote it by E. Then the renormalized perturbative expansion of an N point amplitude AN up to order (λ2 )n reads E2 E4 E 2n AN (E) = A0N (E) 1 + c1 2 + c2 4 + . . . + cn 2n . (5.157) M M M The quantities c1 , . . . cn−1 are ﬁnite and calculable, once we have renormalized the amplitudes with less than N points. Because of the genuinely new divergence at some order n, the coeﬃcient cn must instead be ﬁxed by comparison with experiment, and this is the source of the loss of predictivity. However, we can now realize that at low energy, E M , the lack of predictivity on cn becomes completely irrelevant, because anyhow cn is multiplied by the very small quantity (E/M )2n . Therefore, if we want to make computations with a given accuracy, we can just renormalize a corresponding ﬁnite number of amplitudes and, as long as E M , our ignorance about higherorder divergences is beyond the desired accuracy.
5.8
The modern point of view on renormalizability 145
This means that: Nonrenormalizable theories are perfectly acceptable lowenergy theories. At E ∼ M the expansion (5.157) blows up, signaling that the theory is no longer meaningful, and a more complete theory must take its place. We therefore see that nonrenormalizable theories have a builtin scale M that provides their limit of validity. Renormalizable theories, in contrast, have no builtin massscale M which tells us from the beginning that we cannot trust them at E > M , and in principle they are mathematically consistent and predictive at any energy scale. The distinction between renormalizable and nonrenormalizable theories, however, can be more mathematical than physical. For instance QED is a renormalizable theory, and so we could naively hope that it correctly describes Nature up to arbitrary energy scales. But in fact, experimentally we know that at a massscale M ∼ 100 GeV QED merges with weak interactions in a larger theory, the Standard Model, and so pure QED is in any case a lowenergy approximation to a more complete theory. In other words, renormalizability is related to the behavior of the theory at inﬁnitely large energies or, equivalently, at inﬁnitesimally small distances. As such, it reﬂects rather formal mathematical properties of the ﬁeld theory, since we can never test a theory up to inﬁnitely large energies, not only practically, but even in principle, since quantum gravity must come into play at the latest at the Planck scale, M Pl ∼ 1019 GeV, i.e. at distances l ∼ 10−33 cm. Rather than focus on the concept of renormalizability, the modern approach focuses on the concept of eﬀective ﬁeld theory. The really crucial point is that if we want to compute, with a given precision, processes that take place at a given lengthscale l (or, equivalently, at a corresponding energy scale E), we do not need to know the full theory at inﬁnitely small distances or inﬁnitely high energies. To study what happens in an atom, at l ∼ 10−8 cm, we do not need to know what happens at the scale l ∼ 10−17 cm, typical of weak interactions, except if we are looking for extremely ﬁne eﬀects, and certainly we do not need to know what happens at l ∼ 10−33 cm where quantum gravity becomes important. Once we have ﬁxed the scale l which we are interested in and the level of precision that we want to get from our computations, all we need to know is the eﬀective theory down to a lenghtscale l∗ a few orders of magnitude smaller than l. How many orders of magnitude depends, of course, on the level of precision at which we aim. In Chapter 8 we will discuss the Fermi theory of weak interactions, which is the lowenergy limit of the electroweak theory. We will see that it is an extremely useful lowenergy theory, despite the fact that it is not renormalizable. A very important example of a nonrenormalizable theory is the theory obtained quantizing general relativity. The coupling constant in this case is the Newton constant which dimensionally, in our units = c = 1,
146 Perturbation theory and Feynman diagrams 2 is the inverse of mass squared, GN = 1/MPl . Therefore quantum gravity is not renormalizable. However, the loss of predictivity comes into play only in processes that probe spacetime down to distances of order of the Planck length lPl = 1/MPl 10−33 cm. Such small scales will never be reached in accelerator physics, and probably our best hopes to get information on such small lengthscales, or correspondingly high energies, come from some relics of the Big Bang. In any case, in “normal” conditions, classical general relativity gives a completely adequate description, and the lack or renormalizability of the quantum theory is irrelevant. In Section 9.5 we will come back to the eﬀective Lagrangian approach, and we will see that it is rooted in the concept of universality for critical phenomena.
5.9
The running of coupling constants
A surprising eﬀect of the renormalization procedure is that, after renormalization, the coupling “constants” are not constant at all, but they depend on the energy. We have already seen this in eq. (5.141), where we computed the 2 → 2 scattering amplitude when the center of mass energy squared is q 2 . This is nothing but the eﬀective coupling constant at E 2 = q 2 , and we see that it is diﬀerent from the value λR at E = 0. We found in eq. (5.141) that at energies E mR the amplitude can be written as iM2→2 (q 2 ) = −iλeﬀ (E) , (5.158) where the eﬀective oneloop coupling constant λeﬀ (E) is given by λeﬀ (E) = λR + λ2R β0 log
E + O(λ3R ) . mR
(5.159)
The same formula holds in the limit E mR , as one can see from explicit calculation. We see that the sign of the coeﬃcient β0 plays an especially important role. If β0 > 0 the coupling increases in the UV (until it becomes large and we cannot trust the perturbative expression). If instead β0 < 0 the coupling becomes smaller when we go to higher energies. Theories where the coupling becomes small in the UV are called asymptotically free. Asymptotic freedom means that at large energies (and therefore at short distances) the ﬁelds which appear in the Lagrangian can be treated perturbatively. The other side of the coin, however, is that in an asymptotically free theory at low energies (i.e. large distances) the coupling becomes strong, and the perturbative treatment is inadequate. In this regime, the degrees of freedom that one actually observes are not described by the ﬁelds that enter in the Lagrangian, but are rather composite objects built from these more fundamental degrees of freedom, and bound by the strong interaction. The most important example of this situation is QCD. As we will see in Chapter 10, in QCD the basic ﬁelds which enter the Lagrangian are
5.9
The running of coupling constants 147
quarks and gluons, and at distances l 1 fm their interaction can be treated perturbatively. At l ∼ 1 fm, however, the interaction becomes strong and quarks and gluons do not even appear as free particles, but rather they are conﬁned into hadrons. In this section we study the dependence of the coupling constants on the energy (the “running of the coupling constants” as it is usually called) in more detail. The rest of this section is more advanced and can be skipped at a ﬁrst reading. First of all, let us be more general: rather than focusing on the fourpoint function, we consider a generic npoint function. A general renormalized npoint function ΓR depends on the external momenta pi (or simply on just one invariant q 2 if we choose a simpler kinematical situation, rather than the most general), on the renormalized coupling gR (for a general theory, not necessarily of a scalar ﬁeld, we use the notation g rather than λ for the coupling; we also assume for simplicity that there is just one coupling, but the generalization is straightforward), and on the scale µ used to deﬁne the renormalization procedure, (5.160) ΓR = ΓR (pi ; gR , µ) . In the previous section we renormalized the theory choosing µ = mR : we ﬁrst deﬁned mR as the position of the pole in the propagator; we then deﬁned the renormalized ﬁelds requiring that their twopoint function has residue +i at the pole p2 = m2R . We ﬁnally ﬁxed the ﬁnite value of the fourpoint function at zero momentum, i.e. when the square of the centerofmass energy s was equal to 4m2R . So mR was always our massscale used to ﬁx the ﬁnite values of the renormalized quantities. However we can be more general, and use a generic massscale µ to deﬁne the renormalization procedure. For instance, we can decide to ﬁx the value of the renormalized constant looking at the fourpoint amplitude at a value q 2 = µ2 m2R , or even at a spacelike value q 2 = −µ2 . It is quite useful to keep µ generic and see what are the consequences of a change in µ. The relation between ΓR and the bare npoint function Γ0 is15 « „ Λ ΓR (pi ; gR , µ) = Z −n/2 g0 (Λ), Γ0 (pi ; g0 (Λ), Λ) . (5.161) µ The important point is that ΓR , by deﬁnition, does not depend on the cutoﬀ Λ. We have adjusted the bare coupling, bare mass and ﬁeld renormalization just in such a way that all renormalized Green’s functions are ﬁnite. In the following we will be interested in the situation where the typical energy scales are much bigger than the masses, and we will neglect all mass dependences. Γ0 instead has been computed in the regularized theory, using the bare coupling and the bare masses, and therefore depends on the cutoﬀ Λ both explicitly and, implicitly, through the bare coupling g0 (Λ) and the bare masses, while of course it is independent of µ, because the scale µ enters only afterwards, when we ﬁx the value of the renormalized Green’s function with a renormalization prescription at momenta given by µ, e.g. at q 2 = µ2 for the fourpoint function. The factor Z −n/2 is the contribution from the wave function renormalization of the n ﬁelds. As we have seen, it is obtained computing ﬁrst the twopoint amplitude in the bare theory. The result of this ﬁrst step is therefore a function of Λ, of g0 (Λ) and of p2 ; then Z is deﬁned ﬁxing the value of the numerator of
15 We assume for simplicity that in the theory there is only one type of ﬁeld. Otherwise, the wave function renormalization factors depend on the ﬁeld, and in eq. (5.161) there will be a factor Z −1/2 for each ﬁeld.
148 Perturbation theory and Feynman diagrams the twopoint function at a given value of p2 = µ2 . For instance, in Section 5.6 we ﬁxed it choosing µ = mR and requiring that the residue of the pole of the propagator at p2 = m2R is +i. So, in general, Z is a function of Λ, g0 (Λ) and µ (and not of p2 ), see e.g. eq. (5.137). Since Z is a dimensionless quantity, it can only depend on Λ and µ through their ratio Λ/µ, in the highenergy limit in which all masses can be neglected. Since ΓR is independent of Λ, we can write Λ
dΓR =0 dΛ
(5.162)
and using eq. (5.161) we obtain » Λ
– ∂ ∂ + β(g0 ) − nη(g0 ) Γ0 (pi ; g0 , Λ) = 0 , ∂Λ ∂g0
(5.163)
where β(g0 ) ≡ Λ η(g0 ) ≡
16
In principle, η depends also on Λ/µ. However Z depends on Λ/µ through terms ∼ log Λ/µ, which after taking the derivative with respect to log Λ are independent of µ. There are also subleading terms log log Λ/µ in Z; after taking the derivative these become 1/(log Λ/µ) and disappear in the limit Λ/µ → ∞, leaving a ﬁnite function η(g0 ).
dg0 , dΛ
1 d Λ log Z . 2 dΛ
(5.164)
(5.165)
Equation (5.163) is called a renormalization group equation, and eqs. (5.164) and (5.165) deﬁne the beta function and the eta function of the theory.16 Equations (5.163)–(5.165) can be solved by the method of characteristics. We introduce a dilatation parameter u and the solution is given by Γ0 (pi ; g0 ,
Λ −n/2 ) = Zeﬀ (u)Γ0 (pi ; geﬀ (u), Λ) u
(5.166)
where geﬀ (u) is deﬁned as the solution of the equation u
dgeﬀ = β(geﬀ (u)) du
(5.167)
with the initial condition geﬀ (1) = g0 , and Zeﬀ (u) is deﬁned as the solution of 1 d u log Zeﬀ = η(geﬀ (u)) 2 du
(5.168)
with the initial condition Zeﬀ (1) = 1. We see that geﬀ plays the role of an eﬀective bare coupling constant, and a change in the cutoﬀ is equivalent to a change in geﬀ and in Zeﬀ . To study what happens as we remove the cutoﬀ we must take the limit u → 0. Equation (5.167) can be written in the integrated form Z geff (u) dg = log u . (5.169) β(g ) g0 We see that in the limit u → 0 the integral on the lefthand side must diverge, and this is possible only if, as u → 0, geﬀ (u) approaches a zero of the beta function. In general the renormalization of the coupling has the form gR = g0 − β0 g02 log Λ + O(g03 ) .
(5.170)
The dependence of g0 on the cutoﬀ is obtained inverting the above relation, 2 3 log Λ + O(gR ) g0 = gR + β0 gR
(5.171)
5.9
The running of coupling constants 149
and therefore
dg0 (5.172) = β0 g02 + O(g03 ) . d log Λ This shows that there is always a zero of the beta function at g0 = 0, and it is possible to remove the cutoﬀ while at the same time sending g0 (Λ) → 0. In other words, given a regularized theory, with a cutoﬀ in momentum space or, for example, on a spacetime lattice (which is another possible UV regulator) we ﬁnd the limit Λ → ∞ (or the continuum limit in the case of a lattice) tuning the bare couplings toward a zero of the beta function. This is a way to see things that has very important applications to statistical mechanics and critical phenomena, as well as in lattice gauge theory, and we will come back to it in Section 9.5. β(g0 ) ≡
There is another way to extract information from eq. (5.161), which is more useful from the point of view of particle physics. We rather write the equation as Γ0 = Z n/2 ΓR and we use the fact that Γ0 is independent of the renormalization point µ. Instead ΓR depends on µ explicitly, and also through the renormalized mass and coupling. Let us again neglect all mass terms. Then we write » – ∂ ∂ dΓ0 = µ + β(gR ) + nγ(gR ) ΓR (pi ; gR , µ) . (5.173) 0=µ dµ ∂µ ∂gR where now β(gR ) = µ
dgR dµ
(5.174)
and
1 d µ log Z . (5.175) 2 dµ Equation (5.173) is the Callan–Symanzik equation. This equation is formally very similar to the one previously studied, but now we have the renormalized coupling gR rather than the bare coupling g0 . It tells us how ΓR changes if we change the renormalization point µ. The reason why this equation is very useful is that, in the highenergy limit where all masses can be neglected, using dimensional arguments the dependence on µ can be translated into a dependence on the energy. In fact, if dΓ is the mass dimension of ΓR , then ΓR (pi ; gR , µ) for dimensional reasons must have the form « „ pi dΓ (5.176) ΓR (pi ; gR , µ) = µ F gR , µ γ(gR ) =
with F a dimensionless function. Using again the method of characteristics, eq. (5.173) implies that “ µ” −n/2 = Zeﬀ (u)ΓR (pi ; geﬀ (u), µ) (5.177) ΓR p i ; g R , u where now geﬀ (u) is deﬁned as the solution of the equation u
dgeﬀ = β(geﬀ (u)) du
(5.178)
with the initial condition geﬀ (1) = gR , where gR is the value of the renormalized coupling constant at the reference scale µ. Similarly, Zeﬀ (u) is deﬁned as the solution of 1 d (5.179) u log Zeﬀ = −γ(geﬀ (u)) 2 du
150 Perturbation theory and Feynman diagrams with the initial condition Zeﬀ (1) = 1. Using eq. (5.176) we see that „ « “ µdΓ µ” upi 1 = d F gR , ΓR p i ; g R , = d ΓR (upi ; gR , µ) . u u Γ µ u Γ
(5.180)
Combining eqs. (5.177) and (5.180) we ﬁnd −n/2
ΓR (upi ; gR , µ) = udΓ Zeﬀ
(u)ΓR (pi ; geﬀ (u), µ)
(5.181)
Writing eq. (5.179) in the integrated form, we can rewrite the above expression as ﬀ j Z log u γ(geﬀ )(u )d log u ΓR (pi ; geﬀ (u), µ) . ΓR (upi ; gR , µ) = udΓ exp n 0
(5.182) We see that the rescaling of energies (in the limit when masses can be neglected) is summarized by two eﬀects: ﬁrst of all, naive dimensional analysis does not work anymore. Instead of a simple overall factor udΓ we get also a modiﬁcation determined by the γ function; γ is then called the anomalous dimension. Its origin is in the divergencies of ﬁeld theory, which force us to introduce a new massscale (the cutoﬀ, which is sent to inﬁnity and is replaced by the renormalization point µ), and spoiled naive dimensional analysis. Second, the coupling gR at the scale µ is replaced by geﬀ (u), or geﬀ (E) with E = uµ, which is called the running coupling constant. Therefore geﬀ (E) plays the role of an eﬀective renormalized coupling constant. We see that the beta function β(g) contains important information. In particular, the sign of the beta function near g = 0 is crucial, as we will see below and in Section 9.5. In the case of λφ4 theory the explicit oneloop computation is very simple, since there is no wave function renormalization at one loop, and the result comes just from the graph of Fig. 5.9. We computed it in Section 5.5.2 where we found 3 log Λ , (5.183) λR = λ0 + λ20 16π 2 Therefore the oneloop beta function is β(λ0 ) = β0 λ20 + O(λ30 ) ,
β0 =
3 . 16π 2
(5.184)
Limiting ourselves to oneloop, the explicit integration of E
d λeﬀ = β0 λ2eﬀ , dE
(5.185)
with the initial condition λeﬀ (E = µ) = λ∗ , gives λeﬀ (E) =
λ∗ . 1 − β0 λ∗ log(E/µ)
(5.186)
Comparing with eq. (5.159) we see that the renormalization group (RG) analysis has provided a resummation of a whole class of logarithmic terms. Expanding eq. (5.186) in powers of λ∗ we get " # ∞ X n cn (E)λ∗ λeﬀ (E) = λ∗ 1 + (5.187) n=1
„
with cn (E) =
E β0 log µ
«n .
(5.188)
5.9
The running of coupling constants 151
We might ask what we have really gained, since we used only the oneloop beta function, so one might think that eq. (5.187) is not justiﬁed beyond the term n = 1 that we already knew. However, we will see in Exercise 5.4 that the eﬀect of higherloop corrections to the beta function is to produce additional terms proportional to log log E in the denominator of eq. (5.186), see eq. (5.198). Including these corrections, the coeﬃcients cn are modiﬁed from the value given in eq. (5.188) to a value of the form » –n E + O(log log E) . cn (E) = β0 log (5.189) µ Then the term in cn proportional to (log E)n is not aﬀected by higherorder corrections to the beta function, while, at each order n, in cn there are also terms of order (log E)n−1 and smaller, that are missed using only the oneloop beta function. Therefore the resummation (5.186) is useful when log(E/µ) 1, since in this case at each order n we have picked the term which dominates at high energies. In turn, this means that eq. (5.186) is really useful only when β0 < 0, since in this case when log(E/µ) 1 we have λeﬀ (E) 1 and perturbation theory is consistent. Equation (5.186) is called the leading logarithms approximation. However, in the case of λφ4 we have β0 > 0 and therefore the running coupling increases in the UV. Formally, eq. (5.186) would even predict that λeﬀ (E) diverges at j ﬀ 1 E = µ exp . (5.190) β0 λ∗ Of course this result should not be taken literally, because as soon as λeﬀ (E) becomes of order one the whole perturbative expansion, even if improved by RG, blows up and cannot be trusted anymore. The correct conclusion, instead, is that, even if we started with λ∗ 1, there is a critical energy at which the theory enters in a strong coupling regime. Besides our toy model λφ4 , it turns out, more importantly, that β0 > 0 in QED. This means that in QED the ﬁne structure constant increases with energy, and formally there even exists an energy scale where it becomes strong. The energy similar to eq. (5.190), where formally the oneloop running coupling diverges, is called the Landau pole. However, the running is very slow, since it is logarithmic, and long before the theory enters in the strong coupling regime, we arrive at the electroweak scale, where QED in isolation is no longer the correct theory. From this point on, the evolution of α must be studied in the context of the Standard Model and possibly of its highenergy extensions. From its lowenergy value α = 1/137.035 999 11(46), the ﬁne structure constant at the Z 0 mass MZ = 91.1876 ± 0.0021 GeV grows only to the value17 α(MZ ) =
1 . 127.918 ± 0.018
(5.191)
If instead β0 < 0, we see from eq. (5.186) that the running coupling becomes smaller and smaller at high energies (and therefore the perturbative result is more and more accurate). As we mentioned at the beginning of this section, this property is known as asymptotic freedom and is one of the most important features of quantum chromodynamics, the theory of strong interactions. The running of the strong coupling, which is denoted by αs , is wellveriﬁed experimentally, see Fig. 5.19, taken from S. Eidelman et al., Phys. Lett. B592, 1 (2004).
17 To be precise, this is the value of the ﬁne structure constant, renormalized in the socalled MSscheme. We do not enter into the details of how this is deﬁned. See e.g. Peskin and Schroeder (1995), page 377.
152 Perturbation theory and Feynman diagrams
αs (µ) at the values of µ where they are measured. The lines show the central values and the ±1σ limits of the average. The ﬁgure clearly shows the decreases in αs (µ) with increasing µ. The data are, in increasing order of µ: τ width, Υ decays, deep inelastic scattering, e+ e− event rate at 22 GeV from the JADE data, shapes at TRISTAN at 58 GeV, Z width, and e+ e− event shapes at 135 GeV and 189 GeV. From S. Eidelman et al., Phys. Lett. B592, 1 (2004).
0.3
αs(µ)
Fig. 5.19 Summary of the values of
0.2
0.1
0 1
10 µ GeV
10
2
Summary of chapter • The aim of this chapter was to set up the formalism for computing the transition amplitudes between initial and ﬁnal states, with an arbitrary number of incoming and outgoing particles, i.e. to compute the Smatrix elements (5.16). • The ﬁrst step is the LSZ formula (5.46), which expresses the Smatrix elements in terms of the vacuum expectation value of a T product of ﬁelds. In eq. (5.46), φ is the Heisenberg operator that evolves with the full Hamiltonian, including the interaction term, so it is not a free ﬁeld. The T product is deﬁned in eq. (5.32). • The next step is given by eq. (5.67), where we write the vacuum expectation value of the T product of the ﬁelds φ (which evolve with the full Hamiltonian) in terms of a free ﬁeld φI (the interaction picture ﬁeld) and of the interaction picture Hamiltonian, HI , which is a function of the free ﬁelds φI . The perturbative expansion is the expansion of the exponential in eq. (5.67) in powers of the interaction Hamiltonian. Observe that from now on the interaction picture ﬁeld φI will be denoted simply by φ, while the fully interacting ﬁeld will not appear again. • Expanding the exponential of the Hamiltonian we are left with the task of computing vacuum expectation values of free ﬁelds. The actual computation is enormously simpliﬁed by the use of Wick’s theorem and of Feynman rules (Section 5.5). A basic role is played by the Feynman propagator (Section 5.4), since Wick’s theorem reduces the vacuum expectation value of the product of N ﬁelds, with N arbitrary, to a sum of products of Feynman propagators.
5.9
The running of coupling constants 153
All contributions can be represented graphically by Feynman diagrams. • We can now perform explicit calculations, and we discover that Feynman graphs containing closed loops are in general divergent. To cure these divergences one ﬁrst regularizes the theory, introducing a cutoﬀ. The couplings, ﬁelds and masses which appear in the Lagrangian are then given a dependence on the cutoﬀ (and are now called bare couplings, bare ﬁelds and bare masses) chosen so that the divergences coming from the loops are canceled, and physical observables (like the renormalized masses and couplings) are ﬁnite. • Theories where the divergences can be reabsorbed in a ﬁnite number of parameters are called renormalizable, and are mathematically consistent and predictive, in principle at any energy scale. Nonrenormalizable theories, however, can still be perfectly acceptable lowenergy theories, and have an intrinsic massscale above which we know that they must be replaced by a more fundamental theory. • As a consequence of the renormalization procedure, the physical (i.e. renormalized) coupling constants are not at all constant. Rather, their value depends on the energy scale at which they are measured. If the coupling goes to zero in the highenergy limit the theory is called asymptotically free. The theory of strong interactions, QCD, turns out to be an asymptotically free theory.
Further reading • The subject of perturbative expansion, Feynman diagrams, renormalization, etc. is treated in detail in all QFT textbooks. Excellent and very detailed discussions are given in Itzykson and Zuber (1980), Peskin and Schroeder (1995) and Weinberg (1995). At a simpler level, a clear book is Mandl and Shaw (1984). • The measurement of the cosmological constant discussed in Section 5.7 are very important and are very likely to become even more important in the future. A book discussing the modern developments in cosmology is S. Dodelson, Modern Cos
mology, Elsevier, San Diego 2003. See, e.g. Section 2.4.5 for a discussion of vacuum energy and type Ia supernovae. Beautiful recent experimental results on type Ia supernova are reported in A. Riess et al., astroph/0402512. Fluctuations in the cosmic microwave background (CMB) have been measured with great accuracy by the WMAP experiment, see e.g. C. L. Bennett et al., First Year Wilkinson Microwave Anisotropy Probe (WMAP) Observations: Preliminary Maps and Basic Results, Astrophys. J. Suppl. 148 (2003) 1 [arXiv:astroph/0302207].
154 Perturbation theory and Feynman diagrams
Exercises (5.1) Using the deﬁnition (5.69) and the expression of the T product in terms of the theta function, show that the Feynman propagator is a Green’s function of the KG operator, i.e. it satisﬁes (2x + m2 )D(x − y) = −iδ (4) (x − y) .
(5.192)
Show that this is true in general for the T product of N ﬁelds (5.43), justifying the fact that we have called them the N point Green’s function. Solve eq. (5.192) going in momentum space and study to what boundary conditions corresponds each prescription for going around the poles. (5.2) Consider a scalar theory in d spacetime dimensions action has the standard R kinetic term R d whose d x(∂φ)2 and an interaction term dd x λφn . According to the counting argument presented in the text, for what values of n and d is the theory renormalizable? (5.3) In the text we have written the interaction as λφ4 . We could instead study the theory with interaction term : λφ4 :. Compare the perturbative expansion in the two cases. In particular, show that the mass renormalization is diﬀerent, and with the interaction term : λφ4 : it vanishes at O(λ). (5.4) In QCD the perturbative expansion is an expansion in powers of g 2 and, at the twoloop level, dg (5.193) = −β0 g 3 − β1 g 5 + O(g 7 ) , d log E with β0 > 0 and β1 > 0. In terms of αs = g 2 /(4π), dαs (5.194) = −b0 α2s − b1 α3s + O(α4s ) , d log E
with b0 > 0 and b1 > 0. (i) Neglect the term O(α3s ) and verify that the solution is α(E) =
α(µ) . 1 + b0 α(µ) log(E/µ)
(5.195)
where µ is a massscale of reference, used to ﬁx the initial condition. Deﬁne a new massscale ΛQCD as ΛQCD
j = µ exp −
1 b0 α(µ)
ﬀ .
(5.196)
Verify that eq. (5.195) can be rewritten as α(E) =
1 . b0 log(E/ΛQCD )
(5.197)
Therefore the coupling is small (and the approximation of neglecting the term O(α3s ) in eq. (5.194) is justiﬁed) when E ΛQCD . Experimentally, ΛQCD 200 MeV, and typically the perturbative calculations are valid at E > 1 GeV. (ii) Using eq. (5.197) as a lowestorder solution, show that the solution of eq. (5.194) at two loops, i.e. including the term O(α3s ), can be written as α(E) =
1
, log log(E/ΛQCD ) (5.198) after a suitable redeﬁnition of ΛQCD . b0 log(E/ΛQCD ) +
b1 b0
Crosssections and decay rates
6
In the previous chapter we have understood how to compute matrix elements between initial and ﬁnal states and how to make them ﬁnite. We will see in this chapter how to use these matrix elements to compute scattering crosssections and decay rates.
6.1
Relativistic and nonrelativistic normalizations
It is useful ﬁrst of all to clarify a diﬀerence between the relativistic and nonrelativistic normalization of oneparticle states. To make the argument cleaner, we ﬁrst consider a system in a cubic box with spatial volume V = L3 . At the end of the computation V will be sent to inﬁnity. Momenta are therefore discrete; for instance, if we use periodic boundary conditions, p = 2πn/L with n = (nx , ny , nz ) a vector with integer components. In nonrelativistic quantum mechanics a oneparticle state with momentum p in the coordinate representation is given by a plane wave ψp (x) = C eip·x (6.1) and the normalization constant is ﬁxed by the condition that there is one particle in the volume V , d3 x ψp (x)2 = 1 . (6.2) V
√ This ﬁxes C = 1/ V . Wave functions with diﬀerent momenta are orthogonal, and therefore d3 x ψp∗ 1 (x)ψp2 (x) = δp1 ,p2 . (6.3) V
Writing ψp (x) = xp and using the completeness relation 1, we can rewrite this as p1 p2 (N R) = δp1 ,p2 .
d3 x x x = (6.4)
The superscript (N R) reminds us that the states have been normalized according to the conventions of nonrelativistic quantum mechanics.
6.1 Normalizations
155
6.2 Decay rates
156
6.3 Crosssections
158
6.4 Twobody ﬁnal states
160
6.5 Resonances
163
6.6 Born approximation
167
6.7 Solved problems
171
156 Crosssections and decay rates
In relativistic QFT this normalization is not the most convenient, because the spatial volume V is not relativistically invariant, and therefore the condition “oneparticle per volume V ” is not invariant. We have already introduced in eq. (4.11) a more convenient Lorentzinvariant normalization; in a ﬁnite box, using eq. (4.6), it reads p1 p2 (R) = 2Ep1 V δp1 ,p2 .
(6.5)
Therefore the diﬀerence between the relativistic and nonrelativistic normalization of the oneparticle states is p (R) = (2Ep V )1/2 p (N R) , and of course for a multiparticle state n (R) p1 , . . . , pn = (2Epi V )1/2
p1 , . . . , pn (N R) .
(6.6)
(6.7)
i=1
We denote by Mf i the scattering amplitude between the initial state with momenta q1 , . . . , qm and the ﬁnal state with momenta p1 , . . . , pn , with nonrelativistic normalization, and by Mf i the same matrix element with relativistic normalization of the states. Then Mf i =
n
(2Epi V )−1/2
i=1
m
(2Eqj V )−1/2 Mf i .
(6.8)
j=1
We saw in Chapter 5 that the Smatrix can be written as S = 1 + iT and that it is convenient to extract a factor (2π)4 δ (4) (Pi − Pf ) from T , where Pi and Pf are the total initial and ﬁnal fourmomenta. Then S = 1 + (2π)4 δ (4) (Pi − Pf )iM .
(6.9)
If we take the matrix element of S between the initial state i and the ﬁnal state f , taken with the nonrelativistic normalization, then the matrix element of the operator M is just Mf i while the matrix element of the identity operator is just a Kronecker delta, because of eq. (6.4), Sf i = δf i + (2π)4 δ (4) (Pi − Pf )iMf i .
6.2
(6.10)
Decay rates
Consider ﬁrst the case in which the initial state is a single particle with fourmomentum p and mass M , and the ﬁnal state is given by n particles with fourmomenta pi and masses mi , i = 1, . . . n. We are therefore considering a decay process. Assume for the moment that all particles are distinguishable. The rules of quantum mechanics tell us that the probability for this process is obtained by taking the squared modulus of the amplitude and summing over all possible ﬁnal states. In eq. (6.10) the term δf i gives of course zero because the initial and ﬁnal states are diﬀerent. When
6.2
we take the square of the other term we are confronted with the square of the delta function. To compute it, we recall that we are working in a ﬁnite spatial volume and, from eq. (4.6), (2π)3 δ (3) (0) = V .
(6.11)
Similarly, we regularize also the time interval, saying that time runs from −T /2 to T /2 (at the end of the computation T → ∞) so that (2π)4 δ (4) (0) = V T
(6.12)
and (2π)4 δ (4) (Pi − Pf )iMf i 2 = (2π)4 δ (4) (Pi − Pf ) V T Mf i 2 .
(6.13)
We must now sum this expression over all ﬁnal states. Since we are working in a ﬁnite volume V , this is the sum over the possible discrete values of the momenta of the ﬁnal particles. In the largevolume limit for each particle we can write, using eq. (4.5) V → d3 pi . (6.14) 3 (2π) p i
It is interesting to understand this result physically, observing that in statistical mechanics the integration measure over the phase space is d3 xi d3 pi (2π)3
(6.15)
for each particle. The factor (2π)3 is simply the volume of the cells of the phase space, h3 = (2π)3 , in units = 1. In our case the particles are momentum eigenstates and are completely delocalized in space, so the scattering amplitude depends on the momentum but not on the positions of the particles, and we can integrate over d3 xi , obtaining the volume factor. In conclusion, the probability dw for a decay in which in the ﬁnal state the ith particle has momentum between pi and pi + dpi is dw = (2π)4 δ (4) (Pi − Pf ) V T Mf i 2
n V d3 pi i=1
(2π)3
.
(6.16)
This is the probability that the decay takes place at any time between −T /2 and T /2. We are more interested in the decay rate dΓ, which is the decay probability per unit time, and therefore is obtained dividing by T , n V d3 pi , (6.17) dΓ = (2π)4 δ (4) (Pi − Pf ) V Mf i 2 (2π)3 i=1 and by construction has a ﬁnite limit for T → ∞. To get rid of the divergent V factors we now express this in terms of the matrix element
Decay rates 157
158 Crosssections and decay rates
with relativistic normalization. Using eq. (6.8) we see that the volume factors cancel, and we are left with dΓ = (2π)4 δ (4) (p −
i
pi )
n d3 pi 1 , Mf i 2 2Ep (2π)3 (2Ei ) i=1
(6.18) where Ep is the energy of the initial particle and Ei of the ﬁnal particles. Various observations are in order: • Energy–momentum conservation is guaranteed by the Dirac delta. In particular, if M < i mi the process is forbidden. • The factors d3 pi /Ei are relativistically invariant. • Mf i is computed with the relativistic normalization of the states and therefore is just the matrix element that we learned to compute in Chapter 5, and it is relativistically invariant. • The factor 1/(2Ep ) reduces to 1/(2M ) in the rest frame of the decaying particle. In a generic frame in which the particle has speed v, we have Ep = γM with γ = (1 − v 2 )−1/2 and therefore the rate Γ is smaller by a factor γ. The lifetime of a particle is the inverse of its total decay rate (i.e. of the rate dΓ integrated over momenta and summed over all possible decay modes). Therefore the factor γ is nothing but the relativistic dilatation of time. It is useful to deﬁne the (diﬀerential) nbody phase space dΦ(n) , dΦ(n) ≡ (2π)4 δ (4) (Pi − Pf )
n
d3 pi . (2π)3 2Ei i=1
(6.19)
Equation (6.18) can therefore be written as dΓ =
1 Mf i 2 dΦ(n) . 2Ep
(6.20)
Finally, observe that if n of the ﬁnal particles are identical, conﬁgurations that diﬀer by a permutation are not distinct and therefore the phase space is reduced by a factor n!.
6.3
Crosssections
Consider a beam of particles with mass m1 , number density (that is, number of particles/unit volume) n01 and velocity v1 impinging on a target made of particles with mass m2 and number density n02 , at rest. The superscript 0 on n0i is meant to stress that these are the number densities in a speciﬁc frame, that with particle 2 at rest. Assume for simplicity that both types of particles have a uniform distribution (it is not diﬃcult to generalize to a nonuniform distribution, as one can have
6.3
in a typical beam). The number of scattering events, N , that take place per unit volume and per unit time will be proportional to the incoming ﬂux n01 v1 and to the density of targets n02 . The proportionality constant is, by deﬁnition, the crosssection: dN = σv1 n01 n02 dV dt .
(6.21)
Dimensional analysis shows immediately that σ has the dimensions of an area. Equation (6.21) holds in the rest frame of the particles of type 2. We want to write a similar expression in a generic frame. First of all, in a generic frame, we deﬁne the crosssection as the Lorentz invariant quantity that, in the rest frame of particle 2, is given by eq. (6.21). The number dN is Lorentz invariant (its integral is the number of clicks of the detector, so it is clearly independent of the reference frame), and dV dt = d4 x is also invariant. Therefore we must ﬁnd a Lorentzinvariant expression that, in the rest frame of particle 2, reduces to v1 n01 n02 . This is given by (see, e.g. Landau and Lifshitz, vol. II (1979), Section 12) (6.22) n1 n2 (v1 − v2 )2 − (v1 × v2 )2 , where n1 , n2 are the number densities of the two types of particles in the frame where their respective velocities are v1 and v2 (note that the number density is not invariant, but transforms as the inverse of a spatial volume). If the particles are collinear, we simply have dN = σv1 − v2 n1 n2 dV dt . It is convenient to deﬁne the quantity % I ≡ (p1 p2 )2 − m21 m22 = E1 E2 (v1 − v2 )2 − (v1 × v2 )2 ,
(6.23)
(6.24)
so that
I (n1 V )(n2 dV )dt . (6.25) V E1 E2 Integrating over dV , n2 dV gives the total number N2 of particles of type 2, while n1 V = N1 . Then the total number of events per unit particle of type 1, per unit particle of type 2 in a total time T is given by σIT /(V E1 E2 ). However, this is nothing but the probability of the event, i.e. the square of the matrix element, summed over all ﬁnal states. Therefore n V d3 pi V E1 E2 (2π)4 δ (4) (Pi − Pf ) V T Mf i 2 σ= , (6.26) IT (2π)3 i=1 dN = σ
or, in diﬀerential form, dσ =
n V 2 E1 E2 V d3 pi (2π)4 δ (4) (Pi − Pf ) Mf i 2 . I (2π)3 i=1
(6.27)
We now pass from Mf i 2 to Mf i 2 . The two initial particles bring a factor 1/[(2E1 V )(2E2 V )] so the overall factor V 2 E1 E2 cancels, and the
Crosssections 159
160 Crosssections and decay rates
n ﬁnal particles bring each one a factor 1/(2Ei V ) so that also the volume factors in V d3 pi cancel. The ﬁnal result is then dσ = (2π)4 δ (4) (Pi − Pf )
n d3 pi 1 Mf i 2 . 4I (2π)3 2Ei i=1
(6.28)
Observe that in the above expression the factors I, Mf i and d3 pi /Ei are separately Lorentz invariant, so the Lorentz invariance of the crosssection is evident. The term 4I is called the ﬂux factor. In terms of the phase space (6.19), eq. (6.28) reads dσ =
6.4
1 Mf i 2 dΦ(n) . 4I
(6.29)
Twobody ﬁnal states
Consider ﬁrst the decay of a particle of mass M into two particles of masses m1 , m2 . Since the phase space is Lorentz invariant, we can compute it in the frame that we prefer, and of course the simplest choice is the rest frame of the initial particle. Then dΦ(2) = (2π)4 δ(M − E1 − E2 )δ (3) (p1 + p2 )
d3 p2 d3 p1 . (6.30) 3 (2π) 2E1 (2π)3 2E2
We have six integration variables and four Dirac deltas, so we can reduce this to only two integrations. We can perform explicitly the integration over d3 p2 using the Dirac delta δ (3) (p1 + p2 ), and we are left with a phase space which is still diﬀerential with respect to d3 p1 , dΦ(2) =
1 1 δ(M − E1 − E2 )d3 p1 . (2π)2 4E1 E2
(6.31)
Of course here E22 = p22 + m22 where now p2 has become a notation for −p1 instead of being an independent integration variable. We now write d3 p1 = p21 dp1 dΩ, where dΩ is the inﬁnitesimal solid angle and p1 = p1 , and we integrate over p1 using the conservation of energy, i.e. ∞ % % 1 1 (2) 2 2 2 2 2 dΩ p dp1 δ M − p1 + m1 − p1 + m2 . dΦ = (2π)2 4E1 E2 1 0 (6.32) The integral is easily performed using the identity δ(f (x)) =
1 δ(x − x0 ) f (x0 )
(6.33)
where x0 is the zero of f (x) (if there is more than one zero we must sum over all of them, but in this case there is only one zero in the integration domain p1 0 ) and we ﬁnd 1/2
4 1 dΦ(2) = dΩ . (6.34) M + (m21 − m22 )2 − 2M 2 (m21 + m22 ) 2 2 32π M
In the limit m1 = m2 ≡ m this simpliﬁes further to
1 4m2 (2) dΦ = 1− dΩ , 2 32π M2
6.4
Twobody ﬁnal states 161
p1
p3
(6.35)
where we have assumed that the two particles are distinguishable. If instead they are identical, the phase space is reduced by a factor 1/2!. Another common situation is m1 = m, m2 = 0, in which case 1 m2 dΦ(2) = 1 − dΩ . (6.36) 32π 2 M2 Observe that the phase space goes to zero when the decay products have the maximum mass compatible with the conservation of energy, i.e. at m = M/2 in eq. (6.35) and at m = M in eq. (6.36). Using eq. (6.20) we can write the diﬀerential decay rate for a twobody decay, dΓ/dΩ, where dΩ = d cos θ dφ. In the rest frame of the decaying particle, it is
4 1/2 1 M + (m21 − m22 )2 − 2M 2 (m21 + m22 ) Mf i 2 dΩ . 2 3 64π M (6.37) In principle, Mf i depends on the angles θ, φ. If the decaying particle has spin, it is convenient to choose the direction of the spin as the polar axis. In the absence of external ﬁelds we have cylindrical symmetry around this axis (the symmetry could be broken, for instance, by an external magnetic ﬁeld pointing in a direction diﬀerent from the spin of the particle), and in this case Mf i does not depend on φ and the integration over dφ simply gives a factor 2π. If furthermore the particle has spin zero, there is no preferred direction and the decay is isotropic, i.e. Mf i is independent also of θ, and the integration over dΩ gives simply a factor 4π. dΓ =
p1 + p2
Consider now a scattering process 2 → 2. We consider an initial state with two particles with masses m1 , m2 and fourmomenta p1 , p2 , and a ﬁnal state with two particles with masses m3 , m4 and fourmomenta p3 , p4 . It is useful to introduce the Mandelstam variables s, t and u, s = (p1 + p2 )2 , t = (p1 − p3 )2 ,
p2
p4
Fig. 6.1 An schannel amplitude.
(6.38)
u = (p1 − p4 )2 .
p
p3
1
These variables are clearly Lorentz invariant, and satisfy the relation p−p
s+t+u=
m21
+
m22
+
m23
+
m24
1
,
3
(6.39)
as one veriﬁes immediately from the deﬁnitions, using energy–momentum conservation, p1 + p2 = p3 + p4 . It is convenient to work in the center of mass (CM), where the incoming particles have fourmomenta p1 = (E1 , p) and p2 = (E2 , −p), with 2 E1,2 = p2 + m21,2 . Computing √ s in the CM we ﬁnd s = (E1 + E2 )2 , so the centerofmass energy is s. Observe that in a Feynman graph
p 2
p4
Fig. 6.2 A tchannel amplitude.
162 Crosssections and decay rates
like Fig. 6.1 the momentum of the intermediate particle is p1 + p2 , so its propagator is a function of s; for instance, if the intermediate particle is a scalar with mass m, the propagator in Fig. 6.1 can be written as i/(s − m2 ). Instead in Fig. 6.2 the propagator of the intermediate particle is i/(t − m2 ). For this reason, the amplitudes in Figs. 6.1 and 6.2 are referred to as the schannel and tchannel amplitude, respectively. The uchannel amplitude is obtained exchanging p3 and p4 in Fig. 6.2. In the CM, we can also write p3 = (E3 , p ), p4 = (E4 , −p ). Energy conservation gives immediately 1/2 1 , (6.40) p  = √ s2 + (m23 − m24 )2 − 2s(m23 + m24 ) 2 s and the same calculation performed in the case of twobody decay gives dΦ(2) =
p  1 √ dΩ . 2 (2π) 4 s
(6.41)
With a simple computation (see the solutions to Exercise 7.1) one can show that the ﬂux factor I, evaluated in the CM, becomes √ I = p s (6.42) and therefore the 2 → 2 diﬀerential scattering crosssection is dσ =
1 2 p  M dΩ .  f i 64π 2 s p
(6.43)
For elastic scattering (m1 = m3 , m2 = m4 ) we have p  = p and dσelas =
1 Mf i 2 dΩ . 64π 2 s
(6.44)
The above formulas are valid also for particles with spin, if the initial and ﬁnal spin states are known; in this case the initial state has the form i = p1 , s1 ; . . . ; pn , sn , and similarly for the ﬁnal state, so the only modiﬁcation in the above equations is that the labels i, f in Mf i include also the spin degrees of freedom. However, experimentally it is more common that we do not know the initial spin conﬁguration and we accept in the detector all ﬁnal spin conﬁgurations; in this case, to compare with experiment, we must sum the righthand side of eq. (6.29) over the ﬁnal spin conﬁgurations and average it over the initial spin conﬁgurations. Deﬁning Mf i 2 ≡ Mf i 2 (6.45) initial spins
ﬁnal spins
all formulas for the crosssections are modiﬁed with the replacement 1 Mf i 2 , (6.46) Mf i 2 → (2sa + 1)(2sb + 1) where sa , sb are the spins of the two initial particles. For a decay rate Γ the average over the spin s of the initial particle brings instead a factor 1/(2s + 1).
6.5
6.5
Resonances and the Breit–Wigner distribution 163
Resonances and the Breit–Wigner distribution
Consider the scattering 2 → 2 in a theory with a cubic interaction vertex, as for instance in a scalar theory with Lint = λφ3 . At tree level the amplitude is given by the sum of three Feynman diagrams, the schannel amplitude of Fig. 6.1, the tchannel amplitude of Fig. 6.2, and the uchannel amplitude obtained from Fig. 6.2 exchanging p3 with p4 . We focus on the schannel amplitude. We denote by m the (renormalized) mass of the φ ﬁeld and by p = p1 + p2 the total initial four momentum. Then p2 = (E1 + E2 )2 − (p1 + p2 )2 ≡ s is the square of the CM energy, and therefore the physical region is deﬁned by p2 > (2m)2
(6.47)
and in the physical region the propagator i/(p − m ) of the internal line in Fig. 6.1 is always ﬁnite. In other words, the internal line is oﬀshell, which is also expressed saying that it represents a “virtual” particle, rather than a real particle. However, in other situations it is possible to have one or more real particles in the intermediate state. Consider for instance the theory described by two light real scalar ﬁelds1 φ1 and φ2 and one heavy scalar ﬁeld Φ, with Lagrangian 2
2
1 (∂φ1 )2 − m2 φ21 + (∂φ2 )2 − m2 φ22 + (∂Φ)2 − M 2 Φ2 + gφ1 φ2 Φ . 2 (6.48) The Feynman rules for this theory are very simple. The propagator of the ﬁelds φi is i/(p2 −m2 ), the propagator of Φ is i/(p2 −M 2 ), and there is an interaction vertex shown in Fig. 6.3, equal to −ig. After dressing the propagators with their loop corrections, the masses m and M appearing in the propagators become the physical, renormalized masses, as we saw in Chapter 5 (apart from a crucial subtlety to be explained soon). The φ1 φ2 → φ1 φ2 scattering amplitude in the schannel is described at tree level by the diagram in Fig. 6.4, and the Feynman rules give
1
We take two diﬀerent light ﬁelds φ1 , φ2 to avoid the small complication of identical particles; we might as well consider a single light real scalar ﬁeld φ with coupling φ2 Φ, but we should be careful to insert a factor 1/2! in the phase space, because of identical particles, and the appropriate combinatorial factors in the amplitudes.
L=
iM2→2 = (−ig)2
i , p2 − M 2
Fig. 6.3 The vertex for a Φφ1 φ2 interaction. The heavy line is the Φ ﬁeld and the thin lines represent one ﬁeld φ1 and one ﬁeld φ2 .
(6.49)
with p2 = s = (p1 +p2 )2 . The physical region corresponds to p2 (2m)2 since each of the two incoming particles have at least an energy equal to its mass. Therefore, if M 2 < (2m)2 , in the physical region we always have p2 > M 2 , so p2 − M 2 is always nonzero and the amplitude is ﬁnite. However, if M 2 > 4m2 , we apparently have a divergent amplitude (and therefore a divergent crosssection) at a physical value of the energy. The divergence appears when the momenta of the incoming particles are such that p2 = M 2 , i.e. when the internal line represents an onshell particle. This divergence means that in the case M > 2m the amplitude (6.49) cannot be correct, and we must have missed something.
Fig. 6.4 The diagram for 2 → 2
scattering in the schannel.
164 Crosssections and decay rates
The origin of this unphysical divergence is that we have neglected that, if M > 2m, the particle described by the ﬁeld Φ is unstable, because it can decay into two particles of mass m through the graph in Fig. 6.3. This graph gives an amplitude iMΦ→φ1 φ2 = −ig, independently of the masses of the particles. However, if M < 2m, the Dirac delta in the phase space is never satisﬁed and the decay rate is zero. If M > 2m instead the phase space opens up and we have a nonzero decay rate. To understand the physics, let us ﬁrst consider what happens when we have an unstable particle in nonrelativistic quantum mechanics (see Landau and Lifshitz, vol. III (1977), Section 134). When we study the Schr¨ odinger equation in three spatial dimensions, we obtain real eigenvalues for the energy operator under the assumption that the wave function vanishes at inﬁnity. For a decay process, instead, we have an outgoing spherical wave at inﬁnity, and since this boundary condition is complex, the eigenvalues of the Hamiltonian are also complex. If we write them in the form Γ (6.50) E = E0 − i 2 the timedependence of the wave function is ψ ∼ e−iEt = e−iE0 t− 2 t . Γ
Fig. 6.5 The oneloop correction to
the mass of Φ.
(6.51)
Therefore the probability ψ2 decays as exp(−Γt) and we see that 1/Γ is the lifetime, and therefore Γ is just the decay rate discussed in Section 6.2. The fact that the eigenvalues of the Hamiltonian become complex when we have an unstable particle must happen also in relativistic quantum theory, and we must be able to read it from the Feynman graphs. Indeed, when we compute the loop corrections to the squared mass of the Φ ﬁeld, we ﬁnd that the graph in Fig. 6.5, considered as a function of M , develops an imaginary part when M > 2m, i.e. above the threshold for production of two physical particles of mass m in the intermediate state (see Exercise 6.6). Therefore eq. (6.49) is formally correct, but it is not true that M is real. Rather, separating the real and imaginary parts, we have Γ (6.52) M = MR − i , 2 where MR is the renormalized mass. For the simple Lagrangian that we have considered, it is straightforward to verify that the imaginary part of M is indeed equal to −iΓ/2, with Γ the decay rate of the process Φ → φ1 φ2 . We can just compute explicitly the imaginary part of the graph in Fig. 6.5 and compare it with the decay rate computed from the graph in Fig. 6.3, using the general formulas for the decay rate given in Sections 6.2 and 6.4. The optical theorem states that this is a general result, and the imaginary part of M is always equal to −Γ/2, where Γ is the total decay rate (if there are many possible decay channels they all contribute to the imaginary part). The case Γ MR is particularly interesting; this means that the intermediate particle has a lifetime 1/Γ much bigger than the time that
6.5
Resonances and the Breit–Wigner distribution 165
the light takes to travel a distance equal to its Compton wavelength 1/MR . In this case it makes sense to consider it has a real intermediate state, which is produced in the collision, lives for some time and then decays. In this case we call this intermediate particle a resonance. In this limit we can approximate M 2 = (MR − iΓ/2)2 MR2 − iMR Γ and the amplitude (6.49) becomes iM2→2
−ig 2 , E 2 − MR2 + iMR Γ
(6.53)
with E the total CM energy. We see that, thanks to the imaginary contribution, at E = MR the amplitude is no longer divergent. However, it is much larger than far from the resonance. In fact, when we are far from the resonance, i.e. when E = cMR with c a numerical constant not too close to one, the modulus of the amplitude is of order g 2 /MR2 . Instead, at the resonance, it becomes of order g 2 /MR Γ. Since Γ MR , this is a much bigger value than far from the resonance. At E MR we can further approximate E 2 − MR2 = (E − MR )(E + MR ) 2MR (E − MR ) and the amplitude becomes −ig 2 1 . (6.54) iM2→2 2MR E − MR + i(Γ/2) We can now compute the elastic crosssection near the resonance, using eq. (6.44), and observing that the t and uchannel amplitudes can be neglected since they are not resonant. Then g4 1 dσ . (6.55) 4 2 dΩ (16π) MR (E − MR )2 + (Γ2 /4) The width Γ can also be easily computed explicitly using eq. (6.20) with iMΦ→φ1 φ2 = −ig and the phase space (6.35). This gives
1 4m2 g2 4m2 g2 1 − 4π = 1 − . (6.56) Γ= 2MR 32π 2 M2 16πMR M2 It is convenient to use this relation to eliminate g in favor of Γ from eq. (6.55), since Γ is the quantity directly observed, while g can be an eﬀective coupling of no fundamental signiﬁcance if the resonance is a bound state of more fundamental components. Integrating the crosssection over dΩ we ﬁnd, at E MR , 4π Γ2 . (6.57) MR2 − 4m2 (E − MR )2 + (Γ2 /4) √ At E = MR , conservation of energy gives MR = 2 m2 + k 2 , where k is the momentum of the ﬁnal particles in the CM. Therefore MR2 − 4m2 = 4k 2 , and we can rewrite eq. (6.57) as σ(E)
σ(E)
Γ2 π . 2 k  (E − MR )2 + (Γ2 /4)
(6.58)
166 Crosssections and decay rates
This is the Breit–Wigner distribution, when the initial and ﬁnal states of the process are the same. We can generalize it observing that, if the initial and ﬁnal states are diﬀerent, the factor Γ2 in the numerator is replaced by ΓR→i ΓR→f where ΓR→i and ΓR→f are the decay rates of the resonance R into the initial and ﬁnal states, respectively, simply because the factor of g 2 in the numerator becomes gRi gRf where gRi is the eﬀective coupling of the initial state to the resonance R, and gRf is the eﬀective coupling to the ﬁnal state. Instead the factor Γ2 in the denominator remains the total decay rate. For the moment, we have limited ourselves to the case where the initial and ﬁnal particles, described by φ1 , φ2 , are scalars, and we have also assumed that the resonance Φ is a scalar. If instead the resonance has spin J we must sum over the 2J + 1 possible spin states of the resonance, and we therefore have an overall factor 2J + 1. Furthermore, if the two initial particles have spin sa and sb , and we know their spin state, we simply use the partial width ΓR→i for these spin states. If, as it is more common, we do not know the initial polarizations, we average over the initial spins inserting a factor 1 . (2sa + 1)(2sb + 1) s ,s a
(6.59)
b
We reabsorb sa ,sb in ΓR→i , which therefore becomes the width for the process R → i summed over all possible spin conﬁgurations of the state i. Similarly, we sum over the ﬁnal polarizations, reabsorbing the sum in the redeﬁnition of ΓR→f . In conclusion, the general form of the Breit–Wigner distribution is
σ(E)
π ΓR→i ΓR→f 2J + 1 . 2 (2sa + 1)(2sb + 1) k  (E − MR )2 + (Γ2 /4)
(6.60)
As an example, consider the scattering e+ e− → Z 0 → f f¯
(6.61)
where e± are the electron and positron, Z 0 is one of the vector bosons of the Standard Model, and f, f¯ a fermion–antifermion pair. In this case sa = sb = 1/2 while J = 1. Since MZ me , MZ2 4k 2 and eq. (6.60) gives, for the crosssection at the Z 0 peak, ¯ Γ(Z 0 → e+ e− )Γ(Z 0 → f f) 4π 3 2 2 2 · 2 MZ ΓZ /4 0 + − 0 ¯ 12π Γ(Z → e e )Γ(Z → f f) . = 2 2 MZ ΓZ
σpeak =
(6.62)
6.6
6.6
Born approximation and nonrelativistic scattering 167
Born approximation and nonrelativistic scattering
In the nonrelativistic limit, the computations performed with the Feynman diagrams must reproduce the results of nonrelativistic quantum mechanics, where the interaction between particles is described by a potential V (x ). The question that we want to answer in this section is the following: given the ﬁeld theory Lagrangian, what is the potential V (x ) experienced by the particles in the nonrelativistic limit? We begin by recalling the basic formulas of scattering theory in nonrelativistic quantum mechanics (see e.g. Landau and Lifshitz, vol. III (1977), Section 126): the elastic scattering crosssection for a particle of mass m in the potential V (x ) has the general form dσ = f (θ)2 , dΩ
(6.63)
where θ is the scattering angle. The scattering amplitude f (θ) can be computed considering V as a small perturbation of the free Hamiltonian. The result to ﬁrst order in V is called the Born approximation, and is given by f (θ) = −
m 2π
d3 x e−iq ·x V (x ) .
(6.64)
We denote the initial momentum by k , the ﬁnal momentum by k (with k  = k  since we consider an elastic process) and the transferred momentum by q = k − k . The scattering angle θ is related to q = q  and to k = k  by q = 2k sin(θ/2). In a central potential V (r) the angular integral in eq. (6.64) is easily performed explicitly, and 2m ∞ f (θ) = − dr rV (r) sin qr . (6.65) q 0 Let us compare these results with the relativistic formalism that we have developed. For deﬁniteness, we consider the scattering of a nonrelativistic particle of momentum k and mass m, with k  m, oﬀ a heavy target A, with mass MA m. We can think for instance of an electron scattering oﬀ an atom. Since k m MA , we can neglect the recoil of the atom. We limit ourselves to elastic scattering.2 We assume that the incoming particle and the particle A interact through the exchange of a massless or massive boson; it would be a photon in the electron–atom case, but we can treat similarly the exchange of a massive vector particle, or of a scalar particle. At tree level, the interaction is described by the Feynman diagram in Fig. 6.6a. The fact that we neglect the recoil of the scattering center is represented writing the Feynman diagram as in Fig. 6.6b.3
2
The case where the atom is left in an excited state, and therefore the collision is inelastic, is discussed in Problem 6.2.
3
We are considering a theory in which there is a (light particle)–(light particle)–boson vertex and a (heavy particle)–(heavy particle)–boson vertex, but no (light particle)–(heavy particle)–boson vertex, so there are no schannel and uchannel amplitudes, but only the tchannel amplitude of Fig. 6.6a.
168 Crosssections and decay rates
Fig. 6.6 (a): the scattering of a light particle oﬀ a heavy target. In the limit in which we neglect the recoil of the target, the graph is drawn as in Fig. (b), which represents more generally the scattering of a particle in an external potential.
(a)
(b)
We start from our basic formula for elastic scattering, eq. (6.44). By assumption MA is much larger than the electron energy, so s MA2 and eq. (6.44) becomes dσelas
1 Mf i 2 dΩ . 64π 2 MA2
(6.66)
Since we want to compare with the nonrelativistic equations, it is convenient to use the nonrelativistic normalization of the matrix element. We denote by k , A a state with an electron with momentum k and the atom in a state labeled by A. According to eq. (6.7) we have k , A (R) = (2Ek )1/2 (2EA )1/2 k , A (N R) (2m)1/2 (2MA )1/2 k , A (N R) . (6.67) We work in the rest frame of the atom, so EA = MA , and we have used the fact that the incoming particle is nonrelativistic, so Ek m. For notational convenience we have also set the spatial volume V equal to one, since we have already checked explicitly in Section 6.1 that the volume factors cancel in the ﬁnal expression for the crosssection. As in Section 6.1, we denote by Mf i the matrix element with nonrelativistic normalization. Then, from eq. (6.67), Mf i = (k , ATf i k , A )(R) (2m)(2MA ) Mf i , and eq. (6.66) becomes
4
Comparing the crosssections, the relative phase between f (θ) and Mf i remains undetermined. The correct phase is a plus sign, as we have written, and can be found either comparing directly the amplitudes, or checking that in this way one obtains the Coulomb potential with the correct sign, as we will do below.
(6.68)
m 2 dσelas Mf i 2 . (6.69) dΩ 2π Comparing with eqs. (6.63) and (6.64) we see that we can identify the nonrelativistic scattering amplitude f (θ) with (m/2π)Mf i .4 We therefore arrive at the following conclusion. The interaction potential is an intrinsically nonrelativistic concept, since it describes an instantaneous (rather than a retarded) interaction. In QFT, in the fully relativistic regime, it makes no sense. However, in the nonrelativistic limit, it is recovered with the following procedure: 1) Compute the scattering amplitude Mf i using the Feynman diagrams. 2) Transform it into the amplitude with nonrelativistic normalization of the states, Mf i . This is related to the nonrelativistic scattering amplitude f (θ) by Mf i = (2π/m)f (θ). Using eq. (6.64), Mf i (q ) = − d3 x e−iq ·x V (x ) , (6.70)
6.6
Born approximation and nonrelativistic scattering 169
or, inverting the Fourier transform, d3 q V (x ) = − Mf i (q )eiq ·x . (2π)3
(6.71)
We apply this formula to the electromagnetic scattering of an electron with charge e oﬀ a positively charged ion with charge −Ze and spin 1/2, and we consider the case where the initial and ﬁnal spin states are equal, so we will not write the spin labels explicitly. The vertex factor associated to the electron is −ieγ ν while the vertex associated to the ion is +iZeγ µ . Therefore the Feynman diagram of Fig. 6.6a gives uA (+iZeγ µ )uA ] iMf i = [¯
−iηµν [¯ ue (−ieγ ν )ue ] , q2
(6.72)
where uA is the wave function of the ion and ue of the electron. The momentum transferred by the photon is q µ = (q 0 , q ) and, since we are considering elastic scattering, we have q 0 = 0 and q 2 = −q 2 . Then Mf i =
Ze2 (¯ uA γ µ uA ) (¯ ue γµ ue ) . q2
(6.73)
Since the particle A is at rest, it is convenient to use the standard representation for the γ matrices, as discussed in Sections 3.4.2 √ and 3.6. For a particle with mass MA at rest we found uL = uR = MA ξ, with ξ † ξ = 1, see eq. (3.103). As we found in eq. (3.95), the spinor in the standard representation is given in terms of uL , uR by 1 uR + uL u= √ , (6.74) uR − uL 2 so, for the particle A at rest,
ξ uA = 2MA . 0
(6.75)
Then u ¯A γ 0 uA = u†A uA = 2MA ξ † ξ = 2MA and, from the form of the γ matrices in the standard representation, eq. (3.96), u ¯ A γ i uA = 0. Therefore ue γµ ue ) = u ¯A γ 0 uA (¯ (6.76) (¯ uA γ µ uA ) (¯ ue γ0 ue ) (2MA )(2me ) , where we have set u ¯e γ0 ue 2me since the electron is nonrelativistic. We see that the contribution of the wave functions on the external lines is just what is needed to convert Mf i into Mf i , and we ﬁnd Mf i (q ) =
Ze2 . q2
(6.77)
Using eq. (6.71) and performing the angular integration similarly to eq. (6.65), the potential is therefore given by ∞ d3 q 4π iq ·x V (x ) = − Mf i (q )e =− dq qMf i (q) sin qr , (2π)3 (2π)3 r 0 (6.78)
170 Crosssections and decay rates
where here we have used the notation q = q . The integral is performed ∞ using the identity 0 dx (sin x)/x = π/2, and we ﬁnally ﬁnd Zα , (6.79) r which is the standard Coulomb potential. The same calculation can be performed if we consider the exchange of a massive boson of mass µ. The result is now proportional to the Fourier transform of the massive propagator.5 For elastic scattering we have q 0 = 0 and therefore q 2 = (q 0 )2 − q 2 = −q 2 , so V (r) = −
5 Actually, in the case of a massive gauge boson the propagator is not proportional to ηµν but to ηµν − (qµ qν /m2 ), see eq. (8.26). The reader can verify that the additional term qµ qν /m2 in our case gives zero since q = p − p and then qµ u ¯(p )γ µ u(p) = 0, using the fact that the spinors u(p), u ¯(p ) are solutions of the Dirac equation.
q2
1 1 =− 2 . 2 −µ q + µ2
Therefore the rdependence of the potential is now given by 1 e−µr d3 q . eiq x ∼ 3 2 2 (2π) q + µ r
(6.80)
(6.81)
A potential of this form is called a Yukawa potential. This is a shortrange potential, with a characteristic interaction range equal to the Compton wavelength 1/µ of the particle exchanged. We conclude that: exchange of a massless boson ↔ longrange Coulomb potential exchange of a massive boson ↔ shortrange Yukawa potential. As for the sign of the potential, we have seen that in the case of the exchange of a vector boson only the component µ = ν = 0 of the propagator contributes, because in the nonrelativistic limit only the µ = 0 component survives in u ¯γ µ u. Therefore the factor coming from the propagator is −iη00 i D00 = 2 =+ 2 . (6.82) q − µ2 q  + µ2 For the exchange of a scalar particle instead the factor coming from the propagator is i i =− 2 . (6.83) D= 2 2 q −µ q  + µ2 The interaction mediated by a vector particle is repulsive for particles with the same charge and attractive for particles with opposite charge, as we have checked in the computation above. We see comparing eqs. (6.82) and (6.83) that the interaction mediated by a scalar particle is instead attractive for particles with the same charge. In fact, the strong interaction between nucleons, at distances larger than the fermi, can be thought of as mediated by the pion, which is scalar and massive, and therefore the strong interaction between nucleons is attractive, and shortranged. More complicated potentials can be obtained with the simultaneous exchange of more than one boson. For instance, in the language of Feynman diagrams, van der Waals forces at large distances arise from the exchange of two photons between atoms which are electrically neutral but have an electric dipole moment, see Landau and Lifshitz, vol. IV (1982), Section 85.
6.7
6.7
Solved problems
Problem 6.1. Threebody kinematics and phase space In this problem we investigate various aspects of the kinematics of threebody ﬁnal states. Consider the decay of a particle with fourmomentum p in its rest frame, with p = (M, 0), into three particles with momenta pi and masses mi , i = 1, 2, 3. Let E1 , E2 , E3 be the energies of the decay products in the rest frame of the decaying particle. The diﬀerential threebody phase space is dΦ(3) =
d3 p 1 d3 p 2 d3 p 3 (2π)4 δ (4) (p1 + p2 + p3 − p) , (2π)3 2E1 (2π)3 2E2 (2π)3 2E3
(6.84)
where Ei2 = p2i + m2i . The Dirac delta can be used to integrate over the spatial momentum p3 , so that dΦ(3) =
1 d3 p 1 d3 p 2 δ(E1 + E2 + E3 − M ) , 8(2π)5 E1 E2 E3
(6.85)
where E3 = (p23 + m23 )1/2 and now p3 is a notation for −(p1 + p2 ). The matrix elements Mf i 2 must be integrated with this measure. To proceed further, we must know the dependence of Mf i 2 on p1 , p2 . The simplest case is the decay of a spin0 particle into spin0 particles, in the absence of external ﬁelds. In this case there is no preferred direction in space, and the matrix element can only depend on the angle θ between p1 and p2 . Then eq. (6.85) becomes 1 4πp21 dp1 2πp22 dp2 dcos θ δ(E1 + E2 + E3 − M ) (6.86) 8(2π)5 E1 E2 E3 1 p1 dp1 = (p2 dp2 ) (p1 p2 dcos θ) δ(E1 + E2 + E3 − M ) . 32π 3 E1 E2 E3
dΦ(3) =
Now we use the identity E1 dE1 = p1 dp1 , which follows from E12 = p21 + m21 , and similarly E2 dE2 = p2 dp2 . Furthermore, E32 = (p1 + p2 )2 + m23 = p21 + p22 + 2p1 p2 cos θ + m23 .
(6.87)
Therefore, at p1 , p2 ﬁxed, we have E3 dE3 = p1 p2 dcos θ. In eq. (6.86) it is then convenient to perform the integration in d cos θ as the innermost, so we can rewrite eq. (6.86) as 1 dE1 dE2 dE3 δ(E1 + E2 + E3 − M ) , (6.88) 32π 3 and use the Dirac delta to eliminate E3 . In conclusion, for spin0 particles, and in the absence of external ﬁelds, dΦ(3) =
dΦ(3) =
1 dE1 dE2 . 32π 3
(6.89)
Of course this expression is valid only in the region of the (E1 , E2 ) plane where energy–momentum conservation is satisﬁed, otherwise the Dirac delta gives zero. To determine this region, we ﬁrst introduce the Mandelstam variables s, t, u for the decay of a particle with fourmomentum p into three particles with fourmomenta p1 , p2 , p3 , s = (p − p1 )2 = (p2 + p3 )2 ,
(6.90)
t = (p − p2 )2 = (p1 + p3 )2 ,
(6.91)
u = (p − p3 )2 = (p1 + p2 )2 .
(6.92)
Solved problems 171
172 Crosssections and decay rates These variables are Lorentz invariant by deﬁnition, and can be written in terms of the centerofmass energies E1 , E2 , E3 as s = M 2 + m21 − 2M E1 ,
t
2
t=M + u = M2 + 2
s+t=M +m
2
s t = M2m
2
m22 m23
(6.93)
− 2M E2 ,
(6.94)
− 2M E3 .
(6.95)
s is also called the invariant mass of the (2, 3) pair, and is denoted also as m223 , and similarly t = m213 and u = m212 . From E1 + E2 + E3 = M it follows that (6.96) s + t + u = M 2 + m21 + m22 + m23 .
s Fig. 6.7 The allowed region of phase
space (shaded area) when two ﬁnal particles are massless.
Therefore the three Mandelstam variables are not independent. We choose s and t as the independent variables. Since ds = −2M dE1 and dt = −2M dE2 , the phase space can be rewritten as dΦ(3) =
t s+t=M
2
s Fig. 6.8 The Dalitz plot when all
three ﬁnal particles are massless becomes a triangle.
dsdt . 16M 2 (2π)3
(6.97)
We now ﬁnd the kinematical limits on s, t. First of all s attains its maximum value when E1 has its minimum value, see eq. (6.93), i.e. when E1 = m1 , so smax = (M − m1 )2 . This corresponds to the conﬁguration in which, in the rest frame of the decaying particle, the initial particle decays into particle 1 at rest, while particles 2 and 3 have opposite momenta, with the modulus of momentum ﬁxed by energy conservation. The minimum value of s is found instead writing s = (p2 + p3 )2 = m22 + 2 m3 + 2(E2 E3 − p2 · p3 ). Since s is invariant, we can compute it in the frame that we prefer. In the CM of the pair (2, 3) we have p3 = −p2 , and in this frame s = m22 + m23 + 2(E2 E3 + p2  · p3 ), which shows that the minimum value is obtained, in this frame, for p2 = p3 = 0, so that E2 = m2 , E3 = m3 , and s = (m2 + m3 )2 . Since s is Lorentz invariant, this is the minimum value in any frame. In conclusion, the limits on s are
t
(m2 + m3 )2 s (M − m1 )2 .
(6.98)
Now, ﬁxing s within these limits, we look for the limits on t. We therefore look for a relation which expresses the conservation of energy and momentum and which is written only in terms of s and t. We start from E32 = p23 + m23 and we use the conservation of energy, E3 = M − E1 − E2 and of momentum, p3 = −p1 − p2 , to write
s
(M − E1 − E2 )2 = m23 + p 21 + p 22 + 2p1 · p2 . The limiting cases correspond to
Fig. 6.9 The generic form of the
Dalitz plot when all three ﬁnal particles are massive.
p1 · p2 = ±p1  · p2  = ±
q (E12 − m21 )(E22 − m22 ) .
(6.99)
(6.100)
Inserting this into eq. (6.99), we ﬁnd that the limiting curve in the (s, t) plane is given by q M 2 + 2E1 E2 + m21 + m22 − m23 − 2M (E1 + E2 ) = ±2 (E12 − m21 )(E22 − m22 ) . (6.101) Using eqs. (6.93) and (6.94) we can eliminate E1 , E2 in favor of s, t, E1 =
M 2 + m21 − s , 2M
E2 =
M 2 + m22 − t . 2M
(6.102)
6.7 We examine ﬁrst this curve in the limiting case m1 = m2 = 0 (we then denote m3 simply by m). The curve with the plus sign becomes simply t+s = M 2 +m2 while that with the minus sign becomes st = m2 M 2 . If m = 0, the resulting region of the (s, t) plane is shown in Fig. 6.7, while if even m = 0 the area degenerates to a triangle, see Fig. 6.8. The plot of the phase space region allowed by energy–momentum conservation is known as the Dalitz plot. If all three masses are diﬀerent from zero the Dalitz plot has the generic form shown in Fig. 6.9. Observe that the number of cusps in the limiting curve is equal to the number of massless ﬁnal particles. The usefulness of this representation is that the phase space is uniform in the Dalitz plot, see eq. (6.97), and therefore any nonuniformity in the distribution of events is due to the matrix element. This allows us to identify immediately possible resonances. Suppose for instance that the decay of the initial particle proceeds through an intermediate resonance that subsequently decays into particles 2 and 3 as in Fig. 6.10. As we saw in Section 6.5, the process will √ be greatly enhanced when the kinematic invariant s (i.e. the invariant mass m23 of the (2, 3) pair) is equal to the mass mR of the resonance. Therefore the distribution of the experimental events will be mostly localized in a band corresponding to this value of m23 , rather than being distributed more or less uniformly over the whole Dalitz plot, and it might look as in Fig. 6.11. This is the way in which many resonances are discovered. For example, the D0 meson is a particle with mass mD = 1864.6 ± 0.5 MeV, spin zero and a lifetime τ = (410.3±1.5)×10−15 s. Among its decay modes, one ﬁnds a threebody decay D0 → K − π + π 0 . Displaying the various events collected by the detector on a Dalitz plot, one ﬁnds a band of the type shown in Fig. 6.11 when on the horizontal axis we plot the invariant mass of the K − π + system, and the band is localized at m2K − π+ (892 MeV)2 . This shows that the process goes through a resonance, known as K ∗ (892)0 , i.e. D0 → K ∗ (892)0 π 0 and subsequently K ∗ (892)0 → K − π + . The same considerations can be applied to a scattering process of two particles into three particles. In this case the initial state, in the CM, has fourmomentum p = (ECM , 0), where ECM is the total energy in the CM, and all considerations above go through with the replacement M → ECM . So, for instance, in the scattering process of kaons on protons, K − p → Λ0 π + π − , with the kaon momentum of the order of the GeV and the protons at rest, one ﬁnds that the events are concentrated in two bands, as in Fig. 6.12, corresponding to the two processes K − p → Σ+ π − , followed by Σ+ → Λ0 π + , and K − p → Σ− π + , followed by Σ− → Λ0 π − . Problem 6.2. atoms
Solved problems 173
3
2 1
Fig. 6.10 A threeparticle decay go
ing through a resonant intermediate state.
t
mR2
s
Fig. 6.11 If the threebody decays
proceed through a resonance of mass mR , the experimental events are concentrated on a band around s = m2R , of width equal to the resonance width, rather than being distributed more or less uniformly in the Dalitz plot. m2 Σ
m2 − Λπ
Inelastic scattering of nonrelativistic electrons on
In this problem we study the process in which a nonrelativistic electron scatters on an atom A, leaving the atom in an excited state A∗ , i.e. e− A → e− A∗ . We start from eq. (6.43) for the inelastic crosssection. We work in the rest frame of the atom A, and we denote its mass by MA . As in Section 6.6, we 2 , since the mass of the atom A is much bigger than the electron use s MA energy. We use the matrix element with nonrelativistic normalization, see eq. (6.8), and therefore eq. (6.43) becomes “ m ”2 p dσ = Mf i 2 . dΩ 2π p
(6.103)
We use the notation p = p, p = p . We saw in Section 6.6 that, for elastic
m2
Λ π+
m2
Σ
Fig. 6.12 The distribution of events
in K − p → Λ0 π + π − , showing that the process goes through the resonances Σ± .
174 Crosssections and decay rates Z
scattering, Mf i (q ) = −
d3 x e−iq ·x V (x ) = − p Vˆ p ,
(6.104)
where the hat denotes the quantum operator, and p , p are the incoming and outgoing electron states with nonrelativistic normalization, so x p = eip·x , and q = p − p . We set all volume factors equal to one for simplicity, since they cancel anyhow at the end. The last equality in eq. (6.104) is easily proved inserting two complete sets of states, Z (6.105) p Vˆ p = d3 xd3 x p x x Vˆ x x p , and using x Vˆ x = V (x ) x x = V (x )δ (3) (x − x ). In eq. (6.104) the atom is treated as an external scattering center, without internal dynamics. If we take into account the fact that the internal state of the atom changes, we must insert in the initial and ﬁnal state also the atomic state, so for inelastic scattering eq. (6.104) must be replaced by Mf i = − p A∗ Vˆ p A .
(6.106)
For simplicity, we limit ourselves to the Coulomb interaction of the incoming electrons with the atomic electrons and with the nucleus, neglecting all spin interactions, and the form factor of the nucleus. Then ! Z X Z 1 ˆ − V (x , x a ) = −α , (6.107) x  a=1 x − x a  where x is the position operator of the incoming electron, x a of the atomic electrons, and we have considered a neutral atom with Z electrons. The wave function of the incoming electron is eip ·x and of the outgoing electron is eip ·x . Therefore, with q = p − p , the matrix element (6.106) becomes Z Mf i = − d3 x e−iq ·x A∗ Vˆ (x , x a )A # " Z Z X 1 Z 3 −iq ·x ∗ ∗ = +α d x e A A − A . (6.108) A  x  x − x a  a=1 We now observe that Z d3 x e−iq ·x and we use
1 = e−iq ·x a x − x a  Z
Z
d3 x e−iq ·x
1 x 
(6.109)
4π 1 = (6.110) x  q 2 (this equality can be proved more easily adding a factor e− x  in the integrand, to assure the convergence; in polar coordinates the integral is then elementary, and at the end we take the limit → 0+ ). Therefore # " Z X 4πα ∗ −iq ·x a ∗ A , (6.111) A e Mf i = 2 Z A A − q a=1 d3 x e−iq ·x
where q = q . We now perform a multipole expansion, expanding e−iq ·x a , and we retain terms up to quadratic order. The expansion is valid when qa 1, where a is the atomic size. Then # " Z Z X 4πα 1 i j ∗ X i j i ∗ i 3 xa A + q q A  xa xa A + O(q ) , (6.112) Mf i = 2 iq A  q 2 a=1 a=1
6.7
Solved problems 175
where in xia the index a labels the atomic electrons, while i = 1, 2, 3 is the spatial index. We introduce the dipole and the quadrupole operators Di =
Z X
xia ,
(6.113)
a=1
Qij =
Z „ X a=1
xia xja −
1 ij 2 δ ra 3
« ,
(6.114)
with ra = x a . Then # " 4πα 1 2 ∗ X 2 1 i j ∗ ij i ∗ i 3 Mf i = 2 q A  ra A + iq A D A + q q A Q A + O(q ) . q 6 2 a (6.115) When we take the modulus squared of this amplitude, there is no interference between the dipole and the other two terms (the scalar term ∼ r 2 and the quadrupole), since the dipole contributes only to transitions which change the parity, while both the scalar and the quadrupole are nonvanishing only between states with the same parity.6 Let us denote more explicitly A = nLM , where L is the orbital angular momentum, M = Lz , and n denotes collectively all the other quantum numbers, e.g. n is the principal quantum number in the hydrogen atom. Observe that, since we are neglecting the spin– orbit coupling, L is separately conserved. Similarly, we write A∗ = n L M . Putting together eqs. (6.103) and (6.115), and taking into account that the interference term involving the dipole vanishes, the crosssection is the sum of an evenparity term and the dipole term, ˛ ˛2 « „ ˛ ˛ p 2 2 ˛ 1 X 2 q i q j ij dσ ˛ = m α ˛ n L M  ra nLM + 2 n L M Q nLM ˛ , ˛3 ˛ dΩ even p q a (6.116) „ « p 4m2 α2 i j i dσ = q q n L M D nLM nLM Dj n L M . (6.117) dΩ dipole p q4 These expressions can be simpliﬁed observing that typically we do not know the value of M before the transition and we are not interested in a speciﬁc value of M after. Therefore in the crosssection we average over the initial value of M and we sum over the ﬁnal values. Summing over M , the interference term between the P scalar and quadrupole in eq. (6.116) disappears. In fact, the scalar operator a ra2 has nonvanishing matrix elements only if M = M , L = L , and its matrix element is independent of M . Therefore the M dependence in the interference term is completely contained in n LM Qij nLM . Summing over M , we ﬁnd L X n LM Qij nLM = 0 . (6.118) M =−L
This could be shown by explicit computation, but it is much easier to observe that n LM Qij nLM is a spatial tensor with two indices; before performing the sum over M , we have at our disposal the tensor δ ij , and the direction ni of the quantization axis, so the result will be a combination of δ ij and ni nj . After we sum over M , any dependence on the direction of the quantization axis disappears, and the result can depend only on δ ij . However, Qij is a traceless tensor, and therefore the lefthand side of eq. (6.118) cannot be proportional to δ ij . This means that it must vanish.
6
The angular momentum selection rules instead are as follows: the dipole is a vector, and as such it could mediate transitions with ∆L = 0, ±1. However parity eliminates ∆L = 0, since xi is a true vector, and we are left with ∆L = ±1. Similarly the quadrupole is a spin2 operator and can mediate transitions with ∆L = 0, ±2, whereas ∆L = ±1 are eliminated by parity. The scalar r 2 of course mediates only transitions with ∆L = 0, ∆M = 0. Therefore in a transition with ∆L = 0, ∆M = 0 the scalar and quadrupole can interfere.
176 Crosssections and decay rates Therefore the crosssection splits into a sum of scalar, dipole and quadrupole terms, « „ X 2 p m2 α2 dσ = ra nLM 2 δLL δM M , (6.119)  n LM  dΩ scalar p 9 a „
„
dσ dΩ dσ dΩ
« = dipole
« quad
p 4m2 α2 q i q j X i n L M D nLM nLM Dj n L M , p q 4 2L + 1 M,M
(6.120) p m2 α2 q i q j q k q l X ij kl n L M Q nLM nLM Q n L M . = p q 4 2L + 1 M,M
(6.121) Performing the sum over M, M allows us to simplify further the dipole and quadrupole crosssections. Again we use the fact that the choice of the quantization axis becomes irrelevant and there is no preferred direction. Therefore, in the dipole crosssection, the quantity X i n L M D nLM nLM Dj n L M M,M
must be proportional to δ ij , since it is a tensor and there is no other quantity that can appear in the ﬁnal result. We denote the proportionality constant by D2 /3, X
n L M Di nLM nLM Dj n L M =
M,M
1 ij 2 δ D , 3
(6.122)
P so that by deﬁnition D2 = M,M  n L M Di nLM 2 . In order to simplify the quadrupole term we introduce the notation X ij n L M Q nLM nLM Qkl n L M . (6.123) T ijkl = M,M
The advantage of summing over M, M is that even the apparently complicated tensor structure of T ijkl is fully determined by symmetry considerations. Again, we use the fact that T ijkl is a tensor and δ ij is the only tensor at our disposal, since we have no preferred direction (it is easy to see that ijk cannot enter, both because of parity and because it is impossible to use it to construct a tensor with the symmetry properties of T ijkl ). Therefore we must have T ijkl = c1 δ ij δ kl + c2 δ ik δ jl + c3 δ il δ jk . Since Qij = Qji , T ijkl must satisfy T ijkl = T jikl , and similarly for the second pair of indices. This implies that, ijkl ∼ δ ik δ jl + δ il δ jkP− cδ ij δ kl . The constant apart from an overall constant, P iiT iikl = 0. This gives c is ﬁxed observing that i Q = 0 and therefore iT c = 2/3. Deﬁning X ijij Q2 = T , (6.124) i,j
we therefore have T ijkl =
Q2 10
„ δ ik δ jl + δ il δ jk −
2 ij kl δ δ 3
« .
(6.125)
(The factor 1/10 comes contracting i with k and j with l in the above equation.) In conclusion, all the information about the atomic structure has been condensed in just D2 for dipole transitions and Q2 for quadrupole transitions.
6.7 Inserting eqs. (6.122) and (6.125) into eqs. (6.120) and (6.121), respectively, we ﬁnd « „ 1 m2 α2 D2 dσ p 4 , (6.126) = dΩ dipole p 3 2L + 1 q2 « „ dσ p 2 1 = (6.127) m2 α2 Q2 . dΩ quad p 15 2L + 1 The total crosssection is obtained integrating over dΩ. The angular dependence is hidden in the transferred momentum q while p is ﬁxed by the conservation of energy. Therefore the quadrupole crosssection, which is independent of q, simply gets a factor of 4π after integration over dΩ. For the dipole crosssection we observe that, since q = p − p , we have q 2 = p + p2 − 2pp cos θ ,
(6.128)
qdq = −pp d cos θ .
(6.129)
2
and therefore
and the limits on q are qmin = p − p (we are considering excitations of the atom due to the collision, so p > p ) and qmax = p + p . Then « „ Z dσ σdipole = dΩ dΩ dipole „ « Z p+p 1 dσ = 2π q dq pp dΩ dipole p−p =
8π p + p 1 m2 α2 D2 log . 2 3 2L + 1 p p − p
(6.130)
Summary of chapter • The calculation of scattering crosssections and of decay rates is made of two parts: (1) The dynamical part, which is the computation of the matrix element Mf i . When a perturbative approach is applicable, Mf i can be computed using the Feynman diagram technique discussed in Chapter 5; (2) The kinematical part, i.e. the summation over the ﬁnal states, with the appropriate factors for the initial state. • The kinematics of the ﬁnal state is contained in the phase space, eq. (6.19). The basic equation for computing the decay width of a particle is eq. (6.20), while scattering crosssections are computed using eqs. (6.29) and (6.24). Explicit formulas for twobody and threebody ﬁnal states are given in Section 6.4 and in Problem 6.1. • When the values of the initial momenta are such that an internal line becomes onshell, the crosssection is enhanced. In this case the intermediate state can be seen as a real particle which is formed, lives for a certain time, and then decays. Such a particle is called a resonance. The resonant crosssection is described by the Breit–Wigner distribution, eq. (6.60).
Solved problems 177
178 Crosssections and decay rates
Further reading • Many useful results on the topics of this chapter can be found in the old but still beautiful series of books by Landau and Lifshitz. For the deﬁnition of crosssections, decay rates, phase space, etc. see vol. II (Classical Field Theory), Section 12 and vol. IV (Relativistic Field Theory), Section 65. For resonances in nonrelativistic quantum mechanics see vol. III (Quantum Mechanics), Section 134. For emission of radiations by atoms or molecules, dif
fusion of light, interactions between electrons and between atoms with Feynman diagram techniques, Chapters 5, 6, 9 and 10 of Landau and Lifshitz, vol. IV (1982) give an unmatched source of explicit calculations. • A rich source of solved problems in particle physics and ﬁeld theory, including many examples of calculations of scattering processes, decays, etc., is Di Giacomo, Paﬀuti and Rossi (1994).
Exercises (6.1) Consider the diﬀerential crosssection for the 2 → 2 scattering, given in eq. (6.43). Show that it can be rewritten, in terms of the Mandelstam variable t and of the ﬂux factor I, as 1 Mf i 2 dt . (6.131) dσ = 64πI 2 Observe that all factors are explicitly Lorentz invariant. (Assume cylindrical symmetry to perform the integration over dφ.) (6.2) Consider a 2 → 2 elastic scattering process for two particles of masses m1 and m2 . In the CM, let p be the ﬁnal momentum of the particle 1, E its energy, v2 the modulus of the initial velocity of the particle 2, and θ the scattering angle. Perform the Lorentz transformation to the laboratory frame, where the particle 2 is initially at rest and check that the ﬁnal energy of the particle 1, in the lab frame, is Elab = γ2 (E + v2 p  cos θ)
Emin = γ2 (E − v2 p ) Emax = γ2 (E + v2 p ) .
(6.133)
Use this to show that the twobody phase space can be rewritten as 1 √ dElab (6.134) dΦ(2) = 8πγ2 v2 s √ where s is the center of mass energy, and we assumed cylindrical symmetry around the beam axis.
(6.135)
(6.3) (i) Consider the decay of an excited atomic state A∗ into a lower state A, with emission of a photon, A∗ → Aγ. Verify that the phase space can be written as dΦ(2) =
ω dΩ , 16π 2 MA
(6.136)
where ω is the energy of the photon and MA the mass of the atom. (ii) Verify that the decay width can be written as dΓ = Mf i 2
(6.132)
with γ2 = (1 − v22 )−1/2 , and therefore dElab = γ2 v2 p  d cos θ .
The twobody phase space is therefore uniform with respect to the lab energy Elab , between the kinematical limits Emin Elab Emax , with
ω dΩ , 8π 2
(6.137)
where Mf i is the matrix element with the normalization of one particle per unit volume for the atomic states. (iii) Consider a scattering process Aγ → A∗ γ . Show that dσ =
1 ω Mf i 2 dΩ , 16π 2 ω
(6.138)
where ω, ω are the energy of the initial and ﬁnal photon, respectively.
Exercises 179 (6.4) Consider a twophoton decay of an atomic state, A∗ → Aγ1 γ2 . Show that the decay width can be written as Z 1 dΓ = ω (ω − ω ) dΩ1 dΩ2 Mf i 2 , 1 1 dω1 8(2π)5 (6.139) where ω1 , ω2 are the energies of the two photons, ω = EA∗ − EA = ω1 + ω2 , and dΩ1 , dΩ2 are the solid angles of the two photons. (6.5) Denote by dΦ(n) (P ; p1 , . . . , pn ) the nbody phase space, with p1 + . . . pn = P . Show that Z dΦ(n) (P ; p1 , . . . , pn ) =
∞
dµ2 2π
0 (n−j+1)
×dΦ(j) (q; p1 , . . . , pj )dΦ
(6.140)
(P ; pj+1 , . . . , pn , q),
where µ2 = q02 − q 2 . Discuss the physical meaning of this recursive representation of the phase space. (6.6) (i) In the theory with Lagrangian given in eq. (6.48), show that the Feynman diagram in Fig. 6.5 develops an imaginary part when M > 2m, and compute it. (ii) Denoting the result of the Feynman diagram of Fig. 6.5 by iM, show that eq. (6.52) predicts that the decay rate Γ for the process Φ → φ1 φ2 is related to M by Γ=
1 ImM . MR
(6.141)
(iii) Verify the correctness of the above relation (which is a form of the optical theorem) computing Γ explicitly.
7
Quantum electrodynamics
7.1 The QED Lagrangian
180
7.2 Oneloop divergences
183
7.3 Solved problems
186
7.1
The QED Lagrangian
Quantum electrodynamics (QED) describes the interaction between electrons (or any other charged spin 1/2 particle, like muons) and photons. It is convenient to quantize the photons using the covariant quantization of Section 4.3.2. Actually, it is also useful to generalize slightly the Lagrangian used in Section 4.3.2: instead of eq. (4.102), we describe the free electromagnetic ﬁeld by 1 1 (∂µ Aµ )2 , Lem = − Fµν F µν − 4 2ξ
(7.1)
with ξ a generic parameter. In Section 4.3.2 we set ξ = 1, but it can be shown that for any ξ, after requiring that ∂µ Aµ vanishes between physical states, the spectrum of the theory is given by the two transverse polarization states of the photon. Basically this comes out because the only role of the term (1/2ξ)(∂A)2 is to break gauge invariance and to allow us to deﬁne the momentum conjugate to A0 . Then, between physical states, the operator ∂µ Aµ vanishes and the matrix elements between physical states obtained with eq. (7.1) are independent of ξ. Of course intermediate steps, like the equal time commutation relations between Aµ and the conjugate momenta, or the propagator, do depend on ξ. In the interacting theory, it will turn out that the dependence on ξ vanishes if Aµ is coupled to matter respecting gauge invariance, so in particular Aµ must be coupled to a conserved current. It is sometimes useful to work with ξ generic, and to check the correctness of the computation verifying that in the end ξ cancels in the matrix elements between physical states. Also, in diﬀerent problems, different choices of ξ can simplify the calculation. The term (1/2ξ)(∂A)2 is called the gauge ﬁxing term and ξ is the gauge ﬁxing parameter; the choice ξ = 1 is called the Feynman gauge, and is typically the simplest choice. Sometimes also the choice ξ = 0 (Landau gauge) is useful; the Lagrangian is singular in this limit, but we will see below that the photon propagator is well deﬁned at ξ = 0. The interaction between the photon and the electron is written in terms of the covariant derivative, as explained in Section 3.5.4. QED is then described by the Lagrangian ¯ µΨ . ¯ ∂ − m)Ψ − 1 Fµν F µν − 1 (∂µ Aµ )2 − eAµ Ψγ LQED = Ψ(i 4 2ξ
(7.2)
The Feynman rules of QED have already been given in Section 5.5.4. We
7.1
just add that, if we use a generic ξ = 1, the photon propagator becomes kµ kν ˜ µν (k) = −i − (1 − ξ) D . (7.3) η µν k 2 + i k2 We now discuss the symmetries of the QED action. From the point of view of spacetime symmetries, QED has of course Poincar´e invariance. We have also seen that the coupling is constructed in such a way that the theory is invariant under gauge transformation, i.e. local U (1) transformations Ψ(x) → eieθ(x) Ψ(x) , Aµ (x) → Aµ (x) − ∂µ θ .
(7.4) (7.5)
The presence of this local symmetry implies also the existence of the corresponding global U (1) symmetry with θ a constant parameter, Ψ(x) → eieθ Ψ(x) , Aµ (x) → Aµ (x) .
(7.6) (7.7)
There is therefore an associated conserved Noether current, which is ¯ µ Ψ, and a U (1) charge which is conserved by the electromagnetic Ψγ interaction. To understand the meaning of this charge, observe that ¯ 0 Ψ. In the Lagrangian density j 0 is coupled Q = d3 x j 0 , with j 0 = Ψγ 0 to A , and in the Hamiltonian density j 0 enters as +eA0 j 0 . Since in classical electrodynamics A0 is the electrostatic potential, we see that j 0 is the electric charge density, measured in units of e, and therefore Q is the electric charge, again in units of e. As we saw in Section 4.2, for electrons Q = 1 and for positrons Q = −1, see eq. (4.43). Gauge invariance implies that the photon is massless: a mass term for the photon would correspond to a term m2γ Aµ Aµ in the Lagrangian, but this is forbidden since it is not invariant under gauge transformations. If gauge invariance were broken we should expect a photon mass of the order of the symmetrybreaking scale. However, the experimental bound on the photon mass is extraordinarily tight, mγ < 2 × 10−16 eV. QED also has important discrete symmetries. Consider ﬁrst the parity operation P . We have deﬁned the action of P on a quantized spinor ﬁeld in eq. (4.58): Ψ(x) → ηa γ 0 Ψ(x ). From this it follows that the current ¯ µ Ψ is a true fourvector, i.e. under parity the spatial components Ψγ change sign and the temporal component is invariant. Since also ∂µ is a true fourvector, the kinetic term of the fermion is invariant under parity. Similarly, the gauge ﬁeld Aµ is a true fourvector, so both the kinetic term of the gauge ﬁeld and the interaction term are invariant under P . Therefore QED is invariant under parity. Another important symmetry is charge conjugation, C. In Section 4.2 we deﬁned the operation of charge conjugation on the quantized Dirac spinors. We saw in Exercise 4.3 that, using the fact that the quantized ¯ µ Ψ changes sign under charge Dirac ﬁelds anticommute, the operator Ψγ conjugation, ¯ µΨ . ¯ µ ΨC = −Ψγ (7.8) C Ψγ
The QED Lagrangian 181
182 Quantum electrodynamics
Observe that, even if it involves complex conjugation, on the quantized ﬁelds C is deﬁned as a linear (rather than antilinear) operator. Its action on the quantized Dirac ﬁeld is determined by its action on the creation and annihilation operators ap,s , bp,s given in eq. (4.59), regardless of the fact that the coeﬃcients of ap,s , bp,s in the expansion of Ψ are the complex functions us (p)e−ipx and v s (p)eipx . Therefore CiΨC = iCΨC, ¯ µ Ψ changes sign under charge conjugation. Then the kinetic so also iΨγ term transforms as ¯ µ ∂µ ΨC = −i(∂µ Ψ)γ ¯ µΨ , CiΨγ
(7.9)
(since the term ∂µ Ψ, after the action of C, becomes proportional to ∂µ Ψ∗ and is then anticommuted to the left where it combines with a ¯ and, after integrating ∂µ by parts, the kinetic term γ 0 to give ∂µ Ψ) in the action is invariant. Since the interaction term is proportional to ¯ µ Ψ and Ψγ ¯ µ Ψ changes sign, if we deﬁne the charge conjugation on Aµ Ψγ Aµ as CAµ (x)C = −Aµ (x), QED is invariant under charge conjugation, as we already saw in Section 4.3.2. The photon is then an eigenstate of charge conjugation, with eigenvalue −1, Cγ = −γ .
(7.10)
As an example of the use of these invariance principles, we examine the electromagnetic decay of the neutral pion. In general, a particle can be an eigenstate of charge conjugation only if it is electrically neutral. Consider the three pions π ± , π 0 . Apart from an arbitrary phase, we have Cπ + = π − and Cπ − = π + . Instead, the π 0 is neutral, and therefore Cπ 0 = ηπ 0 ; C has been deﬁned on spinors so that C 2 = 1 (see Section 4.2.3) and of course C 2 = 1 also on the gauge ﬁeld, so C 2 is the identity operator. So, even if π 0 , at the fundamental level, is a possibly complicated bound state of fermions (in terms of quarks ¯ we know that C 2 is the identity operator also when we π 0 = u¯ u + dd), apply it to the π 0 , and therefore η 2 = 1 and η can only take the values ±1. To see which one is the actual value, we observe that π 0 decays electromagnetically as (7.11) π 0 → 2γ . The twophoton state is an eigenstate of C with eigenvalue (−1)2 = +1 and since the electromagnetic interaction conserves C, this must also be the value of C for the π 0 , i.e. η = +1. In turn, this means that the electromagnetic decay π 0 → 3γ is forbidden because it violates C. Experimentally the decay into three photons is not observed, and the limit is Γ(π 0 → 3γ) < 3.1 × 10−8 . (7.12) Γ(π 0 → 2γ) Finally, the QED action is invariant under time reversal T , and therefore also under CP T , in agreement with the CP T theorem, see Section 4.2.3.
7.2
7.2
Oneloop divergences
In Section 5.6 we deﬁned the superﬁcial degree of divergence D for a scalar ﬁeld theory, and we saw that the condition for renormalizability is that only a limited number of Green’s functions have D 0. In QED, or in general in the presence of fermions, the deﬁnition of D must be modiﬁed, since the fermionic propagator decreases as 1/p rather than 1/p 2 . We denote by Nfext , Nγext the number of external fermionic and photonic lines respectively, by Nfint , Nγint the number of internal fermionic and photonic lines, by V the number of vertices in the graph and by L the number of loops. Then, repeating the arguments of Section 5.6, the superﬁcial degrees of divergence is deﬁned in QED as D = 4L − 2Nγint − Nfint .
(7.13)
The number of loops is related to the total number of internal lines Nfint + Nγint as in eq. (5.143), L = Nfint + Nγint − V + 1
(7.14)
and the fact that to each vertex are associated two fermionic lines and one photonic line means that 2V = 2Nfint + Nfext ,
V = 2Nγint + Nγext .
(7.15)
Combining these expressions, we ﬁnd 3 D = 4 − Nγext − Nfext . 2
(7.16)
This means that only the Green’s functions with Nγext + 32 Nfext 4 are potentially dangerous. Furthermore, some of the potentially dangerous Green’s functions are actually ﬁnite or even zero. Consider in fact the Green’s functions with no external electron line and an arbitrary number Nγext = n of external photon lines. They correspond to 0Aµ1 (x1 ) . . . Aµn (xn ) exp −i d4 x HQED 0 c . (7.17) We have seen that the QED Hamiltonian is invariant under charge conjugation, CHQED C = HQED . Inserting multiple factors C 2 = 1 in the above expression and using C0 = 0 , we ﬁnd 0Aµ1 (x1 ) . . . Aµn (xn ) exp −i d4 x HQED 0 = 0(CAµ1 (x1 )C) . . . (CAµn (xn )C)(C exp −i d4 x HQED C)0 n 4 (7.18) = (−1) 0Aµ1 (x1 ) . . . Aµn (xn ) exp −i d x HQED 0 . Therefore the Green’s functions with no external fermion lines and with an odd number of external photon lines are identically zero, to all orders in perturbation theory (Furry’s theorem).
Oneloop divergences 183
184 Quantum electrodynamics 1
As in Section 5.6, this does not mean that the other Green’s functions have no divergences, but that their divergencies are automatically cured by the renormalization of Green’s functions with a smaller number of external legs.
k
p_ k
p
p
Fig. 7.1 The oneloop electron self
energy. k q
q k−q
Fig. 7.2 The oneloop photon self
energy.
p+k
p’ + k p’
k
Fig. 7.3 The oneloop vertex correc
tion.
k1
k3
p
p−k 3
p−k 1
k2
d4 k iΠµν (q) ≡ (−1) (7.20) (2π)4 i( k + m) i( k − q + m) ×Tr (−ieγµ ) 2 (−ieγ ) . ν k − m2 + i (k − q)2 − m2 + i
q
p
Thus, we are ﬁnally left with the following potentially dangerous1 Green’s functions: if Nfext = 0 we can have (i) Nγext = 0. This is a vacuum diagram, and can be cured simply by normal ordering the Hamiltonian. (ii) Nγext = 2. This is a divergence in the photon propagator. (iii) Nγext = 4. This is a light–light scattering amplitude. If instead Nfext = 2 (recall that Nfext must be even, as we see from the ﬁrst equation in (7.15), and as dictated by charge conservation) we can have only (i) Nγext = 0 (the fermion propagator), and (ii) Nγext = 1 (the interaction vertex). If Nfext 4 all Green’s functions have instead D < 0. Let us discuss these UV divergences at the oneloop level. The corresponding oneloop diagrams are shown in Figs. 7.1–7.4. Using for simplicity the Feynman gauge ξ = 1, the graph in Fig. 7.1 is given by i( p − k + m) iη µν d4 k (−ieγµ ) − 2 (−ieγν ) . iΣ(p) ≡ (2π)4 k (p − k)2 − m2 + i (7.19) The correction to the photon propagator is given by the graph in Fig. 7.2,
p−k1 −k2
k4
Fig. 7.4 The oneloop light–light scattering amplitude.
The minus sign comes from the fermionic loop, and writing explicitly the Dirac indices one can check that they combine to give a trace. The correction to the interaction vertex in Fig. 7.3 is d4 k iη νρ i( p + k + m) (−ieγ ) − (−ieγρ ) −iΓµ (p, q) ≡ ν 4 2 (2π) k (p + k)2 − m2 + i i( p + k + m) ×(−ieγµ ) , (7.21) (p + k)2 − m2 + i with q + p = p . The oneloop light–light scattering amplitude of Fig. 7.4 is instead given by (the +i are understood) ( p + m) d4 p 4 4 Tr γ µ 2 Aµνρσ (k1 , k2 , k3 , k4 ) = (−1)(−ie) i 4 (2π) p − m2 ( p − k 3 + m) ρ ( p − k 1 − k 2 + m) ×γ ν γ (p − k3 )2 − m2 (p − k1 − k2 )2 − m2 ( p − k 1 + m) × γσ , (7.22) (p − k1 )2 − m2 with k1 + k2 = k3 + k4 . The integrals in eqs. (7.19), (7.20) and (7.21) are UV divergent, so to make sense of them we must specify a regularization procedure. Putting a cutoﬀ Λ in Euclidean momentum space, as we did in Chapter 5 for λφn theories, is not at all convenient in a gauge theory.
7.2
The problem is that putting such a cutoﬀ means that we are setting to zero all momentum modes of the ﬁelds with k > Λ. However, even if we set to zero all Fourier modes of the gauge ﬁeld A˜µ (k) with k > Λ, these modes are regenerated by a gauge transformation Aµ (x) → Aµ (x)− ∂µ θ, with θ generic. In other words, a cutoﬀ in momentum space is not compatible with gauge invariance. In general, it is very dangerous to break a symmetry of the theory by the regularization. One would naively expect that the symmetry is recovered when we remove the cutoﬀ, but this turns out to be not at all automatic (if the symmetry is not recovered, one says that there is an anomaly in the theory). If the gauge symmetry became anomalous it would be a disaster. We saw in Section 4.3 that gauge invariance is crucial in order to eliminate the spurious degrees of freedom from the gauge ﬁeld Aµ and remain with a massless particle with two helicity states h = ±1, as is the photon. It is therefore much more convenient to regularize the theory maintaining gauge invariance explicitly. The two most useful gauge invariant regularizations are dimensional regularization and Pauli–Villars regularization. The former is based on the following idea. Consider for example 1 dd k , (7.23) Id ≡ (2π)d (k 2 + ∆)2 where we have already performed the Wick rotation, so k is now a Euclidean momentum, while ∆ is some combination of external momenta and masses. We are interested in d = 4, but we keep for the moment d generic. For d = 4, this is one of the typical divergent parts of the diagrams written above. Now one observes that, if d < 4, the integral is convergent and the result can be written in terms of the Euler Γ function, d Γ(2 − d2 ) 1 2− 2 1 . (7.24) Id = ∆ (4π)d/2 Γ(2) The function Γ(z) has isolated poles at z = 0, −1, −2, . . . and therefore the integral diverges in d = 4, 6, 8, . . ., and is otherwise well deﬁned, even for d noninteger. We can therefore take the righthand side of eq. (7.24) as the deﬁnition of Id for generic d, real or even complex. We can now study it in d = 4 − dimensions, and in the limit → 0 we recover our divergence; from the known behavior of the Γ function near the poles one ﬁnds that, as → 0, 2 ∆ 1 − log − γ + O() , (7.25) I4− → (4π)2 4π where γ 0.5772 . . . is called the Euler–Mascheroni constant. We have therefore succeeded in writing the integral as a divergent part plus a ﬁnite term; plays the role of the cutoﬀ. The Pauli–Villars regularization is instead based on the idea of modifying the form of the propagator in the UV, so that it goes to zero more rapidly and helps the convergence of the loop integrals. For instance, the photon propagator is modiﬁed by the replacement 1 1 1 → 2 − 2 . (7.26) 2 k − i k − i k + Λ2 − i
Oneloop divergences 185
186 Quantum electrodynamics
In the limit Λ → ∞ we recover the original propagator, but for ﬁnite Λ at large k 2 the propagator decreases as 1/k 4 rather than 1/k 2 . It can be shown that both dimensional and Pauli–Villars regularizations preserve gauge invariance. There is a welldeveloped technology for computing integrals and renormalizing the theory in these schemes, see, e.g. Peskin and Schroeder (1995) or Weinberg (1995). Once we have regularized the integrals, respecting gauge invariance, we can adapt to QED the same reasoning explained in the case of λφ4 theory; then, the graph in Fig. 7.1 can be treated exactly as we did in Section 5.5.2 for the scalar propagator, and the divergence is reabsorbed in a renormalization of the mass and of the wave function of the fermion. The graph in Fig. 7.2 instead gives a result of the form (7.27) Πµν (q) = ηµν q 2 − qµ qν Π(q 2 ) , with Π(q 2 ) divergent. This divergence is reabsorbed in a renormalization of the wave function of the photon. It is important that in Πµν (q) there is no term proportional to ηµν times a constant (rather than ηµν q 2 ), since this would have provided a renormalization of the mass of the photon. However, a photon mass term m2 Aµ Aµ in the Lagrangian is forbidden by gauge invariance, and therefore such a term is not produced using a gaugeinvariant regularization. Finally, the graph in Fig. 7.3 renormalizes the electric charge. The graph in Fig. 7.4, in a naive power counting, seems logarithmically divergent. However the explicit computation shows that this graph is ﬁnite because the wouldbe divergent term inside the integral actually vanishes. After reabsorbing the divergences into the renormalized ﬁelds, renormalized mass and renormalized charge, all other Green’s function are oneloop ﬁnite. This turns out to hold at all loops, and QED is renormalizable. µ+
e+ p’
7.3
k’
Solved problems
q p e
_
k µ
_
Fig. 7.5 The amplitude for e+ e− →
µ+ µ− at order e2 ; p, p are the incoming momenta of the electron and positron, respectively, and k, k the outgoing momenta of µ− and µ+ .
Problem 7.1. e+ e− → γ → µ+ µ− As a prototype of many similar computations, we evaluate the crosssection for the process e+ e− → µ+ µ− in QED. Remember, however, that when we approach the electroweak scale we cannot limit ourselves to QED and there is also a contribution to the amplitude from the Z 0 , e+ e− → Z 0 → µ+ µ− , which becomes resonant at E = mZ 90 GeV. As long as the CM energy E is much smaller that mZ we can neglect it. In QED at lowest order there is only one Feynman graph, shown in Fig. 7.5. The Feynman rules give „ « −i qµ qν iMf i = v¯s (p )(−ieγ µ )us (p) 2 ηµν − (1 − ξ) 2 u ¯r (k)(−ieγ ν )v r (k ) , q q (7.28)
7.3 where q = p + p and the assignment of the momenta to the various particles is as in Fig. 7.5. First of all, we observe that the term ∼ qµ qν in the photon propagator gives zero. In fact, using q = p + p , “ ” ” “ qµ v¯s (p )γ µ us (p) = v¯s (p ) p us (p) + v¯s (p ) ( pus (p)) . (7.29) Using the Dirac equations for u and for v¯, given in eqs. (3.100) and (3.115), we see that the righthand side of eq. (7.29) is equal to
−m v¯s (p )us (p) + m v¯s (p )us (p) = 0 .
(7.30)
Therefore the matrix element is independent of the gauge ﬁxing parameter ξ, as we expected from gauge invariance. The origin of this result is the fact ¯ µ Ψ, which that in the interaction Lagrangian Aµ is coupled to the current Ψγ is conserved on the equations of motion since it is the Noether current of the U (1) symmetry. We therefore recover, at the quantum level, a condition that we already found classically in Section 3.5.4: to preserve gaugeinvariance, a gauge ﬁeld must be coupled to a conserved current. Let us now perform the computation of the scattering crosssection for e+ e− → µ+ µ− . Using the notation s = q 2 for the square of the CM energy we have Mf i 2 =
´` ´ e4 ` u ¯(µ− )γ µ v(µ+ ) v¯(µ+ )γ ν u(µ− ) s2` ´` − ´ × v¯(e+ )γµ u(e− ) u ¯(e )γν v(e+ ) .
(7.31)
We have used the notation u(e− ) = us (p), etc. in which instead of writing explicitly the momentum and spin we have written the particles to which they refer, and we used the identity (¯ uγ µ v)∗ = v¯γ µ u, which is easily derived using µ † 0 µ 0 (γ ) = γ γ γ . If we are interested in a process with a speciﬁc spin structure, i.e. if we know the spin of the initial particles and we are interested in the amplitude with a given value of the spin of the ﬁnal particle, we can use the explicit expression for us (p), v s (p) given in Section 4.2. However, it is more common that we have an unpolarized beam, so we do not know the spin of the initial particle, and we accept in the detector all ﬁnal particles, without measuring their spin state. In this case we must average the crosssection over the initial spin state and sum it over the ﬁnal spin state. To understand how to perform the sum over spins it can be convenient, even if a bit tedious, to write out explicitly the Dirac indices and rewrite the above expression as e4 µ ν u ¯a (µ− )γab vb (µ+ )¯ vc (µ+ )γcd ud (µ− ) s2 uc (e− )(γν )c d vd (e+ ) ×¯ va (e+ )(γµ )a b ub (e− )¯ 4 ˆ ˜ ˜ ˆ e µ ν = 2 u(µ− )¯ u(µ− ) da γab v(µ+ ) bc γcd v(µ+ )¯ sˆ ˜ ˜ ˆ × v(e+ )¯ v (e+ ) d a (γµ )a b u(e− )¯ u(e− ) b c (γν )c d . (7.32)
Mf i 2 =
Recalling now eqs. (3.112) and (3.113) we see that, summing over the spin v (µ+ )]bc can be replaced by ( k − mµ )bc , and [u(µ− )¯ u(µ− )]da states, [v(µ+ )¯ by ( k + mµ )da . Similarly for the initial electrons; here however we have to u(e− ) is replaced average, rather than to sum, over the two spin states, so u(e− )¯ + + v(e ) by (1/2)( p − me ). Looking at the structure by (1/2)( p + me ) and v(e )¯
Solved problems 187
188 Quantum electrodynamics of the Dirac indices, we see that they are cyclic and can be rewritten in matrix form as traces. Therefore ˆ ˜ ˆ ˜ e4 1X Mf i 2 = 2 Tr ( k + mµ )γ µ ( k − mµ )γ ν Tr ( p + me )γν ( p − me )γµ . 4 spin 4s (7.33) The traces can be performed using the identities Tr(γ µ γ ν ) = 4η µν , µ ν
ρ σ
Tr(γ γ γ γ ) = 4(η
µν
(7.34)
η
ρσ
−η
µρ νσ
η
+η
µσ νρ
η ),
(7.35)
while the trace of an odd number of γ matrices vanishes (for other useful identities, see, e.g. Peskin and Schroeder (1995), page 133). The factors η µν can then be used to contract the various momenta between them. The resulting scalar products are most easily computed in the CM frame. In this frame p = (E, p), p = (E, −p) with (2E)2 = s and p2 = E 2 − m2e , while k = (E, k), k = (E, −k) with k2 = E 2 − m2µ . Denoting by θ the angle between p and k, the result is » – 4m2µ m2e + m2µ 4m2e 1X 2 2 4 + (1 − )(1 − ) cos θ . (7.36) Mf i  = e 1 + 4 4 spin s s s To compute the crosssection we use eq. (6.43), with 1X Mf i 2 Mf i 2 → 4 spin
(7.37)
p √ and with p = E 2 − m2e , p  = E 2 − m2µ . Introducing α = e2 /(4π) we get «1/2 » „ – m2e + m2µ 4m2µ α2 1 − (4m2µ /s) 4m2e dσ 2 1 + 4 = + (1 − )(1 − ) cos θ . dΩ 4s 1 − (4m2e /s) s s s (7.38) √ √ In the large energy limit, me , mµ s (but still s mZ otherwise QED is not the correct theory to use, and we must resort to the Standard Model) we have α2 dσ (1 + cos2 θ) (7.39) dΩ 4s and, performing the angular integration, the total crosssection in this limit is σ
4πα2 . 3s
(7.40)
Problem 7.2. Electromagnetic form factors
q
p1
p2
Fig. 7.6 The treelevel vertex. The
photon is oﬀshell.
In this problem we study the most general form of the radiative corrections to the electron–photon vertex, and we will show that the eﬀect of loop corrections, to all orders in α, is contained in two form factors, describing the electric charge density and the magnetic dipole density. Consider ﬁrst of all the graph in Fig. 7.6, where the initial and ﬁnal electrons are onshell, i.e. p21 = p22 = m2e , while the photon line with momentum q µ can be an internal line of a more general graph, and therefore q 2 is generic. Consider the electromagnetic current operator µ µ ¯ Jem (x) = Ψ(x)γ Ψ(x) ,
(7.41)
7.3
+
+
(a)
(c)
(b)
+
+
+
....
(d)
=
(e) Fig. 7.7 Loop corrections to the vertex. The graph (a) is the treelevel contri
bution. The oneloop contributions are given by the graphs (b), (c), (d) and by a graph like (d), but with the photon line on the other electron line. The graph (e) is an example of a twoloop graph. The sum over all possible graphs is indicated by the blob. where Ψ is the full quantum ﬁeld, rather than the free ﬁeld in the interaction picture. The information on the electron–photon vertex, to all orders in α, is contained in the matrix element of this current between the initial and ﬁnal electron states, µ (x)p1 . (7.42) p2 Jem (For notational simplicity, we suppress the spin labels in the initial and ﬁnal state.) At tree level, we just substitute Ψ with the free ﬁeld (4.32) and we compute the matrix element explicitly, µ (x)p1 tree = u ¯(p2 )γ µ u(p1 ) e−i(p1 −p2 )x . p2 Jem
(7.43)
This is the contribution to the matrix element of the iT operator. The factors e−i(p1 −p2 )x , together with similar factors from the other external lines of the complete Feynman diagram, contribute to the overall Dirac delta expressing energy–momentum conservation, which is extracted from the deﬁnition of Mf i , see eq. (5.115). These exponential factors can be extracted to all orders in perturbation theory, simply observing that, if Pˆ is the momentum operator, we have ˆ ˆ µ µ (x) = eiP x Jem (0)e−iP x (7.44) Jem and therefore ˆ
ˆ
µ µ µ p2 Jem (x)p1 = p2 eiP x Jem (0)e−iP x p1 = e−i(p1 −p2 )x p2 Jem (0)p1 . (7.45) µ (0)p1 and at tree level So, the object which enters in Mf i is p2 Jem µ (0)p1 tree = u ¯(p2 )γ µ u(p1 ) . p2 Jem
(7.46)
The oneloop corrections and an example of a twoloop graph are shown in Fig. 7.7, and we have generically denoted by a blob the sum of all possible radiative corrections, to all orders in α. Of course, it is not possible to compute this sum explicitly. However, the result is constrained by Lorentz invariance,
Solved problems 189
190 Quantum electrodynamics parity and gauge invariance. Lorentz invariance implies that the result must be a fourvector, so we should ask what fourvectors can be constructed with the spinors u ¯(p2 ) and u(p1 ) The most general fermion bilinears have been classiﬁed in eq. (3.119). Using the variables µ µ q µ = pµ (7.47) pµ = pµ 1 + p2 , 2 − p1 , the most general fourvectors that we can construct with the spinors u ¯2 = u ¯(p2 ) and u1 = u(p1 ) is a linear combination of u ¯2 pµ u1 ,
u ¯2 q µ u1 ,
u ¯2 γ µ u1 ,
u ¯2 σ µν pν u1 ,
u ¯2 σ µν qν u1 ,
(7.48)
which are true fourvectors, plus the corresponding pseudovectors obtained µ (0)p1 must be a combination of these quaninserting γ 5 . Therefore p2 Jem tities. µ 0 (x) is a true fourvector, so it transforms as Jem (t, x ) → Under parity Jem 0 i i µ (t, x ) → Jem (t, −x ) and Jem (t, x ) → −Jem (t, −x ) or, more compactly, as Jem µ (t, −x ), with no sum over the µ index. At the same time, under parity η µµ Jem the state p1 → ηe p1 , where ηe is the intrinsic parity of the electron and p1 is the parityreversed momentum, p1 = (p01 , −pi1 ); similarly p2  → p2 ηe∗ , with the same ηe since we have an electron both in the initial and in the ﬁnal state. µ (0)p1 , The term ηe∗ ηe = 1 therefore cancels and the matrix element p2 Jem under parity, picks a factor η µµ while the momenta p1 , p2 become parityreversed. This is the same transformation properties of the ﬁve terms displayed in eq. (7.48), while the corresponding quantities constructed with γ 5 pick an overall −η µµ factor. Since parity is a symmetry of QED, only the terms in eq. (7.48) can enter in the parametrization of the matrix element that we are considering, while the terms with γ 5 are absent. ¯2 are solutions Another simpliﬁcation comes from the fact that, when u1 , u of the Dirac equation with masses m1 , m2 respectively (in our case m1 = m2 = me ), there is an algebraic identity, known as the Gordon identity: from the deﬁnition of σ µν , i i u ¯2 [γµ , γν ](pν2 − pν1 )u1 = u ¯2 [γµ p2 − γµ p1 − p2 γµ + p1 γµ ] u1 . 2 2 (7.49) In the ﬁrst and in the fourth term we anticommute p with γ µ , so that p1 is ¯2 , using next to u1 and p2 is next to u u ¯2 σµν q ν u1 =
γµ p2 = ({γµ , γν } − γν γµ ) pν2 = 2p2,µ − p2 γµ ,
(7.50)
¯2 p2 = m2 u ¯2 , see and similarly for p1 γµ . Then we use p1 u1 = m1 u1 and u eqs. (3.100) and (3.114), and we ﬁnd u2 [pµ − (m1 + m2 )γµ ] u1 . u ¯2 σµν q ν u1 = i¯
(7.51)
Similarly, considering u ¯2 σµν pν u1 , we ﬁnd u2 [q µ + (m1 − m2 )γ µ ] u1 . u ¯2 σµν pν u1 = i¯
(7.52)
¯2 pµ u1 from the list We can use these identities to eliminate u ¯2 σ µν pν u1 and u µ of independent bilinears. Therefore we ﬁnd that p2 Jem (0)p1 is at most a ¯(p2 )σ µν qν u(p1 ) and u ¯(p2 )q µ u(p1 ). The linear combination of u ¯(p2 )γ µ u(p1 ), u coeﬃcients must be Lorentz invariant functions. With pµ and q µ we can construct the invariants q 2 , p2 and qp. However, qp = p22 − p21 = 0, while q 2 and p2 are not independent since q 2 = p22 + p21 − 2p2 p1 = 2m2e − 2p2 p1 and
7.3 p2 = p22 + p21 + 2p2 p1 = 2m2e + 2p2 p1 , so that q 2 + p2 = 4m2e . We choose q 2 as the independent variable. Then i u ¯2 σ µν qν u1 + f3 (q 2 ) u ¯2 q µ u1 . 2me (7.53) The factor i in iσ µν qν is chosen so that f2 (q 2 ) is real, and the factor 1/2me is a convenient normalization, so that f2 (q 2 ) is dimensionless, as f1 (q 2 ). We now make use of the fact that, as a consequence of gauge invariance, the µ (x) = 0. Using eq. (7.45) we see electromagnetic current is conserved, ∂µ Jem that this means that µ (0)p1 = 0 . (7.54) qµ p2 Jem µ p2 Jem (0)p1 = f1 (q 2 ) u ¯2 γ µ u1 + f2 (q 2 )
¯2 γ µ u1 = u ¯2 ( p2 − p1 )u1 The term u ¯2 γ µ u1 satisﬁes this condition, since qµ u vanishes because u1 and u2 are solutions of the equations of motion. Also the term u ¯2 σ µν qν u1 satisﬁes the condition: σ µν qν qµ vanishes identically because ¯2 q µ u1 = q 2 u ¯2 u1 does not vanish, since the σ µν is antisymmetric. Instead qµ u photon is in general oﬀshell, and q 2 = 0. Current conservation is exact to all orders in α, since it is a consequence of gauge invariance, which means that the function f3 (q 2 ) must be identically zero. In conclusion, the most general parametrization of the matrix element of the electromagnetic current, compatible with Lorentz invariance, parity and current conservation (i.e. with gauge invariance) is µ p2 Jem (0)p1 = f1 (q 2 ) u ¯2 γ µ u1 + f2 (q 2 )
i qν u ¯2 σ µν u1 . 2me
(7.55)
The functions f1 (q 2 ) and f2 (q 2 ) are called form factors. Comparison with eq. (7.46) shows that at tree level f1 (q 2 ) = 1 and f2 (q 2 ) = 0. We now want to understand their physical meaning. This can be obtained considering their eﬀect on the scattering amplitude. The meaning of f1 can be understood considering its eﬀect on the scattering of the electron on a static source, such as a heavy atom. We computed it at lowest order in Section 6.6. Including f1 , eq. (6.77) becomes Ze2 (7.56) Mf i (q) = 2 f1 (q ) , q where we have taken into account that, for elastic scattering, q µ = (0, q ) and we denoted by f1 (q ) the function f1 (q 2 ) evaluated at q µ = (0, q ) . The interaction potential (6.78) then becomes Z V (x ) = −Ze2
d3 q f1 (q ) iq ·x e . (2π)3 q 2
We denote by ρ(x ) the inverse Fourier transform of f1 (q ), Z f1 (q ) = d3 x ρ(x )e−iq ·x .
(7.57)
(7.58)
Then Z V (x ) = −Ze =−
2
Ze2 4π
3
d x ρ(x ) Z
d3 x
Z
d3 q 1 iq ·(x −x ) e (2π)3 q 2
1 ρ(x ) . x − x 
(7.59)
Solved problems 191
192 Quantum electrodynamics This is the Coulomb potential generated by a charge distribution ρ(x ), and therefore the form factor f1 (q ) is the Fourier transform of the charge distribution. We see that the eﬀect of loop corrections is to delocalize the charge distribution of the electron, which is a Dirac delta at tree level and becomes ρ(x ) after the electron is “dressed” by the radiative corrections. We also see that, to all orders in α, f1 (0) = 1, since it is just the total electron charge, in units of e. The meaning of the second form factor is more easily understood looking rather at the scattering in an external magnetic ﬁeld. Instead of specifying the structure of the source, we can more simply write the interaction with an external ﬁeld Aext µ in the form µ Lext = −eAext µ Jem .
(7.60)
=0 We consider again a static external ﬁeld, so q 0 = 0, but now we take Aext 0 and ∇ × A = B. The amplitude is i ˜ext ˜ext ¯(p )σ µν u(p) Aµ (q)qν u u(p )γ µ u(p) − ef2 (q ) Mf i = −ef1 (q )A µ (q)¯ 2me e ˜iext (q)¯ ˜iext (q)q j u u(p )γ i u(p) − i f2 (q )A ¯(p )σ ij u(p). (7.61) = ef1 (q )A 2me We consider a slowly varying ﬁeld, so we take q → 0 and we keep only the ﬁrst nonvanishing contribution. The computation is performed in detail in Peskin and Schroeder (1995), Section 6.2, so we simply quote the result. The expansion in powers of q starts from a constant spinindependent term proportional to p + p . This is the contribution of the operator p · A + A · p from the nonrelativistic Hamiltonian (p − eA)2 /2me . We are more interested in the next term, which is linear in q and depends on the spin, and gets a contribution both from f1 and from f2 . The result, limiting ourselves to processes with the same spin state for the initial and ﬁnal electron, is Mf i
e ˜qi ξ † σ i ξ . [f1 (0) + f2 (0)]B 2me
(7.62)
where Mf i = Mf i /(2me ) is the matrix element with a nonrelativistic normalization for the electron, and is related to the scattering potential as in ˜qi is the Fourier component of the magnetic ﬁeld. Using the reeq. (6.71); B sults of Section 6.6, we can see that this is the scattering amplitude that would be generated, in the nonrelativistic theory, by a potential V (x ) = −µ·B(x )
(7.63)
“ σ ” e [f1 (0) + f2 (0)] ξ † ξ . (7.64) me 2 From (7.63) we see that µ is a magnetic dipole moment. Since f1 (0) = 1 to all orders in perturbation theory, the magnetic moment µ is related to the expectation value of the spin operator, with
µ=
σ ξ, 2
(7.65)
e S, 2me
(7.66)
S = ξ† by µ=g where g = 2 + 2f2 (0), or
g−2 = f2 (0) . 2
(7.67)
Exercises 193 The form factor f2 (q 2 ) therefore gives a correction to the magnetic dipole moment. At tree level, f2 (0) = 0 and we recover the result that we found from the Dirac equation, see eq. (3.186). Deviations from g = 2 therefore come from the loop corrections to f2 (0). The oneloop result, ﬁrst derived by Schwinger in 1948, turns out to be f2 (0) = α/(2π).
Summary of chapter • QED describes the interactions of spin 1/2 charged particles with photons. The Feynman rules are summarized in Fig. 7.8. The wave functions associated to the external legs were given on page 135. • QED is renormalizable. The oneloop divergences are studied in Section 7.2 and are reabsorbed into the renormalization of the fermion mass, fermion wave function, photon wave function and electric charge. The photon mass is protected against loop corrections by gauge invariance. • Invariance principles constrain the form of loop corrections to all orders. In particular, the vertex is parametrized by two form factors, representing the charge density and the magnetic dipole density.
=
p =
k
i ___________ p/ − m + iε µ kν ] − i [ η µν − (1−ξ) k___ _____ 2 k 2+ i ε k
µ q p1
=
p2
−i e γ
µ
Fig. 7.8 The Feynman rules for
QED.
Further reading • QED is discussed in great detail in many books. See, in particular, Itzykson and Zuber (1980), Peskin and Schroeder (1995), Weinberg (1995) and Landau and Lifshitz, vol. IV (1982). • Many useful theoretical and experimental results are collected in Quantum Electrodynamics, T. Kinoshita ed., World Scientiﬁc 1990. This includes reviews on highprecision tests of QED, a description
of measurements and of calculations of the magnetic moment of electrons and muons, hydrogenic bound states, Lamb shift experiments, hyperﬁne structure experiments, precision measurements in positronium, etc. • For an explicit calculation of the g − 2 of the electron at O(α) and of the Lamb shift see, e.g. Mandl and Shaw (1984), Section 9.6.
Exercises (7.1) (i) Write the Feynman diagrams for the annihilation e+ e− → 2γ, to lowest perturbative order. (ii) Write the modulus squared of the amplitude as a trace, averaging over the initial spins and summing over the ﬁnal helicities. Evaluate the trace in the limit p → 0, where p is the electron momen
tum in the CM. (iii) Compute the crosssection σ ¯ of the process (averaged and summed over the initial and ﬁnal spins) in the limit p → 0. Show that the ﬂux factor I can be written as (7.68) I = E1 E2 v ,
194 Quantum Electrodynamics where E1 , E2 are the energies of the two incoming particles in the CM and v their relative velocity. Using this, and the value of M2 computed above, verify that in the limit p → 0, σ ¯e+ e− →2γ →
2 πrB , v
(7.69)
with rB = α/me . (7.2) In this exercise we compute the decay rate of positronium (the hydrogenoid bound state of e+ and e− ) into two photons, restricting to positronium states with orbital angular momentum L = 0. (i) Using the results of Exercise 4.1, show that, if L = 0, the annihilation can take place only when the e+ e− pair has total angular momentum J = 0, and that the crosssection σ ¯ , averaged over initial spin and summed over the ﬁnal helicities, is given by 1 (7.70) σ ¯ = σ (J =0) , 4 where σ (J =0) is the crosssection with the e+ e− pair in the state with J = 0. (ii) Consider ﬁrst the crosssection for the annihilation e+ e− → 2γ, Z 1 Me+ e− →2γ 2 dΦ(2) (7.71) σ= 4I (we include the 1/2! for the identical photons in dΦ(2) ). Using eq. (7.68) show that Z σv = Me+ e− →2γ 2 dΦ(2) , (7.72) where Mf i is the matrix element with the normalization of one particle per unit volume. Verify also that Z (7.73) ΓPos→2γ = MPos→2γ 2 dΦ(2) ,
where in MPos→2γ the positronium state (labeled “Pos”) is normalized as one particle per unit volume. (iii) Write Me+ e− →2γ = 2γp , −p where p is the electron momentum in the CM, and MPos→2γ = 2γPos, where the positronium is at rest. Show that Z d3 p ˜ ) , (7.74) 2γp , −p ψ(p MPos→2γ = (2π)3 ˜ ) is the positronium wave function in where ψ(p momentum space. ˜ ) is peaked at small (iv) Justify the fact that ψ(p values of the momentum, p  ∼ (1/2)me α, and from this derive that, at lowest order in α, MPos→2γ ψ(0) lim 2γp , −p , p →0
(7.75)
where ψ(x) is the wave function in position space. Using eqs. (7.72) and (7.73) derive the relation ΓPos→2γ = ψ(0)2 lim σ (J =0) v . p →0
(7.76)
(v) The wave functions of positronium are the same as the hydrogen atom, with the replacement of the reduced mass, which is approximately me in the hydrogen atom, with me /2 for the e+ e− system. Then, in the state with quantum numbers n = 0, L = 0, 1 e−r/a , ψ(r) = √ πa3
(7.77)
with a = 2/(me α). Using the crosssection (7.69) and eqs. (7.76) and (7.70) show that, for the state n = L = 0, 1 ΓPos→2γ = me α5 . (7.78) 2 Compare this result with the experimental value of the lifetime, Γ = 7.994(11) ns−1 .
The lowenergy limit of the electroweak theory In this section we begin our study of weak interactions. The electroweak theory is described by the Standard Model (SM). A systematic explanation of the SM is beyond the scope of this course, but in the next chapters we will introduce two of the most important theoretical tools for its construction, i.e. nonabelian gauge ﬁelds and the Higgs mechanism. However, the full structure of the SM is only revealed at energies comparable to the masses of the bosons W ± and Z 0 that, together with the photon, mediate the electroweak interaction. Since mW = 80.425(38) GeV and mZ = 91.1876(21) GeV, the weak decays of particles with masses between a few hundred MeV and a few GeV, as for instance the muons, the pions, the kaons, the neutron, charmed mesons like the D0 , etc., can be studied in a lowenergy approximation to the SM. For instance in the βdecay of the free neutron, n → pe− ν¯e , we have a mass diﬀerence mn − mp 1.29 MeV. Therefore, even if at the fundamental level the decay is mediated by the W boson, the fact that the maximum momentum transfer is much smaller than mW allows us to use a lowenergy eﬀective theory. The same approximation holds for nuclear βdecays. For the same reason, we can use the lowenergy theory when we study a scattering process mediated by weak interactions, as for instance e− νe → e− νe , at centerof mass energies well below mW . In this chapter we introduce this lowenergy approximation, which is given by a fourfermion model, and we will understand how it can be obtained “integrating out” some heavy gauge bosons. We will then illustrate in detail in the Solved Problems section how to use it to compute explicitly many weak decays.
8.1
A fourfermion model
As a preliminary exercise, we consider a theory with a single Dirac fermion Ψ, and a Lagrangian ¯ (i ∂ − m) Ψ + G(ΨΨ) ¯ 2. L=Ψ
(8.1)
The interaction is given by a fourfermion term, and in the Feynman diagrams we have vertices where four fermionic lines meet at a point,
8 8.1 A fourfermion model
195
8.2 Charged and neutral currents in the Standard Model 197 8.3 Solved problems: weak decays
202
196 The lowenergy limit of the electroweak theory
−i g Fig. 8.1 The exchange of a massive intermediate boson reduces to a fourfermion interaction in the lowenergy limit.
ig
−i g 2
i 2
p 0, so the interaction tends to align the spins, i.e. it is a ferromagnetic coupling. The statistical mechanics of the Ising model is obtained from the partition function Z = e−βF = tr e−βH , with β = 1/kB T , and therefore is governed by the dimensionless parameter K≡
J . kB T
(9.65)
As with any statistical system, at ﬁxed temperature T and ﬁxed number of particles the equilibrium state is given by the minimum of the free energy F = E − T S, and will be the result of the competition between the tendency to minimize the energy E, so in our case to align the spins, and the tendency to maximize the entropy S, so to disorder the system. At small T the tendency to minimize energy is more important while increasing T the tendency toward maximization of the entropy becomes progressively more relevant. In the Ising model this competition leads to
9.5
a phase transition: below a critical temperature Tc (or, more precisely below a critical value of kB T /J) the system develops a spontaneous magnetization, that is M ≡ si = 0, while above Tc the magnetization M vanishes (see Fig. 9.2). As T → Tc from below, the magnetization goes to zero as (9.66) M ∼ (Tc − T )β , and β is known as a critical index. In the twodimensional Ising model β = 1/8 but this kind of behavior is quite general for critical phenomena. In the magnetized phase there is longrange order. Since the spins tend to be aligned, if the spin s0 at the site 0 has the value s0 = +1 then, even if the site n is very far from the site 0, the probability of ﬁnding sn = +1 is higher than the probability of having sn = −1. Therefore there is no exponential decay of the correlation function (there will be, rather, a powerlaw fall oﬀ) or, in other words, the correlation length is inﬁnite. In the disordered phase T > Tc , instead, ξ is ﬁnite. As T → Tc from above, ξ diverges as ξ∼
1 , (T − Tc )ν
QFT and critical phenomena 233
M
Tc
T
Fig. 9.2 The magnetization as a
function of temperature in the Ising model.
ξ
(9.67)
with ν another critical index (ν = 1 in the twodimensional Ising model). This behavior is illustrated in Fig. 9.3. We are now ready to discuss the renormalization group (RG) equations in the language of critical phenomena. First of all, observe that in QFT the lattice spacing (as any other UV cutoﬀ) is not a physical quantity, and we are only interested in the limit a → 0. For this reason, it is diﬃcult to have an intuitive picture of how the bare theory should depend on the cutoﬀ. In statistical mechanics, instead, the lattice spacing is a physical quantity, for instance of order 10−8 cm in a typical condensed matter system. The correspondence between QFT and statistical mechanics provided by the path integral is, more precisely, a correspondence between the bare Green’s functions of QFT (which are objects about which it is diﬃcult to form an intuitive physical picture) and the physical correlation functions of the statistical system. It is therefore not surprising that in statistical systems the RG equations have a more intuitive interpretation. Consider for example the Ising model; the Hamiltonian (9.64) gives a microscopic description of the system in terms of interactions between nearestneighbor spins separated by a distance a. However, we are usually not interested in the details of the interaction on a microscopic scale. We are more interested in understanding what happens at the macroscopic scale. To go from a microscopic to a macroscopic description is a nontrivial problem, since cooperative eﬀects between the spins can take place. This is especially important when the correlation length ξ is large because, even if the fundamental interaction involves only nearestneighbor pairs, the inﬂuence of each spin on the others is propagated, via this nearestneighbor interaction, across a distance of order ξ.
Tc
T
Fig. 9.3 The correlation length as a function of temperature in the Ising model.
234 Path integral quantization a
2a
To construct an eﬀective Hamiltonian which describes the physics at the macroscopic scale we can proceed in steps, “integrating out” the details of the model at short scales. A possible way to do it, in the Ising model, is via a “block spin” transformation, ﬁrst introduced by Kadanoﬀ. Regroup the lattice sites into squares made of four adjacent spins, as in the upper part of Fig. 9.4. If ξ 1, nearestneighbor spins must be very strongly correlated. This means that, deﬁning the variables Sn = (1/4) i∈n si , where the sum is over the four adjacent spins in the nth square, each Sn will most of the time take the values ±1, as the original spin variables si , and therefore we can consider them as eﬀective spin variables, living on the centers of their respective squares, indicated by a cross in the lower part of Fig. 9.4.5 Since the original spins si had only nearestneighbor interaction, in a ﬁrst approximation also these “block spin” variables will interact among themselves only with a nearestneighbor interaction, but their eﬀective coupling constant in general will not be the same as the coupling K which appears in eqs. (9.64) and (9.65). Rather, the coupling will have a value K 2 ≡ f (K), for some function f (K), determined by the condition that the partition function is preserved, Tr{Sn } e−βH
Fig. 9.4 The block spin transforma
tion. 5 Actually, there are better formulations of the RG transformation where the calculations can be done more explicitly and systematically. For instance, for a scalar ﬁeld φi = φ(xi ) one can integrate over the original variables in the path integral,Pwith a Dirac delta of the form δ(Φn − i∈n φi ), which forces the four variables φi in the nth square to have a given value Φn , so we obtain an eﬀective theory in terms of the “block ﬁelds” Φn . Alternatively, we can work in momentum space, using the momentum modes φk , regularizing the theory with a cutoﬀ Λ which restricts the Euclidean momenta to k < Λ. Then, in the partition function Z, we integrate explicitly over the modes φk in a shell Λ − δΛ < k < Λ, remaining with an eﬀective action for the lowmomentum modes. We do not enter into these technical aspects, and we limit ourselves to illustrate the main ideas using Kadanoﬀ’s block spin transformation, which is rather intuitive.
[Sn ]
= Tr{si } e−βH[si ] ,
(9.68)
so that the theory obtained with a block spin transformation describes the same macroscopic physics. In general, H [Sn ], as determined by the above equation, will contain also interaction terms that are not present in H, so the evolution of the couplings should really be followed in a multiparameter space of coupling constants, rather than being restricted to the coupling K. For the moment, we neglect this aspect to simplify the presentation of the main ideas. If K describes the interaction on a microscopic scale a, K2 describes the eﬀective interaction on the scale 2a. We can then iterate the procedure, performing this block spin transformation over the Sn variables, again regrouping four of them together; since the Hamiltonian still has the same form as in eq. (9.64), with just the replacement K → f (K), the eﬀective coupling at the scale 4a will be K3 ≡ f (K2 ) = f (f (K)) .
(9.69)
n
We can iterate the procedure n times, until 2 a becomes almost of order aξ. What we gain by this procedure is that at each stage the number of eﬀective spin variables which are strongly correlated among themselves has been thinned. In fact, in terms of the variable Sn , the new (dimensionless) correlation length is ξ1 = ξ0 /2 (where ξ0 is the correlation length of the original variables si ), since the spacing between the spins Sn is 2a. After n steps the correlation length is ξn = ξ0 /2n . When this number becomes of order one, collective eﬀects do not play any role, and the physics of the system can simply be read oﬀ the eﬀective Hamiltonian. Therefore the coupling Kn obtained iterating n times the block spin transformation, i.e. the coupling deﬁned by Kn = f (Kn−1 ) ,
(9.70)
9.5
QFT and critical phenomena 235
with the initial condition K1 = K, is the eﬀective coupling which describes the physics at the lengthscale l = 2n a. The relation with the renormalization group equations discussed in Section 5.9 now emerges. What we ﬁnd, either using the QFT language in Section 5.9 or the language of statistical mechanics in this section, is that the behavior of the theory is described by eﬀective coupling constants that depend on a lengthscale l (or equivalently on an energyscale 1/l). In statistical mechanics l represents the scale over which microscopic ﬂuctuations have been averaged out while, in QFT, 1/l is the cutoﬀ in momentum space. All the necessary information is encoded in the functions that describe how the couplings change as we change the scale: the beta function in Section 5.9 or the function f (K) in this section. We see that the problem of taking into account the collective action of many degrees of freedom when the correlation length is large has been translated into the problem of computing the function f (K). This can be a very diﬃcult task, so it might seem that we have simply translated a diﬃcult problem into another diﬃcult problem. However, the power of the method emerges in connection with the notion of ﬁxed point. A ﬁxed point Kc is deﬁned as a solution of the equation Kc = f (Kc ) .
(9.71)
Observe that ξn is a function of Kn , i.e. ξn = ξ(Kn ). Since ξn+1 = ξn /2, the function ξ(K) must satisfy ξ[f (K)] =
1 ξ(K) . 2
(9.72)
At the ﬁxed point Kc = f (Kc ) so ξ(Kc ) = (1/2)ξ(Kc ), which has two solutions, ξ(Kc ) = ∞ and ξ(Kc ) = 0. We see from eq. (9.63) that the latter case is of no interest for constructing an interacting continuum QFT, and is called a “trivial” ﬁxed point. The ﬁxed points corresponding to ξ(Kc ) = ∞ are instead called “critical”. In a general system, the space of coupling constants will be multidimensional, and one must also take into account that, even if a term is not originally present in the microscopic Hamiltonian, it will be generated by the RG transformation, unless it is protected by some symmetry. Therefore we will have a RG ﬂow in this multidimensional (actually, inﬁnitedimensional) space. However, most of the couplings will be simply driven to zero by the RG transformation. The operators with whom they are associated in the Lagrangian are termed irrelevant. In the space of the remaining couplings, the RG ﬂow will be basically determined by the ﬁxedpoint structure. Barring more exotic possibilities, like chaotic behavior, the RG trajectory will ﬂow either to inﬁnity or toward the ﬁxed points. A hypothetical example of a ﬂow in a twodimensional parameter space is shown in Fig. 9.5. In this ﬁgure we have drawn the critical surface, that is the surface in the parameter space where ξ = ∞, and other surfaces at constant ξ. We have considered a situation in which
C
ξ=
B
ξ = 10
A
ξ =1
P0
Fig. 9.5 An example of RG ﬂow
with three ﬁxed points A, B, C on the critical surface and a ﬁxed point P0 at ξ = 0.
236 Path integral quantization
on the critical surface there are three ﬁxed points, labeled A, B and C. Furthermore, there is a trivial ﬁxed point P0 with ξ = 0. Each point on this graph corresponds to a set of values for the couplings, i.e. a physical Hamiltonian in the language of statistical mechanics, or a bare QFT in ﬁeldtheoretical language. Since under a RG step ξ → ξ/2, if we apply the transformation to a point on the critical surface we remain on the critical surface, while if we start at ξ ﬁnite we will be eventually driven to the ﬁxed point at ξ = 0. A very important property of a ﬁxed point is its stability. In Fig. 9.5 P0 is a stable ﬁxed point: the ﬂow is such that, if we start inﬁnitesimally close to P0 in any direction, the RG transformation will bring us toward P0 . All ﬁxed points on the critical surface have instead at least one unstable direction, which is the direction orthogonal to the critical surface itself: if we start close to the critical surface, but not exactly on it, the RG transformation decreases ξ and drives us further away from the critical surface. The ﬁxed points A and C have one stable direction along the critical surface, while B has instead a second unstable direction. Points that are on the same RG trajectory describe the same macroscopic physics, by construction of the RG transformation, see eq. (9.68). In principle, each point on the critical surface is suitable for removing the cutoﬀ and deﬁning a renormalized QFT, since there ξ = ∞ and a = 0. However, these theories are not all physically diﬀerent. Since a RG trajectory connects equivalent theories, all the points on the critical surface that lie in the attraction basin of the same ﬁxed point correspond to equivalent theories. In other words, the space of possible renormalized QFT is split into equivalence classes, known as universality classes, in onetoone correspondence with the ﬁxed points on the critical surface. Universality is a powerful concept to explain why statistical systems with very diﬀerent microscopic Hamiltonians turn out to have the same critical behavior, and in particular the same critical indices. For instance, for a generic system with one coupling constant K, we have seen in eq. (9.67) that near the critical coupling Kc ξ(K)
c , (K − Kc )ν
with c a constant. Combining this with eq. (9.72), we have ν f (K) − Kc = 2. lim K→Kc K − Kc
(9.73)
(9.74)
Close to the critical point, f (K) Kc + f (Kc )(K − Kc ) and, substiν tuting into eq. (9.74), we get [f (Kc )] = 2, or ν=
log 2 . log f (Kc )
(9.75)
This shows that the critical index ν depends only on the form of the function f (K) near the critical point, and not on the microscopic details of the Hamiltonian. In particular, all Hamiltonians that lie on the same RG trajectory describe systems that, even if apparently very diﬀerent
9.5
QFT and critical phenomena 237
on the microscopic scale, have the same value of ν. The above results also nicely illustrate how the nonanalytic behavior (9.73) emerges as a result of collective behavior, from a regular Hamiltonian and an analytic function f (K). In the context of QFT, universality means ﬁrst of all that (at least within a universality class) the renormalized theory does not depend on the details of the regularization. Universality however is important also for a diﬀerent reason. In Section 5.9 we saw that we can write the RG equations in two diﬀerent forms: (1) We can write them as equations that govern the dependence of the bare coupling constants on the cutoﬀ, as in eq. (5.164). With a lattice cutoﬀ Λ = 1/a, eq. (5.164) reads a
dg0 = −β(g0 (a)) , da
(9.76)
where, in the general case of many coupling constants, g0 and β are vectors in the coupling constant space. (2) We can write the RG equations in the form of equations that govern the dependence of the renormalized coupling constants on the energy. From eq. (5.178), setting E = uµ (where µ is the reference scale), dgeﬀ = β(geﬀ (E)) . (9.77) E dE Both eqs. (9.76) and (9.77) originated from eq. (5.161), in the former case taking the derivative with respect to the cutoﬀ and in the latter with respect to the renormalization point µ. Equation (9.76), as we have seen, can be understood in the language of critical phenomena, because the cutoﬀ a can be interpreted as the physical lattice spacing of a condensed matter system, and a change in cutoﬀ as the result of taking into account collective eﬀects. However, eq. (9.77) has the same form as eq. (9.76), so we can again apply the notions of ﬁxed points, universality, etc. While in eq. (9.76) the ﬁxed points are at ξ = 0, ∞ (i.e. when the momentum space cutoﬀ 1/a is 0 or ∞), in eq. (9.77) they will be at E = 0 or at E = ∞. In eq. (9.76) only the ﬁxed points at ξ = ∞ were interesting in the QFT context. In eq. (9.77), instead, both types of ﬁxed points are very interesting; a ﬁxed point at E = 0 (called an infrared, or simply IR, ﬁxed point) can govern the behavior of a theory at low energies while a ﬁxed point at E = ∞ (called a UV ﬁxed point) will be relevant at high energies. This can be understood from Figs. 9.6 and 9.7, where for simplicity we take the space of coupling constants to be onedimensional. We consider a theory with a beta function which vanishes at geﬀ = 0 (the perturbative ﬁxed point, which always exists) and which furthermore has a zero at a value geﬀ = gc . In Fig. 9.6 the beta function is positive for 0 < geﬀ < gc and negative for geﬀ > gc . From this, it follows that a solution of eq. (9.77), as E → ∞, will always be attracted toward g eﬀ = gc , independently of the initial value g(µ). In fact, if 0 < g(µ) < gc , β(geﬀ ) > 0, therefore dgeﬀ /dE > 0 and geﬀ (E) increases asymptotically
β ( g)
g
gc
Fig. 9.6 An example of a beta func
tion with two zeros, such that g = 0 is an IR ﬁxed point and g = gc is a UV ﬁxed point.
β ( g)
gc
g
Fig. 9.7 Reversing the sign of the
beta function, g = 0 becomes a UV ﬁxed point and g = gc an IR ﬁxed point.
238 Path integral quantization
toward gc . Conversely, if the initial value g(µ) > gc , the beta function is negative and geﬀ (E) decreases toward gc . Therefore in this example gc is a UV ﬁxed point. Universality manifests itself in the fact that the largeenergy properties of the theory are governed by this value of the coupling, irrespectively of the initial value of g(µ). To study the limit E → 0, instead, we can run the RG ﬂow backward and, again using Fig. 9.6, we see that, if 0 < g(µ) < gc , geﬀ (E) runs toward zero as E → 0 while, if g(µ) > gc , geﬀ (E) runs toward inﬁnity. Therefore g = 0 is an IR ﬁxed point (with attraction domain 0 < g(µ) < gc ), and all theories in this universality class become free at low energies. In Fig. 9.7 we have reversed the sign of the beta function, and the same analysis as above shows that now g = 0 is a UV ﬁxed point with attraction domain 0 < g(µ) < gc , while in the infrared the theory ﬂows at gc . In QCD, the beta function has only the perturbative zero at g = 0, and is negative for g > 0. There is no other ﬁxed point. Therefore the theory in the UV ﬂows to g = 0, which is the property of asymptotic freedom that we already mentioned in Section 5.9. In the IR limit the coupling grows large and enters in the strong coupling domain, where perturbative methods cannot be applied.
9.6
QFT at ﬁnite temperature
Another interesting application of the path integral technique is to QFT at ﬁnite temperature. To understand this relation we consider ﬁrst quantum mechanics and we start from eq. (9.15), written using the Schr¨ odinger representation, ˆ
qf e−iH(Tf −Ti ) qi =
q(Tf )=qf
[dq] eiS .
(9.78)
q(Ti )=qi
We now rotate from Minkowski to Euclidean space, t → −it, so that exp{−iHt} becomes exp{−Ht}, while eiS/ becomes e−S/ , as discussed in the previous section. In this section, the notation t will hereafter denote Euclidean time. Then eq. (9.78) becomes ˆ
qf e−β H qi =
q(Tf )=qf
[dq] e−S ,
(9.79)
q(Ti )=qi
where β ≡ Tf − Ti and Ti , Tf are the minimum and maximum values of Euclidean time; S is the Euclidean action so, for a particle in a potential V, 1 S = mq˙2 + V (q) . (9.80) 2 We now take qi = qf ≡ q and we sum over all possible values of q, ˆ q e−β H q = [dq] e−S , (9.81) q
q(t)=q(t+β)
9.7
where the sum is over all possible conﬁgurations with q(Ti ) = q(Tf ) or, in other words, over all periodic conﬁgurations with period β = Tf − Ti . Since the states q form a complete set, the lefthand side is the trace ˆ of e−β H over the Hilbert space, so we ﬁnd the relation ˆ −β H = [dq] e−S . (9.82) tr e q(t)=q(t+β)
For a scalar ﬁeld theory we can perform the same steps with just a change in notation, and we get ˆ tr e−β H = Dφ e−S , (9.83) φ(x ,t)=φ(x ,t+β)
with S the Euclidean action given in eq. (9.54) and H the ﬁeld theory Hamiltonian. The lefthand side of eq. (9.82) or eq. (9.83) is the thermal partition function of a system with Hamiltonian H, at a temperature given by kB T = 1/β. This means that the thermal averages of the system can be computed using the path integral, restricting to paths periodic in Euclidean time. From a practical point of view, in quantum mechanics it is simpler to compute directly the trace on the lefthand side of eq. (9.82) using the operator formalism. In QFT, instead, it is usually much simpler to evaluate the path integral in eq. (9.83).
9.7
Solved problems
Problem 9.1. Instantons and tunneling It happens in many theories that, beside the trivial solution φ = 0, there are other, nontrivial, solutions of the Euclidean equations of motion, which vanish at inﬁnity. A typical situation is given by solutions which describe tunneling phenomena between diﬀerent vacua. We consider the action of a scalar ﬁeld theory in D Euclidean dimensions, » – Z 1 ∂µ φ∂µ φ + V (φ) , (9.84) SE = d D x 2 „ «2 m2 2 φ , (9.85) φ 1− 2 η so we have a mass term, a cubic and a quartic coupling. As a ﬁrst step, we restrict to D = 1, so the action is » – Z 1 2 SE = dt (9.86) (∂t φ) + V (φ) . 2
and we choose
V (φ) =
This is Euclidean quantum mechanics, with φ(t) playing the role of the position q(t). However, we will still use the notation φ and the typical ﬁeld theory language, to emphasize that many general ideas carry through (with some qualiﬁcations to be discussed below) in ﬁeld theory in D Euclidean dimensions.
Solved problems 239
240 Path integral quantization
V
φ
Fig.
9.8 The
against φ.
function
−V (φ)
This potential has two degenerate minima: one is at φ = 0 and, in the ﬁeld theory language, it is the perturbative vacuum; the other is located at φ = η. Of course, with a redeﬁnition of the ﬁeld of the form φ → φ − η we could shift the rightmost minimum at φ = 0 and call this the perturbative vacuum; in any case, the point is that there are two minima, and perturbation theory is deﬁned as an expansion around one of them. So we choose φ = 0 as the perturbative vacuum, and we study the path integral with the boundary conditions that the ﬁeld goes to zero as t → ±∞. We now ask whether there are classical solutions of the equations of motion with these boundary conditions. One is obviously the perturbative vacuum, φ(t) = 0 for all t. To search for other solutions, observe that formally eq. (9.86) is the same as the action of a particle with coordinate φ(t) and unit mass, moving in Minkowski space in a potential −V (φ), shown in Fig. 9.8. From this formal analogy we immediately understand that there is a solution starting at t = −∞ at φ = 0 with a “speed” φ˙ → 0+ , which approaches the point φ = η at t = +∞. In the original Minkowskian ﬁeld theory, φ = 0 and φ = η are the two degenerate vacua, separated by a potential barrier, and therefore this Euclidean solution, from the point of view of Minkowski ﬁeld theory, represents a tunneling process between the two vacua. We will call this Euclidean solution an “instanton”. Actually, this solution does not satisfy our boundary conditions, since φ = 0 at t → +∞. However, there is also a solution, that we call an antiinstanton, which goes from φ = η to φ = 0, and we can combine the two solutions to obtain a path that starts and ends at φ = 0; for instance we can consider a path that starts at φ = > 0. This will reach an inversion point close to φ = η and then will go back to φ = . In the limit → 0+ we get the desired ¯ pair. solution, which is called an instanton–antiinstanton (I I) The path integral can therefore be approximated as a sum of two terms: the contributions of the small ﬂuctuations around the perturbative vacuum φ = 0, which reproduces the perturbative expansion, and the contribution of the ﬂuctuations around the I I¯ pair. The latter gives an example of a nonperturbative contribution. Let φ0 (t) be the I I¯ classical solution. We can write a small ﬂuctuation over φ0 in the form φ = φ0 + ϕ ,
(9.87)
with ϕ small. The action can be expanded in powers of ϕ as S[φ] = S[φ0 ] + S2 [ϕ] + S3 [ϕ] + . . . ,
(9.88)
where S2 is quadratic in ϕ, S3 is cubic, etc. It is important that there is no term linear in ϕ, since φ0 is a classical solution and therefore is an extremum of the action. Then the contribution to the path integral coming from the ﬂuctuations over φ0 is Z e−S[φ0 ]
Dϕ e−(S2 [ϕ]+S3 [ϕ]+...) .
(9.89)
We see that this contribution is proportional to e−S[φ0 ] , where S[φ0 ] ≡ SI I¯ is the action of the I I¯ conﬁguration, or more generally of the classical conﬁguration over which we are expanding. In this example, we have SI I¯ = SI + SI¯ = 2SI so to compute SI I¯ we must ﬁnd explicitly the instanton solution and its action SI . In this model this is easily done: the equation of motion is «„ « „ φ φ 2 ¨ 1−2 , (9.90) φ=m φ 1− η η
9.7 and the instanton solution, which interpolates between φ = 0 at t = −∞ and φ = η at t = +∞, is η . (9.91) φ(t) = 1 + exp{−mt} Inserting this into eq. (9.86) and setting τ = mt/2 we ﬁnd Z mη 2 1 mη 2 ∞ SI = = . (9.92) dτ 4 8 6 cosh τ −∞ Observe that the coeﬃcient of the φ4 term in the action (9.84, 9.85) is m2 /(2η 2 ). We then deﬁne the quartic selfcoupling λ from λ/4! = m2 /(2η 2 ), and we see that the contribution to the path integral of the small ﬂuctuations over this classical conﬁgurations is proportional to n co , (9.93) e−2SI = exp − λ with c = 4m3 . The fact that c is not adimensional is due to the fact that in D = 1 the coupling λ is not adimensional. The important point is that this contribution is nonanalytic in the coupling λ and can never be seen in a perturbative expansion around λ = 0; the Taylor expansion of the function f (x) = exp{−1/x} around x = 0 is in fact identically zero. Instantons therefore provide an example of nonperturbative contributions that cannot be computed in a perturbative expansion in terms of Feynman graphs. Parametrically, as λ → 0, exp{−c/λ} is smaller than any power of λ; therefore, in a theory in which the renormalized coupling is suﬃciently small, like in QED, eﬀects of this type are negligible. However, in a theory like QCD where the coupling is strong, nonperturbative eﬀects are important. The generalization to D > 1 Euclidean dimensions is not completely straightforward. For instance, if we start with eq. (9.84), we might look for a classical solution φ(t, x) which is independent of x. Then we would just ﬁnd again the solution φ(t) discussed above. However now its action contains also a divergent volume factor coming from the integration over the spatial coordinates, and therefore the contribution to the path integral of these conﬁgurations is ∼ e−∞ = 0. Thus we must look for solutions whose action is localized not only in time, as in eq. (9.92), but also in space. Solutions of this type turn out to exist in a number of interesting theories, including QCD, and are generically called instantons.
Summary of chapter • In the path integral formulation of quantum mechanics the position and momentum (and any other variable) remain cnumbers rather than being promoted to operators. The amplitudes are computed summing over all possible trajectories with the given boundary conditions, weighting each trajectory with eiS (or e−S in Euclidean space). The connection between the operator formalism and the path integral quantization is provided by eqs. (9.15) and (9.25). • The vacuum expectation value of a T product of ﬁelds can be computed performing a path integral over all ﬁeld conﬁgurations that go to zero at inﬁnity, eq. (9.29). The perturbative expansion is recovered writing the action as S = S2 + Sint , where S2 is the
Solved problems 241
242 Path integral quantization
part quadratic in the ﬁelds and Sint the interaction term. The inverse of the operator which appears in S2 gives the propagator, and the perturbative expansion is reproduced expanding eiS (or e−S in Euclidean space) in powers of Sint . • An important aspect of the path integral formulation is that it allows us to deﬁne the theory nonperturbatively, and it makes it possible to compute (at least in principle, or with suitable numerical or semiclassical techniques) terms nonanalytic in the coupling constant. • The Euclidean formulation of the path integral reveals deep connections between QFT and critical phenomena. The bare Green’s functions of a QFT in d spatial dimensions are equivalent to the physical correlation functions of a statistical system living in d + 1 spatial dimensions. The dependence of the bare coupling constants on the cutoﬀ in QFT can then be understood, in the language of statistical mechanics, as a result of the collective interaction between many degrees of freedom when the correlation length is large, and can be computed “integrating out” the small scale ﬂuctuations. This generates the RG transformation. The existence of ﬁxed points of the RG transformation leads to the notion of universality.
Further reading • A clear and concise review of the path integral technique can be found in Appendix A of Polchinski (1998). For more details on the path integral see Peskin and Schroeder (1995), Chapter 9. For the path integral quantization of gauge theories see also Ramond (1990). • For the connection between QFT and critical phenomena, the original review papers are still excellent readings; a classical reference is K. G. Wilson and J. B. Kogut, Phys. Rept. 12 (1974) 75. See also
J.B. Kogut, Rev. Mod. Phys. 51 (1979) 659. For a historical perspective of the developments leading to renormalization group and the relation between QFT and critical phenomena, see the Nobel lecture of K. Wilson, Rev. Mod. Phys. 55 (1983) 583. • Two advanced books on the relations between ﬁeld theory and critical phenomena are Parisi (1988) and ZinnJustin (2002). • For instantons, a classical reference is Coleman (1985). See also Chapter 39 of ZinnJustin (2002).
10
Nonabelian gauge theories In this chapter we introduce nonabelian gauge theories, or Yang–Mills theories. Their importance stems from the fact that strong interactions are described by a nonabelian gauge theory with gauge group SU (3), known as quantum chromodynamics or QCD, while the electromagnetic and weak interactions are uniﬁed in a gauge theory with gauge group SU (2) × U (1), the electroweak theory. Together, QCD and the electroweak theory form the Standard Model, which to date reproduces all known experimental results of particle physics, up to energies of the order of a few hundred GeV. To have an idea of the degree of accuracy of these measurements, we mention that the mass of the Z 0 boson is measured with a precision of almost two parts in 105 , mZ = 91.1876(21) GeV, its width is ΓZ = 2.4952(23) GeV, and many other observables are known with a precision of the order of a few parts in 103 . A full presentation of the Standard Model is beyond the scope of this course. In this and the next chapter we will however introduce two of its main ingredients, namely Yang–Mills theories and the Higgs mechanism. Nonabelian gauge theories, beside having an extraordinary experimental success, have also a very rich theoretical structure, at the classical and especially at the quantum level. Within the scope of this course, we can only limit ourselves to just a few elementary aspects; in particular, we will discuss how to generalize gauge transformations to nonabelian groups and how to write the corresponding invariant Lagrangians.
10.1
Nonabelian gauge transformations
As a ﬁrst step, it is useful to rewrite the abelian gauge transformation of electrodynamics in a form more suitable for generalization. We saw in Chapter 3 that electrodynamics has a local U (1) gauge invariance. We write a generic xdependent U (1) element as U (x) = eiθ(x) ,
(10.1)
with 0 θ(x) 2π. A ﬁeld Ψ with charge q transforms as Ψ(x) → Uq (x)Ψ(x)
(10.2)
where Uq (x) = eiqθ(x) is a representation of the U (1) transformation, labeled by the parameter q. The transformation of the gauge ﬁeld instead is (10.3) Aµ (x) → Aµ (x) − ∂µ θ .
10.1 Nonabelian gauge transformations
243
10.2 Yang–Mills theory
246
10.3 QCD
248
10.4 Fields in the adjoint representation
250
244 Nonabelian gauge theories
In terms of the gauge group element U (x), this can be rewritten as Aµ (x) → Aµ + i(∂µ U )U † .
(10.4)
We make the (obvious) observation that the transformation law of Aµ is an intrinsic property of the gauge ﬁeld, and does not know anything about q, which instead is a parameter that labels the representation to which the matter ﬁeld Ψ belongs. The coupling between Aµ and Ψ is obtained using the covariant derivative, eq. (3.167), which depends on q, i.e. on the representation to which Ψ belongs, Dµ Ψ = (∂µ + iqAµ )Ψ .
(10.5)
The important property of the covariant derivative is that, even under xdependent transformations, it transforms in the same way as Ψ, Dµ Ψ → Uq (x)Dµ Ψ ,
1
We recall a few basic facts about the group SU (N ). It has N 2 − 1 generators T a , which are hermitian and satisfy Tr T a = 0. By deﬁnition they obey the Lie algebra [T a , T b ] = if abc T c , where f abc are the structure constants of SU (N ), which are completely antisymmetric and real. For example, for a is a repreSU (2), f abc = abc . If TR sentation of the algebra and V a unitary matrix of the same dimension as a , then V T a V † is still a solution of TR R the Lie algebra and therefore provides an equivalent representation. We can ﬁx V requiring that it diagonalizes the a T b ), so that matrix D ab (R) ≡ Tr (TR R a T b ) = C(R)δ ab . The normalizaTr (TR R tion factor C(R) is ﬁxed (since the Lie algebra is not invariant under a rescaling of T a ) and it depends on the representation R. For SU (N ), it turns out that C(R) = 1/2 for the fundamental representation and C(R) = N for the adjoint representation; the reader can check it for SU (2), using T i = σi /2 for the fundamental and (T i )jk = −i ijk for the adjoint, as we discussed in eq. (2.37). By deﬁnition for SU (N ) we raise or lower the index a with δ ab , so we will conventionally always write it as an upper index, and repeated upper indices are summed over.
(10.6)
as we already saw in eq. (3.168). As discussed in Section 3.5.4, using covariant derivatives there is a very simple way to construct a theory with local U (1) invariance: we start from a theory with global U (1) invariance and we just replace all the ordinary derivatives with covariant derivatives. This method of coupling matter to the electromagnetic ﬁeld is known as the minimal coupling. We have also seen, again in Section 3.5.4, that nonminimal couplings are also possible, but they are characterized by coupling constants with dimensions of inverse powers of mass. From the discussion in Section 5.8 and the explicit example of the Fermi theory presented in Section 8.1, we understand that couplings with inverse mass dimensions are less fundamental than dimensionless couplings, and emerge as the lowenergy limit of some more fundamental dimensionless coupling. Therefore, it is the minimal coupling that we want to generalize. We ﬁnd it convenient to redeﬁne θ(x) → eθ(x), where e < 0 is the electron charge. Therefore we write U (x) = eieθ(x) ,
(10.7)
where now 0 θ(x) 2π/e, and the gauge transformation becomes Ψ(x) → eiqeθ(x) Ψ(x) ,
(10.8)
i Aµ (x) → Aµ + (∂µ U )U † . (10.9) e We want to generalize the above transformations to the case where U (x) belongs to a nonabelian group G, rather than just to U (1), and we want to construct a Lagrangian invariant under such local transformations. We will limit ourselves to the the case G = SU (N ), although the construction is very general; G is called the gauge group.1 We start by generalizing eq. (10.8). We consider a set of ﬁelds Ψ α (x) transforming in a given representation R of the gauge group. The ﬁelds
10.1
Nonabelian gauge transformations 245
are then labeled by an index α = 1, . . . , dim (R). For deﬁniteness we take Ψα to be Dirac fermions, but all the subsequent considerations are very general and apply to any matter ﬁelds, e.g. to bosonic ﬁelds or to Weyl fermions. The fact that Ψ transforms in the representation R means that, under a gauge transformation, Ψ → UR Ψ ,
(10.10)
or, in components, Ψα (x) → (UR )α β (x)Ψβ (x). In eq. (10.10), UR (x) = exp{igθa (x)TRa } ,
(10.11)
where TRa are the generators of the gauge group in the representation R and θa (x) are the parameters of the transformation. We have redeﬁned the parameters θ a (x) → gθa (x), where g is a constant. We will see below that g will be the coupling constant of the theory. The free Dirac Lagrangian, ¯ α γ µ ∂µ Ψα , Lfree = iΨ
(10.12)
(with the sum over the index α understood) is invariant under global ¯ → U † and, if UR is SU (N ) transformations, since if Ψ → UR Ψ then Ψ R independent of x, it goes through ∂µ and cancels against UR† . However, if UR depends on x, performing the transformation we also get a term proportional to ∂µ U and this Lagrangian is no longer invariant. To construct an invariant Lagrangian, we introduce a set of gauge ﬁelds Aaµ labeled by an index a, with one gauge ﬁeld for each generator of the gauge group; the Aaµ are called nonabelian gauge ﬁelds. In particular, SU (N ) has N 2 − 1 generators, so we have three gauge ﬁelds for SU (2) and eight gauge ﬁelds for SU (3). We introduce the matrix ﬁeld Aµ (x) = Aaµ (x)T a .
(10.13)
Of course Aaµ does not depend on the representation (just as in electromagnetism the gauge ﬁeld, and therefore its transformation properties, does not know anything about the parameter q that labels the matter representation), while the generators T a , and therefore the matrix Aµ , have an explicit form which depends on the representation R. We deﬁne the gauge transformation of Aµ as i Aµ → U Aµ U † − (∂µ U )U † , g
(10.14)
where Aµ = Aaµ (x)TRa and U (x) = exp{igθ a (x)TRa } are in the same representation R. This deﬁnition is consistent because the transformation that it induces on Aaµ is independent of R, as it should be. This can be shown considering ﬁrst an inﬁnitesimal transformation, U (x) = 1 + igθa (x)TRa + O(θ2 ) .
(10.15)
246 Nonabelian gauge theories
Then eq. (10.14) becomes i Aaµ TRa → (1 + igθa TRa )Abµ TRb (1 − igθc TRc ) − (igTRa ∂µ θa ) + O(θ2 ) g = Aaµ TRa + igθa Abµ [TRa , TRb ] + TRa ∂µ θa + O(θ2 ) . (10.16) Therefore Aaµ → Aaµ + ∂µ θa − gf abc θb Acµ + O(θ2 ) ,
2
We take g > 0, while the electron charge is e < 0. This is the origin of some apparent sign diﬀerences in the deﬁnitions for the U (1) and the nonabelian case.
(10.17)
and no dependence on the representation R appears. For Lie groups the inﬁnitesimal transformation ﬁxes uniquely also the ﬁnite transformation, and therefore even the ﬁnite transformation of Aaµ is independent of R. Equation (10.14) generalizes eq. (10.9) to nonabelian groups. The constant g will play the role of the gauge coupling, as we will see below.2 In particular, under a global gauge transformation, Aµ → U Aµ U † . It is interesting to ask what this transformation property means in terms of the N 2 − 1 ﬁelds Aaµ , and we will see in Section 10.4 that it means that, under global gauge transformations, Aaµ → (Uadj )a b Abµ , where Uadj is the adjoint representation of the gauge group. We now deﬁne the covariant derivative on the ﬁeld Ψ as Dµ Ψ = ∂µ − ig Aaµ TRa Ψ ,
(10.18)
where TRa are the generators in the same representation R as the ﬁeld Ψ. Using eqs. (10.10) and (10.14) we see that i † † Dµ Ψ → ∂µ (UR Ψ) − ig UR Aµ UR − (∂µ UR )UR UR Ψ = UR Dµ Ψ , g (10.19) where we used the fact that Aaµ TRa transforms with the same matrix UR which appears in the transformation of Ψ. Therefore Dµ Ψ transforms in the same way as Ψ, even under local transformations.
10.2
Yang–Mills theory
Using the covariant derivative, it is now easy to write a Lagrangian with local nonabelian gauge invariance. We just replace ∂µ → Dµ in the free theory, that is, we write ¯ α [iγ µ (Dµ Ψ)α − mΨα ] . Ψ (10.20) L= α
This Lagrangian contains the kinetic term of the fermionic ﬁeld and its interaction with the gauge ﬁelds. The interaction term, which is hidden in the covariant derivative, is ¯ α γ µ (TRa )αβ Ψβ , Lint = gAaµ Ψ
(10.21)
and we see that g is a coupling constant. We also need a kinetic term for the gauge ﬁelds. One might try to deﬁne the ﬁeld strength tensor of
10.2
Yang–Mills theory 247
a each of the gauge ﬁelds Aaµ as Fµν = ∂µ Aaν − ∂ν Aaµ , but it is immediate to verify that this quantity does not have any simple transformation property under (10.14). Instead, a straightforward computation (using the identity 0 = ∂µ (U U † ) = (∂µ U )U † + U (∂µ U † ) and therefore ∂µ U † = −U † (∂µ U )U † ) shows that the quantity
Fµν = ∂µ Aν − ∂ν Aµ − ig[Aµ , Aν ]
(10.22)
transforms as Fµν (x) → U (x)Fµν (x)U † (x) .
(10.23)
Fµν is called the nonabelian ﬁeld strength. From eqs. (10.22) and (10.13) we see that we can rewrite Fµν as a Fµν = Fµν Ta
(10.24)
a Fµν = ∂µ Aaν − ∂ν Aaµ + gf abc Abµ Acν .
(10.25)
with
Now it is easy to construct a gaugeinvariant kinetic term for the gauge ﬁeld; it is given by 1 a a µν 1 Lgauge = − Tr Fµν F µν = − Fµν F , 2 4
(10.26)
where Fµν has been taken in the fundamental representation, and we used the fact that Tr(TFa TFb ) = (1/2)δ ab . Under gauge transformations Tr Fµν F µν → Tr (U Fµν F µν U † ) = Tr Fµν F µν due to the cyclic property of the trace. The complete Lagrangian of the SU (N ) Yang–Mills theory with Dirac fermions in the representation R is therefore a ¯ α γ µ (TRa )αβ Ψβ − 1 Fµν ¯ α ∂ Ψα −mΨ ¯ α Ψα +gAaµ Ψ F a µν , (10.27) LYM = iΨ 4
or, in more compact form, ¯ (iD − m) Ψ − 1 Tr Fµν F µν . LYM = Ψ 2
(10.28)
Observe, from eq. (10.25), that the term F 2 contains not only the standard kinetic term of the gauge ﬁelds, but also an interaction vertex with three gauge bosons, proportional to g, and a vertex with four gauge bosons, proportional to g 2 , as shown in Fig. 10.1. Observe also that gauge invariance has ﬁxed the threeboson, fourboson, and boson– fermion–fermion vertices in terms of a single parameter, the gauge coupling g.
Fig. 10.1 The vertices with three
and with four nonabelian gauge bosons.
248 Nonabelian gauge theories
10.3
QCD
Quantum chromodynamics (QCD) is a Yang–Mills theory with gauge group SU (3). The matter ﬁelds are the quarks. They are in the fundamental representation of the gauge group and have spin 1/2. As we already discussed in Chapter 8, there are six type of quarks, denoted as u (up), d (down), c (charm), s (strange), t (top) and b (bottom). The type of quark is called the ﬂavor, while the index of the gauge group is called the color index. Therefore a generic quark ﬁeld has two indices, Ψα,A with α = 1, 2, 3 the color index and A = u, d, c, s, t, b the ﬂavor index. Each quark ﬂavor is described by a Lagrangian of the type (10.27), with a diﬀerent mass for each ﬂavor. The 32 − 1 = 8 gauge bosons are called gluons. Therefore the QCD Lagrangian is ¯ α,A ∂ Ψα,A − mA Ψ ¯ α,A Ψα,A − LQCD = iΨ
1 a a µν F F 4 µν
¯ α,A γ µ T a Ψβ,A , +gAaµ Ψ αβ
(10.29)
where we sum over both the color indices α, β and the ﬂavor index A, and T a are the generators of SU (3) in the fundamental representation. QCD is the fundamental theory of strong interactions. A crucial property of QCD, that we already discussed in Sections 5.9 and 9.5, is asymptotic freedom, which means that the running coupling constant geﬀ (E) (deﬁned in Section 5.9) is small at high energies and large at low energies. At small distances QCD is well described in terms of weakly interacting quarks and gluons, while at large distances, of the order of 1 fm, the theory becomes nonperturbative and quarks are conﬁned. This means that quarks cannot be observed as free particles, but we can only observe colorsinglet bound states of quark–antiquarks (mesons) or of three quarks or three antiquarks (baryons). Mesons and baryons are collectively denoted as hadrons and, being composed of quarks, are subject to strong interactions. The strong interactions generate dynamically a characteristic energy scale ΛQCD ∼ (1 fm)−1 200 MeV. The lightest hadron is the pion, whose mass is in fact of this order of magnitude, mπ 140 MeV. Besides the exact local SU (3) color symmetry, QCD also has important approximate global symmetries, due to the possibility of performing a coordinateindependent rotation in ﬂavor space. We saw in Section 3.4.3 that the free Lagrangian of a single massless Dirac fermion has a U (1) × U (1) symmetry, in which we rotate independently the lefthanded and righthanded Weyl spinors, ψL → eiθL ψL ,
ψR → eiθR ψR .
(10.30)
In terms of the Dirac spinor Ψ the two independent transformations with θR = θL = α and θR = −θL = β have been written in eqs. (3.125) and (3.126), and we recall them here, Ψ → eiα Ψ ,
5
Ψ → eiβγ Ψ .
(10.31)
10.3
QCD 249
The transformation parametrized by α is called the vector U (1), while the one parametrized by β is called the axial U (1), or UA (1). Consider now the QCD Lagrangian with Nf quark ﬂavors Ψ1 , . . . , ΨNf (i.e. Ψ1 is the Dirac spinor describing the uquark, Ψ2 the dquark, etc.). Denote by qL a column vector with Nf components whose entries are N 1 the Weyl spinors ψL , . . . , ψL f which describes the lefthanded quarks, ⎛ 1 ⎞ ψL ⎜ · ⎟ ⎟ (10.32) qL = ⎜ ⎝ · ⎠, Nf ψL and similarly for qR (the color index is not written explicitly). Recalling the relation between the Dirac Lagrangian written in terms of Dirac spinors and in terms of Weyl spinors, Sections 3.4.1 and 3.4.2, we can rewrite the quark part of the QCD Lagrangian (10.29) in the form † µ † µ † † σ ¯ Dµ qL + iqR σ Dµ qR − (qL M qR + q R M qL ) , Lquarks = iqL
(10.33)
where M is a mass matrix, diagonal in ﬂavor space MAB = mA δAB .
(10.34)
If we set the mass term to zero, in the above Lagrangian there is no coupling between lefthanded and righthanded quarks, and we can perform a SU (Nf ) transformation independently on the lefthanded and righthanded quarks, qL → UL qL ,
qR → UR qR ,
(10.35)
with UL , UR two independent SU (Nf ) matrices acting in ﬂavor space. The operator Dµ acts on the coordinates, through ∂µ , and in color space, because of Aaµ T a ; however, it knows nothing about ﬂavor. Therefore, if the matrices UL,R do not depend on the coordinates xµ , they commute with Dµ . Then under eq. (10.35) † µ † † µ † † qL σ ¯ Dµ qL → qL UL σ ¯ Dµ UL qL = qL UL UL σ ¯ µ Dµ qL ,
(10.36)
so it is invariant, since UL† UL = 1, and similarly for qR . This means that, in the limit in which we can neglect the masses of Nf quark ﬂavors, QCD has an approximate global SUL (Nf ) × SUR (Nf ) invariance.3 We introduce the Dirac spinor Q, in the chiral representation of the γmatrices, qL Q= . (10.37) qR Then the symmetry SUL (Nf ) × SUR (Nf ) can be written as a product of a vector SU (Nf ) and an axial SU (Nf ), similarly to eq. (10.31) Q → eiα Q ,
5
Q → eiβγ Q ,
(10.38)
3
Actually, we could more generally consider a UL (Nf ) × UR (Nf ) transformation, so we also have a vector U (1), which corresponds to baryon number, and an axial U (1). The axial U (1) symmetry is however spoiled by subtle quantum eﬀects that we will not discuss.
250 Nonabelian gauge theories
with α = αa T a and β = β a T a , where T a are the generators of the ﬂavor symmetry in the fundamental representation. If the mass term is nonzero, but Nf masses are equal, so that M = mI is a multiple of the identity matrix, we no longer have a SU (Nf ) × SU (Nf ) global symmetry, but we still have a SU (Nf ) global symmetry in which the left and righthanded quarks are rotated in the same way, since in this case the mass term is † † qL ) qR + qR m(qL
(10.39)
and is invariant under eq. (10.35) with UL = UR . The approximation of neglecting the quark masses or of neglecting their diﬀerences is useful only for the lightest quarks, u and d (and, to a lesser accuracy, s). In particular, if we take mu md , we have an approximate SU (2) global symmetry called isospin, while if we further assume mu md ms we have an approximate SU (3) ﬂavor symmetry. We discussed these symmetries in the Complement on page 209, where we also explained how to use them to extract information on hadronic matrix elements. We will further examine the axial SU (Nf ) symmetry in Section 11.2, when we discuss spontaneous symmetry breaking and Goldstone bosons.
10.4
Fields in the adjoint representation
We have seen that the form of the covariant derivative depends on the transformation property of the object on which it acts, since in eq. (10.18) the generators are in the same representation R as the ﬁeld Ψ. Apart from ﬁelds transforming in the fundamental representation of SU (N ), another typical case that one encounters is that of ﬁelds in the adjoint representation. Let us consider for deﬁniteness a real scalar ﬁeld. As we saw in Section 2.4, the adjoint representation exists for any group and has the same dimension as the number of generators, i.e. N 2 − 1 for SU (N ). A scalar ﬁeld in the adjoint can be written as φa (x), a = 1, . . . , N 2 − 1 (while for a ﬁeld in the fundamental, as in the previous section, we use the notation φα with α = 1, . . . , N ), and the indices a, b are of the same type as the indices labeling the generators. Under a gauge transformation, a ﬁeld in the adjoint of SU (N ) transforms by deﬁnition as a a (10.40) φ → eigθ (x)Tadj φ , where φ is the vector column with components φa . Using the fact that we have as many ﬁelds as generators, we can form the matrix ﬁeld Φ(x) = φa (x)T a
(10.41)
(with the generators T a in any representation that we wish to use, not necessarily in the adjoint). We now show that, in terms of Φ, eq. (10.40) becomes (10.42) Φ(x) → U (x)Φ(x)U † (x) .
10.4
Fields in the adjoint representation 251
Here U (x) = exp{igθ a (x)T a }, where the generators T a are in the same representation that we used in the deﬁnition of Φ, eq. (10.41).4 To prove this assertion, it is suﬃcient to consider an inﬁnitesimal transformation. a bc ) = The explicit form of the generators in the adjoint of SU (N ) is (Tadj abc ab −if . Writing all indices as upper indices, raised with δ , under an inﬁnitesimal transformation we have a cb b ) φ = −gf abcθa φb , δφc = igθa (Tadj
4
In other words, eq. (10.42) holds at the abstract group level, without any reference to the representation.
(10.43)
which implies that δΦ = δφc T c = −gf abcθa φb T c .
(10.44)
On the other hand, the inﬁnitesimal form of eq. (10.42) is δΦ = ig[θa T a , Φ] = igθa φb [T a , T b ] = −gf abc θa φb T c ,
(10.45)
which agrees with eq. (10.44).5 Of course, nothing here depends on the Lorentz indices of the ﬁeld, so we see that Fµν is an example of a ﬁeld transforming in the adjoint representation under local gauge transformation; compare with eq. (10.23). Instead Aµ transforms in the adjoint only under global transformations, while for local transformations it acquires also the inhomogeneous term ∼ ∂µ U . Observe also that if we choose φa real then Φ is hermitian, and the gauge transformation is compatible with the hermiticity condition. The covariant derivative of a ﬁeld in the adjoint is a
c a b (Dµ φ) = ∂µ φa − ig Acµ (Tadj ) bφ .
Using
a (Tadj )bc
= −if
a
bc
(10.46)
we have a
(Dµ φ) = ∂µ φa − gf abc φb Acµ .
(10.47)
By deﬁnition (Dµ φ)a transforms as φa under local gauge transformations. We can also write the covariant derivative in terms of Φ; deﬁning Dµ Φ = (Dµ φ)α T a , eq. (10.47) gives Dµ Φ = ∂µ Φ − ig[Aµ , Φ] .
(10.48)
Using eqs. (10.14) and (10.42), we easily check that under gauge transformations (10.49) Dµ Φ → U (x)(Dµ Φ)U † (x) , conﬁrming that Dµ Φ transforms as Φ. An invariant Lagrangian is 2 1 ∂µ φa − gf abc φb Acµ . (10.50) 2 Here the generators which appear in eq. (10.41) have been chosen in the fundamental representation, so the trace gives a factor 1/2 and we recover the standard normalization of the kinetic term. The gauge invariance of eq. (10.50) follows from the cyclicity of the trace. Again, we see that the requirement of gauge invariance ﬁxes the interaction terms, and in eq. (10.50) we have a cubic interaction −gf abc (∂µ φa )φb Acµ and a quartic interaction O(g 2 φ2 A2µ ). L = Tr Dµ ΦDµ Φ =
5 Actually, to prove eq. (10.42) it was not really necessary to perform an explicit computation. It suﬃces to realize that eq. (10.42) is the same transformation law obeyed by the tensor T αβ = ψα ψ† β where ψ is in the fundamental representation N and ψ † in the antifun¯ . The product N ⊗ N ¯ dedamental N composes into (N 2 − 1) ⊕ 1, i.e. in the adjoint plus the singlet. However, the singlet is absent in Φ because Tr T a =0, and therefore Φ is purely in the adjoint.
252 Nonabelian gauge theories
Summary of chapter • Nonabelian gauge transformations generalize the local invariance of electrodynamics, with gauge group U (1), to nonabelian gauge groups like SU (N ). Instead of a single gauge ﬁeld, we now have a set of gauge ﬁelds Aaµ , with one gauge ﬁeld for each generator of the gauge group. Matter ﬁelds are in a representation R of the gauge group and therefore carry an internal index α = 1, . . . , dim (R). The transformation laws are given by eqs. (10.10) and (10.14). • The Yang–Mills Lagrangian is given by eq. (10.28). Besides an interaction term between matter and gauge ﬁelds, dictated by the covariant derivative, there are also interaction vertices involving only three and four gauge bosons, ﬁxed by the form of the nonabelian ﬁeld strength. Therefore all these interaction terms are ﬁxed by the requirement of gauge invariance. • QCD is a Yang–Mills theory with gauge group SU (3); the matter ﬁelds are the quarks and the gauge ﬁelds are the gluons. The Lagrangian is given in eq. (10.29).
Further reading • Nonabelian gauge theories are the building blocks of modern particle physics. Given their extraordinary experimental success and their rich theoretical structure, the literature on them is vast. A detailed introduction is provided in Peskin and Schroeder
(1995) and in Weinberg vol. II, (1996). • A detailed survey of QCD is given by the three volumes of At the frontier of Particle Physics– Handbook of QCD, M. Shifman ed., World Scientiﬁc 2001.
Spontaneous symmetry breaking In this chapter we present the phenomenon of spontaneous symmetry breaking (SSB). This is a mechanism of great importance both in particle physics and in condensed matter physics. Its generality and importance stem from the fact that it deals with how a symmetry of the action in QFT (or of the Hamiltonian in a statistical system) is reﬂected on the ground state of the system. As we will see in Section 11.1, SSB strictly speaking can only take place in a system with an inﬁnite number of degrees of freedom. It is therefore a genuinely ﬁeldtheoretical phenomenon, which does not appear in quantum mechanical systems with a ﬁnite number of variables. We will examine the eﬀect of SSB on diﬀerent types of symmetries. In Section 11.2 we will discuss the SSB of global symmetries, and the emergence of Goldstone bosons. In Section 11.3 we will examine the SSB of local abelian symmetries, and we will see that it is a crucial element in the BCS theory of superconductivity, when the latter is formulated in ﬁeld theoretical language. We will ﬁnally examine the SSB of nonabelian gauge symmetries, and we will see that in this case it gives rise to the masses of nonabelian gauge bosons, like the W ± and Z 0 in the Standard Model.
11.1
Degenerate vacua in QM and QFT
Spontaneous symmetry breaking is a very general phenomenon characterized by the fact that the action has a symmetry (global or local) but the quantum theory, instead of having a unique vacuum state which respects this symmetry, has a family of degenerate vacua that transform into each other under the action of the symmetry group. A simple example is given by a ferromagnet. The action governing its microscopic dynamics is invariant under spatial rotations. For instance, we can describe a ferromagnet by a generalization of the Ising Hamiltonian given in eq. (9.64), introducing a vector variable si associated to each site i, si ·sj , (11.1) H = −J i,j
where J > 0 and the sum is restricted to nearestneighbor pairs. As we discussed in Section 9.5, above a critical temperature a ferromagnet
11 11.1 Degenerate vacua in QM and QFT
253
11.2 SSB of global symmetries and Goldstone bosons 256 11.3 Abelian theories: SSB and superconductivity
259
11.4 Nonabelian theories: masses of W ± and Z 0
262
254 Spontaneous symmetry breaking
has a unique ground state, with zero magnetization. Of course this state respects the rotational invariance, since on it the expectation value of the magnetization M = si vanishes, and therefore no preferred direction is selected. Below a critical temperature instead it becomes thermodynamically favorable to develop a nonzero magnetization, and in this new vacuum M = 0 and the full SO(3) rotational symmetry is broken to the subgroup SO(2) of rotations around the magnetization axis. The original invariance of the Lagrangian is now reﬂected in the fact that, instead of a single vacuum state, there is a whole family of vacua related to each other by rotations, since the magnetization can in principle develop in any direction. However, the system will choose one of these states as its vacuum state. The symmetry is then said to be spontaneously broken by the choice of a vacuum. SSB is a phenomenon that cannot take place in a quantum mechanical system with a ﬁnite number of degrees of freedom, since in this case, if we have a family of “vacua”, the true vacuum state is a superposition of them which respects the original symmetry. To illustrate this point, we consider for instance the quantum mechanics of a particle, described by a coordinate q(t), in a potential
V
1 2 2 λ (q (t) − η 2 )2 , (11.2) 2 with λ, η parameters. This potential is shown in Fig. 11.1, and is called a doublewell potential. The Lagrangian is V (q) =
−η
η
q
Fig. 11.1 A doublewell potential
L=
1 2 mq˙ − V (q) , 2
(11.3)
and is symmetric under the parity transformation q(t) → −q(t) (this is also called a Z2 symmetry, where Z2 is the ﬁnite group formed by 1 and −1 under multiplication). The potential has two minima, at q = ±η. We can solve the Schr¨ odinger equation expanding the potential around the minimum at q = +η, retaining only the quadratic term in the Taylor expansion of the potential around η (so that we have a harmonic oscillator), and treating in perturbation theory all higher powers of the expansion of the potential. We call + the ground state obtained in this way; more precisely, this is a perturbative vacuum. We can do the same expanding around −η, and we call − the corresponding perturbative vacuum. However, the true ground state of the theory is neither + nor − . At the nonperturbative level there is a nonvanishing amplitude for the transition between these two states, due to the possibility of tunneling under the barrier which separates the two minima, and which can be computed in a WKB approximation (or using the instanton technique developed in Solved Problem 9.1). Because of the tunneling process, the Hamiltonian is not diagonal in the ± basis. Rather, we will have +H+ = −H− ≡ a +H− = −H+ ≡ b ,
(11.4)
11.1
Degenerate vacua in QM and QFT 255
with b a, since the tunneling amplitude is exponentially suppressed. Diagonalizing this Hamiltonian we immediately ﬁnd that the eigenstates are the symmetric and antisymmetric combinations S = + + − ,
A = + − − ,
(11.5)
with energies a±b, respectively. Therefore the degeneracy between these states is lifted by the fact that b = 0, and the true ground state is the combination with energy a − b. Under a parity transformation q → −q, S is invariant while A picks a minus sign. Recalling that physical states are deﬁned up to an overall phase, we see that the true ground state of the Hamiltonian goes into itself under parity, and there is no SSB of the Z2 symmetry. Consider now a real scalar ﬁeld with Lagrangian 1 L = ∂ µ φ∂µ φ − λ2 (φ2 − η 2 )2 . 2
(11.6)
Here again we have a Z2 symmetry φ → −φ. The crucial diﬀerence is that the tunneling amplitude in this case is proportional to exp{−cV } with c a constant and V the spatial volume. In fact, this tunneling amplitude can be evaluated as in the instanton computation that we have discussed in Solved Problem 9.1, with a classical conﬁguration which is not localized in space, so its action is proportional to the volume, Scl = cV , and the tunneling amplitude is proportional to exp{−Scl} = exp{−cV }. This result can be understood physically by discretizing space, so that our ﬁeld theory corresponds to a quantum mechanical system in which for each spatial point x we have a variable qx (t) ≡ φ(x, t), and in order to tunnel into the other vacuum each of the qx must tunnel. Let the tunneling amplitude for a single variable qx be proportional to e−c , for some constant c . The total amplitude is the product of the separate amplitudes so, if N is the number of lattice sites, e−c = e−c N = e−cV . (11.7) tunneling amplitude ∼ x
In an inﬁnite volume this amplitude vanishes and there is no mixing between the two vacua. In other words, the eﬀective height of the barrier is inﬁnite and therefore we truly have two distinct sectors of the theory, i.e. two diﬀerent Hilbert spaces H+ , H− constructed above the two vacua ± with the usual rules of second quantization. There is no possibility to restore the symmetry via tunneling, and all local operators have vanishing matrix elements between a state in H+ and a state in H− . A characteristic of SSB is the existence of an order parameter which takes a nonzero expectation value on the chosen vacuum. In the example of the ferromagnet the order parameter is the magnetization, i.e. a spatial vector, while in the previous example it was an element of Z2 , φ /η = ±1. In the following we will be interested in situations where the order parameter is a scalar ﬁeld φ, real or complex. In any case, the order parameter is a quantity which is not invariant under the symmetry
256 Spontaneous symmetry breaking
1
More precisely, since vectors that differ by a phase describe the same physical state, we do not have SSB if U 0 = eiα 0, for some constant phase α. Conversely, in order to have SSB, beside eq. (11.8) we must also require that T a 0 is not proportional to 0 itself.
in question, so that a nonvanishing expectation value means that the symmetry is broken. For a Lie group, we can restate the condition of SSB in terms of the action of the generators on the vacuum state. We denote by U = exp{iθa T a } a generic element of the symmetry group in question, and by T a the generators. If the vacuum state is invariant, then for any value of the parameters θ a we have U 0 = 0 and therefore all generators must annihilate the vacuum,1 so T a 0 = 0 for each a. Instead, if the vacuum state is not invariant, there must be one or more generators T a that do not give zero when acting on the vacuum state, T a 0 = 0 .
(11.8)
For example, for a ferromagnet in the ordered phase the SO(3) rotation group is broken. The SO(3) generators are the angular momentum operators Jx , Jy , Jz and if the magnetization is, say, along the zaxis, we have Jz 0 = 0 (since rotations around the zaxis still leave the vacuum state invariant) but Jx 0 = 0, Jy 0 = 0. The full SO(3) group is therefore broken to the SO(2) subgroup generated by Jz .
11.2
SSB of global symmetries and Goldstone bosons
Consider the Lagrangian for a complex scalar ﬁeld L = ∂µ φ∗ ∂ µ φ − V (φ) ,
(11.9)
with
2 1 2 2 λ φ − η 2 . (11.10) 2 This is a doublewell potential for φ and therefore it has a continuous set of minima; writing φ = φeiα , the vacua are characterized by φ = η, and α arbitrary. The Lagrangian has a global U (1) invariance V (φ) =
φ → eiθ φ ,
(11.11)
with θ an arbitrary constant. The scalar ﬁeld will choose one of these vacua, so that α = α0 , and the U (1) symmetry is spontaneously broken. Without loss of generality we can redeﬁne α so that α0 = 0, and therefore on the vacuum φ = η . (11.12) We want to understand the spectrum of the theory after SSB. This can be done studying the small oscillations around the vacuum. We therefore write 1 (11.13) φ(x) = η + √ (χ(x) + iψ(x)) 2 √ where χ and ψ are real ﬁelds (the normalization 1/ 2 is chosen for later convenience). Observe that the set of vacua is a circle of radius η in
11.2
SSB of global symmetries and Goldstone bosons 257
the complex ﬁeld plane, and since we are expanding around the point (Re φ = η, Im φ = 0), χ is a ﬂuctuation in the direction orthogonal to the manifold of vacua, while ψ is a ﬂuctuation in the tangential direction, as shown in Fig. 11.2. In other words, η + iψ, for ψ constant and inﬁnitesimal, is another vacuum. A small displacement in the direction of ψ does not cost energy since we are moving along a ﬂat direction of the potential (at least to lowest order, i.e. retaining terms quadratic in ψ in the Lagrangian and neglecting cubic and higherorder terms). Instead with a small displacement in the direction of χ we feel an approximately quadratic rise of the potential, so this ﬂuctuation costs energy. It is therefore clear that, after quantization, ψ is associated to a massless mode, while χ is a massive mode. To check this formally, we insert eq. (11.13) into the Lagrangian (11.9), and we ﬁnd 2 1 1 λ2 √ L = ∂µ χ∂ µ χ + ∂µ ψ∂ µ ψ − (2 2η) χ + χ2 + ψ 2 . (11.14) 2 2 8
V
χ Re φ Im φ
ψ
Fig. 11.2 The directions in ﬁeld space parametrized by χ and ψ.
We see indeed that in this Lagrangian there is a mass term for χ, 1 2 λ2 √ 2 mχ = (2 2η) = λ2 η 2 , 2 8
(11.15)
but there is no term of the form (1/2)m2ψ ψ 2 , so ψ is massless. In conclusion, in this model the U (1) symmetry (11.11) is spontaneously broken by the choice of vacuum, and at the same time a massless spin0 boson appears in the spectrum.2 This is an example of a general theorem, the Goldstone theorem, which states that, given a ﬁeld theory which is Lorentz invariant, local, and has a Hilbert space with a positive deﬁnite scalar product, if a continuous global symmetry is spontaneously broken, then in the expansion around the symmetrybreaking vacuum there appears a massless particle for each generator that breaks the symmetry. This particle is called a Goldstone (or Nambu–Goldstone) particle. As in the above example, also in the general case the emergence of massless particles corresponds to the possibility of moving, in ﬁeld space, in the direction of the manifold of vacua. The dimensionality of the manifold of vacua is equal to the number of generators which break the symmetry. In fact, setting the vacuum energy to zero, by deﬁnition we have H0 = 0. Since T a is the generator of a symmetry transformation, it satisﬁes [T a , H] = 0 and therefore H(T a 0 ) = T a H0 = 0 .
(11.16)
So, if T a 0 = 0 (and if it is not proportional to 0 itself, see note 1) we have found a new state with the minimum energy, i.e. another vacuum state. This is the origin of the fact that we have a Goldstone particle for each generator which breaks the symmetry. The Goldstone theorem further states that the quantum numbers of the Goldstone particles are the same as the corresponding generator. In most cases, the global symmetry transformations are internal transformations in the ﬁeld space which do not act on the Lorentz indices of
2
Our discussion is oversimpliﬁed, because we assumed that the relevant quantity, for determining whether the vacuum is degenerate, is the classical potential V (φ). Quantum corrections in general can modify the form of the potential, and generate an eﬀective potential (known as the Coleman– Weinberg eﬀective potential), which is the quantity that really determines whether there is SSB or not. However, replacing V (φ) with this eﬀective potential, our considerations are correct.
258 Spontaneous symmetry breaking
3
However, in supersymmetry the generators exchange fermions with bosons and carry half integer spin. As a consequence, the Goldstone particles associated to global supersymmetry breaking are fermions.
the ﬁelds. For instance, in the above example the symmetry which is broken is U (1), or, equivalently, an O(2) rotation symmetry in the space (Re φ, Im φ). These rotations do not touch Lorentz indices, and therefore the generators are Lorentz scalars. Correspondingly, the associated massless particle is a spin0 boson.3 In particle physics, an important example of Goldstone bosons is provided by the pions. From the discussion in Section 10.3 we know that, in the limit in which the masses of the up and down quarks can be neglected, QCD has an approximate global SU (2) × SU (2) symmetry. We can now ask how it is realized on the vacuum. If the vacuum is invariant, then the situation is completely analogous to ordinary quantum mechanics, and the spectrum of the system is organized in multiplets (degenerate in mass) of the symmetry group. On the contrary, if a generator fails to annihilate the vacuum, we have seen that there is a corresponding massless particle in the spectrum. Therefore, if the SU (2)×SU (2) approximate global symmetry of QCD were unbroken, all strongly interacting particles should be approximately arranged in representations of SU (2) × SU (2). Since the two SU (2) factors are obtained one from the other with a parity transformation, this means in particular that for each strongly interacting particle there should be a second one, approximately degenerate in mass, and with the opposite parity. Experimentally this is not the case. For instance, the three pions are pseudoscalars, and there exists no triplet of real scalars close in mass to the pions. Rather, the experimental values of masses and quantum numbers of the strongly interacting particles point toward a diﬀerent alternative: the vector SU (2) is unbroken, and is in fact the isospin symmetry. Correspondingly, particles are organized in isospin multiplets almost degenerate in mass; the three pions form a triplet, the proton and neutron a doublet, etc. This explains why their mass diﬀerences, which are O(1) MeV, are tiny compared to the strong interaction scale which is rather O(100) MeV. On the contrary the axial SU (2) is spontaneously broken, and as a consequence we have three (because of the three generators of SU (2)) Goldstone bosons, which are pseudoscalar because of the γ 5 in the generators of the axial SU (2) transformation, see eq. (10.38). More precisely, one uses the term quasiGoldstone bosons to stress that these are particles which would be massless in the limit of exact symmetry; since instead this SU (2) × SU (2) symmetry is only approximate, these particles are light compared to the other hadrons. Indeed, the three pions fulﬁll these conditions. They are pseudoscalars, and they are the lightest strongly interacting particles, with masses O(140) MeV rather than the values O(1) GeV typical of the neutron and the proton. Identifying the pions as the pseudoGoldstone bosons of chiral symmetry allows us to write down eﬀective Lagrangians which govern their dynamics, burying all our ignorance of QCD at large distances into a few phenomenological parameters. This is a more advanced subject, and we refer the reader to the Further Reading section.
11.3
11.3
Abelian gauge theories: SSB and superconductivity 259
Abelian gauge theories: SSB and superconductivity
To illustrate the eﬀect of SSB on a theory with a local symmetry we start again from the Lagrangian (11.9), but now we gauge the U (1) symmetry. Therefore we introduce a U (1) gauge ﬁeld Aµ and we take as Lagrangian 1 L = (Dµ φ)∗ Dµ φ − V (φ) − Fµν F µν , 4
(11.17)
Dµ φ = (∂µ + iqAµ ) φ .
(11.18)
with As before,
2 1 2 2 λ φ − η 2 . (11.19) 2 To understand the physical content of the theory, it is convenient to write the complex ﬁeld φ in terms of its modulus and a phase, and to expand the modulus around η,4 1 iα(x) φ(x) = φ(x) e = η + √ ϕ(x) eiα(x) . (11.20) 2 V (φ) =
Now observe that, since under the U (1) local transformation φ transforms as (11.21) φ(x) → eiqθ(x) φ(x) with θ(x) the parameter of the gauge transformation, we can ﬁx the gauge freedom setting α(x) = 0 in eq. (11.20). In other words, we have used the gauge freedom to remove one degree of freedom from the complex ﬁeld φ, so that we are left with just a single real scalar ﬁeld ϕ. The phase α(x) parametrizes the manifold of vacua, so it is the ﬁeld that, in the case of global symmetries, describes the Goldstone boson.5 We see that when we break a local symmetry the Goldstone boson is eliminated from the physical spectrum by gauge invariance. After setting α(x) = 0, using eqs. (11.18) and (11.20) we get 1 1 Dµ φ = √ (∂µ ϕ) + iq η + √ ϕ Aµ (11.22) 2 2 and, substituting into eq. (11.17), √ η 2 3 1 4 1 µ 2 ϕ + ϕ L = ∂ ϕ∂µ ϕ − λ η 2 ϕ2 + 2 2 8 2 1 1 Aµ Aµ − Fµν F µν . +q 2 η + √ ϕ 4 2
(11.23)
We recognize a standard kinetic term for a real massive scalar ﬁeld ϕ, with mass m2ϕ = 2λ2 η 2 . For the gauge ﬁeld, the quadratic term is now 1 1 LA = − Fµν F µν + m2A Aµ Aµ , 4 2
(11.24)
4
Polar coordinates become singular at the origin, therefore this parametrization is only useful when ϕ η. However, to understand the particle content of the theory it is suﬃcient to limit ourselves to ϕ inﬁnitesimal and therefore, as long as η = 0, we can use this parametrization without problems.
5
In fact, α parametrizes exactly the direction in ﬁeld space corresponding to the vacuum manifold, while the ﬁeld ψ in the previous section parametrizes this direction only for inﬁnitesimal displacements.
260 Spontaneous symmetry breaking
with m2A = 2q 2 η 2 .
(11.25)
Taking the variation of eq. (11.24) we ﬁnd the equation of motion ∂µ F µν + m2A Aν = 0 .
(11.26)
Equation (11.26) is known as the Proca equation (we already met it in Exercise 4.4). Contracting it with ∂ν we have ∂µ ∂ν F µν + m2A ∂ν Aν = 0; since ∂µ ∂ν F µν = 0 automatically, and mA = 0, we ﬁnd ∂ν Aν = 0 .
(11.27)
Using this condition, ∂µ F µν = ∂µ (∂ µ Aν − ∂ ν Aµ ) becomes equal to 2Aν and eq. (11.26) gives (2 + m2A )Aν = 0 . (11.28) Therefore the Proca equation describes a massive gauge boson. We saw in Section 2.4.1 that a vector ﬁeld Aµ , from the point of view of spatial rotations, decomposes into 0 ⊕ 1. Expanding Aµ (x) in plane waves, the condition ∂µ Aµ = 0 becomes kµ µ (k) = 0 and eliminates, in a covariant way, the component with polarization vector µ (k) ∼ kµ , since for this polarization we have kµ µ (k) ∼ k 2 = m2A = 0. In the rest frame of the particle (which exists, since mA = 0), k µ = mA (1, 0, 0, 0) and the polarization vector which has been eliminated is µ (k) = (1, 0, 0, 0) which, from the point of view of spatial rotations, is a scalar. Therefore eq. (11.27) eliminates the spin0 part and we are left with a pure spin 1. In conclusion, eq. (11.26) describe a massive spin1 particle. From this example we learn that the spontaneous breaking of a local symmetry does not produce Goldstone bosons, but instead the gauge ﬁeld has acquired a mass proportional to the vacuum expectation value of the scalar ﬁeld. In this context the scalar ﬁeld φ is called a Higgs ﬁeld, and the mechanism that produces a mass for the gauge boson is called the Higgs mechanism. It is interesting to compare the number of degrees of freedom with and without SSB. If in the potential we set η = 0, then there is no SSB; the scalar ﬁeld has two real components. We cannot use the gauge invariance to eliminate the phase θ as before, because when η = 0 the decomposition (11.20) of φ in √ terms of two real ﬁelds ϕ, θ is not well deﬁned: in fact in this case ϕ = 2φ and therefore ϕ 0, so it is no longer a scalar ﬁeld which can freely perform at least inﬁnitesimal ﬂuctuations around ϕ = 0. Rather, gauge invariance can be used to eliminate the longitudinal components of Aµ , as we studied when we quantized the free electromagnetic ﬁeld in Section 4.3.2, and the remaining gauge ﬁeld has two physical degrees of freedom, the two transverse polarizations. In total we have two physical degrees of freedom from the Higgs ﬁeld and two from the gauge ﬁeld. After SSB, the scalar ﬁeld has just one real component, but the gauge ﬁeld is massive, and a massive spin1 particle has three degrees of freedom. In total, we have 1 + 3 = 4 degrees of freedom. So, the Higgs mechanism implies a reshuﬄing of the degrees of freedom. The ﬁeld that, in the case
11.3
Abelian gauge theories: SSB and superconductivity 261
of a global symmetry, was a Goldstone boson, is turned into the third polarization state of a massive spin1 particle. One might ask what do we really gain by giving a mass to the gauge boson with the Higgs mechanism, rather than adding by hand a mass term (1/2)m2A Aµ Aµ to the Lagrangian (a mass term for the gauge ﬁeld generated by SSB is called a soft mass term, in contrast to a term added by hand, which is called a hard mass term). The point is that, in the Higgs mechanism, the Lagrangian is gauge invariant, which is not the case if we instead add by hand a mass term. The breaking of the symmetry takes place at the level of the vacuum. It can be shown that such a spontaneous breaking preserves a number of good properties of the unbroken theory, and in particular the theory is still renormalizable. Intuitively, this comes from the fact that at very high energies E η we can neglect η and the UV properties of the theory are the same as in the case η = 0. If we instead break the gauge symmetry by hand the theory is not renormalizable. Spontaneous breaking of the U (1) gauge symmetry is realized in Nature in the phenomenon of superconductivity. Let us recall that the relation between the electric current j and an applied external electric ﬁeld E is j = σE, where the proportionality constant σ is called the conductivity. A superconductor is an object where σ = ∞. In a piece of material with ﬁnite volume we have a ﬁnite number of electrons, so we cannot have an inﬁnite current, and therefore the electric ﬁeld E is forced to be zero inside the superconductor, and the Maxwell equation ˙ = −∇ × E states that the magnetic ﬁeld B is constant in time. ThereB fore, if B was zero at some initial time, it will remain zero inside the superconductor even if we switch on an external magnetic ﬁeld outside the superconductor. This means the ﬁeld lines of the applied external magnetic ﬁeld cannot penetrate inside the superconductor (Meissner effect). At the microscopic level, what happens is that the electrons in the superconductor form currents on the surface, which screen the external ﬁeld.6 There is therefore a characteristic screening length l, and inside the superconductor the external magnetic ﬁeld drops exponentially, B(x) = B(0)e−x/l ,
(11.29)
where x = 0 represents the interface between the superconductor (at x > 0) and the external space. The physical mechanism behind superconductivity is that, due basically to an interaction mediated by phonons, pairs of electrons bind together in a singlet state, forming the socalled Cooper pairs. This composite object is therefore described, at the level of eﬀective theory, by a charged scalar ﬁeld, with charge equal to twice the electron charge. The eﬀective Lagrangian describing the interaction of this scalar ﬁeld with the electromagnetic ﬁeld is given by eq. (11.17) (with q = 2e). The result (11.29) is then understood in terms of the Higgs mechanism: the scalar ﬁeld describing the Cooper pair plays the role of the Higgs ﬁeld and develops a vacuum expectation value; as a consequence, the photon acquires a mass µ, and its wave equation becomes eq. (11.28). Then also the electric and magnetic ﬁelds
6 Since a given piece of superconducting material, with a ﬁnite volume, has only a ﬁnite number of electrons, there is a maximum magnetic ﬁeld Bc that can be screened. If the applied external ﬁeld is higher than Bc , it turns out that the ﬁeld penetrates in a nonhomogeneous manner. For type II superconductors, the magnetic ﬁeld penetrates in the form of narrow ﬂux tubes.
262 Spontaneous symmetry breaking
satisfy a massive KG, (2 + µ2 ) E = 0 ,
(2 + µ2 ) B = 0 .
(11.30)
When we switch on an external magnetic ﬁeld, after a transient time we will have a static ﬁeld conﬁguration. Therefore the equation for B becomes ∇2 B = 0 at x < 0 and (∇2 − µ2 )B = 0 at x > 0. The solution of this equation, at x > 0, is given by (11.29) with l identiﬁed with µ−1 . The penetration length is therefore the inverse of the mass that the photon has inside a superconductor.
11.4
Nonabelian gauge theories: the masses of W ± and Z 0
We consider now an SU (2) gauge theory with a doublet of complex scalar ﬁelds φα , with α = 1, 2, transforming in the fundamental representation. We call φ the Higgs ﬁeld. The covariant derivative is α
(Dµ φ) = ∂µ φα − igAaµ (T a )α β φβ ,
(11.31)
and the generators T a in the fundamental representation, for SU (2), are T a = σ a /2, where σ a are the Pauli matrices. Since φ† φ is invariant under φ → U φ with U unitary, any function of † φ φ is gauge invariant, and we can also write a gaugeinvariant potential term V (φ† φ). Therefore the Lagrangian is 1 a a µν F , LSU(2)−Higgs = (Dµ φ)† (Dµ φ) − V (φ† φ) − Fµν 4
(11.32)
and we choose
2 1 2 † λ φ φ − η2 . (11.33) 2 We have a degenerate family of vacua, at φ† φ = η 2 . Following the same strategy used for the SSB of the U (1) gauge invariance, we use the gauge freedom to eliminate some components of φ (similarly to the elimination of α(x) in the previous section). Here we have a ﬁeld φ with two complex components, i.e. four real components, and an SU (2) transformation, which has three parameters. We can then use the gauge freedom to eliminate three of the four components of φ, writing it as 0 , (11.34) φ= η + √12 χ V (φ† φ) =
where χ is a real scalar ﬁeld. It is convenient to introduce the matrices √ 1 1 0 1 + 2 (11.35) σ = √ σ + iσ = 2 0 0 2 √ 1 0 0 σ − = √ σ 1 − iσ 2 = 2 (11.36) 1 0 2
11.4
and the ﬁelds
Nonabelian gauge theories: the masses of W ± and Z 0 263
1 1 A± Aµ ± iA2µ µ = √ 2
(11.37)
− + 3 3 σ a Aaµ = σ + A− µ + σ Aµ + σ Aµ .
(11.38)
so that Then, the covariant derivative becomes √ − 0 χ g 2Aµ . Dµ φ = − i (η + √ ) 3 √1 ∂µ χ −A 2 2 µ 2
(11.39)
∗ + Recalling from the deﬁnition (11.37) that (A− µ ) = Aµ , we ﬁnd
2 g2 1 µ χ A3 µ A3µ (D φ) (Dµ φ) = ∂ χ∂µ χ + η+ √ 2 4 2 2 g2 χ −µ A+ . + η+ √ µA 2 2 µ
†
(11.40)
Apart from the standard kinetic term of the χ ﬁeld and from cubic and quartic couplings between χ and the gauge ﬁelds, we recognize a mass term for A3µ , g2 1 2 mA = η 2 (11.41) 2 4 and, using 1 −µ A+ = (A1µ A1 µ + A2µ A2 µ ) , (11.42) µA 2 −µ we see that the term (g 2 /2)η 2 A+ gives the same mass mA to both µA 1 2 Aµ and Aµ or, equivalently, to their linear combinations A± µ . Therefore all three gauge bosons become massive, with a mass
gη mA = √ . 2
(11.43)
In the Standard Model the situation is similar, but the gauge group now is SU (2) × U (1). We have three gauge bosons Aaµ associated with SU (2) and one gauge boson Bµ associated to U (1), and two diﬀerent gauge couplings, g for SU (2) and g for U (1). This means that on a ﬁeld in a generic representation the covariant derivative is Dµ = ∂µ − ig T a Aaµ − ig SBµ
(11.44)
where T a are the SU (2) generators in the representation of interest, and S is the charge of the particle in question relative to the U (1) group, i.e the parameter that labels the U (1) representation. The Higgs boson φ is an SU (2) doublet (so that on it T a = σ a /2) and is given the assignment S = 1/2.7 Therefore σa g Dµ φ = ∂µ − ig Aaµ − i Bµ φ . (11.45) 2 2
7 Note that both components of the doublet have the same assignment of S. In general, on any SU (2) multiplet, S is a constant times the unit matrix, which means that S commutes with the SU (2) generators, as it should be, since the gauge group is the direct product of SU (2) and U (1).
264 Spontaneous symmetry breaking
The potential for the Higgs ﬁeld is the same as in eq. (11.33) so that again we can choose a gauge such that 0 φ= . (11.46) 1 η+ √ χ 2
Computing (D µ φ)† Dµ φ using eq. (11.45) we ﬁnd terms quadratic in the gauge ﬁelds, of the form 1 2 1 −µ η (gA3µ − g Bµ )(gA3 µ − g B µ ) + g 2 η 2 A+ . µA 4 2 It is convenient to introduce the notation g¯ = g 2 + g 2 , g/¯ g = cos θW , g /¯ g = sin θW
(11.47)
(11.48)
where θW is the Weinberg angle. We also change notation, Wµ± = A± µ, and we deﬁne (11.49) Zµ0 ≡ A3µ cos θW − Bµ sin θW . Then we see from eq. (11.47) that the Z boson gets a mass 1 mZ = √ g¯η 2
(11.50)
while the W bosons get a mass 1 mW = √ gη . 2
(11.51)
The ratio of the W to Z mass is therefore given in terms of the Weinberg angle, mW = cos θW . (11.52) mZ Instead, the other orthogonal combination of A3µ and Bµ , Aµ ≡ A3µ sin θW + Bµ cos θW
(11.53)
remains massless and is therefore identiﬁed with the photon.
Summary of chapter • SSB takes place when, rather than a single vacuum invariant under the symmetry in question, we have a family of vacua which transform among themselves under the action of the symmetry group. The system will eventually settle into one of these vacua, and the symmetry is spontaneously broken by this choice. • If we have a quantum system with a ﬁnite number of degrees of freedom there is in general an exponentially small, but nevertheless ﬁnite, amplitude for tunneling between the diﬀerent perturbative vacua. The true vacuum will be a superposition of the perturbative vacua which respects the symmetry, and therefore there is no SSB. However, in a system with an inﬁnite number of degrees of freedom, as in QFT, the tunneling amplitude is zero because each degree of freedom should tunnel, and therefore SSB is possible.
11.4
Nonabelian gauge theories: the masses of W ± and Z 0 265
• When a global symmetry is spontaneously broken, in the spectrum of the theory there is a massless particle for each broken symmetry generators. In particular, the pions are the Goldstone bosons associated to the SSB of the axial SU (2) symmetry of QCD. They would be exactly massless if the symmetry were exact. Since it is only approximate, they are just lighter than the other hadrons. • When a local symmetry is spontaneously broken, the gauge ﬁeld becomes massive and the wouldbe Goldstone boson is turned into the third physical degree of freedom of the massive spin1 gauge ﬁeld. This mechanism gives an eﬀective mass to the photon in a superconductor (which is at the origin of the Meissner eﬀect) and gives a mass to the gauge bosons W ± and Z 0 of the electroweak theory.
Further reading • For spontaneous symmetry breaking in gauge theories, a clear discussion is given for instance in Okun (1982), Chapter 20 and in Coleman (1985), Chapter 5. • An advanced discussion of SSB can be found in
(Weinberg), vol II, Chapters 19 and 21. • For a discussion of pion dynamics and chiral Lagrangians see Georgi (1984), Chapter 5, Coleman (1985), Chapter 2 and (Weinberg), vol II, Chapter 19.
12
Solutions to exercises
12.1 Chapter 1
266
12.2 Chapter 2
267
12.3 Chapter 3
270
12.4 Chapter 4
272
12.5 Chapter 5
275
12.6 Chapter 6
276
12.7 Chapter 7
279
12.8 Chapter 8
281
12.1
Chapter 1
(1.1) Since photons are massless the only energy scale is provided by kB T . Dimensionally, in units = c = 1, an energy density is (mass)4 , therefore the photon density must be ργ ∼ (kB T )4 . This gives ργ in units (eV)4 . Transforming to GeV/cm3 using 200 MeV fm 1 gives ργ /ρc ∼ 5 × 10−5 . At the present epoch of the Universe the energy density in photons, or more generally in relativistic particles, is much smaller than in nonrelativistic matter. (1.2) A temperature T 4.5 × 106 K corresponds to an energy kB T 388 eV (using kB T 1/38.68 eV at T = 300 K). For a relativistic particle at the equilibrium temperature T , the average energy is E 3kB T and therefore the average photon energy is Eγ = O(1) keV. Since Eγ me mp , we can use the Thompson formula (1.16) for the scattering on electrons and the same formula, with me replaced by mp , for the scattering on protons. Therefore σ(γp → γp) 8πα2 /(3m2p ). This is smaller than the γe → γe crosssection by a factor m2e /m2p and therefore the contribution of the protons to l is negligible. Because of electric charge neutrality, in our simpliﬁed model of the Sun the electron number density is equal to the proton number density and is n = ρ/(me + mp ) ρ/mp 0.8 × 1024 cm−3 . Inserting the numerical value for the Thompson crosssection, σ(γe → γe) 6.65 × 10−25 cm2 , we ﬁnd l 1.8 cm. More accurate modeling of the Sun gives l 0.5 cm. The photons therefore perform a random walk of step l inside the Sun. For a random walk in one dimension, after N steps we have x2 = N l2 . In three dimensions a radial distance R is covered 2 = (1/3)N l 2 because, if we denote by x the in N steps with R axis along which the photon ﬁnally escaped, not all steps have been performed along the x direction. Rather in each step x2 + y 2 + z 2 increases by l 2 , so x2 eﬀectively performs a random walk of step 2 l2 /3. Therefore we get an escape time t = N l/c = 3R /(lc) 3 × 104 yr. (1.3) For slow particles the largest lengthscale is the De Broglie wavelength λ = 1/(mv). For the neutron m ∼ 939.56 MeV, so E = (1/2)mv 2 ∼ 1 MeV gives v ∼ 0.046, λ ∼ 4.5 fm and σ ∼ πλ2 ∼ 0.7 barn.
12.2
12.2
Chapter 2
(2.1) In the rest frame of the particle, E = m and p = 0. Performing a boost along a direction, say the xaxis, E and p ≡ px transform as t and x in eq. (2.18), so after the boost E = m cosh η and p = m sinh η, and therefore (E + p)/(E − p) = e2η . Performing a further boost with rapidity η in the same direction, E → E cosh η + p sinh η and p → E sinh η + p cosh η , so (E cosh η + p sinh η ) + (E sinh η + p cosh η ) (E cosh η + p sinh η ) − (E sinh η + p cosh η ) E + p = e2η+2η . = e2η (12.1) E−p
e2η →
(2.2) A generic tensor T i1 ...iN without any symmetry properties, from the point of view of angular momenta is the direct product of N times the vector representation, T i1 ...iN = 1 ⊗ 1 ⊗ . . . ⊗ 1, so it contains spin up to j = N . Decomposing T i1 ...iN in irreducible representations, we must remove the traces and each pair of indices must be symmetrized or antisymmetrized. When we remove a trace two indices are contracted and we are left with a tensor with two less indices, which can have only up to spin N − 2. When we antisymmetrize over two indices (i, j) we can then contract with ijk , so we obtain a tensor with one less index, and maximum spin N − 1. Therefore the spin N in T i1 ...iN can be neither in the traces nor in the tensors in which some indices have been antisymmetrized, and must be in the totally symmetric and traceless tensor. Typical examples are for instance the quadrupole moment of a mass distribution ρ(x) (or of a charge distribution), 1 ij Q = d3 x ρ(x)(xi xj − δ ij x2 ) . (12.2) 3 which is a spin2 operator. A spin3 operator is the octupole moment, 1 ij llk δ M + δ ik M ljl + δ jk M ill , (12.3) Oijk = M ijk − 5 where the index l is summed over and ijk M = d3 x ρ(x)xi xj xk . (12.4) † † i (2.3) Let v 0 = ξR ψR and v i = ξR σ ψR . We verify that under boosts 0 i v and v transform as appropriate for a contravariant fourvector. We can always take the xaxis as the boost direction, and it is also suﬃcient to consider an inﬁnitesimal boost. Using eq. (2.60), † † ησ † † 1 ψR → ξR e ψR ξR ψR + η ξR σ ψR , ξR 1
† 1 † ησ † † 1 σ ψR → ξR e σ 1 ψR η ξR ψR + ξR σ ψR . ξR 1
(12.5)
Chapter 2 267
268 Solutions to exercises
Therefore v 0 → v 0 + ηv 1 and v 1 → ηv 0 + v 1 , which is the inﬁnites† µ imal form of eq. (2.18). Observe that instead v¯µ ≡ ξR σ ¯ ψR is not a contravariant fourvector since, under the transformation (2.18), v 1 and v¯1 → −η¯ v 0 + v¯1 , i.e. they mix with −η rather v¯0 → v¯0 − η¯ than +η. For the lefthanded spinors the transformation matrix is exp{−η σ 1 /2} instead of exp{+η σ 1 /2}, and the situation is re† µ † µ versed: ξL σ ¯ ψL is a contravariant fourvector while ξL σ ψL is not. We can also verify directly the transformation properties under ﬁnite Lorentz transformation, using the identity 1
eησ = cosh η + σ 1 sinh η ,
(12.6)
which can be proved performing the Taylor expansion of the exponential and using the fact that (σ 1 )2 = 1. (2.4) F µν → Λµ ρ Λν σ F ρσ , where Λµ ρ = exp{−(i/2)ωαβ (J αβ )µ ρ } and (J αβ )µ ρ is given by eq. (2.23), since the tensor representations are obtained iterating on each index the transformation matrix of the fourvector representation. Expanding to ﬁrst order in ωαβ and performing the contractions, δF µν = ω µ ρ F ρν − ω ν ρ F ρµ .
(12.7)
In terms of E and B, δE = −η×B + θ×E , δB = +η×E + θ×B .
(12.8)
(2.5) (i) Writing explicitly the six conditions Aµν = (1/2)µνρσ Aρσ we ﬁnd A01 = A23 , A02 = −A13 , A03 = A12 , A12 = A03 , A13 = −A02 and A23 = A01 . With the Minkowski metric, the ﬁrst condition A01 = A23 becomes A01 = A23 while the last conditions A23 = A01 becomes A23 = −A01 , and together they give A01 = A23 = 0. Similarly for the other conditions, so in the Minkowski case we are left with Aµν = 0. (ii) If instead we raise the indices with δ µν the conditions A01 = A23 and A23 = A01 are identical, so in total we have only three independent conditions A01 = A23 , A02 = −A13 and A03 = A12 . Similarly an antiselfdual tensor satisﬁes A01 = −A23 , A02 = A13 and A03 = −A12 . For SO(4), µνρσ is an invariant tensor (as for SO(3, 1), it follows from the condition det Λ = 1). Therefore, if the condition Aµν = (1/2)µνρσ Aρσ holds in a frame, it holds in all Lorentztransformed frames, so a selfdual tensor remains selfdual, and an antiselfdual tensor remains antiselfdual. This means that selfdual and antiselfdual tensors are irreducible representations of SO(4), and that in Euclidean space a sixdimensional real antisymmetric tensor Aµν decomposes into its selfdual and antiselfdual parts.
12.2
(iii) With the Minkowski metric the conditions A01 = iA23 and A23 = iA01 become A01 = iA23 and A23 = −iA01 and therefore are identical, and similarly for the other conditions, so we are left with three independent conditions, A01 = iA23 , A02 = −iA13 , A03 = iA12 . The duality conditions are Lorentzinvariant so selfdual and antiselfdual tensors are irreducible representations of SO(3, 1). However, the Minkowskian duality conditions make sense only if Aµν is complex, so it can be used only to decompose a tensor Aµν with six independent complex components into its selfdual and antiselfdual parts, each with three complex components, i.e. each with six real degrees of freedom. Since under parity µνρσ is a pseudotensor, a parity transformation exchanges the selfdual and antiselfdual parts. Comparison with the classiﬁcation of Lorentz representations in terms of the (j− , j+ ) quantum numbers show that they are the (0, 1) and (1, 0) representations. Observe that these representations have complex dimension three. (iv) In terms of the electric and magnetic ﬁelds E i , B i and of the variables ai+ = (−1/2)(E i + iB i ), ai− = (−1/2)(E i − iB i ) we can write F µν = F+µν + F−µν with ⎛
F µν
0 ⎜ E1 ⎜ =⎝ 2 E E3 ⎛
F±µν
0 ⎜ −a1± =⎜ ⎝ −a2± −a3±
−E 1 0 B3 −B 2 a1± 0 ±ia3± ∓ia2±
−E 2 −B 3 0 B1
⎞ −E 3 B2 ⎟ ⎟, −B 1 ⎠ 0
a2± ∓ia3± 0 ±ia1±
⎞ a3± ±ia2± ⎟ ⎟. ∓ia1± ⎠ 0
(12.9)
The six independent real components of F µν have been written in terms of the three complex components ai+ of the selfdual tensor F+µν , and of their complex conjugate ai− which are the components of the antiselfdual tensor F−µν . This is not a decomposition into representations of smaller dimensions. We have just rewritten a sixdimensional real representation in terms of a threedimensional complex representation. Under a general Lorentz transformations the three components of E i and the three components of B i mix between themselves, so a real antisymmetric tensor is an irreducible representation of real dimension six. (2.6) (i) In the (x, y) plane, e1 = (1, 0) → (cos θ, sin θ), e2 = (0, 1) → (− sin θ, cos θ), e± → e∓iθ e± , so from eq. (2.131) e+ has helicity h = +1 and e− has h = −1. According to the discussion in Section 2.7, this means that electromagnetic waves are made of massless spin1 particles, the photons. (ii) The transformation of the tensor hij under rotations in the
Chapter 2 269
270 Solutions to exercises
(x, y) plane is hij → Rik Rjl hkl , i.e. h → RhRT , with h+ h× cos θ − sin θ h= , R= . h× −h+ sin θ cos θ
(12.10)
Performing the matrix multiplication we ﬁnd h× → h× cos 2θ + h+ sin 2θ , h+ → −h× sin 2θ + h+ cos 2θ
(12.11)
and therefore (h× ± ih+ ) → e∓2iθ (h× ± ih+ ) which, according to eq. (2.131), means that they have helicities ±2.
12.3
Chapter 3
(3.1) The dimensions are read from the kinetic terms. For a scalar (∂µ φ)2 must have dimensions (mass)4 to compensate the factor d4 x. Since ∂µ ∼ mass, it follows that φ has dimensions of mass. Similarly Aµ ∼ (mass) and ψ ∼ (mass)3/2 . In d spacetime dimensions φ ∼ Aµ ∼ (mass)(d/2)−1 while ψ ∼ (mass)(d−1)/2 . (3.2) Consider ﬁrst uL . Under a boost of rapidity η along the z axis we have (see eq. (2.59)) uL → exp{−ησ 3 /2}uL. Use the identity σ η η ˆ · σ sinh = cosh +η . (12.12) exp η · 2 2 2 Inverting tanh η = v we get e2η = (1 + v)/(1 − v). From this 1/2 verify that cosh(η/2) = E+m and therefore sinh(η/2) = 2m E−m 1/2 . Pay attention to the fact that, in order to transform a 2m particle at rest into a particle moving with velocity +v, we must perform a boost with velocity −v. Then verify that in the boosted frame √ 1 √ E + m − σ3 E − m ξ . (12.13) uL = √ 2 Finally verify that, for a particle moving along the z axis, 1 √ 1 √ (12.14) E±m= E + p3 ± E − p3 , 2 2 and therefore eq. (3.103) is recovered. For uR , under boost uR → exp{+ησ 3 /2}uR and therefore the result is recovered with the replacement p3 → −p3 . (3.3) (i) dφ must be equal to 1, i.e. to the mass dimensions of φ. Then µ ∂φ/∂xµ → ∂φ /∂x = e−2α ∂φ/∂xµ and (∂φ)2 cancels the factor 4α 4 e coming from d x. The current is 1 µ jD = (φ + xν ∂ν φ)∂ µ φ − xµ ∂ν φ∂ ν φ . (12.15) 2 (ii) φ2 → e−2α φ2 so d4 x φ2 is not invariant, while d4 x φ4 is invariant. Dilatations are a classical symmetry when there is no intrinsic massscale, so they are broken by a mass term but not by a term λφ4 since λ is dimensionless.
12.3
(3.4) (i) dA = 1, dψ = 3/2. (ii) From the Noether theorem (and eliminating terms that vanish upon use of the equations of motion) 1 µ ¯ µ ∂ν ψ + 3 ψiγ ¯ µ ψ − F µρ Aρ . = xν (δνµ F 2 − F µρ ∂ν Aρ ) + xν ψiγ jD 4 2 (12.16) After some algebra, this can be rewritten as µ = jD
3 µ j + xν T µ ν − ∂ρ (F µρ xν Aν ) 2
(12.17)
where 1 ¯ µ (i∂ν − eAν )ψ , T µ ν = δνµ F 2 − F µρ Fνρ + ψγ (12.18) 4 ¯ µ ψ is the U (1) current. Since j µ is conserved by itself, and j µ = iψγ we can redeﬁne the dilatation current subtracting it. Furthermore, the term ∂ρ (F µρ xν Aν ) does not contribute to the charge since its µ = 0 component is a total spatial derivative, and also it is sepaµ . rately conserved, so we subtract it, too, from the deﬁnition of jD µ µ ν µ ν µ µ µ Then jD = x T ν and ∂µ jD = x ∂µ T ν + T µ . The term ∂µ T ν vanishes because the energy–momentum tensor is conserved, while, ¯ µ (i∂µ − eAµ )ψ = 0 using the from the above equation, T µ µ = ψγ massless Dirac equation. µ (iii) Upon use of the equations of motion of the massive theory, jD happens to have the same form as in the massless case. However, again using the equations of motion of the massive theory, now µ ¯ . = T µ µ = mψψ ∂µ jD
(12.19)
The invariance under dilatations is broken if the trace of the energy– momentum tensor is nonvanishing. (3.5) The two Lagrangians diﬀer by a total derivative, ¯ µ ψ) . L = L − (i/2)∂µ (ψγ
(12.20)
¯ µ ∂ ν ψ. With L , we ﬁnd T µν = T µν − With L, we ﬁnd T µν = iψγ ν µ µ ¯ (i/2)∂ j with j = ψγ µ ψ. The extra term (−i/2)∂ ν j µ does not spoil ∂µ T µν = 0 because ∂µ j µ = 0. The conserved charges P ν diﬀer by a term proportional to d3 x ∂ ν j 0 . However this is zero because, if ν is a spatial index, it is a spatial derivative and then the spatial integral vanishes, assuming as always a suﬃciently fast decrease of the ﬁelds at inﬁnity. If instead ν = 0 we use ∂0 j 0 = −∂i j i so we get again a spatial divergence. Therefore the µν fourmomentum computed with T µν and with T is the same. (3.6) We denote (t, x) by x. Then the ﬁvedimensional ﬁeld is φ(x, y). We impose the boundary condition that φ(x, ±R/2) = 0, corresponding to the fact that the ﬁeld vanishes at the boundary of spacetime. The mode expansion compatible with these boundary conditions is ∞ nπy , (12.21) φn (x) cs φ(x, y) = R n=1
Chapter 3 271
272 Solutions to exercises
where cs(nπy/R) is cos(nπy/R) if n is odd and sin(nπy/R) if n is even. We therefore have an inﬁnite set of fourdimensional ﬁelds φn (x). The fact φ(x, y) satisﬁes (25 + m2 )φ = 0 implies that the ﬁelds φn (x) satisfy nπ 2 2 (12.22) 2+m + φn (x) = 0 R and therefore each φn (x), with n = 1, . . . , ∞, describes a fourdimensional particle with mass mn given by m2n = m2 + (nπ/R)2 . This set of particles is called a Kaluza–Klein (KK) tower. In particular, if the ﬁvedimensional mass m = 0, then mn = nπ/R and the KK modes are equally spaced. Therefore the existence of an extra dimension of size R should manifest itself with the presence of new particles at an energy scale O(π/R). Since no such particle is observed up to present accelerator energies E of order of a few hundreds GeV, we conclude from this that R < π/(500GeV) ∼ 10−16 cm. There is however a subtle way out of this limit. It is in principle possible (and indeed it is suggested by some theoretical considerations based on string theory) that the extra dimensions are not accessible to particles with the usual weak, electromagnetic or strong interaction, and that only gravity can propagate in the extra dimensions. In this case we can have a large R. The resulting KK modes would be light, but they would not be observed at accelerators because they interact too weakly. A limit on R would come from modiﬁcations of Newton’s law of gravitation. Newton’s law is well veriﬁed experimentally only down to the millimeter scale (below it is diﬃcult to measure the gravitational force between two objects, because it is overwhelmed by the van der Waals forces). Therefore, the bound on extra dimensions in which only gravity can propagate is of order R < 1 mm (see N. ArkaniHamed, S. Dimopoulos and G. R. Dvali, Phys. Rev. D59 (1999) 086004).
12.4
Chapter 4
(4.1) (i) The exchange of coordinates gives a factor (−1)L , while the relative intrinsic parity of a fermion and an antifermion is −1, so in total we have (−1)L+1 . (ii) Consider e± as two charge state of the same particle, exchanged by C. Because of Fermi–Dirac statistics, the exchange of two identical fermions gives a minus sign. On the other hand, this exchange is performed applying the charge conjugation operator (which gives a factor C), exchanging the coordinates (which gives (−1)L ) and exchanging the spin. The spin exchange gives (−1)S+1 , i.e. the singlet state S = 0 has an antisymmetric spin wave function, while S = 1 has a symmetric spin wave function. Therefore C(−1)L (−1)S+1 = −1, and it follows that C = (−1)L+S . (iii) The ground state of parapositronium has
12.4
(4.2)
(4.3)
(4.4)
(4.5)
L = 0, S = 0 and therefore C = +1. Since the photon has C = −1, and QED conserves C, it can only decay into an even number of photons. Perform a boost along the z axis. Since the transverse components p⊥ of the momentum are not aﬀected, δ (2) (p ⊥ − k⊥ ) is invariant and we must consider only Ep δ(pz − kz ). Use the form of the Lorentz transformation of Ep , pz together with the property of the Dirac delta δ(f (x)) = δ(x − x0 )/f (x0 ) (valid when x0 is the only solution of f (x) = 0). Use the fact that Ψ and Ψ∗ anticommute at equal time, and the fact that the transpose of γ µ can be written as (γ µ )T = γ 0 γ µ γ 0 , as one veriﬁes from the explicit expression of the γ matrices. (i) The mass term breaks gaugeinvariance. The Euler–Lagrange equation is ∂µ F µν +m2 Aν = 0. Acting with ∂ν , using ∂ν ∂µ F µν = 0 and m = 0, gives ∂ν Aν = 0. Using this condition, ∂µ F µν = ∂µ ∂ µ Aν −∂µ ∂ν Aµ reduces to 2Aν and therefore ∂µ F µν +m2 Aν = 0 becomes (2 + m2 )Aµ = 0. (ii) The expansion of Aµ in plane waves is as in eq. (4.104). However now the condition (2 + m2 )Aµ = 0 imposes p2 = m2 , while ∂ν Aν = 0 gives µ pµ = 0. Therefore there are three independent solutions for the polarization vectors µ . Since all our equations are explicitly Lorentz covariant, we can study the particle content of the theory in the frame that we prefer and, since m = 0, we can choose the rest frame of the particle. In this frame p = (m, 0, 0, 0) and the three independent orthogonal polarization vectors are 1 = (0, 1, 0, 0), 2 = (0, 0, 1, 0) and 3 = (0, 0, 0, 1); they describe the three spin degrees of freedom of a massive vector ﬁeld. (i) Acting on a generic multiparticle state p1 , . . . , pn we have (2Ep )1/2 e−βH a†p p1 , . . . , pn =e
−βH
(12.23)
p , p1 , . . . , pn
= exp{−β(Ep + Ep1 + . . . + Epn )}p , p1 , . . . , pn . On the other hand, (2Ep )1/2 a†p e−β(H+Ep ) p1 , . . . , pn
(12.24)
(2Ep )1/2 a†p e−β(Ep1 +...+Epn +Ep ) p1 , . . . , pn
= = exp{−β(Ep + Ep1 + . . . + Epn )}p , p1 , . . . , pn , so the two expressions coincide on the most general state of the Fock space. An alternative derivation is obtained deﬁning f (β) = e−βH a†p − a†p e−β(H+Ep ) .
(12.25)
Clearly, f (0) = 0. Show that [H, a†p ] = Ep a†p , and using this check that f (β) = −Hf (β). The solution of this equation, with the boundary condition f (0) = 0, is f (β) = 0.
Chapter 4 273
274 Solutions to exercises
(ii) Using the above result and the cyclic property of the trace, Tr e−βH a†p aq = Tr a†p e−β(H+Ep ) aq (12.26) = Tr e−β(H+Ep ) aq a†p = Tr e−β(H+Ep ) (a†p aq + [aq , a†p ]) . Dividing by Tr e−βH , a†p aq β = e−βEp a†p aq β + e−βEp (2π)3 δ (3) (p − q ) .
(12.27)
Solving for a†p aq β we get the desired result. When p = q , in a ﬁnite volume, use eq. (4.7). (iii) If a†p and aq obey anticommutation relations, in the last passage in eq. (12.26) aq a†p is replaced by −a†p aq + {aq , a†p } = −a†p aq + (2π)3 δ (3) (p − q ) and therefore a†p aq β = −e−βEp a†p aq β + e−βEp (2π)3 δ (3) (p − q ) , (12.28) so a†p aq β =
V . eβEp + 1
(12.29)
(4.6) (i) The volume of the phase space is V (4/3)πp3F . Each cell has a volume h3 = (2π)3 (in our units = 1) and in each cell, by the exclusion principle, we can accommodate two electrons, with spin up and spin down. (ii) When p  < pF , ap,s destroys a particle which is present in 0 F , so in this case ap,s 0 F = 0. The fact that Ap ,s and A†p ,s satisfy the canonical anticommutation relations follows easily from the identities θ(x)θ(x) = θ(x), θ(x)θ(−x) = 0 and θ(x)+θ(−x) = 1 satisﬁed by the step function. The operator A†p ,s , acting on 0 F , creates an electron above the Fermi surface or destroys an electron in the “ﬁlled Fermi sea”. The latter process can be described as the creation of a “hole” in the Fermi sea, and the excitation of an electron from a level below pF to a level above pF can be described as the creation of an electron–hole pair. (iii) For instance, {Ap ,s , A†q ,r } = αp α∗q {ap ,s , a†q ,r } + βp βq∗ {a†−p ,−s , a−q ,−r } = (αp 2 + βp 2 )(2π)3 δ (3) (p − q )δrs .
(12.30)
All other relations are proved similarly. (iv) A†p Ap = αp 2 a†p ap + βp 2 ap a†p − αp βp∗ ap ap − α∗p βp a†p a†p . (12.31) The terms ap ap and a†p a†p have a vanishing diagonal matrix element. Use αp 2 = 1 + βp 2 (since we are now considering bosons) and, in a unit volume, ap a†p = a†p ap + 1.
12.5
12.5
Chapter 5 275
Chapter 5
(5.1) Using (2x +m2 )φ(x) = 0 and the relations ∂t θ(t) = δ(t), ∂t θ(−t) = −δ(t) (and therefore ∂t2 θ(t) = δ (t), ∂t2 θ(−t) = −δ (t)) we ﬁnd (2x + m2 ) [θ(t)0φ(x)φ(0)0 + θ(−t)0φ(0)φ(x)0 ] = δ (t)0[φ(x), φ(0)]0 + 2δ(t)0[∂t φ(x), φ(0)]0 . (12.32) By deﬁnition of distributions, δ (t) is deﬁned integrating by parts, so δ (t)φ(x) = −δ(t)∂t φ(x). Since δ(t) has support only at t = 0, the commutator [∂t φ(x), φ(0)] above must be computed at equal time, and then [∂t φ(x), φ(0)] = −iδ (3) (x ), so we get the desired result. The derivation with φ(y) (here we set y = 0) replaced by φ(y1 ) . . . φ(yn ) is obtained similarly, writing explicitly all theta functions. ˜ In momentum space we ﬁnd (−p2 + m2 )D(p) = −i. Formally this ˜ gives D(p) = i/(p2 − m2 ), and therefore i d4 p e−ipx . (12.33) D(x) = (2π)4 p2 − m2 However, the integrand has two poles at p0 = ± p 2 + m2 , and therefore we must also specify how to go around these poles in the complex p0 plane. For each pole we can go above or below it. After specifying a prescription, we can then compute the integral over p0 , 0 i dp0 0 2 e−ip t . (12.34) 2 2 (p ) − p − m If t > 0 we can close the contour in the lower half plane since, when 0 p0 = −iu with u > 0 then e−ip t = e−ut , which for t > 0 provides a convergence factor in the integral. Conversely, when t < 0 we can close the contour in the upper half plane. If we go around both poles from below, then when t > 0 (i.e. when we close the contour in the lower half plane) we encircle no pole (see Fig. 12.1), so the integral vanishes. Therefore, with this prescription, D(t, x ) = 0 for t > 0, and D(x) is called an advanced Green’s function. Conversely, if we go around both poles from above we ﬁnd that D(t, x ) = 0 for t < 0, and we have a retarded Green’s function. The Feynman propagator corresponds to a mixed case, see Fig. 5.1. (5.2) n(d − 2) 2d. Observe that in d = 2 the ﬁeld φ is dimensionless and a term λφn is renormalizable by power counting for every n, so we can take an arbitrary function V (φ) as the potential. (5.3) The main point is to understand that, in the Wick theorem, we must omit the contractions between ﬁelds inside a normal ordered term. For instance, the O(λ) contribution to the mass renormalization, in the theory with interaction (λ/4!) : φ4 :, is proportional to 0T {φ(x1 )φ(x2 ) : φ4 (x) :}0 . From the point of view of the combinatorics of the Wick theorem, ϕ ≡: φ4 (x) : can be treated
p0
Fig. 12.1 The case when the poles
in the complex p0 plane are both encircled from below, corresponding to an advanced Green’s function.
276 Solutions to exercises
just as a single ﬁeld, so for instance T {φ1 φ2 ϕ(x)} = : φ1 φ2 ϕ(x) : +D12 ϕ(x)
(12.35)
+0T {φ1 ϕ(x)}0 φ2 + 0T {φ2 ϕ(x)}0 φ1 . Using 0T {φi ϕ(x)}0 = 0 (since it is odd under φ → −φ) one ﬁnds T {φ1 φ2 : φ4 :} = : φ1 φ2 φ4 : + D12 : φ4 : ,
(12.36)
and therefore 0T {φ(x1 )φ(x2 ) : φ4 (x) :}0 = 0, so there is no mass renormalization at O(λ). Alternatively, one can write : φ4 (x) : =: φ(x3 )φ(x4 )φ(x5 )φ(x6 ) : (letting x3 = x4 = x5 = x6 ≡ x at the end of the calculation) and use eq. (5.85) to express : φ3 φ4 φ5 φ6 : as T {φ3 φ4 φ5 φ6 } minus the contraction terms, so in turn T {φ1 φ2 : φ3 φ4 φ5 φ6 :} = T {φ1 φ2 φ3 φ4 φ5 φ6 } − (contractions) . (12.37) One can now check explicitly that the “−(contractions)” term above cancel the terms in T {φ1 . . . φ6 } where we have contractions between φi φj with i, j = 3, 4, 5, 6. In general, one can understand from this example that the introduction of the normal ordering in the interaction term eliminates the tadpole graphs. (5.4) Setting u = 1/αs , eq. (5.194) becomes du b1 = b0 + . d log E u
(12.38)
(i) Neglecting the term ∼ b1 , the solution is u(E) = u(µ) + b0 log(E/µ). Substituting µ = ΛQCD exp{1/[b0 α(µ)]}, we ﬁnd u = b0 log(E/ΛQCD ). (ii) We can solve perturbatively inserting the lowestorder solution into the term ∼ b1 . The equation then becomes b1 du = b0 + . (12.39) d log E b0 log(E/ΛQCD ) The solution is u(E) = b0 log(E/ΛQCD )+(b1 /b0 ) log log(E/ΛQCD ), where we have redeﬁned ΛQCD at two loops so that the integration constant vanishes.
12.6
Chapter 6
(6.1) Use t = (p1 − p3 )2 = m21 + m23 − 2E1 E3 + 2p1 p3  cos θ. Since p3  and E3 are ﬁxed by energy–momentum conservation, we have dt = 2p1 p3 d cos θ. Inserting this into eq. (6.43) (with p1 ≡ p , p3 ≡ p ), integrating over dφ and using eq. (6.42) we get the desired result. (6.2) Equation (6.132) is obtained performing a Lorentz boost with velocity −v2 . Since E , p are ﬁxed by energy–momentum conservation, only θ is a variable and eq. (6.133) follows. Then dΩ =
12.6
d cos θdφ = 2πdElab /(γ2 v2 p ). Inserting this into eq. (6.41) gives the result. The kinematical limits are obtained setting cos θ = ±1 in eq. (6.132). √ (6.3) (i) In eq. (6.41) set s MA (since MA ω) and, for the photon, p  = ω. (ii) Denoting by Mf i the matrix element with normalization for √ the√atom equal to one particle in a volume V = 1, Mf i = 2MA 2MA∗ Mf i 2MA Mf i (since MA∗ − MA = ω
MA ). Using the phase space found above, eq. (6.20) gives dΓ =
1 ω (2MA )2 Mf i 2 dΩ . 2MA 16π 2 MA
(12.40)
MA cancels and we get the desired result. (iii) Use eq. (6.43) with s = MA2 and Mf i = 2MA Mf i . (6.4) Denoting by k1 , k2 , p the fourmomenta in the CM of the photons and of the ﬁnal atom, respectively, we have 1 1 d3 k1 d3 k2 d3 p (3) δ (p + k 1 + k 2 )δ(ω − ω1 − ω2 ) . 2! (2π)5 2ω1 2ω2 2MA (12.41) The factor 1/2! takes into account the fact that the two photons are identical particles. Integrating over d3 p and using the notation EA∗ − EA = ω, dΦ(3) =
ω1 dω1 dΩ1 ω2 dω2 dΩ2 δ(ω − ω1 − ω2 ) (2π)5 16MA 1 ω1 (ω − ω1 )dω1 dΩ1 dΩ2 . = (2π)5 16MA
dΦ(3) =
(12.42)
Finally, to compute dΓ use Mf i = 2MA Mf i , as in the previous exercise. (6.5) Writing explicitly dΦ(j) and dΦ(n−j+1) , the righthand side of eq. (6.140) becomes ∞ 2 j dµ d3 pi (12.43) (2π)4 δ (4) (p1 + . . . + pj − q) 2π i=1 (2π)3 2Ei 0 ⎞ ⎛ n 3 3 d p i ⎠ d q (2π)4 δ (4) (pj+1 + . . . + pn + q − P ) . ×⎝ (2π)3 2Ei (2π)3 2q 0 i=j+1 The ﬁrst Dirac delta forces q = p1 + . . . + pj . Inserting this into the second Dirac delta, we can rewrite the above expression as $ ∞ 2 # n dµ d3 pi 4 (4) (2π) δ (p1 + . . . + pn − P ) 2π (2π)3 2Ei 0 i=1 ×
d3 q (2π)4 δ (4) (p1 + . . . + pj − q) . (2π)3 2q 0
(12.44)
Now use the identity d3 q = d4 q δ(q 2 − µ2 )θ(q 0 ) , 2q 0
(12.45)
Chapter 6 277
278 Solutions to exercises 2 2 2 which follows from the fact that, by deﬁnition, µ = q0 − q , and 0 2 2 2 the θ function selects q = + q + µ as solutions of q − µ2 = 0. Then the above expression becomes n d3 pi (12.46) (2π)4 δ (4) (p1 + . . . + pn − P ) (2π)3 2Ei i=1 ∞ × dµ2 d4 q δ(q 2 − µ2 ))θ(q 0 )δ (4) (p1 + . . . + pj − q) . 0 p1
....
p
j−1
pj
q
pj+1 P
pj+2
.... p
n
Fig. 12.2 A graphical representa
tion of the decomposition of the phase space given in eq. (6.140).
1
Observe that, if instead of an interaction term gφ1 φ2 Φ with two diﬀerent ﬁelds φ1 , φ2 , we were to take a single φ ﬁeld, with interaction Lagrangian gφ2 Φ, there would be an additional factor of two in the amplitude, because when we compute
0T {φ2 (x)Φ(x)φ2 (y)Φ(y)}0 there are two possible contractions: we can contract the ﬁrst φ(x) with the ﬁrst φ(y) (and therefore the second φ(x) with the second φ(y)) or the ﬁrst φ(x) with the second φ(y). If instead we have
0T{φ1 (x)φ2 (x)Φ(x)φ1 (y)φ2 (y)Φ(y)}0 and a Lagrangian whose kinetic term does not mix φ1 and φ2 , we can only contract φ1 (x) with φ1 (y) and φ2 (x) with φ2 (y).
The last two integrals give ∞ dµ2 δ(q 2 − µ2 ) d4 q θ(q 0 )δ (4) (p1 + . . . + pj − q) 0 = d4 q θ(q 0 )δ (4) (p1 + . . . + pj − q) = 1 , (12.47) and the desired result follows. Diagrammatically, we can represent eq. (6.140) as in Fig. 12.2, so this representation of the phase space is useful to discuss a process in which the nbody decay of the initial particle goes through a resonance of mass µ which later decays into j particles. (6.6) (i) Denoting by p the external momentum and by q and p − q the momenta in the loop, the graph gives1 i i d4 q . (12.48) iM = (−ig)2 (2π)4 q 2 − m2 + i (q − p)2 − m2 + i In the rest frame of the initial particle, p = (MR , 0), where MR is the (renormalized) mass of Φ. Then the poles in the integrand 0 are at q 0 = Eq − i, q 0 = −Eq + i, q = MR + Eq − i and 0 q = MR − Eq + i, where Eq = + q 2 + m2 . In the complex q 0 plane we can close the integration contour both in the lower or in the upper halfplane. Choosing for instance the lowerhalf plane, we pick the residues of the poles at q 0 = Eq − i and at q 0 = MR + Eq − i, and we get 3 1 d q 1 1 + M = −g 2 . (2π)3 2MR Eq MR − 2Eq + i MR + 2Eq − i (12.49) In the second fraction we can set = 0 since the denominator never vanishes. In the ﬁrst we use the identity 1 1 = P ∓ iπδ(x) , x ± i x
(12.50)
where P denotes the principal part. Then we get an imaginary contribution to M, 1 d3 q Im M = πg 2 δ(MR − 2Eq ) . (12.51) (2π)3 2MR Eq
12.7
Chapter 7 279
Since Eq = q 2 + m2 m, when MR < 2m the Dirac delta is never satisﬁed, and the imaginary part vanishes. Instead, when MR 2m, performing the integral with the help of the delta function gives . g2 4m2 Im M = 1− . (12.52) 16π MR2 (ii) In Section 5.5.2 we have seen that the oneloop correction to the propagator produces a shift of the mass squared (see eqs. (5.108) and (5.112), and observe that the loop correction to the propagator, denoted here by iM, is the quantity denoted as −iB in eq. (5.108), so B = −M), so M 2 → M 2 − M = M 2 − ReM − i ImM.
(12.53)
The renormalized mass is given by MR2 = M 2 − ReM, and the quantity that appears in the denominator of the propagator, after inclusion of loop corrections, is therefore M 2 = MR2 − iImM. On the other hand, from eq. (6.52), M 2 = MR2 − iMR Γ. Therefore we expect that MR Γ = ImM. (iii) To verify this, we compute Γ explicitly. The amplitude for the process Φ → φ1 φ2 is (−g) and therefore, using the phase space (6.35), . . g2 4m2 4m2 1 2 4π (−g) = . (12.54) 1− 1− Γ= 2 2 2MR 32π MR 16πMR MR2 Comparing with eq. (12.52) we see that indeed MR Γ = ImM. Observe that in this theory g has dimensions of mass, so Γ has the correct dimensions.
12.7
Chapter 7
(7.1) (i) See Fig. 12.3. (ii) If s1 , s2 are the initial spins and λ1 , λ2 the ﬁnal helicities, 1 e4 Me+ e− →2γ 2 = Tr ( p1 + me )Lµν ( p2 − me )L†µν 4 4 s1 ,s2 ,λ1 ,λ2
(12.55) where p1 is the momentum of the electron, p2 of the positron, k1 , k2 of the two photons, and Lµν = γ µ
1 1 γν + γν γµ . p1 − k 1 − me p1 − k 2 − me
(12.56)
In the limit p → 0, after long but straightforward algebra, the computation of the trace gives 1 Me+ e− →2γ 2 = 32π 2 α2 . (12.57) 4 s1 ,s2 ,λ1 ,λ2
Fig. 12.3 The two Feynman diagrams for e+ e− → γγ to lowest order.
280 Solutions to exercises
To simplify the algebra, work directly in the CM, in the limit p → 0 (so that the photon energies are ω1 = ω2 me ) and contract the γ matrices with repeated indices. For instance, using γµ γ µ = (γ0 )2 − i (γi )2 = 4, one ﬁnds γµ γ ν γ µ = γµ ({γ ν , γ µ } − γ µ γ ν ) = γµ 2η µν − 4γ ν = −2γ ν . (12.58) In this way one can prove the useful identities γ µ Aγµ = −2A ,
γ µ A Bγµ = 4(AB) ,
γ µ A B Cγµ = −2 C B A . (12.59) Use also the cyclic property of the trace to bring closer γ matrices with repeated indices. (iii) In the CM (considering for generality two particles with different masses m1 , m2 ), p1 = (E1 , p ), p2 = (E2 , −p ), and I 2 = (p1 p2 )2 − m21 m22 = (E1 E2 + p 2 )2 − (E12 − p 2 )(E22 − p 2 ) = p 2 (E1 + E2 )2 .
(12.60)
Therefore
I = p (E1 + E2 ) = E1 E2 p 
1 1 + E1 E2
= E1 E2 (v1  + v2 ) ,
(12.61)
where v1 , v2 are the respective velocities in the CM. The relative velocity has modulus v = v1  + v2 , so I = E1 E2 v. The result for σ then follows from the general formula (6.29), using eq. (12.57) and twobody phase space (6.35) with m = 0. (7.2) (i) From Exercise 4.1 we know that an e+ e− pair can annihilate into two photons only if it has S = 0; since we are considering a bound state with L = 0, then also J = 0. Alternatively, the result follows from the fact that a twophoton state cannot have J = 1, see Landau and Lifshitz, vol. IV (1982), Section 9 for the proof. Equation (7.70) then follows from ⎤ ⎡ 1 ⎣ (J=0) + σ (J=1) ⎦ , (12.62) σ ¯= σ 4 Jz =−1,0,1
and σ (J=1) = 0. (ii) Equations (7.72) and (7.73) follow immediately from eq. (6.8), with V = 1. (iii): 3 d p1 d3 p2 2γPos = 2γp1 , p2 p1 , p2 Pos . (12.63) (2π)3 (2π)3 In the CM, ˜ 1) p1 , p2 Pos = (2π)3 δ (3) (p1 + p2 )ψ(p
(12.64)
12.8
˜ ) is the wave function of positronium in momentum where ψ(p space, so d3 p ˜ ). 2γp , −p ψ(p (12.65) 2γPos = (2π)3 (iv) From the orderofmagnitude estimates in the Introduction we know that in the hydrogen atom v ∼ α and p  ∼ me α (for positronium me becomes the reduced mass me /2). Then, to lowest order in α, in eq. (7.74) we can approximate 2γp , −p with its value at p me and extract it from the integral. The remaining integral is ψ(x ) at x = 0. Equation (7.76) then follows from eqs. (7.72) and (7.73), recalling that only J = 0 contributes. (v) The agreement is at the level of 0.5%. Including the ﬁrst radiative correction, the theoretical prediction turns out to be 1 α π2 Γ = me α5 1 − 5− (12.66) 2 π 4 and agrees with experiment, within the error (see D.W. Gidley et al., Phys. Rev. Lett. 49 (1982) 525).
12.8
Chapter 8
(8.1) (i) Compare with Solved Problem 7.2 on page 188. (ii) The max2 imum value of q 2 is qmax = (mn − mp )2 (1.3 MeV)2 (see e.g. Solved Problem 6.1). The typical scale of variation of the form factors is instead of order of the QCD scale, so qtypical ∼ a few hundred MeV. (iii): Mf i = −
GF cos θC √ ¯p γµ (1 − gA γ 5 )un . u ¯e γ µ (1 − γ 5 )uν¯ u 2
(12.67)
Averaging over the initial spin and summing over the ﬁnal spins, G2F cos2 θC Tr[( pe + me )γ µ (1 − γ 5 ) pν¯ (1 + γ 5 )γ ν ] 2 p + mn (1 + gA γ 5 )γ ν ] . ×Tr[( pp + mp )γµ (1 − gA γ 5 ) n 2 (12.68) Performing the traces, Mf i 2 =
Mf i 2 = 16 G2F cos2 θC [(1 + gA )2 (pe pp )(pν¯ pn ) (12.69) 2 2 +(1 − gA ) (pe pn )(pν¯ pp ) − (1 − gA )mp mn (pe pν¯ )] . We next compute the scalar products in the neutron rest frame, pn = (mn , 0). Observe that the maximum proton energy is Epmax =
m2n + m2p − m2e ∆2 − m2e = mp + , 2mn 2mn
(12.70)
with ∆ = mn − mp . Since (∆2 − m2e )/(2mn ) ∼ 10−4 MeV, we can neglect it with respect to mn ∼ 103 MeV and, in the scalar
Chapter 8 281
282 Solutions to exercises
products, we can set the proton energy Ep to the ﬁxed value Ep mp . For the same reason, we can write (pe pp ) = Ee Ep − pe ·pp Ee mp , since pp  mp . With this and similar approximations in the other scalar products, Mf i 2 = 16 G2F cos2 θC mp mn ×[(1 +
2 3gA )Ee Eν¯
+ (1 −
2 gA )
(12.71) Ee2
−
m2e
Eν¯ cos θ] ,
where θ is the angle between the electron and the antineutrino. The width is given by dΓ =
d3 pe d3 pν¯ d3 pp 1 Mf i 2 3 3 2mn (2π) 2mp (2π) 2Ee (2π)3 2Eν¯
(12.72)
×(2π)4 δ (3) (pp + pe + pν¯ )δ(mp + Ee + Eν¯ − mn ) . Integrate ﬁrst over d3 pp with the help of the δ (3) . Write d3 pe = 4πp2e dpe and d3 pν¯ = 2πEν¯2 dEν¯ d cos θ. Integrate over dEν¯ with the help of the remaining Dirac delta and ﬁnally perform the integration over d cos θ, between cos θ = ±1. The term linear in cos θ in Mf i 2 integrates to zero and the constant part gives the desired result. (iv) The kinematical limits on Ee are Eemin = me and Eemax =
dΓ __ dEe
me
Ee
∆
Fig. 12.4 The Fermi spectrum of β
decay.
m2n − m2p + m2e ∆2 − m2e =∆− ∆. 2mn 2mn
(12.73)
The spectrum is shown in Fig. 12.4. When mν = 0, at Ee ∆ we have dΓ/dEe ∼ (∆ − Ee )2 and therefore the slope of the spectrum, d2 Γ/dEe2 , goes to zero as Ee → ∆. For a small nonzero value of mν we can use the same expression for the matrix element and take into account mν = 0 just in the phase space. The result is that now the slope diverges (d2 Γ/dEe2 → −∞) at the endpoint of the spectrum. (v) Integrating over Ee , 1 G2F ∆5 cos2 θC 2 2 (1 + 3g ) dx x(1 − x) x2 − (me /∆)2 . Γ= A 2π 3 me /∆ (12.74) If me = 0 the integral can be computed analytically, and is equal to 1/30. For the physical values of me , ∆, numerical integration gives 0.472565/30, and G2F ∆5 cos2 θC 2 (1 + 3gA ). (12.75) 60π 3 Inserting the numerical values, we get a result for τ = 1/Γ larger by about 8% than the experimental value. A more accurate computation requires us to keep all six form factors. Furthermore, an important correction comes from the exchange of photons between the electron and the proton; these Coulomb corrections between ﬁnal states are large when, as in the present case, the relative speed of the ﬁnal charged particles is small. Γ = 0.472565
12.8
(8.2) (i) Using the same considerations as in Solved Problem 7.2, the most general parametrization of a vector current is ue γ µ uµ + f2 (q 2 )¯ ue σ µν qν uµ + f3 (q 2 )q µ u ¯ e uµ . e− j µ (0)µ− = f1 (q 2 )¯ (12.76) Imposing current conservation gives qµ (f1 (q 2 )¯ ue γ µ uµ +f2 (q 2 )¯ ue σ µν qν uµ +f3 (q 2 )q µ u ¯e uµ ) = 0 . (12.77) In the ﬁrst term we use the equations of motion together with q = k − p, where p and k are the electron and muon fourmomenta, ue uµ . In the second respectively. This gives u ¯e qµ γ µ uµ = (mµ −me )¯ µν term σ qν qµ = 0 by symmetry, so we get f1 (q 2 )(mµ − me ) + q 2 f3 (q 2 ) = 0 .
(12.78)
Setting q 2 = 0 (which is the value of q 2 in which we are interested, since the photon is onshell) we ﬁnd that f1 (0) = 0. (ii) The amplitude is obtained multiplying by ∗µ . Since ∗µ (q)q µ = 0, only the term ∼ f2 (0) survives. (iii) Mf i 2 =
e2 F2 2 ∗µ ρ qν qσ (¯ ue σ µν uµ ) (¯ uµ σ ρσ ue ) . 4m2µ
(12.79)
Perform the sum over the photon polarizations using ∗µ ρ → −ηµρ . To perform the sum over the spin of e− and the average over the ¯e → p + me and uµ u ¯µ → ( k + spin of µ− one could replace here ue u mµ )/2. The resulting trace apparently has up to six γ matrices but can be simpliﬁed using the γ matrix identities given in eq. (12.59). However, the calculation is much simpler if instead we eliminate immediately σ µν from Mf i 2 using the Gordon identity, eq. (7.51); then we get e2 F2 2 (12.80) 8m2µ ×Tr [( p + me )(Qµ − mγ µ )( k + mµ )(Qµ − mγµ )] ,
Mf i 2 = −
with Q = p + k and m = me + mµ . Computing the trace and the resulting scalar products (we need only 2(pk) = m2e + m2µ ), Mf i 2 =
e2 F2 2 (m2µ − m2e )2 , 2m2µ
(12.81)
and the result for Γ follows immediately. The resulting bound on F2  is F2  < 1 × 10−13 . Supersymmetric extensions of the SM predict a nonzero decay rate for µ− → e− γ, at a level not far from the experimental bound. Actually, in this case the eﬀective current jµ that mediates the transition has a structure V − A rather than a pure vector current as we have taken in this exercise. The modiﬁcation to the calculation amounts simply to the insertion of a projector (1 − γ 5 )/2 between u ¯e and uµ .
Chapter 8 283
284 Solutions to exercises
(8.3) (i) From the form of the Lagrangian, we see that the amplitude is proportional to GF and therefore σ ∼ G2F . Since GF is the inverse of a √ mass squared, and the only other energy scale is the CM energy s, for dimensional reasons we must have σ ∼ G2F s. (ii) The amplitude is M = MW + MZ with GF MW = − √ [¯ νe γµ (1 − γ 5 )e][¯ eγ µ (1 − γ 5 )νe ] , (12.82) 2 GF MZ = − √ [¯ νe γµ (1 − γ 5 )νe ][a2 e¯γ µ (1 − γ 5 )e + a3 e¯γ µ (1 + γ 5 )e)] , 2 e_
νe W
e
+
_
νe
Fig. 12.5 The eνe scattering ampli
tude mediated by the W boson. νe
νe Z e
_
0
e
_
Fig. 12.6 The eνe scattering ampli
tude mediated by the Z boson.
with a2 , a3 given in eq. (8.19); we already set a1 = 1/2, and we took into account that the term jµ0 j 0,µ produces two equal contributions to the process, one in which the neutrino current is provided by the ﬁrst jµ0 (and therefore the electron current by the second) and one in which the neutrino current is provided by the second factor j 0,µ . At a fundamental level, MW and MZ correspond to the graphs in Figs. 12.5 and 12.6. Performing the Fierz rearrangement in MW , we get GF νe γµ (1 − γ 5 )νe ] (12.83) M = − √ [¯ 2 1 ×[( + sin2 θW )¯ eγ µ (1 − γ 5 )e + sin2 θW e¯γ µ (1 + γ 5 )e)] . 2 The computation of M2 , with the usual average and sum over spins, and the subsequent computation of the scalar product in the CM frame is now rather straightforward, and the result is # $ 2 1 G2F s 1 2 4 + sin θW + sin θW σ(νe e → νe e) = π 2 3 0.176 G2F s .
(12.84)
(8.4) (i) Use eq. (6.21) with n02 dV = 1 (since we are considering a single target particle) and Γ = dN/dt. (ii) n ∼ T 3 follows from dimensional considerations if m T , since then T is the only massscale and dimensionally n = 1/volume = (mass)3 . Of course, it can also be obtained explicitly from the Boltzmann, Bose–Einstein or Fermi–Dirac distributions. From the previous exercise, σ ∼ G2F s. At a temperature T much larger than all the masses in question, s ∼ T 2 . Furthermore we have seen that n ∼ T 3 , while for relativistic particles v = 1, so Γ = nσv ∼ G2F T 5 and Γ/H ∼ (G2F T 5 )/(T 2 /MPl ) ∼ (T /1 MeV)3 . Therefore for T MeV neutrino–electron scattering maintained the neutrinos in equilibrium, while when the temperature of the Universe dropped around O(1) MeV the neutrinos decoupled. Observe that when T ∼ MeV the electron mass is not negligible compared to T , but T ∼ me so we still have only one massscale and the estimate s ∼ T 2 is still correct.
Bibliography The literature on quantum ﬁeld theory and related topics is vast. Here we collect only a few general references that we ﬁnd especially useful. More references on speciﬁc topics are given in the Further Reading sections, at the end of most chapters. Coleman, S. (1985). Aspects of Symmetry: Selected Erice Lectures. Cambrigde University Press, Cambridge. Di Giacomo, A., Paﬀuti, G. and Rossi, P. (1994). Selected Problems in Theoretical Physics (With Solutions). World Scientiﬁc, Singapore. Georgi, H. (1984). Weak Interactions and Modern Particle Theory. Benjamin/Cummings, Menlo Park, CA. Georgi, H. (1999). Lie Algebras in Particle Physics, 2nd edition. Perseus Books, Reading, MA. Itzykson, C. and Zuber, J.B. (1980). Quantum Field Theory. McGrawHill, Singapore. Jackson, J. D. (1975). Classical Electrodynamics. Wiley, Chichester. Kolb, E. W. and Turner, M. S. (1990). The Early Universe. AddisonWesley, Reading, MA. Landau, L. D. and Lifshitz, E. M. (1979). Course of Theoretical Physics, vol.II: The Classical Theory of Fields. Pergamon Press, Oxford. Landau, L. D. and Lifshitz, E. M. (1977). Course of Theoretical Physics, vol.III: Quantum Mechanics: NonRelativistic Theory. Pergamon Press, Oxford. Landau, L. D. and Lifshitz, E. M. (1982). Course of Theoretical Physics, vol.IV (by Berestetskij, V. B., Lifshitz, E. M. and Pitaevskij, L. P.): Quantum Electrodynamics. Pergamon Press, Oxford. Mandl, F. and Shaw, G. (1984). Quantum Field Theory. Wiley, Chichester. Nakahara, M. (1990). Geometry, Topology and Physics. IOP Publishing, Bristol. Okun, L. B. (1982). Leptons and Quarks. NorthHolland, Amsterdam. Parisi, G. (1988). Statistical Field Theory. AddisonWesley, Redwood. Perkins, D. H. (2000). Introduction to High Energy Physics, 4th edition. Cambridge University Press, Cambridge. Peskin M. E. and Schroeder D. V. (1995). An Introduction to Quantum Field Theory. Perseus Books, Reading, MA. Polchinski, J. (1998). String Theory, vol. I. Cambridge University Press, Cambridge.
286 Bibliography
Ramond, P. (1990). Field Theory: A Modern Primer, 2nd edition. AddisonWesley, Redwood. Weinberg, S. (1995). The Quantum Theory of Fields. vol. 1: Foundations. Cambridge University Press, Cambridge. Weinberg, S. (1996). The Quantum Theory of Fields. vol. 2: Modern Applications, Cambridge University Press, Cambridge. ZinnJustin, J. (2002). Quantum Field Theory and Critical Phenomena. Oxford University Press, Oxford.
Index cnumbers, 64 Cabibbo angle, 198 Cabibbo–Kobayashi–Maskawa matrix, 198 Callan–Symanzik equation, 149 Casimir operators, 16 Poincar´e group, 37 Charge conjugation, 28, 33, 93–94, 181–182 photon, 100, 182 Charged currents, 198 Chiral symmetry, 62, 209–212 in QCD, 248–250, 258 Classical electromagnetism, 65–72 Classical ﬁeld theory, 43–72 Coleman–Weinberg eﬀective potential, 257 Compton radius, 5 Cooper pairs, 261 Correlation functions in statistical mechanics, 231 Correlation length, 232 Cosmic microwave background, 12 Cosmological constant, 141–143 Coulomb potential, xiv, 170 Coupling constants, 142 bare, 136, 138 bare vs. renormalized, 138 in Feynman rules, 127 renormalizable, 140 running of, 146–151 Coupling to the EM ﬁeld, 69–72 Covariant derivative, 70 nonabelian, 246 Covariant quantization of EM ﬁeld, 101–105 CPT theorem, 95 Critical indices, 233 Critical phenomena, 232–238 Cross–section, 158–160 Current conservation, see Noether’s theorem Cutoﬀ, infrared, 84 Cutoﬀ, ultraviolet, 130, 135 CVC, 209–212
Action principle, 43–46 Active transformations, 20 Adjoint representation, see Representations, adjoint α (ﬁnestructure constant), xiv, 2 running of, 151 αs (alpha strong), 151 running of, 152 Amputated diagrams, 127 Anomalous dimension, 150 Anomalous magnetic moment, 73– 74, 192–193 of electron, 2 of muon, 2, 11 Anomaly, 100, 185 Antiunitary operators, 94 Anticommutation relations, 88 Antiparticles, 86 Asymptotic freedom, 146, 151, 152, 238 Axial current, 63, 198, 207, 209– 212 Bare couplings, 136 Bare ﬁelds, 136, 137 Bare Green’s functions, 147 Bare mass, 136 Barn, 9 Beta decay, 217, 281–282 Beta function, 148–151, 237–238 for λφ4 , 130 Bilinears of Dirac ﬁelds, 61 Binding energy hydrogen, 6 nuclei, 8 Block spin transformation, 234 Bogoliubov transformation, 107 Bohr radius, 6 Boosts, 17 generators, 19 Born approximation, 167–170, 173– 177 Bose–Einstein statistics, 86 Branching ratio, 10 Breit–Wigner distribution, 163–166
287
288 Index Dalitz plot, 171–173 De Broglie wavelength, 10 Decay rates, 156–158 Decays K 0 → π − l+ νl , 212–216 µ− → e− γ, 217, 283 µ− → e− ν¯e νµ , 200, 202–205 n → p e− ν¯e , 217, 281–282 π + → l+ νl , 205–209 π 0 → 2γ, 72, 182 π 0 → 3γ, 182 Positronium, 106, 194 Degree of divergence, 139, 183 Dilatation symmetry, 81, 271 Dimensional regularization, 185 Dirac adjoint, 58 Dirac delta integral representation, xiii relation to ∇2 (1/r), 77 Dirac equation, 56–58 in electromagnetic ﬁeld, 71 nonrelativistic limit, 73–74 plane wave solutions, 59 Dirac ﬁeld, 32–33 quantized, 88–90 Dirac Hamiltonian, 89 Dirac matrices, see γ matrices Dirac monopole, 66 Dirac, P.A.M., 73, 219 Disconnected diagrams, 124 Divergences, infrared, 8 Divergences, ultraviolet, 128–131, 135–140, 183–186 Dual ﬁeld strength tensor, 66 Dyson, F., 144 e (electron charge), xiv Eﬀective ﬁeld theory, 145 Electromagnetic ﬁeld classical, 65–72 quantized, 96–105 Electromagnetism, conventions, xiv Electron selfenergy, 184 Electroweak theory, 195–216, 263 Energy localization in ﬁeld theory, 69 Energy–momentum tensor, 49–50 for electromagnetic ﬁeld, 68 for scalar ﬁeld, 52 for spinor ﬁeld, 55 improved, 50 Euclidean action, 229
Euclidean propagator, 230 Euler–Lagrange equations, 44–46 Extra dimensions, 4, 82, 272 fπ (pion decay constant), 207 Families leptons, 197 neutrinos, 56 quarks, 197 fermi (unit of length), 5 Fermi constant, see GF Fermi Lagrangian, 201 Fermi momentum, 107 Fermi theory of weak interaction, 3, 195–216 Fermi vacuum, 107 Feynman diagrams, 122–135 1PI, 138 Feynman gauge, 180 Feynman propagator, 120–122 Dirac ﬁeld, 132 massive gauge boson, 200 photon, 133 scalar ﬁeld, 122, 132 Feynman rules, 127, 131–135 Feynman, R.P., 144, 219, 222 Fieldstrength renormalization, see Wave function renormalization Fierz identity, 218 Fine structure, 7, 74–79 Fine structure constant, see α First quantization, 1 of Dirac equation, 73–79 Fixed points of RG transformation, 235–238 infrared, 237 ultraviolet, 237 Flavor symmetries, 209–212 Fock space, 83–86 Form factors electromagnetic, 188–193 hadronic, 213 Friedmann–Robertson–Walker metric, 141 Functional derivative, 226 Functional integral, see Path integral Furry’s theorem, 183 GF (Fermi constant), 201, 205 γ (Callan–Symanzik function), 149 γ matrices, xii
Index 289 chiral representation, xiii, 57 standard representation, xiii, 59 traces, xiii, 188 Gauge ﬁxing, 180 Gauge transformation, 66 nonabelian, 243–247 Gaussian integrals, multidimensional, 227 Generating functional, 226 Generators of SU (N ), 244 of Lie groups, 14 of Lorentz group, 18 Gluons, 248 Goldstone bosons, 256–258 Goldstone theorem, 257 Gordon identity, 190 Gravitational waves, helicity, 42, 269 Gravitons, 40 Gravity nonrenormalizable, 145 Green’s function, 116 Groups nonabelian, 16 noncompact, 27 Gupta–Bleuler quantization, see Covariant quantization of EM ﬁeld Gyromagnetic ratio, 2, 73–74, 192– 193 Hadrons, 9, 199, 211, 248 Hard breaking of symmetries, 261 Heisenberg picture, 111 Heisenberg, W., 144 Helicity, 54 of the photon, 67, 98–100 of Weyl spinors, 55, 57 Higgs boson, 3 Higgs mass and ﬁne tuning, 143 Higgs mechanism, see Spontaneous symmetry breaking Hydrogen atom energy levels, 79 Hyperﬁne structure, 79 Instantons, 239–241 Interaction picture, 117–118, 120 Invariant mass, 172 Irrelevant operators, 235
Ising model, 232 Isospin, 210, 250 violation, 211 Jacobi identity, 22 Kadanoﬀ, L., 234 Kaluza–Klein modes, 82, 272 Klein–Gordon equation, 51 scalar product, 52 Lamb shift, 79 ΛQCD , 154 Landau gauge, 180 Landau levels, 80 Landau pole, 151 Lattice regularization, 230 Leading logarithms approximation, 151 LEP, 3 Lepton number, 65 Leptons, 197–201 Lie groups, 13–16 compactness, 16 structure constants, 15 Light–light scattering, 184 Little group, 37 Lorentz gauge, 67 Lorentz group, 16–34 LSZ reduction formula, 111–116 Mf i (scattering amplitude), 127 Magnetic moment, see Gyromagnetic ratio Majorana ﬁeld, 33 Majorana mass, 63 Mandelstam variables, 161, 171 Mass renormalization, 136 Maxwell equations, 65, 66 Maxwell stresstensor, 68 Mean free path, 12 Meissner eﬀect, 261 Mesons, 212, 248 Minimal coupling, 72 Monte Carlo simulations, 230 Natural units = c = 1, 4–5 Neutral currents, 199 Neutrino helicity, 40 masses, 56, 65 oscillations, 3 sterile, 65
290 Index Noether’s current, 49 Noether’s theorem, 46–50 Nonabelian ﬁeld strength, 247 Nonabelian gauge ﬁelds, 245 Nonabelian gauge theories, 243– 252 Nonminimal coupling, 72 Nonperturbative eﬀects, 206 Nonperturbative methods, 228–241 Nonrenormalizable theories, 139, 144–146 Normal ordering, 85 ΩΛ , 142 Oneloop vertex in QED, 184 Oneparticle irreducible graphs, 138 Optical theorem, 164, 179 Order parameter, 255 Parity, 32 intrinsic, 91 bosons, 93 fermions, 93 Majorana fermions, 92, 93 photon, 100 violation in weak interactions, 200 Partial width, 10 Path integral, 219–242 Pauli matrices, xiii, 25 Pauli, W., 144 Pauli–Lubanski fourvector, 37 Pauli–Villars regularization, 185 PCAC, 212 Phase space, 158 recursive relation, 179, 278 threebody, 171–173 twobody, 160–162 Photon, 40, 98, 103, 105 charge conjugation, 100 helicity, 67, 98–100 in external legs, 135 mass and gauge invariance, 181 parity, 100 propagator, 133 selfenergy, 184 sum over polarizations, 97, 101, 218 π 0 , electromagnetic interaction, 72 Pion decay constant, see fπ Pions, as Goldstone bosons, 258 Planck mass, 10 Planck units, 11
Poincar´e group, 34–40 Polarizations, sum over for fermions, 61, 202 for photons, 97, 101, 218 Positronium, 106, 194 Poynting vector, 68 Proca equation, 107, 260 Proca Lagrangian, 107 Propagator, see Feynman propagator QCD, 206, 210, 248–250 QED, 180–188 QFT at ﬁnite temperature, 238 Quarks, 112, 197–201, 248 conﬁnement, 147, 206 electric charges, 197 masses, 206 spectator, 212 Radiation gauge, 67 Rapidity, 18, 41 Rationalized Gaussian units, xiv Renormalizability, 140, 144 Renormalization, 135–140 and spontaneous symmetry breaking, 261 of QED, 183–186 Renormalization group, 146–151 in statistical mechanics, 233– 238 Renormalized Green’s functions, 147 Representations SO(3), 22 SU (2), 25 adjoint, 21 decomposition, 21 equivalent, 14 fundamental, 21 inﬁnitedimensional, 30 reducible, 14, 20 spinorial, 24 unitary, 15, 27 Resonances, 9, 163–166, 173 Running coupling constants, 146– 151 Rydberg, 6 Smatrix, 109–111 Scalar electrodynamics, 71 Scattering e− A → e− A∗ , 173–177 e+ e− → 2γ, 193, 279
Index 291 e+ e− → γ → µ+ µ− , 186–188 e− γ → e− γ, 7–8, 266 e− νe → e− νe , 218, 284 Schr¨ odinger picture, 109 Schwinger, J., 144, 193 Semileptonic decays, 212 SO(3, 1), 16–18 Soft breaking of symmetries, 261 Spin, 25, 26 from Noether’s theorem, 31 of Dirac ﬁelds, 90 of Lorentz representations, 26 photon, 100 Spinstatistics theorem, 86, 88 Spontaneous symmetry breaking, 253– 265 in the Standard Model, 263 Standard Model, see Electroweak theory, QCD Stokes theorem, 45 String theory, 4, 142 critical dimension, 100 Strong interactions, 8, 209, 211, 243, 248 SU (2), 25 SU (N ), 244 Sun, escape time of photons, 12 Superconductivity, 261 and Higgs mechanism, 261 Tproduct, 114 Tachyons, 37 Tensors, 20–24 invariant, 24 selfdual, 21, 42, 268–269 Thermal quantum ﬁeld theory, 107 Thomson crosssection, 7, 266 Timereversal, 94 and reality of form factors, 213– 214 Tomonaga, S., 144 Tunneling, 239–241 U (1) gauge symmetry, 70, 181 U (1) charge, 53, 87, 90, 181 Universality classes, 236 Vacuum energy, 85, 141–143 van der Waals forces, 170 Vector current, 63, 198, 207, 209– 212 W boson, 199, 200
mass, 200, 262–264 propagator, 200 Wave function renormalization, 112, 137 Weak decays, 10, 202–216 Weak interactions, see Electroweak Theory, Fermi Lagrangian Weinberg angle, 199, 264 Weyl equation, 54 Weyl ﬁeld, 31 quantized, 90–91 Weyl spinors, 27 Wick rotation, 130 Wick’s theorem, 122 Width, see Decay rates, Breit–Wigner distribution Wigner theorem, 36 Wilson, K., 144, 231, 242 WMAP, 153 Yang–Mills theories, see Nonabelian gauge theories Yukawa potential, 9, 170 Z boson, 199 mass, 200, 262–264 width, 243
EBook Information

Series: Oxford Master Series in Statistical, Computational, and Theoretical Physics

Year: 2,005

Pages: 308

Pages In File: 308

Language: English

Topic: 269

Identifier: 9783540719854,3540719857

Org File Size: 7,313,276

Extension: pdf