1. Instantons
1.1 Instantons as a classical solution
An instanton, is pretty much exactly what you say: An (anti-)self-dual configuration of the curvature of a principal bundle. The curvature FA of a principal bundle with connection A is the field strength tensor of the physical gauge theory, while the connection is called the gauge potential. In electromagnetism, for example, the non-zero components of F are exactly the electric and magnetic fields, so it has direct physical relevance. In general, the action1 of a Yang-Mills gauge theory is given by the integral
whose Euler-Lagrange equations are the classical equations of motion, i.e. the classical solutions are stationary points of this functional. Now, we can decompose the field strength into a self-dual part F+ and an anti-self-dual part F− which are orthogonal to each other w.r.t. the inner product
Plugging this in gives
and comparing this to the second Chern class C2(A):=∫Trg(F∧F) we can see that SYM[A]≥∣C2(A)∣, i.e. is locally minimized when equality holds. But that equality holds exactly when either F+=0 or F−=0, i.e. when the full field strength F is itself either self-dual or anti-self-dual (or vanishing). Thus, the classical equations of motion are equivalent to ∗F=±F, i.e. instantons are just classical solutions to the equation of motion.
Already classically, the size of the moduli space of instantons is somewhat interesting, since it shows whether or not there are possible different solutions to the classical equation of motion. However, instantons may be related by large gauge transformations which one may quotient out classically, so counting the instantons as such is not enough information to be interesting.
1.2 Instantons as "vacua" of the quantum theory
Standard perturbation theory in quantum field theory proceeds by expanding the exponential of the action function around a classical solution. The existence of multiple classical solutions means we should expand around every single one of them, and sum up the solutions with some weight f.2 If we denote by DAk a Feynman path integral measure (with gauge equivalence classes quotiented out) that is over perturbations of the instanton labeled by k=18πC2(A)3, we get for the Euclidean path integral
By a standard decomposition argument that this integral should factor into local parts when we consider things like Ak=Ak1+Ak2 where Ak1,Ak2 live on different regions, one heuristically finds that f(k1+k2)=f(k1)f(k2) should hold, and since everything in a physicist's world is smooth except when it isn't, this means that f is itself an exponential f(k)=eiθk for some θ∈R. Remember how k was defined, we get a modified action
with which we can drop the sum and write
where DA now ranges over the whole space of connections.
From this a development of other theories may follow, e.g. one may now try to promote the θ itself to a dynamical field, then called the axion in Peccei-Quinn theory.
However, there is an issue when one tries to do this sum in theories coupled to fermions, which is the appearence of a quantum anomaly.
1.3 Instantons and quantum anomalies
We consider a theory of quantum fermion fields coupled to a classical gauge field background (i.e. an instanton)
where D is the Dirac operator D=Dμγμ with Dμ the covariant derivative belonging to A and γμ the usual gamma matrices acting on the spinors ψ. We would like to write
but there is an issue with the usual definition of DψDˉψ in terms of a limit of integrations over the modes of the field. See this answer of mine for the definition and regularization of the measure and this answer of mine for how, in the end, the anomaly - i.e. the problem in defining the measure in an invariant way under large gauge transformations - is related to the index of D, which by the Atiyah-Singer index theorem is closely related to the instanton number k as the Chern class.
1Actually, we consider the so-called Euclidean action, to which the actual action is related by Wick rotation
2Perturbatively, the states belonging to such different sectors are unrelated, but with the theory of instantons, one can see that there are non-perturbative amplitudes between them, see this answer of mine
3There may be multiple instantons with the same number, the sum is then over each on of them (since in general it is not guaranteed those are related by a gauge transformation)
2. Donaldson invariants
This section attempts to be a brief retelling of Witten's "Topological Quantum Field Theory"
2.1 Twisted N=2 supersymmetric Yang-Mills theory
The physical setting in which the Donaldson invariants will appear is a Yang-Mills theory living on the four-dimensional manifold M coupled to certain fields such that the total action enjoys a supersymmetry. The action of this theory looks admittedly horrible:
Here Dμ=∇μ+[Aμ,˙] is the gauge covariant derivative (with ∇ the ordinary Riemannian covariant derivative) and all fields are g-valued. There is a Z2-grading on the space of fields, where we call the different classes "bosonic" (or even) and "fermionic" (or odd). The bosonic fields are ϕ,λ, the fermionic ones are η,ψ,χ, and χ is additionally constrained to be self-dual. This action is invariant under the infinitesimal symmetry
with ϵ a fermionic infinitesimal parameter. As with all transformations, we think of this one as having a generator - its supercharge - Q, which gives all transformations as δα=−iϵQ(α) where α is any field. In a Hamiltonian formulation, Q(α) would be the Poisson bracket {Q,α}, but on a general manifold, we don't have that option. By explicit computation, one finds that δϵδζX−δζδϵX=−2iϵζϕ
for every field X except A, where it is −Dϕ. This holds only on-shell for χ, but off-shell for all others. Therefore, the commutator of two such transformations is a gauge transformation, and hence has no physical impact. In the Hamiltonian formulation, we would write this {Q,Q}=0 (modulo gauge transformations), that is, Q is akin to a BRST charge. The conserved current associated to this symmetry is
where conservation means that ∗J is closed, so for any homology 3-cycle γ, the integral
depends only on the homology class of γ. Furthermore, one may show that the energy-momentum tensor Tμν=2δSSYMδgμν of this theory is an infinitesimal transform Tμν={Q,λμν} for another ugly expression λ (see Witten's eq. (2.34)).
2.2 Donaldson invariants as path integrals
In the following, the path integral measure DX includes all fields, and also intends to have gauge equivalence classes quotiented out. The generic object we consider is the (unnormalized) expectation value of any observable O, where O is any nice functional in the fields:
If the supersymmetry transformation is non-anomalous, we have Z({Q,O})=0 for every observable. We now claim that Z=Z(1) is a smooth invariant, and in particular will turn out to be a Donaldson invariant. For Z to be a smooth invariant, it must be invariant under changes in the metric. The change of the action under a change of metric is by definition δS=12∫M√gTμνδgμν and this leads to
so Z(1) is invariant under changes of the metric. Similarily, it is invariant under changes of the gauge coupling constant e, as long as it stays non-zero. But in the limit of small coupling, the path integral is strongly dominated by classical minima of the free theories...and the classical minima of the free gauge theory are the anti-self-dual instantons! The self-dual ones are not minima because we added the topological F∧F to the Lagrangian. So we may evaluate Z by looking at the instanton contributions. However, as in 1.3 above, the path integral measure is not invariant if the fermionic zero modes are mismatched. The equations for the ψ zero modes turn out to be the exact same equation as that for an infinitesimal perturbation δA to an instanton configuration A to be an instanton:
for Y either δA or ψ. But the number of possible independent perturbations of an instanton that are again an instanton is exactly what one would call the dimension dim(M) of the moduli space at the point A if it was a proper smooth non-singular space, so the number of ψ zero modes is the same as the dimension of the moduli space of instantons. In general, there seems to be an index theorem that says the total number of zero modes is equal to the formal dimension of the moduli space, but Witten does not give its name or application here.
Specializing to the situation in the question, where the instanton moduli space is discrete and has dimension zero, we thus have that Z is a smooth invariant. For a fixed instanton background and in the limit of weak coupling, it is enough to look at the lowest order terms in the fields. Those are quadratic terms of the form ΦΔΦ and ΨDΨ where Δ is a second order elliptic operator on the bosonic fields Φ=(A,ϕ,λ)T and D is first order real skew-symmetric on the fermionic fields Ψ=(η,ψ,χ)T. This means the path integral over Φ and ψ degenerates into a Gaussian integral.
Furthermore, their eigenvalues are related by supersymmetry: Looking at the supersymmetry transformation (1), we see that the classical solutions F=−∗F and ϕ,λ,η,ψ,χ=0 are invariant under supersymmetry, so the quantum excitations when expanding about them are related by supersymmetry, too. For each non-zero λ that is an eigenvalue of D (which come in pairs since it is skew-symmetric), there is an eigenvalue λ2 of Δ.This is shown in D'Adda, DiVecchia, "Supersymmetry and instantons"
Now, the Gaussian path integral over ∫M(ΦΔΦ+iΨDΨ)√g yields Pf(D)/√det(Δ), and the Pfaffian and the determinant differ only by a sign, so this becomes ±∏isgn(λi), where the product runs over all non-zero eigenvalues. The ± is due to us having to pick an orientation that determines the sign of the Pfaffian. So we pick any one instanton A0 and declare this product is +1 for it. Now, one picks any other instanton Ai and determines how often D has a zero eigenvalue along the homotopy At=tA0+(1−t)Ai. Everytime it obtains a zero eigenvalue, the sign of Pf(D) is defined to change. (I think this is basically transporting D along the curve At in the space A/G, where A is the space of connections.) This gives a well-defined way to define the sign of the Pfaffian for Ai if it gives the same sign regardless of which homotopy At is chosen - and since concatenating two of them gives a transformation of A0 onto itself, this is equivalent to the requirement that the sign must not change along any loop in A/G based at A0.
If this is given, then we obtain that Z=∑i(−1)ni, where the sum is over all instantons and the ni are determined in the above fashion. This is now, finally, exactly the same sketchily defined-invariant as in the question.
