• Register
PhysicsOverflow is a next-generation academic platform for physicists and astronomers, including a community peer review system and a postgraduate-level discussion forum analogous to MathOverflow.

Welcome to PhysicsOverflow! PhysicsOverflow is an open platform for community peer review and graduate-level Physics discussion.

Please help promote PhysicsOverflow ads elsewhere if you like it.


PO is now at the Physics Department of Bielefeld University!

New printer friendly PO pages!

Migration to Bielefeld University was successful!

Please vote for this year's PhysicsOverflow ads!

Please do help out in categorising submissions. Submit a paper to PhysicsOverflow!

... see more

Tools for paper authors

Submit paper
Claim Paper Authorship

Tools for SE users

Search User
Reclaim SE Account
Request Account Merger
Nativise imported posts
Claim post (deleted users)
Import SE post

Users whose questions have been imported from Physics Stack Exchange, Theoretical Physics Stack Exchange, or any other Stack Exchange site are kindly requested to reclaim their account and not to register as a new user.

Public \(\beta\) tools

Report a bug with a feature
Request a new functionality
404 page design
Send feedback


(propose a free ad)

Site Statistics

205 submissions , 163 unreviewed
5,075 questions , 2,226 unanswered
5,348 answers , 22,744 comments
1,470 users with positive rep
818 active unimported users
More ...

  Non-quantum explanation for why classical physics should obey an action principle?

+ 6 like - 0 dislike

Recently, I had the following conversation with Norm Margolus.  I mentioned to him that the stationary action principle of classical physics had always seemed completely mysterious to me, until I learned the modern explanation for it involving quantum interference.  I.e., given all the classical differential equations that one could've written down a-priori, why should nature have picked equations that happen to give stationary paths of a particular global integral---unless, let's say, there was some physical process that was somehow exploring all the paths (not just the stationary ones), just as quantum mechanics says there is?

Margolus replied that this was conceptually completely backwards.  He claimed that no matter what the equations of classical physics had been, we could always view them as obeying some sort of action principle; and he pointed out that it was the action principle that preceded quantum mechanics rather than vice versa.  For my part, I agreed that this was the historical order, but I thought the historical order reversed the logical order.  I.e., on the one hand, quantum mechanics can be "motivated," more-or-less, as simply a non-commutative generalization of probability theory, without saying a word about the action principle.  But on the other hand, the action principle has no satisfying explanation (that I know of) that doesn't mention quantum mechanics.

So, I'm extremely curious to know the experts' thoughts on this.  Specifically: supposing we didn't know anything about QM, is there a way that the action principle could have been explained and motivated within classical physics, as satisfyingly as QM explains it?

asked Jul 24, 2014 in Theoretical Physics by ScottAaronson (795 points) [ no revision ]
Most voted comments show all comments

OK, thanks, although I didn't claim otherwise!  How does this relate to the question?

"He claimed that no matter what the equations of classical physics had been, we could always view them as obeying some sort of action principle; " Could you explain why you are not satisfied with this answer? 

(This is not an answer.) It is truly amazing that nature can be described, quite often, by differential equations and many of them can be derived from an action.  I think Laplace's 'Mécanique Céleste'  be a good place to look for the answer not just for historical reasons but for the simple fact that it was written before quantum mechanics. 

@JiaYiyang: This does not work. For example, if your equation is $\dot{x}-x = 0$, minimizing $\int (\dot{x} -x)^2 dt$ gives a second order equation for a mass on a spring, not the first order equation. The reason is that the action principle is defined as extremizing the motion with fixed endpoints, not minimizing absolutely.

It depends, but there are some properties that an extremal formulation obeys--- if the action is not explicitly time-dependent, the dynamics is intrinsically reversible, and obeys a symplectic Liouville principle of conservation of information (Liouville's theorem). These properties are obeyed by nondissipative systems only, as explained in Moretti's answer, and this is the central restriction of Lagrangian formulations--- they are nondissipative, the conserve information. There are exceptions, you can sometimes write a dissipative system as an extremal point of a Lagrangian using an explicitly time-dependent action, but this is only for very special dissipative systems, like the mass on a spring with linear friction, and it doesn't change the principle. Action formulations are symplectic and time-reversible, and fundamentally justified from some sort of fundamental account of information, and this is provided by quantum mechanics ultimately, but you can make some excuses for it classically too, although not as convincing.

Most recent comments show all comments

Jie: I'm not satisfied with it because I don't understand in what sense it's actually true.  (Even if it's always true in some formal sense -- which I already don't understand -- is it always true in a sense where any interesting conclusions can be drawn from it?)

If I'm allowed to cheat a bit, given any differential equation $F(x_i, \frac{\partial}{\partial x_i},\frac{\partial^2}{\partial x_i^2}\ldots)=0$, can't I just define the "action" to be $\int F^2$, and require it to be minimized? This of course may lack nice features as there are in standard definition of action(integral over a Lagrangian function), but I think it does imply the question is vacuous in the broadest context.

5 Answers

+ 5 like - 0 dislike

First of all the variational principle just determines the equations of motion and not the motion itself.

This is quite a subtle point, because the variational principle, to be formulated, assumes to fix the initial and final configuration of the system. However the found equations, in general, do not admit any solution or they admit too many solutions satisfying that couple of initial and final configuration!

The variational problem is generally ill-posed if you look for the solutions. Instead, it makes sense if you look for just the equation of motion without looking for solutions with given initial and final configuration.

However, having the  equations of motion, existence and uniqueness theorems exist when initial conditions are given specifying both the initial configuration and the initial velocity.

Coming to the core of your issue, I would like to stress that it is generally false that the evolution of a classical system is described by a variational principle. Indeed, as soon as you include friction forces you cannot derive the laws of classical dynamics from a variational principle. What it is true is the following theorem.

Consider a set of $N$ matter points interacting, within the Newtonian formulation, by means of forces obtained by a potential or a generalized potential (I mean, in particular, the cases of the Lorentz force and inertial forces)  and possibly subjected to constraints depending on time and positions (not velocities). If either  there are no constraints at all or the (unknown) reactive forces due to the constraints satisfy a certain postulate called postulate of ideal constraints (it happens in particular for the rigidity constraint and when the points are supposed to belong to friction-free lines or friction-free surfaces also moving or deforming in time) or also D'Alembert's principle (even if it is nothing but a postulate on the type of reactions due to the constraints which can or cannot be verified in the practice), then  deterministic equations of motion of the system can be derived by a variational principle.

It is worth stressing that the reactions do not appear in the mentioned variational equations, though these equations  completely determine the motion of the system if initial conditions (positions and velocities of every point of the system at initial time) are given. Reactions can be determined a posteriori, once the motion of the system has been computed, just exploiting the equations of Newton.

The result can be extended to the case of systems of both points and solid bodies.

answered Jul 25, 2014 by Valter Moretti (2,085 points) [ revision history ]
edited Jul 26, 2014 by Valter Moretti
+ 2 like - 0 dislike

I will try to answer your question, provided that I understand it correctly. I believe the Principle of Least Action was formulated before QM and can be justified without it. Why would we need QM to justify using the Action principle? 

The Action is a variational principle that takes the history or the path of the system as an argument and returns a real value back. It just happens that nature insists that this real value is as minimum as possible, hence why we are minimising the Action. 

In Classical Mechanics the input is the evolution of the generalised coordinates \(q(t)\) as described by the Lagrangian. From then we compute the Action as the integral of the Lagrangian from time \(t_1\) to \(t_2:\)

\(\mathcal{S}[\mathbf{q}(t)] = \int_{t_1}^{t_2} L[\mathbf{q}(t),\dot{\mathbf{q}}(t),t]\, dt\)

The true evolution of the system is the one which the Action is an extremum.

As you can see there is no need for QM to define an Action principle. In my eyes, it is more intuitive and perhaps a bit more justified to use this principle in CM rather than QM because the notion of a path is perhaps better defined (No uncertainty principle, no path integral formulation and such). 

answered Jul 25, 2014 by PhotonicBoom (40 points) [ no revision ]
Most voted comments show all comments

By definition \(q(t_1)\) and \(q(t_2)\) are fixed. The reason for this is because we assume that the initial and final points are the same for all trajectories, and only the path varies.

You are free to choose the reference frame (recall we are using generalised coordinates) hence \(q(t_1) = q(t_2) = 0 \) is a perfectly valid choice.

@Dilaton I think he meant what V. Moretti explained. :)

Does the downvoter care to explain why he downvoted?

Dude, you shamed me into removing the downvote. But I think that you misinterpreted the question inexplicably, the body makes it clear what is being asked.

Most recent comments show all comments

I downvoted, because it doesn't answer the question, this answer consists of generic statements with no real content. The main mistake OP makes is that he thinks that any classical equation can be formulated as an action principle. This is simply false. A dissipative equation in general cannot be formulated as an action principle, since action principles are always manifestly symmetric with respect to the direction of time (it's obvious from the formulation).

Euler lagrange equations for this reason second order, or fourth order (if the action depends on second derivatives) or sixth order (if the action depends on third derivatives). Terms which are linear in first derivatives integrate out, and don't influence the equations. If the action only depends on x and v, you produce a symplectic Hamiltonian formulation naturally.

Second, it is simply false that the action is minimized absolutely in the true motion. For example, if you look at a Harmonic oscillator at say x(0)=0 and x(T)=0, where T is specially chosen to be the period of the oscillator, there are infinitely many classical solutions, and after this focal point, the classical solution is no longer a minimum of the action (it is simply an extremal point). This property of minima is clarified by Morse theory, but it's really simple to understand, once solutions focus, the minimal property is lost (it's this focusing property which is what Penrose uses for his singularity theorem).

So what does this answer say, really? It says "you can write down an action principle in classical mechanics". OP already knew that. The point to make here is that this places restrictions on the form of the equations, and ultimately the only justification for this teleological thing is quantum mechanics.

@RonMaimon, thanks for the reply. I still believe the downvote was a bit harsh, but your points are fair. As I understood the question, the OP was asking what is the justification of the Action principle in CM, so I tried to answer as best as I could where it comes from. Anyway, I am happy to be wrong, thats why we do science. Thanks for the feedback, especially those coming from experienced and established users of this site.

+ 2 like - 0 dislike

The problem with the question: "is there a way that the action principle could have been explained and motivated within classical physics, as satisfyingly as QM explains it?" is that "classical physics" is not defined. In some expositions of classical mechanics, the action principle is taken as the starting point. Of course, this will probably not satisfy the OP. But as it is impossible to begin from nothing, we have to choose another starting point. The other possible usual starting point of classical mechanics is Hamiltonian formalism. If you accept the Hamiltonian formalism then you can derive the action principle by a standard Hamiltonian/Lagrangian equivalence (here there can be many painful details but I don't think there are important for "big picture considerations"). But now the OP could ask "is there a way that the Hamiltonian formalism could have been explained and motivated"? In other words, have we gain something by choosing the Hamiltonian rather than the Lagrangian formulation as the beginning point? I think so because there are  "explanations" and "motivations" for the Hamiltonian formulation: again, it is not possible to start from nothing, but it is possible to start from very "primitive hypotheses" on what is a physical theory. As it is not directly the question, I do not expand on that (I can if someone asks) but I refer to Kapustin's paper http://arxiv.org/abs/1303.6917 (whose goal is to show that under these "primitive hypotheses", the only possible physical formalisms are classical and quantum physics, the former being a natural limit of the latter).

The reason there exists kind of "primitive hypotheses" from which one can derive Hamiltonian formalism and that there does not seem to exist an obvious "analogue" for the Lagrangian formalism, is essentially psychological. As already mentioned in other answers, the action principle can look "unnatural" because it seems to  violate causality whereas causality is generally included in "primitive hypothesis". But the action principle happens to be so useful, in particular in giving the possibility to make relativistic invariance manifest, that it should certainly be included in the "natural" things (what should be called "natural" or "motivation" can not be determined only from some "pre-physical", "human" intuition).   


Further remarks:

1)The action principle of classical mechanics is very special in the sense that the Lagrangian $L(q, \dot{q})$ is not an explicit function of $\ddot{q}$,.... This is the reason for the Euler-Lagrange equations to be of second order : a more complicated functional dependence of the Lagrangian would produce higher orders equations. I think this shows that it is difficult to find a direct reason to motivate the action principle: a general argument leading to a variational principle would hardly select this very special kind of variational problem. In contrast, the equivalence with the Hamiltonian formalism gives a natural "explanation" for this particular structure.

2)There exists in physics variational problems similar to the action principle, in particular in thermodynamics. Thermodynamics is mainly about extremality of thermodynamic functions (energy, free energy, enthalpy...) Apart from experimental evidences (which were of course historically the most important) (or maybe "philosophical considerations"), I see no "macroscopic" derivation of this fact. The only explanation I know is statistical physics: thermodynamic is the limit of statistical physics in the thermodynamic limit. It is well-known that statistical physics is analogue to quantum physics and that the therrmodynamic limit is analogue to the classical limit. In the statistical physics side, there is no obvious Hamiltonian formalism because there is no obvious time and so there is no alternative way to "explain" the variational principles in thermodynamics. EDIT: however, see the comment by  Arnold Neumaier.

Maybe that if physicists of the end of the 19th century had asked the question of the OP: "where does the action principle come from in classical mechanics" and as they knew the answer to the question "where does the variational principles come from in thermodynamics", they could have discover quantum physics by pure thought (the fact that classical mechanics is the limit of something as thermodynamics is the limit of statistical physics should have been suggested by the presence of variational principles in the two cases but also by the presence of Legendre transforms in the two theory: Legendre transform is a natural ("tropical") limit of the Fourier-Laplace transform).

answered Jul 26, 2014 by 40227 (5,140 points) [ revision history ]
edited Jul 26, 2014 by 40227

One can derive the extremal principles of thermodynamics from macroscopic axioms; see my paper Phenomenological thermodynamics in a nutshell. It is not an action principle, though, as the underlying geometry is not symplectic geometry but contact geometry.

+ 2 like - 1 dislike

First of all, I think that interpreting the action principle in quantum mechanics as a physical process that explores all the paths goes too far. (As a mathematical process, fine, but physical, rather not.)

The reason is that the main difficulty of the path integral formalism is the construction of the path integral measure. Doing this with a suitable level of rigor usually incorporates information from the action (Kinetic term \(\frac{m}2 \dot x^2\) for the Wiener measure, eigenvalues of the Dirac operator in QFT, etc.). When trying to incorporate other degrees of freedom like spin; one is quickly let to the see the path integral as a mnemonic device for the Trotter product formula. I would say that you have to pick some classical differential equation / operator a priori that you can use as an input to define a path integral, not the other way round.

Now the classical action principle. First, you can interpret it as a "physical" process if you really want to: The classical particle tests out all possible paths and then chooses the one that minimzes the action.

You may not agree with this formulation, though, and your reason would probably be that this looks like a non-local process: the particle has to "figure out the whole path" before going somewhere, and cannot decide "at every moment in time" where to go. However, the Euler-Lagrange equations usually are local, so just because the process looks non-local, it doesn't need to be inherently non-local. I'm not entirely sure what's going on myself, but the action principle is similar to Bellmann's Dynamic Programming: to go from a point \(x_0\) to a point \(x_1\), you pick an intermediate point \(x'\), solve the subproblems of going to and from there and then minimize over the intermediate point

\(S[x_0,x_1] = \text{min}_{x'} (S[x_0,x']+S[x',x_1])\)

It may be possible to obtain the classical action principle from the quantum "interference of all paths" interpretation via some sort of "tropical probability theory", but I'm not sure if that's correct and I don't know whether anyone has already tried that.

Second, I don't agree with your friend's claim that the classical action principle is "obviously" correct. I would say that it is a mysterious happenstance: Just like a law of nature, it is neither obvious nor does it follow from some higher principle. The action principle can be derived for systems with holonomic constraints, i.e. where the particles are confined to move, say, on a circle. Situations like these are usually modeled by appealing to d'Alembert's Principle but if you start with the trivial action integral for an unconstrained system, you can incorporate the constraints by simply restricting the optimization problem to trajectories that fulfill the constraints. The equivalence of these formulations requires proof.

To summarize, my points are:

  1. The QM explanation is not actually satisfying. (Construction of path integral measure.)
  2. It might be possible to derive the classical action principle from "tropical probability theory" (Conjecture!).
  3. The classical action principle can be given the status of a theorem (as opposed to a higher principle) for systems with constraints.
answered Jul 26, 2014 by Greg Graviton (775 points) [ revision history ]
edited Jul 27, 2014 by Greg Graviton

''The action principle can be derived for systems with constraints,'' -- only for holonomic constraints. For nonholonomic constraints no action principle seems to exist.

The issue of constructing the measure is entirely solved by an infinitesimal Wick rotation, which solves the problem in principle in all cases of interest, although not yet with mathematical rigor. To call this niggling mathematical technicality a barrier to understanding the physical content of the path integral is grossly misleading. It is completely fine to consider the quantum path integral as physically summing over all paths using the natural measure, the limit of the Wick rotated measure as you go to the extreme limit. The "Trotter formula" is simply exactly the same thing in operator language, and it is just a slightly obfuscated way to state things.

@ArnoldNeumaier: I was thinking of those. Corrected, thanks.

@RonMaimon: A Wick rotation turns \(\exp(i∫dt\frac{m}2\dot x^2)\)into \(\exp(-∫dτ\frac{m}{2}∂_τx^2)\), but I have no issue with that. What I intended to say is that the path integral measure for a single particle in an electric field, the (Wick rotated) Wiener measure, requires a specific form of the kinetic term in the Lagrangian, namely \(\frac{m}2 \dot x^2\). This is necessary because there is no general measure \(Dx\) for all paths. In a way, the kinetic term is absorbed into the path integral measure, and this is the point where I would say that the interpretation of a particle moving along all possible paths becomes somewhat "unphysical", because in a way, the measure already preselects paths that are compatible with the kinetic term. In this sense, I would say that the path integral does not provide a satisfactory explanation of what Scott Aaronson asks for.

That this is not a "niggling mathematical technicality", can be seen when you want to construct a path integral for non-cartesian coordinates. (Kashiwa, Ohnuki, Suzuki. "Path Integral Methods", chapter 3.1) An interpretation that is still valid in this and other cases is: "The particle hops from point \(x_t\) to \(x_{t+Δt}\) with an amplitude given by the free propagator \(G_0(x_t,t;x_{t+Δt},t+Δt)\)and picks up an additional phase from the potential energy. Then, you sum over all these hops." But of course, this is just Trotter's product formula.

The change of variables are not inconsistent, you just get some determinants which alter the naive Lagrangian. I agree with the point of view above: to define the path integral you don't say "sum over all paths", in an unstructured way, you always need a more specific way to define exactly which class of paths you are summing over, and for this, you use the (slightly, or completely) Euclideanized kinetic term to define which paths are included in the measure, and the measure gets defined properly only as you take a continuum limit. It couldn't be any other way, you can't think of a path integral as a God-given measure over paths, and then the action is just an arbitrary function that you integrate over the paths, just because there are lots of functions over paths for which the path integral doesn't make sense.

Instead of thinking of the path integral as a uniform measure on paths, with functions getting integrated using this measure, you instead always interpret the path integral with the action already given as a procedure for sampling paths, and averaging functions over these samples, as in Monte-Carlo integration. You need a well defined Euclidean statistical sampling procedure to define the path integral, but any such sampling procedure will do, so that you can take the continuum limit of any statistical second-order phase transition, not just to those near a free field theory. Measure can always be taken to mean a sampling procedure. It is true that the procedure depends on the action.

+ 1 like - 2 dislike

The relativistic classical action principle follows from (1) homogenity of space-time, (2) A world-line can be defined for the system.

In the proper frame of this world-line, it follows from (1) that this world-line must remains parallel to the time axis, otherwise preferential treatment would be given to one direction over the others Therefore the world-line isn't curved => classical relativistic Lagrangian => classical Newtonian Lagrangian.

I realized this after coming across this question on Physics Stack Exchange that questions Landau's derivation of the form of the classical Lagranian at the beginning of his mechanics book using homogeneity of space and time. He treats space and time separately, and therefore has to  postulate an $L$ dependent on $v$, rather than treating them as one entity which simplifies the necessary assumptions.

answered Jul 25, 2014 by physicsnewbie (-20 points) [ no revision ]

Landau's derivation  assumes the action principle and then determines the action. the OP however wants to know why should one assume that an action exists.

@suresh1 right, and I'm saying that the classical action principle is equivalent to the statement that the world-line of some point that characterizes a system is a straight line. And this follows from the homogenity/isotropy of space time. For a free particle this point is its position, for N particles it's the center of mass/energy of the system.

Let's Generalize Landau's derivation to space-time. We only have to assume there is an action principle where $L$ is just a function of the coordinates; since the velocity is now a rotation in space-time. And from this, the action principle is just the statement that the world line is a geodesic. Is it possible to come to this conclusion using a different argument? Yes, using (1) and (2) in my answer that doesn't assume any action principle.

Your answer

Please use answers only to (at least partly) answer questions. To comment, discuss, or ask for clarification, leave a comment instead.
To mask links under text, please type your text, highlight it, and click the "link" button. You can then enter your link URL.
Please consult the FAQ for as to how to format your post.
This is the answer box; if you want to write a comment instead, please use the 'add comment' button.
Live preview (may slow down editor)   Preview
Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
Anti-spam verification:
If you are a human please identify the position of the character covered by the symbol $\varnothing$ in the following word:
Then drag the red bullet below over the corresponding character of our banner. When you drop it there, the bullet changes to green (on slow internet connections after a few seconds).
Please complete the anti-spam verification

user contributions licensed under cc by-sa 3.0 with attribution required

Your rights