To understand what is going on one has to make a difference between what is a full/quantum/non-perturbative quantum field theory and what is a Lagrangian and/or semiclassical/perturbative description of a theory. In a full QFT, one has an algebra $\mathcal{A}$ of (physical) fields of operators. A (global) symmetry group $G$ is a group of automorphisms of this algebra. A choice of vacua is a choice of realization of $\mathcal{A}$ on an Hilbert space $H$, space of (physical) states (the states in $H$ are obtained from a particular vector in $H$, the vacuum, by action of elements of $\mathcal{A}$). We can have different choices of vacua corresponding to different (inequivalent) representations of $\mathcal{A}$ on an Hilbert space. For a given choice of vacua, all the symmetries of $\mathcal{A}$ are not necessarely realizable by unitary transformation of the Hilbert space: the realizable symmetries form a subgroup $H$ of $G$ and if $H$ is strictly smaller that $H$, one has spontaneously symmetry breaking from $G$ to $H$. The spectrum of the theory in the given vacua contains one massless scalar (Goldstone boson) for each continuous direction in $G/H$.
The notion of gauge symmetry depends on a specific Lagrangian description of the theory. In such description, one starts with a classical field theory with a gauge symmetry and one defines a QFT by quantization, let's say by the path integral approach. In such picture only gauge invariant classical fields define corresponding fields of operators in the quantum theory. Indeed to define correlation functions in the quantum theory one has to take the path integral over gauge equivalence classes of fields and so only gauge invariants quantities can be included in the integrand. One could try to define correlation functions of gauge variants fields by fixing a gauge and it is indeed possible perturbatively but the results depend on the gauge choice and fixing a gauge is anyway in general impossible at the non-perturbative level (Gribov ambiguity). So, very concretely, in the Seiberg-Witten example, $\phi$, which is a gauge-variant field in the classical starting point of the Lagrangian description, does not define a well-defined field of operators in the full QFT and in particular it does not make sense to talk about an expectation value $<\phi>$.
In the classical theory, it makes sense to say that the field $\phi$ has a non-zero value at infinity, the usual description of the Higgs mechanism applies and this story extends to the perturbative level. To understand the relation with the full non-perturbative theory, it is useful to think in terms of path integrals. A Lagrangian for a gauge theory defines a full QFT by path integral over gauge equivalence classes of classical fields. In particular, one has the choice of boundary conditions at infinity for the classical fields we are integrating over and this choice is mapped to the choice of vacuum of the full quantum QFT. But this mapping can be quite non-trivial. In the Seiberg-Witten story, the boundary condition on the field $\phi$ is specified by a complex number $a$, well-defined up to a sign. Classically, the moduli space of classical vacua is parametrized by $a$. For $a \neq 0$, the gauge symmetry is spontaneously broken from $SU(2)$ to $U(1)$ and for $a=0$ the $SU(2)$ gauge symmetry is unbroken. For big $a$, the classical theory is weakly coupled at the symmetry breaking scale and so one expects that for every such $a$ the path integral with boundary conditions prescribed by $a$ defines a vacuum of the full quantum theory, with an infrared behaviour looking like the classical one: a U(1) gauge theory with massive W bosons. But for small $a$, the classical theory is strongly coupled and it is unlikely that the quantum theory looks like the classical one. In fact the path integral has infrared divergences making the correspondence between $a$ and quantum vacua doubtful. The conclusion is that $a$, what would be a candidate for $<\phi>$, is not a good well defined coordinate on the moduli space of vacua. It is not very surprising precisely because $\phi$ is not an allowed observable in the full theory.
Breaking of gauge symmetry is not breaking of a corresponding global symmetry simply because in general there is no global symmetry associated to a gauge symmetry. More precisely the conserved current associated to a global gauge transformation is in general gauge variant and so cannot define a well defined charge on the Hilbert space of (physical) states. (A well known exception to this statement is QED where the current associated to the global $U(1)$ is gauge invariant and there is a well defined eletric charge but to have a spontaneous symmetry breaking one needs a charged scalar and the current associated to global $U(1)$ is not gauge invariant because of the term $A^\mu A^\nu \phi \phi^\dagger$ in the Lagrangian). If there were really a breaking of a global symmetry then one should see a Goldstone boson.
The conclusion is that the notion of spontaneous symmetry breaking of a gauge symmetry only makes sense given a Lagrangian/ classical/ perturbative description of the theory. It is not surprising as gauge symmetry is simply a redundancy in a given description of the theory (digression: physical consequences of a gauge symmetry description exist at the level of asymptotic symmetries but they are much more sublter objects that a global symmetry acting on the Hilbert space). So asking the question: is there a spontaneously symmetry broken of a gauge symmetry in a given vacuum of a full non-perturbative QFT does not really make sense. A question which makes sense is: are there some massless spin 1 particles ? If yes then there is a natural gauge theory description. If no then it's no.
So the meaningful questions that Seiberg and Witten are trying to answer are: what is the space of vacua of the theory and what is the infrared physics in each of these vacua? They start by the classical story, with a moduli space parametrized by $a$, a $U(1)$ unbroken gauge symmetry at $a\neq 0$ and a $SU(2)$ unbroken gauge symmetry at $a =0$. They argue that this picture is qualitatively correct at the quantum level for large $a$. To study the general case, one needs a good coordinate on the space of vacua. Natural functions on the space of vacua are vev of fields of operators. $tr \phi^2$ is a well-defined field of operator of the theory because it comes from a gauge invariant function in the path integral definition of the QFT and so it makes sense to consider $<tr \phi^2>$. The fact that it is a good choice is not obvious a priori, it could be a constant function on the space of vacua for example. But it is a good choice because it is clearly a good choice in the region where the classical approximation is good, for large $a$, $<tr \phi^2> \sim a^2$. In other words, $<tr \phi^2>$ is the simplest way to extend to the full quantum theory the variable $a$ natural from the classical point of view. All the work is then to determine the quantum corrections to the classical picture, and in particular to compute exactly $<tr \phi^2>$ as a function of $a$ in the region of the space of vacua where $a$ is still a good coordinate.