A "world sheet instanton" $f : \Sigma_g \rightarrow X$ is by definition an holomorphic map: $\bar{\partial}f=0$. This equation is the analogue of the anti-self dual (ASD) equation for 4d gauge fields.
In fact, there exists a really precise analogy between the A-model and 4d instantons (known since the 1980's: Gromov's idea to use (pseudo)holomorphic curves to study symplectic manifolds was directly inspired by Donaldson's work using 4d instantons to study 4-manifolds)(in fact, in a non-supersymmetric context, it was known in the 1970's in physics that there is an analogy between 2d $\sigma$-models and 4d gauge theories). From a physics point of view, this comes from the fact that the A-model is a topological twist of a 2d theory with $\mathcal{N}=(2,2)$ SUSY and that the natural setting for 4d instantons is the Donaldson-Witten theory, topological twist of a 4d theory with $\mathcal{N}=2$ SUSY. The key point is that $\mathcal{N}$ 2d SUSY is dimensional reduction of $\mathcal{N}=1$ 4d SUSY and $\mathcal{N}=2$ 4d SUSY is dimensional reduction of $\mathcal{N}=1$ 6d SUSY. Both SUSY algebras comes by dimensional reduction of a SUSY algebra in two higer dimensions, and so will have a charge of topological nature, a BPS-like bound and a general notion of instanton as finite action configuration saturating this bound. In both cases, instanton configurations are determined by a first order PDE (ASD equation in 4d, holomorphic map equation in 2d) whereas the general equations of motion of the (untwisted) theory are second order PDE (Yang-Mills equations in 4d, harmonic map equation in 2d).
Another way to understand the analogy is not to go up but to go down in dimensions. Both 4d ASD equations and 2d holomorphic map equation are gradient flow lines for some (infinite dimensional version of) Morse theory (in the same way that the usual tunneling instantons of quantum mechanics are related to the usual finite dimensional Morse theory: see Witten's paper "Supersymmetry and Morse theory"). More precisely, the 4d ASD equation is the equation of gradient flow lines for the 3d Chern-Simons functional, whereas the 2d holomorphic map equation is the equation of gradient flow lines for the 1d action functional ($\int_\gamma pdq)$. It is possible to extend the analogy at some level of details like finding the analogue of the topological subtleties appearing in the definition of the Chern-Simons functional, and so on.
In summary, both stories come from a miracle. The miracle in 4d is the splitting of 2-forms in self-dual and anti-self-dual parts. The miracle in 2d is the splitting of harmonic functions in holomorphic and anti-holomorphic parts. In other words, the 4d miracle is the existence of quaternions and the 2d miracle is the existence of complex numbers (from there, you could ask: what is the 8d miracle coming from the existence of octonions?).
The A-model, and in fact the topological string (i.e. the coupling of the A-model with topological 2d gravity) makes sense for $X$ Kähler manifold of any dimension (which is different from the physical string where the coupling to physical 2d gravity imposes a critical dimension). For any $\beta \in H_2(X,\mathbb{Z})$, there is a moduli space $M_g(X,\beta)$ parametrizing holomorphic maps of class $\beta$ from genus $g$ Riemann surfaces to $X$. It is the analogue of a moduli space of 4d ASD instantons of given instanton number. If the dimension of the moduli space of 4d ASD instantons is zero, it means that the moduli space is made of finitely many points and one can obtain a number by counting the number of points. If the dimension is not zero, one can obtain numbers by integrating differential forms over the moduli space: one obtains the Donaldson invariants, which determine the correlation functions of the Donaldson-Witten theory. In fact, to make precise sense of that, one has to ensure that the moduli space is compact and to have that, one has to had "punctual instantons" (singular limits of instantons shrinking to a point). There is a similar story for holomorphic maps: one has to had singular configurations bubbling-off as limit of smooth configurations (the precise technical thing to do was found by Kontsevich and is the "stable" part of "stable map"). If the compactified moduli space has dimension zero, one can simply count the number of points, if not, one can integrate differential forms over the moduli space and obtain the Gromov-Witten invariants of $X$, which determine the correlation functions of the A-model/topological string.
What is special about $X$ Calabi-Yau 3-fold is that all the moduli spaces $M_g(X,\beta)$ are of dimensional zero, and one obtains numbers $N_{g,\beta}$ by counting the number of points in these moduli spaces. In fact, it is not true, $M_g(X,\beta)$ can be of higher dimension, but it is always "virtually of dimension zero" and it is always possible to extract numbers $N_{g,\beta}$ (which are rational in general). To make the preceding phrase precise is the main technical difficulty of the theory (this difficulty has been solved but requires some thinking), from a physics point of view, it has to do with the correct treatment of the fermionic zero modes.