Just to highlight that this is an example of what in mathematics is called Chern-Weil theory combined with the clutching construction:
Notice first that the usual talk about field strenghts on Minkowski space vanishing at infinity is mathematically equivalent to working on the one-point compactification of Minkowski space, which is the 4-sphere.
Now generally, by the clutching construction, a G-principal bundle on an (n+1)-sphere may be given as a Cech cocycle with respect to covering the (n+1)-sphere by two (n+1)-hemispheres which overlap on an n-sphere times a small open interval. It follows that the class of the bundle is given by homotopy classes of maps from the n-sphere to G. For n+1=2 and G = U(1) this is the usual winding number, being the magnetic charge of the Dirac monopole. For n+1 = 4 and G = SU(2) this is the instanton number.
Next, any principal bundle given by transition functions this way may be canonically equipped with a connection (a gauge field) once a partition of unity subordinate to the cover is chosen, by a standard formula recalled here. You'll notice that this formula is such that in the present case towards the poles it yields the formulas found in the typical physics textbook discussion of these matters. I think much clarity is gained by understanding the proper mathematical description of what's going on, namely that after one-point compactification of spacetime representing fields vanishing at infinity, one builds a nontrivial principal bundle on the resulting 4-sphere by the clutching construction and then equips it in the quasi-canonical way with a connection.
Then it is Chern-Weil theory which implies that putting any connection on that bundle and evaluating its curvature 2-form in the suitable invariant polynomial (such as the trace in the standard matrix representation of SU(2)) produces a differential form which is a de Rham representative of that class. This is ultimately what makes the two expressions that the questions ask about be related or even equal (in the absence of torsion classes, as in the simple case of SU(2)-bundles on the 4-sphere, they are equal).
(Notice that there is a fun relation to baryogensis via the chiral anomaly: if we think of the southern hemisphere of the 4-sphere as a Hartle-Hawking-like universe, then as the 3-dimensional "equator" moves away form the "big bag" south pole the integral of the Chern-Simons term increases, reflecting the baryogenesis, until it reaches the large integral value given by the instanton number as one approaches the "north pole" asymptotic region, a modern picture very different from but curiously reminiscent of Kelvin's "vortex atoms" )
There is more hidden in the story of instantons which would deserve to be amplified more widely in physics textbooks. For some more exposition see around slide 13 of a talk I gave a few weeks back, titled Higher field bundles for gauge fields.