Let's see. There are two observations one needs to make in order to "arrive" to F-theory. Let's go back to type IIB string theory and take the lowe energy sugra 7-brane solutions. These 7-branes have an harmonic function that depends logarithmically on the transverse distance from the brane, something distinct to these 7-branes and not to other lower $p$ $Dp$-branes. If you examine this system you will realize that there exists a $SL(2,\mathbb{Z})$ symmetry and that many of these 7-branes put together backreact to give a a $\mathbb{P}^1$ background. The other observation is that this $SL(2,\mathbb{Z})$ symmetry has a nice geometric interpretation as the modular group of $T^2 \approx S^1 \times S^1$ in whose zero limit we compactify M-theory to get the 9-dimensional type IIA string theory (if I remember well). These are the two points or observation that lead to consider F-theory in the first place.
F-theory is nothing more than a "new" way to compactify type IIB string theory in which the complex scalar field $\tau$ is not constant anymore. The novelty is also that we can consider this scalar field $\tau$ as the complex structure modulus of an auxiliary torus with modular group the usual $SL(2,\mathbb{Z})$ (and this "interpretation" if I am not mistaken is the same in say Seiberg-Witten theory). We the above in mind we indeed get a 12 dimenisonal theory where the torus on which we compactify it is actually a non-physical torus, it does not have a pure geometric interpretation. Note that the dimensional reduction is not a usual KK reduction as we do in, say, type IIB when compactifying it in $\mathbb{M}^4 \times T^6$. Additionally note that the low energy limit is not given by a 12 dimensional sugra theory since sugra can be realized up to only 11 dimensions.
The above want to morally communicate the fact that the 12 dimensional interpretation is a useful means to geometrize they $SL(2,\mathbb{Z})$ duality symmetry. Now, upon compactifying the resulting IIB theory in lower dimensions (so, after the $T^2$ F-theory compactification) we get already some remarkable results. The compactification of tyoe IIB on the previously mentioned $\mathbb{P}^1$ because of the backreaction of the 7-branes preserves half the susy. What is remarkable is the fact that also M-theory compactified on a K3 surface preserves the same amount of supersymmtries. Now things can get quite technical but we already see some connection. If one goes further she will that M-theory and F-theory are related to each other after one has dualized M-theory and type IIB strings on the $\mathbb{P}^1$ and by the (conjectured) fact that F-theory on an elliptically fibrated K3 is also dual to type IIB strings on $\mathbb{P}^1$. To end up, the most useful road map I have found is the picture where F-theory on $T^2$ is dual to type IIB in 10 dimensions which is T-dual to type IIA in 9 dimensions which is the M-theory compactification on $T^2$.
I took the above notes from a graduate course I attended with lecturer Inaki García-Etxebarria who works on F-theory. Additionally a nice resource is of course the nLab article and also Herman Verlinde's lectures on PiTP. Maybe Weigand's notes are also useful.