The solution to both of your questions arise from wanting to have consistent definitions to make the notation as slick as possible and logically sound.
The way to see question one is to look at how the derivatives act on \(\theta^2 = \theta^\alpha\theta_\alpha\) note that in the primer he uses Northwest-Southeast (NW-SE) convention for undotted indices. From the definition \(\frac{\partial}{\partial \theta_\alpha} \theta_\beta = \delta^\alpha_\beta\) and \(\frac{\partial}{\partial \theta^\alpha} \theta^\beta = \delta_\alpha^\beta\) we find the following
\[\frac{\partial}{\partial \theta_\alpha} \theta^2 = \frac{\partial}{\partial \theta_\alpha} (\theta^\gamma \theta_\gamma) = \epsilon^{\gamma\beta} \delta^\alpha_\beta \theta_\gamma - \delta^\alpha_\gamma \theta^\gamma =-2 \theta^\alpha\]
\[\frac{\partial}{\partial \theta^\alpha} \theta^2 = \frac{\partial}{\partial \theta^\alpha} (\theta^\gamma \theta_\gamma) = \theta_\alpha - \theta^\gamma \epsilon_{\gamma\beta}\delta^\beta_\alpha= 2 \theta_\alpha\]
This is why need the partial derivative to give us a minus sign upon raising or lowering in order for the raising and lowering of the right hand side to remain consistent.
As for the second question, often times in calculations we don't want to write out every single index, so we have the NW-SE and SW-NE convention for undotted and dotted indices respectively. By keeping this convention throughout we can logically infer where the summed over indices go without having to write them out explicitly. But in order to do that we need to define what where the indices go on the sigma matrices. The convention used is
\[\bar\sigma^\mu\equiv {\bar{\sigma}^\mu}^{\dot\alpha \alpha}\] and \[\sigma^\mu\equiv {{\sigma}^\mu}_{\alpha \dot\alpha}\] , where we can switch from one to the other by raising (or lowering) the indices \[{\bar{\sigma}^\mu}^{\dot\alpha \alpha} = \epsilon^{\alpha\beta}\epsilon^{\dot\alpha\dot\beta}{{\sigma}^\mu}_{\beta \dot\beta} \]
Now we can get rid of the summed over indices by instead of writing \(((\sigma^{\mu})^{\alpha\dot{\gamma}}\theta^{\dagger}_{\,\,\,\dot{\gamma}})\partial_{\mu}\) we write \(((\sigma^{\mu})^{\alpha\dot{\gamma}}\theta^{\dagger}_{\,\,\,\dot{\gamma}})\partial_{\mu} = (\theta^{\dagger}_\dot\gamma{\bar{\sigma}^{\mu}}^{\dot\gamma\alpha}) \partial_{\mu} = (\theta^{\dagger}\bar{\sigma}^{\mu})^{\alpha}\partial_{\mu}\) where we used the above definition for raising both the indices of the \(\sigma^\mu\) making it barred and hid the summed over indices by using SW-NE convention for dotted indices.