We introduce a quantum algorithm for solving structured prediction problems with a runtime that scales with the square root of the size of the label space, but scales in \(\tilde{O}(\epsilon^{-3.5})\) with respect to the precision, \(\epsilon\), of the solution. In doing so, we analyze a stochastic gradient algorithm for convex optimization in the presence of an additive error in the calculation of the gradients, and show that its convergence rate does not deteriorate if the additive errors are of the order \(O(\sqrt{\epsilon})\). Our algorithm uses quantum Gibbs sampling at temperature \(\Omega(\epsilon)\) as a subroutine. Based on these theoretical observations, we propose a method for using quantum Gibbs samplers to combine feedforward neural networks with probabilistic graphical models for quantum machine learning. Numerical results using Monte Carlo simulations on an image tagging task demonstrate the benefit of the approach