Mini-batch gradient descent

The mini-batch gradient descent is a technique that combines properties from batch gradient descent and also stochastic gradient descent to optimize efficiency and accuracy of the gradient descent algorithm. In each iteration, a certain number of examples (a batch) within a data set will undergo batch gradient descent. This batch could be as small as 2 or larger than 200 examples. Compared to batch gradient descent it is significantly faster, and compared with stochastic gradient descent good vectorisation of the number of examples allows the computation to parallelised, hence it can perform faster than a stochastic gradient descent as well.