Batch Size vs Mini Batch Size

Batch Size: There is no such a thing as batch size, there’s only mini batch size which is number of the samples you train your model on before updating the weight and biases but sometimes programmers just drop the mini in the term, and use batch sizes instead so basically batch size is exactly the same as the minibatch size. If we want to be absolutely correct though then the batch size is the number of samples we have in our data set and it’s not a number that you can specify in your program
MSE: Mean square root error will work best if you want to calculate the error for predicting a function regression, like predicting stock price, or a product lifetime failure function. In fact, there is no mean in MSE algorithm And the error is calculated between the neural network output and the prediction label. and then summation over all different samples.
Cross Entropy: Cross Entropy will work best if you want to do a classification like binary classification or even a multi-class classification. It helps by penalizing wrong class detection. The calculation of the cross-entropy function is fairly easy it is -log(p_target_class). So basically it only depends on the class that the input data is belonged to. In this case other classes error don’t have any impact on the network loss.
ArgMax: ArgMax is a function that is used at the final output layer of the neural network. It simply changes the class with the maximum number to one and all other cases to zero