Is there any other way of initializing weights and biases ? I am initializing it randomly.
Yes, it is common to initialize the weights randomly. However, there are different techniques for choosing the variance, e.g. Xavier initialization, He initialization, etc (see this discussion).
It is a bit different for the biases. Unlike weights, it's totally ok to init them with zeros. In ReLu based networks, it's common to use slightly positive numbers to ensure that most activations are positive, at least initially and backpropagate the error. But random init usually works as well.
Do I need to perform backprop after every forward pass or I should take average op errors and update it at the end of the epoch ?
In a classical algorithm, yes. The idea is to assess and update the network iteratively. But you can perform both operations for a mini-batch, instead of individual instances, if that's what you describe, and it works more efficiently. But it's not common to do several forward passes before one backward pass, it will only slow down the training.
Do I need to use biases in the input layer ?
The biases appear in the linear layer, along with the weights. The data itself is passed to the first layer without a bias.