I found the error when training using Caffe. Seems both of the errors are connected with memory because when I reduce the batch size, both the errors are solved.
Just, seems the misaligned address comes first before out of memory. For example, when I put batch size = 32 (or more), I got out of memory error. When I put batch size = 16, I got misaligned address error. When I put batch size = 8 (or less), there is no error.
Actually what is the difference between those errors and the meaning of the errors?
I use CUDA 8.0 and CuDNN 5.1