I trained faster_rcnn_nas model with my custom dataset (resized images 1280x1080). My GPU is Nvidia Quadro P5000 and I can test the model on this computer. When I test with GTX 1060 it crashes and gives memory error. But when I test pre-trained faster_rcnn_nas it works fine.
What is the differences between pre-trained and custom model? Is there any way to run the model with 1060? Or is there any batch_size or similar parameters to change for testing?
What I have done: I limited my gpu and found out that I need minimum 8gb gpu to test my model.
Full error:
ResourceExhaustedError: 2 root error(s) found. (0) Resource exhausted: OOM when allocating tensor with shape[500,4032,17,17] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[{{node MaxPool2D/MaxPool-0-TransposeNHWCToNCHW-LayoutOptimizer}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[SecondStagePostprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/Sum/_275]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
(1) Resource exhausted: OOM when allocating tensor with shape[500,4032,17,17] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[{{node MaxPool2D/MaxPool-0-TransposeNHWCToNCHW-LayoutOptimizer}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
0 successful operations. 0 derived errors ignored.