I think your question kind of match with following issue mentioned in YOLO WIKI page:
1. I get low accuracy:
....
....
4. Your training dataset doesn't suitable for your Test dataset:
- Training dataset contains: cars (rear view) from distance 100m
- Test dataset contains: cars (side view) from distance 5m
If you look into How to improve object detection section, you can find a general rule about how to prepare your training Dataset
based on your detection target:
General rule - your training dataset should include such a set of
relative sizes of objects that you want to detect:
train_network_width * train_obj_width / train_image_width ~= detection_network_width * detection_obj_width / detection_image_width
train_network_height * train_obj_height / train_image_height ~= detection_network_height * detection_obj_height / detection_image_height
I.e. for each object from Test dataset there
must be at least 1 object in the Training dataset with the same
class_id and about the same relative size:
object width in percent from Training dataset ~= object width in percent from Test dataset
That is, if only objects that occupied 80-90% of the image were
present in the training set, then the trained network will not be able
to detect objects that occupy 1-10% of the image.
Your query:
Does it mean I have to prepare training & validation dataset which contains same car images with multiple resolution?
Answer: No, it doesn't necessarily mean that you need to have the same car image of different resolutions.
Rather, you can say that you need to have similar resolution of training images of cars concerning what sizes (in resolution) of cars you want to detect using your trained model.