Both are possible and will give different results.
When re-training with the full dataset you are making sure that every image has the same importance during the training.
When you fine-tune on new images, the model might be better on these new images (or might not) but will probably be worst on the previous images.
Let's say your first 1000 images were images of dogs and the 2000 new images are images of cats. If you want to detect cats it can be good to use your pre-trained dog model to recognise this type of object (animals) but fine-tune on cat images so you are overfitting on this type of object.
If you want to detect dogs and cats, I would suggest training it all together with balanced classes.
Basically the fine-tunning method will fasten the learning on the 2000 new images but the model might "forget about information it had" on the 1000 first images