I am working on a scene recognition problem with bag of visual words. Here is a code that I adapted from the Internet. In the training dataset, I have 5 classes each having 100 images. In the random test dataset, I have 5000 images. I understand that I should make a vocabulary from the training set. But should I also make a vocabulary of test dataset?
FEATURE = 'bag of sift';
CLASSIFIER = 'support vector machine';
categories = {'shopping', 'office', 'eating', 'chatting', 'biking'};
num_train_per_cat = 100;
vocab_size = 200;
% YOUR CODE FOR build_vocabulary.m
vocab = build_vocabulary(train_image_paths, vocab_size);
% YOUR CODE FOR get_bags_of_sifts.m
fprintf('Computing training features\n');
train_image_feats = get_bags_of_sifts(train_image_paths,vocab);
save('train_bag.mat', 'train_image_feats');
fprintf('Computing test features\n');
test_image_feats = get_bags_of_sifts(test_image_paths,vocab);
% YOUR CODE FOR svm_classify.m
test_image_feats_mat = cell2mat( test_image_feats);
test_image_feats= vl_svmdataset(test_image_feats_mat);
predicted_categories = svm_classify(train_image_feats,train_labels, test_image_feats)