Questions tagged [vlad-vector]

The VLAD vector (Vector of Locally Aggregated Descriptors) is a descriptor vector which is often used in image retrieval, e.g. to aggregate SIFT descriptors. Use this tag for programming questions related to the VLAD vector.

The VLAD vector was introduced in 2010 (see link below), and is widely used in image retrieval and application, to aggregate e.g. descriptors. This tag should be used for all questions concerning the implementation and usage of VLAD vectors.

A typical pipeline with the VLAD vector consists of the following steps.

  1. Extract descriptors from one or multiple images, usually with a SIFT detector.
  2. Create a codebook of visual words, typically with k-means.
  3. Compute and accumulate the residuals between the descriptors and the cluster center, for each of the clusters from the codebook.
  4. Stack the accumulated residuals to form the VLAD vector.

Then, typically the following two steps are performed

  1. Normalize the VLAD e.g. using the L2 norm.
  2. Reduce the dimensionality, e.g. using PCA.

Links

  • Implementation of the VLAD vector in the open-source library .

  • Original Paper: Jégou, H., Douze, M., Schmid, C., & Pérez, P. (2010). Aggregating local descriptors into a compact image representation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 3304–3311.

12 questions
4
votes
0 answers

Defining a threshold for feature matching in geometrical re-ranking

I'm implementing a cache for virtual reality applications: given an input image query, return the result associated to the most visually similar cached image (so a previously processed query) if the distance between the query representation and the…
3
votes
2 answers

Vectorise VLAD computation in numpy

I was wondering whether it was possible to vectorise this implementation of VLAD computation. For context: feats = numpy array of shape (T, N, F) kmeans = KMeans object from scikit-learn initialised with K clusters. Current method k =…
ashnair1
  • 345
  • 2
  • 9
2
votes
1 answer

Extracting VLAD from SIFT Descriptors in VLFeat with Matlab

I have a folder of images. I want to compute VLAD features from each image. I loop over each image, load it, and obtain the SIFT descriptors as follows: repo = '/media/data/images/'; filelist = dir([repo '*.jpg']); sift_descr = {} for i =…
Chris Parry
  • 2,937
  • 7
  • 30
  • 71
1
vote
1 answer

Why should we use bag of visual words (or vlad) instead of storing descriptors?

I have read a lot about image encoding techniques, e.g. Bag of Visual Words, VLAD or Fisher Vectors. However, I have a very basic question: we know that we can perform descriptor matching (brute force or by exploiting ANN techniques). My question…
1
vote
1 answer

Memory Limitation when Extracting VLAD from SIFT Descriptors in VLFeat with Matlab

I recently asked how to extract VLAD from SIFT descriptors in VLFeat with Matlab here. However, I am running up against memory limitations. I have 64GB RAM and 64GB Swap. all_descr = single([sift_descr{:}]); ... produces a memory…
Chris Parry
  • 2,937
  • 7
  • 30
  • 71
1
vote
3 answers

Can CNNs be faster than classic descriptors?

Disclamer: I don't know almost nothing on CNNs and I have no idea where I could ask this. My research is focused on high performance on computer vision applications. We generate codes representing an image in less than 20 ms on images with the…
justHelloWorld
  • 6,478
  • 8
  • 58
  • 138
0
votes
1 answer

CUDA out of memory error when trying to implement Triplet Loss

I'm trying to implement triplet loss for NetVLAD layer, by using 3 different images from my dataloader as follows: a - batch with minimal augmentation p - same batch with more augmentations n - different batch from the same dataloader However when…
0
votes
1 answer

How can I use a model of format .mat in PyTorch?

I've downloaded a trained NetVLAD model from https://www.di.ens.fr/willow/research/netvlad/. However, the model was trained in MatLab and is of the type .mat. How can I load this in PyTorch? I was able to access the weights using the following code,…
Shania F.
  • 1
  • 1
0
votes
1 answer

Power normalization step for VLAD vector representation

I am doing a Power normalization step for VLAD vector representation v. The un-normalized VLAD vector for an image in my experiment is of 8192x1 dimension [Considering 128-D SIFT descriptors, and K (centroids) = 64]. Power-law normalization…
Alastair_V
  • 45
  • 6
0
votes
1 answer

A 8192-dimensional VLAD vector take 32KB of of memory per image. How?

I have a simple question concerning VLAD vector representation. How is it that an 8192-dimensional (k=64, 128-D SIFT) VLAD vector take '32KB of of memory' per image? I could not relate these two numbers.
Alastair_V
  • 45
  • 6
0
votes
1 answer

TensorFlow error "unable to get element from the feed as bytes" when using ActionVLAD

I installed TensorFlow r0.12 using Anaconda and execute the run.sh file from the action detection algorithm ActionVLAD: Then I got this error traceback: tensorflow.python.framework.errors_impl.InternalError: Unable to get element from the feed as…
0
votes
1 answer

VLFeat: ValueError for certain number of clusters in vl_kmeans

I have an array of size 301 x 4096, for which I want to calculate the VLAD vector. I tried to do the quantization using center, assignments = vlfeat.vl_kmeans(data,8) but this returns ValueError: too many values to unpack If I change number of…
ytrewq
  • 3,670
  • 9
  • 42
  • 71