Questions tagged [feature-extraction]

In pattern recognition and in image processing, feature extraction is a special form of dimensionality reduction. Transforming the input data into the set of features is called feature extraction. If the features extracted are carefully chosen it is expected that the features set will extract the relevant information from the input data in order to perform the desired task using this reduced representation instead of the full size input.

Feature extraction involves simplifying the amount of resources required to describe a large set of data accurately. When performing analysis of complex data one of the major problems stems from the number of variables involved. Analysis with a large number of variables generally requires a large amount of memory and computation power or a classification algorithm which overfits the training sample and generalizes poorly to new samples. Feature extraction is a general term for methods of constructing combinations of the variables to get around these problems while still describing the data with sufficient accuracy.

Best results are achieved when an expert constructs a set of application-dependent features. Nevertheless, if no such expert knowledge is available general dimensionality reduction techniques may help.

Source: Wikipedia

1664 questions
8
votes
4 answers

How to use SIFT/SURF as features for a machine learning algorithm?

Im working on an automatic image annotation problem in which im trying to associate tags with images. For that im trying for SIFT features for learning. But the problem is all the SIFT features are a set of keypoints, each of which have a 2-D array,…
8
votes
2 answers

Feature Extraction with Javascript

I am wondering whether there is any open source or free library for Image feature extraction with Javascript? I am developing an app where I need to use an algorithm like SIFT. It is tough to implement in JS, and I couldn't find a good SIFT…
Keshan
  • 14,251
  • 10
  • 48
  • 72
7
votes
2 answers

What's a simple and efficient method for extracting line segments from a simple 2D image?

Specifically, I'm trying to extract all of the relevant line segments from screenshots of the game 'asteroids'. I've looked through the various methods for edge detection, but none seem to fit my problem for two reasons: They detect smooth…
7
votes
0 answers

Meaning of algorithm, trees, and checks during Flann based feature matching (OpenCV, python)

I am currently testing Flann feature matching with OpenCV in python, and do not fully understand what some of the parameters actually do. Here is a snippet of code copied from the OpenCV docs. The full code can be found here. # FLANN…
7
votes
1 answer

Why do Mel-filterbank energies outperform MFCCs for speech commands recognition using CNN?

Last month, a user called @jojek told me in a comment the following advice: I can bet that given enough data, CNN on Mel energies will outperform MFCCs. You should try it. It makes more sense to do convolution on Mel spectrogram rather than on…
7
votes
1 answer

How to extract features from FFT?

I am gathering data from X, Y and Z accelerometer sensors sampled at 200 Hz. The 3 axis are combined into a single signal called 'XYZ_Acc'. I followed tutorials on how to transform time domain signal into frequency domain using scipy fftpack…
jsammut
  • 305
  • 2
  • 8
7
votes
1 answer

need normalization before SelectKBest in python

I need to select some features from dataset for a regression task. But the numerical values are from different ranges. from sklearn.datasets import load_boston from sklearn.feature_selection import SelectKBest, f_regression X, y =…
user3104352
  • 1,100
  • 1
  • 16
  • 34
7
votes
1 answer

comparing HOG feature vectors without SVM

I am relatively a newbie to computer vision and now currently doing a learning project on shape detection where I have a fixed region of interest(ROI) in all the images where the object is most likely present and I have to compare their shapes to…
7
votes
1 answer

ValueError: Shape must be rank 1 but is rank 0 for 'ROIAlign/Crop' (op: 'CropAndResize') with input shapes: [2,360,475,3], [1,4], [], [2]

I tried to give all input in this function but it comes out problem like below , i not sure what is the empty [] is . There are 2 image image in RGB and the original code is from…
Go Go Gadget 2
  • 148
  • 1
  • 2
  • 10
7
votes
3 answers

How to handle unseen categorical values in test data set using python?

Suppose I have location feature. In train data set its unique values are 'NewYork', 'Chicago'. But in test set it has 'NewYork', 'Chicago', 'London'. So while creating one hot encoding how to ignore 'London'? In other words, How not to encode the…
7
votes
1 answer

ValueError: After pruning, no terms remain. Try a lower min_df or a higher max_df

from sklearn.feature_extraction.text import TfidfVectorizer tfidf_vectorizer = TfidfVectorizer(max_df=0.95, max_features=200000, min_df=.5, stop_words='english', …
Jeet Dadhich
  • 71
  • 1
  • 1
  • 6
7
votes
4 answers

Template Matching for Coins with OpenCV

I am undertaking a project that will automatically count values of coins from an input image. So far I have segmented the coins using some pre-processing with edge detection and using the Hough-Transform. My question is how do I proceed from here? I…
Leo
  • 565
  • 5
  • 20
7
votes
1 answer

How to classify URLs? what are URLs features? How to select and Extract features from URL

I have just started to work on a Classification problem. Its a two class problem, My Trained model(Machine Learning) will have to decide/predict either to allow a URL or Block it. My Question is very specific. How to Classify URLs? Should i use…
7
votes
5 answers

How to approach Machine Learning problems with dynamically sized input collection?

I'm approaching a problem trying to classify a data sample as good or bad quality with machine learning. The data sample is stored in a relational database. A sample contains the attributes id, name, number of up-votes (for good/bad quality…
7
votes
1 answer

optimum hessian threshold for SURF feature extraction in opencv + Minimum descriptors matching

Currently I am working on face recognition project where I am using Fisherfaces/LDA to filter out the images on a broader level and then using SURF to verify the output from LDA. What would be a good Hessian threshold which should be passed to…