Questions tagged [feature-extraction]

In pattern recognition and in image processing, feature extraction is a special form of dimensionality reduction. Transforming the input data into the set of features is called feature extraction. If the features extracted are carefully chosen it is expected that the features set will extract the relevant information from the input data in order to perform the desired task using this reduced representation instead of the full size input.

Feature extraction involves simplifying the amount of resources required to describe a large set of data accurately. When performing analysis of complex data one of the major problems stems from the number of variables involved. Analysis with a large number of variables generally requires a large amount of memory and computation power or a classification algorithm which overfits the training sample and generalizes poorly to new samples. Feature extraction is a general term for methods of constructing combinations of the variables to get around these problems while still describing the data with sufficient accuracy.

Best results are achieved when an expert constructs a set of application-dependent features. Nevertheless, if no such expert knowledge is available general dimensionality reduction techniques may help.

Source: Wikipedia

1664 questions
0
votes
1 answer

classification of data where attribute values are strings

I have a labeled data set with 7 attributes and about 80,000 rows. However, 3 of these attributes contain more than 50% missing data. I filtered the data to ignore rows with any null values which left me with about 30,000 rows of complete data. The…
0
votes
2 answers

scikit-learn multi dimensional features

i have a question concerning scikit-learn. Is it possible to merge a multi dimensional feature list to one feature vector. For example: I have results from an application analysis and I would like to represent an application with one feature…
0
votes
1 answer

Fast way to calculate feature on sliding window of huge csv file

I am a newbie in Python. I have a huge csv file of 18 GB and 48 million records. Each record is of 37 Dimensional Vector, recorded at ~1700 Hz. What I am trying to do is applying a sliding window over it using this approach. And for each window I am…
Muaz
  • 57
  • 1
  • 2
  • 8
0
votes
1 answer

Combinations of features using Python NumPy

For an assignment I have to use different combinations of features belonging to some data, to evaluate a classification system. By features I mean measurements, e.g. height, weight, age, income. So for instance I want to see how well a classifier…
quantum285
  • 1,032
  • 2
  • 11
  • 23
0
votes
1 answer

Multiple features into one using Pipeline and featureUnion from Python Scikit-learn

I would like to train and predict the gender of a person. I have two features 'name' and 'randint' each coming from a different Pandas column. I am trying to 1) combine them into a pipeline/featureunion. As well as 2) adding the predicted label onto…
0
votes
2 answers

How to get scale, rotation & translation after feature tracking?

I have implemented a Kanade–Lucas–Tomasi feature tracker. I have used it on two images, that show the same scene, but the camera has moved a bit between taking the pictures. As a result I get the coordinates of the features. For example: 1.…
user5345663
0
votes
2 answers

Extract BRIEF features with mexopencv

I am trying to extract binary features in Matlab with mexopencv. If I use ORB as a detector and extractor everything works fine. The problem is when I try to use BRIEF extractor. This is the code I am using: detector =…
user1805638
  • 31
  • 1
  • 7
0
votes
1 answer

Sort Extracted Data Based On Image Region

I have analysed tree core images through the raster package in an attempt to perform image analysis. In the image: http://dx.doi.org/10.6084/m9.figshare.1555854 You can see the measured "vessels" (black and numbered) and also annual lines (red)…
Darren468
  • 41
  • 1
  • 6
0
votes
1 answer

How to do data dimensionailty reduction?

I have a set of 25 images of label 'Infected' and 25 images of label 'Normal'. I am trying to extract the dual-tree complex wavelet transform based coefficients as features for each of the images. My code to obtain coefficients using DT-CWT ia as…
vishnu
  • 31
  • 5
0
votes
1 answer

Opencv 2.4.11 Java: Drawing lines from center of mass to edge of contour

Basically , I have a binary image that contains an object , I applied contours and moments functions to find the center of mass , and detect the object in this image . ( irregular object ) What I want to do now is to generate lines ( at different…
Shaman
  • 113
  • 1
  • 9
0
votes
1 answer

Looking for a saliency map detection code for video processing

I am looking for a code or an application which can extract the salient object out of a video considering both context and motion, or an algorithm just for motion saliency map detection (motion contrast) so I can fuse it with a context_aware…
0
votes
1 answer

How to determine whether 2 code snippets are functionally same?

Given 2 code snippets I want to check whether they are functionally similar or not. By functional similarity I mean that they should yield same output when provided with same input. I am extracting feature set from a given code snippet using :…
0
votes
0 answers

How to extract number from a detected image

I want to use opencv for extracting a number written of a speed sign. and since the detected sign will be just an image, how can i extract the number written on it? how can I do that using opencv. note: I know how to make image matching, but i do…
0
votes
2 answers

Mathematical representation of a set of points in N dimensional space?

Given some x data points in an N dimensional space, I am trying to find a fixed length representation that could describe any subset s of those x points? For example the mean of the s subset could describe that subset, but it is not unique for that…
0
votes
1 answer

Best Features for Term Level Clustering

At the moment, I am working on a project that related to mining Twitter data. The aim of the project is to find the themes that can be used to represent the set of tweets. To help us finding the themes, we came up with an idea to do term level…
bohr
  • 631
  • 2
  • 9
  • 29