Questions tagged [feature-extraction]

In pattern recognition and in image processing, feature extraction is a special form of dimensionality reduction. Transforming the input data into the set of features is called feature extraction. If the features extracted are carefully chosen it is expected that the features set will extract the relevant information from the input data in order to perform the desired task using this reduced representation instead of the full size input.

Feature extraction involves simplifying the amount of resources required to describe a large set of data accurately. When performing analysis of complex data one of the major problems stems from the number of variables involved. Analysis with a large number of variables generally requires a large amount of memory and computation power or a classification algorithm which overfits the training sample and generalizes poorly to new samples. Feature extraction is a general term for methods of constructing combinations of the variables to get around these problems while still describing the data with sufficient accuracy.

Best results are achieved when an expert constructs a set of application-dependent features. Nevertheless, if no such expert knowledge is available general dimensionality reduction techniques may help.

Source: Wikipedia

1664 questions
6
votes
3 answers

computer vision: extracting info about a shape given a contour (e.g. pointy, round...)

Given the 2D contour of a shape in the form of lines and vertices, how can I Extract Information from that? like: Pointy, round, straight line. Shape similarities with a given shape. Code is not necessary, I am more interested in concepts and the…
AndreasT
  • 9,417
  • 11
  • 46
  • 60
6
votes
3 answers

Android AudioRecord and MediaRecorder

I'm developing an audio processing application where I need to record audio, and then process it to obtain features of that recording. However, I want the audio in a playable format to play it after with MediaPlayer. I've seen that to record audio…
5
votes
3 answers

PixelLib not detecting objects properly

libraries im using import pixellib from pixellib.instance import instance_segmentation import cv2 import matplotlib.pyplot as plt the script: segment_image = instance_segmentation() segment_image.load_model('mask_rcnn_coco.h5') segmask, output =…
5
votes
1 answer

Transfomers for mixed data types

I'm having trouble applying at once different transformers to columns with different types (text vs numerical), and concatenating such transformers in a single one for later use. I tried to follow the steps in the documentation for Column…
kilgoretrout
  • 158
  • 3
  • 14
5
votes
3 answers

Detect dotted (broken) lines only in an image using OpenCV

I am trying to learn techniques on image feature detection. I have managed to detect horizontal line(unbroken/continuous), however I am having trouble detecting all the dotted/broken lines in an image. Here is my test image, as you can see there…
5
votes
1 answer

Sound feature attributeError: 'rmse'

In using librosa.feature.rmse for sound feature extraction, I have the following: import librosa import numpy as np wav_file = "C://TEM//tem//CantinaBand3.wav" y, sr = librosa.load(wav_file) chroma_stft = librosa.feature.chroma_stft(y=y,…
Mark K
  • 8,767
  • 14
  • 58
  • 118
5
votes
1 answer

How to do mean(target) encoding in pyspark

I need to do a mean(target) encoding to all categorical columns in my dataset. To simplify this problem, Let's say there're 2 columns in my dataset, first column is the label column, the second column is a categorical column. e.g label | cate1 …
Alain ux
  • 95
  • 1
  • 5
5
votes
2 answers

Bag of Words (BOW) vs N-gram (sklearn CountVectorizer) - text documents classification

As far as I know, in Bag Of Words method, features are a set of words and their frequency counts in a document. In another hand, N-grams, for example unigrams does exactly the same, but it does not take into consideration the frequency of occurance…
5
votes
1 answer

How to extract these 6 symbols (signatures) from paper (opencv)

I have an image: and I'm trying extract the signs one by one. I tried findContours() but I got a lot of internal contours. Is there any way to do this?
leosouzabh
  • 63
  • 1
  • 6
5
votes
1 answer

How to count non-alphanumeric characters on pandas dataframe

Here's my data No Body 1 DaTa, Analytics 2 2 StackOver. 67% Here's my expected output No Body Non Alphanumeric 1 DaTa, Analytics 2 1 2 StackOver. 67% 2 I am only count non-alphanumeric like ! @ # & (…
Nabih Bawazir
  • 6,381
  • 7
  • 37
  • 70
5
votes
1 answer

Feature matching with flann in opencv

I am working on an image search project for which i have defined/extracted the key point features using my own algorithm. Initially i extracted only single feature and tried to match using cv2.FlannBasedMatcher() and it worked fine which i have…
flamelite
  • 2,654
  • 3
  • 22
  • 42
5
votes
2 answers

Why does the local_binary_pattern function in scikit-image provide same value for different patterns?

I am using the local_binary_pattern function in the scikit-image package. I would like to compute the rotation invariant uniform LBP of 8 neighbors within radius 1. Here is my Python code: import numpy as np from skimage.feature import…
Peter
  • 243
  • 1
  • 2
  • 8
5
votes
2 answers

Merging CountVectorizer in Scikit-Learn feature extraction

I am new to scikit-learn and needed some help with something that I have been working on. I am trying to classify two types of documents (say, type A and type B) using Multinomial Naive Bayes classification. In order to get the term counts for these…
Archit Shukla
  • 51
  • 1
  • 4
5
votes
2 answers

Extract folder name and filename from FilePath using scala

I have streams of files being read from a directory and the filetree is of the…
Taiwotman
  • 885
  • 14
  • 27
5
votes
2 answers

Extracting Product Attribute/Features from text

I've been assigned a task to extract features/attributes from product description. Levi Strauss slim fit jeans Big shopping bag in pink and gold I need to be able to extract out attributes such as "Jeans" and "slim fit" or "shopping bag" and…