Questions tagged [multiclass-classification]

776 questions
2
votes
1 answer

Interpreting AUC, accuracy and f1-score on the unbalanced dataset

I am trying to understand how AUC is a better metric than classification accuracy in the case when the dataset is unbalanced. Suppose a dataset is containing 1000 examples of 3 classes as follows: a = [[1.0, 0, 0]]*950 + [[0, 1.0, 0]]*30 + [[0, 0,…
2
votes
1 answer

Multi Dimension Y_train on Keras

i have 2 corpus for x_train and y_train, and after some treatment like this : input_sequences = [] labels = [] indexCA = 0 for line in corpusMSA: lineCA = corpusCA[indexCA].split() # Save CA Line token_list =…
2
votes
2 answers

Output top 2 classes from a multiclass classification algorithm

I am working on a multiclass classificiation problem for text , where I have a lot of different classes (15+). I have trained a Linearsvc svm method(method is just and example). But it outputs just single class with highest probability, Is there a…
2
votes
1 answer

Spacy's BERT model doesn't learn

I've been trying to use spaCy's pretrained BERT model de_trf_bertbasecased_lg to increase accuracy in my classification project. I used to build a model from scratch using de_core_news_sm and everything worked fine: I had an accuracy around 70%. But…
2
votes
0 answers

How to do multiclass-classification without one-hot encoding, using the 'sparse_categorical_entropy' in keras?

My data is pretty simple, input arrays of m=10 real numbers and output another array of m=10 numbers. I'm trying to sort arrays using Neural Networks, I send in the random numbers and I expect the NN to output the argsort(x), thus the array of…
9879ypxkj
  • 387
  • 5
  • 15
2
votes
0 answers

SciKit Learn Logistic Regression One-verse-the-rest problem

How does sklearn's Logistic Regression handle class imbalance resulting from OVR (one vs rest) multiclass handling scheme? In SciKit-Learn library, there is a LogisticRegression API providing to you. Reference:…
2
votes
1 answer

Training YOLO (object detection) on multiple objects on imbalanced dataset?

I'm training YOLO on custom dataset(using Alexey AB implementation of Darknet). It has 3 classes of images where class 1 has 45k images, and remaining two have around 1k images. After training it for 6k iterations, the loss is in between 1.5 and 2.…
2
votes
3 answers

How to stratify the training and testing data in Scikit-Learn?

I am trying to implement Classification algorithm for Iris Dataset (Downloaded from Kaggle). In the Species column the classes (Iris-setosa, Iris-versicolor , Iris-virginica) are in sorted order. How can I stratify the train and test data using…
2
votes
3 answers

Best way to handle imbalanced dataset for multi-class classification in Auto-Sklearn

I'm using Auto-Sklearn and have a dataset with 42 classes that are heavily imbalanced. What is the best way to handle this imbalance? As far as I know, two approaches to handle imbalanced data within machine learning exist. Either using a resampling…
2
votes
1 answer

How to estimate the accuracy on a large dataset?

Given that I have a deep learning model(handover from former colleague). For some reason, the train/dev set was missing. In my situation, I want to classify my dataset into 100 categories. The dataset is extremely imbalanced. The dataset size is…
2
votes
1 answer

Deep Learning: Multiclass Classification with same amount of labels between the training dataset and test dataset

I'm writing a code for doing a multiclass classification. I have custom datasets with 7 columns (6 features and 1 label), the training dataset has 2 types of label (1 and 2), and the testing dataset has 3 types of labels (1, 2, and 3). The aim of…
2
votes
1 answer

How to convert 2-dimensional data into 3-D (Non-linear data)?

I have created synthetic data like , X, Y = make_classification(n_features=2,n_samples=100, n_redundant=0, n_informative=1, n_clusters_per_class=1, class_sep=0.001,weights= [0.8,0.2] ,n_classes=2…
2
votes
1 answer

What sort of loss function should I use this multi-class multi-label(?) problem?

In my experiment I am trying to train a neural network to detect if patients exhibit symptom A, B, C, D. My data consists of different angled photos of each patient along with whether or not they have symptom A, B, C, D. Right now in, pytoch, I am…
2
votes
0 answers

What is the difference between multiclass hinge loss and triplet loss?

The triplet loss (https://omoindrot.github.io/triplet-loss) seems very similar to the multiclass hinge loss (http://cs231n.github.io/linear-classify/#interpret). It is not clear what is the difference between the two.
2
votes
1 answer

Training SVC from scikit-learn shows that using -h 0 may be faster?

I am training an SVC model on a large dataset, and since I have set verbose=True, it is showing a Warning: using -h 0 may be faster. I have two questions here: what is this warning and how we can set any option of libsvm as it has mentioned in…
S.EB
  • 1,966
  • 4
  • 29
  • 54