-1

I am trying to make a ml model that can predict the category of the given sentence using MNB but in the training data there are unwanted classes in it . How can I remove that data

This is the datset iam using

this dataset does not belong to me

Misra, Rishabh and Prahal Arora. "Sarcasm Detection using Hybrid Neural Network."arXivpreprint arXiv:1908.07414 (2019).

Misra, Rishabh and Jigyasa Grover. "Sculpting Data for ML: The first act of Machine Learning." ISBN 9798585463570 (2021).

I want to remove certain categories like 'US.NEWS' , 'POLITICS' etc.. how can I do that?

I tried to read the data by loading it using json module in python but somehow that too is not working.

1 Answers1

0

Popular library in python for data manipulation is pandas.

You can load load this json with the function:

import pandas as pd

df = pd.read_json("path/to/the/file")

And then later you can drop columns you want:

df.drop(["US.NEWS", "POLITICS"], axis=1)

Of course you can use other libraries like polars and pyspark!

slapekm
  • 1
  • 1