Questions tagged [data-science]

Implementation questions about data science. Data science concerns extracting knowledge or insights from data, in whatever shape or form. It can contain predictive analytics and usually takes a lot of data wrangling. General questions about data science should be posted to their respective communities.

Data science is an interdisciplinary field that uses scientific methods, processes, and systems to extract knowledge and insights from data in various forms, both structured and unstructured, similar to .

Wikipedia

NOTE: If you want to use this tag for a question not directly concerning implementation, then consider posting on Cross Validated, Data Science, or Artificial Intelligence instead. Otherwise you're probably off-topic.

9099 questions
1
vote
1 answer

Joining 2 data frames with all the same columns

In pandas/jupyter notebook with python: I have a dataframe (df1) with information about the amount of crime per year where each row is a summation of the total amount of crime in that country-year unit. However, df1 does not have rows which contain…
taraamcl
  • 25
  • 5
1
vote
1 answer

find the co-ordinate and highlight the required text in image in python

I have the photo in which I wanted to get the co-ordinate of text and highlight the text. Text to highlight and get co-ordinate = 'was the age of wisdom' I try to get the co-ordinate by providing the first and last word but didn't get the required…
1
vote
1 answer

convert model with tensorflowjs from pb file to json file

I am trying to convert my model (Yolo model) from model.pb to model.json. I trying to do this because I want to load my model directly into my web application (with JavaScript). so I trying to do this with google colab platform and my drive…
1
vote
1 answer

How to change the median line color for each violin plot in one ax, using matplotlib?

This is my code: fig, ax = plt.subplots() parts = ax.violinplot([df1_1['Height'], df1_2['Height'], df1_3['Height']], showmedians=True, vert=False, points=1000, widths=1, showextrema=False) ax.set_title('Height Distributions (2000 - 2016…
1
vote
1 answer

Pandas - Group by week while maintaining perfect chronological records

We have a dataframe that has break events that have happened on a production line. # Example dataframe df = pd.DataFrame({ 'RSNCODE': ['300.306', '100.102', '300.306'], 'BEGTIME': ['2022-06-08 22:21:47', '2022-06-22 14:00:00', '2022-07-25…
1
vote
0 answers

can we calculate model accuracy from MAPE or MAE?

In a linear regression model, if MAPE or MAE is caculated, then can we conclude that the regression model accuracy is (1- MAE)*100? as the MAPE value mostly ranges under 100 and it is commonly used error metric in regression.
Akash K.
  • 11
  • 1
1
vote
2 answers

Data Cleaning Error in Classification KNN Alrogithm Problem

I believe the error is telling me I have null values in my data and I've tried fixing it but the error keeps appearing. I don't want to delete the null data because I consider it relevant to my analysis. The columns of my data are in this order:…
Renata
  • 11
  • 1
1
vote
0 answers

Code works until I put it into a function

I am writing a function to pull data from an excel file output = pd.Dataframe() def get_help(sheet,head,file): tmp_df = pd.read_excel(file,sheet_name='Cover') tmp_df = tmp_df.dropna(how='all') tmp_df = tmp_df.dropna(axis=1,how='all') …
asahi
  • 11
  • 3
1
vote
2 answers

Compare two columns in panda but with a specific value by row

I need to compare two columns with specific values to get a sum at the end of all the records that match. For example: in the 'Survived' column the value for each record must be 1 and for the 'Pclass' column the value of the record can be 1 or 2 I…
yezzussss
  • 11
  • 1
1
vote
2 answers

How to create non-alphabetically ordered Categorical column in Polars Dataframe?

In Pandas, you can create an "ordered" Categorical column from existing string column as follows: column_values_with_custom_order = ["B", "A", "C"] df["Column"] = pd.Categorical(df.Column, categories=column_values_with_custom_order,…
Eero H
  • 33
  • 7
1
vote
0 answers

Getting SettingWithCopyWarning in case of column drop or modify for pandas dataframe

From the sample dataset 'iris', i have created a dataframe df as follows : import seaborn as sns df = sns.load_dataset('iris') From this i created another dataframe d2 & from the '_is_view' flag, i can see that the d2 is created as a copy (rather…
mezda
  • 3,537
  • 6
  • 30
  • 37
1
vote
3 answers

How to calculate average level on certain days and times in a pandas data frame

I have a data frame like so Date_Time Level 2017-08-08 23:55:01 239.0 2017-08-08 23:50:01 242.0 2017-08-08 23:45:01 246.0 2017-08-08 23:40:01 250.0 2017-08-08 23:35:01 254.0 ... ... 2017-07-26 00:23:57 72.0 2017-07-26…
Shawn Mian
  • 13
  • 2
1
vote
1 answer

How would I create an SQL query to get the following result, Data Science Question?

I am working on a project with sample data for vehicle report relays the data is as follows: (timetransmittedtz is a…
1
vote
1 answer

why i am facing with problem when i try to change columns to 0 an 1?

data.head() experience stay No relevent experience 0 Has relevent experience 1 No relevent experience 0 Has relevent experience 1 No relevent experience 0 data['experience'] = data['experience'].map({'Has relevent…
1
vote
1 answer

How do I solve an import issue with numpy in scikit.decomposition.PCA?

I was trying to use the scikit.decomposition.PCA package and I couldn't even import it. import numpy as np from sklearn.decomposition import PCA I've upgraded both of np and scikit, but the error seems to be w/in scikit, what should I do? *Note:…
1 2 3
99
100