Questions tagged [data-science]

Implementation questions about data science. Data science concerns extracting knowledge or insights from data, in whatever shape or form. It can contain predictive analytics and usually takes a lot of data wrangling. General questions about data science should be posted to their respective communities.

Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured, similar to data-mining.

Wikipedia

NOTE: If you want to use this tag for a question not directly concerning implementation, then consider posting on Cross Validated, Data Science, or Artificial Intelligence instead. Otherwise you're probably off-topic.

9099 questions

vote

1 answer

Why does the algorithm sometimes not behave as intended?

we are currently working on a college project. We have been tasked to optimize the maintenance schedule for repairs on bikes from a bike sharing service. The bikes can only be rented from and returned to bike docking stations. We need to calculate…

python pandas dataframe algorithm data-science

asked Jun 03 '23 at 18:26

Yusuf Shehadeh

vote

0 answers

ModuleNotFoundError: No module named 'sklearn.ensemble._bagging'

ModuleNotFoundError: No module named 'sklearn.ensemble._bagging' Which version is suitable of scikit learn for the above error? I am facing this issue when I am using the python 3.7 version. And I can't update the version.

github data-science python-3.7 data-science-experience github-issues

asked Jun 03 '23 at 06:30

Aditya rai

vote

0 answers

Create tables using OptBinning with custom bins

I want to use the library optbinning to create tables with all the metrics, but under the assumption that I already have all the bins. I don't want to optimize the binning process, I just want the tables with my current bins. Despite the fact that…

python data-science analytics optbinning

asked Jun 02 '23 at 17:06

TomasLeon

vote

0 answers

Troubleshooting 'Notebook validation failed: data.cells' error in Jupyter Notebook

How to resolve the error "Notebook validation failed: data.cells[{data__cells_x}] must be valid exactly by one definition (0 matches found)" in Jupyter Notebook? How can I resolve the "Notebook validation failed: data.cells[{data__cells_x}] must be…

python error-handling jupyter-notebook data-science data-analysis

asked May 19 '23 at 15:19

Basant

vote

1 answer

What does ' ::Page{} ' do in R/RStudio?

What does ::Page{} do in R/RStudio ? I'm studying Data Science through IBM certification course in coursera and the notes contain this line of code in all the code blocks and no explanation to what the "Page" function is doing #load ggplot…

r data-science data-analysis

asked May 19 '23 at 09:28

Simon Nadar

vote

1 answer

Python can recognize header position and extract header info

I have a csv file in which there are 3 or more headers in one csv file. I want python or pandas to be able to recognize the header position and extract the header info in the csv file. Here I give an example of a csv file that I have. "Level and…

python pandas data-science visualization

asked May 19 '23 at 07:01

Muhammad Fauzan

vote

1 answer

AttributeError: 'FloatProgress' object has no attribute 'style'

import numpy as np import pandas as pd import torch from torch.utils.data import Dataset import stanza stanza.download('en') nlp = stanza.Pipeline(lang='en') above code used for Creating a Pipeline Stanza provides a plethora of pre-trained NLP…

jupyter-notebook data-science tqdm stanza

asked May 18 '23 at 09:23

igneous spark

vote

2 answers

How do I drop and change dtype in a Pipeline with sklearn?

I have some scraped data that needs some cleaning. After the cleaning, I want to create a "numerical and categorical pipelines" inside a ColumnTransformer such as: categorical_cols = df.select_dtypes(include='object').columns numerical_cols =…

python machine-learning scikit-learn data-science

asked May 14 '23 at 09:52

Odiseon

vote

2 answers

How to read a column and apply a function to each cell as a tuple?

I'm trying to analyze a database with coordinates (X,Y). I need to read each data in that column and classify it as either North or South if it's "Y" or East or West if it's "X". So basically what I want to do is read each data in that column and…

python pandas database dataframe data-science

asked May 10 '23 at 15:00

rosvend

vote

3 answers

Index a different range of indicies from each row of numpy array

I have two arrays of incidies with shape m. I need to take the mean of the values inbetween the indicies from an array with shape m x n. Can this be done without iterating through each row? What is the fastest way to do this? idx0 = np.array([1, 3,…

python numpy data-science

asked May 08 '23 at 20:50

dotto

vote

2 answers

How can I double unpivot data like in this example in SQL?

I saw sometimes you can use a cross apply, but I feel it won't work in this case as I have 10 columns for "Days" (for 10 years) and 10 columns for "Discharges" (for 10 years).....so I need this pivoted into 10 different row per zip and age…

sql sql-server database data-science ssms

asked May 06 '23 at 19:55

manavjn

vote

1 answer

Efficiently iterating over a list to extract count for multiple variables

I have a dataset of medical insurance variables, and am interested in understanding how the proportion of smokers ('yes', 'no') differ between regions ('northwest', 'northeast', 'southwest', 'southeast'). I have used a for loop to iterate over each…

python for-loop data-science analysis

asked May 05 '23 at 18:20

whorrodwi

vote

1 answer

Collapse rows by common variable of list

I want to collapse the rows of dataframe to create the orthologe group of each othologe and its corresponding genes. For example: Column A Column B Ortho1 gene1 Ortho2 gene2, gene3 Ortho3 gene4, gene5, gene6 Ortho4 gene5,…

r dataframe tidyverse data-science bioinformatics

asked May 04 '23 at 14:39

Jin_soo

vote

0 answers

I ran a command that was supposed to show me the data about my object detection ai but i get an error that i can't solve

basically i have this command: python Tensorflow\models\research\object_detection\model_main_tf2.py --model_dir=Tensorflow\workspace\models\my_ssd_mobnet --pipeline_config_path=Tensorflow\workspace\models\my_ssd_mobnet\pipeline.config…

python data-science object-detection python-unicode

asked May 04 '23 at 00:09

Poupiloup Polpy

vote

0 answers

pandas group memory usage reduction

Hello i have some code that is utilizing a high amount of memory regressor_df is a df that has over 14 million elements. when i remove the location from the group by the amount of ram needed to process goes down by about 26gb. how can i run this…

python pandas data-science

asked May 02 '23 at 21:31

wezzie

Prev 1 2 3

…

99 100 Next