Questions tagged [data-science-experience]

IBM Data Science Experience is an interactive, collaborative, cloud-based environment where data scientists can use multiple tools to activate their insights.

IBM Data Science Experience is an interactive, collaborative, cloud-based environment where data scientists can use multiple tools to activate their insights.

Source: http://datascience.ibm.com/blog/welcome-to-the-data-science-experience/

261 questions
0
votes
1 answer

Adding empty columns to python DataFrame

I am trying to add couple of empty columns to a python Dataframe , the columns to be added are in the form of list, how could I do…
0
votes
2 answers

Issues in creating jobs of size XL in Watson Machine Learning (WML)

I have an issue when trying to create jobs for Decision Optimization when using size XL in Watson Machine Learning (WML). The first job for the day I have no issues what so ever to create. But the second job is failing. If changing to smaller…
0
votes
1 answer

Apriori algorithm expert is needed

I have a dataset with 3.3M rows and 8k unique products. I wanted to apply apriori algorithm to find association rules and connections between products. Well, I did it before on a much smaller database with 50k rows and maybe 200 unique…
0
votes
1 answer

How can I impute missing value with the help of tree based model like random forest

In my datasets, I have one variable which contains 30% missing value. I am trying to use tree based model but not getting clear picture how to implement it. data['X'].value_counts() OUTPUT----- ? 39454 MC 32223 HM 6197 SP 4892 BC …
user2986845
  • 75
  • 1
  • 2
  • 8
0
votes
1 answer

Read simple csv with PySpark

probably a silly issue, but I don't get it. I'm working on a Jupyter Notebook with Python3.6, Spark 2.4, hosted by IBM Watson Studio. I have a simple csv file: num,label 0,0 1,0 2,0 3,0 And to read it I use the following commands: labels =…
Vincenzo Lavorini
  • 1,884
  • 2
  • 15
  • 26
0
votes
1 answer

using TQDM in Watson Studio Notebooks

I'm using Pandas, and I would like to use the TQDM progress bar in the notebook. After loading TQDM: from tqdm.auto import tqdm tqdm.pandas() and applying a function to the Pandas Dataframe: new_df = df.progress_apply(...) I get as output, instead…
Vincenzo Lavorini
  • 1,884
  • 2
  • 15
  • 26
0
votes
1 answer

getting 'StructField' object has no attribute '_get_object_id' on BinaryClassificationMetrics

I was trying to get the binary classification report on pyspark and I ran into this error StructField' object has no attribute '_get_object_id' Here is my code %%spark from pyspark.mllib.evaluation import BinaryClassificationMetrics #from…
0
votes
1 answer

Total the sum of a corresponding Column with python

|-------|------------|--------------|--------------|-------------|------------|------------|--------------| | Store | Date | Weekly_Sales | Holiday_Flag | Temperature | Fuel_Price | CPI | Unemployment…
0
votes
1 answer

Experiment design for machine learning algorithm in production

I have a machine learning algorithm ready. I would like to put it into production in a country of 70 cities. But before rolling it out to 70 cities, I would like to do experimentation in 1 city to evaluate it's performance in production. However,…
0
votes
1 answer

Watson Studio using ibmdbr in Rstudio to connect to DB2 Z/OS

I'm trying to use RStudio to connect to the DB2 for Z/OS backend using ibmdbr odbcDriverConnect and getting error: Warning messages: 1: In odbcDriverConnect(con.text) : [RODBC] ERROR: state 01000, code 0, message [unixODBC][Driver Manager]Can't…
debrajo
  • 13
  • 5
0
votes
1 answer

Data Refinery Job failed with SCAPIException CDICO2060E

I'm building my first project in Watson Studio and a Data Refinery Job fails with the following error: ERROR: Failed to execute the flow. Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times,…
debrajo
  • 13
  • 5
0
votes
1 answer

Integrating Model with Data Preprocessing steps

It might be a silly thought, but please bear with me and guide me if I am taking the wrong approach. I am working on a Machine learning project whose model will give final output, this output from ML model is to be consumed by another project…
0
votes
1 answer

How to find anomalies in wind-sensor TimeSeries data?

I have time series data set which contain TimeStamp[hour base] and wind sensor value. I need to find anomalies from this data set. What are the techniques to find out anomalies ? How to find anomalies with only these two features ( TimeStamp,…
0
votes
1 answer

How to consider features while forecasting?

I have to forecast future utilisation of my emplyees based on their past data based on zone,slot. Here zone and slot is the 2 features i wanted to include while forecasting.Any suggestions how to proceed. data looks as like dt zone slot …
0
votes
3 answers

How do you deal with missing data when it's missing like 60%?

My data has a lot of missing values and I have to predict those values. One way is to take the average of those values. But I want to hear an other perspective on it. How experienced data scientist solve such kind of issue?
hammadshahir
  • 350
  • 4
  • 7