Append synthetic data to sklearn diabetes dataset

Asked Oct 12 '22 at 18:43

Active Oct 13 '22 at 01:42

Viewed 96 times

I have been using the solution posted here about adding a few rows to the sklearn diabetes dataset to test the impact of extreme data values. Is there a "batch" way to append thousands of rows to the sklearn.diabetes so I can test using my synthetic dataset?

EDIT/UPDATE The sklearn diabetes dataset comes with 450 rows. I generated a synthetic dataset that extends the rows to 3500. The example is below, and the entries match in format and type to the sklearn data.

edited Oct 13 '22 at 01:42

asked Oct 12 '22 at 18:43

G. D'Seas

Are you looking for something like [`pd.concat()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html)? – rickhg12hs Oct 12 '22 at 21:31
@rickhg12hs, thank you but unfortunately, it also seems to be limited to appending only limited variables and a few rows at a time. My synthetic dataset has over 3K lines (also have it as a CSV). – G. D'Seas Oct 12 '22 at 21:57
`pd.concat()` can concatenate entire dataframes - large dataframes. Perhaps if you showed your data formats, specific possibilities could be presented. – rickhg12hs Oct 12 '22 at 22:00

Append synthetic data to sklearn diabetes dataset

0 Answers0