0

I have been using the solution posted here about adding a few rows to the sklearn diabetes dataset to test the impact of extreme data values. Is there a "batch" way to append thousands of rows to the sklearn.diabetes so I can test using my synthetic dataset?

EDIT/UPDATE The sklearn diabetes dataset comes with 450 rows. I generated a synthetic dataset that extends the rows to 3500. The example is below, and the entries match in format and type to the sklearn data.

Data format and types.

G. D'Seas
  • 45
  • 6
  • Are you looking for something like [`pd.concat()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html)? – rickhg12hs Oct 12 '22 at 21:31
  • @rickhg12hs, thank you but unfortunately, it also seems to be limited to appending only limited variables and a few rows at a time. My synthetic dataset has over 3K lines (also have it as a CSV). – G. D'Seas Oct 12 '22 at 21:57
  • `pd.concat()` can concatenate entire dataframes - large dataframes. Perhaps if you showed your data formats, specific possibilities could be presented. – rickhg12hs Oct 12 '22 at 22:00

0 Answers0