Creating synthetic dataset similar to a specific dataset

Asked Jun 11 '22 at 17:16

Active Jun 11 '22 at 17:27

Viewed 41 times

I have a data set and want to work with an external ML team however because of NDA, I can not share the original or anonymized dataset with them.

Currently I'm using make_classificatio to create a synthetic data however this is a little time consuming to first understand the statistics of original dataset and then create a synthetic dataset similar to the original one. For dataset example you may consider Iris or other public datasets,

from sklearn import datasets
iris = datasets.load_iris()

I'm wondering if you know any better way to imitate the original dataset?

edited Jun 11 '22 at 17:27

asked Jun 11 '22 at 17:16

Phoenix

difficult to help you if you cannot share the dataset to give an example, looks like we're stuck in a paradox :p – mozway Jun 11 '22 at 17:18
say Iris dataset (or any other public dataset) `iris = datasets.load_iris()` – Phoenix Jun 11 '22 at 17:20

Creating synthetic dataset similar to a specific dataset

0 Answers0