I have got a file with 5 rows and multiple columns and that when read by the program it should generate 100 records for example which can then be loaded into database. Format can be excel or csv
Asked
Active
Viewed 30 times
-4
-
Try to have a look on SMOTE algorithm, which creates new data from your existing ones. (5 might be a bit low to be honest) – Adept Sep 24 '20 at 13:16
1 Answers
0
Let's save you have a file file.csv
. Read that into a dataframe and sample from it as many times as you need. Write the result to a new dataframe or csv.
import pandas as pd
df = pd.read_csv('file.csv')
new_df = df.sample(n=100, replace=True) # n could be as big as you want
# new df can now be exported
new_df.to_csv('new_df.csv')

mullinscr
- 1,668
- 1
- 6
- 14
-
But i dont want to duplicate same data. Could use some package and generate new data similar to the data in that column – auto daily Sep 24 '20 at 13:51