Splitting a dataframe by passing a list of indices

Question

I have a dataframe df, containing only one column 'Info', which I want to split into multiple dataframes based on a list of indices, ls = [23,76,90,460,790]. If I want to use np.array_split(), how do I pass the list so that it parses the data from these indices with each index being the first row of split dataframes.

Would using `ls = [23,76,90,460,790]` result in 5 DF's - could you elaborate a bit please? — Jon Clements, Dec 09 '21 at 18:09
Yes, first dataframe should start from row 23 to 75, then second one from 76 to 89 and so on. — ABC, Dec 09 '21 at 18:15
@Scott ahh... I didn't find that in my search for a possible duplicate (and thoughtful of you to not immediately close as a duplicate as it leads to your own answer :) - feel free to close as a duplicate (although I prefer my use of zip_longest - but I can add to there or you're more than welcome to add it to your answer there). — Jon Clements, Dec 09 '21 at 18:48

Jon Clements · Answer 1 · 2021-12-09T18:37:46.267

-1

I don't think you can use np.array_split() here (you can access the underlying .values of the primary DF but you'd get back numpy arrays - not DFs...) - what you can do is use .iloc and "slice" from your DF, eg:

from itertools import zip_longest

dfs = [df.iloc[s: e] for s, e in zip_longest(ls[::2], ls[1::2])]

edited Dec 09 '21 at 18:37

answered Dec 09 '21 at 18:23

Jon Clements

138,671
33
247
280

Thank you very much. I think it should work. – ABC Dec 10 '21 at 16:39

Splitting a dataframe by passing a list of indices

1 Answers1