I am trying to get pandas to select the range of rows under the "ClosePrice" from the below structured csv and store it in dataframes. The file has many identifiers but I only want to go through the file with the identifiers in the list below. Also the number of the rows is not always the same.
list = ['ABC0123', 'DEF0123']
> Column 1 Column 2 Column 3 Column 4 Column 5 Column 6 Column 7
> "Date" 20170101 "Identifier" ABC0123
> "OpenPrice" 500 "Currency" USD
> "ClosePrice" 550 "foo" bar
> foo foo foo foo foo foo foo
> foo foo foo foo foo foo foo
> foo foo foo foo foo foo foo
> "Date" 20170101 "Identifier" SOMEOTHER
> ...
> ...
> ...
> "Date" 20170101 "Identifier" DEF0123
> "OpenPrice" 600 "Currency" USD
> "ClosePrice" 650 "foo" bar
> foo foo foo foo foo foo foo
> foo foo foo foo foo foo foo
> foo foo foo foo foo foo foo
> foo foo foo foo foo foo foo
> foo foo foo foo foo foo foo
> foo foo foo foo foo foo foo
> foo foo foo foo foo foo foo
> foo foo foo foo foo foo foo
> foo foo foo foo foo foo foo
I am getting the first row of each table I am interested in with a for-i-loop and:
df.iloc[df[df['Column 4'].isin(list)].index + 3,:]
which goes to the top left cell with a "foo" value and selects the whole row, but I am trying to figure out how to select the rows below that starting point and stop before the next
"Date" 20170101 "Identifier" SOMEOTHER
One approach I was thinking about was to check for the len of the value of the cell under the last row in Column 5 which would be = 0 , but I am not able to reproduce this logic with scripting. Other approaches more than welcome.