I have extracted a tabular data using Camelot into pandas DataFrame. Now due to table indentation issues in pdf, string belonging to same row gets split into two parts(especially strings inside bullet points). I want to merge these spitted rows into single row.
I have highlighted how single row is split into two rows. (for "c)" bullet point and "V" bullet point) :
I have also added expected output.
I am not able to create a generalize logic for this. Can anyone suggest witty code to handle these cases?
Link to sample dataset : https://docs.google.com/spreadsheets/d/1xdhb1d5qWPhcF3mdS1F76FfMqgFLmZdonHmo9DKBUw0/edit#gid=0