I've got some files of big data to parse through. Each file has repetitions of certain tags but only one case of others. For example, each file has parents for name and date which only show once in every block of data but have many children like patent citations, non-patent citations, and classification.
So I parse through finding all cases of each three of these children and store them every iteration of parents in each file to individual lists. The problem is that the children are always of different lengths and I want to write them all on one row of a CSV file.
For example for one iteration in a file for my list inputs are like:
Name = [Jon]
Date = [1985]
Patcit = [1, 2, 3]
Npatcit = [4, 5, 6, 7, 8]
Class = [9, 10]
This is my second iteration, incoming lists
Name = [Nikhil]
Date = [1988]
Patcit = [1, 2, 3]
Npatcit = [4, 5, 6, 7]
Class = [9, 10, 11, 12, 13]
This is my third iteration, incoming lists
Name = [Neetha]
Date = [1986]
Patcit = [1, 2]
Npatcit = [4, 5]
Class = [9, 10, 11, 12]
And I want an output written to a CSV file to look like:
Name Date Patcit Npatcit Class
Jon 1985 1,2,3 4,5,6,7,8 9,10
Nikhil 1988 1,2,3 4,5,6,7 9,10,11,12,13
Neetha 1986 1,2 4,5 9,10,11,12
(Repeat next name and date iteration on the next row)