I have to work with a dataset that contains multiple lines per observation. The number of rows per observation can vary. The file is structured so that information is not repeated.
The file contains a segment ID, which relates the output to a specific piece of information. Here is a sample of the file layout.
Segment Id definitions
SegementID Table Number of Occurrences
1 Customer Information 1
3 Items bought 1-10
Table layout - Customer Information
ID Name
Table layout - Items Bought
Item Cost Date
Here is a sample of how the output file would look.
SegementID
1 100 matt
3 ball 3.25 1/16/2016
3 cat 5.55 1/17/2016
1 200 lucy
3 doll 500.35 2/1/2016
3 ball 3.25 2/2/2016
3 dog 5.55 2/3/2016
Notice that segment ID = 1, relates to customer information. Segment ID 3 then shows all the transactions that customer has made.
I would like to make the structure that has customer ID available on each transaction line. What is the best way to do this?
ID Item Cost Date
100 ball 3.25 1/16/2016
100 cat 5.55 1/17/2016
200 doll 500.35 2/1/2016
200 cat 3.25 2/2/2016
200 dog 5.55 2/3/2016