Im trying to get the rows of an excel document. What i have achieved.
1-. Retrieve .xls, .xlsx files
2-. Convert those files to TIFF images
3-. Enhance image for better text recognition
4-. Identify Pages
5-. Create the Documents
6-. Recognize Page and Fields
7-. Populate Fields (this is were is my problem)
For example, in a table like
Name | Age | Size
Juan | 26 | 1.90m
Max | 25 | 1.85m
Victor | 26 | 1.65m
My project can find the keyword Name, Age & Size, and in the settings i can tell him, ok the value is down a line and group the leading and trailing words, but it will only fill the fields name, age and size with the first values below and will ignore the others, and datacap does not seems to have a field array type.
In the image, you can see that there is only one way add fields, and they are scalar (just one value), Add multiple only adds multiple fields at once, not a field of multiple values haha.
This is how my fields get retrieved
Another problem i face is that my excel sheet gets splitted in order to fill a document format, and i was expecting the whole sheet to be converted in 1 document not 4
In the image, those 4 pages are from the same sheet (in the excel)
IBM docs still lacks information, there are some pages that only has its title and zero information lol.