I am trying tabula and I am selecting text by area but some areas change between documents and I got some mismatched results. Check the images for a clearer explanation.
What are the alternatives for this kind of comportment in pdf files?
I am trying tabula and I am selecting text by area but some areas change between documents and I got some mismatched results. Check the images for a clearer explanation.
What are the alternatives for this kind of comportment in pdf files?
If it's only two different sizes, maintain two sets of location data, and have a single piece of text you look for that tells you which size it is, like:
Código do Serviço / Atividade
(I picked that text because, when looking at them side by side, it's the first text I could identify that had different locations.)
If the "lower" location matches, then it's the bigger of the two, and you will use the "large" location set.