I am trying to use vaex as alternative to pandas to merge extremely big data frames( 100k rows + 176m rows) on a string column.
The .join
seems to work without any error and I can even check .shape
of the result data frame but when I try to .head
the result a big error stack returns (attaching it bellow).
One of the lines near the end mentions pyarrow.lib.ArrowInvalid: offset overflow while concatenating arrays
.
My first guess would be that I have not enough RAM but the merge went surprisingly okey. How can I fix this ?