The crawler 1st crawl will create a table schema, some time it not detect "
& ,
correctly and break the data row.
I fix it by updating the table Serde serialization lib.
But now I have a problem, the additional columns that created in 1st crawl still remain, even I re-run the crawler. It has thousands of columns, very annoying.
Is it possible to remove unnecessary columns (col30
, col31
, col32
, ... col3034
) on 2nd crawl?