I have a data set stored in Parquet files crawled from S3 and registered in Glue Data Catalog. Some of the columns are of ambiguous type.
For example column col
is typed as struct<long:bigint,string:string>
.
If I select from that table tbl
, then values of col
are displayed for example like this:
{long=16, string=null}
{long=null, string=15.2}
What I would like to do now is query specifically those rows where col
was classified as a string
.
How would I do that?
(What would a query have to look like for filtering rows from tbl
whose value in the col
column is classified as long
and > 10
?)