-1

We tried to upload a csv file with column 'cpf' on AWS-Athena, the field cpf contains numbers like this '372.088.989-03'

create external table (
    cpf bigint,
    name string
    cell bigint
)

Athena doesn't read this field, how can i register?

we try to register like string and this works but is not correct

  • That probably isn't a valid value. Are the periods representing European-style commas? Can you possibly remove them, so that the file contains `372088989-03`? Alternatively, you could load it as a `varchar` and then cast it to the desired format in your queries. – John Rotenstein Oct 28 '22 at 21:09
  • Well, `372.088.989-03` isn't a valid value for a `bigint` (or any numeric) data type. If the formatting of this value needs to be preserved, then it should be stored in a `text` column. Otherwise you will need to explain what the real value should be `372088989 - 3` which yields `372088983`? or `37208898903` or `372.08898903` or `372088.98903` or `372088989.03`? –  Oct 29 '22 at 08:18
  • @JohnRotenstein This field is a unique identity in Brazil it's the same as SSN – William Soares Oct 31 '22 at 13:56

1 Answers1

0

Ah! It's the CPF number - Wikipedia.

It does not match rules for numbers, and you won't be doing any mathematics on it, so I would recommend treating the CPF as a string.

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470