2

I have a DataFrame df which I retrieved from a Postgres database as follows

using DataFrames, LibPQ

con = LibPQ.Connection(con_string)
result = execute(con, "SELECT * FROM [table]")
df = DataFrame(result)
close(con)

Sorry, I cannot make this reproducible.

Now, either DataFrames or LibPQ is turning NUMERIC Postgres columns into type Decimals.Decimal. This might be cool for being as accurate as possible, but it provides problems when I try to plot anything with these columns.

eltype.(eachcol(df))

5-element Vector{Union}:
 Union{Missing, String}
 Union{Missing, TimeZones.ZonedDateTime}
 Union{Missing, Int32}
 Union{Missing, Date}
 Union{Missing, Decimals.Decimal}

As very nicely explained here by Bogumił Kamiński I can change the columns of a specific type to some other type. The caveat is that I cannot even test whether a column is of type Union{Missing, Decimals.Decimal}, because the Decimals package is not loaded. OK, I thought, let's load the Decimals package then - but it doesn't work, because the package must be installed first...

Is there some other way to turn these columns into Float64s? Without having to install the entire package? I know that I could change the column types by using the column names, like

df.my_column = Float64.(df.my_column)

but I will not know the relevant column names in advance.

Georgery
  • 7,643
  • 1
  • 19
  • 52

2 Answers2

2

You can use Union{Missing, AbstractFloat} as type selector as Decimal <: AbstractFloat.

Since Union{Missing, AbstractFloat} is not a concrete type you need to write eltype(col) <: Union{Missing, AbstractFloat} to check a subtyping condition.


By the way if you have LibPQ.jl installed then you also have access to Decimals.jl:

julia> LibPQ.Decimals.Decimal
Decimals.Decimal
Bogumił Kamiński
  • 66,844
  • 3
  • 80
  • 107
0

You can use identity to properly type every column in DataFrame.

julia> df=DataFrame(A=Number[1,2],B=Union{Missing,AbstractFloat}[3,4])
2×2 DataFrame
 Row │ A       B          
     │ Number  Abstract…? 
─────┼────────────────────
   1 │      1         3.0
   2 │      2         4.0

julia> identity.(df)
2×2 DataFrame
 Row │ A      B       
     │ Int64  Float64 
─────┼────────────────
   1 │     1      3.0
   2 │     2      4.0
  • I don't think this will work in this case, as the goal is not just to get the narrowest type of values, but to convert them to a different type. – Milan Bouchet-Valat Apr 06 '22 at 20:50
  • The question is to convert to `Float64`, but since it says "it provides problems when I try to plot anything with these columns.", I answered how to convert to a type that can be plotted. It is useful because it can be written short if the conversion to the narrowest type is sufficient instead of the conversion to a specific different type. – 91 dokudo Apr 07 '22 at 22:00