4

I want to do elementwise operations on Array{Union{A,B,C}, M}, and I want an output of Array{Union{A,B,C}, N} without dynamic dispatch caused by failure to infer the Union{A,B,C} element type.

I pasted the examples I'm playing with below. I am aware I could use type annotations to control return types e.g. local y::eltype(x), but it doesn't stop Any inferences in the elementwise operations (as shown by @code_warntype) or dynamic dispatch (won't affect allocation numbers, which I report later).

It seems like Union splitting works up to 3 concrete types in the example, but while looping splits the element type, broadcasting splits the Vector type instead of the element type parameter. I'm asking for generalizable approaches to writing loops or elementwise methods that get around these inference limitations, even if it gets as tedious as conditional statements for each type.

begin
  function loop(x)
    local y
    for i in x
      y = i
    end
    return y
  end
        
  function each1(x)
    return x+1
  end
  
  # inputs to test
  const x3 = Union{Int, Float64, Bool}[1, 1.1, true]
  const x4 = Union{Int, Float64, Bool, ComplexF64}[1, 1.1, true, 1.1im]
end

I won't paste the whole @code_warntype printouts, just the inferred output types. Also I'm not exactly sure how to measure dynamic dispatch, but it causes extra allocations so I'm reporting the allocations in the 2nd runs of @time.

typeof(loop(x3)) # Bool
@code_warntype loop(x3) # Union{Bool, Float64, Int64}
@time loop(x3) # 0 allocations

typeof(each1.(x3)) # Vector{Real}
@code_warntype each1.(x3) # Union{Vector{Float64}, Vector{Int64}, Vector{Real}}
@time each1.(x3) # 3 allocations

typeof(loop(x4)) # ComplexF64
@code_warntype loop(x4) # Any
@time loop(x4) # 8 allocations

typeof(each1.(x4)) # Vector{Number}
@code_warntype each1.(x4) # AbstractVector{<:Number}
@time each1.(x4) # 13 allocations
BatWannaBe
  • 4,330
  • 1
  • 14
  • 23

1 Answers1

4

The reason for what you see it twofold:

  1. as you noticed if union is larger than of 3 types the Julia compiler gives up trying to do exact type inference and uses more crude bounds. This is a hard coded assumption to avoid excessive code generation.
  2. Broadcasting performs narrowing of eltype of a collection if possible.

Let me present the second issue in a minimal example:

julia> x = Any[1]
1-element Vector{Any}:
 1

julia> identity.(x)
1-element Vector{Int64}:
 1

Also when you write:

broadcasting splits the Vector type instead of the element type parameter

a precise information is that broadcasting informs you that the result will be either Vector{Float64}, or Vector{Int64}, or Vector{Real} depending on the actual contents of the array (beause eltype narrowing is performed).

The way to avoid it is to use broadcasting assignment instead (or just a loop). However, you must be sure that the eltype of the resulting container is able to store the result of the operation. So in your case it could be e.g.:

julia> r3 = similar(x3, Union{Int, Float64})
3-element Vector{Union{Float64, Int64}}:
 0.0
 0.0
 0.0

julia> r3 .= each1.(x3)
3-element Vector{Union{Float64, Int64}}:
 2
 2.1
 2

julia> @time r3 .= each1.(x3)
  0.000034 seconds (1 allocation: 16 bytes)
3-element Vector{Union{Float64, Int64}}:
 2
 2.1
 2

(note that I have manually chosen the eltype of r3 so that it is proper - i.e. I know what type of result I can potentially expect given the definition of each1 and the type of x3)

Bogumił Kamiński
  • 66,844
  • 3
  • 80
  • 107
  • `similar` and in-place broadcasting definitely cuts down on the allocations, and no type ambiguities show up in `@code_warntype`. Is there a way to do the same for the `loop` function in my example? It's not broadcasting and even if I annotate `i` and `y` with `::eltype(x)`, there's still `Any`s showing up in the `@code_warntype loop(x4)` printout and the allocations are the same. – BatWannaBe Dec 24 '21 at 10:40
  • 1
    As commented in point 1 - with an union of 4 elements AFAICT Julia will give up with specialization as in the line `for i in x` you invoke iteration protocol in which Julia gives up trying to resolve this type when creating a variable in lowered code. See https://docs.julialang.org/en/v1/manual/types/#Type-Unions and https://github.com/JuliaLang/julia/blob/073900d6f973790961c5bd52f754cf7e45ea2874/base/compiler/types.jl#L71. – Bogumił Kamiński Dec 24 '21 at 11:49
  • Ah so this is the same thing as comprehensions failing to infer types e.g. `(i for i in x4)`, I've noticed that but never made the connection. I wonder if iterating over a type-stable `eachindex(x)` would work; that would put the burden of type on `x[i]` instead of the iteration protocol. I'll try it out when I can get back to my laptop, or if you update your post showing if it works, I'll just mark this answer as the resolution. – BatWannaBe Dec 24 '21 at 12:21
  • 1
    Update: Iterating `eachindex(x)` and indexing `x[i]` preserves the array's element type union, even with 9 types. – BatWannaBe Dec 24 '21 at 15:23
  • Update 2 for anyone who comes across this post later: the element type union is not preserved if the Union-splitting limit is exceeded AND if you operate on the element value, like `y = x[i]+1`. It makes sense, it's ridiculous to generally compile 9 separate branches for 9 different types checked per iteration. It was probably preserved for `y = x[i]` because `eltype(x)` is known. – BatWannaBe Dec 25 '21 at 06:03