0

I have a complex code and I am using when to make a new column under some conditions. Consider the following code:

 df.select(
    '*',
    F.when((A)|(B)|(C),top_val['val']).alias('match'))

let A,B and C are my conditions. I want to put an order on these conditions like this:

If A satisfied then don't check B and C If B satisfied then don't check C.

Is there any way to put this order?

user15649753
  • 475
  • 2
  • 12

1 Answers1

2

As stated in this blog and quoted in this answer, I don't think you can guarantee the order of evaluation of an or expression.

Spark SQL (including SQL and the DataFrame and Dataset API) does not guarantee the order of evaluation of subexpressions. In particular, the inputs of an operator or function are not necessarily evaluated left-to-right or in any other fixed order. For example, logical AND and OR expressions do not have left-to-right “short-circuiting” semantics.

However, you can nest the when() inside .otherwise() to form a series like this and achieve what you want:

df.select(
    '*',
    F.when((A),top_val['val'])
    .otherwise(F.when((B),top_val['val'])
                .otherwise(F.when((C), top_val['val']))).alias('match'))
viggnah
  • 1,709
  • 1
  • 3
  • 12