1

I have a GPS data:

t   lat long
0   27  28
5   27  28
10  27  28
15  29  49
20  29  49
25  27  28
30  27  28    

I want to calculate the haversine distance between two lat-long only when their value is different. Things I have done is, creating a udf to calculate the same:

def distanceTo(lat:Double,long:Double,lag_lat:Double,lag_long:Double): Double = {

  val lat1 = math.Pi / 180.0 * lat
  val lon1 = math.Pi / 180.0 * long
  val lat2 = math.Pi / 180.0 * lag_lat
  val lon2 = math.Pi / 180.0 * lag_long

  // Uses the haversine formula:
  val dlon = lon2 - lon1
  val dlat = lat2 - lat1
  val a = math.pow(math.sin(dlat / 2), 2) + math.cos(lat1) * math.cos(lat2) * math.pow(math.sin(dlon / 2), 2)
  val c = 2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))
  val meters = 6372.8 * c * 1000
  meters
}

Steps I did, registering the function as udf and using it get a column of haversine distance:

val udf_odo = udf[Double,Double,Double,Double,Double](distanceTo)

val stoppage_df=lag_df
  .withColumn("re_odo", udf_odo(col("lat"), col("long"),col("lag_latitude"), col("lag_longitude")))

but I want that this function should only be called if there is difference in the lat-long otherwise the column should get 0.

Shaido
  • 27,497
  • 23
  • 70
  • 73
experiment
  • 315
  • 3
  • 19

1 Answers1

1

Your condition is not clear in the Question. But, you can use "when" clause and specify your condition in it. See below:

val stoppage_df=lag_df.withColumn("re_odo", when(<condition>, udf_odo(col("lat"), 
                       col("long"),col("lag_latitude"), col("lag_longitude")))
                      .otherwise(0));

I recommend you refer the link for more details on performing column operations based on conditions.

Ana
  • 841
  • 10
  • 28