I want to store the result of below line in a col in the same df dataframe.
df.filter(F.abs(df.Px)< 0.005).count()
How can I do that?
I want to store the result of below line in a col in the same df dataframe.
df.filter(F.abs(df.Px)< 0.005).count()
How can I do that?
The answer is you can do that using union
. However, it's not a good practice to append the row below particular column because you can also have multiple columns and that will give you only one extra row with new count value.
I give an example snippet below.
from pyspark.sql import Row
df = spark.createDataFrame(pd.DataFrame([0.01, 0.003, 0.004, 0.005, 0.02],
columns=['Px']))
n_px = df.filter(func.abs(df['Px']) < 0.005).count() # count
df_count = spark.sparkContext.parallelize([Row(**{'Px': n_px})]).toDF() # new dataframe for count
df_union = df.union(df_count)
+-----+
| Px|
+-----+
| 0.01|
|0.003|
|0.004|
|0.005|
| 0.02|
| 2.0|
+-----+