In Spark this may be more difficult, as you don't have an index on which you could do resampling. To create days from months you could do these steps:
- use window function
lead
to find the next date
- create an array with days range until that date
- explode the array
Example input:
from pyspark.sql import functions as F, Window as W
df = spark.createDataFrame(
[('2022-02-01 00:00:00', 'schet_21', 131329.55),
('2022-03-01 00:00:00', 'schet_22', 7716.1)],
['months', 'type', 'summaoborotdt']
).withColumn('months', F.to_timestamp('months'))
Script:
last_day = F.date_sub(F.lead('months').over(W.orderBy('months')), 1)
df = df.select(
F.sequence('months', F.coalesce(last_day, 'months')).alias('days'),
*[c for c in df.columns if c != 'months']
).withColumn('days', F.explode('days'))
Result:
df.show(99)
# +-------------------+--------+-------------+
# | days| type|summaoborotdt|
# +-------------------+--------+-------------+
# |2022-02-01 00:00:00|schet_21| 131329.55|
# |2022-02-02 00:00:00|schet_21| 131329.55|
# |2022-02-03 00:00:00|schet_21| 131329.55|
# |2022-02-04 00:00:00|schet_21| 131329.55|
# |2022-02-05 00:00:00|schet_21| 131329.55|
# |2022-02-06 00:00:00|schet_21| 131329.55|
# |2022-02-07 00:00:00|schet_21| 131329.55|
# |2022-02-08 00:00:00|schet_21| 131329.55|
# |2022-02-09 00:00:00|schet_21| 131329.55|
# |2022-02-10 00:00:00|schet_21| 131329.55|
# |2022-02-11 00:00:00|schet_21| 131329.55|
# |2022-02-12 00:00:00|schet_21| 131329.55|
# |2022-02-13 00:00:00|schet_21| 131329.55|
# |2022-02-14 00:00:00|schet_21| 131329.55|
# |2022-02-15 00:00:00|schet_21| 131329.55|
# |2022-02-16 00:00:00|schet_21| 131329.55|
# |2022-02-17 00:00:00|schet_21| 131329.55|
# |2022-02-18 00:00:00|schet_21| 131329.55|
# |2022-02-19 00:00:00|schet_21| 131329.55|
# |2022-02-20 00:00:00|schet_21| 131329.55|
# |2022-02-21 00:00:00|schet_21| 131329.55|
# |2022-02-22 00:00:00|schet_21| 131329.55|
# |2022-02-23 00:00:00|schet_21| 131329.55|
# |2022-02-24 00:00:00|schet_21| 131329.55|
# |2022-02-25 00:00:00|schet_21| 131329.55|
# |2022-02-26 00:00:00|schet_21| 131329.55|
# |2022-02-27 00:00:00|schet_21| 131329.55|
# |2022-02-28 00:00:00|schet_21| 131329.55|
# |2022-03-01 00:00:00|schet_22| 7716.1|
# +-------------------+--------+-------------+