I'm a long time T-SQL user and am new to python. I inherited a project where one of my processes hard coded a number rather than making it dynamic. The value in the variable is nothing more than the number of months between two dates. After assigned to the variable, the integer is used in a later calculation. The problem I have is I have only found solutions to use months_between() in a dataframe. While the value is calculated correctly, the downstream process requires the integer as an input and not reading the dataframe.
In SQL, I would have written:
DECLARE varMonths
SET varMonths = SELECT DATDIFF(mm, date1, date2)
In Python I tried:
elig_endnum2 = spark.sql("select round(months_between(current_date(), date('2007-01-01')),0)")
If someone could provide me a little direction and a link to a resource on the appropriate way to solve this, I'd be grateful.