Thanks for the prompt responses. Based on the responses, I have modified the question and also provided numeric code example.
I am from Market Research industry. We analyse survey databases. One of the requirements of the survey tables is that blank rows & columns should not get suppressed. Blank rows and / or columns may result when we are generating table on filtered database.
To avoid this zero suppression, we create a blank table with all rows / columns, then create actual table using Pandas and update the blank table with the actual table numbers using Pandas pd.update
function. This way, we retain rows / columns with zero estimates. My sincere apologies for not pasting code as this is my first question on Stack Overflow.
Here's the example dataframe:
dict = { 'state':
['state 1', 'state 2', 'state 3', 'state 4', 'state 5', 'state 6', 'state 7', 'state 8', 'state 9', 'state 10'],
'development': ['Low', 'Medium', 'Low', 'Medium', 'High', 'Low', 'Medium', 'Medium', 'Low', 'Medium'],
'investment': ['50-500MN', '<50MN', '<50MN', '<50MN', '500MN+', '50-500MN', '<50MN', '50-500MN', '<50MN', '<50MN'],
'population': [22, 19, 25, 24, 19, 21, 33, 36, 22, 36],
'gdp': [18, 19, 29, 23, 22, 19, 35, 18, 26, 27]
}
I convert it into a dataframe:
df = pl.DataFrame(dict)
I filter it using a criteria:
df2 = df.filter(pl.col('development') != 'High')
And then generate a pivot table
df2.pivot(index='development', columns='investment', values='gdp')
The resulting table has one row suppressed ('High' development) and one column suppressed ('>500MN' investment).
The solution I am looking for is to update the blank table with all rows and columns with the pivot table generated. Wherever there are no values, they would be replaced with a zero.