I have a dataframe of shape 2701x128
It has a lot of missing values. The thing is that some rows can have 95% of filled data and some - only 5%. Let me try to visualize it:
X-axis is number of row(after sort), y-axis is number of non-zero values (SORTED, histogram-like)
X-axis is number of column(after sort), y-axis illustrates, how many non-zero's column have over all rows (SORTED, histogram-like)
I need: i need to imput data as accurate as i can, because this is the problem i need to solve. Problem: I cant interpolate everything with means, medians and othe statistical moments, because it's very rough. I also can't create a usual learning model cause there's NO structure in missing data.
Can you please suggest something as accurate as learning models, which can model the distribution, but be able to deal with completly random misses. So, apparently, the main problem is to create dataset from this unstructured misses. I can't find the solution at the moment.