-1

Assume I have a 1D array and want to replace / interpolate NaN blocks of length n with copies of the n/2 non-nan previous values and the n/2 non-nan subsequent values.

Example 1:

input = [1, 2, NaN, NaN, NaN, NaN, 3, 2]
output= [1, 2,   1,   2,   3,   2, 3, 2]

Example 2: if n is odd, fill with n%2+1 previous values and n%2 subsequent values

input = [1, 2, 3, NaN, NaN, NaN, 4, 2]
output= [1, 2, 3,   2,   3,   4, 4, 2]

Example 3: if not enough non-nan neighbours are available, replicate the available one (in this example value = 3)

input  = [3, NaN, NaN, NaN, NaN, 4, 2]
output = [3,   3,   3,   4,   2, 4, 2]

my prelim solution looks like this..

def fillna_with_neighbours(data):
# get start / stop of nan blocks
    nan_blocks = np.where(np.isnan(data),1,0)
    nan_blocks = np.concatenate([[0],nan_blocks,[0]]) 
    nan_blocks = np.where(np.diff(nan_blocks)!=0)[0]
    nan_blocks = nan_blocks.reshape(-1, 2)

    for block in nan_blocks:
        nan_start, nan_end = block 
        n = nan_end - nan_start
        n_pre = n//2 if n%2 == 0 else n//2 + 1
        nan_pre = nan_start - n_pre if nan_start>=n_pre else 0
        n_post = n//2
        nan_post = nan_end + n_post
        
        pre = data[nan_pre:nan_start]
        post = data[nan_end:nan_post]

        if pre.size < n_pre:
            pre = np.resize(pre, n_pre)
        if post.size < n_post:
            post = np.resize(post, n_post)

        data[nan_start:nan_end] = np.hstack([pre, post])
    return data

ex1 = np.asarray([1, 2, np.nan, np.nan, np.nan, np.nan, 3, 2])
ex2 = np.asarray([1, 2, 3, np.nan, np.nan, np.nan, 4, 2])
ex3 = np.asarray([3, np.nan, np.nan, np.nan, np.nan, 4, 2])

is there any ready (scipy?) function for this problem. I am sure there are much better

dohe
  • 11
  • 1
  • It looks like an assignment/homework. SO is not a homework solving workforce. What did you tried so far ? Provide proof of attempts with actual code. I would advise people not to answer until you showed some minimal efforts to solve your own assignment. – LoneWanderer Nov 23 '22 at 13:53
  • it is no assignment. just thought that one of you guys will come up with a much nicer way to solve this. – dohe Nov 23 '22 at 15:39
  • that is a very different question then ... – LoneWanderer Nov 23 '22 at 16:14

1 Answers1

0

Considering this comment from OP:

it is no assignment. just thought that one of you guys will come up with a much nicer way to solve this.

The question is too broad in this form, but here are some leads.

If you are using pandas, you can have a look to the following functions which are designed to fill missing data using various methods (quotes follow) :

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.fillna.html

Fill NA/NaN values using the specified method. method{‘backfill’, ‘bfill’, ‘pad’, ‘ffill’, None}, default None

Method to use for filling holes in reindexed Series pad / ffill: propagate last valid observation forward to next valid backfill / bfill: use next valid observation to fill gap.

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.interpolate.html

Fill NaN values using an interpolation method.

If you are not using pandas, then it is most probable that you have to implement an equivalent logic using numpy, or basic python.

LoneWanderer
  • 3,058
  • 1
  • 23
  • 41