0

I have 2 columns of information. The 2nd column is time in seconds. The first column is error at that time. I need to make a vector that contains the value of error in seconds for 2.5s intervals. there should be 172 of them. Here is my data: col 0 = error, col 1 = time in seconds

 array([[0.00, 0.01],
   [1.91, 9.60],
   [0.00, 19.08],
   [2.05, 28.64],
   [1.04, 38.19],
   [1.89, 47.73],
   [1.69, 57.27],
   [2.24, 66.79],
   [1.89, 76.33],
   [1.86, 85.88],
   [2.37, 95.39],
   [2.29, 104.93],
   [2.03, 114.45],
   [2.16, 123.99],
   [1.34, 133.52],
   [2.40, 143.03],
   [2.17, 152.54],
   [0.00, 162.03],
   [1.61, 171.59],
   [2.31, 181.13],
   [2.15, 190.67],
   [2.22, 200.19],
   [2.16, 209.72],
   [0.00, 219.20],
   [2.65, 228.76],
   [1.74, 238.33],
   [0.00, 247.85],
   [2.33, 257.42],
   [1.85, 266.94],
   [0.00, 276.50],
   [2.27, 286.06],
   [1.67, 295.62],
   [2.41, 305.15],
   [0.00, 314.65],
   [1.32, 324.21],
   [2.39, 333.74],
   [2.19, 343.27],
   [2.51, 352.81],
   [2.41, 362.33],
   [1.79, 371.86],
   [0.00, 381.36],
   [3.07, 390.93],
   [2.30, 400.47],
   [0.00, 409.98],
   [2.41, 419.54],
   [2.22, 0.05],
   [1.75, 9.59],
   [2.18, 19.14],
   [1.99, 28.64],
   [1.80, 38.16],
   [1.45, 47.68],
   [1.57, 57.21],
   [2.24, 66.74],
   [0.00, 76.24],
   [2.31, 85.80],
   [0.00, 95.29],
   [2.39, 104.85],
   [0.00, 114.34],
   [0.95, 123.89],
   [2.35, 133.42],
   [2.43, 142.98],
   [1.66, 152.48],
   [1.08, 162.01],
   [0.00, 171.53],
   [1.20, 181.08],
   [2.43, 190.64],
   [2.42, 200.16],
   [2.59, 209.69],
   [1.98, 219.22],
   [1.75, 228.76],
   [2.28, 238.28],
   [1.98, 247.80],
   [1.08, 257.33],
   [2.08, 266.84],
   [2.30, 276.37],
   [0.00, 285.84],
   [1.38, 295.40],
   [2.19, 304.95],
   [0.00, 314.44],
   [1.54, 324.01],
   [2.19, 333.52],
   [0.00, 343.02],
   [2.13, 352.59],
   [2.31, 362.13],
   [0.00, 371.61],
   [2.36, 381.18],
   [2.02, 390.71],
   [2.68, 400.24],
   [0.00, 409.71],
   [2.19, 419.28]])

I tried using a linear interpolator using the following code, but got the error ValueError: A value in x_new is below the interpolation range.

import numpy as np
#import scipy
#import matplotlib.pyplot as plt 
from scipy import interpolate 
float_formatter = lambda x: "%.2f" % x
#np.set_printoptions(formatter={'float_kind':float_formatter})

# Read the text file with the errors - error,time format
orig=np.genfromtxt('Error_Onsets.csv',delimiter=',')
print repr(orig)
# Build a linear interpolator, giving it the known time (X) and error (Y)
interpf = interpolate.interp1d(orig[:,1],orig[:,0],kind='linear')

# What's the TR?
TR=2.5

# Setup the new vector of times, spaced by TRs
new_times=np.arange(0,172*TR,TR)

# Interpolate using the func defined above to get the error at any TR
new_err = interpf(new_times)

I read that this may be because x values need to be steadily increasing for linear interpolation to be appropriate. I'd appreciate any advice.

Maria
  • 1,247
  • 1
  • 13
  • 21
  • 2
    Take a look at the `bounds_error` and/or `fill_value` arguments to [`interp1d`](http://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.interp1d.html#scipy.interpolate.interp1d). Specifically, the behavior when you don't specify `bounds_error` and you ask for a value outside of the input range (like you are doing, at t=0). – jedwards Apr 23 '16 at 01:05
  • 1
    And w/r/t your question about having to use steadily increasing values, you do not. Unless you specify `assume_sorted=False`, which you're not, the vectors will be sorted by the function. – jedwards Apr 23 '16 at 01:25

1 Answers1

2

I'd usually do this without interpolation, just using the most recent value (so no sampling from future data):

times = np.arange(orig[0,1], orig[-1,1], 2.5)
indexes = np.searchsorted(orig[:,1], times, side='right') - 1
np.column_stack((orig[indexes,0], times))

This gives you two columns: the new times 2.5s apart, and the most recent error values as of those times.

John Zwinck
  • 239,568
  • 38
  • 324
  • 436
  • I don't understand what this is doing. I get the following array: array([[0.01, 0.01], [2.51, 0.05], [5.01, 0.05], [7.51, 0.05], [10.01, 9.59], [12.51, 9.59], [15.01, 9.59], [17.51, 9.59], ... the first value should be error, but error never goes above 5, so I don't understand why I'm getting huge values for my error – Maria Apr 24 '16 at 20:04
  • @Maria: I apologize, in the last line I had swapped the columns by mistake. I have updated the answer now, please try it. The first column should be the errors, second column the times. – John Zwinck Apr 25 '16 at 01:59
  • thank you! but it still seems to be interpolating (or something...). I get the following: `array([[0.00, 0.01], [2.22, 2.51], [2.22, 5.01], [2.22, 7.51], [1.75, 10.01], [1.75, 12.51],`... the error is 0 until after 9s, so I'm not sure why it's >2 at 2.51, 5.01, etc – Maria Apr 25 '16 at 17:15
  • I think you're right that this is a better way to do what I'm looking for. However, since I wasn't thinking about doing it this way I haven't given you all the relevant information. I think I'm going to make a new post that has more info and is easier to follow, to come up with a solution. ty for your help! – Maria Apr 25 '16 at 17:26