2

I am currently using Python to compare two different datasets (xDAT and yDAT) that are composed of 240 distance measurements taken over a certain amount of time. However, dataset xDAT is offset by a non-linear amount. This non-linear amount is equal to the width of a time-dependent, dynamic medium, which I call level-A. More specifically xDAT measures from the origin to the top of level-A, whereas yDAT measures from the origin to the bottom of level-A. See following diagram:

enter image description here

In order to compare both curves, I must fist apply a correction to xDAT to make up for its offset (the width of level-A).

As of yet, I have played around with different degrees of numpy.polyfit. I.E:

coefs = np.polynomial.polynomial.polyfit(xDAT, yDAT, 5)
polyEST=[]
for i in range(0,len(x-DAT)):
    polyEST.append(coefs[0] + coefs[1]*xDAT[i] + coefs[2]*pow(xDAT[i],2) + coefs[3]*pow(xDAT[i],3) + coefs[4]*pow(xDAT[i],4) + coefs[5]*pow(xDAT[i],5))

The problem with using this method, is that when I plot polyEST (which is the corrected version of xDAT), the plot still does not match the trend of yDAT and remains offset. Please see the figure below, where xDAT= blue, corrected xDAT=red, and yDAT=green:

enter image description here

Ideally, the corrected xDAT should still remain noisier than the yDAT, but the general oscillation and trend of the curves should match.

I would greatly appreciate help on implementing a different curve-fitting and parameter estimation technique in order to correct for the non-linear offset caused by level-A.

Thank you.

LexStJ
  • 51
  • 4
  • I don't think it is clear what you are trying to do. You call `polyfit` as if `xDAT` and `yDAT` are the `x`- and `y`- coordinates of a single set of sample points, yet, your problem description and plot suggest that they are not related as such. – Stelios Jan 08 '17 at 20:32
  • By your description the offset is by a constant. Using a polynomial with degree > 0 seems like the wrong approach. Also, from your plot xDat and yDat do not seem "lined up" - that is xDat clearly not yDat +/- some constant - it appears to be "flipped" or phase shifted as well. – Oliver Dain Jan 08 '17 at 20:39
  • Thank you Stelios. In the introduction, I mentioned that both xDAT and yDAT are datasets of 240 points. Each of those points corresponds to a measurement at a specific time...both datasets were measured at equal time increments. I used polyfit to derive a polynomial that describes the difference between the two datasets, and I then used that polynomial to correct for the offset seen in xDAT. I am convinced that there is a more appropriate method to correct of the level-A offset observed in xDAT rather than np.polyfit. – LexStJ Jan 08 '17 at 20:44
  • Yes, there appears to be a slight phase shift and flip, however I am not sure how to correct for that. As seen in the simple diagram, there is a slight difference as to the location of the samples. However, the samples in each set were taken at identical time points. In which case, the phase shift and flip is caused by the depth of the medium and slight difference in the horizontal locations of the test points. Thank you. – LexStJ Jan 08 '17 at 20:50
  • Note: The graph of corrected xDAT (in red) shows a plateau between observation numbers 75-80, but I do not see this feature in the original xDAT (in blue). – James Phillips Jan 09 '17 at 09:33

1 Answers1

0

The answer depends on what Level A is. If it is independent, your first line should be something like

coefs = np.polynomial.polynomial.polyfit(numpy.arange(xDAT.size), yDAT-xDAT, 5)

This will give a polyfit of an independent A as drawn, and then the corrected x should be

xDAT+np.polynomial.polynomial.polyval(numpy.arange(xDAT.size),coefs)

If A is dependent on the variables (as it looks to be), you don't want to polyfit, as that only regresses the real part of the oscillation (the "spring" part of a spring-damper system), which is why your corrected_xDat is in phase with xDat instead of yDat. To regress something like that you'll need to use Fourier transforms (which is not my specialty).

Daniel F
  • 13,620
  • 2
  • 29
  • 55