Sorry for the long question, maybe there is a quick solution but I haven't been able to figure this out. I have 3 different datasets with x and y datapoints like the following:
line1_x = [277.75, 287.68, 308.77, 322.23, 342.98, 363.99, 387.59, 396.83, 401.52, 405.78, 408.30,
408.06, 406.94, 406.11, 403.46, 401.14, 397.28, 394.04, 390.62, 386.99, 384.31, 380.41,
375.85, 372.25, 368.35, 364.86, 363.01, 361.84, 361.85, 363.79, 366.25, 368.94, 371.58,
376.11, 380.97, 387.33, 391.61, 392.36, 373.35, 350.68]
line1_y = [806.00, 779.43, 713.71, 731.96, 701.13, 621.68, 525.04, 490.21, 467.48, 455.57, 445.59,
440.54, 436.74, 432.71, 434.32, 438.06, 449.00, 462.25, 476.56, 493.43, 514.02, 538.52,
570.87, 614.20, 664.00, 708.06, 750.61, 786.38, 814.16, 834.67, 853.45, 864.71, 872.50,
882.23, 893.28, 907.94, 922.18, 931.05, 950.03, 1085.45]
line2_x = [443.98, 442.71, 441.24, 437.57, 427.09, 417.82, 418.20, 418.14, 417.76, 414.84, 411.93,
408.14, 404.29, 402.48, 400.63, 398.01, 394.43, 392.05, 391.69, 391.06, 388.61, 384.59,
378.93, 374.67, 372.82, 371.07, 370.50, 370.96, 372.11, 374.10, 377.05, 381.65, 385.54,
388.72, 389.27, 389.00, 389.71, 391.49, 392.60, 385.89, 384.39, 361.87]
line2_y = [299.48, 317.04, 338.92, 360.55, 405.99, 451.64, 493.67, 516.66, 530.73, 540.62, 548.90,
553.39, 555.65, 559.21, 564.41, 571.17, 577.56, 585.31, 592.54, 606.96, 626.18, 651.76,
679.82, 710.62, 744.39, 774.02, 802.14, 829.54, 849.98, 865.51, 879.94, 894.02, 903.30,
910.80, 918.08, 926.39, 935.80, 947.09, 955.08, 965.04, 1076.83, 1110.45]
line3_x = [302.52, 313.15, 340.68, 352.96, 364.85, 378.51, 407.05, 437.33, 453.73, 462.99, 467.79,
470.68, 472.10, 473.56, 473.49, 472.72, 471.14, 468.91, 467.76, 466.58, 463.90, 460.31,
457.23, 453.48, 448.68, 443.65, 438.08, 433.63, 429.14, 424.09, 418.62, 415.17, 414.60,
417.55, 422.72, 429.17, 436.35, 443.75, 450.29, 455.66, 459.26, 461.39, 462.30, 463.23,
469.86, 435.97]
line3_y = [794.42, 809.18, 782.48, 761.93, 771.51, 776.51, 689.07, 560.78, 531.03, 524.44, 518.59,
516.73, 514.03, 511.97, 511.63, 518.31, 532.60, 549.91, 563.77, 582.81, 601.98, 617.95,
628.03, 640.11, 658.30, 679.82, 703.00, 730.40, 754.16, 776.00, 800.85, 824.39, 844.47,
856.76, 866.70, 875.72, 884.18, 892.66, 903.61, 912.93, 922.55, 928.60, 932.00, 952.27,
1029.70, 1065.12]
This dataset is a sequence of points, so time and location is important and that is why it is not possible to simply add points to the smaller datasets and calculate the average. I've been trying to interpolate the data so all the interpolated lines have the same distance and then using that to calculate the average for each step of the interpolated lines. The data looks like this:
I calculated the interpolation using the minimum y and maximum y of the 3 datasets using the interp1d function:
interpolated_y = np.linspace(minimum_y, maximum_y, num=100, endpoint=True)
interpolated_x = interp1d(line1_y,line1_x, fill_value="extrapolate")(interpolated_y)
And they look like this:
This is good because all the distances between the steps of the interpolated points are the same for all the lines and there are exactly the same amount of interpolated points, but if I use this to calculate the average between the interpolation steps I lose the information about the starting points, i.e. I am not able to calculate the average between the starting points.
My question is: How can I interpolate the 3 datasets such that I have the same number of interpolated points with the same distance between points for each of them?, and the interpolation should start from the starting points and continue in the correct direction. For the cases where the dataset is smaller, an extrapolation should happen.
Or, is there another way to calculate an average of those points taking into account that they are a time sequence points?