2

somehow my sensor calibration was of and I ended up with some seriously shifted data. The measured data is, you guessed it, what I measured and the baseline is the level, where the data should be.

data <- data.frame(hour = c(1:244),
                   measured = c(1151.43, 1151.19, 1150.39, 1149.38, 
                                1149.01, 1148.3, 1147.61, 1146.68, 1145.13, 1144.23, 1151.17, 
                                1145.58, 1140.4, 1139.47, 1138.38, 1137.11, 1136.24, 1135.55, 
                                1134.84, 1134.18, 1133.82, 1135.74, 1159.47, 1180.34, 1169.46, 
                                1136.52, 1131.85, 1132.28, 1132.84, 1134.29, 1135.86, 1136.97, 
                                1142.12, 1188.96, 1231.69, 1254.89, 1246.7, 1202.24, 1156.71, 
                                1146.82, 1148.99, 1150.41, 1151.31, 1151.59, 1151.87, 1157.17, 
                                1190.79, 1210.93, 1209.53, 1179.72, 1153.43, 1153.28, 1153.23, 
                                1153.2, 1152.95, 1152.55, 1152.33, 1152.67, 1152.58, 1154.27, 
                                1163.28, 1153.28, 1150.61, 1149.78, 1148.39, 1147.02, 1146.01, 
                                1144.79, 1143.43, 1142.81, 1142.02, 1141.34, 1140.36, 1139.73, 
                                1139.22, 1138.59, 1137.79, 1137.44, 1136.92, 1136.15, 1135.93, 
                                1136, 1137, 1138.15, 1138.51, 1149.14, 1155.07, 1138.72, 1138.04, 
                                1138.58, 1138.65, 1138.91, 1139.59, 1139.73, 1139.11, 1138.66, 
                                1138.57, 1148.24, 1166.46, 1157.07, 1140.83, 1140.43, 1140.7, 
                                1140.79, 1140.64, 1141.63, 1142.87, 1143.84, 1144.57, 1144.89, 
                                1147.34, 1156.13, 1147.45, 1146.44, 1146.93, 1147.68, 1148.14, 
                                1148.27, 1147.62, 1146.77, 1146.25, 1146.47, 1147.69, 1164.92, 
                                1164.16, 1148.28, 1147.49, 1147.27, 1147.66, 1147.94, 1148.97, 
                                1150.35, 1151.25, 1152.39, 1153.05, 1154.46, 1166.86, 1160.59, 
                                1154.12, 1154.55, 1155.08, 1155.64, 1156.19, 1156.38, 1156.46, 
                                1156.38, 1155.96, 1163.76, 1189.55, 1191.38, 1162.85, 1157.35, 
                                1157.28, 1158, 1158.6, 1159.6, 1160.03, 1160.16, 1160.78, 1161.24, 
                                1161.72, 1164.73, 1161.89, 1162.13, 1162.35, 1162.61, 1162.25, 
                                1161.42, 1160.78, 1160.35, 1159.98, 1159.83, 1165.63, 1186.16, 
                                1182.38, 1159.98, 1158.49, 1158.33, 1159.3, 1160.39, 1160.97, 
                                1161.17, 1161.25, 1161.36, 1161.31, 1162.32, 1169.11, 1160.85, 
                                1160.19, 1160.06, 1159.86, 1158.93, 1158.65, 1158.49, 1158.52, 
                                1157.93, 1157.94, 1179.24, 1195.79, 1179.21, 1156.38, 1156.31, 
                                1157.05, 1158.47, 1159.08, 1159.28, 1159.73, 1160, 1160.1, 1160.04, 
                                1160.12, 1159.18, 1159.05, 1159.07, 1158, 1157.06, 1156.52, 1156.22, 
                                1156.91, 1157.18, 1156.54, 1160.11, 1183.55, 1188.34, 1162.84, 
                                1154.78, 1154.72, 1154.6, 1154.61, 1154.63, 1154.66, 1154.76, 
                                1155.2, 1160.27, 1188.68, 1205.58, 1192.46, 1158.55, 1157.47, 
                                1157.73, 1158.1, 1158.37, 1158.3, 1158.4),
                   baseline = c(1010.1, 1009.2, 1008.8, 1007.8, 1007.1, 1005.5,
                                1004.2, 1002.9, 1001.9, 1000.8, 999.8, 998.7, 997.8,
                                996.8, 996, 995.5, 995.1, 994.4, 993.5, 992.8, 992.4,
                                992.2, 992.2, 992.2, 992.8, 993.6, 995.3, 997.2, 998.4, 
                                999.7, 1001, 1002.1, 1003.1, 1004.1, 1004.7, 1005.2,
                                1006.7, 1008.6, 1009.7, 1010.5, 1010.9, 1011, 1011.2, 
                                1011.8, 1012.1, 1012.3, 1012.9, 1013.2, 1013.4, 1013.3, 
                                1013.4, 1013.1, 1013, 1012.5, 1012.3, 1011.8, 1010.9, 
                                1010.6, 1010, 1009.8, 1008.9,  1008, 1006.4, 1005.3, 1004.8, 1003.1, 1002.2, 1001.3, 1000.7, 
                                1000, 999.8, 999.5, 999.3, 999, 998.9, 998.4, 998.2, 998, 998.2, 
                                998.3, 998.2, 998.6, 998.5, 998.4, 998.3, 998.3, 998.7, 998.4, 
                                998.5, 998.6, 998.7, 998.8, 999, 999.4, 999.6, 1000.1, 1000.6, 
                                1001.2, 1001.1, 1001.2, 1001.5, 1001.8, 1002.1, 1002.6, 1003.2, 
                                1003.8, 1004.3, 1004.6, 1004.8, 1005, 1005.4, 1005.5, 1005.7, 
                                1006.2, 1006.3, 1006.5, 1006.9, 1007.2, 1007.6, 1008.1, 1008.2, 
                                1008.8, 1009, 1009.3, 1009.2, 1009.4, 1009.6, 1009.6, 1010.1, 
                                1010.8, 1011.4, 1011.9, 1012.1, 1012.6, 1012.7, 1013, 1013.4, 
                                1013.5, 1013.9, 1014.3, 1014.9, 1015.3, 1015.9, 1016.1, 1016.6, 
                                1017.2, 1017.3, 1017.5, 1017.6, 1017.6, 1017.4, 1017.5, 1017.8, 
                                1018.2, 1018.4, 1018.5, 1018.5, 1018.5, 1018.7, 1018.6, 1018.9, 
                                1018.7, 1018.7, 1019, 1019.5, 1019.6, 1019.5, 1019.3, 1019.1, 
                                1018.9, 1018.6, 1018.6, 1018.4, 1018.3, 1018, 1017.6, 1017.8, 
                                1018.1, 1018.3, 1018.3, 1018.3, 1018.3, 1018.4, 1018.1, 1017.8, 
                                1017.6, 1017.4, 1017.5, 1017.5, 1017.5, 1017.6, 1017.7, 1017.8, 
                                1017.5, 1017.4, 1017.1, 1016.9, 1016.8, 1016.8, 1016.9, 1017.1, 
                                1017.3, 1017.5, 1017.4, 1017.3, 1017, 1016.8, 1016.7, 1016.4, 
                                1016, 1015.5, 1015.4, 1015.4, 1015.3, 1015.1, 1015.3, 1015.6, 
                                1015.5, 1014.9, 1014.6, 1013.4, 1012.9, 1012.3, 1012.1, 1012, 
                                1012.1, 1012.4, 1012.7, 1012.7, 1013.3, 1013.6, 1014, 1014.1, 
                                1014.5, 1014.8, 1015.2, 1015.7, 1016.3, 1016.9, 1017.4, 1017.5, 
                                1017.3, 1017.3, 1016.9))

A quick plot shows this

plot(data$hour, data$measured, type = "l", ylim = c(950, 1250))
lines(data$hour, data$baseline, col = "red")

example of baseline shift

The black line is the measured data and the red line, where the base actually should be.

As the distance between the data and the baseline looks equal, I thought I could just take the difference between them mean values and subtract it.

correction <- mean(data$measured) - mean(data$baseline)

plot(data$hour, data$measured, type = "l", ylim = c(950, 1250))
lines(data$hour, data$baseline, col = "red")
lines(data$hour, data$measured-correction, col = "green")

This almost worked, but as you can see, the green line ends up being a bit too low.

enter image description here

I also thought about fitting a line through the data. Something like this

fit <- lm(measured ~ poly(hour, degree = 7), data = data)$fitted.values

Does anybody have an idea how to shift the measured values down to the baseline?

Thank you so much for your help.

tim
  • 427
  • 1
  • 6
  • 10
  • 1
    In the ideal situation, you need to understand why your measured data is shifted, and what's the shifted value? If this is a data process problem and we don't care why it is shifted, then I think the shape of measured line should be kept and the only parameter can be optimized is the constant shifted value in y-axis. – Peace Wang Mar 16 '21 at 07:21
  • Maybe try: `correction <- median(data$measured) - median(data$baseline)` – GKi Mar 16 '21 at 07:41
  • I completely agree. Just shifting would be enough for now. @tdy 's answer pointed me in the right direction. – tim Mar 16 '21 at 07:42
  • or: `correction <- runmed(data$measured, 13) - runmed(data$baseline, 13)` – GKi Mar 16 '21 at 07:50
  • Hi @GKi, both of your solutions definitely improve on my solutions. Thanks. But the accepted answer of just using part of the data did the trick. – tim Mar 16 '21 at 07:57
  • Using `tail` will fail if the peaks look in the other direction. – GKi Mar 16 '21 at 08:00
  • Thanks for pointing that out. `tail` will also fail, if there is a peak right at the end of the data, but by being aware of that I have to double check my data. Other option is to use `head` or any other part of the data that has the same shape as the baseline. – tim Mar 16 '21 at 08:09
  • Maybe also `correction <- min(data$measured) - min(data$baseline)` will work. – GKi Mar 16 '21 at 08:18

1 Answers1

0

I tend to agree with Peace about only applying an offset and not altering the signal shape.

You could improve the correction by using a more locally stable segment without peaks. For this signal, the tail should work:

correction <- mean(tail(data$measured)) - mean(tail(data$baseline))

unshifted

tdy
  • 36,675
  • 19
  • 86
  • 83
  • 1
    Thanks for the good idea. Using the mean of a more stable part without peaks did the trick. My over-complicated mind was looking for a way to cut out the peaks, fit a line through the remaining curve and then subtract the baseline. But your solution gets me close enough. – tim Mar 16 '21 at 07:52
  • No problem. Good point about choosing a locally stable segment. I added that rationale for future reference, just so it's clear that `tail` is not a general solution. – tdy Mar 16 '21 at 10:20