I got the same problem when testing bfast
on my consumer portfolio data and failed to find any real solution. I went on to dig into the bfast literature from the earth sensing community, which is where bfast
was first developed and extensively used. My read is that there was really little you can do to get bfast to always fit a useful seasonal component.
Days ago, I encountered this Quora discussion on the best software for time series analysis and found there is a new R package Rbeast
for breakpoint detection and time series decomposition. There is also a nice tweet that shows a quick comparison between bfast and Rbeast.
After some experimenting, I found Rbeast
was able to pinpoint seasonal breakpoints in my data and yours. To be frank, I still have no idea how Rbeast
works. The BEAST algorithm in Rbeast
seems rather complicated, with tons of outputs; it is not well documented and not as easy to use as bfast. Let me show what I got, first using your data and then using a second artificial time series.
Your data
# The original code to generate your data
n_years <- 50
freq <- 52
y_pattern <- sin(seq(0, 2*pi, length = freq))
y <- rep(y_pattern, n_years) + rnorm(freq*n_years, sd = 0.1)
mydata <- data.frame(Year = rep(1:n_years, each = freq), Week = rep(1:freq, n_years), y)
move_data <- function(data, year, weeks_to_move){
x <- data[data$Year == year, "y"]
c(x[seq(52 - weeks_to_move + 1,52)], x[seq(1, 52 - weeks_to_move)])
}
mydata$y_shifted <- mydata$y
for (year in 26:50){
mydata$y_shifted[mydata$Year == year] <- move_data(mydata, year, weeks_to_move = 8)
}
# You data analyzed by the BEAST algorithm in Rbeast
library(Rbeast)
fit <- beast(mydata$y_shifted, freq=52)
print(fit)
plot(fit)
#####################################################################
# Seasonal Changepoints #
#####################################################################
.-------------------------------------------------------------------.
| Ascii plot of probability distribution for number of chgpts (ncp) |
.-------------------------------------------------------------------.
|Pr(ncp = 0 )=0.000|* |
|Pr(ncp = 1 )=0.999|*********************************************** |
|Pr(ncp = 2 )=0.001|* |
|Pr(ncp = 3 )=0.000|* |
|Pr(ncp = 4 )=0.000|* |
|Pr(ncp = 5 )=0.000|* |
|Pr(ncp = 6 )=0.000|* |
|Pr(ncp = 7 )=0.000|* |
|Pr(ncp = 8 )=0.000|* |
|Pr(ncp = 9 )=0.000|* |
|Pr(ncp = 10)=0.000|* |
.-------------------------------------------------------------------.
| Summary for number of Seasonal ChangePoints (scp) |
.-------------------------------------------------------------------.
|ncp_max = 10 | MaxSeasonKnotNum: A parameter you set |
|ncp_mode = 1 | Pr(ncp= 1)=1.00: There is a 99.9% probability |
| | that the seasonal component has 1 chgnpt(s). |
|ncp_mean = 1.00 | Sum{ncp*Pr(ncp)} for ncp = 0,...,10 |
|ncp_pct10 = 1.00 | 10% percentile for number of changepoints |
|ncp_median = 1.00 | 50% percentile: Median number of changepoints |
|ncp_pct90 = 1.00 | 90% percentile for number of changepoints |
.-------------------------------------------------------------------.
| List of probable seasonal changepoints ranked by probability of |
| occurrence: Please combine the ncp reported above to determine |
| which changepoints below are practically meaningful |
'-------------------------------------------------------------------'
|scp# |time (cp) |prob(cpPr) |
|------------------|---------------------------|--------------------|
|1 |1301.000000 |1.00000 |
.-------------------------------------------------------------------.

The abrupt seasonal shift was detected precisely. Rbeast also gives the probability of detecting breakpoints in seasonality and trend (the red ad green curves in the Pr(scp) and Pr(tcp) panels of the above figure). The probability for the detected seasonal shift is very high, close to 1.0. The trend of your data is a flat line. It is essentially a constant of zero and the probability of finding breakpoints (i.e., changepoints as used in Rbeast) in trend is also close to zero all over.
A second time series
A cool feature of Rbeast
is the estimation of sin and cos orders for the harmonic seasonal model. Below, I generated a time series that has three seasonal segments (i.e., two breaks) plus a sloped trend with no breaks. The three seasonal segments have different sin orders, taking 1, 2, and 3, respectively.
# Generate a sample time series with three seasonal segments
# the sin/cos orders for the three segs are different.
seg1 <- 1:1000
seg2 <- 1001:2000
seg3 <- 2001:3000
new_data <- c( sin(seg1*2*pi/52), 0.6*sin( seg2*2*pi/52*2), 0.3*sin( seg3*2*pi/52*3)) + (1:3000)*0.0002+ rnorm(3000, sd = 0.1)
# Test bfast using new_data
y_ts <- ts(new_data, start = c(1,1), frequency = 52)
fit <- bfast(y_ts, h=.15, season="harmonic", max.iter=20, breaks=3)
plot(fit)

Surprisingly enough, bfast
didn't detect any breaks in seasonality, although the three segments are easily eye-balled in the plotted data Yt
.
# Analyze the new_data time series using `Rbeast`
fit <- beast(new_data, freq=52)
print(fit)
plot(fit)
#####################################################################
# Seasonal Changepoints #
#####################################################################
.-------------------------------------------------------------------.
| Ascii plot of probability distribution for number of chgpts (ncp) |
.-------------------------------------------------------------------.
|Pr(ncp = 0 )=0.000|* |
|Pr(ncp = 1 )=0.000|* |
|Pr(ncp = 2 )=0.969|*********************************************** |
|Pr(ncp = 3 )=0.031|** |
|Pr(ncp = 4 )=0.000|* |
|Pr(ncp = 5 )=0.000|* |
|Pr(ncp = 6 )=0.000|* |
|Pr(ncp = 7 )=0.000|* |
|Pr(ncp = 8 )=0.000|* |
|Pr(ncp = 9 )=0.000|* |
|Pr(ncp = 10)=0.000|* |
.-------------------------------------------------------------------.
| Summary for number of Seasonal ChangePoints (scp) |
.-------------------------------------------------------------------.
|ncp_max = 10 | MaxSeasonKnotNum: A parameter you set |
|ncp_mode = 2 | Pr(ncp= 2)=0.97: There is a 96.9% probability |
| | that the seasonal component has 2 chgnpt(s). |
|ncp_mean = 2.03 | Sum{ncp*Pr(ncp)} for ncp = 0,...,10 |
|ncp_pct10 = 2.00 | 10% percentile for number of changepoints |
|ncp_median = 2.00 | 50% percentile: Median number of changepoints |
|ncp_pct90 = 2.00 | 90% percentile for number of changepoints |
.-------------------------------------------------------------------.
| List of probable seasonal changepoints ranked by probability of |
| occurrence: Please combine the ncp reported above to determine |
| which changepoints below are practically meaningful |
'-------------------------------------------------------------------'
|scp# |time (cp) |prob(cpPr) |
|------------------|---------------------------|--------------------|
|1 |2001.000000 |1.00000 |
|2 |1001.000000 |1.00000 |
|3 |1027.000000 |0.02942 |
.-------------------------------------------------------------------.

The above is the Rbeast result. The two breaks and the three seasonal segments are recovered. There are no breaks in the trend Rbeast
estimated seasonal harmonic orders. In the Order_s panel above, the three sin and cos orders are recovered correctly. The Order_s curve also shows the locations of the two seasonal breaks.