4

I have vectors of different length For example,

a1 = c(1,2,3,4,5,6,7,8,9,10) a2 = c(1,3,4,5) a3 = c(1,2,5,6,9)

I want to stretch out a2 and a3 to the length of a1, so I can run some algorithms on it that requires the lengths of the vectors to be the same. I would truncate a1 to be same as a2 and a3, but i end up losing valuable data.

ie perhaps a2 could look something like 1 1 1 3 3 3 4 4 5 5 ?

Any suggestions would be great! thanks

EDIT: I need it to work for vectors with duplicate values, such as c(1,1,2,2,2,2,3,3) and the stretched out values to represent the number of duplicate values in the original vector, for example if i stretched the example vector out to a length of 100 i would expect more two's than one's.

2 Answers2

4

It sounds like you're looking for something like:

lengthen <- function(vec, length) {
  vec[sort(rep(seq_along(vec), length.out = length))]
}

lengthen(a2, length(a1))
# [1] 1 1 1 3 3 3 4 4 5 5
lengthen(a3, length(a1))
# [1] 1 1 2 2 5 5 6 6 9 9
lengthen(a4, length(a1))
# [1] 5 5 5 1 1 1 3 3 4 4
lengthen(a5, length(a1))
# [1] 1 1 1 1 1 1 4 4 5 5

Where:

a1 = c(1,2,3,4,5,6,7,8,9,10)
a2 = c(1,3,4,5)
a3 = c(1,2,5,6,9)
a4 = c(5,1,3,4)
a5 = c(1,1,4,5)
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
  • Thanks! that was very helpful. Just wondering, what if I had data like c(1,1,2,2,3,4,4,5,5), it doesn't work when there are duplicates in the vector. Do you know if there'd be any work arounds regarding that? cheers – mexicanseafood Jun 22 '20 at 11:08
0

One way could be to create a sequence between two points with defined length.

#Put the data in a list
list_data <- list(a1 = a1, a2 = a2, a3 = a3)
#Get the max length
max_len <- max(lengths(list_data))
#Create a sequence
list_data <- lapply(list_data, function(x) 
                    seq(min(x), max(x), length.out = max_len))

#$a1
# [1]  1  2  3  4  5  6  7  8  9 10

#$a2
# [1] 1.000 1.444 1.889 2.333 2.778 3.222 3.667 4.111 4.556 5.000

#$a3
# [1] 1.000 1.889 2.778 3.667 4.556 5.444 6.333 7.222 8.111 9.000

Get them in separate vectors if needed :

list2env(list_data, .GlobalEnv)

This however does not guarantee that your original data points would remain in the data. For example, a2 had 3 and 4 in data but it is not present in this modified vector.

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213