0

I think I need to write an if/else loop of some kind to accomplish this, but I'm not sure where to start. I want to search within a column of my data frame for values that are of a certain length, and contain a certain symbol. For example, within the column LAYER, if the value is two symbols long and contains an "L" (this could be LF, FL, LH, or HL), I want to multiply other column values by 0.5.

LAYER    VALUE    UPPER    LOWER    THICKNESS_MIN    THICKNESS_MAX    A1    A2    A3

LF        5        0        4           3                 10         3.4    67    24
LFH       9        0        6           2                 9          3.7    65    76
FH        4        0        2           1                 8          3.3    35    34
FL        11       0        1           5                 6          3.8    56    86
LH        50       0        4           3                 4          4.6    43    45

The sentence format that I have for this is "#if value in LAYER is 2 characters and one of them is L, then multiply the columns VALUE, UPPER, LOWER, THICKNESS_MIN AND THICKNESS_MAX by 1/2 and change the LAYER value to FF_FH for this row

I also need to do the same but for rows where the LAYER value is 3 characters long, and the other variables are multiplied by 2/3.

I want the final outcome to be something like

LAYER    VALUE    UPPER    LOWER    THICKNESS_MIN    THICKNESS_MAX    A1    A2    A3

LF        2.5      0        2           1.5               5         3.4    67    24
LFH       3        0        2           1.3               3          3.7    65    76
FH        4        0        2           1                 8          3.3    35    34
FL        5.5      0        0.5         2.5               3          3.8    56    86
LH        25       0        2           1.5               2          4.6    43    45
Amanda S
  • 1
  • 2

1 Answers1

1

First of all, let's put your dataset in a form that can be copied and pasted to an R session.

mydf <-
structure(list(LAYER = c("LF", "LFH", "FH", "FL", "LH"), VALUE = c(5L, 
9L, 4L, 11L, 50L), UPPER = c(0L, 0L, 0L, 0L, 0L), LOWER = c(4L, 
6L, 2L, 1L, 4L), THICKNESS_MIN = c(3L, 2L, 1L, 5L, 3L), THICKNESS_MAX = c(10L, 
9L, 8L, 6L, 4L), A1 = c(3.4, 3.7, 3.3, 3.8, 4.6), A2 = c(67L, 
65L, 35L, 56L, 43L), A3 = c(24L, 76L, 34L, 86L, 45L)), .Names = c("LAYER", 
"VALUE", "UPPER", "LOWER", "THICKNESS_MIN", "THICKNESS_MAX", 
"A1", "A2", "A3"), class = "data.frame", row.names = c(NA, -5L
))

Now, it's very easy, just remember that grepl returns a logical vector the size of its second argument, so we can AND it (&) with the output of nchar.

inx <- grepl("L", mydf$LAYER) & nchar(mydf$LAYER) == 2
mydf[inx, 2:5] <- mydf[inx, 2:5] * 1/2
mydf[inx, 1] <- "FF_FH"

If the number of characters in the column of interess is 3, just adapt the code accordingly.

inx <- nchar(mydf$LAYER) == 3
mydf[inx, 2:5] <- mydf[inx, 2:5] * 2/3

mydf
  LAYER VALUE UPPER LOWER THICKNESS_MIN THICKNESS_MAX  A1 A2 A3
1 FF_FH   2.5     0   2.0      1.500000            10 3.4 67 24
2   LFH   6.0     0   4.0      1.333333             9 3.7 65 76
3    FH   4.0     0   2.0      1.000000             8 3.3 35 34
4 FF_FH   5.5     0   0.5      2.500000             6 3.8 56 86
5 FF_FH  25.0     0   2.0      1.500000             4 4.6 43 45
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66