-1

I want the value in myrate column to be

  1. for the first value of myrate should be rupee (minus) amt
  2. for second row of myrate column should be the first value of myrate (value as generated in point 1) minus the second value of Rupee column if 'Name' is the same in both the 1st and second row
  3. the logic of step 2 should continue till a new value in Name colume is reached (in this case: "sss") 4.again in the second row of "sss" name, the logic of step 2 should continue

I tried using dplyr which gives the correct answer for some iterations, but it fails for others.

table example

stefan
  • 90,330
  • 6
  • 25
  • 51
  • Can you show the data with `dput` and expected output – akrun Apr 05 '20 at 18:35
  • Name Amt Rupee Myrate kkk 20 11 -9 kkk 21 333 -342 kkk 22 65 -407 sss 23 90 67 sss 24 36 31 sss 25 71 -40 ttt 26 69 43 ttt 27 32 11 (myrate column is the expected output) – shailesh sankaran Apr 06 '20 at 17:31

1 Answers1

0

Try this. I added a helper function to compute myrate. Instead of checking names I simply split the df by Name, do the computations for each Name and bind them back together in the end.

# Example data

df <- data.frame(
  Name = c(rep("kkk", 3), rep("sss", 3), rep("ttt", 2)),
  Amt = c(20:27),
  Rupee = c(11, 333, 65, 90, 36, 71, 69, 32)
)
df
#>   Name Amt Rupee
#> 1  kkk  20    11
#> 2  kkk  21   333
#> 3  kkk  22    65
#> 4  sss  23    90
#> 5  sss  24    36
#> 6  sss  25    71
#> 7  ttt  26    69
#> 8  ttt  27    32

# Helper function
myrate <- function(df) {
  # Init myrate = 0
  df$myrate <- 0
  # Add id = row number
  df$id <- 1
  df$id <- cumsum(df$id)
  # Loop over id or rows
  for (i in df$id) {
    df$myrate[i] <- 
      if (i == 1) {
        df$Rupee[i] - df$Amt[i]
      } else {
        df$myrate[i] <- df$myrate[i - 1] - df$Rupee[i]
      }
  }
  df["id"] <- NULL
  df
}

# Computation
library(dplyr)

df %>% 
  # Split by Name
  split(.$Name) %>%
  # Compute myrate for each Name
  lapply(myrate) %>%
  # Bind back in one df
  bind_rows()
#>   Name Amt Rupee myrate
#> 1  kkk  20    11     -9
#> 2  kkk  21   333   -342
#> 3  kkk  22    65   -407
#> 4  sss  23    90     67
#> 5  sss  24    36     31
#> 6  sss  25    71    -40
#> 7  ttt  26    69     43
#> 8  ttt  27    32     11

Created on 2020-04-05 by the reprex package (v0.3.0)

stefan
  • 90,330
  • 6
  • 25
  • 51
  • Dear Stefan, thans for the revert, just one doubt, what if the line items in the tables are really huge, like few 100s..please also provide help in such case – shailesh sankaran Apr 06 '20 at 05:45
  • Hi Shailesh. As you tagged your question with `bigtable` I was aware that using a for loop is probably not the best solution. In general it should work for a large table, however it is probably quite slow. In this sense it's only a first step on how to implement the algorithm. Unfortunately, at the moment I have no quick solution to get rid of the loop. Therefore I would recommend that you build up on my first step and ask a second and refined question on how to implement the algorithm without the need of a for loop. – stefan Apr 06 '20 at 06:56
  • Dear Stefan, Appreciate your time and effort spent on helping me out. Thanks a lot – shailesh sankaran Apr 06 '20 at 10:53