2

This is my data

 Assay Sample Dilution  meanresp number
    1    S     0.25       68.55      1
    1    S     0.50       54.35      2
    1    S     1.00       44.75      3

My end goal is to apply a linear regression to every two consecutive rows and return the slope of that regression using Dilution and meanresp.

The length of the table can vary and i'd prefer not to use for loops as i'm trying to get out of the habit.

I think ddply would be good, but i'm not sure how to select the subset of every two consecutive rows. I thought perhaps there could be a way of saying 'do this for every vector subset of Dilution of length 2?

Any insight would be helpful.

LyzandeR
  • 37,047
  • 12
  • 77
  • 87
Kabau
  • 79
  • 2
  • 8
  • I'm not sure what you expect as a result. Something like `diff(meanresp) / diff(Dilution)` (grouped by `Assay` and `Sample`)? – Roland Jan 15 '15 at 10:19
  • could [this](http://stackoverflow.com/questions/26755653/r-how-to-write-a-for-loop-that-reads-every-two-lines-in-a-matrix/26756580#26756580) help you ? (for the selection of every two consecutive rows) – Cath Jan 15 '15 at 10:21
  • I'm aiming for something that does this: ddply(.data=data, .variables='subsets each consecutive 2 rows', .fun='linear model function') The subsetting is where i'm having trouble. Does this make sense? – Kabau Jan 15 '15 at 10:27
  • `ddply` splits by columns not rows so you can't use that. Check a way to do it using `lapply` twice below. Essentially you could merge those two as 1 function but I would avoid it as it causes hard-to-read code. – LyzandeR Jan 15 '15 at 10:38

1 Answers1

3

I don't know how this will be helpful in a linear regression but you could do something like that:

df <- read.table(header=T, text="Assay Sample Dilution  meanresp number
    1    S     0.25       68.55      1
    1    S     0.50       54.35      2
    1    S     1.00       44.75      3")

Using lapply:

> lapply(2:nrow(df), function(x) df[(x-1):x,] )
[[1]]
  Assay Sample Dilution meanresp number
1     1      S     0.25    68.55      1
2     1      S     0.50    54.35      2

[[2]]
  Assay Sample Dilution meanresp number
2     1      S      0.5    54.35      2
3     1      S      1.0    44.75      3

In case you want specific columns as well for your consecutive rows you can select them as:

> lapply(2:nrow(df), function(x) df[(x-1):x, c('Dilution','meanresp')] )
[[1]]
  Dilution meanresp
1     0.25    68.55
2     0.50    54.35

[[2]]
  Dilution meanresp
2      0.5    54.35
3      1.0    44.75

EDIT

If you want to perform a linear regression another lapply is enough to do it:

a <- lapply(2:nrow(df), function(x) df[(x-1):x, c('Dilution','meanresp')] )

b <- lapply(a,function(x) lm(Dilution~meanresp,data=x))

>b
[[1]]

Call:
lm(formula = Dilution ~ meanresp, data = x)

Coefficients:
(Intercept)     meanresp  
    1.45687     -0.01761  


[[2]]

Call:
lm(formula = Dilution ~ meanresp, data = x)

Coefficients:
(Intercept)     meanresp  
    3.33073     -0.05208  

Or if you just want the slope:

b <- lapply(a, function(x) {
                    d <- lm(Dilution~meanresp,data=x)
                    coefficients(summary(d))[2,1]
})

> b
[[1]]
[1] -0.01760563

[[2]]
[1] -0.05208333
LyzandeR
  • 37,047
  • 12
  • 77
  • 87
  • This is great, thank-you. It does pretty much exactly what I need (i'll just need to make the list become a dataframe instead showing what dilutions were used). – Kabau Jan 15 '15 at 12:09