0

I do have two table. I want to extract a single colum from the second table and past it into the first table. The problem is that not all rows of the colum of the second table should be copied but that only those are copied whose first colum matches with the first table

read.table("table1")->c
read.table("table2")->d
d[,1] %in% c[,1] ->f 

does only lead to a vector with TRUE and FALSE - but I would need the row number, then having such a vector with the row numbers of the matching elements, I would need to extract exactly these rows from table d fourth column

d[,4]->g
g[vector with numbers,]->g1

is there an easy possibility?

Tim Heinert
  • 179
  • 3
  • 6
  • 11
  • The vector with TRUE/FALSE values for each position, can be used for subsetting in exactly the same was as a vector of row numbers. – James Mar 13 '13 at 11:25
  • 1
    @Tim just I am curious why are you using `->` and not `<-` to assign variable? – agstudy Mar 13 '13 at 11:30
  • @agstudy so was I. I haven't seen that before. – Simon O'Hanlon Mar 13 '13 at 11:38
  • @SimonO101 I don't get your point here? you haven't seen what? – agstudy Mar 13 '13 at 11:39
  • I have *NEVER* seen someone use -> after the expression for the assignment, that's all. I'm not saying it's better or worse, just unusual – Simon O'Hanlon Mar 13 '13 at 11:40
  • Tim, did either solution work for you? I notice you have asked nine questions and have not accepted a single answer yet. If the solutions that people are kind enough to provide work for you, please press the green tick arrow next to your preferred answer, that way these questions can be removed from the unanswered stack. If they do not answer your question please ask for further clarification. Thanks. – Simon O'Hanlon Mar 14 '13 at 07:35

2 Answers2

3

Or with match

f <- d[ match(c[,1] , d[,1]) , ]
Simon O'Hanlon
  • 58,647
  • 14
  • 142
  • 184
3

This is a classic merge:

merge(c,d[,c(1,4)],by=1)

If you have names in your data tables, the matching may be performed without specifying the by parameter. As a side note, since c is a very common base function (which I've used here), it is not a great choice for a variable name.

James
  • 65,548
  • 14
  • 155
  • 193
  • @Arun indeed it will be faster. – agstudy Mar 13 '13 at 11:26
  • @Arun True, but the extra time that `merge` spends is on safety and convenience. Unless speed is proving an issue, I would err on the side of caution. – James Mar 13 '13 at 11:27
  • Can you explain the safety issue? Is this because of NA returns in the case of nomatch? I suppose you could just do `f <- d[ na.omit( match(c[,1] , d[,1]) ) , ]` in that case – Simon O'Hanlon Mar 13 '13 at 11:31
  • @SimonO101 Yes, and if you name your variables appropriately, it can help that you are matching on the correct variables. – James Mar 13 '13 at 11:45
  • Fairly sure. If I understand correctly, he wants elements from `d` that have matches in `c`. So.. `c <- 1:10; d <- seq(3,15,3); f<- d[ match( c , d ) ]` gives `NA NA 3 NA NA 6 NA NA 9 NA`. NA represent that the value in the first column was not found in the second column. I *think* that's what he want no? – Simon O'Hanlon Mar 13 '13 at 11:53