2

Since intersect doesn't work with dataframes, I'm trying to use subset to create a subset of dfA with only data for which dfA's row names match dfB's row names. I should end up with 3000 rows because dfA has 5000 rows and dfB has 3000, and all of dfB’s row names exist in dfA’s row names.

The following just returns dfA's column names without any data.

mysubset = subset(dfA, dfA[,0] %in% dfB[,0]) 
M--
  • 25,431
  • 8
  • 61
  • 93
user8121557
  • 149
  • 2
  • 9

2 Answers2

1

You should get a subset based on rownames for both data.frames.

dfA[which(rownames(dfA) %in% rownames(dfB)),]

This checks which row names from dfA are in row names of dfB (which) and returns the indices to get the data in dfA (dfA[...]).

If you want to stick to your solution (which costs a bit more, computationally):

subset(dfA, rownames(dfA) %in% rownames(dfB)) 
M--
  • 25,431
  • 8
  • 61
  • 93
1

The rownames function will give you access to the rownames, then the set comparison condition will do what you expected.

Example, using small data frames with some shared rownames

dfA <- data.frame(x = 1:5,
                  y = 6:10,
                  row.names = letters[1:5])
# Show dfA
dfA
  x  y
a 1  6
b 2  7
c 3  8
d 4  9
e 5 10


dfB <- data.frame(x = 1:5,
                  y = 6:10,
                  row.names = letters[3:7])

# Show dfB
dfB
  x  y
c 1  6
d 2  7
e 3  8
f 4  9
g 5 10

Solution

# Subset rows with matching rownames 

dfA[ rownames(dfA) %in% rownames(dfB), ]
  x  y
c 3  8
d 4  9
e 5 10
Damian
  • 1,385
  • 10
  • 10
  • I chose this answer since it's economical; although, Masoud's answer works as well. I don't know why I forgot all about `rownames`; I must've been so focused on using `mydf[,0]`. Thanks! – user8121557 Jul 20 '17 at 12:14