0

I have a large dataframe My_Data, which contains a few thousand names. I am trying to subset the data frame using a vector of names Names.rm but I keep getting a dataframe returned with 0 rows (despite the names being present in My_Data).

These are what I have tried:

My_Data[My_Data$Author_name %in% Names.rm, ]

subset(My_Data, Author_name %in% Names.rm)

EDIT:

Sorry I'm not sure of the proper way to format data but I'll try and give a sample:

My_Data:

Author Time.period Gender 8 AERTS R Rien ECOLOGY 2001-2005 M 10 AGRAWAL AA Anurag ECOLOGY 2001-2005 M 12 AINSLIE G George NEUROSCIENCES 2001-2005 M 73 BLOB RW Richard ZOOLOGY 2001-2005 M

Names.rm:

1 AERTS R Rien ECOLOGY
2 BLOB RW Richard ZOOLOGY

Code used: My_Data[My_Data$Author %in% Names.rm, ]

Expected output: Author Time.period Gender 8 AERTS R Rien ECOLOGY 2001-2005 M 73 BLOB RW Richard ZOOLOGY 2001-2005 M

Actual output (when tried with whole dataframe):

[1] Author Time.period Gender
<0 rows> (or 0-length row.names)

EDIT 2: OK so it worked there with that subset of the data, but it isn't working when I try and do it on my whole dataset. Is there a limit to the size of the dataset you can do this for?

I have read: Selecting columns in R data frame based on those *not* in a vector and Select rows from a data frame based on values in a vector

DanielWard
  • 71
  • 3
  • 7
  • 1
    If you get 0 rows, the names are not present in `My_Data`. Also, if you don't post your actual data, it's difficult to give any kind of useful help – nicola Oct 31 '18 at 14:32
  • 1
    `mtcars[mtcars$cyl %in% 6,]` works for me, so your patterns do work Like nicola said, its probably a mismatch in your `Names.rm` so post an small example (not the whole thing) of your data and names to drop – Nate Oct 31 '18 at 14:33
  • 1
    It might be helpful if you could reproduce the error and supply some data / code to demonstrate where your issue is occurring. – zack Oct 31 '18 at 14:38

0 Answers0