2

I'm importing a sav file to RStudio. Now I want to select only a specific nation (column header: nation) and a specific year (column header: year). Using following code:

myfile_nation_year <- subset(myfile, (nation == "Great Britain") & (year == "2012"))

I only get this error message:

 Error in subset.default(sigma_org, (nation == "Great Britain") & (year ==  : 
  object 'nation' not found

When I look at my file in the Viewer the header appears with nation, year and the other headers.

I also tried:

myfile_nation_year <- subset(myfile, (myfile$nation == "Great Britain") & (myfile$year == "2012"))

I get no error message but an empty list. I bet it's a piece of cake for someone experienced, but I'm new to R and don't know what I did wrong.

str(myfile) 
List of 3184 
 $ nation : Factor w/ 20 levels "France","Germany",..: 1 1 1 1 1 1 1 1 1 1 ... 
 $ region : Factor w/ 9 levels "Europe","USA",..: 1 1 1 1 1 1 1 1 1 1 ... 
 $ city_chn : Factor w/ 23 levels "Beijing","Shanghai",..: NA NA NA NA NA NA NA NA NA NA ... 
 $ citych_tiers : Factor w/ 5 levels "Else","Tier 1",..: NA NA NA NA NA NA NA NA NA NA ... 
 $ year : Factor w/ 8 levels "2007","2008",..: 8 8 8 8 8 8 8 8 8 8 ...
  • Do you have any spaces in the name "nation"? Such as " nation" or "nation ". –  Aug 20 '15 at 09:35
  • No, as far as I can see not. But if, what would I have to do? – marco_stuggi Aug 20 '15 at 09:38
  • Does using backtick ( ` ) around nation work? – Jaap Aug 20 '15 at 09:46
  • No unfortunately not – marco_stuggi Aug 20 '15 at 09:49
  • How did you read the data? With the foreign package? – Jaap Aug 20 '15 at 09:52
  • Yes exactly and with `read.spss(file.choose())`. I can't read the file directly, I get `permission denied`. I think because I don't have admin rights on my computer at work. – marco_stuggi Aug 20 '15 at 09:56
  • 1
    Just to be sure, can you try `str(myfile)` and show us the output? – maj Aug 20 '15 at 10:00
  • Sure. Hope the format is okay: `str(myfile) List of 3184 $ nation : Factor w/ 20 levels "France","Germany",..: 1 1 1 1 1 1 1 1 1 1 ... $ region : Factor w/ 9 levels "Europe","USA",..: 1 1 1 1 1 1 1 1 1 1 ... $ city_chn : Factor w/ 23 levels "Beijing","Shanghai",..: NA NA NA NA NA NA NA NA NA NA ... $ citych_tiers : Factor w/ 5 levels "Else","Tier 1",..: NA NA NA NA NA NA NA NA NA NA ... $ year : Factor w/ 8 levels "2007","2008",..: 8 8 8 8 8 8 8 8 8 8 ...` – marco_stuggi Aug 20 '15 at 11:05
  • Looks like you've got two problems: `myfile` is a list, not a data frame, and `myfile$nation` and `myfile$year` are factors,, not strings. Try `myfile <- as.data.frame(myfile)`, then `myfile$nation <- as.character(mfile$nation)` and `myfile$year <- as.character(myfile$year)`, then retry your `subset()` code and see if that works. – ulfelder Aug 20 '15 at 11:32
  • 1
    @ulfelder thank you very much! I only needed the first half of your answer. (Converting the list into a dataframe). It's working now! – marco_stuggi Aug 20 '15 at 13:09

1 Answers1

0

I guess you firstly imported your sav file and saved it in the object myfile.

Try:

head(myfile)

You'll see how your data look into R and see directly if columns are named correctly.

If they aren't it means you badly used subset(), try removing () around nation and year.

Jacques Peeters
  • 150
  • 1
  • 7
  • Thanks for your answer. This might help me with other problems in the future. But the solution was brought to me by @ulfelder – marco_stuggi Aug 20 '15 at 13:11