2

here is a sample of my data, which is found at this link: http://www.uwyo.edu/crawford/datasets/drugreactions.txt enter image description here

I made this equation for the data

    fit2 <- lm(Allergens~Gender*Race*Druglevel, data=dr)

Which spit me out this

enter image description here

I know how to reorder the data to give a black male baseline with

    dr$Race<-factor(dr$Race,levels=c("Black","Latino","Indian","Asian","NativeAmerican","Whit e"))
    dr$Gender<-factor(dr$Gender,levels=c("Male","Female"))
    newfit <- lm(Allergens~Gender*Race, data=dr)

However want I want is to be able to take out certain coefficients. For example, say I just want white males and black females to be in the model, instead of all the other categories. I tried

    whitefit <- lm(Allergens~(Gender="Male"), data=dr)

But got an error due to uneven rows between allergens and where gender equals male.

Ideally I would like a way to take out any category so that I could completely customize the model and take out things for simplicity sake. For example, taking out male Indians from the model above.

Maxwell Chandler
  • 626
  • 8
  • 18

1 Answers1

2

You need to subset the data either through indexing or by using the subsetfunction:

dr <- read.table("http://www.uwyo.edu/crawford/datasets/drugreactions.txt",
                 header=TRUE, stringsAsFactors = TRUE)

# Example excluding Indians:
newfit <- lm(Allergens ~ Gender * Race, data = subset(dr, subset = Race != "indian"))

# Example using only White Males and Black Females
wmbf.fit <- lm(Allergens ~ Gender * Race, 
               data = subset(dr, subset = (Race == "White" & Gender == "Male") |
                                          (Race == "Black" & Gender == "Female")))

However if you want to exclude a gender altogether, you'll need to change your formula to exclude Gender, since all observations will have the same value on Gender and therefore this variable can't possibly contribute to the model.

Dominic Comtois
  • 10,230
  • 1
  • 39
  • 61