-1

So I build a lm model in R on 65OOO rows (mydata) and I want to see only the predictions for the first 5 rows in order to see how good my model predicts. Below you can see the code I wrote to execute this but it keeps predicting the values of all 65000 rows. Is someone able to help me?

lm_model2002 <- lm(`AC: Volume` ~ `Market Area (L1)`,data=mydata)
summary(lm_model2002) 
df = head(data.frame(`Market Area (L1)`=mydata$`Market Area (L1)`),5)
predict(lm_model2002,newdata=df)

but now the real problem: I took the first row of mydata and copied this row 5 times, then I made a vector that ranges from 1 to 2 and replaced one of the variables ( price per unit) with that vector. As a result, I want to predict the exact same rows but with only a different price, so that i am able to plot this evolution of a higher price:

lm_model3204<- lm(`AC: Volume` ~ log(price_per_unit)*(Cluster_country_hierarchical+`Loyalty-cumulative-volume-10`+`Loyalty-cumulative-orders-10`+`Loyalty-number-of-order-10`+price_discount+Incoterms)+Cluster_spg*(price_discount+Cluster_country_hierarchical)+price_discount*(Month+`GDP per capita`+`Loyalty-cumulative-orders-10`+`Loyalty-cumulative-volume-10`)+`Payer CustGrp`+`CRU Index`,data = mydata)
summary(lm_model3204)
test_data <- mydata[1:1,] 
df <- data.frame(test_data,ntimes=c(5)) 
df <- as.data.frame(lapply(df, rep, df$ntimes)) 
priceperunit<-seq(1,2,by=0.25) 
df$price_per_unit<-priceperunit 
pred <- predict(lm_model3204,newdata=df) 

1 Answers1

0

Please use a minimal reproducible example next time you post a question.

You just have to predict the first five rows. Here an example with the in-built iris dataset

data("iris")

lm_model2002 <- lm(Sepal.Length ~ Sepal.Width,data=iris)
summary(lm_model2002)

predict(lm_model2002,newdata=iris[1:5,])

output:

> predict(lm_model2002,newdata=iris[1:5,])
       1        2        3        4        5 
5.744459 5.856139 5.811467 5.833803 5.722123 

Or:

df <- head(iris,5)
predict(lm_model2002,newdata=df)

EDIT

After your last comment, to see the change in prediction by changing one of the independent variables

data(iris)

df <- iris[rep(1,5),]
Petal_Length<-seq(1,2,by=0.25)
df$Petal.Length<-Petal_Length

lm_model3204 <- lm(Sepal.Length ~ Petal.Length+Sepal.Width,data=iris)
pred <- predict(lm_model3204,newdata=df) 
Elia
  • 2,210
  • 1
  • 6
  • 18
  • Thanks,I want to share my real problem. I took the first row of mydata and copied this row 5 times, then I made a vector that ranges from 1 to 2 and replaced one of the variables ( price per unit) with that vector. As a result, I want to predict the exact same rows but with only a different price, so that i am able to plot this evolution of a higher price: test_data <- mydata[1:1,] df <- data.frame(test_data,ntimes=c(5)) df <- as.data.frame(lapply(df, rep, df$ntimes)) priceperunit<-seq(1,2,by=0.25) df$price_per_unit<-priceperunit and then ofcourse pred <- predict(lm_model3204,newdata=df) – Lucasjansens Apr 12 '22 at 10:28
  • 1
    post your data with `dput(your_df)`, or if it is too big `dput(head(your_df,20))` – Elia Apr 12 '22 at 10:29
  • It keeps predicting on the whole dataset but I only want predictions for the df set with 5 rows – Lucasjansens Apr 12 '22 at 10:32
  • 1
    Please post a reproducible example or your data by copying/pasting the output of `dput(your_data)`. In the meantime, I try to figure your problem out – Elia Apr 12 '22 at 10:37
  • this is our model: lm_model3204<- lm(`AC: Volume` ~ log(price_per_unit)*(Cluster_country_hierarchical+`Loyalty-cumulative-volume-10`+`Loyalty-cumulative-orders-10`+`Loyalty-number-of-order-10`+price_discount+Incoterms)+Cluster_spg*(price_discount+Cluster_country_hierarchical)+price_discount*(Month+`GDP per capita`+`Loyalty-cumulative-orders-10`+`Loyalty-cumulative-volume-10`)+`Payer CustGrp`+`CRU Index`,data = mydata) – Lucasjansens Apr 12 '22 at 10:41
  • I updated the question and gave a preview of the data which I call mydata – Lucasjansens Apr 12 '22 at 10:58