0

I have a set of data with two categories. I want to draw a highchater scatter plot in r with each category has its own color and independent regression line of itself. I used a for loop to realize the different color and calculated two points of my regression line. The question is, I only can draw one regression line on the scatter plot. I tried to draw the line in the loop or use %>% to add a line. Neither of them worked.And I also cannot add title to this chart. Here is my code.

data<-data.frame(gene=rnorm(20),metab=rnorm(20),type=sample(0:1,1000,rep=TRUE))[1:20,]]
b<-glm(data$metab~data$gene+data$type+data$gene:data$type)
coefficients<-t(b$coefficients)
i<-coefficients[,1]
g<-coefficients[,2]
c2<-coefficients[,3]
g.c2<-coefficients[,5]

u<-matrix(c(1,0))
type1<-u[1,1]
type2<-u[2,1]

largest1<- max(data$gene[data$type==type1])
largest2<-max(data$gene[data$type==type2])
min1<-min(data$gene[data$type==type1])
min2<-min(data$gene[data$type==type2])

line1<-data.frame(x=c(largest1,min1),y=c(g*largest1+i,min1*g+i))
line2<-data.frame(x=c(largest2,min2),y=c(g*largest2+g.c2*largest2+i+c2,min2*g+i+c2))

hc<-highchart(width = 800, height = 700) 
#hc_title(text = "scatterplot",
             #style = list(color = '#2E1717',fontSize = '20px',
                          #fontWeight = 'bold')) %>%
for(type in u){
       hc<-hc%>% 
        hc_add_series_scatter(data$gene[data$type==type],data$metab[data$type==type],name=sprintf("type %s", type),
                              showInLegend = TRUE) 
        #if(type==type1){                
        #   hc_add_series(hc,data=line1,type='line',name='regression line1') 
        #}else if (type==type2){                
        #   hc_add_series(hc,data=line2,type='line',name='regression line2')
        #}
} 

hc 

hc_add_series(hc,data=line1,type='line',name='regression line1',enableMouseTracking=FALSE,marker=FALSE) %>%
hc_add_series(hc,data=line2,type='line',name='regression line2',enableMouseTracking=FALSE,marker=FALSE)

The line1 and line2 are data.frame that include two points of two regression lines.

Kacper Madej
  • 7,846
  • 22
  • 36
刘小明
  • 1
  • 1

1 Answers1

0

I am getting errors while trying to run the code you have provided. First line ends with ] that prevents the code from running or something else is missing.

If I remove this, then on coefficients[,5] I am getting an error about index out of bounds. Fixed that by changing to [,4].

In 2 last lines of your code you are using hc_add_series with hc - this triggers single change, opens the chart and doesn't save to hc.

In the end I have used below code:

data<-data.frame(gene=rnorm(20),metab=rnorm(20),type=sample(0:1,1000,rep=TRUE))[1:20,]]
b<-glm(data$metab~data$gene+data$type+data$gene:data$type)
coefficients<-t(b$coefficients)
i<-coefficients[,1]
g<-coefficients[,2]
c2<-coefficients[,3]
g.c2<-coefficients[,4]

u<-matrix(c(1,0))
type1<-u[1,1]
type2<-u[2,1]

largest1<- max(data$gene[data$type==type1])
largest2<-max(data$gene[data$type==type2])
min1<-min(data$gene[data$type==type1])
min2<-min(data$gene[data$type==type2])

line1<-data.frame(x=c(largest1,min1),y=c(g*largest1+i,min1*g+i))
line2<-data.frame(x=c(largest2,min2),y=c(g*largest2+g.c2*largest2+i+c2,min2*g+i+c2))

hc <- highchart(width = 800, height = 700) %>%
  hc_title(text = "scatterplot", style = list(color = '#2E1717',fontSize = '20px', fontWeight = 'bold'))

for(type in u){
       hc <- hc %>% 
        hc_add_series_scatter(data$gene[data$type==type],data$metab[data$type==type],name=sprintf("type %s", type),showInLegend = TRUE) 
        #if(type==type1){                
        #   hc_add_series(hc,data=line1,type='line',name='regression line1') 
        #}else if (type==type2){                
        #   hc_add_series(hc,data=line2,type='line',name='regression line2')
        #}
} 

hc <- hc %>%
hc_add_series(data=line1,type='line',name='regression line1',enableMouseTracking=FALSE,marker=FALSE) %>%
hc_add_series(data=line2,type='line',name='regression line2',enableMouseTracking=FALSE,marker=FALSE)

hc

Two regression series are added, but data is wrong - you are using wrong data format. As series data you have an object with x and y that are arrays of numbers. As series data you should have an array with arrays looking like [X,Y] (X, Y being numbers) or an array of objects looking like {x: X, y: X} (X, Y being numbers and x, y being names of variables). The problem is visible from JS level and should be fixed in R. I am not an R expert, but I think what I wrote here is a good start.

Kacper Madej
  • 7,846
  • 22
  • 36