0

I have two columns of data to plot into a scatter plot, the points don't appear to be in the correct place, as if the axis are shifted and the data points have been reordered. The data comes from a larger set and the plan was to use "lines" to add furthur lines to the plot. I haven't been using R for long and am following the examples Line charts on the QuickR website. The data has already been binned and should produce a distribution curve. The plan is to use the R code within knime, if/when it works.

this is my data and code

1                  |   0.2  |                 ?
2                  |   0.3  |0.9040357526438414
3                  |   0.4  |5.411174917770767
4                  |   0.5  |6.680564306410919
5                  |   0.6  |6.209330779324634
6                  |   0.7  |6.064339545114229
7                  |   0.8  |6.689841006370736
8                  |   0.9  |8.755978393214562
9                  |   1.0  |13.11661304772278
10                 |   1.1  |17.093411510054928
11                 |   1.2  |15.479560872141883
12                 |   1.3  |8.420296969382726
13                 |   1.4  |3.5346457261075566
14                 |   1.5  |1.6256413200367157
15                 |   1.6  |2.2848049576096026
16                 |   1.7  |1.2394427974974978
17                 |   1.8  |0.28169014084507044
18                 |   1.9  |                 ?
19                 |   2.0  |                 ?
20                 |   2.1  |                 ?
21                 |   2.2  |                 ?
22                 |   2.3  |                 ?
23                 |   2.4  |                 ?
24                 |   2.5  |                 ?
25                 |   2.6  |                 ?
26                 |   2.7  |                 ?
27                 |   2.8  |                 ?
28                 |   2.9  |                 ?

nwells=c(2, 6, 10, 14, 18) #to take the columns I want from the larger dataset
plot(3, 20, type="n", xlim=c(0, 3), ylim=c(0, 26), xlab="Intensity",
    ylab="Proportion", xaxs="i", yaxs="i") 
colors <- rainbow(length(nwells)) 
linetype <- c(1:length(nwells)) 
plotchar <- seq(nwells)

# add lines 

      well <- data1[2]
  bin <-data1[1] 
  data<-data.frame(bin, well)
      lines(data, type="p", lwd=1.5,
        lty=1, col=colors[n], pch=plotchar[n], grid()) 

this is my plot distribution scatter plot 1 sample from R

note the point at 1.5, 5 - where does that come from in this data set?

anybody any idea what I am doing wrong?

tech133
  • 1
  • 2

1 Answers1

0

I think there is some issues due to incorrect conversion from factor to numeric variable in your data column 2, the following should fix it (not the point 1.5, 5 is not there anymore):

data[,2] <- as.numeric(as.character(data[,2]))
well <- data1[2]
bin <-data1[1] 
data<-data.frame(bin, well)
lines(data, type="p", lwd=2.5,
      lty=1, col=colors, pch=plotchar, grid()) 

enter image description here

Sandipan Dey
  • 21,482
  • 2
  • 51
  • 63
  • Thanks, I thought I'd cracked the numeric/factor issue. Your solution gives an error:- `> data[,1]<-as.numeric(as.character(data[,1])) Warning message: NAs introduced by coercion` but after a bit more reading up I realised this is because of the missing values represented by "?" `a<- read.csv("Sorted table.txt", header=T, sep=" ", dec=".", na.strings=("?"))` solved the issue. Thanks – tech133 Oct 04 '16 at 20:35