2

mydata

mydata_x <- rep(seq(200,600,by=200),times=1,each=4)
mydata_y <- rep(seq(600,1200,by=200),times=4,each=1)
mydata_z <- c(0,0,0,0,0,0,0,529,0,0,0,0)
mydata <- data.frame(x=mydata_x,y=mydata_y,z=mydata_z)

I've taken samples from a virtual field and now want to predict neighboring values. It's been done by the two methods described below... both predicting a constant value for each point. I don't understand why.

Keep in mind that I'm a new R user and love it. I've been trying to follow an example I found online to perform my task. Using the example dataset, I have no problem achieving their outcome. However, I'm unable to apply it to my own data. See the example here: http://www.stat.ucla.edu/~nchristo/statistics_c173_c273/c173c273_lec11_w11.pdf

require(geoR)

#create x,y coordinates for locations to predict

kx <- rep(seq(250,700,by=100),times=1,each=length(seq(650,1250,by=100)))
ky <- rep(seq(650,1250,by=100),times=length(seq(250,700,by=100)),each=1)  
k_df <- data.frame(x=kx,y=ky) #coordinates as data.frame
k_matrix <- as.matrix(cbind(kx,ky)) #coordinates as.matrix

#convert sample data (top of post) as.geodata

b <- as.geodata(mydata)

#predict values

prediction <- ksline(b,cov.model = "gaussian",cov.pars = c(10,3.33),locations=k_df)
prediction$predict

All 35 outcomes return 44.0833 #I just don't get it!!!

I don't understand how I am returning a single value across the field when there are so many zeros. I would expect to see a single region of decreasing values.

Thoughts?

Community
  • 1
  • 1
braxtonlewis
  • 41
  • 1
  • 8
  • I only listed a single method here, contrary to my initial statement. The second method used was the function krige.conv. The predicted values using this method also come out constant. – braxtonlewis Sep 10 '15 at 15:33
  • I've found that asking the code to predict a much higher number (~500) of values within the grid produces better results. – braxtonlewis Sep 10 '15 at 17:16
  • Welcome to SO. +1 for including sample data and code, which is more than most people do... Unfortunately, your code does not run as is: the second line should have `times=3`. This may not seem like a big deal, but if you expect people to help you, it's really not reasonable to force them to struggle through simple errors like that. – jlhoward Sep 10 '15 at 20:38
  • The 'times=' code was meant to run for the number consistent with the 'length' of that sequence. For the example, I should have removed that variability to make it easier to read. I'll keep that in mind for next time. You helped me greatly. Thank you. – braxtonlewis Sep 11 '15 at 21:50

1 Answers1

1

Try this:

prediction <- ksline(b,cov.pars = c(10,200),locations=k_df)
cbind(k_matrix,prediction$predict)[12:20,]
#        kx   ky            
#  [1,] 350 1050  93.6977130
#  [2,] 350 1150 292.2674563
#  [3,] 350 1250 329.6293038
#  [4,] 450  650   0.5934677
#  [5,] 450  750  -0.1541056
#  [6,] 450  850  -2.9329553
#  [7,] 450  950   1.8124209
#  [8,] 450 1050  93.6977130
#  [9,] 450 1150 292.2674563

The kriging algorithm calculates an estimate for a prediction point based on values at "nearby" data points (discounted for distance from the prediction point). The "range" element (second element) of cov.pars=... controls how far to look for "nearby points". You had it set to 3.33, whereas the nearest point was ~ 50 units away. So basically the algorithm wasn't using any nearby points, and the estimate for all points is then simply the mean of all the data values.

jlhoward
  • 58,004
  • 7
  • 97
  • 140
  • Deep down I knew that I had misunderstood one of these variables. Now both the ksline and krige.conv codes are working beautifully, and providing similar results. Thank you greatly for your help. – braxtonlewis Sep 11 '15 at 12:18