1

I have some rather messy degrees, decimal minutes coordinates (the source of which is out of my control) in the following format (see below). I am trying to work out the distance between the points ultimately.

minlat <- "51  12.93257'"
maxlat <- "66  13.20549'"
minlong <- "- 5   1.23944'"
maxlong <- "- 5   1.36293'"

As they are they are in a rather unfriendly format for (from measurements package):

measurements::conv_unit(minlat, from = 'deg_dec_min', to = 'dec_deg')

and ultimately

distm(c(minlong, minlat), c(maxlong, maxlat), fun = distHaversine)

I think I need to use the gsub( to get them into a friendly format, whereby I would like them to be

minlat <- 51 12.93257 # removing the double space
minlong <- -4 1.36293 # removing the double space and the space after the -

I've been messing around with gusb( all morning and it has beaten me, any help would be great!!

Jim
  • 558
  • 4
  • 13

1 Answers1

1

It sounds like you just need to strip all excess whitespace. We can try using gsub with lookarounds here.

minlong <- " - 5   1.23944 "   # -5 1.23944
minlong
gsub("(?<=^|\\D) | (?=$|\\D)", "", gsub("\\s+", " ", minlong), perl=TRUE)

[1] " - 5   1.23944 "
[1] "-5 1.23944"

The inner call to gsub replaces any occurence of two or more spaces with just a single space. The outer call then selectively removes a remaining single space only if it not be sandwiched between two digits.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • Great that worked a treat for the longs, but the double space remains in the lats, how would one go about keeping one space while removing the additional? Many thanks, I find regex very confusing – Jim Dec 07 '18 at 11:12
  • Ah no worries, thanks for your help! Much appreciated – Jim Dec 07 '18 at 11:34
  • @Jim I fixed it, but I had to make two calls to `gsub`. I failed to do it one regex. – Tim Biegeleisen Dec 07 '18 at 11:37
  • Brilliant @Tim! That's great, any solution is great with gsub(. Thanks for your continued support! – Jim Dec 07 '18 at 11:50