0

I am using the rgeo ruby library to parse out geojson polygons. The behavior is to return nil when calling decode on a polygon with duplicate points as in the following example:

geom = {:geom=>{"type"=>"Polygon", "coordinates"=>[[[-82.5721, 28.0245], [-82.5721, 28.0245] ... }
geo_factory = RGeo::Cartesian.factory(:srid => 4326)
rgeo_geom = RGeo::GeoJSON.decode(geom, json_parser: :json, geo_factory: geo_factory)

Due to the repeated point at the beginning, rgeo_geom will be nil after this code is executed.

What is the most efficient way to clean this polygon? Is there a built in rgeo feature or should I roll my own?

To be clear I would like to remove only consecutive duplicate points as this is what causes the library to return nil for the above code. I am also not looking for in db solutions such as postgis st_removerepeatedpoints, but am essentially looking for this behavior executed in ruby.

sakurashinken
  • 3,940
  • 8
  • 34
  • 67

1 Answers1

1

I'm not familiar with rgeo, but from a pure Ruby standpoint I would think you could do the following.

h = { :geom=>{
        "type"=>"Polygon",
        "coordinates"=>[
          [-80.1234, 28.1234], [-82.5721, 28.0245], [-82.5721, 28.0245],
          [-83.1234, 29.1234], [-82.5721, 28.0245], [-83.1234, 29.1234],
          [-83.1234, 29.1234], [-83.1234, 29.1234]
        ]
      } 
    }

The question shows "coordinates"=>[[[-82.5721, 28.0245],... with no right bracket matching the middle left bracket. I've assumed there should only be two left brackets. If that is not the case my answer would have to be modified.

The following does not mutate h. To show that's true, first compute the hash of h.

hhash = h.hash
  #=> -4413716877847662410

h.merge({ :geom=>(h[:geom].merge("coordinates"=>
  h[:geom]["coordinates"].chunk_while(&:==).map(&:first))) })
  #=> { :geom=>{
  #       "type"=>"Polygon",
  #       "coordinates"=>[
  #         [-80.1234, 28.1234], [-82.5721, 28.0245], [-83.1234, 29.1234], 
  #         [-82.5721, 28.0245], [-83.1234, 29.1234]
  #       ]
  #     }
  #   }

h.hash == hhash
  #=> true

See Hash#merge, Object#tap, Enumerable#chunk_while, Enumerable#flat_map and Enumerable#uniq.

Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100
  • I think you can use `map(&:first)` instead of `flat_map(&:uniq)` – Stefan Jan 12 '18 at 08:46
  • And `chunk_while`, being a good citizen, allows shortcuts: `chunk_while(&:==)` – Stefan Jan 12 '18 at 10:56
  • @Stefan, I just realized I don't understand your second suggestion. `enum.chunk_while { |pair| pair.== }` raises an exception, since `Array#==` requires an argument (another array). Why does `enum.chunk_while(&:==)` work? I must be overlooking something pretty basic. – Cary Swoveland Jan 17 '18 at 03:49
  • 1
    `Symbol#to_proc` returns `proc { |obj, *args| obj.public_send(self, *args) }` (a bit simplified). When calling that proc, the first argument becomes the receiver, the symbol itself is the method name and the remaining arguments (if any) are passed as methods arguments. So `:==.to_proc` is invoked via `:==.to_proc.call(1, 2)` which is equivalent to `1.public_send(:==, 2)`. And with _"good citizen"_ I mean that `chunk_while` yields _multiple_ values (`yield *pair`), so instead of `:==.to_proc.call(pair)` it invokes `:==.to_proc.call(*pair)`. – Stefan Jan 17 '18 at 07:14
  • Thanks, Stefan. Great explanation! – Cary Swoveland Jan 17 '18 at 07:37