1

I have a third party JSON feed which is huge - lots of data. Eg

{
   "data": [{
     "name": "ABC",
     "price": "2.50"
   },
   ...
   ]
}

I am required to strip the quotation marks from the price as the consumer of the JSON feed requires it in this way.

To do this I am performing a regex to find the prices and then iterating over the prices and doing a string replace using gsub. This is how I am doing it:

price_strings = json.scan(/(?:"price":")(.*?)(?:")/).uniq
price_strings.each do |price|
  json.gsub!("\"#{price.reduce}\"", price.reduce)
end
json

The main bottle neck appears to be on the each block. Is there a better way of doing this?

amburnside
  • 1,943
  • 5
  • 24
  • 43
  • 2
    This is a very awkward way to manipulate JSON. Have you tried calling `JSON.parse(json)`, and then treating it like the object it actually represents? You should be able to do `json['data']['price'] = json['data']['price'].to_f`, rather than "removing the quotation marks". – Tom Lord Oct 10 '19 at 13:40
  • Even if you wanted to manipulate it as a string there is no reason why you would need make an array and iterate through it. Just use gsub with captures to replace every occurance of the pattern. – max Oct 10 '19 at 13:43
  • @TomLord - this was my initial approach unfortunately if I have a float with a trailing zero then the zero is trimmed. I need the number in the format 2.50 – amburnside Oct 10 '19 at 13:53
  • 1
    @amburnside This doesn't make any sense. If the consumer is expecting a formatted string, then the JSON needs to be a **string** - i.e. `"2.50"`. And if the consumer is fussy and wants a float, then `2.5 == 2.50`, so it should make absolutely no difference. – Tom Lord Oct 10 '19 at 13:55
  • @TomLord Totally agree with you, but this is where I am – amburnside Oct 10 '19 at 13:56
  • @amburnside I'll be totally honest, and say: I don't believe you. If you can show me whatever godforsaken code is written on this client that treats `2.50` and `2.5` as being different values, than I'll eat my hat. – Tom Lord Oct 10 '19 at 18:35
  • The only "reasonable" (by which I mean, possible but outrageously bad) explanation I can think of is if the client is parsing the JSON via regex, instead of... you know... a JSON parser. Which would be completely insane. – Tom Lord Oct 10 '19 at 18:36

2 Answers2

2

If this JSON string is going to be serialised into a Hash at some point in your application or in another 3rd-party dependency of your code (i.e. to be consumed by your colleagues or modules), I suggest negotiating with them to convert the price value from String to Numeric on demand when the json is already a Hash, as this is more efficient, and allows them to...

...handle edge-case where say if "price": "" of which my code below will not work, as it would remove the "", and will be a JSON syntax error.

However, if you do not have control over this, or are doing once-off mutation for the whole json data, then can you try below?

json =
<<-eos
{
  "data": [{
    "name": "ABC",
    "price": "2.50",
    "somethingsomething": {
      "data": [{
        "name": "DEF",
        "price": "3.25", "someprop1": "hello",
        "someprop2": "world"
      }]
    },
    "somethinggggg": {
      "price": "123.45" },
    "something2222": {
      "price": 9.876, "heeeello": "world"
    }
  }]
}
eos

new_json = json.gsub /("price":.*?)"(.*?)"(.*?,|})/, '\1\2\3'

puts new_json
# =>
# {
#   "data": [{
#     "name": "ABC",
#     "price": 2.50,
#     "somethingsomething": {
#       "data": [{
#         "name": "DEF",
#         "price": 3.25, "someprop1": "hello",
#         "someprop2": "world"
#       }]
#     },
#     "somethinggggg": {
#       "price": 123.45 },
#     "something2222": {
#       "price": 9.876, "heeeello": "world"
#     }
#   }]
# }

DISCLAIMER: I am not a Regexp expert.

Jay-Ar Polidario
  • 6,463
  • 14
  • 28
2

This is truly a fools errand.

JSON.parse('{ "price": 2.50 }')
> {price: 2.5}

As you can see from this javascript example the parser on the consuming side will truncate the float to whatever it wants.

Use a string if you want to provide a formatted number or leave formatting up to the client.

In fact using floats to represent money is widely known as a really bad idea since floats and doubles cannot accurately represent the base 10 multiples that we use for money. JSON only has a single number type that represents both floats and integers.

If the client is going to do any kind of calculations with the value you should use an integer in the lowest monetary denomation (cents for euros and dollars) or a string that's interpreted as a BigDecimal equivilent type by the consumer.

max
  • 96,212
  • 14
  • 104
  • 165