1

I am trying to modify keys in my JSON file (around 60 MB) by removing spaces if there are any. I use Sublime Text Editor to load and edit large JSONs.

Currently, I am using the following expression to find quoted strings with spaces:

"([a-zA-Z])+([\s])+([a-zA-Z]*)":

Finds: "First Name":

Then I use the following expression to replace space with underscore with the matched string:

"$1_$3":

Result: "t_Name":

Expected: "First_Name":

I am not able to figure out why I am not able to capture the first word with $1. Any help would be appreciated. Thanks!

Note: There are around 15000 different keys with spaces in the JSON.

ak2492
  • 305
  • 1
  • 4
  • 14
  • 1
    `([a-zA-Z])+` and `([\s])+` are repeated capturing groups. They only store the last char after each of these patterns finish matching. Just move the quantifiers into the groups, ``([a-zA-Z]+)`` and `\s+`. – Wiktor Stribiżew Jan 28 '21 at 23:32
  • Yes! Thanks for the help @WiktorStribiżew – ak2492 Jan 28 '21 at 23:38

1 Answers1

2

Use

"([a-zA-Z]+)\s+([a-zA-Z]*)":

See proof

Explanation

--------------------------------------------------------------------------------
  "                        '"'
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    [a-zA-Z]+                any character of: 'a' to 'z', 'A' to 'Z'
                             (1 or more times (matching the most
                             amount possible))
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  \s+                      whitespace (\n, \r, \t, \f, and " ") (1 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  (                        group and capture to \2:
--------------------------------------------------------------------------------
    [a-zA-Z]*                any character of: 'a' to 'z', 'A' to 'Z'
                             (0 or more times (matching the most
                             amount possible))
--------------------------------------------------------------------------------
  )                        end of \2
--------------------------------------------------------------------------------
  ":                       '":'
Ryszard Czech
  • 18,032
  • 4
  • 24
  • 37