-1

I'm trying to make a regular expression that change all common incorrect comma formating to a correct one in a text file - don't want to find correct comma formating.

Finds a comma that has at least one space before and any after (edited: changed * to +, typo):

/ +, */

Finds a comma that do not have any space after:

/,(?! )(?!\n)/

Finds a comma that has more than one space after:

/,  +/

Combination:

/ +, *|,(?! )(?!\n)|,  +/

In addition I don't want it to match text stings at all. A code that uses a sting that have " or ' before and after:

"," "hjsdh,hjj,jhj"
',' 'asjj,'

How to make a combo of these?

The find should be replaced by a correct comma (a comma and a space).

Examples of incorrect comma formating:

#,#
# ,#
#,  #
#  , #
JasonMArcher
  • 14,195
  • 22
  • 56
  • 52
LeoD3
  • 195
  • 10
  • 1
    Why bothering with correct commas overall? ` *, *` will match it all, you just replace with `, ` – Kilazur May 27 '14 at 09:56
  • what do you mean not find strings with just a comma? your entire file is a string. You mean a line with just a comma? – nl-x May 27 '14 at 09:58
  • which IDE are you using ? – nl-x May 27 '14 at 09:59
  • Why bother with anything at all? I want text users read to be correct. I want to make pretty-print code snippets. – LeoD3 May 27 '14 at 09:59
  • @LeoD3 I edited to be clearer. – Kilazur May 27 '14 at 09:59
  • I don't want to change stings used in the code that just contains a comma. Edited my text to be more clear. – LeoD3 May 27 '14 at 10:02
  • `/ *, */` will match a comma preceded by *any* number of spaces, *including* zero. You probably want something like `\s+`, and you should probably read some [regex documentation](http://perldoc.perl.org/perlre.html). – Biffen May 27 '14 at 10:02
  • Ah I knew someone would say: read the documentation. Someone always say. I have read it. But yes, you are right, if I spend many many more hours reading it I probably would find an better answer. But can't you do that for most questions on this site? – LeoD3 May 27 '14 at 10:04
  • I didn't want to trigger correct commas, so I could see how many that where incorrect. But I guess I have to forget that. – LeoD3 May 27 '14 at 10:10
  • Wait... so your problem isn't correcting commas? it's counting incorrect ones? – OGHaza May 27 '14 at 10:14
  • @LeoD3 The fact that `*` means *zero* or more could have been learned by spending a *couple of minutes* reading the documentation. – Biffen May 27 '14 at 10:14
  • I knew that, I do use + in that code too. Just didn't want to find correct commas. – LeoD3 May 27 '14 at 10:21
  • @LeoD3 You didn't not seem to know that, since you say "Finds a comma that has *at least one space before*" and then use `*`. If a "correct" comma is one *not* preceded by spaces, then `+` is what you should use. – Biffen May 27 '14 at 10:31
  • Right you are, I used it on the other line but missed that one. Thanks. – LeoD3 May 27 '14 at 10:33
  • @LeoD3 You still haven't said (I think) what language it is that you want to clean up, but have you looked for an existing tool? Writing something like this without parsing the language by its rules is near impossible with just regex. Just consider comments, strings, heredocs, etc. – Biffen May 27 '14 at 10:41
  • Well I use a language few have heard of so I don't think it will help unfortunately. Only want to exclude "," and ',' and preferably correct commas (so if you use find/replace stepping forward in eclipse or notepad++, you wont see those). – LeoD3 May 27 '14 at 10:47
  • Changed the question somewhat, it didn't work before because of the * instead of the +. Still I want the match to ignore all text strings (starting and ending with " or '). How to do that? A similar question: http://stackoverflow.com/questions/6671196/regex-ignore-text-inside-quoted-strings-in-net – LeoD3 May 30 '14 at 09:17

2 Answers2

0

Find \s*,\s* and replace by #, #

This helps you?

0

Since it seems you want to be able to match just the incorrect commas, how about:

/( +, *|, {2,}|\b,\b)/
  • Matches a comma with any space in front of it
  • Matches a comma with more than 1 space after it
  • Matches a comma with a word character directly either side of it

There are many special cases you haven't specified how you'd want to handle though e.g. ",,," "1,200,000" ",hi" "hi,"

RegExr - showing the cases this does and does not handle.

OGHaza
  • 4,795
  • 7
  • 23
  • 29
  • Thanks. But I only wanted to exclude lone commas surrounded by " or ' (code for string). And yea the correct comma (comma and one space). – LeoD3 May 27 '14 at 10:24
  • @Leo, in that case if you get rid of the `(?!^,$)` in the 2nd regex and it'll probably match what you want - but there are so many special case I'd have to guess at what you want to happen. e.g. 'hello,?' '1,400', anything involving symbols or otherwise malformed english. e.g. [see here](http://regexr.com/38tci) – OGHaza May 27 '14 at 10:31
  • Nah just wanted to ignore code like (lone commas in a string): "," or ',' And not like this "not, like this" or ' , ' So something like this? /(?!",")(\s+,\s*|,\s{2,}|\b,\b)/ – LeoD3 May 27 '14 at 10:37
  • \b,\b finds the correct place, but you cant replace the text with a correct comma (or anything). – LeoD3 May 30 '14 at 09:10