What Regex would capture everything from ' mark to the end of a line?

Question

I have a text file that denotes remarks with a single '.

Some lines have two quotes but I need to get everything from the first instance of a ' and the line feed.

I AL01                  ' A-LINE                            '091398 GDK 33394178    
         402922 0831850 '                                   '091398 GDK 33394179    
I AL02                  ' A-LINE                            '091398 GDK 33394180    
         400722 0833118 '                                   '091398 GDK 33394181    
I A10A                  ' A-LINE 102                       '  53198 DJ  33394182    
         395335 0832203 '                                  '  53198 DJ  33394183    
I A10B                  ' A-LINE 102                       '  53198 DJ  3339418

score 232 · Accepted Answer · answered May 06 '09 at 17:59

232

'.*

I believe you need the option, Multiline.

answered May 06 '09 at 17:59

Joshua Belden

10,273
8
40
56

7

This will capture first instance of character ' and end of last line – killdaclick Jun 10 '19 at 20:00
With this you go to the end of the file or the text, not to the end of the line. – Valerio Ficcadenti Jul 17 '23 at 04:45

BenAlabaster · Answer 2 · 2009-05-06T18:08:24.620

126

The appropriate regex would be the ' char followed by any number of any chars [including zero chars] ending with an end of string/line token:

'.*$

And if you wanted to capture everything after the ' char but not include it in the output, you would use:

(?<=').*$

This basically says give me all characters that follow the ' char until the end of the line.

Edit: It has been noted that $ is implicit when using .* and therefore not strictly required, therefore the pattern:

'.*

is technically correct, however it is clearer to be specific and avoid confusion for later code maintenance, hence my use of the $. It is my belief that it is always better to declare explicit behaviour than rely on implicit behaviour in situations where clarity could be questioned.

edited May 06 '09 at 18:08

answered May 06 '09 at 17:58

BenAlabaster

39,070
21
110
151

2

The $ is unnecessary. The dot will stop at the end of the line under normal circumstances. – Tomalak May 06 '09 at 18:00
10

unnecessary - but proper for what he wants to do. It serves as a reminder later that it is expecting everything from ' to the end of the line – gnarf May 06 '09 at 18:03
@balabaster: I did not say that it was wrong. ;-) It was just a footnote. – Tomalak May 06 '09 at 18:09
@Tomalak: Wasn't trying to imply you were wrong by any means, was just clarifying my reasoning for my choice of using $ rather than not. Thank you for pointing it out. – BenAlabaster May 06 '09 at 18:10
+1 for including how to include everything after the character in question, instead of always including it. – grizzasd Oct 07 '19 at 18:30
(?<=').*$ is actually correct answer to what op is asking, capture after, not with. This should be accepted answer – Aistis Taraskevicius Feb 26 '20 at 11:09

score 32 · Answer 3 · answered May 06 '09 at 17:58

32

'.*$

Starting with a single quote ('), match any character (.) zero or more times (*) until the end of the line ($).

answered May 06 '09 at 17:58

OtherDevOpsGene

7,302
2
31
46

This answer is a great example of how to break down the logic behind what a command, nice and clear! – Timmah Aug 26 '19 at 06:40

score 19 · Answer 4 · answered Sep 21 '15 at 15:45

19

When I tried '.* in windows (Notepad ++) it would match everything after first ' until end of last line.

To capture everything until end of that line I typed the following:

'.*?\n

This would only capture everything from ' until end of that line.

answered Sep 21 '15 at 15:45

Danish

191
1
2

score 12 · Answer 5 · answered Jun 01 '16 at 11:19

12

In your example I'd go for the following pattern:

'([^\n]+)$

use multiline and global options to match all occurences.

To include the linefeed in the match you could use:

'[^\n]+\n

But this might miss the last line if it has no linefeed.

For a single line, if you don't need to match the linefeed I'd prefer to use:

'[^$]+$

answered Jun 01 '16 at 11:19

Gess

459
6
15

1

Had trouble with this suggestion with golang's regex. `'[^\n]+` was needed instead of `'[^\n]+$`. See https://play.golang.org/p/EemihqdIMSl – jws Sep 16 '21 at 14:13

gnarf · Answer 6 · 2009-05-06T18:07:38.007

This will capture everything up to the ' in backreference 1 - and everything after the ' in backreference 2. You may need to escape the apostrophes though depending on language (\')

/^([^']*)'?(.*)$/

Quick modification: if the line doesn't have an ' - backreference 1 should still catch the whole line.

^ - start of string
([^']*) - capture any number of not ' characters
'? - match the ' 0 or 1 time
(.*) - capture any number of characters
$ - end of string

score 0 · Answer 7 · answered Oct 24 '19 at 18:58

https://regex101.com/r/Jjc2xR/1

/(\w*\(Hex\): w*)(.*?)(?= |$)/gm

I'm sure this one works, it will capture de hexa serial in the badly structured text multilined bellow

     Space Reservation: disabled
         Serial Number: wCVt1]IlvQWv
   Serial Number (Hex): 77435674315d496c76515776
               Comment: new comment

I'm a eternal newbie in regex but I'll try explain this one

(\w*(Hex): w*) : Find text in line where string contains "Hex: "

(.*?) This is the second captured text and means everything after

(?= |$) create a limit that is the space between = and the |

So with the second group, you will have the value

That's not the question, is it ? – Daniel E. Dec 31 '19 at 10:04 — Daniel E., Dec 31 '19 at 10:04

What Regex would capture everything from ' mark to the end of a line?

7 Answers7

Linked

Related