2

I have a problem with Regular Expressions.

Consider we have a string

 S= "[sometext1],[sometext],[sometext]....,[sometext]"

The number of the "sometexts" is unknown,it's user's input and can vary from one to ..for example,1000.

[sometext] is some sequence of characters ,but each of them is not ",",so ,we can say [^,].

I want to capture the text by some regular expression and then to iterate through the texts in cycle.

QRegExp p=new QRegExp("???");
p.exactMatch(S);
for(int i=1;i<=p.captureCount;i++)
{
  SomeFunction(p.cap(i));
}

For example,if the number of sometexts is 3,we can use something like this:

([^,]*),([^,]*),([^,]*).

So,i don't know what to write instead of "???" for any arbitrary n. I'm using Qt 4.7,I didn't find how to do this on the class reference page.
I know we can do it through the cycles without regexps or to generate the regex itself in cycle,but these solutions don't fit me because the actual problem is a bit more complex than this..

Evgenii.Balai
  • 939
  • 13
  • 30

3 Answers3

3

A possible regular expression to match what you want is:

([^,]+?)(,|$)

This will match string that end with a coma "," or the end of the line. I was not sure that the last element would have a coma or not.

An example using this regex in C#:

String textFromFile = "[sometext1],[sometext2],[sometext3],[sometext4]";

foreach (Match match in Regex.Matches(textFromFile, "([^,]+?)(,|$)"))
{
    String placeHolder = match.Groups[1].Value;

    System.Console.WriteLine(placeHolder);
}

This code prints the following to screen:

[sometext1]
[sometext2]
[sometext3]
[sometext4]

Using an example for QRegex I found online here is an attempt at a solution closer to what you are looking for: (example I found was at: http://doc.qt.nokia.com/qq/qq01-seriously-weird-qregexp.html)

QRegExp rx( "([^,]+?)(,|$)");
rx.setMinimal( TRUE ); // this is if the Qregex does not understand the +? non-greedy notation.

int pos = 0;
while ( (pos = rx.search(text, pos)) != -1 ) 
{
     someFunction(rx.cap(1));
}

I hope this helps.

Rob
  • 2,618
  • 2
  • 22
  • 29
  • Yes,probably i'll use this,thanks. It will make my code longer,because there is also a suffix and prefix of the given expression of the same type but with different separators. I hoped it 's possible to express all this with a single regex:) IT's strange this is not supported by any current version of regexs. I think it may be nice way to go for the developers of them:) – Evgenii.Balai Jul 28 '11 at 22:46
0

We can do that, you can use non-capturing to hook in the comma and then ask for many of the block:

Try:

QRexExp p=new QRegExp("([^,]*)(?:,([^,]*))*[.]")

Non-capturing is explained in the docs: http://doc.qt.nokia.com/latest/qregexp.html

Note that I also bracketed the . since it has meaning in RegExp and you seemed to want it to be a literal period.

Mark
  • 1,058
  • 6
  • 13
  • This doesn't work.It will capture the first "sometext",after that it will capture second text,but no more:) – Evgenii.Balai Jul 28 '11 at 22:06
  • Yeah, you have run cap() in a loop. Forgot that quirk. ;-( The manual explains it better: http://doc.qt.nokia.com/latest/qregexp.html#capturing-text – Mark Jul 29 '11 at 14:48
0

I only know of .Net that lets you specify a variable number of captures with a single
expression. Example - (capture.*me)+
It creates a capture object that can be itterated over. Even then it only simulates
what every other regex engine provides.

Most engines provide an incremental match until no matches left from within a
loop. The global flag tells the engine to keep matching from where the last
sucessfull match left off.

Example (in Perl):

while ( $string =~ /([^,]+)/g ) { print $1,"\n" }