2

I need to process a random string which has the character ".." in-between. Like the one shown below

a..b/c..de/f-g..xyz..abc..def..123..

How can I get the data between these ".." using regexp?( this string can be of any length and I need to get each intermediate data set for further processing). Please guide me with this.

Thanks!

Donal Fellows
  • 133,037
  • 18
  • 149
  • 215
Nathan Pk
  • 709
  • 4
  • 18
  • 27

3 Answers3

6

If there is p.e. no newline in the string you could get a list of your strings with:

set in a..b/c..de/f-g..xyz..abc..def..123..
set out [split [string map {.. \n} $in] \n]]
Andrew Cheong
  • 29,362
  • 15
  • 90
  • 145
rene
  • 76
  • 1
  • A good character to map-to/split-on is `\ufffc` (the “object replacement character”) since you're really unlikely to encounter one of those in any real text. – Donal Fellows Oct 25 '12 at 08:31
1

One tool to consider for this sort of thing, especially if the splitting term is more complex than the one in the question, is the textutil::split package in Tcllib. That lets you split strings by regular expression, like this:

package require textutil::split

set sample "a..b/c..de/f-g..xyz..abc..def..123.."
set RE {\.\.};  # “.” is an RE metacharacter, and so needs to be escaped
set pieces [textutil::split::splitx $sample $RE]

The above will also produce an empty element at the end of the pieces list, because there's a terminating separator.

Donal Fellows
  • 133,037
  • 18
  • 149
  • 215
  • The regular expression could also be `\.{2}`, but that's hardly simpler, or it could be `***=..` (a leading `***=` is a special Tcl RE engine feature for “just match the rest as a literal”). – Donal Fellows Oct 25 '12 at 08:42
0

You could use this regex:

[^..]

That would match all characters that are not ...

Nick
  • 4,302
  • 2
  • 24
  • 38
  • I think this command would probably look at the beginning of the string for ".." and leaves the rest. I tried this and its not producing the required result. – Nathan Pk Oct 24 '12 at 23:45
  • @Nathan: Actually, that would be a command called `^..`. Or — more relevantly — a regular expression that matches anything except a `.` (because brackets enclose a character _set_). – Donal Fellows Oct 25 '12 at 08:39
  • @NathanPk When I tested this on `http://regexhero.net/tester/` it grabbed everything that wasn't `..` – Nick Oct 25 '12 at 15:02