2

I have a file with wrapped lines. It happens to be TCL code that wraps multiple lines. (but it could be anything that as rule of line wrapping.)

like :

set long [ some cmd { some long stuff \
  more stuff \
  even more stuff \
  end of cmd} but going on \
  end of set ]

I want to parse this into a single line so that I can do some pattern matching on it.

I looked at the docs for the 'read' command but that doesn't seem to do it.

Your help is much appreciated.

Thanks, Gert

Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
Gert Gottschalk
  • 1,658
  • 3
  • 25
  • 37

4 Answers4

2

I'm not so experienced Tcl programmer, so my proposition is very straight forward.

From your question I guess, that you read the file line by line (I guess using "gets") and then do something with the line (pattern matching). So, the most straight forwart implementation will be like this (by the way, one of the questions is what do you like to do with trailing whitespaces of "previous" line and leading whitespaces of the "next" line):

;# Note: The code bellow was not tested, and may not run cleanly,
;# but I hope it shows the idea.

;# Like "gets", but concatenates lines, which finish with "\" character with
;# the next one.
proc concatenatingGets {chan} {
    set wholeLine ""
    set finishedReadingCurentLine no

    while {! $finishedReadingCurrentLine } {

        set currentLine [gets $chan]

        ;# more complicated rule can be used here for concatenation
        ;# of lines

        if {[string index $currentLine end] == "\\"} {

            ;# Decide here what to do with leading and trailing spaces.
            ;# We just leave them as is (only remove trailing backslash).
            ;# Note, that Tcl interpreter behaves differently.

            append wholeLine " " [string range $currentLine 0 end-1]

        } else {

            set finishedReadingCurrentLine yes

        } ;# if-else strig is to be concatenated

    } ;# while ! finishedReadingcurrentLine

} ;# concatenatingGets

;# Now use our tweaked gets:
set f [open "myFileToParse.txt" r]
while {![eof $f]} {
    set currentLine [concatenatingGets $f]

    ;# ... Do pattern matching ot current line, and whatever else needed.

}
close $f
Dmitrii Semikin
  • 2,134
  • 2
  • 20
  • 25
1

Since you are reading Tcl code, you can use the facilities that Tcl provides to help. In particular, info complete will say whether a string contains a “complete” command, which is great for detecting continuation lines and multi-line literals (such as a procedure body). The only trick about it is that everything only works right when you put newline characters in as well. Thus:

set buffer {}
set fd [open $thefilename]
# Next line is idiomatic "read by lines" pattern
while {[gets $fd line] >= 0} {
    append buffer \n $line
    # IMPORTANT: need extra newline at end for this to work with
    # backslash-newline sequences.
    if {![info complete $buffer\n]} {
        # Get next line
        continue
    }
    processACompleteCommand $buffer
    set buffer {}
}
close $fd
# Deal with the last command if necessary (EOF is always a command terminator)
if {$buffer ne ""} {
    processACompleteCommand $buffer
}
Donal Fellows
  • 133,037
  • 18
  • 149
  • 215
  • Cool. I would give +1 for showing idiomatic expressions and +1 for showing how to work with Tcl code using Tcl (Unfortunately I can give only one +1). – Dmitrii Semikin Mar 25 '13 at 05:40
0

You can see how Tcl handles the arguments very simply:

proc some {args} {
    foreach arg $args {
        puts $arg
    }
}
set long [ some cmd { some long stuff \
  more stuff \
  even more stuff \
  end of cmd} but going on \
  end of set ]

results in

cmd
 some long stuff  more stuff  even more stuff  end of cmd
but
going
on
end
of
set

If you want all this as a single string, then "some cmd" is pretty simple

proc some args {join $args}
set long [some cmd ...]
puts $long

outputs

cmd  some long stuff  more stuff  even more stuff  end of cmd but going on end of set
glenn jackman
  • 238,783
  • 38
  • 220
  • 352
0

If you have enough memory for the entire file:

foreach line [split [regsub -- "\n\$" [regsub -all -- "\\\\\n\[\[:blank:\]\]*" [read stdin] " "] ""] "\n"] {
    # ...
}

This does the same \newline substitution as Tcl does.

potrzebie
  • 1,768
  • 1
  • 12
  • 25