1

For example I have a sentence : whAT is yOur hoUSe nUmBer ? Is iT 26. I have to convert all the first the first letters of each word to uppercase and rest in lower case. I am suppose to use all lsearch, lindex lreplace and stuff and form the code. Can someone tell me how to do this?

user2533429
  • 175
  • 1
  • 6
  • 13

3 Answers3

4

The string totitle command is close: it lowercases the whole string except for the first char which is uppercase.

set s {whAT is yOur hoUSe nUmBer ? Is iT 26.}
string totitle $s
What is your house number ? is it 26.

To capitalize each word is a little more involved:

proc CapitalizeEachWord {sentence} {
    subst -nobackslashes -novariables [regsub -all {\S+} $sentence {[string totitle &]}]
}
set s {whAT is yOur hoUSe nUmBer ? Is iT 26.}
CapitalizeEachWord $s
What Is Your House Number ? Is It 26.

The regsub command takes each space-separated word and replaces it with the literal string "[string totitle word]":

"[string totitle whAT] [string totitle is] [string totitle yOur] [string totitle hoUSe] [string totitle nUmBer] [string totitle ?] [string totitle Is] [string totitle iT] [string totitle 26.]"

The we use the subst command to evaluate all the individual "string totitle" commands.


When Tcl 8.7 comes out, we'll be able to do:

proc CapitalizeEachWord {sentence} {
   regsub -all -command {\S+} $sentence {string totitle}
}
glenn jackman
  • 238,783
  • 38
  • 220
  • 352
  • A little more work would be needed in general (in order to defang any stray Tcl metacharacters in the input string) but this is a reasonable approach. – Donal Fellows Jul 30 '13 at 08:59
  • @DonalFellows :: Can you give an example of metachars not defanged by the `-nobackslashes -novariables` opts, or a string that would be problematic? – jimbobmcgee Oct 01 '20 at 17:11
  • 1
    Here's an example: `set sentence {hello[pwd] world}` then `CapitalizeEachWord $sentence` produces `Hello/users/glennjackman World` – glenn jackman Oct 01 '20 at 17:51
  • So, because we're using `subst` to do command substititions, we need to escape any open brackets in the original sentence: `regsub -all {\[} $sentence {\\&}` – glenn jackman Oct 01 '20 at 17:59
  • @jimbobmcgee It's very hard to work out the safety properties of these sorts of things in your head. That's why I so look forward to the 8.7 solution becoming the official recommendation; showing that's safe is trivial. – Donal Fellows Oct 02 '20 at 06:40
  • Thanks for the clarification. Looks like I have a couple of things to revise. Here's hoping the firmware of the devices I am using will uplift past tcl8.4, one day... – jimbobmcgee Oct 05 '20 at 22:06
1

The usual model (in 8.6 and before) for applying a command to a bunch of regular-expression-chosen substrings of a string is this:

subst [regsub -all $REtoFindTheSubstrings [MakeSafe $input] {[TheCommandToApply &]}]

The MakeSafe is needed because subst doesn't just do the bits that you want. Even with disabling some substitution classes (e.g., with the -novariables) option, you still need the trickiest one of all — command substitutions — and that means that strings like hello[pwd]goodbye can catch you out. To deal with this you make the string “safe” by replacing every Tcl metacharacter (or at least the ones that matter in a subst) by its backslashed version. Here's a classic version of MakeSafe (that you'll often see inlined):

proc MakeSafe {inputString} {
    regsub -all {[][$\\{}"" ]} $inputString {\\&}
}

Demonstrating it interactively:

% MakeSafe {hello[pwd]goodbye}
hello\[pwd\]goodbye

With that version, no substitution classes need to be turned off in subst though you could turn off variables, and there's no surprises possible when you're applying the command as things that could crop up in the substituted argument string have been escaped. But there's a big disadvantage: you potentially need to change the regular expression in your transformation to take account of the extra backslashes now present. It's not required for the question's RE (as that just selects sequences of word characters) and indeed that could safely be this reduced version:

subst [regsub -all {\w+} [regsub -all {[][\\$]} $input {\\&}] {[string totitle &]}]

In 8.7 onwards, there's a -command option to regsub that avoids all this mess. It's also quite a bit faster, as subst works by compiling its transformations into bytecode (that's not a good win for a one-off substitution!) and regsub -command uses direct command invoking instead, much more likely to be fast.

regsub -all -command {\w+}  $input {string totitle}

The internal approach used by regsub -all -command can be emulated in 8.6 (or earlier with more extra shims), but it is non-trivial:

proc regsubAllCommand {RE input command} {
    # I'll assume there's no sub-expressions in the RE in order to keep this code shorter
    set indices [regexp -all -inline -indices -- $RE $input]
    # Gather the replacements first to make state behave right
    set replacements [lmap indexPair $indices {
        # Safe version of:  uplevel $command [string range $input {*}$indexPair]
        uplevel 1 [list {*}$command [string range $input {*}$indexPair]]
    }]
    # Apply the replacements in reverse order
    set output $input
    foreach indexPair [lreverse $indices] replacement [lreverse $replacements] {
        set output [string replace $output {*}$indexPair $replacement]
    }
    return $output
}

The C implementation of regsub uses working buffers and so on internally, but that's not quite so convenient at the Tcl level.

Donal Fellows
  • 133,037
  • 18
  • 149
  • 215
-1

You can Use Initcap function for making 1st letter in upper case and rest in lower case.

Neelimesh
  • 1
  • 3