53

Straight to the point, I'm wondering how to use grep/find/sed/awk to match a certain string (that ends with a number) and increment that number by 1. The closest I've come is to concatenate a 1 to the end (which works well enough) because the main point is to simply change the value. Here's what I'm currently doing:

find . -type f | xargs sed -i 's/\(\?cache_version\=[0-9]\+\)/\11/g'

Since I couldn't figure out how to increment the number, I captured the whole thing and just appended a "1". Before, I had something like this:

find . -type f | xargs sed -i 's/\?cache_version\=\([0-9]\+\)/?cache_version=\11/g'

So at least I understand how to capture what I need.

Instead of explaining what this is for, I'll just explain what I want it to do. It should find text in any file, recursively, based on the current directory (isn't important, it could be any directory, so I'd configure that later), that matches "?cache_version=" with a number. It will then increment that number and replace it in the file.

Currently the stuff I have above works, it's just that I can't increment that found number at the end. It would be nicer to be able to increment instead of appending a "1" so that the future values wouldn't be "11", "111", "1111", "11111", and so on.

I've gone through dozens of articles/explanations, and often enough, the suggestion is to use awk, but I cannot for the life of me mix them. The closest I came to using awk, which doesn't actually replace anything, is:

grep -Pro '(?<=\?cache_version=)[0-9]+' . | awk -F: '{ print "match is", $2+1 }'

I'm wondering if there's some way to pipe a sed at the end and pass the original file name so that sed can have the file name and incremented number (from the awk), or whatever it needs that xargs has.

Technically, this number has no importance; this replacement is mainly to make sure there is a new number there, 100% for sure different than the last. So as I was writing this question, I realized I might as well use the system time - seconds since epoch (the technique often used by AJAX to eliminate caching for subsequent "identical" requests). I ended up with this, and it seems perfect:

CXREPLACETIME=`date +%s`; find . -type f | xargs sed -i "s/\(\?cache_version\=\)[0-9]\+/\1$CXREPLACETIME/g"

(I store the value first so all files get the same value, in case it spans multiple seconds for whatever reason)

But I would still love to know the original question, on incrementing a matched number. I'm guessing an easy solution would be to make it a bash script, but still, I thought there would be an easier way than looping through every file recursively and checking its contents for a match then replacing, since it's simply incrementing a matched number...not much else logic. I just don't want to write to any other files or something like that - it should do it in place, like sed does with the "i" option.

Ian
  • 50,146
  • 13
  • 101
  • 111

5 Answers5

72

I think finding file isn't the difficult part for you. I therefore just go to the point, to do the +1 calculation. If you have gnu sed, it could be done in this way:

sed -r 's/(.*)(\?cache_version=)([0-9]+)(.*)/echo "\1\2$((\3+1))\4"/ge' file

let's take an example:

kent$  cat test 
ello
barbaz?cache_version=3fooooo
bye

kent$  sed -r 's/(.*)(\?cache_version=)([0-9]+)(.*)/echo "\1\2$((\3+1))\4"/ge' test     
ello                                                                             
barbaz?cache_version=4fooooo
bye

you could add -i option if you like.

edit

/e allows you to pass matched part to external command, and do substitution with the execution result. Gnu sed only.

see this example: external command/tool echo, bc are used

kent$  echo "result:3*3"|sed -r 's/(result:)(.*)/echo \1$(echo "\2"\|bc)/ge'       

gives output:

result:9

you could use other powerful external command, like cut, sed (again), awk...

Jasen
  • 11,837
  • 2
  • 30
  • 48
Kent
  • 189,393
  • 32
  • 233
  • 301
  • You're right that the finding of files isn't the hard part, so thanks for focusing on the replace part. I just know that some solutions (maybe using `awk` instead) aren't as able to use `find`, possibly making it "hard" to find the files. But this solution looks very promising. So the main difference I see is the `-r` option, and then the `$((\3+1))` in the replace. Does the `-r` enable this functionality? Anyways, I'll try this out with `find` and the `-i` option for `sed` and let you know. Thank you! – Ian Jan 16 '13 at 17:36
  • Sorry, also realized the `echo` and `/e` flag. Can you explain more about those as well? – Ian Jan 16 '13 at 17:45
  • Thanks for the update! Okay, so bear with me. I'm pretty sure this will work. But I am encountering a problem. I can do the same as you, but since my files contain special characters, bash complains about the `<` character that there's a syntax error near the unexpected token. When I take them out, then it works like your example. But also, other special characters like `"` (maybe that's it) aren't in the result. When I add in `-i`, it works, but strips out the `"` (and doesn't work with `<` and possibly other characters). Is there something more needed with the `echo` part? – Ian Jan 16 '13 at 19:15
  • By the way, here's what I'm using: `find . -type f | xargs sed -i -r 's/(.*)(\?cache_version=)([0-9]+)(.*)/echo \1\2$((\3+1))\4/ge'`. And an example of a line in the files is `` – Ian Jan 16 '13 at 19:16
  • If text/line in the text file is something like ``, then I get an error about `syntax error near the unexpected token "<"`. And then also, it strips out `"` characters (maybe more). Two comments up contains my current command that does this. Does that make sense? – Ian Jan 21 '13 at 20:27
  • you need to wrap the echo part with double quote, like this: `sed -r 's/(.*)(\?cache_version=)([0-9]+)(.*)/echo "\1\2$((\3+1))\4"/ge'` – Kent Jan 21 '13 at 20:41
  • 2
    Ahh I see. I swear I did that. Okay, that fixes the "<" character problem. Now the only problem is that the `"` are being stripped when writing back to the file. Do you know what would make that happen? Maybe it's just a `-i` problem, but it's weird. There are `"` before, but gone after the script has run – Ian Jan 21 '13 at 20:47
  • 1
    oh sorry, didn't notice the quote. then dirty&quick fix: `...|sed 's/"/\\"/g'|sed -r 's/(.*)(\?cache_version=)([0-9]+)(.*)/echo "\1\2$((\3+1))\4"/ge'` – Kent Jan 21 '13 at 20:49
  • Alrighty, it's so close! So I ended up using this: `find . -type f | xargs sed -i -r 's/"/\\"/g;s/(.*)(cacheholder)([0-9]+)(.*)/echo "\1\2$((\3+1))\4"/ge'` since piping it didn't seem to work. And I tested on the text line I showed you above. That worked. Then I added in every possible special character on the keyboard. That worked. Then I realized I forgot `"` in the middle of text. When I added that, it replaced the `"` with `\"`. So I'd have my ` – Ian Jan 21 '13 at 21:18
  • So it works fine when the main matcher matches and contains a `"`, but the rest of the text doesn't seem to want to. I guess it's because I'm replacing every single occurrence of `"` with `\"` no matter what, before I do the real search/replace. Hmmm – Ian Jan 21 '13 at 21:22
  • The first example doesn't work for me. I'm using `sed (GNU sed) 4.4` on Cygwin, but I don't see why that would be a problem. With `-i` it changes the number to 24 instead of 4. Without `-i` I'm not sure what it does because the output is strange, only showing the last line. Perhaps that can be blamed on my terminal. But on a linux server (`GNU sed version 4.2.1`), it works as expected. The second example works on both systems. – piojo Dec 20 '18 at 04:05
  • For me it wasn't working with `-r`, but it worked without `-r`. Weird. I'm using Git Bash for windows. – LoMaPh Mar 11 '19 at 21:06
  • sorry @LoMaPh I don't have windows machine to test. all examples in above answer were tested on my Linux box. I cannot tell the difference between git bash and bash either.. – Kent Mar 12 '19 at 12:34
10

Pure sed version:

This version has no dependencies on other commands or environment variables. It uses explicit carrying. For carry I use the @ symbol, but another name can be used if you like. Use something that is not present in your input file. First it finds SEARCHSTRING<number> and appends a @ to it. It repeats incrementing digits that have a pending carry (that is, have a carry symbol after it: [0-9]@) If 9 was incremented, this increment yields a carry itself, and the process will repeat until there are no more pending carries. Finally, carries that were yielded but not added to a digit yet are replaced by 1.

sed "s/SEARCHSTRING[0-9]*[0-9]/&@/g;:a {s/0@/1/g;s/1@/2/g;s/2@/3/g;s/3@/4/g;s/4@/5/g;s/5@/6/g;s/6@/7/g;s/7@/8/g;s/8@/9/g;s/9@/@0/g;t a};s/@/1/g" numbers.txt
Ormoz
  • 2,975
  • 10
  • 35
  • 50
Martijn
  • 101
  • 1
  • 5
  • 2
    Good solution. To not replace @ symbols it seems to work replacing @ with another, less common symbol such as £: sed "s/cache_version=[0-9]*[0-9]/&£/g;:a {s/0£/1/g;s/1£/2/g;s/2£/3/g;s/3£/4/g;s/4£/5/g;s/5£/6/g;s/6£/7/g;s/7£/8/g;s/8£/9/g;s/9£/£0/g;t a};s/£/1/g" $1 – Robin Manoli Nov 22 '16 at 20:49
  • Note, on MacOS the name of the label ends at a newline, so you have to split the command across two lines for it to work. – Matt Nov 08 '18 at 03:17
9

This perl command will search all files in current directory (without traverse it, you will need File::Find module or similar for that more complex task) and will increment the number of a line that matches cache_version=. It uses the /e flag of the regular expression that evaluates the replacement part.

perl -i.bak -lpe 'BEGIN { sub inc { my ($num) = @_; ++$num } } s/(cache_version=)(\d+)/$1 . (inc($2))/eg' *

I tested it with file in current directory with following data:

hello
cache_version=3
bye

It backups original file (ls -1):

file
file.bak

And file now with:

hello
cache_version=4
bye

I hope it can be useful for what you are looking for.


UPDATE to use File::Find for traversing directories. It accepts * as argument but will discard them with those found with File::Find. The directory to begin the search is the current of execution of the script. It is hardcoded in the line find( \&wanted, "." ).

perl -MFile::Find -i.bak -lpe '

    BEGIN { 
        sub inc { 
            my ($num) = @_; 
            ++$num 
        }

        sub wanted {
            if ( -f && ! -l ) {  
                push @ARGV, $File::Find::name;
            }
        }

        @ARGV = ();
        find( \&wanted, "." );
    }

    s/(cache_version=)(\d+)/$1 . (inc($2))/eg

' *
Birei
  • 35,723
  • 2
  • 77
  • 82
  • `File::Find` is a core module just fyi. – squiguy Jan 15 '13 at 23:29
  • This is definitely useful! I wasn't looking for Perl, but hey, if it works, can't complain. Is there any way you can include how to use `Find::File` so that it recursively searches a directory? And if I don't want a backup, should I just remove `-i.bak`? Thanks for the help! – Ian Jan 16 '13 at 17:40
  • @Ian: I've updated the answer with a script that traverse directories searching for the string in all regular files and modifying them creating a backup file. Use `-i` without extension (like `perl -MFile::Find -i -lpe ...`) to modify them in place, but could be a little risky. – Birei Jan 17 '13 at 12:41
  • I prefer perl over Gnu sed because it is pre-installed on Mac. – Frank Harper Jun 19 '17 at 15:21
  • What is the `inc` function needed for? Why not simply `(1+$2)` ? – qbolec Oct 24 '21 at 21:36
4

This is ugly (I'm a little rusty), but here's a start using sed:

orig="something1" ;
text=`echo $orig | sed "s/\([^0-9]*\)\([0-9]*\)/\1/"` ;
num=`echo $orig | sed "s/\([^0-9]*\)\([0-9]*\)/\2/"` ;
echo $text$(($num + 1))

With an original filename ($orig) of "something1", sed splits off the text and numeric portions into $text and $num, then these are combined in the final section with an incremented number, resulting in something2.

Just a start since it doesn't consider cases with numbers within the file name or names with no number at the end, but hopefully helps with your original goal of using sed.

This can actually be simplified within sed by using buffers, I believe (sed can operate recursively), but I'm really rusty with that aspect of it.

David Ravetti
  • 2,030
  • 1
  • 17
  • 22
  • Hmmm ... re-reading your question I don't think I've answered it, but hopefully there's at least something useful in it. I do believe I've done something essentially like what you're doing, completely in sed, but unfortunately don't have that solution handy right now. – David Ravetti Jan 15 '13 at 23:43
  • You're right that it isn't exactly what you're looking for, but it's the right idea and is definitely on the right track. I hope it's okay, but I split up the lines so I could read it better. I have a few questions/comments, so this might span multiple comments. Real quick - why did you escape the `\``? I'm really not looking at doing the search/replace on a filename, but on a file's contents. So that's why in the OP, it has the `find` command and piping the results to `sed`. So like I said in the OP, if there's a way to make a bash script that uses the logic that you have, but – Ian Jan 16 '13 at 17:29
  • modifies files' content in place, then this can really help. If you have any ideas on how to combine them to do both. Your final line of code is what I need, just inside of the replace part of `sed`. But separating these things into separate statements like you have kind of defeats what I need. I'm thinking of using `find`, then looping through the results, then using your logic and a final `sed -i` for truly replacing the value...but I'm not sure if that works that way. Do you know what I mean? Thanks for the help though! – Ian Jan 16 '13 at 17:32
  • The escaping of the backtick was only for display within the post, since that character is also used for defining code blocks here - in reality it is not escaped (and I may have inadvertently double-escaped or something). I believe what you propose is possible with sed -i, using some of sed's more advanced features. I'm not sure if the bash math functions works completely within sed, however, so you might need a more elaborate "replace [0-8] with [1-9], replace [9] with [10]" or something. I see a number of hits on "math in sed", but they seem to mostly say, "try awk or perl instead". – David Ravetti Jan 16 '13 at 18:05
  • Ahhh okay, I can't believe I didn't put that together and remember `\`` is for code and might be the problem. I just wasn't sure if there was a special reason with bash or something. Well you wouldn't happen to know anything about `awk`, would you? The other answer with `sed` here is pretty close, and the answer with `perl` here is as well. Seems only fitting that someone attempt this with `awk` :) – Ian Jan 16 '13 at 19:24
  • I have very little awk experience, mostly for grabbing fields from log files which I would then pass through sed to manipulate. I've used it infrequently enough that I always have to search or grab a book to remember the right formatting. Sorry. – David Ravetti Jan 16 '13 at 20:04
1
perl -pi -e 's/(\?cache_version=)(\d+)/$1.($2+1)/ge' FILE [FILE...]

or for a complete solution:

find . -type f | xargs perl -pi -e 's/(\?cache_version=)(\d+)/$1.($2+1)/ge'

perl substitution operator

  • /e modifier evaluates the replacement as if it were a Perl statement, using its return value as the replacement text.
  • . operator concatenates strings in Perl. The parentheses ensures that the arithmetic operation $2+1 takes precedence over concatenation.
  • /g modifier applies substitution to all matched strings within line

perl options

  • -p ensures that perl will execute the command on every line of each file
  • -i ensures that each file will be edited inplace
  • -e specifies the perl command(s) that are executed (in this case, the substitution operation)
anon
  • 11
  • 1