Dynamic continuous numbering in bash

Question

I have a text file that acts as a database for my script. The file has a column for an "ID" in example.

The database has a format of UID:Item Name:Quantity:Price:Date Added

cat FirstDB.txt

output:

0001:Fried Tarantula:45:100:2017-08-03
0002:Wasp Crackers:18:25:2017-08-04
0003:Century Egg:19:50:2017-08-05
0004:Haggis Flesh:20:90:2017-08-06
0005:Balut (Egg):85:15:2017-08-07
0006:Bear Claw:31:550:2017-08-08
0007:Durian Fruit:70:120:2017-08-09
0008:Live Cobra heart:20:375:2017-08-10
0009:Monkey Brains:30:200:2017-08-11
0010:Casu Marzu:25:1030:2017-08-12

Now the feature that i'm creating allows a certain user to put in new entries in the text file using the same format (I have already created this). However, the real trick here is that the user is also given the option to delete a certain item. In example the user would like to delete Century Egg from the text file the output would be this:

0001:Fried Tarantula:45:100:2017-08-03
0002:Wasp Crackers:18:25:2017-08-04
0004:Haggis Flesh:20:90:2017-08-06
0005:Balut (Egg):85:15:2017-08-07
0006:Bear Claw:31:550:2017-08-08
0007:Durian Fruit:70:120:2017-08-09
0008:Live Cobra heart:20:375:2017-08-10
0009:Monkey Brains:30:200:2017-08-11
0010:Casu Marzu:25:1030:2017-08-12

Then if the user wishes to add any item in the database I would like the user to take the UID 0003 since it's already free. How do I go about in achieving this? I'm stuck with it as of the moment. I believe awk can be useful here but i'm not keeping my options closed and i'm pretty new to scripting and awk im not really that good with awk yet. So if you would have a solution that would be using awk please guide me through it as well. Thank you very much!

Show us your program and where you are stuck. Stack Overflow is for helping make programs work (or make them work better) not for doing your work for you. — Mort, Sep 12 '17 at 15:44
does your script load the file into memory and work primarily on the memory version of data (more efficient), or does the script constantly (re)read/(re)write the file (less efficient)? if the former (load-into/work-in) memory, how are you storing the data in your script? — markp-fuso, Sep 12 '17 at 16:04
I'm stuck in a way that I have virtually no idea on where to begin on the function for the dynamic numbering of the script all that i know is that I can reduce the text file by using `cut -f1 -d:` and from there I would have a list of numbers I could work with. From there I actually don't have any idea on how I could do comparisons and increments I'm very sorry if I sound like I'm asking for stack overflow to work for me probably since I'm just new to all of this. — Gifter Villanueva, Sep 12 '17 at 16:08
I recently posted an answer to exactly this question. I don't have time to search for it but if you look through all of my answers in the past couple of months it's in there somewhere... — Ed Morton, Sep 12 '17 at 16:28

score 1 · Accepted Answer · answered Sep 12 '17 at 17:23

awk to the rescue!

assuming after edits the sequence will not be ordered anymore

awk -F: '{a[$1+0]} END{for(i=1;i<=NR;i++) if(!(i in a)) print i}'

will return you the first missing number from the first column (assumes numerical field).

test

create a shuffled list of formatted sequence numbers with "0003" missing.

awk 'BEGIN{for(i=1;i<=10;i++) printf "%04d\n",i}' | shuf | awk '$1!=3' 

0009
0001
0006
0004
0002
0005
0008
0010
0007

pipe to the script

... | awk -F: '{a[$1+0]} END{for(i=1;i<=NR;i++) if(!(i in a)) print i}'

returns as expected

however, this won't return anything if your list does not have gaps. To handle that case, you need to return the largest number + 1. With this modification the test case and script becomes

$ awk 'BEGIN{for(i=1;i<=10;i++) printf "%04d\n",i}' | 
  shuf | 
  awk -F: '{a[$1+0]} $1>max{max=$1} 
       END {for(i=1;i<=NR;i++) if(!(i in a)) {print i; exit} 
            print max+1}'

11

Note if you're sorting the file after each record insertion you can avoid much of the complexity.

I'll try to use the last script you gave since that seems to be the one that I'm exactly looking for! Thanks! Upvoted this and tagged it as an answer. — Gifter Villanueva, Sep 14 '17 at 03:59

score 0 · Answer 2 · answered Sep 12 '17 at 16:36

0

If I understand the question correctly, you're looking for the first "free" number starting from the top. Something like:

$ awk -F: '{s=sprintf("%04d",NR)} s!=$1{print s; exit}' FirstDB.txt

could do what you want. I'm assuming here, that no 2 clients can add/delete at the same time.

This can even be shortened to:

$ awk -F: '(s=sprintf("%04d",NR))!=$1{print s; exit}' FirstDB.txt

answered Sep 12 '17 at 16:36

Marc Lambrichs

2,864
2
13
14

What was wrong with `$1 != NR` as a much simpler way of writing that condition? Also, you need to be clear that this solution is based on the file being sorted (even after the new element is added). (Also you need an `END` action.) – rici Sep 12 '17 at 17:22
OP is not clear on requirements. If you think input should be unordered, be my guest. OP suggests otherwise, I'm afraid. – Marc Lambrichs Sep 12 '17 at 17:33
I'm not saying your answer is wrong; just that it would be worthwhile stating the assumption. Documenting prerequisites is always useful. (Although the answer *is* wrong if you don't provide a `END` action since the condition might not be true for any line in the file.) – rici Sep 12 '17 at 17:46
It prints nothing. As intended. Again, OP is not clear, so apparently everybody makes his own assumptions. – Marc Lambrichs Sep 12 '17 at 17:56

Dynamic continuous numbering in bash

2 Answers2