How do I get the file randomly from the directory without doing it redundantly in bash programming

Question

I want to ask about bash again. I want to get the file randomly from the directory. For example there are

13.525 file in 1 directory.

I was random the file and get the file

gr123.adl and for the next random I want the file

gr123.adl not to be selected again.

How should I implementing it with bash language?

Thank's for your help before

Regards Gustina M.S

I would `mv file dir_already_processed` as you finish processing a file, then you can't get it again. Else you'll have to maintain a list (in a file or a shell array). Good luck. — shellter, Apr 18 '16 at 15:00
Is this a real world problem? I can't think of why you'd ever need this? Normally random is not unique in any languate. — SaintHax, Apr 18 '16 at 18:48
@SaintHax I suspect the "testing" tag is a key part of the use case here, I can imagine wanting to write a test that would process the files in a directory in a random order — Eric Renouf, Apr 19 '16 at 11:56

score 2 · Answer 1 · edited May 23 '17 at 11:52

I would probably look to do this in another language that has some better handling for situations like this if possible. For example, in python you could do it like

files = os.listdir('.')
random.shuffle(files)
for path in files:
    # do your code test stuff on path

Having a function that will return the next file name is tougher to do in bash, but if you just want to operate on the files in a random order we can follow @shelter's recommendation and use arrays, combined with a randomizing function found in this answer. Here we will shuffle all the filenames in an array, then iterate over them:

shuffle() {
   local i tmp size max rand

   # $RANDOM % (i+1) is biased because of the limited range of $RANDOM
   # Compensate by using a range which is a multiple of the array size.
   size=${#array[*]}
   max=$(( 32768 / size * size ))

   for ((i=size-1; i>0; i--)); do
      while (( (rand=$RANDOM) >= max )); do :; done
      rand=$(( rand % (i+1) ))
      tmp=${array[i]} array[i]=${array[rand]} array[rand]=$tmp
   done
}

array=( * )
shuffle

for((i=0; i<${#array[*]}; i++ )); do
    printf "Operating on %s\n" "${array[i]}"
    # do whatever test makes sense on "${array[i]}"
done

if you really want a function that will return the "next" file we could do it a bit differently from above, setting a variable that we'll use as holding our current filename. So we ill replace the for loop at the bottom with another function definition and loop like so:

next_file() {
    if [[ "$array_ind" -ge "${#array[*]}" ]]; then
        cur=""
    else
        cur="${array[array_ind++]}"
    fi
}

array_ind=0

# now we use next_file whenever we want `cur` to get the next file:
next_file
while [[ ! -z "$cur" ]]; do
    printf -- "--%s--\n" "$cur"
    next_file
done

oliv · Answer 2 · 2016-04-19T08:22:08.957

1

You may try the following:

 ls | sort -R | while read f; do echo $f; done

sort -R is shuffling the files, the while loop makes sure you get all file 1 by 1

EDIT:

If some of your files contains control characters (like \n), you can try this:

 OLDIFS=$IFS; IFS=$(echo -en "\b"); for f in $(ls -b | sort -R); do echo "$f"; done; IFS=$OLDIFS

This changes the input field separator to \b (change it to anything suitable that doesn't match any character of all filenames).

ls -b lists file of the with the control characters.

The for loop is there to take files one by one.

At last, the IFS is set to its original value.

edited Apr 19 '16 at 08:22

answered Apr 18 '16 at 14:21

oliv

12,690
25
45

Note that this will not work right if any files include a newline in their name – Eric Renouf Apr 18 '16 at 14:55
1

Though, I edited the post to answer your comment, I think the handling of control character in filename deserves its own question. – oliv Apr 19 '16 at 08:25
It is frequently the subject of discussion here and comes up in lots of questions. Often they come down to a link to http://mywiki.wooledge.org/ParsingLs and the advice "Don't parse `ls`" – Eric Renouf Apr 19 '16 at 11:58
Interesting litterature, thanks. Can you please propose a correct answer? – oliv Apr 19 '16 at 12:43

SaintHax · Answer 3 · 2016-04-19T13:15:30.373

1

If you really want this, then you need a function that will take arguments and keep track of files.

rand_file() {
   track=~/${PWD##*/}.rand_file
   touch $track

   while read f; do
      if ! grep -q "$f" $track; then
         echo $f| tee -a $track
         break
      fi
   done < <(ls |sort -R)
}

We are using a for loop, so that if we've gotten every file in the directory, it exits cleanly. We are tracking in files named after the directory, so that if the same named file is elsewhere, we don't use it as a previously returned file-- note that means you have to be using it in the PWD, you can code something better, but I'm not going to knock that part out here right now. Once all files have been returned, the function exits with nothing returned. You can delete the file in your home directory to reset the process.

edited Apr 19 '16 at 13:15

answered Apr 18 '16 at 19:05

SaintHax

1,875
11
16

This isn't as safe as the `while` solution above. This will break on any files with any character in `IFS` – Eric Renouf Apr 19 '16 at 02:42
@EricRenouf the while solution, from oliv, doesn't meet requirements. It doesn't return a file (it returns them all). You can always adjust the IFS, but in a testing scenario for bash, I'm assuming you are using normal Linux filenames-- no spaces. These are not files sent from a client, but internal. – SaintHax Apr 19 '16 at 12:54
@EricRenouf for I'll change it to a while loop for you. I expect that a in a closed test environment you can avoid files with Windows names, but... Note quoting all the $track, the scripter can do if needed. – SaintHax Apr 19 '16 at 13:12
You can certainly create files with spaces and other special characters in *nix environments. Particularly if people are using media files I see it happen a lot (spaces in song titles become spaces in mp3 filenames for example), so it may or may not be safe to assume that filenames are "traditional", but as I noted above the while loop still doesn't actually solve all the problems either since it couldn't handle files with newlines in their names (not so common, but not illegal either) – Eric Renouf Apr 19 '16 at 13:15
@EricRenouf I read the new lines, and dismissed it when I read it. From a professional perspective, in 20+ years I've never came across a file with a new line in. I'm sure I could now, from a co-worker... if they thought they were funny, but not in the real world. I don't think that StackOverflow is supposed to make iron clad code either, it's certainly not worth my time to do so. Also, I of course know you can make files with spaces in *nix, but I've never met someone that did so-- every space file I've gotten has come from a Windows machine at some point. – SaintHax Apr 19 '16 at 13:18

How do I get the file randomly from the directory without doing it redundantly in bash programming

3 Answers3