1

I have a well-written perl code (not authored by me), I want to use it for some analysis and need it to iterate over 1000 times with different set of input parameters. I will explain:

perl cmh-test.pl --input C1_E1_C2_E2_C3_E3_C4_E4_java.sync --output C1_E1_C2_E2_C3_E3_C4_E4_java.latest.cmh --min-count 20 --min-coverage 30 --max-coverage 400 --population 1-2,3-4,5-6,7-8 --remove-temp

I now want to run the same code 1000 times, but change the --population parameter every time such as where the first time it was 1-2,3-4,5-6,7-8, next time it becomes a random permutation of 1 to 8, such as 1-3,2-4,5-7,6-8 and so on. How do I do this?

Any help will be greatly appreciated.

Timur Shtatland
  • 12,024
  • 2
  • 30
  • 47
cryptodice
  • 23
  • 3
  • Are you calling the Perl script from Perl? Or from a shell? If so, which one? – choroba Nov 20 '20 at 11:40
  • @choroba, I am calling it from shell. Ubuntu 20.04. – cryptodice Nov 20 '20 at 11:46
  • 3
    This was also [asked on /r/perl](https://www.reddit.com/r/perl/comments/jxneae/need_help_for_a_loop_with_random_permuted_numbers/) and has an answer there. In both places, you should note that you have the same post somewhere else so you don't take up people's time if there's already an answer. – brian d foy Nov 20 '20 at 13:15
  • @briandfoy, right. I am sorry, should have mentioned here that already got a solution from /r/perl. Didn't know responders would be similar. – cryptodice Nov 20 '20 at 13:21
  • 1
    The solution - https://www.reddit.com/r/perl/comments/jxneae/need_help_for_a_loop_with_random_permuted_numbers/gcxm1ev?utm_source=share&utm_medium=web2x&context=3 – cryptodice Nov 20 '20 at 13:23
  • 1
    Further, a small question asked here already: https://www.reddit.com/r/perl/comments/jxneae/need_help_for_a_loop_with_random_permuted_numbers/gcxmz35?utm_source=share&utm_medium=web2x&context=3 – cryptodice Nov 20 '20 at 14:41
  • @briandfoy, cryptodice: Would it make sense to post here the solution from [https://www.reddit.com/r/perl/comments/jxneae/need_help_for_a_loop_with_random_permuted_numbers/](https://www.reddit.com/r/perl/comments/jxneae/need_help_for_a_loop_with_random_permuted_numbers/) ? With attribution, link, etc. This way, it can be accepted, voted on, and/or improved. – Timur Shtatland Nov 20 '20 at 14:50
  • 1
    That reddit user ( @daxim ) is already on stackoverflow, maybe he should do it, if anyone. – TLP Nov 20 '20 at 14:59

1 Answers1

2

As brian d foy suggests in the comments, see the answer posted by daxim on Reddit: https://www.reddit.com/r/perl/comments/jxneae/need_help_for_a_loop_with_random_permuted_numbers/

Below is an alternative solution, without using Algorithm::Combinatorics. Note that shuffle can generate repeating permutations. Also note that using a constant random seed as shown below generates the same permutation sequence every time the script is run. Remember to remove the srand statement if you do not care about reproducibility of the script across instances.

#!/usr/bin/env perl

use strict;
use warnings;
use feature qw( say );
use List::Util qw( shuffle );

# Use fixed seed to generate reproducible results across calls to this
# script. Omit this to use a default seed every instance:
srand 42; 

my @orig_pop = 1..8;

# Using strings to name output files: *_0001.cmh .. *_1000.cmh.
# Use numbers (1 .. 1000) if you want output files named: *_1.cmh .. *_1000.cmh.
for my $iter ( '0001' .. '1000' ) {
    my @pop = shuffle @orig_pop;
    # Using a hash to make array @pop into pairs:
    my %pop = @pop;
    my $pop = join ',', map { "$_-$pop{$_}" } grep { exists $pop{$_} } @pop;
    my $cmd = qq{perl cmh-test.pl } .
        qq{--input C1_E1_C2_E2_C3_E3_C4_E4_java.sync } .
        qq{--output C1_E1_C2_E2_C3_E3_C4_E4_java.latest_${iter}.cmh } .
        qq{--min-count 20 --min-coverage 30 --max-coverage 400 } .
        qq{--population $pop --remove-temp};
    say $cmd;
}

Timur Shtatland
  • 12,024
  • 2
  • 30
  • 47
  • 1
    Thank you so much @Timur , this works perfectly. However, there is no space between --output C1_E1_C2_E2_C3_E3_C4_E4_java.latest_0001.cmh and --min-count 20. In the output, it appears like --output C1_E1_C2_E2_C3_E3_C4_E4_java_1000only_0001.cmh--min-count 20 – cryptodice Nov 20 '20 at 19:06
  • 1
    @cryptodice Thank you, fixed. – Timur Shtatland Nov 20 '20 at 19:19
  • anyway, I could pipe the output through this awk oneliner: awk '{sub(/--min-count/," --min-count")}1' – cryptodice Nov 20 '20 at 19:57