14

I want to produce such file (cartesian product of [1-3]X[1-5]):

1 1
1 2
1 3
1 4
1 5
2 1
2 2
2 3
2 4
2 5
3 1
3 2
3 3
3 4
3 5

I can do this using nested loop like:

for i in $(seq 3) 
do
  for j in $(seq 5)
  do
      echo $i $j
  done
done

is there any solution without loops?

Farvardin
  • 5,336
  • 5
  • 33
  • 54

3 Answers3

19

Combine two brace expansions!

$ printf "%s\n" {1..3}" "{1..5}
1 1
1 2
1 3
1 4
1 5
2 1
2 2
2 3
2 4
2 5
3 1
3 2
3 3
3 4
3 5

This works by using a single brace expansion:

$ echo {1..5}
1 2 3 4 5

and then combining with another one:

$ echo {1..5}+{a,b,c}
1+a 1+b 1+c 2+a 2+b 2+c 3+a 3+b 3+c 4+a 4+b 4+c 5+a 5+b 5+c
fedorqui
  • 275,237
  • 103
  • 548
  • 598
13

A shorter (but hacky) version of Rubens's answer:

join -j 999999 -o 1.1,2.1 file1 file2

Since the field 999999 most likely does not exist it is considered equal for both sets and therefore join have to do the Cartesian product. It uses O(N+M) memory and produces output at 100..200 Mb/sec on my machine.

I don't like the "shell brace expansion" method like echo {1..100}x{1..100} for large datasets because it uses O(N*M) memory and can when used careless bring your machine to knees. It is hard to stop because ctrl+c does not interrupts brace expansion which is done by the shell itself.

legolegs
  • 265
  • 2
  • 4
9

The best alternative for cartesian product in bash is surely -- as pointed by @fedorqui -- to use parameter expansion. However, in case your input that is not easily producible (i.e., if {1..3} and {1..5} does not suffice), you could simply use join.

For example, if you want to peform the cartesian product of two regular files, say "a.txt" and "b.txt", you could do the following. First, the two files:

$ echo -en {a..c}"\tx\n" | sed 's/^/1\t/' > a.txt
$ cat a.txt
1    a    x
1    b    x
1    c    x

$ echo -en "foo\nbar\n" | sed 's/^/1\t/' > b.txt
$ cat b.txt
1    foo
1    bar

Notice the sed command is used to prepend each line with an identifier. The identifier must be the same for all lines, and for all files, so the join will give you the cartesian product -- instead of putting aside some of the resultant lines. So, the join goes as follows:

$ join -j 1 -t $'\t' a.txt b.txt | cut -d $'\t' -f 2-
a    x    foo
a    x    bar
b    x    foo
b    x    bar
c    x    foo
c    x    bar

After both files are joined, cut is used as an alternative to remove the column of "1"s formerly prepended.

Rubens
  • 14,478
  • 11
  • 63
  • 92
  • what u write as join is really a join (http://en.wikipedia.org/wiki/Join_%28relational_algebra%29#Joins_and_join-like_operators) i do not need join. what i want is cartesian product (http://en.wikipedia.org/wiki/Cartesian_product). – Farvardin Apr 29 '14 at 12:35
  • 1
    @طاهر Well, when you join each and every row from one table with the rows from another table, i.e., when you do a [*cross join*](http://en.wikipedia.org/wiki/Join_%28SQL%29#Cross_join), your output is a cartesian product. – Rubens Apr 29 '14 at 12:40
  • 2
    The benefit of this solution is that bash doesn't inherently allow variables to be used in brace expansions. You can use variables in brace expansion with eval, but then you're using eval. – Erik Jul 22 '15 at 05:26