3

I'm trying to join one standard input with an unsorted file, let's say:

awk '{print $1}' somefile | join /dev/stdin unsortedfile

Is it possible to sort the file "at the moment" instead sorting it, saving it and subsequently using it in join? I was thinking about something like

export SORT = `sort unsortedfile`; awk '{print $1}' somefile | join /dev/stdin $SORT

but it doesn't work, it says "SORT : command not found". I'm new to variables, so I'm not sure they are what I am looking for.

If it can be useful, I'm using cygwin.

gniourf_gniourf
  • 44,650
  • 9
  • 93
  • 104
LinuxBlanket
  • 165
  • 1
  • 6
  • Note that you can't have spaces around assignments in shell scripts. Capturing the output from `sort` in a variable is unlikely to be a good way of proceeding. Note that you can often use `-` as a command line argument (all on its own) to indicate 'standard input' (or sometimes standard output, or in the case of GNU Tar, `tar -cf - -T -` uses both standard output — the first one — and standard input — the second one). – Jonathan Leffler Oct 07 '15 at 14:54
  • Thank you @JonathanLeffler, I didn't know that. Shorthands are always very welcome! – LinuxBlanket Oct 07 '15 at 15:03

1 Answers1

4

A cool trick for this would be using process substitution like so:

awk '{print $1}' somefile | join /dev/stdin <(sort unsortedfile)

The <(…) syntax creates a pipe for the duration of the command and allows you to treat the output of a command as a file for just this purpose.

chepner
  • 497,756
  • 71
  • 530
  • 681
AlVaz
  • 768
  • 1
  • 6
  • 21
  • 3
    The `<(…)` notation is called [process substitution](https://www.gnu.org/software/bash/manual/bash.html#Process-Substitution). The whole point of the notation is to give a file name to the command (`join` in the example) that is actually the piped output from another command (`sort` in the example). As such, it is not formally anonymous; it has a name such as `/dev/fd/3`. I left this diatribe out of your answer, but was sorely tempted. – Jonathan Leffler Oct 07 '15 at 14:49
  • Note that this can also be used for the `awk` process as well: `join <(awk ...) <(sort unsortedfile)`. – chepner Oct 07 '15 at 14:53
  • 1
    @AIVaz: Ha! That really makes the thing! Thank you for your quick answer, it was straight and simple! – LinuxBlanket Oct 07 '15 at 14:54
  • 1
    I removed the reference to 'unnamed pipe', since process substitution is essentially syntactic sugar for a named pipe that hides the details of what the name is. – chepner Oct 07 '15 at 15:15