20

I have a CLI script and want it to read data from a file. It should be able to read it in two ways :

  • cat data.txt | ./my_script.py
  • ./my_script.py data.txt

—a bit like grep, for example.

What I know:

  • sys.argv and optparse let me read any args and options easily.
  • sys.stdin let me read data piped in
  • fileinput make the full process automatic

Unfortunately:

  • using fileinput uses stdin and any args as input. So I can't use options that are not filenames as it tries to open them.
  • sys.stdin.readlines() works fine, but if I don't pipe any data, it hangs until I enter Ctrl + D
  • I don't know how to implement "if nothing in stdin, read from a file in args" because stdin is always True in a boolean context.

I'd like a portable way to do this if possible.

jez
  • 14,867
  • 5
  • 37
  • 64
Bite code
  • 578,959
  • 113
  • 301
  • 329
  • note: you can determine if stdin exists (ie your script is running in a pipe), [as detailed here](https://stackoverflow.com/a/27081033/26510) – Brad Parks Feb 02 '22 at 19:50

6 Answers6

20

Argparse allows this to be done in a fairly easy manner, and you really should be using it instead of optparse unless you have compatibility issues.

The code would go something like this:

import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--input', type = argparse.FileType('r'), default = '-')

Now you have a parser that will parse your command line arguments, use a file if it sees one, or use standard input if it doesn't.

sykora
  • 96,888
  • 11
  • 64
  • 71
  • Thanks to you as well. I'm learning a lot today! Anyway, don't you think there is a way to do that with the standard lib? If not, I'm ok with argparse. But optparse does exist... – Bite code Feb 15 '10 at 10:41
  • argparse is pretty tiny, is pure python code and is also nicer to use than optparse. While I wouldn't normally want to add a new dependancy to a project just to read in the command line options, the three factors above have made argparse more than worthwhile in my experience. – mavnn Feb 15 '10 at 10:56
  • @S.Lott I didn't know that, thanks for mentioning. In any case, I recommended argparse for its overall benefits anyway. – sykora Feb 15 '10 at 14:49
  • 1
    argparse is now part of the standard lib; in fact, it is the new recommended argument parsing library. It's also awesome, and worth learning (which doesn't take very long). – Josh Bleecher Snyder Nov 04 '10 at 18:07
  • 2
    Or you can just do: parser.add_argument('--input', dest=input, type = file) since FileType('R') will leave you with an open file afterwards. You can check if args.input and read from stdin if not. – Clara Sep 27 '14 at 09:35
12

Process your non-filename arguments however you'd like, so you wind up with an array of non-option arguments, then pass that array as the parameter to fileinput.input():

import fileinput
for line in fileinput.input(remaining_args):
    process(line)
Boris Verkhovskiy
  • 14,854
  • 11
  • 100
  • 103
Andrew Aylett
  • 39,182
  • 5
  • 68
  • 95
  • I like that, it seems highly effective. Is there anything bad I'm missing? – Bite code Feb 15 '10 at 10:40
  • I believe that this will provide behaviour similar to other Unix commands. In another comment you said you're slightly concerned about the hang-until-input effect when no arguments are given; unless you take steps to notice when no arguments are passed, this will still happen. There's no reason you shouldn't catch this case, though, as passing '-' as a parameter will still read from stdin. – Andrew Aylett Feb 15 '10 at 11:00
  • I will use a combination of Ignocio's advice and Andrew's solution. – Bite code Feb 15 '10 at 12:33
9

For unix/linux you can detect whether data is being piped in by looking at os.isatty(0)

$ date | python -c "import os;print os.isatty(0)"
False
$ python -c "import os;print os.isatty(0)"
True

I'm not sure there is an equivalent for Windows.

edit Ok, I tried it with python2.6 on windows XP

C:\Python26>echo "hello" | python.exe -c "import os;print os.isatty(0)"  
False

C:\Python26> python.exe -c "import os;print os.isatty(0)"  
True

So maybe it it not all hopeless for windows

John La Rooy
  • 295,403
  • 53
  • 369
  • 502
  • 1
    Thanks to both of you. Since I lack of the necessary knowledge to choose between your answer and Ignocio's, what are the drawbacks of each ? Them seem both valid. – Bite code Feb 15 '10 at 10:04
  • @e-satis: What happens if no filename is passed as an argument? When you can answer this question you'll know what you need to do. – Ignacio Vazquez-Abrams Feb 15 '10 at 10:06
  • For now, this seems to work so I accept it. Thanks again. I'm still interested in knowing the issues with what I'm doind. What is a tty, and why does this work? When will it not? – Bite code Feb 15 '10 at 10:10
  • Note that this is defined as Unix-only in the docs, so you hurt portability to Windows and possibly other operating systems. It also assumes you'll never want to just paste data in, which is a silly assumption. – Nicholas Knight Feb 15 '10 at 10:12
  • @e-satis: A tty technically stands for teletype terminal. These days it really just means it's an interactive terminal. The particular concept used by isatty() is non-portable outside Unix, however. – Nicholas Knight Feb 15 '10 at 10:15
  • @Ignacio Vazquez-Abrams: for now my workflow is: istty? no: fetch stdin / yes: try opening a file. try ok? carry on / except: display erro message and exit – Bite code Feb 15 '10 at 10:16
  • @Nicholas Knight: ah, this is not really my best choice then. Any other solutions? – Bite code Feb 15 '10 at 10:20
  • I remove "accepted", not becaused I don't find the solution useful, but because I hope somebody can come with something portable. – Bite code Feb 15 '10 at 10:23
  • Oh, I assumed it was unix/linux when I read `cat data.txt` :) – John La Rooy Feb 15 '10 at 10:54
  • Yeah, I'm working under linux, but I not a fanatic. I like to play various os. – Bite code Feb 15 '10 at 10:59
  • The behavior of your program should not depend significantly on whether `stdin` is connected to a tty. If no filenames are given, simply read from `stdin`, whether it is from a tty or a pipe. – musiphil Jun 14 '15 at 15:07
  • @musiphil sorry but why is it so wrong, like an absolute rule, to not allow input to be typed if nothing is piped? To me this is a way to produce a useful error instead of leaving the program waiting on the interpreter. – PlasmaBinturong Sep 27 '16 at 22:27
  • @PlasmaBinturong: I didn't say it is wrong to take input from tty if nothing is piped; on the contrary, I said "If no filenames are given, simply read from `stdin`, whether it is from a tty or a pipe." People expect the same thing with (a) `prog` (followed by typing the contents of `file` on tty), (b) `prog file`, and (c) `prog – musiphil Sep 29 '16 at 22:28
4

I'm a noob, so this might not be a good answer, but I'm trying to do the same thing (allow one or more files on the command line, default to STDIN otherwise).

The final combo I put together:

parser = argparse.ArgumentParser()
parser.add_argument("infiles", nargs="*")
args = parser.parse_args()

for line in fileinput.input(args.infiles):
    process(line)

This seems like the only way to get all the desired behavior in one elegant package, without requiring named args. Just like unix commands are used as such:

cat file1 file2
wc -l < file1

Not:

cat --file file1 --file file2

Would appreciate feedback/confirmation from veteran idiomatic Pythonistas to make sure I've got the best answer. Haven't seen this complete solution mentioned anywhere else, just fragments.

odigity
  • 7,568
  • 4
  • 37
  • 51
  • 1
    One more thing -- if you don't want to depend on the next guy already knowing that fileinput.input() magically defaults to stdin when it gets an empty list, you can add ', default="-"' to the call to add_argument(). It doesn't change anything, but it makes the logic perfectly explicit. – odigity Feb 01 '13 at 20:09
3

You can use this function to detect if the input is from a pipeline or not.

sys.stdin.isatty()

It returns false if the input is from pipeline or true otherwise.

pranavk
  • 1,774
  • 3
  • 17
  • 25
  • The behavior of your program should not depend significantly on whether `stdin` is connected to a tty. If no filenames are given, simply read from `stdin`, whether it is from a tty or a pipe. – musiphil Jun 14 '15 at 15:06
  • 1
    This really helped me out. I needed to know whether my data was piped or tty for a particular project and this was the only good solution I could use. – Blairg23 Jan 12 '16 at 04:24
3

There is no reliable way to detect if sys.stdin is connected to anything, nor is it appropriate do so (e.g., the user wants to paste the data in). Detect the presence of a filename as an argument, and use stdin if none is found.

Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
  • Thanks to both of you. Since I lack of the necessary knowledge to choose between your answer and gnibbler's, what are the drawbacks of each ? Them seem both valid. – Bite code Feb 15 '10 at 10:05
  • Oh, just tried this approach but their is a problem: if there is no files and no stdin, I end up reading stdin and it hangs. How can I write an error message to the user to tell him to provide data? – Bite code Feb 15 '10 at 10:07
  • Note well that the hang-until-EOF is inevitable here if someone runs it without a filename -- this is standard behavior in the Unix world. see also grep, cat, etc.. If this isn't acceptable, the only portable way you can avoid it is to use another typical convention wherein providing a filename of '-' means 'read from stdin' (or write to stdout). – Nicholas Knight Feb 15 '10 at 10:08
  • Actually, I just fired grep without any input. I just fired grep and sed without any input and they don't hang. – Bite code Feb 15 '10 at 10:25
  • Ok you are right. Cat does wait for input But how does grep and sed do? I give them no args, I guess they check, they see no arg, then they look for stdin, why do they manage to display the message ? – Bite code Feb 15 '10 at 10:28
  • `grep` and `sed` need to be passed arguments in order to even begin to be useful. There's no point in waiting for stdin if there's nothing to do with it. – Ignacio Vazquez-Abrams Feb 15 '10 at 10:30
  • So I should, check my args, if no args given, give a message back, then if args but no file among them, check for stdin, then if no stdin, hangs for manual input. It seems logical... – Bite code Feb 15 '10 at 10:45
  • ...isn't this what `sys.stdin.isatty()` is for? – jorelli Jun 10 '11 at 22:07
  • @jorelli: Sure, right up to the point that someone uses `unbuffer`. – Ignacio Vazquez-Abrams Jun 11 '11 at 00:12