2

If i input paths with wildcards as command line parameters, like:
testprogramm a* b*
and the directory contains the following content:
aa ab ba bb
my argv string will contain:
{"testprogramm","aa","ab","ba","bb"}

However, i want to differentiate between files that originated from the first argument (a*) and the second (b*), how do i do that? What i am searching for is a method that can tell me by the example above that "aa" and "ab" came from the first argument and "ba" and "bb" from the second.

I know that the cp.c commands can do this, but i couldn't figure out how, as the source code is quite nested.

Edit: standard copy (from gnu core utils) cannot differentiate, only embedded (shells, where programm and utils are the same file) can.

Bogomips
  • 33
  • 4
  • 4
    That wildcard handling is done by the shell that loads your program. Your program cannot see what was initially entered on terminal. It sees only what is provided by the shell. Therefore there is no relation between `"aa"` and `"a*"` because in your programe there is no `"a*"` at all. – Gerhardh Aug 08 '22 at 14:08
  • Tell the caller to call it with `testprogram "a*" "b*"`, then you get `a*` and `b*` as parameters and can find the files yourself. Other than that, is it always 2 parameters? Usually, the arguments are sorted when using `*`. So, the first argument that is not sorted is the first argument to the second parameter. But again, if it is completely sorted, you cannot tell where the second parameter starts. – mch Aug 08 '22 at 14:08
  • @Gerhardh but how does cp know it then? if aa ab ba are files and bb is a directory and I call "mv a* b*" it gives me the error "mv: target 'ba': Not a directory". *Ah, thats toybox, Bash itself cannot differentiate between them. Then i guess i will need to check for the first argument not sorted – Bogomips Aug 08 '22 at 14:12
  • @Bogomips: When `cp` sees there it has been given n paths, it treats the first n−1 as sources and the last one as a destination. It has no information about which of those paths came from which user-typed items. – Eric Postpischil Aug 08 '22 at 14:15
  • You are mixing `cp` and `mv`. The [cp manpage](https://man7.org/linux/man-pages/man1/cp.1.html) tells us: "Copy SOURCE to DEST, or multiple SOURCE(s) to DIRECTORY." That means, no matter how many names are provided, the last one must be the destination folder. Similar for [mv manpage](https://man7.org/linux/man-pages/man1/mv.1p.html) – Gerhardh Aug 08 '22 at 14:17
  • @EricPostpischil that is not the case, which is why i got confused. In normal bash, it checks which elements are sorted, now i know. – Bogomips Aug 08 '22 at 14:17
  • @Gerhardh they share the same path and the same encoding of arguments (see the source code) – Bogomips Aug 08 '22 at 14:18
  • You cannot really rely on results being sorted only withing the names for first /second argument. If you do not apply restrictive constraints on your file names, they could be sorted while only refer to first argument. – Gerhardh Aug 08 '22 at 14:19
  • @Gerhardh i know, but it is just the way it is done there so i guess good enough. – Bogomips Aug 08 '22 at 14:21
  • 1
    `cp.c commands can do this,` They can't do this. `how does cp know it then?` cp copies everything to the last argument. So `cp a* b*` -> `cp aa ab ba bb` would copy `aa ab ba` into `bb`. – KamilCuk Aug 08 '22 at 14:22
  • 1
    The way it is done there? According to manpage there are no multiple sorted sequences for separate arguments. These programs do not do what you claim. – Gerhardh Aug 08 '22 at 14:23
  • @KamilCuk now i know, i just haven't used bash for testing it so the cp i have seen it wasn't gnu coreutils – Bogomips Aug 08 '22 at 14:24
  • If you use `cp *.png *.txt dest` you might get `cp 1.png 2.png 3.txt 4.txt dest`. All names are sorted without any hint from what string they came – Gerhardh Aug 08 '22 at 14:25
  • @Bogomips: What is not the case, that `cp` treats its first n-1 paths as sources and its last path as a destination? That is what its man page says and how I have observed it behave. Show us a case where `cp` behaves otherwise. – Eric Postpischil Aug 08 '22 at 14:33
  • @EricPostpischil i assumed that normal cp would do it like that as i used toybox where it works. The idea of sorting is one that i have seen suggested above. So normal cp does not check anything and just takes the last argument – Bogomips Aug 08 '22 at 14:37
  • Also [Toybox cp.c](https://github.com/landley/toybox/blob/master/toys/posix/cp.c) tells us the same: "Copy files from SOURCE to DEST. If more than one SOURCE, DEST must be a directory.". There is no indication that this `cp` implementation would behave any different. It does not take a group of source names and a group of destination names but only one single destination. `destname = tt ? : toys.optargs[--toys.optc];` It simply takes last argument – Gerhardh Aug 08 '22 at 15:05

2 Answers2

1

There is no real solution to solve this problem, at least with bash, as the only information it provides about argv is argv itself and argc. However, as the input is sorted on a per argument base, you can check for the first argument is not sorted, so it could detect it the other way around (b* and a*) but there is no real solution unless changing from bash to an embedded util box like toybox or busybox.

Bogomips
  • 33
  • 4
1
./program a* "|" b*

And then search the delimiter from your program:

#include <stdio.h>
#include <string.h>

int main(int argc, char *argv[])
{
    int group = 0;

    for (int i = 1; i < argc; i++)
    {
        if (strcmp(argv[i], "|") == 0)
        {
            group++;
        }
        else
        {
            printf("group: %d file: %s\n", group, argv[i]);
        }
    }
    return 0;
}

The output is:

group: 0 file: aa
group: 0 file: ab
group: 1 file: ba
group: 1 file: bb
David Ranieri
  • 39,972
  • 7
  • 52
  • 94
  • 1
    Just `./program a* "|" b*` – KamilCuk Aug 08 '22 at 14:39
  • @KamilCuk Yes, simply launching with `./program a* "|" b*` also works :) – David Ranieri Aug 08 '22 at 14:40
  • That would certainly work. Do you know any character(s), that can be used as a file name but not as a delimiter? It was originally a solution i wanted to avoid, but as there is apparently no better one, will for the time being stick with this. I will just use -name like in the find command – Bogomips Aug 08 '22 at 14:41
  • Do you mean the opposite? that can be used as a delimiter but not as a file name? In this case `.` or `|` seems a good delimiter, keep in mind that you search for a complete file named `.` or `|`. As far as I know you can not name a file . nor | – David Ranieri Aug 08 '22 at 14:44
  • 1
    @DavidRanieri, yeah i meant that. "|" can be used but "." cannot. Thx – Bogomips Aug 08 '22 at 14:49
  • 1
    @DavidRanieri: In Unix file systems, you certainly can name a file `|`, and `.` is a special name for the current working directory. – Eric Postpischil Aug 08 '22 at 14:49
  • 1
    Overall, this is called a "sentinel value". A special value is used to split the lists, in this case. For example parallel typically uses `:::` – KamilCuk Aug 08 '22 at 14:50