1

I have a thousands of files in one gcs bucket. Out of which i wanted to copy some n list of files using gsutil -m cp command. By reading the documentation I can perform something like this in my python script.

cat filelist | gsutil -m cp -I gs://my_bucket

If I perform below operation like this i am getting Argument list too long for a list more than few hundred files

alist = [f'{_}.txt' for _ in range(1000000)]
alist_in_str = '\n'.join(alist)
subprocess.call(['printf', alist_in_str, '| gsutil -m cp -I gs://my_bucket'])

What is the efficient way to copy a list of files using gsutil in python script?

ramkrishs
  • 78
  • 7

1 Answers1

1

gsutil cp -I expects a newline-separated list of file names to copy on its standard input. This is very easy to do in Python — don't try to use the shell to do it.

alist = [bf'{_}.txt' for _ in range(1000000)]
subprocess.check_call(['gsutil', '-m', 'cp', '-I', 'gs://my_bucket'],
                      input=b'\n'.join(alist))
Gilles 'SO- stop being evil'
  • 104,111
  • 38
  • 209
  • 254