10

I am trying to make a list of all files in a directory with filenames in a that end in .root.

After reading some writings in the forum I tried to basic strategies using glob and os.listdir but I got into trouble for both of them

First, when I use

import glob
filelist = glob.glob('/home/usr/dir/*.root')

It does make a list of string with all filenames that end in .root but I still face a problem.

I would like to be the list of string to have filenames as '/dir/.root' but the string has full path '/home/usr/dir/.root'

Second, if I use os.listdir, I get into the trouble that

  path = '/home/usr/'
  filelist = os.listdir(path + 'dir/*.root')
  syntax error

which tells me that I can not only get the list of files for .root.

In summary, I would like to make a list of filenames, that end in .root and are in my /home/usr/dir, while cutting off the '/home/usr' part. If I use globe, I get into the trouble of having /home/usr/. If I use os.listdir, I can't specify ".root" endling.

leo
  • 8,106
  • 7
  • 48
  • 80
difficult life
  • 103
  • 1
  • 1
  • 4
  • If you want filepaths to be relative, you must first decide relative to what. To the current working directory? To a specific directory? – leo Jul 09 '17 at 08:53
  • I left a simple and concise function for relative globbing over here: https://stackoverflow.com/a/57514353/790075 – turtlemonvh Aug 15 '19 at 18:22

2 Answers2

19

glob will return paths in a format matching your query, so that

glob.glob("/home/usr/dir/*.root")
# ['home/usr/dir/foo.root', 'home/usr/dir/bar.root', ...]

glob.glob("*.root")
# ['foo.root', 'bar.root', ...]

glob.glob("./*.root")
# ['./foo.root', './bar.root', ...]

...and so forth.

To get only the filename, you can use path.basename of the os module, something like this:

from glob import glob
from os import path

pattern = "/home/usr/dir/*.root"
files = [path.basename(x) for x in glob(pattern)]
# ['foo.root', 'bar.root', ...]

...or, if you want to prepend the dir part:

pattern = "/home/usr/dir/*.root"
files = [path.join('dir', path.basename(x)) for x in glob(pattern)]
# ['dir/foo.root', 'dir/bar.root', ...]

...or, if you really want the path separator at the start:

from glob import glob
import os

pattern = "/home/usr/dir/*.root"
files = [os.sep + os.path.join('dir', os.path.basename(x)) for x in glob(pattern)]
# ['/dir/foo.root', '/dir/bar.root', ...]

Using path.join and path.sep will make sure that the correct path syntax is used, depending on your OS (i.e. / or \ as a separator).

Depending on what you are really trying to do here, you might want to look at os.path.relpath, for the relative path. The title of your question indicates that relative paths might be what you are actually after:

pattern = "/home/usr/dir/*.root"
files = [os.path.relpath(x) for x in glob(pattern)]
# files will now contain the relative path to each file, from the current working directory
leo
  • 8,106
  • 7
  • 48
  • 80
6

just use glob for getting the list you want
and then use os.path.relpath on each file

import glob
files_names = []
for file in glob.glob('/home/usr/dir/*.root'):
    files_names.append(os.path.relpath(file, "/home/usr"))

You can also use regex

import re
files_names.append(re.sub(r'//home//usr//','', file, flags=re.I))
Idan Haim Shalom
  • 1,234
  • 1
  • 11
  • 19
  • 1
    Using the regex as suggested here should be avoided, because it's not portable. (Consider the "Windows-style" paths.) Favor the path manipulation features of Python. – bessbd Dec 08 '20 at 15:40