Identify Linux passwd file

Question

I need help in writing a function (python preferably) to identify if a file is /etc/passwd or etc/shadow. So far I have tried using print(pw.getpwall()) but this reads the file from the os env. I need a library that takes input and can tell if a file is passwd/shadow file or not

/etc/shadow cannot be read by ordinary users. Perhaps you could check for permissions. Also in /etc/passwd, the third column (columns separated by `:`) will always be `0` as it signifies the identification for the `root` user. You can use the `readline` and `split` functions to extract characters. — Cibin Joseph, Apr 22 '20 at 12:33
Thanks Cibin. I have been able to extract the characters asides the salt value. However my question here is "Is there a library that can identify whether a given file is a passwd/shadow file?" If there isn't, is there a work around to achieve this? Like `def is_passw(path)` should return `true` is the file provided in the path is a passwd/shadow file — secjedi, Apr 22 '20 at 13:16
If you're just trying to determine if the file is `passwd` or `shadow`, why not use pattern matching on the filename - like regex? Or, by searching for an expected pattern in the file content? Or, even more simply: `result = file_path == '/etc/passwd'` — S3DEV, Apr 22 '20 at 14:27
@secjedi No, there isn't a library specifically for this afaik. You'd have to extract and check the conditions elaborated in one of the answers. — Cibin Joseph, Apr 22 '20 at 20:44
Yes I understand. I have been able to implement checks using regex but I dont want that. I want to modify the pwd.py unix python library so it accept my file instead of reading the /etc/passwd file the os enviroment. https://github.com/enthought/Python-2.7.3/blob/master/Lib/plat-os2emx/pwd.py Have you got any idea as to how I can get this done? — secjedi, Apr 23 '20 at 15:32

GuBo · Accepted Answer · 2020-04-24T10:44:29.383

passwd and shadow file format differs.

You can write a short function or class. First iteration would be:

Find root user, almost 100% true that root is the first entry
Check 2nd, 6th and 7th column (separator is : sign)
If 2nd is x and 6th is /root and 7th is /bin/*sh then it is a password file almost in 100%
If 2nd is a salt and hash (format: $salt$hash) and 6th is a number and 7th is empy then it is a shadow file almost in 100%

Naturally there could be problems:

Linux is configured not to use shadow file. In this case pasword file 2nd column contains the password
Linux is configured not to use salt (I guess is it possible or not)

Please check manuals: man 5 passwd and man 5 shadow

EDIT, 2020-04-24: Here is the my corrected pwd.py:

#!/usr/bin/env python3

import os
import sys

passwd_file=('./passwd')

# path conversion handlers
def __nullpathconv(path):
    return path

def __unixpathconv(path):
    return path

# decide what field separator we can try to use - Unix standard, with
# the platform's path separator as an option.  No special field conversion
# handler is required when using the platform's path separator as field
# separator, but are required for the home directory and shell fields when
# using the standard Unix (":") field separator.
__field_sep = {':': __unixpathconv}
if os.pathsep:
    if os.pathsep != ':':
        __field_sep[os.pathsep] = __nullpathconv

# helper routine to identify which separator character is in use
def __get_field_sep(record):
    fs = None
    for c in list(__field_sep.keys()):
        # there should be 6 delimiter characters (for 7 fields)
        if record.count(c) == 6:
            fs = c
            break
    if fs:
        return fs
    else:
        raise KeyError

# class to match the new record field name accessors.
# the resulting object is intended to behave like a read-only tuple,
# with each member also accessible by a field name.
class Passwd:
    def __init__(self, name, passwd, uid, gid, gecos, dir, shell):
        self.__dict__['pw_name'] = name
        self.__dict__['pw_passwd'] = passwd
        self.__dict__['pw_uid'] = uid
        self.__dict__['pw_gid'] = gid
        self.__dict__['pw_gecos'] = gecos
        self.__dict__['pw_dir'] = dir
        self.__dict__['pw_shell'] = shell
        self.__dict__['_record'] = (self.pw_name, self.pw_passwd,
                                    self.pw_uid, self.pw_gid,
                                    self.pw_gecos, self.pw_dir,
                                    self.pw_shell)

    def __len__(self):
        return 7

    def __getitem__(self, key):
        return self._record[key]

    def __setattr__(self, name, value):
        raise AttributeError('attribute read-only: %s' % name)

    def __repr__(self):
        return str(self._record)

    def __cmp__(self, other):
        this = str(self._record)
        if this == other:
            return 0
        elif this < other:
            return -1
        else:
            return 1

# read the whole file, parsing each entry into tuple form
# with dictionaries to speed recall by UID or passwd name
def __read_passwd_file():
    if passwd_file:
        passwd = open(passwd_file, 'r')
    else:
        raise KeyError
    uidx = {}
    namx = {}
    sep = None
    while 1:
        entry = passwd.readline().strip()
        if len(entry) > 6:
            if sep is None:
                sep = __get_field_sep(entry)
            fields = entry.split(sep)
            for i in (2, 3):
                fields[i] = int(fields[i])
            for i in (5, 6):
                fields[i] = __field_sep[sep](fields[i])
            record = Passwd(*fields)
            if fields[2] not in uidx:
                uidx[fields[2]] = record
            if fields[0] not in namx:
                namx[fields[0]] = record
        elif len(entry) > 0:
            pass                         # skip empty or malformed records
        else:
            break
    passwd.close()
    if len(uidx) == 0:
        raise KeyError
    return (uidx, namx)

# return the passwd database entry by UID
def getpwuid(uid):
    u, n = __read_passwd_file()
    return u[uid]

# return the passwd database entry by passwd name
def getpwnam(name):
    u, n = __read_passwd_file()
    return n[name]

# return all the passwd database entries
def getpwall():
    u, n = __read_passwd_file()
    return list(n.values())

# test harness
if __name__ == '__main__':
    print(getpwall())

A more consistent way to identify the root user is their UID/GID are both zero (0) — SpaceKatt, Apr 22 '20 at 16:17
Yes I understand. I have been able to implement checks using regex but I dont want that. I want to modify the pwd.py unix python library so it accept my file instead of reading the /etc/passwd file the os enviroment as the library does here: https://github.com/enthought/Python-2.7.3/blob/master/Lib/plat-os2emx/pwd.py Have you got any idea as to how I can get this done? — secjedi, Apr 23 '20 at 15:34
I guess you want ot use it in python3 environment. First of all You must convert to python3 code. See: [https://docs.python.org/2/library/2to3.html](https://docs.python.org/2/library/2to3.html). After I would delete **try to find passwd file** block (line 62-80) and put passwd filename in command line argument option. In the code the **passwd_file** variable contains the final passwd file. I prefer [argparse](https://docs.python.org/3/library/argparse.html) to process command line arguments. — GuBo, Apr 23 '20 at 17:28
I have converted to python3, removed the other lines and instantiated the `passwd_file = './passwd.txt'` i.e my file, but I get a 'TypeError: replace() argument 1 must be str, not None' error at line at line 93. — secjedi, Apr 23 '20 at 18:43
This lib intended to in OS/2 systems. As I know in case of Unix/Linux/Mac there is no alternative separator so **os.altsep** is **None** and the lib doesn't handle this scenario. That's why you have an TypeError. If you only use this script in unix/linux environment than the quickest solution to put **return path* at the begining of the ** __nullpathconv** and **__unixpathconv** functions. Yes, I know this is a dirty hack but you can test the lib and if it is met your requirements you can finalize it. — GuBo, Apr 24 '20 at 07:24
Now I get no error and nothing is returned when I specified my `path='./passwd.txt' for __unixpathconv(path) and __nullpathconv(path)`. I think the library might need a total overhaul to meet this purpose. However, one can achieve this, its fine to make it available so no one goes through this. — secjedi, Apr 24 '20 at 09:12

Identify Linux passwd file

1 Answers1