What is the optimal approach for validating object types of method arguments in Python?

Question

This isn't a specific issue with code and more of an open, conceptual question, so I hope it's in the right place.

I have a pandas dataframe, and I often subset data on bounding times and other optional variables, here frequency. The frequency has discrete values, so I can select data from a single or multiple channels. The function I have looks something like this:

def subset_data(data, times, freq=None):

    sub_data = data.loc[data['time'].between(*times), :]

    if freq is not None:
   
        if isinstance(freq, int):

            sub_data = sub_data.loc[sub_data['frequency'] == freq, :]

        elif isinstance(freq, tuple):

            sub_data = sub_data.loc[sub_data['frequency'].between(*freq), :]
                
    return sub_data

I wanted to modify the second condition to be a more general check for any numeric type, and I found this question - What is the most pythonic way to check if an object is a number?. The accepted answer made me question my approach here and its validity in general. The last point, in particular:

If you are more concerned about how an object acts rather than what it is, perform your operations as if you have a number and use exceptions to tell you otherwise.

I interpret this as implying I should do something like this

def subset_data(data, times, freq=None):

    sub_data = data.loc[data['time'].between(*times), :]

    if freq is not None:

        try:
   
            if isinstance(freq, tuple):

                sub_data = sub_data.loc[sub_data['frequency'].between(*freq), :]
       
            elif isinstance(freq, int):

                sub_data = sub_data.loc[sub_data['frequency'] == freq, :]

        except TypeError:

            print('sub_data filtered on time only. freq must be numeric.')

    return sub_data

or

if isinstance(freq, tuple):

    sub_data = sub_data.loc[sub_data['frequency'].between(*freq), :]
       
elif isinstance(freq, int):

    sub_data = sub_data.loc[sub_data['frequency'] == freq, :]

else:

    raise TypeError('freq must be tuple or numeric')

but would be interested to know if that's anything close to the consensus.

The original is also missing some validation for completeness - I'm too lazy to write this in my own code and feel like it adds unnecessary clutter if I assume that I'll be the only one using it and have a priori knowledge of the types. If this was not the case, I would include:

if isinstance(freq, int):

    sub_data = sub_data.loc[sub_data['frequency'] == freq, :]

elif isinstance(freq, tuple) and len(freq) == 2:

    if isinstance(freq[0], int) and isinstance(freq[1], int):

        sub_data = sub_data.loc[sub_data['frequency'].between(*freq), :]

Is the practice of checking explicitly for the object type and attributes, and this approach to validation in general, appropriate in Python, or is my knowledge lacking somewhere? Maybe everything could be written more concisely, and I'm missing something. There's technically two questions in this post, but I hope the general, overarching concept is clear enough to be useful for others and allow for some answers.

score 1 · Accepted Answer · answered Jul 10 '20 at 00:10

1

If I am not wrong, with the second approach of using a try/except, if the incoming type is incorrect, you will just be shown a TypeError:... and not really a detailed pinpoint to what exactly is causing the issue in the code. With that said, the first approach, you're hardening the checking process by checking for two conditions the int and tuple which is good. I wouldn't have a preference, but both approaches are fine to me, although if the Exception clause you could possibly make it more detailed to get a specific error log (if any).

A good example of understanding Exceptions, if you want to would be too look into examples of KeyError when trying to access an element or value in a dictionary that doesn't exist and then print(e) #e is the error from KeyError exception being raised. Hope this helps somewhat. Cheers.

answered Jul 10 '20 at 00:10

de_classified

1,927
1
15
19

1

Thanks for your answer and the pointer on where to look further. I think one of my issues is being reluctant to sacrifice the space for the added conditions. There are lots of other validation steps that I could (and probably should) include, like checking that the passed frequency (in the single case) is actually one of the discrete channels for example, so it would add a lot of lines! – Dagorodir Jul 10 '20 at 00:16
1

To aid you a little more, I would check for the critical components first that way, even if there is a bug related to type checking or some syntax issue, you won't have an issue with resolving that within a few lines. This is a good SO [link](https://stackoverflow.com/questions/2739582/condition-checking-vs-exception-handling), the 3rd answer does a great job of going through try/excepts & conditional handling step-by-step. – de_classified Jul 10 '20 at 00:23
thanks again. It's been a couple of days, but would you mind commenting on the third example that I included in an edit? Not sure if you saw that initially but that's what I would tend towards – Dagorodir Jul 12 '20 at 17:03

What is the optimal approach for validating object types of method arguments in Python?

1 Answers1