0

I'm trying to determine whether a function argument is a string, or some other iterable. Specifically, this is used in building URL parameters, in an attempt to emulate PHP's &param[]=val syntax for arrays - so duck typing doesn't really help here, I can iterate through a string and produce things like &param[]=v&param[]=a&param[]=l, but this is clearly not what we want. If the parameter value is a string (or a bytes? I still don't know what the point of a bytes actually is), it should produce &param=val, but if the parameter value is (for example) a list, each element should receive its own &param[]=val. I've seen a lot of explanations about how to do this in 2.* involving isinstance(foo, basestring), but basestring doesn't exist in 3.*, and I've also read that isinstance(foo, str) will miss more complex strings (I think unicode?). So, what is the best way to do this without causing some types to be lost to unnecessary errors?

Monchoman45
  • 517
  • 1
  • 7
  • 17

1 Answers1

3

You've been seeing things that somewhat conflict based on Python 2 vs 3. In Python 3, isinstance(foo, str) is almost certainly what you want. bytes is for raw binary data, which you probably can't include in an argument string like that.

The python 2 str type stored raw binary data, usually a string in some specific encoding like utf8 or latin-1 or something; the unicode type stored a more "abstract" representation of the characters that could then be encoded into whatever specific encoding. basestring was a common ancestor for both of them so you could easily say "any kind of string".

In python 3, str is the more "abstract" type, and bytes is for raw binary data (like a string in a specific encoding, or whatever raw binary data you want to handle). You shouldn't use bytes for anything that would otherwise be a string, so there's not a real reason to check if it's either str or bytes. If you absolutely need to, though, you can do something like isinstance(foo, (str, bytes)).

Danica
  • 28,423
  • 6
  • 90
  • 122
  • I'd argue the other way around. To pass data around on the web, you need to encode it (create `bytes`) at some point. So it can make perfect sense to construct an URL from `bytes` pieces. –  Nov 04 '12 at 03:54
  • Yeah, that's true - but query strings are supposed to be ascii-only and %-encoded, right? So most APIs you'd probably want to pass either a URL-encoded string (presumably a `str`) or a general string that will later be URL-encoded (also a `str`). – Danica Nov 04 '12 at 03:59
  • The thing is, if it's already urlencoded (and thus plain ASCII) it might as well be `bytes`. In fact, that may be the saner choice, because it would cause errors if someone latter tried to add another string without urlencoding it first. –  Nov 04 '12 at 04:04
  • In python 3 I'd consider it very strange to have a URL-encoded ASCII string being passed around in your library in `bytes`. – Danica Nov 04 '12 at 07:56