I'm using python's base64 module and I get a string that can be encoded or not encoded. I would like to do something like:
if isEncoded(s):
output = base64.decodestring(s)
else:
output = s
ideas?
In general, it's impossible; if you receive string 'MjMj', for example, how could you possibly know whether it's already decoded and needs to be used as is, or decoded into '23#'?
You could just try it, and see what happens:
import base64
def decode_if_necessary(s):
try:
return base64.decodestring(s)
except:
return s
But you have to ask yourself: what if the original message was in fact a syntactically valid base64 string, but not meant to be one? Then "decoding" it will succeed, but the result is not the required output. So I have to ask: is this really what you want?
Edit: Note that decodestring
is deprecated.
You could check to see if a string may be base64 encoded. In general, the function can predict with 75%+ accuracy is the data is encoded.
def isBase64(s):
return (len(s) % 4 == 0) and re.match('^[A-Za-z0-9+/]+[=]{0,2}$', s)
You can use the argument validate=True, something like:
try:
# Convert the input string to bytes
input_bytes = input_string.encode('utf-8')
# Decode the Base64 encoded bytes
decoded_bytes = base64.b64decode(input_bytes, validate=True)
return decoded_bytes
except binascii.Error:
print("Error: Invalid Base64 string")
validate=True argument in the base64.b64decode() function will enforce strict padding rules (an encoded string is always properly padded). If the input string is not valid, a binascii.Error exception is raised, which we catch and handle accordingly.