3

I have a PostgreSQL query that won't execute properly when I look for empty strings. I've tried escaping single quotes with the backslash, I've tried double quotes, nothing seems to work. Ideas? I need to look for empty strings because my script doesn't do what I need it to do when it finds empty strings in my dataset.

this_query = """
copy(SELECT phone_number from mytable WHERE phone_number is not null and phone_number != '''' ORDER BY random() LIMIT 25)
to stdout csv header;
"""

Without the empty string phrase (phone_number != ''''), the query will work when I execute:

THIS_COMMAND = "psql -h hostname -d dbname -c '{query}' > {file_name}"

command = THIS_COMMAND.format(query=this_query, file_name=a_file_name)

os.system(command)

ATTN: After reviewing answers below, I experimented and found this to work:

phone_number != '\"'\"''\"'\"' 

lno23
  • 67
  • 6

2 Answers2

1

There are ways to make this work with the quotes.

EDIT: Cross-referencing this UNIX-shell quoting answer: The quotes need to both escaped for Python and sh/bash (presumably). So for sh this needs to be '"'"''"'"', but then the " characters need to be escaped for Python, giving '\"'\"''\"'\"'. Phew!

this_query = """
copy(SELECT phone_number from mytable WHERE phone_number is not null and phone_number != '\"'\"''\"'\"''\"'\"''\"'\"' ORDER BY random() LIMIT 25) to stdout csv header;
"""

But to give another approach, have you considered checking for character_length(phone_number) > 0 ?

Ref: https://www.postgresql.org/docs/9.1/functions-string.html

e.g.:

SELECT phone_number FROM mytable WHERE phone_number IS NOT NULL AND character_length(phone_number) > 0

But maybe it's better to clean the data?

UPDATE mytable SET phone_number = NULL WHERE character_length(trim(phone_number)) = 0;
Kingsley
  • 14,398
  • 5
  • 31
  • 53
  • Thanks so much! how WOULD you make it work with quotes? I'm having another issue where someone has inserted a word into the phone_number field and I get the error: ERROR: invalid input syntax for integer: "INVALIDNUMBER" so if I could put in a where clause 'phone_number != 'INVALIDNUMBER'' that would help – lno23 Dec 18 '18 at 22:50
  • I don't have write access to the database unfortunately – lno23 Dec 18 '18 at 22:52
  • @Ino23 - I guess there's no point suggesting hints on data integrity constraints for the table either then ;) – Kingsley Dec 18 '18 at 23:04
  • If your shell is `bash` (version > 2.02), you can also use the syntax `$'\''` for a single quote. – Kingsley Dec 18 '18 at 23:12
  • I ended up using single quotes to wrap around a date in my query like so: `where created_on::date <= '\"'\"'2018-12-11'\"'\"'` and THAT worked. Your quote suggestions did not work. In any case, I might want to try the suggestion made by @Parfait, to use the `\copy` command instead... this quote business is a little crazy – lno23 Dec 20 '18 at 17:23
  • actually your quote suggestion did work for the empty quotes, thank you! – lno23 Dec 20 '18 at 17:30
0

Consider Python's built-in subprocess to pass multiple arguments into a command line call and avoid any need for quote enclosures or escaping. This is a more enhanced command line caller than os.system(). Additionally, as shown below you can set environment variables like PGPASSWORD.

Specifically, use psql's meta-command \copy command (distinct from Postgres' COPY) to pass file directly in query name to be saved client-side. Notice too the use of r (raw string literal) so backslash in \copy can be handled as is and not as a special character.

from subprocess import Popen, PIPE

file_name = '/path/to/file.csv'

this_query = r"""\copy (SELECT phone_number 
                        FROM mytable 
                        WHERE phone_number is not null 
                          AND phone_number != ''
                        ORDER BY random() LIMIT 25)
             TO '{myfile}' with (FORMAT CSV, HEADER);
             """.format(file_name)

host_name = 'XX.XX.XXX'
db_name = 'mydatabase'
user_name = 'myuser'
pwd = 'mypwd'

# LIST OF ARGS
cmd = ["psql", "-h", host_name,  "-d", db_name, 
       "-U", user_name, "-p", "5432", "-c", this_query]

# COMMAND LINE CALL
p = Popen(cmd, env={'PGPASSWORD': pwd},
          stdin=PIPE, stdout=PIPE, stderr=PIPE)

Plus, you can receive return errors or console output via PIPE.

output, error = p.communicate()

if p.returncode == 0:            
    print('OUTPUT:\n {0}'.format(output))            
    print("\nSuccessfully completed!")
else:                
    print('ERROR:\n {0}'.format(error)) 
Parfait
  • 104,375
  • 17
  • 94
  • 125