One idea is to use the SQL VALUES
clause as part of a SELECT
statement.
If you are familiar with SQL, you may have seen the VALUES
clause before, it is commonly used as part of an INSERT
statement like so:
INSERT INTO statusupdate (person_id,status)
VALUES (1, 'my status'), (1, 'another status'), (2, 'his status');
This tells the database to insert three rows - AKA tuples - into the table statusupdate
.
Another way of inserting things though is to do something like:
INSERT INTO statusupdate (person_id,status)
SELECT ..., ... FROM <elsewhere or subquery>;
This is equivalent to the insert_from()
functionality that peewee provides.
But there is another less common thing you can do: you can use the VALUES
clause in any select to provide literal values. Example:
SELECT *
FROM (VALUES (1,2,3), (4,5,6)) as my_literal_values;
This will return a result-set of two rows/tuples, each with 3 values.
So, if you can convert the "bulk" insert into a SELECT/FROM/VALUES
statement, you can then do whatever transformations you need to do (namely, convert Person.name values to corresponding Person.id values) and then combine it with the peewee 'insert_from()` functionality.
So let us see how this would look.
First let us begin constructing the VALUES
clause itself. We want properly escaped values, so we will use question marks instead of the values for now, and put the actual values in later.
#this is gonna look like '(?,?), (?,?), (?,?)'
# or '(%s,%s), (%s,%s), (%s,%s)' depending on the database type
values_question_marks = ','.join(['(%s, %s)' % (db.interpolation,db.interpolation)]*len(new_status_updates))
The next step is to construct the values clause. Here is our first attempt:
--the %s here will be replaced by the question marks of the clause
--in postgres, you must have a name for every item in `FROM`
SELECT * FROM (VALUES %s) someanonymousname
OK, so now we have a result-set that looks like:
name | status
-----|-------
... | ...
Except! There are no column names. This will cause us a bit of heartache in a minute, so we have to figure out a way to give the result-set proper column names.
The postgres way would be to just alter the AS
clause:
SELECT * FROM (VALUES %s) someanonymousname(name,status)
sqlite3 does not support that (grr).
So we are reduced to a kludge. Luckily stackoverflow provides: Is it possible to select sql server data using column ordinal position, and we can construct something like this:
SELECT NULL as name, NULL as status WHERE 1=0
UNION ALL
SELECT * FROM (VALUES %s) someanonymousname
This works by first creating an empty result-set with the proper column-names, and then concatenating the result-set from the VALUES
clause to it. This will produce a result-set that has the proper column-names, will work in sqlite3, and in postgres.
Now to bring this back to peewee:
values_query = """
(
--a trick to make an empty query result with two named columns, to more portably name the resulting
--VALUES clause columns (grr sqlite)
SELECT NULL as name, NULL as status WHERE 1=0
UNION ALL
SELECT * FROM (VALUES %s) someanonymousname
)
"""
values_query %= (values_question_marks,)
#unroll the parameters into one large list
#this is gonna look like ['Frank', 'wat', 'Frank', 'nooo', 'Joe', 'noooo' ...]
values_query_params = [value for values in new_status_updates for value in values]
#turn it into peewee SQL
values_query = SQL(values_query,*values_query_params)
data_query = (Person
.select(Person.id, SQL('values_list.status').alias('status'))
.from_(Person,values_query.alias('values_list'))
.where(SQL('values_list.name') == Person.name))
insert_query = StatusUpdate.insert_from([StatusUpdate.person, StatusUpdate.status], data_query)
print (insert_query)
insert_query.execute()
print ('StatusUpdate.select().count():',StatusUpdate.select().count())
Output:
StatusUpdate.select().count(): 4