django.db.utils.OperationalError: parser stack overflow

Question

I have a re-usable django application which should support python2.7, python 3.x and pypy. I developed it in python 2.7 at the beginning and all of my tests are worked very well. I also made them worked in python3.3 too. But I have a problem with python3.4, pypy, pypy3;

django.db.utils.OperationalError: parser stack overflow

My tests run on sqlite3. I check the trace, I just could guess that it is about query size. I couldn't find any solution to solve this problem.

I overrided builtin sqlite3 code in my python3.4 environment to pring which sql, does it raise the error for. It is really big sql. It really doesn't matter, you don't need to check the sql, I just post it in here to show how it is big. This can also be bigger than that. Because queryset is building on runtime in for loop.

By the way, as I told it before, There is no problem with python2.7 and python3.3. Problem raises for others.

Are there any configuration to handle this?

Here is the sql:

SELECT "river_approvement".
"id", "river_approvement".
"content_type_id", "river_approvement".
"object_id", "river_approvement".
"field", "river_approvement".
"meta_id", "river_approvement".
"transactioner_id", "river_approvement".
"transaction_date", "river_approvement".
"status", "river_approvement".
"skip", "river_approvement".
"order", "river_approvement".
"enabled"
FROM "river_approvement"
INNER JOIN "river_approvementmeta"
ON("river_approvement".
    "meta_id" = "river_approvementmeta".
    "id") INNER JOIN "river_transition"
ON("river_approvementmeta".
    "transition_id" = "river_transition".
    "id") WHERE("river_approvement".
    "field" = ? AND "river_transition".
    "source_state_id"
    IN(SELECT AB0.
        "id"
        FROM "river_state"
        AB0 WHERE AB0.
        "id"
        IN(SELECT AA2.
            "destination_state_id"
            FROM "river_approvement"
            AA0 INNER JOIN "river_approvementmeta"
            AA1 ON(AA0.
                "meta_id" = AA1.
                "id") INNER JOIN "river_transition"
            AA2 ON(AA1.
                "transition_id" = AA2.
                "id") WHERE(AA0.
                "field" = ? AND AA2.
                "source_state_id"
                IN(SELECT Z0.
                    "id"
                    FROM "river_state"
                    Z0 WHERE Z0.
                    "id"
                    IN(SELECT Y2.
                        "destination_state_id"
                        FROM "river_approvement"
                        Y0 INNER JOIN "river_approvementmeta"
                        Y1 ON(Y0.
                            "meta_id" = Y1.
                            "id") INNER JOIN "river_transition"
                        Y2 ON(Y1.
                            "transition_id" = Y2.
                            "id") WHERE(Y0.
                            "field" = ? AND Y2.
                            "source_state_id"
                            IN(SELECT X0.
                                "id"
                                FROM "river_state"
                                X0 WHERE X0.
                                "id"
                                IN(SELECT W2.
                                    "destination_state_id"
                                    FROM "river_approvement"
                                    W0 INNER JOIN "river_approvementmeta"
                                    W1 ON(W0.
                                        "meta_id" = W1.
                                        "id") INNER JOIN "river_transition"
                                    W2 ON(W1.
                                        "transition_id" = W2.
                                        "id") WHERE(W0.
                                        "field" = ? AND W2.
                                        "source_state_id"
                                        IN(SELECT V0.
                                            "id"
                                            FROM "river_state"
                                            V0 WHERE V0.
                                            "id"
                                            IN(SELECT U2.
                                                "destination_state_id"
                                                FROM "river_approvement"
                                                U0 INNER JOIN "river_approvementmeta"
                                                U1 ON(U0.
                                                    "meta_id" = U1.
                                                    "id") INNER JOIN "river_transition"
                                                U2 ON(U1.
                                                    "transition_id" = U2.
                                                    "id") WHERE(U0.
                                                    "field" = ? AND U2.
                                                    "source_state_id"
                                                    IN( ? ) AND U0.
                                                    "object_id" = ? AND U0.
                                                    "content_type_id" = ? ))) AND W0.
                                        "object_id" = ? AND W0.
                                        "content_type_id" = ? ))) AND Y0.
                            "object_id" = ? AND Y0.
                            "content_type_id" = ? ))) AND AA0.
                "object_id" = ? AND AA0.
                "content_type_id" = ? ))) AND "river_approvement".
    "object_id" = ? AND "river_approvement".
    "content_type_id" = ? )

Do you see the diffenence on the same system where you compiled more Python versions from source or it is on different systems with the different '/usr/lib/libsqlite3.so*`? — hynekcer, Aug 31 '15 at 09:12

score 0 · Answer 1 · answered Aug 31 '15 at 11:24

The default sqlite3 parser stack size is 100 lexical items. They think "it is likely to be well beyond the ability of any human to comprehend". I see many nested levels in your example: 15 parenheses, 9 "SELECT", 9 "WHERE", 9 "IN", 5 "AND". That is the minimum of necessary terms that I can imagine on the stack with any possible parser gramatics. Maybe also joins or something invisible is counted into, but the size 100 makes sense. Sqlite3 can be recompiled with the option YYSTACKDEPTH=-1 or a big positive number (-1 is a dynamic depth stack starting with 100 and doubled everytimes), but it is not a solution for a reusable public application.

It can be improved a little by moving complicated terms (sub-select) before AND as the first term of the boolean expression. It can be done by:

MyModel.objects. \
    filter(meta__transition__destination_state_id__in=
         MyModel.objects.filter(...)
    ).filter(field=...)

It can improve the stack size really a little by 5 "AND" in 100 items.

You probaby use a filter expression with more conditions in one filter: filter(field_1=value_1, field_2=value_2) which is the same as filter(**{'field_1': value_1, 'field_2': value_2}). The order of items in the dictionary depends on the implementation of hash function in the respective CPython version or even on the implemetation details of dictionary itself on pypy. This is why the exception may be raised only with some Python versions. If you add a similarly complicated additional subselect, every Python must raise the exception.

The result SQL used by Django queryset can be easily examined without running it:

print(my_complicated_queryset.query.get_compiler('default').as_sql())

'default' is the connection alias name in settings.DATABASES.

django.db.utils.OperationalError: parser stack overflow

1 Answers1

Linked