Why is building a list from user input and printing its contents much slower in PyPy than CPython?

Question

I was coding for a problem in CodeForces, and I submitted this code to run in PyPy:

import math
a=[]
b=[]
t=int(input())
for i in range(t):
    n=float(input())
    a.append(math.floor(n))
    b.append(math.ceil(n))
l=0-sum(a)
i=0
while i<len(a):
    if l>0 and a[i]!=b[i]:
        print(b[i])
        l-=1
    else:
        print(a[i])
    i+=1

However, I was given a "time limit exceeded" verdict, with the execution taking over 1 second.

The same code ran in under 600 ms when run by the CPython interpreter.

From what I understand, PyPy is usually faster than Python. Why would CPython be faster for this code?

@juanpa.arrivillaga Yes, that's what I meant. I am relatively new to Python. Thanks though. I edited my Post. — Susanta Mukherjee, Jun 28 '19 at 18:04
Many SO readers will have no idea what your abbreviations mean. I'm assuming they generally come from the Code Forces community. — jpmc26, Jun 29 '19 at 07:54
Given the answer you received, I think reviewing this specific case is instructive and useful. In order to bring the question into on-topic guidelines, I've revised it to be specific about what's going on with this piece of code rather than asking for general advice and included the code directly. Please review what I've written for mistakes, and note that your code being including your code here means it's licensed under Creative Commons. If this is acceptable to you, notify me and I will nominate the question for reopening. If not, my apologies for presuming about the licensing. — jpmc26, Jun 29 '19 at 22:42
@jpmc26 Thanks for editing my post! Including my Code here is absolutely fine with me. And I will remember these points when I post my next question. Thanks — Susanta Mukherjee, Jun 30 '19 at 13:55

score 4 · Accepted Answer · answered Jun 29 '19 at 07:39

Welcome to Stack Overflow! In two words, the reason that PyPy looses to CPython in this case is that the Python code we are running is not really computing much, but instead all the time is spent doing input/output (first with a loop of input(), then a loop of print()). This is likely the major part of where the time is spent. PyPy's routines for input/output are not as well optimized as CPython's, which is the reason for why it is somewhat slower. You can have a guess that PyPy will win over CPython, sometimes massively, when the Python code that you wrote is spending time doing computations in Python.

The opposite of "doing computations in Python" is sometimes called "running library code"---this includes things like input/output, or more generally anything where a single Python function call invokes quite a lot of C code. Note that, counter-intuitively, this also includes doing arithmetic on very, very large integers, because that requires a lot of C code for every single operation. The opposite extreme example would be doing arithmetic on "small" integers, up to sys.maxsize, because the PyPy JIT can map every operation directly to one CPU instruction.

In summary, PyPy is good where there is some time spent in pure Python---not necessarily all the time. For example, non-trivial pure-Python web servers tend to benefit a lot from PyPy: the raw socket input/output is indeed a bit slower, but all the logic to process the queries and build responses is much faster, and that's easily the major part of the execution time.

Why is building a list from user input and printing its contents much slower in PyPy than CPython?

1 Answers1