iterating a single item list faster than iterating a long string? #Python #Cherrypy

Question

When using Cherrypy, I ran into this comment line. "strings get wrapped in a list because iterating over a single item list is much faster than iterating over every character in a long string." This is located at https://github.com/cherrypy/cherrypy/blob/master/cherrypy/lib/encoding.py#L223 I have done some researches online but I still don't fully understand the reason to wrap the response.body as [response.body]. ? Can anyone show me the details behind this design?

You mean in decoding? I still don't know how would process( ['sameblarblar']) be faster than process('sameblarblar'). The answer, assuming the comment is right, must lie in the ways the 'process' handles inputs. But to be specific, what is it? — JoeyZhao, Aug 02 '16 at 19:43
Another way to ask this question is, in what case would the labor of 'going through the string A' be avoided by putting the string in a list. — JoeyZhao, Aug 02 '16 at 19:52
To use the same example you gave, if you need to do 'for thing in 'lotsofstuff': do this thing', how would it benefit you by doing 'for stuff in ['lotsofstuff']: now you still need to take care of the 'stuff' #no escape, only one extra step'?? — JoeyZhao, Aug 02 '16 at 20:02

score 2 · Accepted Answer · answered Aug 01 '16 at 22:11

2

I think that code only makes sense if you recognize that prior to the code with that comment, self.body could be either a single string, or an iterable sequence that contains many strings. Other code will use it as the latter (iterating on it and doing string stuff with the items).

While would technically work to let that later code loop over the characters of the single string, processing the data character by character is likely inefficient. So the code below the comment wraps a list around the single string, letting it get processed all at once.

answered Aug 01 '16 at 22:11

Blckknght

100,903
11
120
169

In the code, the body has to be a 'basestring' (cherrypy 3) or 'text_or_bytes' (cherrypy 5). So it's not comparing ['a','b','c'] or ['abc'] to 'a', it's just comparing ['a'] and 'a' – JoeyZhao Aug 02 '16 at 19:46
2

The other places that use `self.body` in the file all do `for chunk in self.body:`. That is, it loops over the body and expects to get `chunk`s that are strings (it usually tests that the line is text on the first line of the loop). The operations it does on each chunk are likely to be more efficient if there's just one large chunk than if there are many one-character chunks. It may not be a huge performance difference though, so I don't know that I'd invest too much effort in learning about this implementation detail of that code. – Blckknght Aug 02 '16 at 20:00
Thanks a lot. I think this is probably the only explanation. – JoeyZhao Aug 03 '16 at 19:26

iterating a single item list faster than iterating a long string? #Python #Cherrypy

1 Answers1