Cpython String Addition Optimisation Failure Case
Solution 1:
In the list based approach, string from index 0 of the list is taken and modified before being put back to the list at index 0.
For this short moment interpreter still has the old version of string in the list and can't perform in place modification.
If you take a look at Python's source then you'll see that there is no support for modifying the element of the list in place. So the object (string in this case) has to be retrieved from the list, modified and then put back.
In other words list
type is completely agnostic of the str
type support for +=
operator.
And consider the following code:
l = ['abc', 'def']
def nasty():
global l
l[0] = 'ghi'
l[1] = 'jkl'return'mno'
l[0] += nasty()
The value of l
is ['abcmno', 'jkl']
which proves that 'abc'
was taken from the list, then nasty()
got executed modifying the contents of the list, strings 'abc'
and 'mno'
got concatenated and result was assigned to l[0]
. If nasty()
was evaluated before accessing l[0]
to modify it in place, then the result would be 'ghimno'
.
Solution 2:
So why is the list version creating two references?
In l[0] += ' '
, one reference is in l[0]
. One reference is created temporarily to perform the +=
on.
Here are two simpler functions to show the effect:
>>>deff():... l = ['']... l[0] += ' '...>>>defg():... s = ''... s += ' '...
Disassembling them gives
>>> from dis import dis
>>> dis(f)
20 LOAD_CONST 1 ('')
3 BUILD_LIST 16 STORE_FAST 0 (l)
39 LOAD_FAST 0 (l)
12 LOAD_CONST 2 (0)
15 DUP_TOPX 218 BINARY_SUBSCR
19 LOAD_CONST 3 (' ')
22 INPLACE_ADD
23 ROT_THREE
24 STORE_SUBSCR
25 LOAD_CONST 0 (None)
28 RETURN_VALUE
>>> dis(g)
20 LOAD_CONST 1 ('')
3 STORE_FAST 0 (s)
36 LOAD_FAST 0 (s)
9 LOAD_CONST 2 (' ')
12 INPLACE_ADD
13 STORE_FAST 0 (s)
16 LOAD_CONST 0 (None)
19 RETURN_VALUE
In f
, the BINARY_SUBSCR
(slicing) instruction puts l[0]
at the top of the VM stack. DUP_TOPX
duplicates the top n items on the stack. Both functions (see ceval.c
) increase the reference count; DUP_TOPX
(DUP_TOP_TWO
in Py3) does it directly, while BINARY_SUBSCR
uses PyObject_GetItem
. So, the reference count of the string is now at least three.
g
doesn't have this problem. It creates one additional reference when the item is pushed with LOAD_FAST
, giving a refcount of two, the minimal number for an item on the VM stack, so it can do the optimization.
Post a Comment for "Cpython String Addition Optimisation Failure Case"