1

I went through the this post about Python string slicing (Reverse string: string[::-1] works, but string[0::-1] and others don't), however still I have some questions. Which takes preference? Is it the index (or indices) for the start/end position? Or the step? Which one is evaluated first?

I have tried few things in Python, and the results are not consistent. See below code / examples. I have put my own analysis in brackets which may be wrong, so please help me correct that as well.

  1. txt="abcdef"

txt[::-1] result = 'fedcba' (here the step is given preference, and as it is negative, it starts from the last index, which is 5 or -1, and goes on till the first character 'a')

  1. txt[0:0:-1] result = '' (seems index are evaluated first. So start at 0 but end before 0, which is not possible, hence no results. Step is not even evaluated.)

  2. txt[:0:-1] result = 'fedcb' (preference is given to step, and then index is considered. So the first index becomes the lat position 'f' and goes till one position before 'a')

  3. txt[0::-1] result = 'a' (I am surprised at this. It seems that start index is given preference here. First index 0 is evaluated as 'a', then the step of '-1' is evaluated. But as nothing more can be accessed because of step '-1', no more characters are printed)

5a. txt[0:1] result = 'a' (why? default value of step is '1'. Seems start index is evaluated first, 'a' is accessed, printed, and then stops).

5b. txt[0:1:1] result = 'a' (same as 5a)

5c. txt[0:1:-1] result = '' (why this? compare this to 5a. Seems that in 5c, step is evaluated first and then start index. If start index would have been evaluated first, atleast 'a' should have been printed).

  1. txt[0:-1:-1] result = '' (i was expecting 'a' based on preference being given for start index)

  2. txt[0::-1] result = 'a' (now compare this with example 6 above. Why a different result this time? Blank end index is equivalent to reaching right till the end of the string, isn't it?)

  3. txt[:len(txt):-1] result = '' (compare this to 6 and 7)

  4. txt[:0:-1] result = 'fedcb' (seems step is given preference and then the expression is evaluated. And based on step '-1', the start index is evaluated as the last position 'f')

All of this has confused me to say the least.

quietboy
  • 159
  • 11
  • 4
    You're drastically over-complicating this. There is no precedence. There are simply two cases: (1) with a positive step and (2) with a negative step. In both cases, the index starts with the initial value, then it is compared with the final value. The comparison is determined by the step: `<` for positive and `>` for negative. If the test succeeds, the loop body is executed, the step is added to the value, and the test is repeated. There is no more to it than that. – Tom Karzes Jun 16 '19 at 18:42
  • The only other thing to keep in mind is that, in a slice, a negative initial or final value refers to a position that is offset from the right. So `-1` is equivalent to `len(v) - 1`, `-2` is equivalent to `len(v) - 2`, etc. – Tom Karzes Jun 16 '19 at 18:44

2 Answers2

3

The default numerical values used for the slots (whether they are blank or None) are

  1. 0, or len(x)-1 if the step is negative
  2. len(x), or -len(x)-1 if the step is negative
  3. 1

The strange default for the end slot when moving backwards is to defeat the addition of len(x) for negative indices.

If the step is unspecified, it “takes precedence” only in that its default influences that of the other two.

Having supplied all values, the slice is defined as however many elements (possibly 0) you get by starting at the start value and stopping when at or past the end value (“past” also defined by the sign of the step).

Davis Herring
  • 36,443
  • 4
  • 48
  • 76
  • Change my implementation below with your statements. It'll fail for `txt[0:-1:-1]` and cases not mentioned by OP such as `txt[:-1:-2]` and `txt[-2::-2]`. – Thomas Weller Jun 16 '19 at 20:03
  • @ThomasWeller: I’m sorry, but I don’t understand—are you asking me to change something, and if so in which answer? – Davis Herring Jun 16 '19 at 20:10
1
from dis import dis
dis("txt[::-1]")

gives us

  1           0 LOAD_NAME                0 (txt)
              2 LOAD_CONST               0 (None)
              4 LOAD_CONST               0 (None)
              6 LOAD_CONST               2 (-1)
              8 BUILD_SLICE              3
             10 BINARY_SUBSCR
             12 RETURN_VALUE

so we can see that the default values for start and end are actually None. This means that they are computed internally and the rules might be complex. However, it isn't...

Consider the following (quite simple) implementation:

def impl(txt: str, start: int = None, end: int = None, step: int = 1):
    print("txt[{}:{}:{}] = ".format("" if start is None else start, "" if end is None else end, step), end="")
    # Handle the default values:
    if start is None:
        start = 0 if step >= 0 else len(txt) - 1
    elif start < 0:
        start = len(txt) + start
    if end is None:
        end = len(txt) if step >= 0 else -1
    elif end < 0:
        end = len(txt) + end

    def compare(index, end, step) -> bool:
        if step >= 0:
            return index < end
        if step < 0:
            return index > end

    # Compute the result
    result = ""
    index = start
    while compare(index, end, step):
        result += txt[index]
        index += step
    print(result)
    return result

which will pass for all cases:

txt = "abcdef"
# 1.
assert (txt[::-1] == "fedcba" == impl(txt, None, None, -1))
# 2.
assert (txt[0:0:-1] == "" == impl(txt, 0, 0, -1))
# 3.
assert (txt[:0:-1] == "fedcb" == impl(txt, None, 0, -1))
# 4.
assert (txt[0::-1] == "a" == impl(txt, 0, None, -1))
# 5a
assert (txt[0:1] == "a" == impl(txt, 0, 1))
# 5b
assert (txt[0:1:1] == "a" == impl(txt, 0, 1, 1))
# 5c
assert (txt[0:1:-1] == "" == impl(txt, 0, 1, -1))
# 6.
assert (txt[0:-1:-1] == "" == impl(txt, 0, -1, -1))
# 7.
assert (txt[0::-1] == "a" == impl(txt, 0, None, -1))
# 8.
assert (txt[:len(txt):-1] == "" == impl(txt, None, len(txt), -1))
# 9.
assert (txt[:0:-1] == "fedcb" == impl(txt, None, 0, -1))
# Some others
assert (txt[::2] == "ace" == impl(txt, None, None, 2))    
assert (txt[::-2] == "fdb" == impl(txt, None, None, -2))
assert (txt[:-1:-2] == "" == impl(txt, None, -1, -2))
assert (txt[-2::-2] == "eca" == impl(txt, -2, None, -2))
Thomas Weller
  • 55,411
  • 20
  • 125
  • 222