0

Is there a library or whatever class that extends or conforms to list APIs that allows slicing in constant time?

With a list in Python, lst[1:] will copy the sublist which takes O(n) time. I need some alternative that all operations on lists like max, min, len, set and get would have the expected behaviour, only slicing should be O(1) and use the underlying original list with manipulation on indexing and length.

Is there some shelf class to use for this or do I have to make my own?

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
Ofek Ron
  • 8,354
  • 13
  • 55
  • 103
  • Does this answer your question? [Can I create a "view" on a Python list?](https://stackoverflow.com/questions/3485475/can-i-create-a-view-on-a-python-list) – mkrieger1 Mar 12 '21 at 12:56
  • What exactly do you mean by "with manipulation on indexing and length"? – mkrieger1 Mar 12 '21 at 12:58
  • @mkrieger1 see my answer to understand what i mean by manipulating indices – Ofek Ron Mar 13 '21 at 10:27
  • "With a list in Python, lst[1:] will copy the sublist which takes O(n) time." - if what you're worried about is the time cost of the copy, then stop worrying about that. Most of the operations you want to perform on the slice already take O(n) time themselves, and using a custom view class will most likely be slower than just using a normal slice, due to all the additional interpreter overhead. (Lists and list slicing are implemented in C, while your custom view will not be.) – user2357112 Mar 13 '21 at 10:34
  • @user2357112supportsMonica you are wrong, if you work with a large dataset and slicing it a lot then you would pay alot of overhead on the copies, both memory and time wise. it sometimes makes sense to copy but sometimes it doesnt... depends on use case – Ofek Ron Mar 13 '21 at 10:38
  • @OfekRon: The class you wrote in your answer is over 100 times slower than regular slicing when I test it. You're not getting a speedup here. With the way interpreter overhead works in Python, making copies is rarely a bottleneck, and paying extra interpreter overhead to avoid a copy is almost never worth it. – user2357112 Mar 13 '21 at 10:47
  • @user2357112supportsMonica wrong again... for a 1 million integer list with slice : slice(None, None, -14) : slicing : slicelist is faster then list by factor of 201.29690 max : list is faster then slicelist by factor of 34.96295, if you dont mind the slowdown on iterations (like if you know you dont iterate much but you do a lot of slicing) then it would make sense to use SliceList, a proper example of use cases for this is a queue,stack and windoiwng use cases, although these use cases are covered in other libraries and i am not sure using SliceList for the is better then use libraries – Ofek Ron Mar 13 '21 at 10:50
  • @OfekRon: The slicing timing difference is meaningless - what matters is how long it takes to perform the overall task. With a list, if you do `min(l[1:])`, the slice takes a small fraction of the time and the `min` takes most of it. If you speed up the fast part by a factor of 200, but slow down the slow part by a factor of 35, the overall time taken is **much slower**. You could make the slicing instantaneous and it wouldn't be enough to overcome how much slower you made the rest of the job. – user2357112 Mar 13 '21 at 11:01
  • (My "over 100 times slower" claim comes from [this test](https://ideone.com/iBRVmV) I ran, with a few zeros deleted from your original test sizes to make things finish faster. I dunno where your numbers come from.) – user2357112 Mar 13 '21 at 11:04
  • Also see [this test](https://ideone.com/8SoPs5), where slicing takes about 1/6 the time in a `min(l[1:])` computation, and using your `SliceList` slows down the overall computation by a factor of about 50. Interpreter overhead is **massive**. It is way more significant than the cost of a C-level copy. – user2357112 Mar 13 '21 at 11:07
  • @user2357112supportsMonica you are right, retrieval and interations are much less effcient and can make the overall performance be reduced significantly, but if you do much more slicing then iterations it may pay off. I guess iterations and retrieval could get better if we take it down to c level but still if you do more iterations then slicings then you better use lists – Ofek Ron Mar 13 '21 at 11:40

2 Answers2

0

Given that slicing ranges is instantaneous you could make a function that gets list elements from a sliced range of indexes:

def getSlice(L,start,stop=None,step=None):
    yield from (L[i] for i in range(len(L))[start:stop:step])

L = [*range(100,200)]
print(*getSlice(L,-10,-15,-2)) # 190 188 186

You could also make a wrapper class of your own to process list subscripts as iterators rather than copies of sublists:

class ListSlice:
    def __init__(self,L,indexes=None): 
        self.content  = L
        self._indexes = indexes

    @property
    def indexes(self):
        if isinstance(self._indexes,slice):
            return range(len(self.content))[self._indexes]
        if self._indexes is None:
            return range(len(self.content))  
        return self._indexes 
    
    @indexes.setter
    def indexes(self,value): self._indexes = value
        
    def __len__(self):  return len(self.indexes)
    def __iter__(self): yield from (self.content[i] for i in self.indexes)
                           
    def __getitem__(self,index):
        if isinstance(index,slice):
            return ListSlice(self.content,self.indexes[index])
        return self.content[self.indexes[index]]

    def __setitem__(self,index,value):
        i = self.indexes[index] 
        if isinstance(index,slice):
            if isinstance(i,range):
                self.content[slice(i.start,i.stop,i.step)] = value
            else:
                for j,v in zip(i,value): self.content[j] = v 
        else:
            self.content[i] = value

usage:

L = [*range(100,200)]

# subscript slices will be iterators (without copying)
L = ListSlice(L)      
print(*L[30:40:2]) # 130 132 134 136 138 
print(L[32])       # 132

# view on odd indexes (no copy)
Lodds = ListSlice(L,slice(1,None,2))  # or Lodds = ListSlice(L)[1::2]   
print(*Lodds[-3:])  # 195 197 199

# re-slice on sliced list (without stacking overhead)
L39 = Lodds[3:9]     
print(*L39) # 107 109 111 113 115 117 # L39 is a new ListSlice on L

# arbitrary subset of indexes
Lfibo = ListSlice(L,[1,2,3,5,8,13,21,34,55,89])
print(*Lfibo) # 101 102 103 105 108 113 121 134 155 189

# dynamically changing slicing/indexes
A  = [32,56,4,98,29,15]
sA = ListSlice(A)
sA.indexes = sorted(sA.indexes,key=lambda i:A[i])
print(*sA)  # 4 15 29 32 56 98
print(A)    # [32, 56, 4, 98, 29, 15]

# slice assignments affect the underlying list (L)
Lodds[:3] = [-1]*3     
print(*Lodds[:5])  # -1 -1 -1 107 109
print(*L[:10])     # 100 -1 102 -1 104 -1 106 107 108 109   

Note that the indirection through indexes, in Python code, will be considerably slower than using native list slicing (despite the internal memory copy done by list slicing). The only benefit would be in extreme cases where total memory usage matters and the size of slices is enormous. The new list returned by slicing a list will only consume 8 bytes per item, no matter what the list items are. That is because both mutable and immutable datatypes are stored as pointers in a list (for immutables all the pointers reference the same memory for a given value).

Alain T.
  • 40,517
  • 4
  • 31
  • 51
  • 'itertools.islice' object is not subscriptable when trying to get with index – Ofek Ron Mar 12 '21 at 13:51
  • I need to work with the sliced list like it was list object, islice doesnt give me get ability so its not good – Ofek Ron Mar 12 '21 at 13:53
  • @OfekRon Constant-time indexing is made possible by the presence of the items in memory. You have to pay the linear cost sometime: either upfront when you make the indexible data structure, or on demand when you need access to the particular element. – chepner Mar 12 '21 at 14:23
  • `itertools.islice` is **not** a solution. `itertools` is an iterator library, and `itertools.islice` works purely in terms of iteration. That means that if you have a million-element list `l` and you do `l[-2:]`, that only has to retrieve 2 elements, but if you use `itertools.islice(l, 999998, 1000000)`, the islice iterator has to iterate over **999998** list elements you don't care about before getting to the elements you actually wanted. – user2357112 Mar 13 '21 at 10:24
  • For a million-element list, retrieving the last two elements with `islice` is [about 25000 times slower](https://ideone.com/6VgmJg) than using an actual slice. – user2357112 Mar 13 '21 at 10:31
-1

Well i didn't find any acceptable answer so i implemented what i needed myself... hope this can be useful to the community...

Here is the implementation of SliceList

Its important to note that it is possible to slice a SliceList any amount of times and it will still work on the original list rather then making copies. Note that its getting less and less efficient to iterate a SliceList as you add more levels of slicings (even the first level is much less efficient then using list directly) but if your use case includes little iterations and retrivals but much more slicing then it may pay off, espicially for long lists, to slice a SliceList over slicing a list.

This solution is better then list when you dont mind paying much more time when iterating and retrieving values to gain more time when slicing

Ofek Ron
  • 8,354
  • 13
  • 55
  • 103