65

I'm referring to this: http://docs.python.org/tutorial/datastructures.html

What would be the running time of list.index(x) function in terms of Big O notation?

John Smith
  • 7,243
  • 6
  • 49
  • 61
user734027
  • 747
  • 1
  • 7
  • 8
  • 3
    You can find the source code for the index operation [here](https://github.com/python/cpython/blob/ee171a26c1169abfae534b08acc0d95c6e45a22a/Modules/arraymodule.c#L1123). It is `O(n)`. – Mike Lane Oct 15 '18 at 21:08
  • It's way faster than I was expecting. `%%timeit` said 2.2ns whereas fetching an attribute via an ORM (warm queryset) was 80ns. – Kermit Mar 31 '20 at 03:34

6 Answers6

78

It's O(n), also check out: http://wiki.python.org/moin/TimeComplexity

This page documents the time-complexity (aka "Big O" or "Big Oh") of various operations in current CPython. Other Python implementations (or older or still-under development versions of CPython) may have slightly different performance characteristics. However, it is generally safe to assume that they are not slower by more than a factor of O(log n)...

gnat
  • 6,213
  • 108
  • 53
  • 73
Zach Kelling
  • 52,505
  • 13
  • 109
  • 108
  • 1
    just to add since the index algorithm can be applied on `list` or other data structures, it is implement as linear search hence `O(n)`. – Krishna Oza May 08 '15 at 10:11
  • 2
    Do you happen to know if there is a specific reason it has not been implemented as a binary search instead? It does not sound overly complex, yet would be way more efficient. – bmiselis Aug 17 '20 at 06:54
  • 1
    Theres no guarantee that the list is sorted so a binary search would not work – Vib Oct 11 '20 at 09:42
  • The doc that you have shared, get item for list is O(1). – user7356972 Feb 21 '21 at 12:32
  • It'd be great if `binary=True` or `sorted=True` were an argument one could provide – duhaime Sep 10 '21 at 21:00
13

According to said documentation:

list.index(x)

Return the index in the list of the first item whose value is x. It is an error if there is no such item.

Which implies searching. You're effectively doing x in s but rather than returning True or False you're returning the index of x. As such, I'd go with the listed time complexity of O(n).

4

Any list implementation is going to have an O(n) complexity for a linear search (e.g., list.index). Although maybe there are some wacky implementations out there that do worse...

You can improve lookup complexity by using different data structures, such as ordered lists or sets. These are usually implemented with binary trees. However, these data structures put constraints on the elements they contain. In the case of a binary tree, the elements need to be orderable, but the lookup cost goes down to O(log n).

As mentioned previously, look here for run time costs of standard Python data structures: http://wiki.python.org/moin/TimeComplexity

Alex Smith
  • 1,495
  • 12
  • 13
-3

Try this code, it will help you to get your execution time taken by lis.index operator.

import timeit
lis=[11,22,33,44,55,66,77] 
for i in lis: 
    t = timeit.Timer("lis.index(11)", "from main import lis") 
    TimeTaken= t.timeit(number=100000) 
    print (TimeTaken)
Box Box Box Box
  • 5,094
  • 10
  • 49
  • 67
-3

The documentation provided above did not cover list.index()

from my understanding, list.index is O(1) operation. Here is a link if you want to know more. https://www.ics.uci.edu/~pattis/ICS-33/lectures/complexitypython.txt

A story-teller
  • 113
  • 1
  • 9
  • 2
    You are mistaken. The "Index" that your link is talking about is the same as Get Item in the python.org wiki. You can see in the [cpython source code](https://github.com/python/cpython/blob/ee171a26c1169abfae534b08acc0d95c6e45a22a/Modules/arraymodule.c#L1123) that the index method is doing an O(n) search of the list. – Mike Lane Oct 15 '18 at 21:07
-3

Use the following code to check the timing. Its complexity is O(n).

import time


class TimeChecker:

    def __init__(self, name):
        self.name = name

    def __enter__(self):
        self.start = self.get_time_in_sec()
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        now = self.get_time_in_sec()
        time_taken = now - self.start  # in seconds
        print("Time Taken by " + self.name + ": " + str(time_taken))

    def get_time_in_sec(self):
        return int(round(time.time() * 1000))


def test_list_index_func(range_num):
    lis = [1,2,3,4,5]
    with TimeChecker('Process 1') as tim:
        for i in range(range_num):
            lis.index(4)

test_list_index_func(1000)
test_list_index_func(10000)
test_list_index_func(100000)
test_list_index_func(1000000)

print("Time: O(n)")
  • 5
    This code fails to prove that `list.index` operates in linear time. It does not compare how long `list.index` takes to run on varying input sizes, but it simply runs `list.index` multiple times. Even if you were computing 1+1, if you compute 1+1 a thousand times it will take 1000x longer than computing it once. To ensure this is true, I tested your code with binary search, which should be O(log n), and with accessing an element of the list, which should be O(1). Both of them, naturally, took 10x longer with each call of `test_list_index_func`, which is linear growth, which is incorrect. – Rohan Apr 19 '20 at 10:40
  • This code makes no sense. Time complexity is usually expressed as a function of the input size. The inputs you are giving to `list.index` never change, it's always the same list and the same search value. When you call `test_list_index_func` with increasing values, it isn't changing the input size to `list.index`, it's just changing the number of times that the exact same operation is performed. Doing that will always show a linear pattern, for any deterministic function called with the same inputs (which, again, is what you're doing here). – Z4-tier Apr 02 '23 at 07:08