1

In Java, let's say I have an integer array: int[] a = {1, 2, 3, 4, 5};. If I want to change an element in the array, I can do so by changing the data at a certain address in memory: a[2] = 9; => {1, 2, 9, 4, 5}.

In Clojure, I can have a vector: (def a [1 2 3 4 5]). How do I change an element at a certain position in the vector with a guaranteed worst case time complexity of O(1)? I have read that the assoc keyword has O(1) average time complexity, but this is not what I am looking for. Also, I looked at transient vectors, but I have yet to find a good and simple example of updating a vector in O(1).

Amit
  • 45,440
  • 9
  • 78
  • 110
ljeabmreosn
  • 970
  • 1
  • 9
  • 26
  • You already know that arrays have O(1) update, and vectors have average O(1) update - both data types can be used from Clojure, is there even a question to answer here? – noisesmith Jun 11 '16 at 19:42
  • @noisesmith Yes, there is a question. Is there a primitive Clojure data structure that has the performance as mentioned above? Also, as mentioned above, as average time complexity is useful, in this case it is not. http://stackoverflow.com/questions/17048076/big-o-of-clojure-library-functions. This is table shows that assoc is really O(logn) (base 32). It also shows that no Clojure data structure has true O(1) worst case access time or update time. – ljeabmreosn Jun 11 '16 at 19:51
  • No primitive. Clojure is a java library, if you want the java data structures you can just use them in your Clojure code. – noisesmith Jun 11 '16 at 20:20

2 Answers2

5

Clojure vectors, in their current implementation, have a complexity of O(log_32(n))(logarithm in base 32). Mathematically, O(log_32(n)) is the same thing as O(log(n)). The differences are not theoretical but practical:

  • AFAICT, because a Clojure persistent vector cannot contain more than, say, 4 billion elements, log_32(n) will never be bigger than 7. So, in a way, it is constant time :). This accounts for the fact that the branching factor in the Persistent Hash Tries is 32 (unlike many tree data-structures in which the branching factor is 2).
  • constant factors matter! An update on a Java array will be much faster than on a persistent vector. You should not be thinking about asymptotic complexity, but wondering whether this constant factor is too big for you. If it is, have a look at transients.
Valentin Waeselynck
  • 5,950
  • 26
  • 43
4

In Clojure a vector is a specific data type - it is immutable and has O(log_32(n)) update, returning a new vector that will usually share structurally with the original.

If you need O(1) update, you can use an Array, all of Java's data types are usable from Clojure. I use into here just to have an easy to read representation in the repl - each usage creates a new vector, so you probably want to avoid this in your performance sensitive code.

+user=> (def a (into-array Object [:a 1 :b "m" (java.util.Date.)]))
#'user/a
+user=> a
#object["[Ljava.lang.Object;" 0x456d6c1e "[Ljava.lang.Object;@456d6c1e"]
+user=> (into [] a)
[:a 1 :b "m" #inst "2016-06-11T20:28:05.230-00:00"]
+user=> (aset a 2 'new-contents)
new-contents
+user=> (into [] a)
[:a 1 new-contents "m" #inst "2016-06-11T20:28:05.230-00:00"]

There are a variety of functions defined for working with arrays in Clojure.

Valentin Waeselynck
  • 5,950
  • 26
  • 43
noisesmith
  • 20,076
  • 2
  • 41
  • 49