Browsing through some JDK source code I found the following implementations of Collection.toArray()
and toArray(T[])
in AbstractCollection
:
public Object[] toArray() {
// Estimate size of array; be prepared to see more or fewer elements
Object[] r = new Object[size()];
Iterator<E> it = iterator();
for (int i = 0; i < r.length; i++) {
if (! it.hasNext()) // fewer elements than expected
return Arrays.copyOf(r, i);
r[i] = it.next();
}
return it.hasNext() ? finishToArray(r, it) : r;
}
public <T> T[] toArray(T[] a) {
// Estimate size of array; be prepared to see more or fewer elements
int size = size();
T[] r = a.length >= size ? a :
(T[])java.lang.reflect.Array
.newInstance(a.getClass().getComponentType(), size);
Iterator<E> it = iterator();
for (int i = 0; i < r.length; i++) {
if (! it.hasNext()) { // fewer elements than expected
if (a == r) {
r[i] = null; // null-terminate
} else if (a.length < i) {
return Arrays.copyOf(r, i);
} else {
System.arraycopy(r, 0, a, 0, i);
if (a.length > i) {
a[i] = null;
}
}
return a;
}
r[i] = (T)it.next();
}
// more elements than expected
return it.hasNext() ? finishToArray(r, it) : r;
}
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
@SuppressWarnings("unchecked")
private static <T> T[] finishToArray(T[] r, Iterator<?> it) {
int i = r.length;
while (it.hasNext()) {
int cap = r.length;
if (i == cap) {
int newCap = cap + (cap >> 1) + 1;
// overflow-conscious code
if (newCap - MAX_ARRAY_SIZE > 0)
newCap = hugeCapacity(cap + 1);
r = Arrays.copyOf(r, newCap);
}
r[i++] = (T)it.next();
}
// trim if overallocated
return (i == r.length) ? r : Arrays.copyOf(r, i);
}
private static int hugeCapacity(int minCapacity) {
if (minCapacity < 0) // overflow
throw new OutOfMemoryError
("Required array size too large");
return (minCapacity > MAX_ARRAY_SIZE) ?
Integer.MAX_VALUE :
MAX_ARRAY_SIZE;
}
However, since I am now 'default-implementing' a Collection
-Interface, I wonder if the implementation of these methods can't be simplified in readability as follows, without having more iterations, array creations or generally worse performance:
@Override
default Object[] toArray() {
Object[] result = new Object[size()];
Iterator<E> it = iterator();
int i = 0;
while (it.hasNext()) {
if (i == result.length) // more objects than expected, resize
result = Arrays.copyOf(result, newSize(i));
result[i++] = it.next();
}
if (i != result.length) // trim array
result = Arrays.copyOf(result, i);
return result;
}
@Override
default <T> T[] toArray(T[] a) {
int size = size();
T[] result = size > a.length ? (T[]) Array.newInstance(a.getClass().getComponentType(), size) : a;
Iterator<E> it = iterator();
int i = 0;
while (it.hasNext()) {
if (i == result.length) // more objects than expected, resize
result = Arrays.copyOf(result, newSize(i));
result[i++] = (T) it.next();
}
if (i != result.length)
if (result == a) // set next element to null
a[i] = null;
else // trim array
result = Arrays.copyOf(result, i);
return result;
}
private static int newSize(int currentSize) {
if (currentSize == Integer.MAX_VALUE)
throw new OutOfMemoryError("Required array size too large");
int newSize = currentSize + (currentSize >> 1) + 1;
if (newSize < 0)
// overflow - exceed MAX_ARRAY_SIZE from AbstractCollection only if inevitable
return Math.max(Integer.MAX_VALUE - 8, currentSize + 1);
// clip at MAX_ARRAY_SIZE from AbstractCollection
return Math.min(newSize, Integer.MAX_VALUE - 8);
}
I know that 'simplifying' always is subjective to a certain degree but I am
especially referring to the many levels of indentation and the second step of thinking when finishToArray(T[], Iterator<?>)
is called in the JDK implementation, maybe you get the point.
Since this code was written by Neal Gafter and Joshua Bloch it would be quite presumptuous and precipitate to just call my solution better so I'd like to have your feedback.
Also my code is not tested so if you find some special cases which result in bugs, I'd also like to know them as well.
Edit, since detailed description on what has changed has been requested:
The main point of simplification was in exchanging the for
loop from 0
to r.length
(being initial size()
) to a while
loop using the Collection's Iterator
, so the loop only ends, when there is actually no more element left, and thus making a finishToArray(..)
method unnecesary.
Also the rather complex body in toArray(T[] a)
's for
loop has been highly simplified from several if
-branches including return
statements to just a simple if
-branch resizing the array if necessary.
These are the reasons why I wonder if the JDK implementation offers a certain advantage or if mine is somehow wrong in a way I haven't tought about.