12

What is the reason why substring has the starting parameter as an index and the 2nd parameter as the length from the beginning?

In other words

1   2   3 | 4   5 <=== Length from beginning

A   B   C   D   E

0 | 1   2   3   4 <=== Index

If i want substring() to return BC i have to do "ABCDE".substring(1,3);

Why is this the case?

EDIT: What are the benefits of making the end index exclusive?

shinjw
  • 3,329
  • 3
  • 21
  • 42
  • I believe it's a start index and an end index? – Evan Knowles Oct 29 '14 at 13:03
  • Because its intuitive. – kai Oct 29 '14 at 13:12
  • 1
    Im asking why... when asked this question... I don't want to just tell whoever that "its just the way its done." – shinjw Oct 29 '14 at 13:15
  • I know and my comment clearly asking your question. Your question: "Why did the developers desinged this method like this?" My answer: "Because they thought its intuitive for other developers which are using this method" – kai Oct 29 '14 at 13:20
  • 2
    "*What are the benefits of making the end index exclusive?*" this lets us use substring in a way `"ABCDE".substring(start,start+length);`. Take a look at your example: to get `BC` you could try using variables like `int start = 1; int length = 2` which by using mentioned formula would be the same as `"ABCDE".substring(1,3);`. – Pshemo Oct 29 '14 at 13:41
  • 1
    I think selecting between two indices is dumb anyway. Much more intuitive to have a *start* index (inclusive) and a **length** (not an end index). – Lakey Oct 29 '14 at 18:22
  • Subjective or not, I think this is a valid question. Consider changing the accepted answer from @StephenC's non-answer to one of the other answers that actually do try to give a logical explanation. – Suragch May 10 '15 at 02:53
  • @kai i think is the less intuitive possibility they have when they made this method. It would be mucho more intuitive to made it "inclusive" as most methods or with "startIndex" and "length" as the previous comment – MarBVI Feb 22 '16 at 19:51
  • It's because starts at the begin index and when arrives to the end index stops searching without read it, is just a loop where they ask for less than instead of less than or equals – IgniteCoders Sep 11 '20 at 01:13
  • 1
    to inflict pain on mathematicians – Stephen Oct 04 '20 at 05:44

4 Answers4

12

The question about the "Why" may be considered as philosophical or academic, and provoke answers along the line of "That's just the way it is".

However, from a more general, abstract point of view, it is a valid question, when considering the alternatives: One could imagine two forms of this method:

String substringByIndices(int startIndex, int endIndex);

and

String substringByLength(int startIndex, int length);

In both cases, there is another dimension in the design space, namely whether the indices are inclusive or exclusive.

First of all, note that all versions are basically equivalent. At the call site, it's usually trivial to change the call according to the actual semantics of the method:

int startIndex = ...;
int endIndex = ...;
String s = string.substringByLength(startIndex, endIndex-startIndex);

or

int startIndex = ...;
int length = ...;
String s = string.substringByIndices(startIndex, startIndex+length);

The choice of whether the indices are inclusive or exclusive will add some potential for having to fiddle around with a +1 or -1 here and there, but this is not important here.

The second example already shows why the choice to use an inclusive start index and an exclusive end index might be a good idea: It's easy to slice out a substring of a certain length, without having to think about any +1 or -1:

int startIndex = 12;
int length = 34;
String s = string.substringByIndices(startIndex, startIndex+length);

// One would expect this to yield "true". If the end index
// was inclusive, this would not be the case...
System.out.println(s.length() == length); 

This somehow may also be considered to be in line with things like for-loops, where you usually have

for (int i=startIndex; i<endIndex; i++) { ... }

The start is inclusive, and the end is exclusive. This choice thus matches nicely into the usual, idiomatic language patterns.


However, regardless of which choice is made, and regardless how it is justified: It is important to be

consistent

throughout the whole API.

For example, the List interface contains a method subList(int, int):

List<E> subList(int fromIndex, int toIndex)

Returns a view of the portion of this list between the specified fromIndex, inclusive, and toIndex, exclusive.

with is in line with this convention. If you had to mix APIs where the end index is sometimes inclusive and sometimes exclusive, this would be error-prone.

Community
  • 1
  • 1
Marco13
  • 53,703
  • 9
  • 80
  • 159
5

It's a start and an end index.

To me this seems very logical, however if you prefer you can think of it in terms of start and length using a very simple calculation:

"ABCDEFGH".substring(start, start + length);

It allows you this flexibility.

StuPointerException
  • 7,117
  • 5
  • 29
  • 54
3

It's not so much the "length from the start" but "end index exclusive".

The reason is obvious if you look at how those two numbers work with code to create a substring by copying characters from one array to another.

Given:

int start; // inclusive
int end; // exclusive
char[] string;

Now look how easy it is to use those numbers when copying elements of an array:

char[] substring = new char[end - start];
for (int i = start; i < end; i++)
    substring[i - start] = string[i];

Notice how there is no adjusting by adding/subtracting 1 - the numbers are just what you need for the loop. The loop could actually be coded without the subtraction too:

for (int i = start, j = 0; i < end; i++)
    substring[j++] = string[i];

Choosing those numbers is "machine friendly", which was the way when the C language was designed, and Java is based on C.

Bohemian
  • 412,405
  • 93
  • 575
  • 722
1

Thumb rule while writing code is, take maximum number or inputs from the consumer. It become more easy to get the required output.

Source code is the answer. And they both are start and end indexes.

   public String substring(int beginIndex, int endIndex) {
1942        if (beginIndex < 0) {
1943            throw new StringIndexOutOfBoundsException(beginIndex);
1944        }
1945        if (endIndex > count) {
1946            throw new StringIndexOutOfBoundsException(endIndex);
1947        }
1948        if (beginIndex > endIndex) {
1949            throw new StringIndexOutOfBoundsException(endIndex - beginIndex);
1950        }
1951        return ((beginIndex == 0) && (endIndex == count)) ? this :
1952            new String(offset + beginIndex, endIndex - beginIndex, value);
1953    }

In simple words, It's just to mentions from where to where you want to sub-string it.

Suresh Atta
  • 120,458
  • 37
  • 198
  • 307
  • Is there a rationale behind it? If done by index, why is the front index inclusive and the end index exclusive? – shinjw Oct 29 '14 at 13:07
  • I think that's just a normal CS convention. just as if it were a for loop `i = 0; i – LeatherFace Oct 29 '14 at 13:08
  • 1
    @shinjw Though the choice is largely arbitrary, it makes splitting a string into multiple parts more straightforward: `int startIdx = 0, midIdx = 5, endIdx=10; String first = foo.substring( startIdx, midIdx ); String second = foo.substring( midIdx, endIdx );` – biziclop Oct 29 '14 at 13:11
  • 2
    @shinjw if the start index is inclusive and the end index is exclusive, you can see at a look what the length of the new string is. endindex minus startindex. – kai Oct 29 '14 at 13:14
  • @shinjw Thumb rule while writing code is, take maximum number or inputs from the consumer. It become more easy to get the required output. – Suresh Atta Oct 29 '14 at 13:15
  • @shinjw: it also means value.substring(0, value.length) is a valid method call. If the end index were inclusive the above would fail - which would be unintuitive. – StuPointerException Oct 29 '14 at 13:17
  • 1
    @StuPointerException For every choice we can imagine scenarios where they seem more intuitive than the others. We feel this is more intuitive because we learnt to think like this. – biziclop Oct 29 '14 at 13:39
  • @biziclop - Fair point! – StuPointerException Oct 29 '14 at 13:41