str.characters
returns a String.CharacterView
– which presents a view onto the string's characters, allowing you to access them without having to copy the contents into a new buffer (whereas doing Array(str.characters)
or str.characters.map{...}
would do just that).
String.CharacterView
itself is a Collection
which is indexed by a String.CharacterView.Index
(an opaque index type) and has elements (unsurprisingly) of type Character
(which represents an extended grapheme cluster – generally what a reader would consider a ‘single character’ to be).
let str = "Hello"
// indexed by a String.Index (aka String.CharacterView.Index)
let indexOfO = str.characters.index(of: "o")!
// element of type Character
let o = str.characters[indexOfO]
// String.CharacterView.IndexDistance (the type used to offset an index) is of type Int
let thirdLetterIndex = str.characters.index(str.startIndex, offsetBy: 2)
// Note that although String itself isn't a Collection, it implements some convenience
// methods, such as index(after:) that simply forward to the CharacterView
let secondLetter = str[str.index(after: str.startIndex)]
The reason that it is indexed by a special String.CharacterView.Index
rather than for example, an Int
, is that characters can be encoded with different byte lengths. Therefore subscripting is potentially (in the case of non-ASCII stored strings) a O(n) operation (requires iterating through the encoded string). However, subscripting with an Int
naturally feels like it should be an O(1) operation (cheap, doesn’t require iteration).
str.characters[str.characters.index(str.characters.startIndex, offsetBy: n)] // feels O(n)
str.characters[n] // illegal, feels O(1)
How is it that I can enumerate into it so easily, or convert it to an array or map it but then printing it itself or even when indexed into it prints so gibberish
You are able to enumerate, convert to Array
and map(_:)
a String.CharacterView
simply because it’s a Collection
– and therefore conforms to Sequence
, which allows for ... in
looping as well as the use of map(_:)
and the Array(_:)
constructer, among other things.
As for why printing out str.characters
results in ‘gibberish’ is down to the fact that it simply doesn’t provide its own custom textual representation via conformance to either CustomStringConvertible
or CustomDebugStringConvertible
.