Strings, bytes, runes and characters

26-11-2018
Since Amura is programmed in Go, all concepts explained here are the same or very similar. This is a very brief adaptation of this [Go post](https://blog.golang.org/strings). In Go, a string is in effect a read-only slice of bytes. It's important to state right up front that a string holds arbitrary bytes. It is not required to hold Unicode text, UTF-8 text, or any other predefined format. As far as the content of a string is concerned, it is exactly equivalent to a slice of bytes. An important consequence of this slice-like design for strings is that creating a substring is very efficient. All that needs to happen is the creation of a two-word string header. Since the string is read-only, the original string and the string resulting from the slice operation can share the same array safely. If you run: ```typescript let a = "会意字"; fmt.println(a.length) fmt.println(a.runeCount) ``` Prints: ```bash 9 3 ``` Iterating by index --- ```typescript for (let i = 0, l = a.length; i < l; i++) { fmt.printf("%x ", a[i]) } ``` Prints: ``` e4 bc 9a e6 84 8f e5 ad 97 ``` Iterating by runes --- ```typescript for (let i = 0, l = a.runeCount; i < l; i++) { fmt.printf("%s ", a.runeAt(i)) } ``` Prints: ``` 会 意 字 ``` Indexing a string accesses individual bytes, not characters, like Go does (because most of the functions are just wrappers around the Go function): ```typescript fmt.println(a.indexOf("意")) ``` Prints: ``` 3 ``` So if you try to print the second character knowning that its index is 3: ```typescript fmt.println(a.substring(3, 4)) ``` You get an unprintable byte value: ``` � ``` To print the character (a 'rune' constant in Go): ```typescript fmt.println(a.runeAt(3)) ``` You get the rune: ``` 意 ```