You've successfully implemented a state-based word counting algorithm that handles spaces correctly! However, there's still a problem lurking in our code.
Our algorithm currently only checks for space characters (' '). But what about other types of whitespace?
"Hello\tWorld" // Tab character - should be 2 words
"Hello\nWorld" // Newline character - should be 2 words
"Hello\rWorld" // Carriage return - should be 2 words
Currently, our algorithm doesn't handle these at all.
Now, we could add checks for each of these:
isSpace := x == ' ' || x == '\n' || x == '\t' || x == '\r' || ...
But this quickly becomes unwieldy. In UTF-8 encoding, there are actually many whitespace characters:
This is where Go's standard library shines. The Go team has already solved this problem for us, in a few different packages.
bytes PackageThe bytes package provides functions for manipulating byte slices, with one of these being the Fields function:
func Fields(s []byte) [][]byte
According to the documentation, Fields splits a byte slice around each instance of one or more consecutive white space characters, as defined by the unicode.IsSpace function. It then returns a slice of substrings (as byte slices), or an empty slice if the input contains only whitespace.
This is exactly what we need! Let's look at what it does:
input := []byte("Hello World")
words := bytes.Fields(input)
// words = [][]byte{[]byte("Hello"), []byte("World")}
// len(words) = 2
bytes.Fields Over strings.Fields?You might notice there's also a strings.Fields function. While it does the same thing, it works with strings:
func Fields(s string) []string
Since our countWords function receives []byte, using strings.Fields would require us to convert:
words := strings.Fields(string(data)) // Conversion overhead!
Converting between []byte and string isn't free - it involves copying memory. Since bytes.Fields works directly with byte slices, it's more efficient for our use case.
Using bytes.Fields gives us:
Re implement the word counting algorithm this time using the bytes.Fields function:
bytes packagebytes.Fields on the input dataThe beauty of the standard library is that complex algorithms become single function calls. This is one of Go's greatest strengths for building practical applications!