Lesson Complete!
Great job! What would you like to do next?
Full Course
Loading external data into a Go application is crucial for processing information effectively, and this involves reading data from files. By creating a words.txt
file containing simple numeric values, the process begins with utilizing the os
package from Go’s standard library to read the file’s content. The readFile
function simplifies this by returning the file’s contents as a byte slice and any potential errors. Proper error handling is emphasized, although for initial experimentation, ignoring errors can streamline progress. Converting byte slices to strings allows for human-readable output, which enhances the usability of the data retrieved from the file. Understanding how to manage file input alongside effective data conversion is foundational for future programming challenges, such as counting the number of words in the file.
Now that we have our initial project set up, the next thing we want to do is to actually load the data from a file into our code. To do so, we're going to need to have a file in order to load the data from. So let's go ahead and create one called words.txt
. I'm going to go ahead and use the echo
command that we saw in the last lesson. And for the contents of this file, I want it to contain five words. That will eventually count. To make it easier, I'm going to go ahead and set these words to be 1, 2, 3, 4, 5, which is easy to know how many words exist inside of this file. Then I'm going to go ahead and pipe the output of this command into a file called words.txt
as follows.
Now I want to go ahead and do ls
on my directory. You can see I have a words.txt
file available. And if I print the contents of it out, you can see it contains the 1, 2, 3, 4, 5
string that we just echoed into this file. So far, so good.
Next up, we now need to implement a way to load the contents of this file into our application, so that we can actually process the data we receive. There are a couple of ways to do so when it comes to Go. In order to read the contents of a file in Go, we need to make use of another package provided by the Go standard library. In fact, we're actually going to make use of this package quite a lot throughout the course. This is the OS
package, which provides a platform-independent interface to operating system functionality. This means that this package itself will work across different platforms with the same functions. So reading a file in Linux will work pretty much the same way when it comes to both Unix-like systems, such as macOS or BSD, and Windows as well. Although there are a couple of things that have caveats when it comes to Go. We'll talk about those as we come across them throughout this course.
In any case, if we look at the documentation, you can see that the design is Unix-like, although the error handling is Go-like. Failing calls return values of type error
rather than error numbers, which is a good thing when it comes to programming languages. In any case, the os
package provides a lot of different functions for various operations when it comes to working with our host system. However, the function that we want at the moment is the ReadFile
function. If we read the documentation for, we can see that ReadFile
reads the name file and returns the contents. This file takes a name, so the name of the file, as a string parameter, so we can pass in our file name. And it will return two values: a slice of bytes, which is denoted as follows, which will contain the contents of the actual file that we read in, in byte form. The second return value that this function provides is an error, which will let us know if something went wrong with the actual file reading process, such as if the named file doesn't exist, or if we don't have the correct permissions to read from it. We'll be taking a look at both error handling and file permissions later on throughout this course.
For the meantime, however, we're actually going to go ahead and ignore this error, although this isn't good advice when it comes to working in Go. Typically, you pretty much want to handle every error as it comes in, but again, we'll talk about that more later on. For the meantime, however, let's go about adding the ReadFile
function to our code so that we can actually read the contents of our words.txt
file. Additionally, you can also see here there's an example on actually how we can use this. So let's go ahead and do so.
To begin, we first need to go ahead and import the os
package. When it comes to importing multiple packages on with Go, whilst you can use multiple imports on each line, this isn't the idiomatic way of doing so. Instead, you wrap your imports inside of a parameter list as follows, with each import on their respective lines. This is the idiomatic approach and it's a little less typing than adding in an import for each respective line. In any case, the os
package is now imported into our file and we can go ahead and call the os
module and let's go ahead and call the ReadFile
function which, as you can see, gives the documentation that was provided on the Go doc website.
In this case, we need to pass in the name of the file which is going to be our words.txt
and you can either specify words.txt
which will look in the current directory or you can be a little bit more explicit about this and do ./words.txt
which again will tell this to look in the current directory as denoted by the dot and then the slash to mark the end of the directory. As well as relative paths, you can also specify absolute paths if you want to such as /home/elliott/projects/counter/words.txt
. Although in this case, it's going to be a little bit redundant and it won't work for your own system. So let's just go ahead and do ./words.txt
which is where the file can be found.
Then we need to go ahead and capture the return values of this function. As I mentioned, we're actually going to go ahead and ignore the error and just capture the actual bytes. So let's go ahead and do so capturing them in a value called data
and we use the shorthand initializer syntax which is the :=
in order to both create the variable and assign the value to it. However, you'll notice that my text editor is actually producing an error, letting us know that we have an assignment mismatch. One variable which is I'm assigning to one variable but the os.ReadFile
function returns two values. This is because in Go, if a function returns multiple values, you can't just ignore them by omitting the assignment capture. Therefore, we could solve this by just capturing the error as follows. However, now you'll see I'm getting yet another error - declared and not used. This is because Go is very strict about unused variables and imports as well. And if we define a variable but we don't use it, the compiler will complain. This is actually a good thing - it may feel a little bit annoying but it's actually very good that we're not importing or defining variables and not using them.
But then how do we actually ignore the value that's coming back from the os.ReadFile
? Well, to do so we can go ahead and make use of the blank identifier, which is just an underscore character. This tells the compiler that we're explicitly ignoring a value by assigning it to an underscore name. As you can see, we're no longer getting a complaint about the error not being used, although we are getting a complaint about the data. We can squash this for the moment by just assigning the data value to an underscore and you can see the errors are now gone.
With that, we're now capturing the contents of the file from the os.ReadFile
method in a variable called data
. In order to see that this is working, we actually want to go ahead and write the contents of this data to our terminal output. To do so we can go ahead and actually use the fmt.Println
function yet again. Before we do, let's go ahead and actually get rid of the hello world counter on lines 9 and 10. Then we can go ahead and actually pass in our data
string in order to print it out. Before we do so, let's go ahead and actually add in a prefix string. So we'll go ahead and add in data:
, which will add the following data:
string just before the output of our data field. And we can also go ahead and get rid of this line 10 as well as we're actually using the variable now.
If we take a look at the fmt.Println
function documentation, you can see that spaces are always added between operands and a new line is appended, meaning that this will add a space in between our string of data and the actual data value that we're printing out. So it would look similar to this. The fmt.Println
package is just one of many ways to actually write data to the terminal, and we'll take a look at a number of different ways to do so throughout this course. In any case, with our data now being printed using the fmt.Println
function, let's go ahead and actually run our code and see what happens.
When I do so, you can see we get the data line being printed out. However, afterwards you'll notice that I then get an array of numbers rather than the actual contents of the words.txt
. This array of numbers represents the individual bytes that make up the contents of this file. To prove that this is the case, if I head on over to the ascii-code.com
website and take a look at this ASCII table. Let's scroll down to the ASCII printable characters. Let's make this a little bigger so we can see it. If we look for the decimal number of 111, which is all the way down here, you can see that the value of this is the lowercase o
, as defined here and as defined here. Therefore, we can see that this 111 represents the lowercase o
or the first character inside of our words.txt
file. This is because 111 or 0x6F
is the decimal representation of the byte. And the lowercase o
character is the ASCII representation.
Therefore, as you can see, we're printing out the numeric byte values rather than their ASCII values, which is unfortunately a property of how the data slice works when it comes to Go. Whilst byte values are very good for machines, they're not great for humans. So how do we go about actually printing out the ASCII values or the UTF-8 values that our data contains? Well, fortunately, Go provides some syntactic sugar to do this quite easily, given that casting between a byte and a string is a very common operation when it comes to working with data in Go. To do so, we can just go ahead and wrap our data
value in a string with parentheses as follows.
Now, if I go ahead and run this code again, this time you should see the ASCII text or UTF-8 text in this case be printed to the console, which it is. As you can see we have data: 12345
. Much more readable than the actual byte values themselves. One thing to note is that casting a slice of bytes to a string and also the inverse casting a string to a slice of bytes is a non-zero cost operation, so there is a slight performance impact associated with it. That being said, don't let that performance cost get in the way. It just means to be cognizant to not convert from a byte to a string to a string to a byte slice back and forth. If you need to do it, do it once and don't repeatedly do it on this same piece of data.
In any case, with that we've managed to read the contents of a words.txt
file into a data slice and print it back to the console in the form in UTF-8 encoding so that we can read it. Additionally, by doing so we got to understand a little bit about the actual slice of byte type, which was used to represent the actual byte values of the file contents.
Before we move on, now's a good time to go ahead and commit our code. In this case you can see we have a number of different files. We have the words.txt
and the main.go
as well. You may only want to track the changes made to your main.go
and not worry too much about the words.txt
given the fact that they are a bit of a test file. However, in this case I'm actually going to go ahead and add both of them. One, just to make the git output a little cleaner through the rest of this course and two, because we may as well add them to the repository anyway as we're going to be developing with them for a long time. And given the fact that it's a very small amount of data, it's not going to bloat our repository or anything like that. Additionally, because these aren't being generated then this works as well.
You may have noticed that I achieved this using the git add .
command. The dot just means the current directory and so I'm adding all of the changes found in this directory and any sub-directories as well, which causes both the words.txt
and the main.go
file to be added to the staged changes.
Okay, with the changes added we can now go ahead and commit this code by adding in say the following message: added in the ability to read a words.txt file
. With that we can then move on to the next lesson where we're going to begin writing an algorithm to count the number of words inside of our words.txt
file.