Wednesday, August 31, 2016

Let's Go: Command-Line Programs With Golang

Let's Go: Command-Line Programs With Golang

Overview

The Go Language is an exciting new language that gains a lot of popularity for a good reason. In this tutorial you'll learn how to write command-line programs with Go. The sample program is called multi-git, and it allows you to execute git commands on multiple repositories at the same time.

Quick Introduction to Go

Go is an open-source C-like language created at Google by some of the original C and Unix hackers, who were motivated by their dislike of C++. It shows in Go's design, which made several unorthodox choices such as eschewing implementation inheritance, templates, and exceptions. Go is simple, reliable, and efficient. Its most distinctive feature is its explicit support for concurrent programming via so-called goroutines and channels.

Before starting to dissect the sample program, follow the official guide to get ready for Go development.

The Multi-Git Program

The multi-git program is a simple but useful Go program. If you work on a team where the codebase is split across multiple git repositories then you often need to perform changes across multiple repositories. This is a problem because git has no concept of multiple repositories. Everything revolves around a single repository. 

This becomes especially troublesome if you use branches. If you work on a feature that touches three repositories then you will have to create a feature branch in each of these repositories and then remember to check out, pull, push, and merge all of them at the same time. This is not trivial. Multi-git manages a set of repositories and lets you operate on the whole set at once. Note that the current version of multi-git requires that you create the branches individually, but I may add this feature at a later date.

By exploring the way multi-git is implemented, you will learn a lot about writing command-line programs in Go.

Packages and Imports

Go programs are organized in packages. The multi-git program consists of a single file called main.go. At the top of the file, the package name 'main' is specified, followed by a list of imports. The imports are other packages that are used by multi-git.

For example, the fmt package is used for formatted I/O similar to C's printf and scanf. Go supports installing packages from a variety of sources via the go get command. When you install packages, they end up in a namespace under the $GOPATH environment variable. You can install packages from a variety of sources such as GitHub, Bitbucket, Google code, Launchpad, and even IBM DevOps services via several common version control formats such as git, subversion, mercurial and bazaar.

Command-Line Arguments

Command-line arguments are one of the most common forms of providing input to programs. They are easy to use, allow you to run and configure the program in one line, and have great parsing support in many languages. Go calls them command-line "flags" and has the flag package for specifying and parsing command-line arguments (or flags). 

Typically, you parse command-line arguments at the beginning of your program, and multi-git follows this convention. The entry point is the main() function. The first two lines define two flags called "command" and "ignoreErrors". Each flag has a name, a data type, a default value, and a help string. The flag.Parse() call will parse the actual command-line passed to the program and will populate the defined flags.

It is also possible to access undefined arguments via the flag.Args() function. So, flags stand for pre-defined arguments and "args" are unprocessed arguments. The unprocessed arguments are 0-based indexed.

Environment Variables

Another common form of program configuration is environment variables. When you use environment variables, you may run the same program multiple times in the same environment, and all runs will use the same environment variables. 

Multi-git uses two environment variables: "MG_ROOT" and "MG_REPOS". Multi-git is designed to manage a group of git repositories that have a common parent directory. That's "MG_ROOT". The repository names are specified in "MG_REPOS" as a comma-separated string. To read the value of an environment variable you can use the os.Getenv() function.

Verifying the Repository List

Now that it found the root directory and the names of all the repositories, multi-git verifies that each repository exists under root and that it is really a git repository. The check is as simple as looking for a .git sub-directory for each repository directory.

First, an array of strings named "repos" is defined. Then it iterates over all the repo names and constructs a repository path by concatenating the root directory and the repo name. If the [os.Stat()]() call fails for the .git subdirectory, it logs the error and exits. Otherwise, the repository path is appended to the repos array.

Go has a unique error-handling facility where functions often return both a return value and an error object. Check out how os.Stat() returns two values. In this case the "_" placeholder is used to hold the actual result because you only care about the error. Go is very strict and requires named variables to be used. If you don't plan to use a value, you should assign it to "_" to avoid compilation error.

Executing Shell Commands

At this point, you have your list of repository paths where we want to execute the git command. As you recall, we received the git command line as a single command-line argument (flag) called "command". This needs to be split into an array of components (git command, sub-command, and options). The whole command as a string is stored too for display purposes.

Now, you're all set to iterate over each repository and execute the git command in each one. The "for ... range" loop construct is used again. First, multi-git changes its working directory to the current target repo "r" and prints the git command. Then it executes the command using the exec.Command() function and prints the combined output (both standard output and standard error). 

Finally, it checks if there was an error during execution. If there was an error and the ignoreErrors flag is false then multi-git bails out. The reason for optionally ignoring errors is that sometimes it's OK if commands fail on some repos. For example, if you want to check out a branch called "cool feature" on all the repositories that have this branch, you don't care if the checkout fails on repositories that don't have this branch.

Conclusion

Go is a simple yet powerful language. It's designed for large-scale system programming but works just fine for small command-line programs too. Go's minimal design is in stark contrast to other modern languages like Scale and Rust that are very powerful and well-designed too, but have a very steep learning curve. I encourage you to try Go and experiment. It's a lot of fun.


No comments:

Post a Comment