Casual Programming at Work for Fun and Profit

The thought of countless office workers slaving over spreadsheets and performing mindless work makes me sad. These machines that I love were supposed to free us from menial labor. Instead, there are people out there copying something from one spreadsheet and pasting it into another, over and over, ad nauseum. It's because the tools are incomplete, a consequence of trying to tie every conceivable operation to a button on the screen.

You don't need to be a computer scientist to process data. With casual programming, we can accomplish a lot without the operational complexity of “real” software development. If we take some basic precautions, we can save time, have fun, and look like a total wizard without appreciable risk.

Here are some examples of things I've done on a locked-down Windows 10 laptop:

The Tools

To do all of this, I use Git for Windows. I don't actually use git itself, just the tools that come with Git for Windows. Even on locked-down systems, I haven't had trouble installing it. Note: Run the installer from the file explorer, not through your web browser's download manager. My guess is that it's an executable zip file, and extracting an archive doesn't require admin access. Git for Windows is based on MSYS2. which includes MinGW, a collection of GNU utilities that have been ported to Windows. Git for Windows is updated about every 12 weeks, so be sure to keep your installation up to date.

These tools will let you do a lot of cool tricks. The key feature of this environment is the composability of trivial programs. It's not as easy as clicking a button, but you won't need to break out a calculator, either. This is a great toolset for working with small datasets. I believe most Excel spreadsheets fall into that category.

Key tools include:

Other features that make this environment easy to use:

To work with Excel data in the Git for Windows environment, you will need to save the file as “Text (Tab Separated)”. You can always paste the resulting data back into your original file. I recommend tab-separated because it is very unlikely that you will encounter a tab character embedded in an Excel document. However, it is quite easy to embed a newline character in a cell using Ctrl-Enter and this will be hard to compensate for in a script. If your Excel document has cells with unstructured data, consider excluding those columns from the tab-separated version of the file.

I also recommend installing Pandoc. I hate messing around with styles in Word. With Pandoc, I can compose using a plain text editor. I use markdown to indicate headers, create tables, quote text, and bold and italicize text. Pandoc will then convert my markdown into a nice-looking Word document. I have also converted markdown into HTML, HTML into plain text, and Word into markdown. The key advantage for me is that I get to edit documents in Vim, my favorite editor, and then convert them to a document that is more appropriate for my audience (usually Word). It also brings this content into an environment with powerful automation capabilities.

Use Cases

I'll focus on Excel because I use it a lot at work and awk was made to process structured data like you find in most Excel documents. The key things missing from Excel are regular expressions and scriptability. For really trivial stuff, like summing up numbers or creating pivot tables, staying in Excel is probably your best bet. When things get complicated, using a tool like awk or sed can save you a ton of time and soul-crushing, repetitive work. This is especially true of anything you will have to do more than once or on a regular schedule.

Examples include:

You can also perform find-and-replace operations that are too complicated or fuzzy for Excel's built-in find-and-replace function.

Basically, if you find yourself eyeballing every row in a spreadsheet and applying some kind of rule, you should strongly consider creating a script to do it for you. Not only will this be faster in the long run, but less prone to human error. It's also just more fun and it elevates your thinking.