Burying the Lede

Most of us don’t write very readable shell scripts. There are plenty of things we could do better, but today I want to talk about one in particular — burying the lede.

The term “burying the lede” comes from the field of journalism. Here’s the Wiktionary definition:

To begin a story with details of secondary importance to the reader while postponing more essential points or facts.

Like a good news article, code should tell a story. And the story should start with what’s most important. In the case of code, the most important information is the high-level functionality — a succinct summary of what the program does. In other words, write (and organize) the code top-down, as opposed to bottom-up.

Unfortunately, shell script doesn’t make this easy. Due to the way shell scripts are interpreted, you can’t call a function until after you’ve defined it. This leads to most of us structuring our code like this:

function do_something { ... }
function do_something_else { ... }

do_something
do_something_else

The problem with this is that the function definitions will likely take quite a few lines, and we won’t see what the top-level functionality is until we reach the end of the script.

I’d like to propose a standard way to structure shell scripts to mitigate this issue. (I’m really only talking about shell scripts that have function definitions within them.) I’m sure I’ve seen a few scripts do this, but it’s not very common at all.

My proposal is simple:

function main {
  do_something
  do_something_else
}
function do_something { ... }
function do_something_else { ... }

main

This structure lets us start with the lede. We describe the top-level functionality right away. Only then do we get to the secondary details. The name main makes it pretty clear that it contains the top-level functionality.

I’ve recently started writing my shell code like this, and I’m happy with the results. I’ve also started to use some other programming techniques in my shell scripts to improve readability: better naming, extracting more methods, and moving helper methods into separate files. It feels good to treat shell scripts like real code instead of just some stuff I’ve hacked together.

PS. The WordPress theme I’m currently using (Twenty Eleven) also buries the lede — I can barely even see the title of the blog post on my screen without scrolling. I’m going to have to change that soon.

Yak Shaving #1: Cursor Keys

I recently decided to start using Emacs again. I used it extensively from the early 1990s until the early 2000s. I pretty much stopped using it when I had a sysadmin job with no Emacs on the servers, and no ability to install it. With the rising popularity of tmux and tmate for remote pairing, and my dislike for vim’s modes, I decided to try going back to Emacs in the terminal.

One thing I really want in a text editor is good cursor key support. Shifted cursor keys
should select text, and Control+left and right should move by words. (Apple HIG says to use Option+left and right to move by words; most apps on Mac OS X seem to support both.) Things have worked this way on almost every text editor on every OS I’ve used — Amiga, DOS, Windows, NeXT, Mac, Motif, Gnome, KDE. It’s a part of the CUA standard that’s been in common usage on everything since the mid-1980s.

Enabling cursor keys in Emacs was pretty easy. I’ve decided to use Prelude to make getting started with Emacs easy. Emacs comes with the cursor keys enabled, but Prelude disables them. Undoing Prelude’s change is pretty easy:

(setq prelude-guru nil)

Trying to make shifted cursor keys work is where the trouble began. They work in the GUI version of Emacs, but not from the terminal. It turns out that the Mac Terminal doesn’t distinguish between cursor keys and shifted cursor keys in its default configuration. So I had to figure out how to configure key bindings in Terminal.

That’s easy enough — they’re in the preferences. But what should I set them to? This took a lot of research. Terminal emulation and ANSI code sequences are obscure, complex, and inconsistent. I eventually found the info I needed. For starters, Shift+Right, Shift+Left, Shift+Home, and Shift+End are defined in the terminfo. The rest I was able to piece together from various sources around the Internet.

I’m also trying to script my Mac configuration. So instead of manually adding all the keybindings in the Terminal preferences pane, I decided to write a script. Mac OS X does a decent job of allowing you to change preferences from the command line. For example, to always show the tab bar:

defaults write -app Terminal ShowTabBar 1

Easy enough, except for a couple problems. First, I had to figure out the obscure syntax used in the preferences for key codes. I was able to piece these together with some more Internet research. But the really big problem is that the keyboard bindings are 2 levels deep within a “dictionary” (hash map). And the defaults command doesn’t handle that. There are some obscure utilities that handle nested preferences, but they don’t work well with the preferences caching in Mac OS X 10.9 — a problem I ran into while testing.

So now I’m writing a utility in Python that does what the defaults command does, but that will handle nested dictionaries.

There’s a term for this process of having to solve one issue before you can solve another, and the issues get several layers deep. It’s called yak shaving.

Here’s another good example:

Father from Malcolm in the middle having to fix one thing before fixing another, ad infinitum.
Yak shaving – home repairs

I’m sure this will be the first of many posts about me yak shaving.

What I Want in a Blog Engine

I’m considering moving away from WordPress, for a couple reasons:

  • Security. There have been several vulnerabilities over the past few years, and I’ve never really had a high level of confidence that it’s secure. In addition, I now find the whole PHP model — every web app running as a single user, instead of leveraging UNIX permissions — to be broken.
  • Speed. I started to realize a couple years ago that a blog engine should generate static pages. The static pages should be updated whenever new content is added. There’s really no reason to re-generate the page every time someone wants to read it. At the very least, Server-Side Includes (SSI) or Edge-Side Includes (ESI) should be used.

Now that I’m starting to actually blog consistently, it makes sense to change. But before I move to something different, I want to be sure that I find all the features that I need. This post is my thought process about what I want.

Requirements

  • Self hosted. I like to run my own servers. Partly so I can keep full control, and partly because I enjoy doing some system admin stuff (as long as it doesn’t take too much time).
  • Generation of static pages. Almost everything should be a static page by the time a user requests a page. Obviously, submitting comments would hit a dynamic page, and comment approval would likely be done through a dynamic part of the site. But publishing a blog post or comments should generate static files to be served by Apache or nginx.
  • Categories (or maybe tags). I want readers to be able to find other things that I’ve written that might interest them. A list of categories in a sidebar would help them find what interests them.
  • Site map. Especially an XML sitemap, so Google can index better. I also like the idea of readers being able to see everything in one place.
  • Archives. This is another way of letting readers see everything I’ve written, organized by publication date.
  • RSS and/or Atom. I want readers to be able to subscribe to my blog.
  • Media. I don’t think I necessarily need a media manager, just a simple way to add pictures to blog posts.
  • Comments. I want to allow comments, but I don’t want to use anything like Disqus. I want the comments to be in my own database, so they won’t disappear. I also dislike requiring JavaScript just to see comments. And I’ve worked at places where Disqus wouldn’t work, due to web proxy servers. Comments should be moderated by the site admin, editor, or author. Readers should not have to log in to post a comment, but it would be nice to allow users to log into Twitter or some other social network to provide their info.
  • Pingbacks and Tweetbacks. I’d like to be able to see who’s referring to my articles.

Nice to Have

  • Free. I almost consider this to be a requirement, but I’d be willing to entertain a 1-time payment for something that meets my needs really well. I need to try to counteract my DIY-at-all-non-monetary-costs mentality.
  • Open Source. I think Open Source is likely to provide a more useful product, especially in this space. An Open Source option is more likely to improve over time.
  • Not PHP. I’ve got several problems with PHP. First, the whole PHP model of running in the same context as Apache (or nginx) breaks UNIX conventions. A security issue in a PHP program exposes more than just the program itself — it exposes Apache and every other PHP program. Under the UNIX model, each program runs in a different user context, isolating bugs to that program only. PHP is also a “hacky” language. PHP programmers tend to not be as rigorous (in many ways), leading to more vulnerabilities. Finally, the PHP language itself has had several misfeatures that have lead to security issues — much more than Ruby or Java.
  • Not MySQL. I’d like to be able to remove MySQL from my servers. PostgreSQL would be much preferred. Storing everything in git would be fine too — maybe even preferred. I’d even be fine with MongoDB or some other simple NoSQL database.
  • Admin console. I can’t imagine approving comments without this, but I’m not closed to that possibility.
  • Edit history. I want to be able to update posts, but see what the previous versions looked like. (Not sure if I care whether the public can or cannot see the various revisions.)
  • WYSIWYG. I’d be willing to settle for Markdown. In fact, I’d prefer (extended) Markdown as the canonical form behind the WYSIWYG.
  • Code highlighting. A lot of my blog posts (will) include code in various languages. I’d like the blogging software to be able to handle that well. Syntax highlighting would be great. Any other related features would also be nice.
  • Through-the-web editing. I’d like to be able to edit a page from any device from anywhere that has Internet access.
  • Reader-submitted edits. This is probably going to be the hardest thing to find, but I’d like a really easy way for readers to let me know of typos and such by simply submitting a fix for them. The author or editor of the post would have to approve the edits, obviously.
  • Import from WordPress. I don’t have a ton of content in WordPress, or else this would probably be a more firm requirement.
  • Community. Community helps software thrive, adding improvements and ensuring security.
  • Plugins. It would be nice for the blog engine to be extensible.
  • Unpublished drafts and previews. I’d really like to be able to see by blog posts before I release them to the world.
  • CMS abilities. I’d really like the ability to edit non-blog web pages the same way as blog entries.
  • Themes. It would be nice to have decently looking themes, so I could have other people help me make the site look nice.
  • Posting my tweets in a sidebar area. I’d be OK with this being done in client-side JavaScript. I’d also be OK with periodic polling of my Twitter feed, and updating the static content when appropriate.

Candidates

I’ve been playing with Jekyll lately, using it to generate a simple static site that can be updated easily. It’s pretty nice, and pretty easy to use. Something based on Jekyll would work well. The trickiest part will be figuring out how to manage comments. Or maybe I’ll just have to learn to live more minimally.

Octopress is the obvious candidate. It fits most of my needs and wants. Most likely I’ll write my own commenting engine and add it to my Octopress template.

The only other candidate that I’ve found is Ghost. It’s written in JavaScript and runs on Node. I don’t find Node to be that compelling, but the templates look great, and it’s designed for coding blogs.

I’ll probably take a quick look at Droplets, too. It’s written in PHP, but seems to fit my other requirements.

Unfortunately, that seems to be all there is in this space. Maybe I need to investigate more. Or maybe Octopress just has the static blog generation marked cornered. Either way, I’ll probably change my blogging software in the next month or two. Wish me luck!

 

Bulk Rename in Bash

Here’s a relatively simple way to rename a bunch of files from the command line. It uses sed within a command substitution to compute the new names from the old names.

In this example, we’re renaming files that start with “ABC” to start with “XYZ” instead:

  for i in ABC*; do mv $i $(echo $i | sed -e s/^ABC/XYZ/); done

You’ll have to use shell globbing (wildcards) in the first part, to determine which files will be the source of the renaming, and regular expressions in the second part to translate the old names into the new names.

It’s a good idea to use echo in place of mv before running this, to see what the results will be, so you don’t make a mistake.

Depending on the situation, you may need to adapt this for your particular needs. For example, if you have too many files to rename, you would need to use xargs. Or sed might not be up for the task of transforming the names that you want. Or you might need to use find to find files within a hierarchy. As with any UNIX idiom, there are hundreds of variations on the theme.