Yak Shaving #1: Cursor Keys

I recently decided to start using Emacs again. I used it extensively from the early 1990s until the early 2000s. I pretty much stopped using it when I had a sysadmin job with no Emacs on the servers, and no ability to install it. With the rising popularity of tmux and tmate for remote pairing, and my dislike for vim’s modes, I decided to try going back to Emacs in the terminal.

One thing I really want in a text editor is good cursor key support. Shifted cursor keys
should select text, and Control+left and right should move by words. (Apple HIG says to use Option+left and right to move by words; most apps on Mac OS X seem to support both.) Things have worked this way on almost every text editor on every OS I’ve used — Amiga, DOS, Windows, NeXT, Mac, Motif, Gnome, KDE. It’s a part of the CUA standard that’s been in common usage on everything since the mid-1980s.

Enabling cursor keys in Emacs was pretty easy. I’ve decided to use Prelude to make getting started with Emacs easy. Emacs comes with the cursor keys enabled, but Prelude disables them. Undoing Prelude’s change is pretty easy:

(setq prelude-guru nil)

Trying to make shifted cursor keys work is where the trouble began. They work in the GUI version of Emacs, but not from the terminal. It turns out that the Mac Terminal doesn’t distinguish between cursor keys and shifted cursor keys in its default configuration. So I had to figure out how to configure key bindings in Terminal.

That’s easy enough — they’re in the preferences. But what should I set them to? This took a lot of research. Terminal emulation and ANSI code sequences are obscure, complex, and inconsistent. I eventually found the info I needed. For starters, Shift+Right, Shift+Left, Shift+Home, and Shift+End are defined in the terminfo. The rest I was able to piece together from various sources around the Internet.

I’m also trying to script my Mac configuration. So instead of manually adding all the keybindings in the Terminal preferences pane, I decided to write a script. Mac OS X does a decent job of allowing you to change preferences from the command line. For example, to always show the tab bar:

defaults write -app Terminal ShowTabBar 1

Easy enough, except for a couple problems. First, I had to figure out the obscure syntax used in the preferences for key codes. I was able to piece these together with some more Internet research. But the really big problem is that the keyboard bindings are 2 levels deep within a “dictionary” (hash map). And the defaults command doesn’t handle that. There are some obscure utilities that handle nested preferences, but they don’t work well with the preferences caching in Mac OS X 10.9 — a problem I ran into while testing.

So now I’m writing a utility in Python that does what the defaults command does, but that will handle nested dictionaries.

There’s a term for this process of having to solve one issue before you can solve another, and the issues get several layers deep. It’s called yak shaving.

Here’s another good example:

Father from Malcolm in the middle having to fix one thing before fixing another, ad infinitum.
Yak shaving – home repairs

I’m sure this will be the first of many posts about me yak shaving.

Includable ActiveRecord

I created a Ruby gem recently, called includable-activerecord. It’s pretty small, but I thought I might explain why I created it, and discuss its implementation.

Classical Inheritance

When you use ActiveRecord, you normally include it in your model like this:

class User < ActiveRecord::Base
  # ...
end

Your User class is inheriting from the ActiveRecord::Base class. This is class-based inheritance, also called “classical” inheritance. (That’s “classical” as in “class”, not as a synonym for “traditional”.) Class-based inheritance represents an “is-a” relationship. So we’re saying that a user is an ActiveRecord base. Another way to say this is that User is a subclass of ActiveRecord::Base.

There are a few problems with this. First, what is a “base”? The name was chosen because it’s a base class. But just like we don’t give factory classes names like UserFactory (at least not in Ruby), we shouldn’t name base classes Base.

I suppose that we’re trying to say that this is an ActiveRecord model. That sounds fine at first glance — this is our model for users. But what if we also want to say that a user is a person? Ruby doesn’t allow inheriting from multiple classes. Now we have to choose whether to inherit from ActiveRecord::Base or Person. Person makes more sense, because it fills the “is-a” rule better. Class inheritance is intended for a hierarchical “is-a” relationship, such as “a user is a person”, or “a circle is a shape”. But since ActiveRecord::Base is a base class, we have to use it as our base class.

We could work around this problem by subclassing Person from ActiveRecord::Base and then subclassing User from Person. That’s fine if Person is also a model that we store in the database. But if that’s not the case, then we have a problem.

Mixins

Ruby provides another way of implementing inheritance — mixins. We often don’t think of this as an inheritance model, but it really is. When we include a module, that module gets added to the class’s ancestor chain. We can mix in as many modules as we want.

Mixins indicate more of an “acts like” relationship than an “is-a” relationship. It’s for shared behavior between classes that don’t have a hierarchical relationship. For example, when we mix in the Enumerable module, we’re saying that we want our class to act like other classes that include Enumerable. That sounds more like what we want ActiveRecord to be. We want our user model to behave like other ActiveRecord models, in the way that they can persist to a database.

But ActiveRecord doesn’t support that. Almost all the other Ruby ORMs do; as we’ve shown above, this is for good reasons.

Implementation

So I decided to see if I could implement the equivalent of the ActiveRecord::Base class as a module that could be mixed into model classes. I decided to call my mixin module ActiveRecord::Model, because classes that mix it in will behave as ActiveRecord models.

It turns out that ActiveRecord::Base is a pretty complex class. It includes and extends a lot of other modules. Luckily, as of ActiveRecord 4.0, that’s all the code it includes.

The module only defines a single class method, included. This is one of Ruby’s many hook methods. It gets called when the module in question gets included in another module, and receives that other model as its argument. All we need to have this method do is to include everything that ActiveRecord::Base includes, and extend everything that ActiveRecord::Base extends. Ruby provides a method that’s defined on all classes, called included_modules, which we can use to get the list of everything that’s included in ActiveRecord::Base. Unfortunately, there’s no equivalent list of extended_modules. But a quick search on Stack Overflow found an implementation of extended_modules that we could use.

So with a bit of magic (i.e. hooks and meta-programming), we can get the lists of constituent modules from the ActiveRecord::Base class, and include them in our ActiveRecord::Model module.

So with all that, we can now include the includable-activerecord gem and mix it in, with all the advantages that provides:

class User
  include ActiveRecord::Model
  # ...
end

It was exciting to be able to make this work. Since I wrote it as a proof of concept, I haven’t written any tests yet. But it seems to be working just fine. The main thing I really need to look into is making sure that plugins that extend ActiveRecord::Base from their own code will still work. I’m pretty sure this will work out of the box, because the ActiveRecord::Model.included doesn’t run until the model class is loaded, and that happens after those plugins have initialized themselves.

Testing Rails Validators

It’s challenging to test Rails custom validators.

I recently had to write a validator to require that an entered date is before or after a specified date.

It didn’t seem like writing the validator would be too difficult – I’ve written custom validators before, and date comparisons aren’t all that tricky. But when it came time to write the tests, I ran into several issues. And since I always try to follow TDD / test-first, I was blocked before I even began.

The biggest issue was the ActiveModel::EachValidator#validates_each API. It’s definitely not a well-designed API. You write your validator as a subclass, overriding validates_each. The method takes a model object, the name of the attribute of the model being tested, and the value of that attribute. You can also get the options passed to the custom validator via the options method. To perform a validation, you have to update the model’s errors hash.

The big flaw in the API is that instead of returning a result, you have to update the model. This needlessly couples the model and the validator. And it violates the Single Responsibility Principle — it has to determine validity of the field, and it has to update the errors hash of the model. This is not conducive to testing. Testing this method requires testing that the side-effect has taken place in the collaborator (model), which means it’s not really a unit test any more.

So to make it easier to unit test the validator, I broke the coupling by breaking it into 2 pieces, one for each responsibility. I moved the responsibility for determining validity to a separate method, which I called errors_for. It returns a hash of the errors found. This simplified the validates_each method to simply take the result of errors_for and update the errors hash of the model:

def validate_each(record, attribute_name, attribute_value)
  record.errors[attribute_name].concat(errors_for(attribute_value, options))
end

This made it much easier to unit test the errors_for method. This method doesn’t even need to know about the model — only about the value of the attribute we’re trying to validate. We simply pass in the attribute’s value and the options.

So we could write the tests without even pulling in ActiveRecord or any models:

describe DateValidator do
  let(:validator) { DateValidator.new(attributes: :attribute_name) }
  let(:errors) { validator.errors_for(attribute_value, validation_options) }

  describe 'when attribute value is NOT a valid date' do
    let(:attribute_value) { 'not a valid date' }
    it { errors.must_include 'is not a valid date' }
  end

  describe 'when attribute value IS a valid date' do
    let(:attribute_value) { Date.parse('2013-12-11') }
    it { errors.must_be :empty? }
  end
end

And the errors_for method looked something like this:

def errors_for(attribute_value, options)
  unless attribute_value.is_a?(Date)
    return [options.fetch(:message, "is not a valid date")]
  end
  []
end

Integration testing can also be a bit of a challenge. I recommend following the example from this Stack Overflow answer. Basically, create a minimal model object that contains the field and the validation. Then test that the model behaves like you expect with different values and validations.

My Thoughts on Python vs. Ruby

I’ve been using Python at work for the past few months.  I learned Python back in the early 2000s, but never used it for any large projects.  I learned Ruby in late 2005, and it quickly became my language of choice for most cases.

So while I still prefer Ruby, and will likely use Ruby more in the future than Python, I wanted to assess the strengths and weaknesses of Python in relation to Ruby.  Perhaps some of the lessons could be applied when writing Ruby, and it could help to decide when to use each.  Also, I’m interested in programming language design, and wanted to document pros and cons in that light.

Where Python Sucks (As Compared to Ruby)

  • Have to explicitly include self in EVERY method declaration.
    • Including @classmethod declarations (although people usually use the name cls instead of self there).
    • Except @staticmethod declarations.
  • Have to use self everywhere to reference an object’s own attributes.
  • Inconsistency of things like the len() function, when everything else is a method.
  • Inconsistency of having some built-in classes with lower-case names.
    • And they’re not whole words, so you have to memorize them: list, dict, str, int.
    • Even more bizarre is dict vs. OrderedDict.
  • I’m not a fan of True and False being uppercase.
    • I’m a bit less concerned about None, for some reason.
    • If they’re going to be uppercase, they might as well be all caps, in my opinion.
  • Inconsistency of camelCase versus snake_case in some modules.
  • Claims to prefer explicit over implicit, yet does not use new to create an object.
    • It is nicely concise, but makes it look too much like a regular function call.
  • The implementation of lambdas is too limited.
    • Makes using map function not very useful.
  • Class methods are less than ideal to implement.
    • Probably better to use functions within a module instead, in many cases.
  • Lack of good functional programming tools makes it harder to manipulate lists.
    • Have to often resort to creating an initial list and adding to it in a for loop.
  • I don’t understand why r’regex_string’ doesn’t just create an actual regular expression object.
  • I miss Ruby’s method-call-or-property-getter syntax.
    • Nice that I can get it by adding @property to method definitions, but that’s a bit messy.
  • I don’t understand why lists don’t have a join() method; it seems backwards to call join on the string used to connect the list elements.
  • I miss unless; seems like with all the keywords, Python would have added that.
  • I really miss ||= to memoize.
    • The distinction between unset variables and variables set to None makes it hard.
    • I also miss +=.
  • I really miss statement modifiers (if or unless at the end of the statement/expression instead of the beginning).
  • I miss being able to assign to a compound statement (x = if True: 1; else: 2).
    • While Python does allow this simple case, it does not allow more complex cases.
  • The way modules, files, and classes work, I either have a lot of classes in one file, or have to come up with more module names in order to split classes into different files.
  • The preference for bare functions within modules over class methods often leads to functions outside of classes, when they’d make more sense more closely associated with the class.
  • I don’t quite understand the necessity for pass.
    • Seems like allowing 0 lines of indented code would suffice instead.
  • I miss implicit return.
    • Explicit return looks a bit nicer, but is less concise, and I’ve gotten out of the habit.
  • Mixins turn out to be more useful than multiple inheritance.
  • I miss string interpolation.
  • I really miss Array#first and Array#last.
  • I really don’t like that 0 is falsey.
    • I could live with empty lists and dicts being false, but not 0.0 and 0.
  • Comparing a string to a regular expression seems harder than it should be.
  • Converting to a Boolean is not as easy a Ruby’s !! syntax.
    • OK, bool(expression) isn’t so bad, I guess.

Where Python Rocks

  • Indentation removes the need for end everywhere.
  • Import statements are nice.
    • Can import everything, or just a few things.
  • Annotations are nice for aspect-oriented modification of methods.
    • Does it make it hard to debug though, like alias_method_chain in Ruby?
    • It’s kind of tricky to write them though, due to doubly-nested function definitions.
  • List comprehensions are more powerful than map.
    • But map can be more concise for the most common cases.
  • Can add arbitrary attributes to any object, class, or module.
  • The object creation syntax is nicely concise.
  • Sometimes the ability to add a bare function within a module is nice.
  • Keyword arguments are nicely done.
    • Ruby’s emulation via hashes without braces is OK, but the corner cases are problematic.
  • Docstrings as part of the language is nice.
  • The new with keyword looks like a halfway-decent replacement for blocks.
    • Seems like a lot more work that using blocks and yield though.
  • Python 3 drops support for octal literals starting with a 0.
    • Still allows it via a 0o prefix.
  • No support for all the crazy Perl-inspired globals.
  • The names list and dictionary are better than array and hash.
    • The Ruby names are more about the implementation, especially hash.
    • I’d much prefer the name map over hash, or even dictionary.
    • The name list is only slightly better than array.
  • List and string slicing is quite nice.
    • I do wish that the syntax was “..” instead of “:” though.
    • Slice assignment is even cooler.
  • I prefer the Python dict syntax over the Ruby hash syntax (“:” vs. “=>”).
    • The Ruby 1.9 symbol hash syntax is an improvement, but not quite as good.
  • Checking for string containment is nice: if substring in string.

Where Ruby Rocks

  • Consistency.
  • Blocks.
  • Excellent functional tools to deal with Enumerable.
  • Meta-programming.
  • Optional parentheses.
  • Modules and classes are also objects.

 

Debugging Pattern – Grenade

I’ve been more interested in programming patterns lately, partly due to the Ruby community’s recent interest — especially the Ruby Rogue podcast’s love affair with “Smalltalk Best Practice Patterns”.

I consider most of the patterns in “Smalltalk Best Practice Patterns” to be patterns “in the small” — things that are typically employed on a single line or method. The Gang Of Four patterns are more medium sized, dealing with methods and classes. The PEAA book covers architectural-scale patterns. I suppose “The Pragmatic Programmer” and similar books could (should!) be considered to be very general patterns, mostly about the practice of programming.

One type of pattern that I have not seen discussed much is debugging patterns. I’m not exactly sure why that is; probably just our general concentration on designing the program code itself. There are definitely testing patterns. But I can’t think of anything that I’ve seen that covers debugging patterns. A quick search on the Internet doesn’t turn up too much.

Anyway, a co-worker (Helena) and I were debugging recently, and were having trouble figuring out where a certain method on a certain object was being called. We came up with a really cool solution. We immediately called it a grenade, because its entire purpose is to blow up (raise an exception) and display a backtrace to show us the caller of the method that we’re looking for.

Here’s our implementation in Ruby:


module Grenade
  def method_under_investigation(*)
    raise "method_under_investigation called"
  end
end

class Xyz
  ...
  def xyz
    object_under_investigation.extend(Grenade)
  end
  ...
end

I’m sure we could wrap that in some more semantic sugar, even going as far as making it look something like this:


class Xyz
  ...
  def xyz
    object_under_investigation.blow_up_when_method_called(:method_under_investigation)
  end
  ...
end

I’m not sure that would really be worth it though, unless we were to add it to some sort of library.

So that’s our contribution to the (small) list of debugging patterns.

Write Comments For Yourself

Amos and I got in a heated discussion recently on whether we should write a single line comment to better explain some code. (The code in question was Amos’s very elegant solution to testing whether a job got sent to Resque.)

Amos doesn’t believe in writing comments much at all. He thinks that if you’re writing a comment, it means that you’re doing something wrong, and that you probably need to write the code more clearly.

I agree with that, to a point. First off, it’s not necessary to write perfect code. If you can change a class or method name to better describe what you’re doing (and more importantly, why you’re doing it) then you should definitely do so. But it’s not always worth refactoring until you get every “why” answered. More importantly, I don’t think it’s even possible to capture everything in the code that is worth capturing. For example, why did you choose this implementation, as opposed to another that might be more obvious or common?

After our argument, I came up with a good rule of thumb (or “pattern”):

Write comments for your (future) self.1

In other words, if your comment will help you to understand the code more quickly when you look at it in the future, then it’s a valid comment. It also means that you can assume that the reader has about as much general programming knowledge as you currently do. (Your future self will have more general knowledge, but possibly less specific knowledge of the lines of code in question. And because of this, your current solution might not make as much sense in the future. You might know of a better solution in the future, but you’ll have to know all the constraints that you had when you originally wrote the code.)

This is not to say that you should not write comments in clear English, that others can understand. The comment is written for a future maintainer. That may be you (which is why the rule works well), or it may be someone else. The rule is more about when to write a comment, and what level of competence you should assume of the reader.

1 Perhaps it should be “Write comments TO your (future) self”.