Constant naming and lookup in Ruby

I thought I had a pretty good understanding of how constant lookup worked in Ruby, but I encountered a surprising piece of behavior recently and I wanted to share it.

We had a god model at work that contains thousands of lines of code, much of which is in methods that aren’t truly core to the model. In a long-term attempt to clean this up, we started by moving some of the methods in this model into dependency mixins, like so:

class User
  include User::LoginDependency
  #lots of methods
module User::LoginDependency
  #login-specific methods go in this file

My pair and I started moving methods over, and it all started out easy enough.

Then we encountered something surprising: a method in one of the mixins referenced a constant from the god model, and we got the error uninitialized constant User::LoginDependency::CONSTANT.

This, to me, was surprising to see. Ruby, after all, is a dynamic language with late binding, and I expect things like this to be figured out at runtime. My first hand-wavy assumption was that constant lookup works in a manner similar to sending messages to objects, which made me expect that to work.

Turns out, how you name something makes a difference. If I want to have a module called User::LoginDependency, there are two ways I can define it:

module User
  module LoginDependency

module User::LoginDependency

I can get to both by typing User::LoginDependency, but the key difference is that in the first case, Ruby understands it as “a module LoginDependency nested inside of the module User,” and in the second case, it’s “the module User::LoginDependency”.

For the nested version of User::LoginDependency, if Ruby doesn’t find the constant inside of LoginDependency, Ruby will work its way up the namespace chain and look for the constant in User. but in the latter case, there isn’t a next level up that you can look, because User::LoginDependency is a top-level defined constant.

For this refactoring, my pair and I made the decision that if the new module’s methods were the only ones to reference a given constant we’d move the constant over too, otherwise we’d leave the constant in the original model and update the new dependency mixin to reference the constant’s fully qualified name; e.g. User::CONSTANT instead of just CONSTANT.

Alternatively, we could have switched these mixins over to being defined with a nested namespace, but in our case User is a class and we didn’t want to be re-opening that class just to add these modules that would later be included. Besides, our approach seemed more appropriate, because it keeps the mixin-specific constants private to that mixin, and the shared stuff is all together in the model that those mixins are all included in.


Default values for hashes in Ruby

I was recently working on some code that involved hashes of arrays. As I was reading through some behaviors of Hash in the Ruby docs, I was delighted to see that you could pass an object to and it would be the default value returned when you tried accessing a key in a hash that didn’t exist.

So, let’s try this out a little bit!

2.1.1 :002 > arrays[:colors] << :blue
 => [:blue] 
2.1.1 :003 > arrays[:colors] << :red
 => [:blue, :red] 
2.1.1 :004 > arrays[:colors]
 => [:blue, :red] 

Looks like it’s working! Let’s use another key.

2.1.1 :005 > arrays[:shapes] << :quadrilateral
 => [:blue, :red, :quadrilateral] 

Wait, whaaaa? Let’s see how my :colors array is doing:

2.1.1 :006 > arrays[:colors]
 => [:blue, :red, :quadrilateral] 

Oh, no! What is going on with this hash?

2.1.1 :007 > arrays
 => {} 

Okay, let’s read those Ruby docs more closely:

new → new_hash
new(obj) → new_hash
new {|hash, key| block } → new_hash

Returns a new, empty hash. If this hash is subsequently accessed by a key that doesn't correspond to a hash entry, the value returned depends on the style of new used to create the hash. In the first form, the access returns nil. If obj is specified, this single object will be used for all default values. If a block is specified, it will be called with the hash object and the key, and should return the default value. It is the block's responsibility to store the value in the hash if required.

There are two subtle things at play here. First off, giving hashes a default value doesn’t mean that anything is stored in the hash when you try to access a nonexistent key. That explains why my arrays hash is still empty even after I’m shoveling things onto arrays. This is sensible default behavior; a hash could grow without bound if by default a new value got added to a hash whenever it was accessed by a nonexistent key.

The second subtlety here is a reminder that in Ruby, objects are mutable. We are providing the hash a single object instance (in this case, a new empty array) that is returned as the default value when you try to access a key in the hash that doesn’t exist. If I change that array by appending things to it, I’ll still get back that same array object in the future when I access the hash by a nonexistent key.

I want the hash to work so that when I access a nonexistent key, I get back a new empty array, and that array is added to the hash. We can do this by passing a block to

2.1.1 :008 > groups = {|hash, key| hash[key] = []}
 => {} 
2.1.1 :009 > groups[:colors] << :red
 => [:red] 
2.1.1 :010 > groups[:colors] << :blue
 => [:red, :blue] 
2.1.1 :011 > groups[:shapes] << :octagon
 => [:octagon] 
2.1.1 :012 > groups
 => {:colors=>[:red, :blue], :shapes=>[:octagon]} 

I’ve been using Ruby for years and I lost at least an hour recently because I wasn’t accounting for this subtle behavior.


Don’t make perfect modularity the enemy of a good refactor

I often find myself reviewing pull requests and I will find classes that contain a lot of domain-specific logic that aren’t relevant to the class itself. I’ll point it out and the response is often “I plan to extract this out into a gem, but I just haven’t had a chance/I’ve been busy.”

Extracting the functionality out into a gem would be great, but I’ll be the first to admit that’s the kind of task I’d procrastinate to no end. You have to extract the functionality, get the right directory structure, get a gemspec in place, then you have to host the gem on Rubygems or if you gem is private, Gemfury. Even then, when you make changes to the gem you have to go and do a Bundler update on the apps that use the gem.

Don’t start with that solution, though. Start by extracting the out of place functionality into a new class that just lives inside the /lib directory in the application. In Rails apps, everything in /lib is already included so you can really just extract the code into the new class file and you’re ready to go.

You probably feel like you could do better than that, and that’s a great attitude you have, but by extracting functionality into a new class and out of the unrelated class, you just addressed the biggest concern, and if in the future you find that you really do need a gem (for instance, maybe you’ve got other apps that need the functionality in that class) you’ve already got the class and a unit test file (right? right?) separated out to easily slip into a gem.