Building a HaaS (Hooks-as-a-Service)

2015 update — code samples updated with latest available Ruby (2.2.3).

One little kink new Ruby developers eventually get to notice is this #new/#initialize inconsistency. In order to create a new object, one calls the #new class method, but its implementation seems to be provided under the #initialize instance method. How come such a flaw has made its way into the language spec? Well, actually, there is no flaw… just a hook.

What is a hook?

In Ruby, hooks provide a way to extend the behavior of your programs at runtime.

A hook is a method automatically triggered when a specific event occurs during execution, and it provides a default, no-op implementation you may override with custom behavior. Another definition could be that of a regular method call purposely landed within the codebase so as to allow for opt-in extensibility. The important part is that by default, the hook does nothing.

For instance, Ruby's #included method is a well-known hook triggered when a module is included into a class (mixin). One does not really notice its existence at first, because it does nothing by default, but it may be overriden with custom logic, and put to good use to create reactive programs:

module Foo
  # Let's use the "included" hook to perform
  # something specific to Foo's logic.
  def self.included(base)
    puts "I (#{self}) have been included into #{base}!"
  end
end
 
class Bar
  include Foo
end
 
# => I (Foo) have been included into Bar!

module Foo
  # Let's use the "included" hook to perform
  # something specific to Foo's logic.
  def self.included(base)
    puts "I (#{self}) have been included into #{base}!"
  end
end
 
class Bar
  include Foo
end
 
# => I (Foo) have been included into Bar!

Notice we never had to explicitly call Foo.included anywhere within our code, and yet the string was successfully displayed. We simply provided our own implementation of #included; no matter what, it will always be called when the Foo module is included somewhere, resulting in the message being displayed consistently.

So the whole point with hooks really is that they do nothing by default, but are lurking around at key spots in the codebase for you to instruct them to actually do something useful (hopefully).

The inner dialogue of a hook

Let's have a look at Ruby's default implementation (MRI 2.2.3) for the #included mechanism illustrated above.

To quickly inspect the implementation for a method, you may install the pry and pry-doc gems, and use the "show-method" command.

// [1] pry(main)> show-method Module#include
 
// From: eval.c (C Method):
// Owner: Module
// Visibility: public
// Number of lines: 17
 
static VALUE
rb_mod_include(int argc, VALUE *argv, VALUE module)
{
    int i;
    ID id_append_features, id_included;
 
    CONST_ID(id_append_features, "append_features");
    CONST_ID(id_included, "included");
 
    for (i = 0; i < argc; i++)
        Check_Type(argv[i], T_MODULE);
    while (argc--) {
        rb_funcall(argv[argc], id_append_features, 1, module);
        rb_funcall(argv[argc], id_included, 1, module);
    }
    return module;
}

// [1] pry(main)> show-method Module#include
 
// From: eval.c (C Method):
// Owner: Module
// Visibility: public
// Number of lines: 17
 
static VALUE
rb_mod_include(int argc, VALUE *argv, VALUE module)
{
    int i;
    ID id_append_features, id_included;
 
    CONST_ID(id_append_features, "append_features");
    CONST_ID(id_included, "included");
 
    for (i = 0; i < argc; i++)
        Check_Type(argv[i], T_MODULE);
    while (argc--) {
        rb_funcall(argv[argc], id_append_features, 1, module);
        rb_funcall(argv[argc], id_included, 1, module);
    }
    return module;
}

Notice how Ruby will call an id_included method (the #included one, really) for you upon including a module within a class. Here is that method:

// [2] pry(main)> show-method Module#included
 
// From: object.c (C Method):
// Owner: Module
// Visibility: private
// Number of lines: 5
 
static VALUE
rb_obj_dummy(void)
{
    return Qnil;
}

// [2] pry(main)> show-method Module#included
 
// From: object.c (C Method):
// Owner: Module
// Visibility: private
// Number of lines: 5
 
static VALUE
rb_obj_dummy(void)
{
    return Qnil;
}

It's simply a no-op method returning nil.

The #included method being a hook, some may say the specific implementation we provide when writing def self.included … end is a callback associated to this "include" hook. In that perspective, the hook method is much like a "placeholder" for you to fill in, gently waiting for an event that will trigger any callback that has been registered with it. That callback mechanism is precisely what happens in JavaScript, a language which features higher-order functions (function/closure one may pass as argument). In Ruby though, there is no such thing as a callback: what happens is that the default #included method provided by Module, as shown above, is shadowed by Foo's implementation.

Some other hooks provided by Ruby are #extended, #inherited, #method_added, #method_removed, the (in)famous #method_missing, and many others. Some are defined at the class level, some as instance methods. Getting to know them and using them proprerly will allow you to leverage a little more of Ruby’s dynamic capabilities: think metaprogramming.

One member of the hurd remains unknown from the masses, though; and yet it may be the most widely used: our beloved #initialize!

What is going on when calling `Bar.new`?

Upon creating a new instance:

some memory has to be allocated to store the instance (an object); Ruby takes care of that using a Bar's class method named allocate. Once a reference to the new object is available, it is assigned to self;
the initialize message is then sent to self.

Therefore when a developer calls the class method new, as a side-effect to that call the instance method initialize will be triggered on the brand-new, blank object which has just been created. What is neat is that if no #initialize method is found in your class implementation, everything still works smoothly for Ruby falls back to its default, no-op implementation.

Check my follow-up post A review of object creation in Ruby for more details!

Hooks can make you a better lover

In the object creation process described above, #new provides the core implementation (memory allocation, safety checks…) and #initialize allows for customizing the new object. It is an opt-in mechanism for you to leverage if you need it, and it also prevents you from ever messing with Ruby's internal workflow.

This concept of "safe customization" applies to third-party libraries as well. By implementing the hooks provided by a library, one may tweak programs to suit implementation-specific requirements, while avoiding ugly, unsafe, non-sustainable monkey-patching. The hook pattern is so neat and simple, I would advise you not to become a mere hook consumer, but a hook provider yourself. Gently adding well-thought hooks at key locations within your public APIs will empower fellow developers by:

preventing them from monkey-patching when it may actually be not needed (my experience: most of the time)
making it easier to use Inversion of Control patterns
clarifying a program's workflow (if hooks are properly documented!)

And it is not all there is to it actually! Providing a bunch of well-thought hooks is beneficial to your own code as well, because it will force you into designing APIs as services consumed by users/developers. by thinking about which hooks would be useful (and which would not), and where to put them, you may well cast the light on subtle implementation flaws or weaknesses, refine your architecture or modularize the code some more. On top of that, writing documentation with that mindset (from a user's perspective, that is) is a great way to ensure it targets its true audience, not just you, the maintainer.

A hook's duties

I have been rambling about hooks, but have not provided insights about what a good hook would be.

The first important thing to pay attention to is separating the core, "inviolable" behaviours of your codebase from the more “foggy” parts. Let's take an example: when communicating with a database, atomic operations like read/write/delete are not to be messed with, but you may want to accept side-effects to those core operations. Depending on the needs and the extent of what is to be allowed in terms of side-effects, a "public" domain will be marked out and hooks added at its borders: think "before/after/around" callbacks, for instance.

Hooks should not be provided as zombie-hords though. You should have enough of them to cover your business domain, but not more, or your codebase may become brittle. Place hooks at hot-spots, the logical jointures in your code’s flow: around core operations, within or at the end of initializers, gate keepers and sweeper methods, in between steps when it would make sense checking for specific requirements enforced by the user or allowing for side effects.

As a library author, do not trust hooks’ callbacks. Consider all callbacks to be dumb, unicellular, self-contained code units. They must never introduce coupling to your internal operations, and you shall never expect a hook to be implemented by the user (or, worse, for a hook to abide by some kind of contract: returning values, having specific side-effects, etc.). One way to avoid any problem is to consider hooks as being closures (blocks/lambdas in Ruby, function-closure in JavaScript, etc.). Another safety insurance is to… provide a good documentation, empowering users with sound examples.

Adding hooks is easy!

All of what has been presented so far is by no means new or genius stuff. And it is not specific to Ruby either. It certainly is not a "design pattern" per se. But… it is still awesome! Unfortunately, there are too few libraries providing hooks out there, especially within the Ruby ecosystem.

To help you get started, my friend Nick wrote a little gem, the aptly named "hooks": you may want to give it a try, as it leverages a true callback style. There are other libs available as well, but I feel like they introduce more overhead, relying on meta-programming to hook "around" method calls at run-time, which is simply too complicated for little actual value. Sticking to a more explicitly declarative style is usually just as powerful.