Category Archives: Ruby

The difference between procs and lambdas in Ruby

Note: before starting this post, I recommend reading my other posts about procs and closures for background.

Overview

What’s the difference between a proc and a lambda?

Lambdas actually are procs. Lambdas are just a special kind of proc and they behave a little bit differently from regular procs. In this post we’ll discuss the two main ways in which lambdas differ from regular procs:

  1. The return keyword behaves differently
  2. Arguments are handled differently

Let’s take a look at each one of these differences in more detail.

The behavior of “return”

In lambdas, return means “exit from this lambda”. In regular procs, return means “exit from embracing method”.

Below is an example, pulled straight from the official Ruby docs, which illustrates this difference.

def test_return
  # This is a lambda. The "return" just exits
  # from the lambda, nothing more.
  -> { return 3 }.call

  # This is a regular proc. The "return" returns
  # from the method, meaning control never reaches
  # the final "return 5" line.
  proc { return 4 }.call

  return 5
end

test_return # => 4

Argument handling

Argument matching

A proc will happily execute a call with the wrong number of arguments. A lambda requires all arguments to be present.

> p = proc { |x, y| "x is #{x} and y is #{y}" }
> p.call(1)
 => "x is 1 and y is "
> p.call(1, 2, 3)
 => "x is 1 and y is 2"
> l = lambda { |x, y| "x is #{x} and y is #{y}" }
> l.call(1)
(irb):5:in `block in <main>': wrong number of arguments (given 1, expected 2) (ArgumentError)
> l.call(1, 2, 3)
(irb):14:in `block in <main>': wrong number of arguments (given 3, expected 2) (ArgumentError)

Array deconstruction

If you call a proc with an array instead of separate arguments, the array will get deconstructed, as if the array is preceded with a splat operator.

If you call a lambda with an array instead of separate arguments, the array will be interpreted as the first argument, and an ArgumentError will be raised because the second argument is missing.

> proc { |x, y| "x is #{x} and y is #{y}" }.call([1, 2])
 => "x is 1 and y is 2"
> lambda { |x, y| "x is #{x} and y is #{y}" }.call([1, 2])
(irb):9:in `block in <main>': wrong number of arguments (given 1, expected 2) (ArgumentError)

In other words, lambdas behave exactly like Ruby methods. Regular procs don’t.

Takeaways

  • In lambdas, return means “exit from this lambda”. In regular procs, return means “exit from embracing method”.
  • A regular proc will happily execute a call with the wrong number of arguments. A lambda requires all arguments to be present.
  • Regular procs deconstruct arrays in arguments. Lambdas don’t.
  • Lambdas behave exactly like methods. Regular procs behave differently.

Why DSLs are a necessary part of learning Rails testing

If you want to be a competent Rails tester, there are a lot of different things you have to learn. The things you have to learn might be divided into three categories.

The first of these three categories is tools. For example, you have to choose a testing framework and learn how to use it. Then there are principles, such as the principle of testing behavior vs. implementation. Lastly, there are practices, like the practice of programming in feedback loops.

This post will focus on the first category, tools.

For better or worse, the testing tools most commercial Rails projects use are RSpec, Factory Bot and Capybara. When developers who are new to testing (and possibly Ruby) first see RSpec syntax, for example, they’re often confused.

Below is an example of a test written using RSpec, Factory Bot and Capybara. To a beginner the syntax may look very mysterious.

describe "Signing in", type: :system do
  it "signs the user in" do
    user = create(:user)
    visit new_user_session_path
    fill_in "Username", with: user.username
    fill_in "Password", with: user.password
    click_on "Submit"
    expect(page).to have_content("Sign out")
  end
end

The way to take the above snippet from something mysterious to something perfectly clear is to learn all the details of how RSpec, Factory Bot and Capybara work. And doing that will require us to become familiar with domain-specific languages (DSLs).

For each of RSpec, Factory Bot and Capybara, there’s a lot to learn. And independently of those tools, there’s a lot to be learned about DSLs as well. Therefore I recommend learning a bit about DSLs separately from learning about the details of each of those tools.

Here are some posts that can help you learn about DSLs. If you’re learning testing, I suggest going through these posts and seeing if you can connect them to the code you see in your Rails projects’ codebases. As you gain familiarity with DSL concepts and the ins and outs of your particular tools, your test syntax should look increasingly clear to you.

Understanding Ruby Proc objects
Understanding Ruby closures
Understanding Ruby blocks
What the ampersand in front of &block means
The two common ways to call a Ruby block
How map(&:some_method) works
How Ruby’s instance_exec works
How Ruby’s method_missing works

Learning how Ruby DSLs work can be difficult and time-consuming but it’s well worth it. And if you’re using testing tools that make use of DSLs, learning about DSLs is a necessary step toward becoming a fully competent Rails tester.

Ruby memoization

What is memoization?

Memoization is a performance optimization technique.

The idea with memoization is: “When a method invokes an expensive operation, don’t perform that operation each time the method is called. Instead, just invoke the expensive operation once, remember the answer, and use that answer from now on each time the method is called.”

Below is an example that shows the benefit of memoization. The example is a class with two methods which both return the same result, but one is memoized and one is not.

The expensive operation in the example takes one second to run. As you can see from the benchmark I performed, the memoized method is dramatically more performant than the un-memoized one.

Running the un-memoized version 10 times takes 10 seconds (one second per run). Running the memoized version 10 times takes only just over one second. That’s because the first call takes one second but the calls after that take a negligibly small amount of time.

class Product
  # This method is NOT memoized. This method will invoke the
  # expensive operation every single time it's called.
  def price
    expensive_calculation
  end

  # This method IS memoized. It will invoke the expensive
  # operation the first time it's called but never again
  # after that.
  def memoized_price
    @memoized_price ||= expensive_calculation
  end
  
  def expensive_calculation
    sleep(1)
    500
  end
end

require "benchmark"

product = Product.new
puts Benchmark.measure { 10.times { product.price } }
puts Benchmark.measure { 10.times { product.memoized_price } }
$ ruby memoized.rb
  0.000318   0.000362   0.000680 ( 10.038078)
  0.000040   0.000049   0.000089 (  1.003962)

Why is memoization called memoization?

I’ve always thought memoization was an awkward term due to its similarity to “memorization”. The obscurity of the name bugged me a little so I decided to look up its etymology.

According to Wikipedia, “memoization” is derived from the Latin word “memorandum”, which means “to be remembered”. “Memo” is short for memorandum, hence “memoization”.

When to use memoization

The art of performance optimization is a bag of many tricks: query optimization, background processing, caching, lazy UI loading, and other techniques.

Memoization is one trick in this bag of tricks. You can recognize its use case when an expensive method is called repeatedly without a change in return value.

This is not to say that every time a case is encountered where an expensive method is called repeatedly without a change in return value that it’s automatically a good use case for memoization. Memoization (just like all performance techniques) is not without a cost, as we’ll see shortly. Memoization should only be used when the benefit exceeds the cost.

As with all performance techniques, memoization should only be used a) when you’re sure it’s needed and b) when you have a plan to measure the before/after performance effect. Otherwise what you’re doing is not performance optimization, you’re just randomly adding code (i.e. incurring costs) without knowing whether the costs you’re incurring are actually providing a benefit.

The costs of memoization

The main cost of memoization is that you risk introducing subtle bugs. Here are a couple examples of the kinds of bugs to which memoization is susceptible.

Instance confusion

Memoization works if and only if the return value will always be the same. Let’s say, for example, that you have a loop that makes use of an object which has a memoized method. Maybe this loop uses the same object instance in every single iteration, but you’re under the mistaken belief that a fresh instance is used for each iteration.

In this case the value from the object in the first iteration will be correct, but all the subsequent iterations risk being incorrect because they’ll use the value from the first iteration rather than getting their own fresh values.

If this type of bug sounds contrived, it’s not. It comes from a real example of a bug I once caused myself!

Nil return values

In the example above, if expensive_calculation had been nil, then the value wouldn’t get memoized because @memoized_price would be nil and nil is falsy.

The risk of such a bug is probably low, and the consequences of the bug are probably small in most cases, but it’s a good category of bug to be aware of. An alternative solution is to use defined? rather than lazy initialization, which is not susceptible to the nil-is-falsy bug.

Understandability

Last but certainly not least, code that involves memoization is harder to follow than code that doesn’t. This is probably the biggest cost of memoization. Code that’s hard to understand is hard to change. Code that’s hard to understand provides a place for bugs to hide.

Prudence pays off

Because memoization isn’t free, it’s not a good idea to habitually add memoization to methods as a default policy. Instead, add memoization on a case-by-case basis when it’s clearly justified.

Takeaways

  • Memoization is a performance optimization technique that prevents wasteful repeated calls to an expensive operation when the return value is the same each time.
  • Memoization should only be added when you’re sure it’s needed and you have a plan to verify the performance difference.
  • A good use case for memoization is when an expensive method is called repeatedly without a change in return value.
  • Memoization isn’t free. It carries with it the risk of subtle bugs. Therefore, don’t apply memoization indiscriminately. Only use it in cases where there’s a clear benefit.

How and why to use Ruby’s method_missing

Why method_missing exists

Normally, an object only responds to messages that match the names of the object’s methods and public accessors. For example, if I send the message first_name to an instance of User, then the User object will only respond to my message of first_name if User has a method or accessor called first_name.

But sometimes it’s useful to allow objects to respond to messages that don’t correspond to methods or accessors.

For example, let’s say we want to connect our User object to a database table. It would be very convenient if we could send messages to instances of User that correspond to the database table’s column names without having to either explicitly define new methods or do something inelegant like, for example, user.value(:first_name). It would be better if we could get the database value by calling user.first_name.

What’s more, if we added a new column called last_name, it would be good if we could just do user.last_name without having to change any code.

method_missing allows us to do things like this. In this post we’ll see how by going through an example that’s similar to (but simpler than) the database example above.

Arbitrary attribute setter example

In the below example, we’ll use method_missing to define some behavior that allows us to arbitrarily set values on an object. We’ll have an object called user on which we can call set_first_name, set_last_name, set_height_in_millimeters or whatever other arbitrary values we want.

A plain User object

In the following snippet, we define a class called User which is completely empty. We attempt to call set_first_name on a User instance which, of course, fails because User has no method called set_first_name.

# user.rb

class User
end

user = User.new
user.set_first_name("Jason")

When we run the above, we get undefined method `set_first_name' for #<User:0x00000001520e01c0> (NoMethodError).

$ ruby user.rb
Traceback (most recent call last):
user.rb:9:in `<main>': undefined method `set_first_name' for #<User:0x00000001520e01c0> (NoMethodError)

Adding method_missing

Now we add a method to the User class with a special name: method_missing.

In order for our method_missing implementation to work it has to follow a certain function signature. The first parameter, method_name, corresponds to the name of the message that was passed (e.g. first_name). The second parameter, *args corresponds to any arguments that were passed, and comes through as an array, thanks to the splat operator.

In this snippet all we’ll do is output the values of method_name and *args so we can begin to get a feel for how method_missing works.

class User
  def method_missing(method_name, *args)
    puts method_name
    puts args
  end
end

user = User.new
user.set_first_name("Jason")

When we run this we see set_first_name as the value for method_name and Jason as the value for args.

$ ruby user.rb
set_first_name
Jason

Parsing the attribute name

Now let’s parse the attribute name. When we pass set_first_name, for example, we want to parse the attribute name of first_name. This can be done by grabbing a substring that excludes the first four characters of the method name.

class User
  def method_missing(method_name, value)
    attr_name = method_name.to_s[4..]
    puts attr_name
  end
end

user = User.new
user.set_first_name("Jason")

This indeed gives us just first_name.

$ ruby user.rb 
first_name

Setting the attribute

Now let’s set the actual attribute. Remember that args will come through as an array. (The reason that args is an array is because whatever method is called might be passed multiple arguments, not just one argument like we’re doing in this example.) We’re interested only in the first element of args because user.set_first_name("Jason") only passes one argument.

class User
  def method_missing(method_name, *args)
    attr_name = method_name.to_s[4..]
    instance_variable_set("@#{attr_name}", args[0])
  end
end

user = User.new
user.set_first_name("Jason")
puts user.instance_variable_get("@first_name")

When we run this it gives us the value we passed it, Jason.

$ ruby user.rb
Jason

We can also set and get any other attributes we want.

class User
  def method_missing(method_name, *args)
    attr_name = method_name.to_s[4..]
    instance_variable_set("@#{attr_name}", args[0])
  end
end

user = User.new
user.set_first_name("Jason")
user.set_last_name("Swett")
puts user.instance_variable_get("@first_name")
puts user.instance_variable_get("@last_name")

When we run this we can see that both values have been set.

$ ruby user.rb
Jason
Swett

A note about blocks

In other examples you may see the method signature of method_missing shown like this:

def method_missing(method_name, *args, &block)

method_missing can take a block as an argument, but actually, so can any Ruby method. I chose not to cover blocks in this post because the way method_missing‘s blocks work is the same as the way blocks work in any other method, and a block example might confuse beginners. If you’d like to understand blocks more in-depth, I’d recommend my other post about blocks.

Takeaways

  • method_missing can be useful for constructing DSLs.
  • method_missing can be added to any object to endow that object with special behavior when the object gets sent a message for which it doesn’t have a method defined.
  • method_missing takes the name of the method that was called, an arbitrary number of arguments, and (optionally) a block.

How Ruby’s instance_exec works

In this post we’ll take a look at Ruby’s instance_exec, a method which can can change the execution context of a block and help make DSL syntax less noisy.

Passing arguments to blocks

When calling a Ruby block using block.call, you can pass an argument (e.g. block.call("hello") and the argument will be fed to the block.

Here’s an example of passing an argument when calling a block.

def word_fiddler(&block)
  block.call("hello")
end

word_fiddler do |word|
  puts word.upcase
end

In this case, the string "hello" gets passed for word, and word.upcase outputs HELLO.

We can also do something different and perhaps rather strange-seeming. We can use a method called instance_exec to execute our block in the context of whatever argument we send it.

Executing a block in a different context

Note how in the following example word.upcase has changed to just upcase.

def word_fiddler(&block)
  "hello".instance_exec(&block)
end

word_fiddler do
  puts upcase
end

The behavior is the exact same. The output is identical. The only difference is how the behavior is expressed in the code.

How this works

Every command in Ruby operates in a context. Every context is an object. The default context is an object called main, which you can demonstrate by opening a Ruby console and typing self.

We can also demonstrate this for our earlier word_fiddler snippet.

def word_fiddler(&block)
  block.call("hello")
end

word_fiddler do |word|
  puts self # shows the current context
  puts word.upcase
end

If you run the above snippet, you’ll see the following output:

main
HELLO

The instance_exec method works because in changes the context of the block it invokes. Here’s our instance_exec snippet with a puts self line added.

def word_fiddler(&block)
  "hello".instance_exec(&block)
end

word_fiddler do
  puts self
  puts upcase
end

Instead of main, we now get hello.

hello
HELLO

Why instance_exec is useful

instance_exec can help make Ruby DSLs less verbose.

Consider the following Factory Bot snippet:

FactoryBot.define do
  factory :user do
    first_name { 'John' }
    last_name { 'Smith' }
  end
end

The code above consists of two blocks, one nested inside the other. There are three methods called in the snippet, or more precisely, there are three messages being sent: factory, first_name and last_name.

Who is the recipient of these messages? In other words, in what contexts are these two blocks being called?

It’s not the default context, main. The outer block is operating in the context of an instance of a class called FactoryBot::Syntax::Default::DSL, which is defined by the Factory Bot gem. This means that the factory message is getting sent to an instance of FactoryBot::Syntax::Default::DSL.

The inner block is operating in the context of a different object, an instance of FactoryBot::Declaration::Implicit. The first_name and last_name messages are getting sent to this class.

You can perhaps imagine what the Factory Bot syntax would have to look like if it were not possible to change blocks’ contexts using instance_exec. The syntax would be pretty verbose and noisy.

Takeaways

  • instance_exec is a method that executes a block in the context of a certain object.
  • instance_exec can help make DSL syntax less noisy and verbose.
  • Methods like Factory Bot’s factory and RSpec’s it and describe are possible because of instance_exec.

How map(&:some_method) works

The map method’s shorthand syntax

One of the most common uses of map in Ruby is to take an object and call some method on the object, like this.

[1, 2, 3].map { |number| number.to_s }

To save us from redundancy, Ruby has a shorthand version which is functionally equivalent to the above.

[1, 2, 3].map(&:to_s)

The shorthand version is nice but its syntax is a little mysterious. In this post I’ll explain why the syntax is what it is.

Passing symbols as blocks

Let’s leave the world of map for a moment and deal with a “regular” method.

I’m going to show you a method which takes a block and then demonstrate four different ways of passing a block to that method.

Side note: if you’re not too familiar with Proc objects yet, I would suggest reading my other posts on how Proc objects work and what the & in front of &block means before continuing.

First way: using a normal block

You’ve of course seen this way before. We call my_method and pass a regular block to it.

def my_method(&block)
  block.call("hello")
end

puts my_method { |value| value.upcase } # outputs "HELLO"

Second way: using Proc.new

If we wanted to, we could instead pass our block using Proc.new. Since my_method takes a block and not a Proc object, we would have to prefix Proc.new with an ampersand to convert the Proc object into a block.

(If you didn’t know, prefixing an expression with & will convert a proc to a block and a block to a proc. See this post for more details on how that works.)

def my_method(&block)
  block.call("hello")
end

puts my_method(&Proc.new { |value| value.upcase })

There would never really be a practical reason to express the syntax this way, but I wanted to show that it’s possible. This “second way” example will also connect the first way and the third way.

Third way: using to_proc

All symbols respond to a method called to_proc which returns a Proc object. If we do :upcase.to_proc, it gives us a Proc object that’s equivalent to what we would have gotten by doing Proc.new { |value| value.upcase }.

def my_method(&block)
  block.call("hello")
end

puts my_method(&:upcase.to_proc)

Fourth way: passing a symbol

I’ll show one final way. When Ruby sees an argument that’s prefixed with an ampersand, it attempts to call to_proc on the argument. So our to_proc on &:upcase.to_proc is actually superfluous. We can just pass &:upcase all by itself.

def my_method(&block)
  block.call("hello")
end

puts my_method(&:upcase)

What ultimately gets passed is the Proc object that results from calling :upcase.to_proc. Actually, more precisely, what gets passed is the block that results from calling &:upcase.to_proc, since the & converts the Proc object to a block.

Passing symbols to map

With the understanding of the above, you now know that this:

[1, 2, 3].map(&:to_s)

Is equivalent to this:

[1, 2, 3].map(&:to_s.to_proc)

Which is equivalent to this:

[1, 2, 3].map(&Proc.new { |number| number.to_s })

Which, finally, is equivalent to this:

[1, 2, 3].map { |number| number.to_s }

So, contrary to the way it may seem, there aren’t two different “versions” of the map method. The shorthand syntax is owing to the way that Ruby passes Proc objects.

Takeaways

  • When an argument is prefixed with &, Ruby attempts to call to_proc on it.
  • All symbols respond to the to_proc method.
  • There aren’t two different versions of the map method. The shorthand syntax is possible due to the two points above.

Understanding Ruby closures

Why you’d want to know about Ruby closures

Ruby blocks are one of the areas of the language that’s simultaneously one of the most fundamental parts of the language but perhaps one of the hardest to understand.

Ruby functions like map and each operate using blocks. You’ll also find heavy use of blocks in popular Ruby libraries including Ruby on Rails itself.

If you start to dig into Ruby blocks, you’ll discover that, in order to understand blocks, you have to understand something else called Proc objects.

And as if that weren’t enough, you’ll then discover that if you want to deeply understand Proc objects, you’ll have to understand closures.

The concept of a closure is one that suffers from 1) an arguably misleading name (more about this soon) and 2) unhelpful, jargony explanations online.

My goal with this post is to provide an explanation of closures in plain language that can be understood by someone without a Computer Science background. And in fact, a Computer Science background is not needed, it’s only the poor explanations of closures that make it seem so.

Let’s dig deeper into what a closure actually is.

What a closure is

A closure is a record which stores a function plus (potentially) some variables.

I’m going to break this definition into parts to reduce the chances that any part of it is misunderstood.

  • A closure is a record
  • which stores a function
  • plus (potentially) some variables

I’m going to discuss each part of this definition individually.

First, a reminder: the whole reason we’re interested in Ruby closures is because of the Ruby concept called a Proc object, which is heavily involved in blocks. A Proc object is a closure. Therefore, all the examples of closures in this post will take the form of Proc objects.

If you’re not yet familiar with Proc objects, I would suggest taking a look at my other post, Understanding Ruby Proc objects, before continuing. It will help you understand the ideas in this post better.

First point: “A closure is a record”

A closure is a value that can be assigned to a variable or some other kind of “record”. The term “record” doesn’t have a special technical meaning here, we just use the word “record” because it’s broader than “variable”. Shortly we’ll see an example of a closure being assigned to something other than a variable.

Here’s a Proc object that’s assigned to a variable.

my_proc = Proc.new { puts "I'm in a closure!" }
my_proc.call

Remember that every Proc object is a closure. When we do Proc.new we’re creating a Proc object and thus a closure.

Here’s another closure example. Here, instead of assigning the closure to a variable, we’re assigning the closure to a key in a hash. The point here is that the thing a closure gets assigned to isn’t always a variable. That’s why we say “record” and not “variable”.

my_stuff = { my_proc: Proc.new { puts "I'm in a closure too!" } }
my_stuff[:my_proc].call

Second point: “which stores a function”

As you may have deduced, or as you may have already known, the code between the braces (puts "I'm in a closure!") is the function we’re talking about when we say “a closure is a record which stores a function”.

my_proc = Proc.new { puts "I'm in a closure!" }
my_proc.call

A closure can be thought of as a function “packed up” into a variable. (Or, more precisely, a variable or some other kind of record.)

Third point: “plus (potentially) some variables”

Here’s a Proc object (which, remember, is a closure) that involves an outside variable.

The variable number_of_exclamation_points gets included in the “environment” of the closure. Each time we call the closure that we’ve named amplifier, the number_of_exclamation_points variable gets incremented and one additional exclamation point gets added to the string that gets outputted.

number_of_exclamation_points = 0

amplifier = Proc.new do
  number_of_exclamation_points += 1
  "louder" + ("!" * number_of_exclamation_points)
end

puts amplifier.call # louder!
puts amplifier.call # louder!!
puts amplifier.call # louder!!!
puts amplifier.call # louder!!!!
puts number_of_exclamation_points # 4 - the original variable was mutated

As a side note, I find the name “closure” to be misleading. The fact that the above closure can mutate number_of_exclamation_points, a variable outside the function’s scope, seems to me like a decidedly un-closed idea. In fact, it seems like there’s a tunnel, an opening, between the closure and the outside scope, through which changes can leak.

I personally started having an easy time understanding closures once I stopped trying to connect the idea of “a closed thing” with the mechanics of how closures actually work.

Takeaways

  • Ruby blocks heavily involve Proc objects.
  • Every Proc object is a closure.
  • A closure is a record which stores a function plus (potentially) some variables.

The two common ways to call a Ruby block

Ruby blocks can be difficult to understand. One of the details which presents an obstacle to fully understanding blocks is the fact that there is more than one way to call a block.

In this post we’ll go over the two most common ways of calling a Ruby block: block.call and yield.

There are also other ways to call a block, e.g. instance_exec. But that’s an “advanced” topic which I’ll leave out of the scope of this post.

Here are the two common ways of calling a Ruby block and why they exist.

The first way: block.call

Below is a method that accepts a block, then calls that block.

def hello(&block)
  block.call
end

hello { puts "hey!" }

If you run this code, you’ll see the output hey!.

You may wonder what the & in front of &block is all about. As I explained in a different post, the & converts the block into a Proc object. The block can’t be called directly using .call. The block has to be converted into a Proc object first and then .call is called on the Proc object.

I encourage you to read my two other posts about Proc objects and the & at the beginning of &block if you’d like to understand these parts more deeply.

The second way: yield

The example below is very similar to the first example, except instead of using block.call we’re using yield.

def hello(&block)
  yield
end

hello { puts "hey!" }

You may wonder: if we already have block.call, why does Ruby provide a second, slightly different way of calling a block?

One reason is that yield gives us a capability that block.call doesn’t have. In the below example, we define a method and then pass a block to it, but we never have to explicitly specify that the method takes a block.

def hello
  yield
end

hello { puts "hey!" }

As you can see, yield gives us the ability to call a block even if our method doesn’t explicitly take a block. (Side note: any Ruby method can be passed a block, even if the method doesn’t explicitly take one.)

The fact that yield exists raises the question: why not just use yield all the time?

The answer is that when you use block.call, you have the ability to pass the block to another method if you so choose, which is something you can’t do with yield.

When we put &block in a method’s signature, we can do more with the block than just call it using block.call. We could also, for example, choose not to call the block but rather pass the block to a different method which then calls the block.

Takeaways

  • There are two common ways to call a Ruby block: block.call and yield.
  • Unlike block.call, yield gives us the ability to call a block even if our method doesn’t explicitly take a block.
  • Unlike using an implicit block and yield, using and explicit block allows us to pass a block to another method.

What the ampersand in front of &block means

Here’s a code sample that I’ve grabbed more or less at random from the Rails codebase.

def form_for(record, options = {}, &block)

The first two arguments, record and options = {}, are straightforward to someone who’s familiar with Ruby. But the third argument, &block, is a little more mysterious. Why the leading ampersand?

This post will be the answer to that question. In order to begin to understand what the leading ampersand is all about, let’s talk about how blocks relate to Proc objects.

Blocks and Proc objects

Let’s talk about blocks and Proc objects a little bit, starting with Proc objects.

Here’s a method which takes an argument. The method doesn’t care of what type the argument is. All the method does is output the argument’s class.

After we define the method, we call the method and pass it a Proc object. (If you’re not too familiar with Proc objects, you may want to check out my other post, Understanding Ruby Proc objects.)

def proc_me(my_proc)
  puts my_proc.class
end

proc_me(Proc.new { puts "hi" })

If you run this code, the output will be:

Proc

Not too surprising. We’re passing a Proc object as an argument to the proc_me method. Naturally, it thinks that my_proc is a Proc object.

Now let’s add another method, block_me, which accepts a block.

def proc_me(my_proc)
  puts my_proc.class
end

def block_me(&my_block)
  puts my_block.class
end

proc_me(Proc.new { puts "hi" })
block_me { puts "hi" }

If we run this code the output will be:

Proc
Proc

Even though we’re passing a Proc object the first time and a block the second time, we see Proc for both lines.

The reason that the result of my_block.class is Proc is because a leading ampersand converts a block to a Proc object.

Before moving on I encourage you to try out the above code in a console. Poke around at the code and change some things to see if it enhances your understanding of what’s happening.

Converting the Proc object to a block before passing the Proc object

Here’s a slightly altered version of the above example. Notice how my_proc has changed to &my_proc. The other change is that Proc.new has changed to &Proc.new.

def proc_me(&my_proc) # an & was added here
  puts my_proc.class
end

def block_me(&my_block)
  puts my_block.class
end

proc_me(&Proc.new { puts "hi" }) # an & was added here
block_me { puts "hi" }

If we run this code the output is the exact same.

Proc
Proc

This is because not only does a leading ampersand convert a block to a Proc object, but a leading ampersand also converts a Proc object to a block.

When we do &Proc.new, the leading ampersand converts the Proc object to a block. Then the leading ampersand in def proc_me(&my_proc) converts the block back to a Proc object.

I again encourage you to run this code example for yourself in order to more clearly understand what’s happening.

The differences between blocks and Proc objects

Ruby has a class called Proc but no class called Block. Because there’s no class called Block, nothing can be an instance of a Block. The material that Ruby blocks are made out of is Proc objects.

What happens when we try this?

my_block = { puts "hi" }

If we try to run this, we get:

$ ruby block.rb
block.rb:1: syntax error, unexpected string literal, expecting `do' or '{' or '('
my_block = { puts "hi" }
block.rb:1: syntax error, unexpected '}', expecting end-of-input
my_block = { puts "hi" }

That’s because the syntax { puts "hi" } doesn’t make any syntactical sense on its own. If we want to say { puts "hi" }, there are only two ways we can do it.

First way: put it inside a Proc object

That would look like this:

Proc.new { puts "hi" }

In this way the { puts "hi" } behavior is “packaged up” into an entity that we can then do whatever we want with. (Again, see my other post on Ruby proc objects for more details.)

Second way: use it to call a method that takes a block

That would look like this:

some_method { puts "hi" }

Why converting a block to a Proc object is necessary

Let’s take another look at our code sample from the beginning of the post.

def form_for(record, options = {}, &block)

In methods that take a block, the syntax is pretty much always &block, never just block. And as we’ve discussed, the leading ampersand converts the block into a Proc object. But why does the block get converted to a Proc object?

Since everything in Ruby is an instance of some object, and since there’s no such thing as a Ruby Block class, there can never be an object that’s a block. In order to be able to have an instance of something that represents the behavior of a block, that thing has to take the form of a Proc object, i.e. an instance of the class Proc, the stuff that Ruby blocks are made out of. That’s why methods that explicitly deal with blocks convert those blocks to Proc objects first.

Takeaways

  • A leading ampersand converts a block to a Proc object and a Proc object to a block.
  • There’s no such thing as a Ruby Block class. Therefore no object can be an instance of a block. The material that Ruby blocks are made out of is Proc objects.
  • The previous points taken together are why Ruby block arguments always appear as e.g. &block. The block can’t be captured in a variable unless it’s first converted to a Proc object.

Understanding Ruby Proc objects

What we’re going to do and why

If you’re a Ruby programmer, you almost certainly use Proc objects all the time, although you might not always be consciously aware of it. Blocks, which are ubiquitous in Ruby, and lambdas, which are used for things like Rails scopes, both involve Proc objects.

In this post we’re going to take a close look at Proc objects. First we’ll do a Proc object “hello world” to see what we’re dealing with. Then we’ll unpack the definition of Proc objects that the official Ruby docs give us. Lastly we’ll see how Proc objects relate to other concepts like blocks and lambdas.

A Proc object “hello world”

Before we talk about what Proc objects are and how they’re used, let’s take a look at a Proc object and mess around with it a little bit, just to see what one looks like.

The official Ruby docs provide a pretty good Proc object “hello world” example:

square = Proc.new { |x| x**2 }

We can see how this Proc object works by opening up an irb console and defining the Proc object there.

> square = Proc.new { |x| x**2 }
 => #<Proc:0x00000001333a8660 (irb):1> 
> square.call(3)
 => 9 
> square.call(4)
 => 16 
> square.call(5)
 => 25

We can kind of intuitively understand how this works. A Proc object behaves somewhat like a method: you define some behavior and then you can use that behavior repeatedly wherever you want.

Now that we have a loose intuitive understanding, let’s get a firmer grasp on what Proc objects are all about.

Understanding Proc objects more deeply

The official Ruby docs’ definition of Proc objects

According to the official Ruby docs on Procs objects, “a Proc object is an encapsulation of a block of code, which can be stored in a local variable, passed to a method or another Proc, and can be called.”

This definition is a bit of a mouthful. When I encounter wordy definitions like this, I like to separate them into chunks to make them easier to understand.

The Ruby Proc object definition, broken into chunks

A Proc object is:

  • an encapsulation of a block of code
  • which can be stored in a local variable
  • or passed to a method or another Proc
  • and can be called

Let’s take these things one-by-one.

A Proc object is an encapsulation of a block of code

What could it mean for something to be an encapsulation of a block of code? In general, when you “encapsulate” something, you metaphorically put it in a capsule. Things that are in capsules are isolated from whatever’s on the outside of the capsule. Encapsulating something also implies that it’s “packaged up”.

So when the docs say that a Proc object is “an encapsulation of a block of code”, they must mean that the code in a Proc object is packaged up and isolated from the code outside it.

A Proc object can be stored in a local variable

For this one let’s look at an example, straight from the docs:

square = Proc.new { |x| x**2 }

As we can see, this piece of code creates a Proc object and stores it in a local variable called square. So this part of the definition, that a Proc object can be stored in a local variable, seems easy enough to understand.

A Proc object can be passed to a method or another Proc

This one’s a two-parter so let’s take each part individually. First let’s focus on “A Proc object can be passed to another method”.

Here’s a method which can accept a Proc object. The method is followed by the definition of two Proc objects: square, which squares whatever number you give it, and double, which doubles whatever number you give it.

def perform_operation_on(number, operation)
  operation.call(number)
end

square = Proc.new { |x| x**2 }
double = Proc.new { |x| x * 2 }

puts perform_operation_on(5, square)
puts perform_operation_on(5, double)

If you were to run this code you would get the following output:

25
10

So that’s what it means to pass a Proc object into a method. Instead of passing data as a method argument like normal, you can pass behavior. Or, to put it another way, you can pass an encapsulation of a block of code. It’s then up to that method to execute that encapsulated block of code whenever and however it sees fit.

If we want to pass a Proc object into another Proc object, the code looks pretty similar to our other example above.

perform_operation_on = Proc.new do |number, operation|
  operation.call(number)
end

square = Proc.new { |x| x**2 }
double = Proc.new { |x| x * 2 }

puts perform_operation_on.call(5, square)
puts perform_operation_on.call(5, double)

The only difference between this example and the one above it is that, in this example, perform_operation_on is defined as a Proc object rather than a method. The ultimate behavior is exactly the same though.

A Proc object can be called

This last part of the definition of a Proc object, “a Proc object can be called”, is perhaps obvious at this point but let’s address it anyway for completeness’ sake.

A Proc object can be called using the #call method. Here’s an example.

square = Proc.new { |x| x**2 }
puts square.call(3)

There are other ways to call a Proc object but they’re not important for understanding Proc objects conceptually.

Closures

In order to fully understand Proc objects, we need to understand something called closures. The concept of a closure is a broader concept that’s not unique to Ruby.

Closures are too nuanced a concept to be included in the scope of this article, unfortunately. If you’d like to understand closures, I’d suggest checking out my other post, Understanding Ruby closures.

But the TL;DR version is that a closure is a record which stores a function plus (potentially) some variables.

Proc objects and blocks

Every block in Ruby is a Proc object, loosely speaking. Here’s a custom method that accepts a block as an argument.

def my_method(&block)
  puts block.class
end

my_method { "hello" }

If you were to run the above code, the output would be:

Proc

That’s because the block we passed when calling my_method is a Proc object.

Below is an example that’s functionally equivalent to the above. The & in front of my_proc converts the Proc object into a block.

def my_method(&block)
  puts block.class
end

my_proc = Proc.new { "hello" }
my_method &my_proc

By the way, if you’re curious about the & at the beginning of &block and &my_proc, I have a whole post about that here.

Proc objects and lambdas

Lambdas are also Proc objects. This can be proven by running the following in an irb console:

> my_lambda = lambda { |x| x**2 }
 => #<Proc:0x00000001241e82a8 (irb):1 (lambda)> 
> my_lambda.class
 => Proc

The difference between lambdas and Proc objects is that the two have certain subtle differences in behavior. For example, in lambdas, return means “exit from this lambda”. In regular Proc objects, return means “exit from embracing method”. I won’t go into detail on the differences between lambdas and Proc objects because it’s outside the scope of what I’m trying to convey in this post. A have a different post that describes the differences between procs and lambdas.

Takeaways

  • A Proc object is an encapsulation of a block of code, which can be stored in a local variable, passed to a method or another Proc, and can be called.
  • A closure is a record which stores a function plus some variables. Proc objects are closures.
  • Blocks are Proc objects.
  • Lambdas are Proc objects too, although a special kind of Proc object with subtly different behavior.