Author Archives: Jason Swett

Understanding Ruby Proc objects

What we’re going to do and why

If you’re a Ruby programmer, you almost certainly use Proc objects all the time, although you might not always be consciously aware of it. Blocks, which are ubiquitous in Ruby, and lambdas, which are used for things like Rails scopes, both involve Proc objects.

In this post we’re going to take a close look at Proc objects. First we’ll do a Proc object “hello world” to see what we’re dealing with. Then we’ll unpack the definition of Proc objects that the official Ruby docs give us. Lastly we’ll see how Proc objects relate to other concepts like blocks and lambdas.

A Proc object “hello world”

Before we talk about what Proc objects are and how they’re used, let’s take a look at a Proc object and mess around with it a little bit, just to see what one looks like.

The official Ruby docs provide a pretty good Proc object “hello world” example:

square = Proc.new { |x| x**2 }

We can see how this Proc object works by opening up an irb console and defining the Proc object there.

> square = Proc.new { |x| x**2 }
 => #<Proc:0x00000001333a8660 (irb):1> 
> square.call(3)
 => 9 
> square.call(4)
 => 16 
> square.call(5)
 => 25

We can kind of intuitively understand how this works. A Proc object behaves somewhat like a method: you define some behavior and then you can use that behavior repeatedly wherever you want.

Now that we have a loose intuitive understanding, let’s get a firmer grasp on what Proc objects are all about.

Understanding Proc objects more deeply

The official Ruby docs’ definition of Proc objects

According to the official Ruby docs on Procs objects, “a Proc object is an encapsulation of a block of code, which can be stored in a local variable, passed to a method or another Proc, and can be called.”

This definition is a bit of a mouthful. When I encounter wordy definitions like this, I like to separate them into chunks to make them easier to understand.

The Ruby Proc object definition, broken into chunks

A Proc object is:

  • an encapsulation of a block of code
  • which can be stored in a local variable
  • or passed to a method or another Proc
  • and can be called

Let’s take these things one-by-one.

A Proc object is an encapsulation of a block of code

What could it mean for something to be an encapsulation of a block of code? In general, when you “encapsulate” something, you metaphorically put it in a capsule. Things that are in capsules are isolated from whatever’s on the outside of the capsule. Encapsulating something also implies that it’s “packaged up”.

So when the docs say that a Proc object is “an encapsulation of a block of code”, they must mean that the code in a Proc object is packaged up and isolated from the code outside it.

A Proc object can be stored in a local variable

For this one let’s look at an example, straight from the docs:

square = Proc.new { |x| x**2 }

As we can see, this piece of code creates a Proc object and stores it in a local variable called square. So this part of the definition, that a Proc object can be stored in a local variable, seems easy enough to understand.

A Proc object can be passed to a method or another Proc

This one’s a two-parter so let’s take each part individually. First let’s focus on “A Proc object can be passed to another method”.

Here’s a method which can accept a Proc object. The method is followed by the definition of two Proc objects: square, which squares whatever number you give it, and double, which doubles whatever number you give it.

def perform_operation_on(number, operation)
  operation.call(number)
end

square = Proc.new { |x| x**2 }
double = Proc.new { |x| x * 2 }

puts perform_operation_on(5, square)
puts perform_operation_on(5, double)

If you were to run this code you would get the following output:

25
10

So that’s what it means to pass a Proc object into a method. Instead of passing data as a method argument like normal, you can pass behavior. Or, to put it another way, you can pass an encapsulation of a block of code. It’s then up to that method to execute that encapsulated block of code whenever and however it sees fit.

If we want to pass a Proc object into another Proc object, the code looks pretty similar to our other example above.

perform_operation_on = Proc.new do |number, operation|
  operation.call(number)
end

square = Proc.new { |x| x**2 }
double = Proc.new { |x| x * 2 }

puts perform_operation_on.call(5, square)
puts perform_operation_on.call(5, double)

The only difference between this example and the one above it is that, in this example, perform_operation_on is defined as a Proc object rather than a method. The ultimate behavior is exactly the same though.

A Proc object can be called

This last part of the definition of a Proc object, “a Proc object can be called”, is perhaps obvious at this point but let’s address it anyway for completeness’ sake.

A Proc object can be called using the #call method. Here’s an example.

square = Proc.new { |x| x**2 }
puts square.call(3)

There are other ways to call a Proc object but they’re not important for understanding Proc objects conceptually.

Closures

In order to fully understand Proc objects, we need to understand something called closures. The concept of a closure is a broader concept that’s not unique to Ruby.

Closures are too nuanced a concept to be included in the scope of this article, unfortunately. If you’d like to understand closures, I’d suggest checking out my other post, Understanding Ruby closures.

But the TL;DR version is that a closure is a record which stores a function plus (potentially) some variables.

Proc objects and blocks

Every block in Ruby is a Proc object, loosely speaking. Here’s a custom method that accepts a block as an argument.

def my_method(&block)
  puts block.class
end

my_method { "hello" }

If you were to run the above code, the output would be:

Proc

That’s because the block we passed when calling my_method is a Proc object.

Below is an example that’s functionally equivalent to the above. The & in front of my_proc converts the Proc object into a block.

def my_method(&block)
  puts block.class
end

my_proc = Proc.new { "hello" }
my_method &my_proc

By the way, if you’re curious about the & at the beginning of &block and &my_proc, I have a whole post about that here.

Proc objects and lambdas

Lambdas are also Proc objects. This can be proven by running the following in an irb console:

> my_lambda = lambda { |x| x**2 }
 => #<Proc:0x00000001241e82a8 (irb):1 (lambda)> 
> my_lambda.class
 => Proc

The difference between lambdas and Proc objects is that the two have certain subtle differences in behavior. For example, in lambdas, return means “exit from this lambda”. In regular Proc objects, return means “exit from embracing method”. I won’t go into detail on the differences between lambdas and Proc objects because it’s outside the scope of what I’m trying to convey in this post. A have a different post that describes the differences between procs and lambdas.

Takeaways

  • A Proc object is an encapsulation of a block of code, which can be stored in a local variable, passed to a method or another Proc, and can be called.
  • A closure is a record which stores a function plus some variables. Proc objects are closures.
  • Blocks are Proc objects.
  • Lambdas are Proc objects too, although a special kind of Proc object with subtly different behavior.

Hangman Challenge refactor (November 2021)

https://youtu.be/xDL5GrYnN_g

This is the first in what I intend to be a series of refactoring readers’ coding submissions.

To make your own Hangman Challenge submission, you can go here and follow the instructions.

“Before” version

What was good about the code

I found the class name Word to be good. It was very obvious to me what idea Word represented.

Most of the code was reasonably easy to understand. Aside from one certain line, there were no places where I was at a total loss as to what was going on.

What could be improved

Before I list what could be improved I want to thank the person who bravely submitted his code for public critique.

One of the first things that drew my attention was the Hangman#guess method. This method was pretty long and contained deep nesting.

The name of the Hangman class also lacks meaning. Word is a good abstraction because it represents a crisp idea: the word that the player is trying to guess. But there’s no such thing as a Hangman, at least not in this context, and so an opportunity for a crisp abstraction is lost.

There are some naming issues. The variables secret, letters, guesses and life aren’t as clear as they could be.

There are a couple places where there’s a lack of obvious meaning. For example, what exactly does it mean when life.zero? is true? It can be inferred that this means the player loses, but it would be better if we were to make this blatantly obvious.

There are a couple YAGNI (you ain’t gonna need it) violations. Both the Word constructor and the Hangman constructor have a superfluous feature which allows you to pass in an arbitrary masked character value and starting life value, respectively. These features aren’t needed in order to meet the requirements of the program. These features were added, presumably, “just in case”.

There are a few cases of needless attr_readers. It’s my view that attr_reader should only be used if an instance variable needs to be surfaced to a class’s public API. Otherwise the public API is being made bigger than necessary, limiting how much of the class can be refactored without risk.

This program contains a type of Ruby syntax I’ve never seen before:

secret.each_char.with_index { |char, i| @letters[i], guessed = char, true if char == letter }

Perhaps there are some cases where this type of syntax works well, but in this particular case I found it confounding.

Lastly, there’s was a bit of duplication, one “magic number”, and what you might call a “magic character”.

Here’s the original code with comments added by me.

#!/usr/bin/ruby
#
# https://github.com/jasonswett/hangman_challenge
#

# frozen_string_literal: true

class Word # good abstraction
  attr_reader :secret # needless attr_reader

  def initialize(secret, char = "_") # YAGNI violation
    @secret = secret # unclear variable name
    @letters = Array.new(secret.length) { char } # unclear variable name
  end

  def guess(letter)
    guessed = false

    # confusing syntax
    secret.each_char.with_index { |char, i| @letters[i], guessed = char, true if char == letter }

    guessed
  end

  def completed? # unclear method name
    secret == mask
  end

  def mask
    @letters.join
  end
end

class Hangman # "Hangman" isn't really an abstraction
  attr_reader :word, :guesses, :life # needless attr_readers

  def initialize(secret, life = 6) # YAGNI violation
    @word = Word.new(secret)
    @guesses = "" # unclear variable name
    @life = life # unclear variable name

    puts "#{word.mask} life left: #{life}"
  end

  def guess(letter) # long method with deep nesting
    if word.guess(letter)
      if word.completed?
        puts "#{word.secret} YOU WIN!"
      else
        if guesses.empty?
          puts "#{word.mask} life left: #{life}" # duplication
        else
          puts "#{word.mask} life left: #{life} incorrect guesses: #{guesses}"
        end
      end
    else
      @life -= 1

      if life.zero? # lack of obvious meaning
        puts "#{word.mask} YOU LOSE!"
      else
        @guesses += letter
        puts "#{word.mask} life left: #{life} incorrect guesses: #{guesses}"
      end
    end
  end
end

# hangman = Hangman.new("apple")
# %w[a b q z e p l].each do |letter|
#   hangman.guess(letter)
# end

# hangman = Hangman.new("quixotic")
# %w[a e i o u l g p r].each do |letter|
#   hangman.guess(letter)
# end

File.readlines(ARGV[0], chomp: true).each do |line|
  next if line.empty?

  if line.length == 1
    @hangman.guess(line)
  else
    @hangman = Hangman.new(line)
  end
end

“After” version

Here’s my version. I’m not saying it’s perfect, just an improvement. I invite you to see if you can identify the specific ways in which the code was improved.

#!/usr/bin/ruby
#
# https://github.com/jasonswett/hangman_challenge
#

# frozen_string_literal: true

class Game
  STARTING_LIFE_AMOUNT = 6

  def initialize(word)
    @word = word
    @remaining_life_amount = STARTING_LIFE_AMOUNT
    puts word_status
  end

  def submit_guess(guessed_letter)
    if @word.guess_correct?(guessed_letter)
      @word.correct_guesses << guessed_letter
    else
      @remaining_life_amount -= 1
      @word.incorrect_guesses << guessed_letter
    end

    if player_has_won?
      puts "#{@word.value} YOU WIN!"
      return
    end

    if player_has_lost?
      puts "#{@word} YOU LOSE!"
      return
    end

    puts complete_status
  end

  def player_has_won?
    @word.value == @word.to_s
  end

  def player_has_lost?
    @remaining_life_amount.zero?
  end

  def word_status
    "#{@word} life left: #{@remaining_life_amount}"
  end

  def complete_status
    [word_status, @word.incorrect_guess_message]
      .compact
      .join(" ")
  end
end

class Word
  attr_accessor :value, :correct_guesses, :incorrect_guesses

  def initialize(value)
    @value = value
    @correct_guesses = []
    @incorrect_guesses = []
  end

  def guess_correct?(guessed_letter)
    @value.include?(guessed_letter)
  end

  def to_s
    @value.split("").map do |letter|
      if @correct_guesses.include?(letter)
        letter
      else
        "_"
      end
    end.join
  end

  def incorrect_guess_message
    if incorrect_guesses.any?
      "incorrect guesses: #{incorrect_guesses.join("")}"
    end
  end
end

File.readlines(ARGV[0], chomp: true).each do |line|
  next if line.empty?

  if line.length > 1
    @game = Game.new(Word.new(line))
  else
    @game.submit_guess(line)
  end
end

Conclusion

Here are the issues with the original code that were fixed (or at least improved) in the refactored version:

  • Long method
  • Deep nesting
  • Unclear variable names
  • Lack of obvious meaning
  • YAGNI violations
  • Magic numbers
  • Needless attr_readers
  • Esoteric syntax
  • Duplication

Again, if you’d like to make your own Hangman Challenge submission, start here.

What should I NOT write tests for?

One of the most common questions about testing, including what to write tests for, is what NOT to write tests for.

When people ask me what to write tests for, my honest but maybe not very helpful answer is “basically everything”. But I don’t test literally absolutely everything. There are some cases when I choose to skip tests.

My criteria for skipping a test

My habit of writing tests for everything leads me to write tests for my code by default. For most of the code I write, I actually finder it harder to write the code without tests than with tests.

But sometimes I get lazy or my instincts tell me that a test would be a waste of time. When I’m feeling like this, I ask myself three questions:

  1. If this code were to misbehave, would it fail silently?
  2. If this code were to misbehave, how bad would the consequences be?
  3. If this code were to misbehave, how frequently would it fail?
  4. How costly would it be to write a test?

If the code I’m working on would fail in a very obvious way, and the consequences are minor, and the failure would only happen once a year, and the test would be very costly to write, then that’s a case where I would probably skip the test.

On the other hand, if the code would fail silently, OR if the consequences would be bad, OR if the failure would be frequent, OR if it’s super easy to write the test, then I would just write the test.

You could think of these questions as a boolean expression where all the items get OR’d together:

“Should I write a test?” formula (boolean OR)

  1. Might fail silently?
  2. Consequences might be bad?
  3. Might fail frequently?
  4. Test is easy to write?

If any one of the items is true, then the whole boolean expression is true and I go ahead and write the test. Otherwise I skip the test without guilt or worry.

Behavior vs. implementation

There are also entire types of tests I avoid.

I don’t test for things like the presence of associations, the presence of methods, etc. Such tests are pointless. Rather, I test the behavior that these things enable. Testing the behavior is the only way you can really be sure anything works.

Takeaways

  • I’ll skip a test if and only if the feature won’t fail silently, the consequences won’t be bad, the failure won’t occur frequently, and the test is expensive to write.
  • I don’t test implementation details, but rather I test the behaviors that the implementations enable.

How to get familiar with a new codebase

The question

Someone submitted the following question to me on my ask me a question page:

One of the things I learned from my RoR bootcamp (Le Wagon) was test driven dev. Using that knowledge, when I joined my first job, I wasn’t given much guidance to onboard (start up and I was first internal dev), so I started by reading and learning the system from the rspec files.

What would your advice be for new to industry, junior devs as to the most effective way to learn a new monolith / code base?

Here’s my advice for how to get familiar with a new codebase.

Having the right objective

The first step in getting familiar with a new codebase is to realize that it’s an impossible goal!

Unless the codebase is really tiny, no one, no matter how smart or experienced, can understand the whole thing. And even if you could understand the whole thing, I think it would be a waste of effort.

Rather than trying to understand an entire codebase, I think it’s more useful to try to understand the area of the code where you need to make a change. After all, the only reason for needing to understand an area of code is in order to safely make a change to it.

Now let’s talk about how to understand a piece of code. I find it helpful to list the obstacles to understanding a piece of code.

The obstacles to understanding code

The following things can stand in the way of understanding a piece of code:

  • Unfamiliar technologies
  • Unfamiliar domain concepts
  • Sheer complexity
  • Poor-quality code

Let’s discuss each, including how to address it.

Unfamiliar technologies

The answer to this one is straightforward although not easy: get familiar with those technologies.

When I have to learn a new technology, I personally like to spin up a scratch project to teach myself about it in isolation. Learning a new technology is easier when it’s not mixed with other stuff like unfamiliar domain concepts and somebody else’s code.

Sometimes I like to bounce between a scratch project and a production project. Too much scratch coding and the learning can get too detached from what’s relevant to the production project. To much production coding and it can be hard to separate the difficulties presented by the unfamiliar technology from the difficulties presented by everything else.

Unfamiliar domain concepts

Domain knowledge can be hard to acquire. Software technologies often have documentation you can read, but domain knowledge often has to be acquired through experience or just by having someone tell you.

The unfortunate truth about domain knowledge is that, quite often, you just have to ask your co-workers to tell you. If you’re lucky you may be able to supplement your learning with things like Wikipedia and books.

Sheer complexity

Complex things are obviously harder to understand than simple things. Sometimes it’s helpful to acknowledge that in addition to unfamiliar technologies, unfamiliar domain concepts and hard-to-understand code, some things are just complex. To me, articulating precisely why something is hard to understand is half the battle toward understanding it.

When I want to try to understand something complex, I try to break it down into parts. For example, I started to understand cars a lot better when I understood that a car consists of several somewhat separate systems including the engine, the braking system, the steering system, the heating and cooling system, etc. Cars became a little easier for me to understand once I realized that these separate systems were present and that I could understand each system more or less in isolation.

Poor-quality code

My definition of bad code is code that’s hard to understand and change. Sadly, quite a lot of code in the world, perhaps the vast majority of it, is pretty bad, and therefore hard to work with.

One of the highest-yield techniques I’ve encountered for understanding bad code (which I learned from Working Effectively with Legacy Code) is to do a “scratch refactoring”. With this technique, I take a piece of code and freely rename variables, move things around, etc., with no intention of ever committing my changes. Sometimes this act can lead to a useful burst of insight.

Honestly, the most helpful thing I can say about dealing with legacy code is that you should buy Working Effectively with Legacy Code and read it. There’s too much to say about legacy code for me to repeat it all here, and most of what I could say would be redundant to the book anyway.

Takeaways

  • Getting familiar with an entire codebase is impossible. Instead, focus on getting familiar with the parts you need to change.
  • There can be several different reasons why an area of code can be hard to understand. When trying to understand a piece of code, try to identify the reason(s) the code is hard to understand and then address each reason individually.

How to defend good code

Why good code needs defending

Good code quite frequently comes under fire. Managers explicitly or implicitly pressure developers to cut corners in order to “move fast”.

And sadly, even programmers sometimes argue against writing good code. They say things like “it doesn’t always need to be perfect, because after all we need do to ship”.

These arguments sound reasonable on the surface but, as we’ll see, they contain subtle lies.

The biggest lie in many arguments against good code is that programmers spend too much time gold-polishing their code. To me, cautioning programmers against gold-polishing is kind of like, for example, cautioning Americans not to starve themselves and become too skinny. Sure, it’s a theoretical danger, but our in reality our problem is overwhelmingly the opposite one. Similarly, the problem in the software industry is not that we spend too much time writing good code, but that we spend too much time wrestling with bad code.

If you ever find yourself pressured to write sloppy code, my goal with this post is to arm you with some arguments you can use to push back.

Here’s what I’ll go over:

  • What “good code” means
  • Weak arguments for writing good code
  • Weak arguments for writing bad code
  • My argument for writing good code
  • Let’s start with what “good code” means to me.

    What “good code” means

    Good code is code that’s fast to work with. Good code is easy to understand and change. To me it’s nothing more than that.

    If the code can be changed quickly and easily, then it’s good, by definition. If the code is slow and difficult to change then it’s bad. Any specific coding practices like short methods, clear names or anything else are just incidental. The only thing that matters is whether the code can be changed quickly and easily.

    One reason I like this definition of good code is that it also ought to be appealing to everyone. Developers like code that’s quick and easy to change. Non-technical stakeholders also ought to like the idea of code that’s quick and easy to change.

    People might not understand exactly what it means for code to be “high quality” or “good”, but they can certainly understand what it means to be able to work quickly.

    Anytime you have to make a defense for writing good code, it seems smart to remind your “opponent” (who will hopefully become your ally) that your goal is to move fast.

    Before we address some of the bad arguments for writing bad code in order to refute them, let’s first talk about some bad arguments for writing good code. It’s good to be aware of the bad arguments for your case so you can avoid trying to use them.

    Weak arguments for writing good code

    I think if we’re going to write good code, we should have a clear understanding of why we’re doing it. We should also be able to articulate to others exactly why we’re doing it.

    The bad arguments I’ve heard for writing good code include things like “craftsmanship”, “professionalism” and “integrity”.

    Saying something like “I write good code because it’s more professional to write good code” is a little bit of a copout. It doesn’t explain why it’s more professional to write good code.

    Same with craftsmanship. You can say “I write good code because I believe in craftsmanship”. But that doesn’t explain what the benefits of craftsmanship supposedly are.

    Such appeals are also selfish. They speak to what makes the programmer feel good, not to what benefits the business. These types of arguments are unlikely to be persuasive except perhaps to other programmers.

    So when people pressure me to cut corners to get a job done quickly, I don’t ever push back with talk about craftsmanship or professionalism.

    Weak arguments for writing bad code

    Finally, here are some bad arguments for doing sloppy work and why I think each one is flawed.

    “Perfect is the enemy of the good”

    This is a good and useful saying for the cases to which it actually applies. For example, when you’re talking about project scope, “perfect is the enemy of the good” is a good saying to keep in mind. A bent toward perfectionism can eat up all your time and keep you from shipping something that’s good.

    But with respect to code quality, “perfect is the enemy of the good” is almost always a false premise. Comically so, in fact.

    The typical spectrum of possibilities for a coding change usually doesn’t range from “perfect” to merely “good”. Usually it ranges from “acceptable” to “nightmarish”. A more honest version of this saying would be “acceptable is the enemy of the nightmarish”.

    Refutation: If someone tries to pull “perfect is the enemy of the good” on you, you can say, “Oh, don’t worry, I’m not trying to make it perfect, I’m just trying to make the code understandable enough so I can work with it.” This statement is hard to refute because it appears as though you’re agreeing with the other person. Plus no reasonable person would argue against making the code understandable enough to work with. What you’re saying is also true: you’re not trying to make the code perfect. You’re just trying to make it not nightmarish.

    “Users don’t care about code”

    This idea reflects a shallow, elementary level of thinking. Yes, obviously users don’t directly care about code. But bad code has negative consequences that eventually become obvious to users.

    Bad code (again, by definition) is slower to work with than good code. When bad code is piled on top of other bad code, the slowdowns become exponential. Changes that should take a day take a week. Changes that should take a week take a month. Users definitely notice and care about this.

    Bad code is also harder to keep bugs out of than good code. Code that’s hard to understand gives bugs safe places to hide. Users are obviously going to notice and care about bugs.

    Refutation: If someone uses “users don’t care about code” on you, you can point out that users don’t care directly about bad code, but users do care about the effects of bad code, like slow delivery and buggy software.

    “Your company might go out of business”

    Multiple times I’ve heard something along the lines of this: “If your company goes out of business, it doesn’t matter if the code was perfect.” This might sound on the surface like a slam-dunk argument against geekishly polishing code rather than maturely considering the larger business realities. But it’s not.

    All that’s needed to destroy this argument is a reminder that good code is called good because it’s faster to work with. That’s why we call it “good”.

    Refutation: Good code is called good because it’s faster to work with. Cutting corners only saves time in the very very short term.

    “There’s no time” or “my manager made me do it” or “they did the best they could with the time they had”

    These aren’t arguments for writing bad code but rather excuses for writing bad code.

    No one is holding a gun to your head and making you write shitty code. You’re the steward of your codebase. It’s your responsibility, and no one else’s, to protect the quality of the codebase so that the codebase can continue to be fast to work with.

    If you consciously choose to take on technical debt, you’ll almost certainly never be granted time to pay back that technical debt. Instead you’ll have to pay interest on that technical debt for the rest of your time with that codebase.

    It’s easy for your boss to tell you to cut corners. Your boss doesn’t have to (directly) live with the consequences of poor coding choices. But eventually when the poor coding choices accumulate and bring development to a crawl, your boss will blame you, not himself.

    Obviously it’s not always easy to fight back against pressure to cut corners. But I think developers could stand to fight back a little more than they do (even if it means being quietly insubordinate and writing good code anyway), and I think developers would benefit greatly from doing so. And so would their bosses and the organizations they work for.

    My argument for writing good code

    My argument for writing good code is very simple: code that’s easy to understand and change is faster to work with. Obviously that’s better.

    I’ll also point out something that might not be obvious. Coding choices are multiplicative. The coding choices you make today have an influence over how easy the code will be to work with tomorrow, the next day, and every day after that. Same with the coding choices you make tomorrow. Each day’s choices multiply against every previous day’s choices.

    The result is exponential. Poor coding choices every day lead to an exponential slowdown in productivity. Good coding choices unfortunately don’t lead to an exponential speedup, but they do at least avoid the exponential slowdown.

    You can think of each day’s code additions as having a score. If you add code that has an “easy-to-change score” of 90%, and you do that three days in a row, then your cumulative score is 0.9^3 = 72.9%. If you add code that has an “easy-to-change score” of 40% three days in a row, then your cumulative score is 0.4^3 = 6.4% (!). This is why programmer productivity doesn’t vary by a factor of just 10X but more like infinityX. Bad code can eventually drive productivity down to something close to 0%.

    Takeaways

    • Our industry has a much bigger sloppy-code problem than gold-plating problem.
    • Good code is code that’s fast to work with.
    • The popular arguments for writing poor-quality code, although they sound mature and reasonable on the surface, are a result of sloppy and confused thinking.
    • Whether you choose to write good or bad code is your responsibility, and you’re the one who will have to live with the consequences of your decisions.

How I approach software estimation

Software estimation is really hard. I’ve never encountered a programmer who’s good at it. I don’t think such a programmer exists. I myself am not good at estimation nor do I expect ever to be.

Here are two tactics that I use to deal with the challenge of estimation.

  1. Try to make estimation as irrelevant as possible
  2. Try to use historical data to make future estimates

Try to make estimation as irrelevant as possible

I figure if I’ll never be able to estimate accurately, at least I can try to make estimates a less important part of the picture. Probably my most successful tactic in this area is to try to only take on projects that last a matter of days. If a project will take more than a matter of days, then I try to break that project up and identify a sub-project that will only take a matter of days. That way, if I estimate that the project will take 2 days and it really takes 4, no one has “lost” very much and nobody’s too upset (especially if I let my stakeholder know that I have very low confidence in my estimate).

Try to use historical data to make future estimates

Obviously all estimates are based to an extent on historical data because you’re basing your estimates on past programming experiences, but there’s a little more to it than that. When I work on a project, I break the work into individual features. I have a sense for how long a feature will take on average. Even though some features take an hour and some features take three days, a feature takes some certain amount of time on average. If I think the average amount of time I take to build a feature is half a day and my I can see that my small project has 5 features, then I can estimate that my project will take 5 * 0.5 = 2.5 days. Obviously there’s a lot of room for inaccuracy in this methodology, hence tactic #1.

If you really want to go deep on estimation, I recommend Software Estimation: Demystifying the Black Art by Steve McConnell.

Don’t mix refactorings with behavior changes

Why it’s bad to mix refactorings with behavior changes

It adds risk

Probably the biggest reason not to mix refactorings with behavior changes is that it makes it too easy to make a mistake.

When you look at the diff between the before and after versions of a piece of code, it’s not always obvious what the implications of that change are going to be. The less obvious the implications are, the more opportunity there is for a bug to slip through.

When you mix behavior changes with refactorings, the behavior change and the refactoring obscure each other, often making the change substantially harder to understand and allowing for a much greater opportunity for bugs to slip through.

Mixing refactorings with behavior changes also requires you to make your deployment deltas (i.e. the amount of change being deployed) bigger. The bigger the delta, the greater the risk.

It makes bug attribution harder

If I deploy a behavior change that was mixed with a refactoring, and then I discover that the deployment introduced a bug, I won’t know whether it was my refactoring or my behavior change that was responsible because the two were mixed together.

And then potentially I’m forced to do something painful in order to remove the bug, which is to roll back both my behavior change and my refactoring, even though only one of those two things was the culprit and the other one was innocent. If I had committed and deployed these changes separately, there’s a higher chance that I would be able to attribute the bug to either the refactoring or the behavior change and not have to roll back both.

It makes code review harder

When you mix refactoring with behavior changes, it’s hard or impossible for a reviewer to tell which is which. It makes a discussion about a code change harder because now the conversation is about two things, not just one thing. This makes for a potentially slow and painful PR review process.

How to approach refactorings instead

When I’m working on a behavior change and I discover that my work would also benefit from some refactoring, here’s what I do:

  1. Set aside my current feature branch
  2. Create a new branch off of master on which to perform my refactoring
  3. Merge my refactoring branch to master (and preferably deploy master to production as well)
  4. Merge or rebase master into my feature branch
  5. Resume work on my feature branch

This allows me to work in a way that reduces risk, allows for easier bug attribution, makes code review easier, and generally saves a lot of time and headache.

10X programmers

You can find discussions online regarding the idea of a “10X programmer”. Much of what you’ll find is ridicule of the idea that 10X programmers exist.

I’ve always thought it’s fairly obvious that 10X programmers exist. In fact, I think programmers vary by a factor of way more than 10X.

Months and years vs. days and hours

When I think of a 10X programmer, I don’t think of something who can finish a job in one hour what would have taken an “average” programmer ten hours. Rather, I think of someone who can accomplish ten times as much in a year than an average programmer. I think it’s a very reasonable proposition that one programmer could accomplish 10X as much in a year than another programmer.

When programmers vary beyond 10X

To me it seems clear that programmers can vary not just by 10X or 100X but by infinityX. This is because some programmers are so good that they can solve problems that weaker programmers would never be able to solve. In this case the better programmer hasn’t produced 10X as much value as the worse programmer, but infinitely more value. (I know that you can’t divide by zero, but humor me.)

My infinite variance claim doesn’t require the weaker programmer to be very bad or even below average. Some programming projects are really hard and require a really good programmer for the project not to fail.

Cumulative work vs. non-cumulative work

It’s not possible to be a 10X worker in every type of work. For example, the best dishwasher in the world is probably not 10X more productive than the average dishwasher. The world’s fastest ditch digger probably can’t dig 10X as much as the average ditch digger. That’s because a dishwasher or ditch digger starts each day with a clean slate. They can move their body parts a little faster but that’s all and then they hit a ceiling.

With programming, each change you make to a codebase influences how easily you’ll be able to make future changes to the codebase. The work you do today is helped or burdened by the choices you made yesterday, the day before that, and so on, all the way back to the first day of the project. Each coding decision in the codebase multiplies with some number of other decisions in the codebase, producing an exponential effect. Good coding choices can’t make your work get much faster, but bad coding choices can make your work slow to a crawl or possibly even a halt. When hundreds or thousands of changes interact with each other multiplicatively, it’s not hard for codebases to vary by a factor of much more than 10X.

How to become a 10X programmer

I think being a 10X programmer is mainly a result of four skills: communication, critical thinking, process, and writing good code.

Being skilled at communication helps reduce the chances of building the wrong thing, which wastes time. It also reduces the chances of experiencing interpersonal problems with colleagues which can slow down work.

Being skilled at critical thinking helps you arrive at right answers and helps you, for example, to avoid spending time barking up the wrong tree when debugging.

Following efficient development processes (small tasks, frequent deployment, automated tests, version control, etc.) helps you avoid wasting time on the overhead of programming.

Finally, writing good code (that is, code that’s easy to understand and change) can help make future changes to the codebase faster.

All four of those areas are very broad and many books have been written on each. It’s obviously not realistic for me to go very deep into those areas here. But if you want to become a 10X programmer, I think those are the areas in which to build your skills.

My take on the Single Responsibility Principle

The Single Responsibility Principle (SRP) is an object-oriented programming principle that says (more or less) that each object should only have one responsibility.

The main difficulty that I’ve seen others have with the SRP, which I’ve also had myself, is: what exactly constitutes a single responsibility?

The answer I’ve arrived at is that it’s a subjective judgment what constitutes a single responsibility. If I write a class that I claim has just one responsibility, someone else could conceivably look at my class and credibly argue that it has eight responsibilities. There’s no way to look at a class and objectively count the number of responsibilities it has.

Here’s how I would characterize the gist of the Single Responsibility Principle: things that are small and that are focused on one idea are easier to understand than things that are big and contain a large number of things. Understanding this principle is much more helpful than understanding exactly what a “single” responsibility is.

A Docker “hello world” app

What we’re going to do

In this tutorial we’re going to illustrate what I consider to be Docker’s central and most magical ability: to let you run software on your computer that you don’t actually have installed on your computer.

The specific software we’re going to run is the Lisp REPL (read-evaluate-print loop). The reason I chose Lisp is because you’re unlikely to happen to have Lisp already installed. If I had chosen to use a language that you might already have installed on your computer, like Ruby or Python, the illustration would lose much of its sizzle.

Here are the steps we’re going to carry out. Don’t worry if you don’t understand each step right now because we’re going to be looking at each step in detail as we go.

1. Add a Dockerfile
2. Build an image from our Dockerfile
3. Run our image, which will create a container
4. Shell into that container
5. Start a Lisp REPL inside the container
6. Run a Lisp “hello world” inside the REPL
7. Exit the container
8. Delete the image

Prerequisites

Before you follow this tutorial you’ll of course have to have Docker installed. You might also like to get familiar with basic Docker concepts and terminology.

Adding a Dockerfile

A Dockerfileis a specification for building a Docker image. We’re going to write a Dockerfile, use the Dockerfileto build a custom image, create a container using the image, and finally shell into the container and start the Lisp REPL.

First I’m going to show you the Dockerfileand then I’ll explain each individual part of it.

# Dockerfile

FROM ubuntu:20.04

RUN apt update && apt install -y sbcl

WORKDIR /usr/src

FROM

FROM ubuntu:20.04

The FROMdirective tells Docker what image we want to use as our base image. A base image, as the name implies, is the image that we want to use as a starting point for our custom image. In this case our base image is just providing us with an operating system to work with.

The FROMdirective takes the form of <image>:<tag>. In this case our image is ubuntuand our tag is 20.04. When Docker sees ubuntu:20.04, it will look on Docker Hub for an image called ubuntuthat’s tagged with 20.04.

RUN

RUN apt update && apt install -y sbcl

The RUNcommand in a Dockerfilesimply takes whatever it’s given and runs it on the command line.

In this case the command we’re running is apt update && apt install -y sbcl. The &&in between the two commands means “execute the first command, then execute the second command if and only if the first command was successful”. Let’s deal with each of these commands individually.

The apt updatecommand is a command that downloads package information. If we were to skip the apt updatecommand, we would get an error that says Unable to locate package sbclwhen we try to install sbcl. So in other words, running apt updatemakes our package manager aware of what packages are available to be installed.

The apt install -y sbclcommand installs a package called sbcl. SBCL stands for Steel Bank Common Lisp which is a Common Lisp compiler. (Common Lisp itself is a popular dialect of the Lisp language.)

The -ypart of apt install -y sbclmeans “don’t give me a yes/no prompt”. If we were to leave off the -ywe’d get an “are you sure?” prompt which would be no good because the Dockerfileisn’t executed in an interactive way that would actually allow us to respond to the prompt.

WORKDIR

WORKDIR /usr/src

The WORKDIR /usr/srcdirective specifies which directory to use as the working directory inside the container. Imagine being logged into a Linux machine and running cd /usr/src. After running that command, you’re “in” /usr/srcand /usr/srcis your working directory. Similar idea here.

Listing our existing images

Before we use our Dockerfileto build our image, let’s list our existing Docker images. If this is your first time doing anything with Docker then the list of existing images will of course be empty.

In any case, let’s run the docker image lscommand:

$ docker image ls

Listing our existing containers

In addition to the docker image lscommand which lets us list any images we have, there’s an analogous command that lets us list our containers.

$ docker container ls

Building our image

We can use the docker buildcommand to build our image. The --tag lisppart says “give the resulting image a tag of lisp“. The .part says “when you look for the Dockerfileto build the image with, look in the current directory”.

$ docker build --tag lisp .

After you run this command you’ll see some entertaining output fly across the screen while Docker is building your image for you.

Confirming that our image was created

Assuming that the build was successful, we can now use the docker image lscommand once again to list all of our existing images, which should now include the image we just built.

$ docker image ls

You should see something like the following:

REPOSITORY   TAG      IMAGE ID       CREATED          SIZE  
lisp         latest   91f4fa2a754a   11 minutes ago   140MB

Creating and shelling into a container

Run the following command, which will place you on the command line inside a container based on your image. You should of course replace <image id>with the image id that you see when you run docker image ls.

$ docker run --interactive --tty <image id> /bin/bash

The docker runcommand is what creates a container from the image. The /bin/bashargument says “the thing you should run on this container is the /bin/bashprogram”.

Viewing our new container

Now that we’ve invoked the docker runcommand, we have a new container. Open a separate terminal window/tab and run docker container ls.

$ docker container ls

You should see a new container there with an image ID that matches the ID of the image you saw when you ran docker image ls.

CONTAINER ID   IMAGE          COMMAND       CREATED         STATUS
5cee4af0cfa9   91f4fa2a754a   "/bin/bash"   4 seconds ago   Up 3 seconds

Poking around in our container

Just for fun, run a few commands in the container and see what’s what.

$ pwd    # show the current directory
$ ls -la # show the contents of the current directory
$ whoami # show the current user

Running the Lisp REPL

Finally, let’s run the Lisp REPL by running the sbclcommand inside our container.

$ sbcl

Once you’re inside the REPL, run this piece of “hello, world!” Lisp code.

(format t "hello, world!")

Exiting the container

Press CTRL+D to exit the Lisp REPL and then CTRL+D again to exit the container. The container will delete itself once exited.

Deleting the image

Run the following command to delete the image.
The rmipart stands for “remove image” and the -fflag stands for “force”.

$ docker rmi -f <image id>

Recap

You’ve completed this exercise. You’re now capable of the following things:

  • Writing a Dockerfilethat specifies an operating system to use and some software to install
  • Listing Docker images and containers
  • Building Docker images
  • Shelling into a container
  • Using the software that was installed on the container

If you possess these capabilities and have at least a little bit of a grasp of the underlying concepts, then you’re well on your way to being able to use Docker to accomplish real work.