Taming legacy Ruby code using the “Sprout Method” technique (example 2)

by Jason Swett,
Note: this post is titled “example 2” because some time ago I wrote another Sprout Method example, but after I wrote that post I decided I could come up with a better example.

The Sprout Method technique

If a project was developed without testing in mind, the code will often involve a lot of tight coupling (objects that depend closely on other objects) which makes test setup difficult.

The solution to this problem is simple in concept – just change the tightly coupled objects to loosely coupled ones – but the execution of this solution is often not simple or easy.

The challenge with legacy projects is that you often don’t want to touch any of this mysterious code before it has some test coverage, but it’s impossible to add tests without touching the code first, so it’s a chicken-egg problem. (Credit goes to Michael Feathers’ Working Effectively with Legacy Code for pointing out this chicken-egg problem.)

So how can this chicken-egg problem be overcome? One technique that can be applied is the Sprout Method technique, described both in WEWLC as well as Martin Fowler’s Refactoring.

Here’s an example of some obscure code that uses tight coupling:

require 'open-uri'
file = open('http://www.gutenberg.org/files/11/11-0.txt')
text = file.read
text.gsub!(/-/, ' ')
words = text.split
cwords = []
words.each do |w|
  w.gsub!(/[,\?\.‘’“”\:;!\(\)]/, '')
  cwords << w.downcase
end
words = cwords
words.sort!
whash = {}
words.each do |w|
  if whash[w].nil?
    whash[w] = 0
  end
  whash[w] += 1
end
whash = whash.sort_by { |k, v| v }.to_h
swords = words.sort_by do |el|
  el.length
end
lword = swords.last
whash.each do |k, v|
  puts "#{k.ljust(lword.length + 3 - v.to_s.length, '.')}#{v}"
end

If you looked at this code, I would forgive you for not understanding it at a glance.

One thing that makes this code problematic to test is that it’s tightly coupled to two dependencies: 1) the content of the file at http://www.gutenberg.org/files/11/11-0.txt and 2) stdout (the puts on the second-to-last line).

If I want to separate part of this code from its dependencies, maybe I could grab this chunk of the code (which is the majority of the code, but doesn’t include the network dependency):

text.gsub!(/-/, ' ')
words = text.split
cwords = []
words.each do |w|
  w.gsub!(/[,\?\.‘’“”\:;!\(\)]/, '')
  cwords &lt;&lt; w.downcase
end
words = cwords
words.sort!
whash = {}
words.each do |w|
  if whash[w].nil?
    whash[w] = 0
  end
  whash[w] += 1
end
whash = whash.sort_by { |k, v| v }.to_h
swords = words.sort_by do |el|
  el.length
end
lword = swords.last

I can now take these lines and put them in their own new method:

require 'open-uri'

def count_words(text)
  text.gsub!(/-/, ' ')
  words = text.split
  cwords = []
  words.each do |w|
    w.gsub!(/[,\?\.‘’“”\:;!\(\)]/, '')
    cwords << w.downcase
  end
  words = cwords
  words.sort!
  whash = {}
  words.each do |w|
    if whash[w].nil?
      whash[w] = 0
    end
    whash[w] += 1
  end
  whash = whash.sort_by { |k, v| v }.to_h
  swords = words.sort_by do |el|
    el.length
  end
  lword = swords.last
  [whash, lword]
end

file = open('http://www.gutenberg.org/files/11/11-0.txt')
whash, lword = count_words(file.read)

whash.each do |k, v|
  puts "#{k.ljust(lword.length + 3 - v.to_s.length, '.')}#{v}"
end

I might still not understand the code but at least now I can start to put tests on it. Whereas before the program would only operate on the contents of http://www.gutenberg.org/files/11/11-0.txt and output the contents to the screen, now I can feed in whatever content I like and have the results available in the return value of the method. (If you’d like to see another example of the Sprout Method technique, I wrote another example here.

2 thoughts on “Taming legacy Ruby code using the “Sprout Method” technique (example 2)

  1. Drew

    I looked a this and started breaking it into stanzas before I read more. There were a couple of “create an aggregator then do something that could probably be simpler and in a map with said aggregator” chunks. And the before all that was “tokenize string.” What I’m saying is that I could start pulling pieces out of the middle and turning them into methods. And I’d want to make the file business be a parameter so that I could do dependency injection for tests’ sake.

    Similar to your approach, but I’d want to break out methods like in uncle Bob’s _Clean Code_. Even if they ended up being bad choices, it would make the code easier to reason with at first, then I could refactor into something better afterward, cleaning up my interim changes. Admittedly, I refactor more than most people seem to. But I think frequent refactoring forces me to have better tests and tests tied to behavior and not implementation.

    Please take my comment as a thank you for writing your blogpost.

    Reply
    1. Jason Swett Post author

      Hey Drew, thanks for the comment! Your approach sounds like a good one, and is probably exactly what I would do if I were to take this example further.

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *