rand9

Binary Search in Ruby, or, “Picking the Right Number, as Quickly as Possible”

So you’re writing a parser in C that parses the lines of a file. The line you’re parsing is made up of a 40 character key and any number of ip addresses after, space-separated. You need to know a max line length to read (because C is mean like that), but you’re not sure how many ip’s you can fit on a line for a given key.

Such was my case yesterday and decided to write a mini script in ruby to figure it out. My first stab was to iterate from 1 to 100 and checking the line lengths by literally building a line with x number of ip elements on the line. While the code was correct and produced the necessary information for the given inputs, it was horribly inefficient and so I decided to rewrite it to be smarter. Enter the Binary search algorithm.

Using the binary search algorithm, we take a lower and an upper bound of possible elements and try to quickly guess which number is the highest possible without exceeding the line limit. So here’s the concrete data we know. The line format (as described above) will look something like this:

1
86f7e437faa5a7fce15d1ddcb9eaeaea377667b8 174.18.0.1 174.18.0.2 174.18.0.3 174.18.0.4 174.18.0.5 174.18.0.6

… with theoretically unlimited ips per line. The first value is a key we’ll use to store the ips against in a lookup table, but don’t worry about that right now. The key is generated using sha1 digesting, so we know it will always be 40 characters. The max length for any given ip address is 15 assuming all 4 blocks are at least valued at 100 (e.g. 100.100.100.100). Space-separating the key and x number of ips and your line length calculation is f(x) = kl + (el*x) + x where x is line length, kl is key length, and el is element length (ip address length). In other words, if we’re testing 50 elements on the line, the line length would be 40 + (15*50) + 50 which equals 840.

Now that we can arbitrarily calculate the length of a line based on the number of ip elements we want to test, we can start “guessing”. This isn’t guessing at all, we just split our possible range in half and use the middle ground to test the possible length. In other words, if my initial range is 1..100 (read as “anywhere from 1 ip element to 100 ip elements”), then our first test value for x above would be 50, which if you remember produces a line length of 840. I assumed that I’d be okay with a max line length of 1000 characters, and so we assert that if len is less than the max, then we can use the upper half of the range boundary, or 50..100. If len was more than our max of 1000, we’d take the bottom half, or 1..50.

Using this technique recursively we can whittle down to the exact number of ip elements that can be inserted on a line before we go over the limit of 1000 characters on the line, which happens to be 60. You know you’re done checking when your range is only one element apart, in this case 60..61. With my first solution to iterate up from 1 to 100, this meant we had to check 61 times before we knew we were over the limit. With this new range, we actually only needed 8 iterations! Very cool how “guessing” can solve the problem quite nicely.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
require 'digest/sha1'
@k_len = Digest::SHA1.hexdigest('a').size # 40
@ip_len = "255.255.255.255".size # 15
@range = 1..100 # starting range
@max_line_len = 1000 # length to check against
@count = 0 # iteration counter
# Given a upper and lower boundary, determine if its
# middle value is over or under the given line length
# If over, use the lower boundary (lower..mid) for a recursive check,
# otherwise use upper boundary (mid..upper)
def check_boundary(lower, upper)
# determine middle value
mid = lower + ((upper-lower)/2)
# Exit recursion if we've found the value
throw(:found_value, mid) if (upper-lower) == 1
# only increment iter count if we're checking the length
@count += 1
# Get the line length for the variable number of elements
len = @k_len + (@ip_len*mid) + mid
# Perform the test
if len > @max_line_len
puts_stats lower, mid, upper, len, :over
# use the lower boundary
check_boundary(lower, mid)
else
puts_stats lower, mid, upper, len, :under
# use the upper boundary
check_boundary(mid, upper)
end
end
# Log method for values in a given test
def puts_stats lower, mid, upper, len, over_under
puts '%10d | %10d | %10d | %10d | %10s' % [lower, mid, upper, len, over_under]
end
# Specify some information for readability
puts 'Determining how many ip elements can sit on a line with a max length of %d' % @max_line_len
puts
legend = '%10s | %10s | %10s | %10s | %10s' % %w(lower mid/test upper len over/under)
puts legend
puts '-'*legend.size
# Run the recursive boundary checking
golden_ticket = catch(:found_value) do
check_boundary(@range.first, @range.last)
end
# Output results
puts
puts 'Golden Ticket (under) = %s' % golden_ticket.to_s
possible_iterations = @range.last-@range.first
efficiency = @count.to_f / possible_iterations.to_f
puts '%d iterations for %d possible iterations (%f efficiency)' % [@count, possible_iterations, efficiency]

Running the above script will produce the following output:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
Determining how many ip elements can sit on a line with a max length of 1000
lower | mid/test | upper | len | over/under
--------------------------------------------------------------
1 | 50 | 100 | 840 | under
50 | 75 | 100 | 1240 | over
50 | 62 | 75 | 1032 | over
50 | 56 | 62 | 936 | under
56 | 59 | 62 | 984 | under
59 | 60 | 62 | 1000 | under
60 | 61 | 62 | 1016 | over
Golden Ticket (under) = 60
8 iterations for 99 possible iterations (0.080808 efficiency)

I’m not really sure if the efficiency part makes sense, but you get a sense that it’s a LOT faster, not only because we’re calculating the line length per test, but also because we’re recursing a fraction of calls that the brute force method performs. It’s also fun to inflate/deflate the max line len or the starting range values to see how it affects the number of recursions needed to find the number. For instance, set the max line len to 100000 and see how many extra calls have to be made. Also, what happens if your range isn’t big enough? What if the range is off (e.g. 75..100)?

Algorithms are nifty.

Git Pre-receive Hook for Rejecting a Bad Gemfile

Bundler has a cool facility with Gemfiles that allow you to specify some fine-grained options for a given gem beyond specifying a version. Things like :path, :branch, :git, and :tag. All of those things are neat for development, but horrible for production. I wanted a way to reject pushes to a repo if the Gemfile was changed to include any one of those options, and a git pre-receive hook was just the tonic.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#!/usr/bin/env ruby
BRANCHES = %w( master stable )
REJECT_OPTIONS = %w( git tag branch path )
old_sha, new_sha, ref = STDIN.read.split(' ')
exit 0 unless BRANCHES.include?(ref.split('/').last)
diff = %x{ git diff-index --cached --name-only #{old_sha} 2> /dev/null }
if diff.is_a?(String)
diff = diff.split("\n")
end
if diff.detect{|file| file =~ /^Gemfile$/}
tree = %x{ git ls-tree --full-name #{new_sha} Gemfile 2> /dev/null }.split(" ")
contents = %x{ git cat-file blob #{tree[2]} 2> /dev/null }
invalid_lines = contents.each_line.select do |line|
line =~ /\b(#{REJECT_OPTIONS.join('|')})\b/
end
unless invalid_lines.empty?
puts
puts '> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!'
puts '> ---- PUSH REJECTED by origin ----'
puts '>'
puts "> You've specified an invalid option for #{invalid_lines.size} gem definitions in the Gemfile"
puts "> Invalid options are: #{REJECT_OPTIONS.join(', ')}"
puts '>'
puts "> The offending gems:"
puts ">\t" + invalid_lines.join(">\t")
puts '>'
puts '> To fix:'
puts ">\t* Remove the offending options"
puts ">\t* bundle install"
puts ">\t* Run tests"
puts ">\t* Ammend previous commit (git add . && git commit --amend)"
puts ">\t* git push origin #{ref.split('/').last}"
puts '>'
puts '> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!'
puts
exit 1
end
end

The script above monitors pushes to the “master” and “stable” branches (our development and production lines, respectively). It checks to see if the Gemfile was listed in the new commit file list, then parses the blob of the Gemfile for any of the offending options. Each offending line is then output back to the pushing developer with instructions on how to fix his/her Gemfile and how to amend the commit. Here’s what the output looks like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
$ git push origin master
Counting objects: 5, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 362 bytes, done.
Total 3 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
remote:
remote: > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
remote: > ---- PUSH REJECTED by origin ----
remote: >
remote: > You've specified an invalid option for 2 gem definitions in the Gemfile
remote: > Invalid options are: git, tag, branch, path
remote: >
remote: > The offending gems:
remote: > gem 'utilio', :git => 'git@github.com:localshred/utilio.git'
remote: > gem 'rails', :git => 'git@github.com:rails/rails.git'
remote: >
remote: > To fix:
remote: > * Remove the offending options
remote: > * bundle install
remote: > * Run tests
remote: > * Ammend previous commit (git add . && git commit --amend)
remote: > * git push origin master
remote: >
remote: > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
remote:
To git@git.mycompany.com:repo1.git
! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'git@git.mycompany.com:repo1.git'

It’s also worth noting that since this is a pre-receive hook, when returning an exit status of anything but 0, git will reject merging the commits. This is good because we don’t want “bad code” in our repo. You could also use this to do other checking measures, such as running a CI build or syntax checks.

To use the above hook, simply copy the script above into the ./hooks/pre-receive file in your origin repo. Be sure to chmod +x ./hooks/pre-receive otherwise git won’t be able to invoke the script when a new push occurs. We have ~15 repos that I manage at work that I want to use the hook on, so I just kept the file out on the git user’s home directory and symlinked it back to each repos hooks directory. Same results, just easier to manage if I need to make a quick change to the hook.

Happy coding.

Thor Script for Managing a Unicorn-driven App

Today I deployed a mini sinatra app on one of our test servers to manage some internal QA. I’ve put out quite a few apps backed by Unicorn in QA recently and finally wrote a little script to handle stopping, starting, and reloading of the unicorn processes. Nothing super special here, just thought I’d share a useful script. Drop the following code into your application’s tasks directory, or place it on the app root and call it Thorfile.

tasks/unicorn.thor (or Thorfile)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# put me in /path/to/app/tasks/unicorn.thor
require 'thor'
class UnicornRunner < Thor
include Thor::Group
namespace :unicorn
UNICORN_CONFIG = "/path/to/app/config/unicorn.rb"
RACKUP_FILE = "/path/to/app/config.ru"
PID_FILE = "/path/to/app/tmp/application.pid"
desc 'start', 'Start the application'
def start
say 'Starting the application...', :yellow
`bundle exec unicorn -c #{UNICORN_CONFIG} -E production -D #{RACKUP_FILE}`
say 'Done', :green
end
desc 'stop', 'Stop the application'
def stop
say 'Stopping the application...', :yellow
`kill -QUIT $(cat #{PID_FILE})`
say 'Done', :green
end
desc 'reload', 'Reload the application'
def reload
say 'Reloading the application...', :yellow
`kill -USR2 $(cat #{PID_FILE})`
say 'done', :green
end
end

Usage

From your application root directory, run any of the three commands. Keep in mind you’ll need a unicorn config file that actually dictates how Unicorn should behave (like number of workers, where your logs go, etc). You’ll also need a Rackup file (config.ru) which tells unicorn how to run your app.

1
2
3
4
5
6
7
8
$ thor -T
# thor unicorn:start Start the application
# thor unicorn:stop Stop the application
# thor unicorn:reload Reload the application
$ thor unicorn:start # starts the unicorn master
$ thor unicorn:stop # sends the QUIT signal to master (graceful shutdown)
$ thor unicorn:reload # sends the USR2 signal to master (graceful reload of child workers)

Plop this puppy behind nginx and you’re golden. Thor has a lot more things you could do with this (like overriding which config file to use) by providing method-level options, but this is a great starting point for most people. Leave a comment if you have any improvements or other ways you handle this.

Mapping Object Values With Ruby’s Ampersand-symbol Technique

Discovered another little Ruby nugget the other day. The nugget gives a shorter syntax when you want to map the return value of a message sent to a list of objects, say, the name of the class of the object. In the past I would use Array#map to produce the list with something like:

1
2
3
4
objects = [1, :number_1, "1"]
classes = objects.map {|o| o.class }
classes.inspect
# => [Fixnum, Symbol, String]

Turns out that Ruby has a shortcut that shortens your keystrokes a bit:

1
2
3
4
objects = [1, :number_1, "1"]
classes = objects.map(&:class)
classes.inspect
# => [Fixnum, Symbol, String]

The two snippets are functionally identical. By passing a symbol to map preceded by an ampersand, Ruby will call Symbol#to_proc on the passed symbol (e.g. :class.to_proc), which returns a proc object like {|o| o.class }. Where would you use this you ask? The day I learned this little ditty I was writing some tests that were verifying some active record associations. Whenever I needed to update values on a has_many collection for a particular model, I actually needed to assert that the associated collection of objects were rebuilt with the new values, deleting the old rows and recreating new ones. The ampersand-symbol technique above was nice for this.

1
2
3
4
5
6
7
8
9
describe Father do
it 'should create new children when I attempt to update the children' do
father = Factory(:father)
orig_children = father.children.map(&:id)
# perform the update method
father.reload
father.children.map(&:id).should_not == orig_children
end
end

So I thought I’d pass the word on. Cool stuff in Ruby. Who knew?

Areas of Focus

In 2009 I created a website to keep track of the simple goals in my life. It was a new way to set goals for me: set one simple goal each day. One Simple Goal was born out of a few days of work, because the concept is simple, and as you all know, I like simplicity. It helps me focus on what matters most, while assisting me in ignoring what doesn’t.

In the lead-up to 2011, I’ve been quietly thinking about what types of resolutions I will make, if any. Typically in the past I’ve shied away from making grandiose resolutions, usually just picking certain directions I’d like my life to flow, and then storing them as high-level ideals about my life. This year will be different.

Not different in that I’ll be buying that gym pass or whatever cliche’s abound when talking about resolutions. No, I am fairly certain I won’t be doing that kind of generalized stuff any time soon. Different because I’ll be more concrete about two things: 1) Focus, and 2) Timeframe. Humans generally love to set goals, and almost always find a way to not achieve them. I’m no different. But when I realized that I still wanted the results of those goals, I stumbled on the idea of making a simple goal every day. No long to-do lists, no lofty “Mount Everest” goals. Simple things, like:

  • Finally paint wall patches I did last August. Go me.
  • Push out some code to OSG. Anything.
  • Write the first of the ruby daily series

Fast-forward to tonight when I realized that OSG is exactly how I should be determining (and implementing) my goals for 2011, with a few tweaks. The first and most noticeable tweak is the time frame. Instead of limiting my goal work to one day, it’ll be 2 to 4 weeks, with a preference to lean on a full month. The focus aspect changes slightly also, as it’s easy to set a specific focus for a given day, but more challenging to do so for a given month. I have a few ideas about how using a system like Scrum can help me achieve real focus with a longer time frame. Keep in mind that as with OSG, the idea is that you have something concrete that you can say: “I did (insert goal here)”. So things like “Become more good looking” don’t count because they’re arbitrary in what their completion may look like. We’re looking for concrete goals, things that (as Seth Godin says) you actually ship. The concept of shipping is paramount.

I threw a list together on SpringPad with a few ideas that I plan on thinking into a little further, but the first month is more or less solidified in my mind as to what I’ll be working on (and yes, it actually is scary to me, in a good way). The over arching focus for 2011 is to develop myself further. 2010 was a banner year for me professionally, and 2011 is poised to be the same or more. But something I felt was lacking was a commitment to be better, not for my career or job, just for me. So 2011’s list looks like that.

One over-arching goal that falls out of this framework is that I want to get back into writing. I was into blogging in ‘07 and ‘08 and thoroughly enjoyed it. Not the page-views and all that garbage, just the act of writing about something. Anything. I want to enjoy it again. With all that in mind, I’ve decided that I’m going to work hard at documenting my ride through 2011 and the way I’m setting this whole thing up. Hopefully it’ll help somebody out, but mostly it’s for my own records and enjoyment.

So here are a few rules for setting goals this year. And yes, I just made them up.

  1. Only concrete goals that can be definitively completed within a month are allowed. Can you answer “Yes” or “No” to the question: Did you ship it?
  2. Do things that are worthwhile and that stretch me. No maintaining of status quo, go beyond.
  3. Planning for the month’s goals happens at the beginning of the month, preferably the first day of the month. Plan only for the current month (iteration).
  4. Retrospectives about what was or wasn’t completed happen at the end of the month. Honesty is key here.
  5. Blog often.
  6. Have Fun.
  7. Write more lists like Ryan ;).

I’ll post more soon. In the meantime, why don’t you go set some simple goals?

Intersecting Arrays in Ruby

Just found a slightly satisfying approach to checking the contents of an array in ruby.

I like using Array#include? to figure out whether or not my given array has a certain entry. Unfortunately, if you want to check if an array has a set of possible values, such as, does it contain :a or :b, you can’t just pass an array of those values. Let me show you what I mean:

1
2
3
4
5
6
food = [:milk, :bread, :butter]
food1 = [[:milk, :bread], :butter]
expected = [:milk, :bread]
food.include?(:milk) # => true
food.include?(expected) # => false
food1.include?(expected) # => true

In other words, include? is very specific about the way it does the matching. But what if I want food.include?(expected) to tell me if food has any of expected’s values? Enter Array#&. It doesn’t make include? do anything different, but does give us a simple way to get this newer behavior:

1
2
3
food = [:milk, :bread, :butter]
expected = [:milk, :bread]
(food & expected).size > 0 # => true

Array#& gets the intersection of two arrays (the values that are present in both) and returns a new array containing only those values. You could add this to any Array instance by simply defining your own include_any? method:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# myapp/lib/ext/array.rb
class Array
def include_any? values
(self & values).size > 0
end
def include_all? values
(self & values).size == values.size
end
end
[:milk, :bread, :butter].include_any?([:milk, :butter]) # => true
[:milk, :bread, :butter].include_all?([:milk, :butter]) # => false
[:milk, :bread, :butter].include_all?([:milk, :butter, :bread]) # => true

I cheated and gave you an include_all? method also, which just ensures that all of the expected values are present.

I could’ve used Enumerable#any? but then we’d have to use a block and still use Array#include?. This way, we’re golden.

What cool things have you done with ruby today?

Hooking Instance Methods in Ruby

Everyone and their dog is familiar with ActiveRecord-style callbacks, you know, the kind where you specify you want a particular method or proc to be run before or after a given event on your model. It helps you enforce the principles of code ownership while making it trivial to do the hardwiring, ensuring that code owned by the model is also managed by the model.

I love this kind of programming and recently found that I needed some similar functionality in a particular class, one that wasn’t tied to Active[Insert your railtie here]. My case was different in that I knew that any class inheriting from a particular base, which we’ll call HookBase, needed a hardwired hook for every method defined, functionality that needed to run for virtually every instance method call. The following example illustrates my need:

1
2
3
4
5
6
7
8
9
10
11
class HookBase
def hardwired_hook
# functionality every method needs
end
end
class MyClass < HookBase
def find_widget
# needs setup/teardown help from HookBase
end
end

So, operating from the idea that every instance method extending classes implement should have default wrap-around behavior, I got to work. First off, you need to know that ruby has built-in lifecycle hooks on your classes, objects, and modules. Things like included and extended and method_added help you hook in to your code to ensure that the appropriate things are happening on your classes, objects, and modules. So in my case, I needed to know when a method was added to HookBase (or one of its children) so that I could appropriately tap into that code.

method_added is where the meat of the solution lies. When a method is added, ruby fires the method_added call on the object (if any exists), passing it the name of the new method. Keep in mind that this happens after the method has already been created, which is crucial to this solution. We’ll next create a new name from the old name, prepended with some identifier (in this case we chose “hide_”).

We’ll need to check the private_instance_methods array for already defined method names to ensure we’re not duplicating our effort (or clobbering someone elses), as well as checking our own array constant for methods we don’t want to hook. Remember that method_added will be called on every method that is found for HookBase as well as children. I found that there were HookBase methods I had implemented that were supporting this behavior and didn’t need to be hardwired, so I added this to my list of methods to ignore.

If we’ve made it this far, go ahead and alias the old method to the new one, then privatize the new one. Now we can safely redefine the old method without destroying the code it contained. We also now know that no one (except self) can invoke the private method directly, they’ll have to implicitly go through the HookBase first.

Redefining the old method is as simple as using define_method and calling our hardwired_hook method within, passing our new_method (which is privatized), and the old method (for convenience), and any associated arguments and blocks.

The final implementation looks something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
class HookBase
class << self
NON_HOOK_METHODS = %w( hardwired_hook some_other_method )
def method_added old
new_method = :"hide_#{old}"
return if (private_instance_methods & [old, new_method]).size > 0 or old =~ /^hide_/ or NON_HOOK_METHODS.include?(old.to_s)
alias_method new_method, old
private new_method
define_method old do |*args, &block|
hardwired_hook new_method.to_sym, old.to_sym, *args, &block
end
end
end
private
# Hardwired handler for all method calls
def hardwired_hook new_method, old_method, args*, &block
# perform any before actions
puts 'doing stuff before method call...'
# Invoke the privatized method
__send__ new_method, *args, &block
# perform any after actions
puts 'doing stuff after the method call...'
end
end
class MyClass < HookBase
def find_widget
puts 'finding widget...'
end
end
MyClass.new.find_widget
# doing any before actions
# finding widget...
# doing any after actions

The great thing about this approach is you may not even care about hardwiring anything, but just want to provide hooking functionality. If that’s the case, simply define a class method in HookBase to register a hook (such as before or after), optionally accepting an :only or :except list of methods. Internally store the blocks passed and invoke them in the hardwired_hook method either before or after the method call.

Let me know if you have any comments or different approaches. Happy hacking!

UPDATE: Forgot that method_added needs to be defined in class << self to work properly. Also updated to use the Array#& intersection method I described in Intersecting arrays in ruby instead of using Array#include?.

Variable-length Method Arguments in Ruby

In yesterday’s post regarding the four things you should know about ruby methods, I covered some basics about ruby method definitions. For this post, I just wanted to go into a little more detail here with some of the things I left off the table that came to me later.

Variable-length arguments

Number three on our list of things you should know, we talked about variable-length arguments, also known as the array-collected parameter. At least, that’s what I call it… sometimes. This cool feature allows you to pass any number of arguments to a ruby method and have all the unmapped parameters get collected into a single parameter which becomes an array of the values that were passed. Whew! That was a mouthful.

1
2
3
4
5
6
7
8
9
10
11
def log_all(*lines)
lines.each{|line| @logger.info(line)}
end
log_all 'foo', 'bar', 'baz', 'qux'
# File: logs/your_log_file.log
# foo
# bar
# baz
# qux

This is neat, no doubt, but what if you don’t want to specify each value individually to the array-collected parameter in the method call? Say for instance, in the previous example, I already had the values 'foo', 'bar', 'baz', and 'qux' in an array. Ruby allows you to pass the array through as a pre-collected array of values. The only thing to do is pass the array as a single parameter, prefixed by an asterisk (*).

1
2
3
4
5
6
7
8
9
def log_all(*lines)
lines.each{|line| @logger.info(line)}
end
# values are predefined
values = ['foo', 'bar', 'baz', 'qux']
# Don't forget the asterisk!
log_all *values

If you were to indeed forget the asterisk before the values array being passed, log_all’s lines parameter would still be an array, but would only contain one element: the array you passed. So in order to get to it you’d either have to flatten lines or call the 0th index on it, which sort of defeats the purpose.

Don’t forget blocks, lambdas, and procs!

This array-collection technique to method parameter definition is not only confined to normal methods, but also to ruby’s trio of anonymous function definitions: blocks, lambdas, and procs. Take the same example from above, with log_all rewritten as a lambda.

1
2
3
4
5
6
7
8
9
log_all = lambda do |*lines|
lines.each{|line| @logger.info(line)}
end
# values are predefined
values = ['foo', 'bar', 'baz', 'qux']
# Don't forget the asterisk!
log_all.call *values

You can even do this with ActiveRecord named_scopes, which is wicked cool.

1
2
3
4
5
6
7
8
9
10
11
class User &lt; ActiveRecord::Base
named_scope :with_names_like, lambda do |*names|
{
:conditions => names.collect{|name| "lower(name) like '%?%'"}.join(' OR ').to_a.push(names).flatten!
}
end
end
User.all.with_names_like 'jeff', 'jose', 'jill'
# Creates a condition sql string like so:
# WHERE (lower(name) like '%jeff%' OR lower(name) like '%jose%' OR lower(name) like '%jill%')

I’m interested to know what other ways you’ve come up with to use this technique. Please leave a comment below if you have any questions or examples of your own work.

Four Things You Should Know About Ruby Methods

Just wanted to jot down some of the really cool things I’ve learned aout the way you can call methods in ruby. I may end up expanding this post into four separate posts with more info if need be, but for now I’ll try to keep this short.

1. Default values

When defining a method with parameters, inevitably you’ll find that it can prove useful to have some of the params revert to a default value if no value is passed. Other languages like python and php give you similar conventions when providing the parameter list to a method.

To use default values, simply use an equals sign after the parameter name, followed by the default parameter you wish to use. Note however, that while not all params need have a default value assigned, all params that do have defaults must go at the end of the parameter list.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Invalid method definition. Either from_date must be at the end of the method list, or to_date must have a default value
def back_to_the_future(from_date=1985, to_date)
puts "From Date: #{from_date}"
puts "To Date: #{to_date}"
end
# Valid method definition
def back_to_the_future(from_date=1985, to_date=1955)
puts "From Date: #{from_date}"
puts "To Date: #{to_date}"
end
back_to_the_future
# => 1985
# => 1955
back_to_the_future 2010
# => 2010
# => 1955
back_to_the_future 2010, 2000
# => 2010
# => 2000

Pretty simple, but hey, maybe you didn’t know.

2. “Named Parameters” using the special hash parameter

Python, Objective-C, and various other languages have an interesting syntax for method arguments where you can name an argument beyond the scope in which the method is defined. These named parameters give you the ability to assign values to method arguments in an arbitrary order, since you are assigning a value to a specific parameter by that parameters name.

While ruby doesn’t have Named Parameter syntax built in, there is one way to gain something very similar, and it has to do with Ruby Hashes. Ruby’s hash syntax has a very simple, minimalist style that I really like, especially in Ruby 1.9. The interesting thing about using the hash parameter as a “named parameter” or “variable-length” argument, is that there is no syntactic sugar needed when defining the method. All the interesting work goes on while calling the method.

1
2
3
4
5
6
7
8
9
10
11
12
def back_to_the_future(options)
puts "From #{options[:from]} to #{options[:to]}"
end
back_to_the_future :from => 1985, :to => 1955
# => From 1985 to 1955
back_to_the_future :to => 1985, :from => 1955
# => From 1955 to 1985
back_to_the_future :with_doc => false
# => From to

Notice here that we didn’t have to include the open and close curly braces usually present in a hash definition. You can put them in if you’d like, but sometimes it’s more confusing to see it that way (what with block syntax using curly braces or the do...end syntax).

You can still have regularly defined parameters with or without defaults in the parameter list, just make sure they come before the expected hash collection.

1
2
3
4
5
def back_to_the_future(from_date, to_date, options) ... end
def back_to_the_future(from_date, to_date=1955, options) ... end
def back_to_the_future(from_date, to_date=1955, options={}) ... end

You’ve probably noticed by now that rails does this all over the place.

3. Variable-length arguments using the special array parameter

While named-parameters is nice for passing a list of configuration options to a method, sometimes you just want a method to accept any number of arguments, such as a logger method that can take any number of log messages and log them independent of each other. Ruby has another parameter condensing technique where all parameters passed that do not map to pre-defined arguments get collected into their own special array parameter, that is, if you ask for it. Defining a method in this way, you simply place a parameter at the end of the parameter list with an asterisk (*) preceding the parameter name.

1
2
3
4
5
6
7
8
9
10
def back_to_the_future(from_date, to_date, *people)
puts "Who's coming?"
people.each {|person| puts person }
end
back_to_the_future 1985, 1955, 'Marty', 'Doc', 'Biff'
# => Who's Coming?
# => Marty
# => Doc
# => Biff

Also worth noting is that the parameters collected do not need to be of one type like Java forces you to be. One could be a string, the next a number, the next a boolean. Whether or not that is a good design for your method is another story.

This style of parameter definition can be mixed with all the styles we’ve discussed so far, just remember the order things go: Regular params, Regular params with defaults, a Hash-collected param (if any), and finally the Array-collected params (where the param is preceded with an asterisk).

1
2
3
4
def back_to_the_future(from_date, to_date=1955, delorean_options={}, *people)
# ...
end
back_to_the_future 1985, :use_flux_capacitor => true, :bring_back_george => false, 'Marty', 'Doc'

4. Saving the best for last: Blocks!

Arguably the most powerful feature that Ruby boasts is the ability to send an anonymous function to a method to be executed by the method in whatever way it was designed. Ruby calls these anonymous code blocks just that, blocks. In other contexts you might hear them called lambda’s, procs, or simply anonymous function. You’ve probably already used blocks a ton in your ruby code, but what exactly are they for, and how can you use them in your own code?

Virtually every class in ruby’s core make use of blocks to basically extend the language’s abilities without having to add more syntactic sugar. Take for instance iterating over an array, the conventional way with a for loop, and ruby’s more idiomatic way, with the each and it’s associated block.

1
2
3
4
5
6
7
8
people = ['Marty', 'Doc', 'Biff']
for person in people
puts person
end
people.each do |person|
puts person
end

The first example uses ruby’s syntax sugar to run the loop, printing out each entry in the people array. The second calls the each method on the people array, passing it a block. Array#each can and likely will run it’s own code before or after invoking the block. As a developer outside looking in, it doesn’t really matter to me what each does, so long a it calls my block for each element in the array. If we were to write a simplification of what ruby is doing in the background, it’d probably look something like this:

1
2
3
4
5
def each
for e in self
yield e
end
end

But wait a minute, isn’t that what we wrote in our first example without the block? Indeed, it’s very similar. Where our block example differs is that we have the ability to pass an anonymous block of code to the each method. When each is ready to call our block, it invokes yield, passing the argument applicable, in this case the e variable. In other words, each is handling the iteration for us, allowing us to focus on what matters more, the code being run for each iteration.

Syntactically, the big things to jot down with defining your methods to accept blocks are as follows:

  • All methods implicitly may receive a block as an argument.
  • If you want to name this argument, for whatever reason, it must be last in the argument list, and preceded by an ampersand &amp;.
  • Just as all methods implicitly may receive a block, you can always check in a given method if a block was given, by calling block_given?.
  • TO invoke a block, simply call the yield method, passing any paramters your block may be expecting
  • Alternatively, if you have named the block, say &func, treat it as a lambda or proc that is passed to you (because that’s what was passed), using the built-in call method available to procs: func.call(some_param).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
def back_to_the_future(*people, &cool_block_name)
puts "We're going back to the future with..."
people.each do |person|
cool_block_name.call(person)
end
end
back_to_the_future 'Marty', 'Doc', 'Biff' do |person|
puts person
end
# => We're going back to the future with...
# => Marty
# => Doc
# => Biff

All of these examples are obviously contrived, but I hope it sheds some light on some really cool things you can do in ruby with simple method definitions. I’ll likely be doing more posts with blocks, procs, and lambda’s in the future, since they are definitely the most powerful tools in the shed (as far as methods go), so look for those sometime in the near future.

Please let me know if you find any omissions or errors in the above examples and explanations. Happy Coding!