Author Archives: joshua

My journey though open-source and fun JSON tricks

I recently attended a Ruby on Rails speed dating event here in San Francisco. Participants there were attempting to meet companies who would hire them on fulltime. And they only had five minutes to wow you. This is all over the din of 39 other participants attempting to do the same thing and with the shouting of the organizer barking out minute-by-minute reminders. You’re gonna have to be impressive to stand out, and you will have to leave something behind that you can be remembered by.

I’ve been doing open source software for a few years now. I came to Miso through open source. I really like open source, and within the Ruby community, most of that open source development seems to occur through Github. When I did the speed dating rounds, I asked people for their Github account names (I would have accepted self-hosted alternatives). Some people were nervous to share; some people didn’t have a Github account. I understand that, because…

Open source is scary

When I first started doing open source, I was struggling with what to work on. I didn’t feel like I had any projects I could easily contribute easily to. I had recently read Stefen Goessner article on JSONPath, and that made me think, maybe I could code this in Ruby. I looked around for an existing implementation in Ruby and when I couldn’t find one, I set out to write my own.

I hacked around for half a day, got something working and submitted my request to Rubyforge to have my project hosted. It took two days to get approval, but once I had that, I thought, I was finally on my way to becoming an open source hero. I didn’t have any particular need for the Gem I had written, it was more of a fun challenge, but there was something cathartic about getting my code out there for everyone to see.

My first Ruby gem, Jsonpath

I released my first Ruby gem and I was pretty proud of myself. I could do things like:

require 'jsonpath'

object = [
  {"title" => "Sayings of the Century", "price" => 18.75},
  {"title" => "Moby Dick", "price" => 20.95}
]

JsonPath.wrap(object).path("$..price").path("$..price").to_a
# => [18.75, 20.95]

I had implemented the full Jsonpath syntax; I even supported the odd array slice syntax and filter syntax. I thought I was doing well.

Later I opened a Github account and got to put my code up there. While the experience on Rubyforge was more akin to putting your code in a gallery for the world to see, Github was much more of a living, breathing thing. Suddenly, my README was on the front page, and, could I let my old README sit there unchanged? Though at the time, I thought I was pretty clever, pretty soon I felt shame. How could I not have documented my code better, for instance.

Getting over shame

Shame shapes the software we’re willing to release and not release. @mattmight recognized this shame within academia, and released the CRAPL license to address it. This license claims to “absolve authors of shame, embarrassment and ridicule for ugly code”. Looking back on Jsonpath, I feel bad. My Ruby knowledge was weak, I didn’t even know how to package up a gem correctly, I didn’t have the sense to add any sort of documentation to my project. But getting over my lack of knowledge was useful. My gem has proven to be useful to others, and in turn, those other people have submitted pull requests. My interactions through open source have made me a better coder, and I’m grateful for it.

Speed dating and Ruby

If you’re coming to a Ruby employment event, the best place to start to wow people is via open source. Start with something you care about, even if that program is only useful to you. It doesn’t have to be beautiful, it can just be your “learning how to code” project. If you’re looking for work as a programmer, undoubtedly you’re working on some projects in your spare time to sharpen your skills. Ultimately, code is not meant to sit in a gallery; your code is art waiting to be seen by the wide world. I remember once Matz was asked as then end of one of his talks what he thought the most beautiful code was. His response was simply: “Your code.” Even if it’s not completely beautiful yet, it’s yours, and that sense of ownership and passion will come through in how you express yourself and see your own code.

What I’m not looking for is impressive code, though, I’m happy to find that as well. What I want is to know you’ve gotten over your shame. It’s not just in the academic community where shame and fear rules, but that in many programming communities, the resistance to released half-finished code or “messy” code is strong, and code stay locked up.

The only consolation is that by getting over this shame and being willing to share your code is that you will get feedback, you will meet people who will help you, you’ll learn to participate in other people’s projects.

I’m still ashamed, but that’s okay

I’ve released version 0.5.0 of this gem, but it’s not really finished. (It never really is.) When I look back on my implementation which has remained largely unchanged, I know I could do a lot better. I intend to re-write this gem at some point, but, my “good enough” implementation has been helpful to people I don’t even know. I’m not sure the shame ever really goes away. And that’s okay. Soon your shame will be joined by other emotions. Collaborating with friends on a new project. The thrill of your first pull-request. Getting to submit your own first pull request to another project. Maybe even a project you use. Or other people use.

If you’re still hesitating, that’s okay. Looking back though, I’m glad I got over my own fears and released something, even if it’s not entirely pretty or finished, at least I got started.

Writing simple Ruby client/servers using Protobufs

We’ve recently been using protobufs as a serialization format for RPC here at Miso. While there are already some existing protobuf RPC solutions, we wanted something that could stream down any number of protobuf objects, as well, we wanted the ability to have void return types.

We built Protoplasm to solve these problems. Protoplasm’s server is built on top of Eventmachine. Object serialization is done using Beefcake.

A common pattern for RPC using a serialization format is to have the request class have an enum for the request type and a series of optional fields to fill out the actual request details. Using Beefcake, our request object might look something like this:

class AddCommand
  include Beefcake::Message
  required :left, :int32, 1
  required :right, :int32, 2
end

class SubtractCommand
  include Beefcake::Message
  required :left, :int32, 1
  required :right, :int32, 2
end

class Command
  include Beefcake::Message
  module Type
    ADD = 1
    SUB = 2
  end
  required :type, Type, 1
  optional :add_command, AddCommand, 2
  optional :subtract_command, SubtractCommand, 3
end

In this example, we support two kinds of requests, ADD and SUB. So, let’s get to implementing this.

First of all, we need to tie our Type enum to the command fields inside the command object. To do this in Protoplasm, we create a Types module, and include Protoplasm::Types into there and define the relationship between the type, field and response type. Here is a complete example of how to do that:

require 'protoplasm'

module Types
  include Protoplasm::Types

  class AddCommand
    include Beefcake::Message
    required :left, :int32, 1
    required :right, :int32, 2
  end

  class SubCommand
    include Beefcake::Message
    required :left, :int32, 1
    required :right, :int32, 2
  end

  class MathAnswer
    include Beefcake::Message
    required :answer, :int32, 1
  end

  class Command
    include Beefcake::Message
    module Type
      ADD = 1
      SUB = 2
    end
    required :type, Type, 1
    optional :add_command, AddCommand, 2
    optional :sub_command, SubCommand, 3
  end

  request_class Command
  request_type_field :type
  rpc_map Command::Type::ADD, :add_command, MathAnswer
  rpc_map Command::Type::SUB, :sub_command, MathAnswer
end

Now we have all our definitions. request_class defines the class to use for our request. request_type_field tells us where to look for the command type in any given request object. The rpc_map method ties together the enum value, the field in the request object and response type.

With all this under our feet, let’s get to building a client and server for this. To write a simple Server for this, we could do the following:

require 'protoplasm'
require './types'

class Server < Protoplasm::EMServer.for_types(Types)
  def process_add_command(cmd)
    send_response(:answer => cmd.left + cmd.right)
  end

  def process_sub_command(cmd)
    send_response(:answer => cmd.left - cmd.right)
  end
end

Then, we can start our server by adding

Server.start(40000)

To create a corresponding client, most of the work is done for you. Here is a sample client that would work with this server:

class Client < Protoplasm::BlockingClient.for_types(Types)
  def add(l, r)
    send_request(:add_command, :left => l, :right => r).answer
  end

  def subtract(l, r)
    send_request(:sub_command, :left => l, :right => r).answer
  end

  def host_port
    ['localhost', 40000]
  end
end

This client will always try to connect via localhost, and on port 40000, but other than that, this is a completely working example. We can then issue requests to a running server by doing the following:

client = Client.new
client.add(2, 3)        # => 5
client.subtract(10, 7)  # => 3
client.subtract(-10, 7) # => -17

The entire source code for this example is at https://github.com/bazaarlabs/protoplasm-example.

When a request is made what is really going on is the following. There are nine bytes sent as a header, then the entire serialized protobuf object is sent. The first byte is a reserved byte, the next eight are a 64-bit unsigned, native endian number. This is the size in bytes of the protobuf object.

The server responds with first the reserved byte. If it’s void, it stops sending data. If it’s streaming it will continue to send the full header plus each serialized object. The reserved byte in this case serves the purpose of indicating when streaming should stop. The client has no way to abort streaming aside from dropping the connection.

The full implementation of Protoplasm is available at https://github.com/bazaarlabs/protoplasm.

Bonus: Fun with Gemspecs!

Another common problem with this sort of RPC client/server arrangement in Ruby is where do you put the types information. Though you could use a third gem to hold onto just the types, a simpler arrangement is to use multiple gemspecs within the same repo. This is the technique employed by protoplasm itself, so, if you’re interesting, take a look at the source. Each gemspec has the same library files in common, but each gemspec has different dependencies. We avoid loading all dependencies by using autoload, but the same thing could be achieved by requiring the individual server and client ruby files.