Monthly Archives: January 2011

Easy Monitoring of Varnish with Munin

If you’re looking for a reverse proxy server, Varnish is an excellent choice. It’s fast, and it’s used by Facebook and Twitter, as well as plenty of others. For most sites, it can be used effectively pretty much out of the box with minimal tuning.

Like many decently-sized Rails apps, we leverage a lot of open source code. Dozens of gems and plugins, a variety of cloud services, Varnish and Nginx for caching and load balancing, and various persistence solutions. The point is, as our app usage has grown over the last year, we’ve had our share of stressful, on-the-fly debugging while our app was down. That’s not the best time to learn about all the fun nuances and interactions of your technology stack.

It’s a good idea to know what your services are doing and the key metrics to watch, so you’re better prepared when you hit those inevitable scaling pain points. New Relic has been tremendously useful for monitoring and debugging our database and Rails app. The rest of this post goes over some key metrics for Varnish and setting up Munin to monitor them.

Optimizing and Inspecting Varnish

Unless your application has an extremely high volume of traffic, you likely won’t have to optimize Varnish itself (e.g., cache sizes, thread pool settings, etc). Most of the work will be in verifying that your resources have appropriate HTTP caching parameters (Expires/max-age and ETag/Last-Modified). You’re most of the way there if you do the following:

  • Run Varnish on a 64-bit machine. It’ll run on a 32-bit machine, but it likes the virtual address space of a 64-bit machine. Also, Varnish’s test suites are only run on 64-bit distributions.
  • Normalize the hostname. e.g., www.website.com => website.com, to avoid caching the same resource multiple times. Details here.
  • Unset cookies for any resource that should be cacheable. Details here.

Varnish includes a variety of command line tools to inspect what Varnish is doing. SSH into the server running Varnish, and let’s take a look.

Inspecting an individual resource

First, let’s look at how Varnish handles an individual resource. On a client machine, point a web browser to an resource cached by Varnish. On the server, type:

$ varnishlog -c -o ReqStart <IP address of client machine>

The output of this command will be communication between the client machine and Varnish. In another SSH terminal, type:

$ varnishlog -b -o TxHeader <IP address of client machine>

The output of this command will be communication between Varnish and a backend server (i.e., an origin server, the actual application). Try reloading the resource in the browser. If it is cached correctly, you shouldn’t see any communication between Varnish and any backend servers. If you do see something printed there, inspect the HTTP caching headers and verify they are correct.

Varnish statistics

Now that we’ve seen that Varnish is working for an individual resource, let’s see how it’s doing overall. In your SSH session, type:

$ varnishstat

The most important metrics to note here are the hitrate and the uptime. Varnish has a parent process whose only function is to monitor and restart a child process. If Varnish is restarting itself frequently, that’s something to be investigated by looking at its output in /var/log/syslog.

Other than that, check out Varnishstat For Dummies for a good overview.

It’s great that we can check on Varnish fairly easily, but the key is to automate this process; otherwise, it can be very difficult to detect warning patterns early. Also, it’s not realistic to have a huge, manual, pre-flight checklist to check on the health of all your services. Enter Munin…

Get Started with Munin in 15 minutes

Munin is a monitoring tool with a plug-in framework. Munin nodes periodically report back to a Munin server. The Munin server collects the data and generates an HTML page with graphs. The default install of Munin contains a plug-in for reporting Varnish statistics. The Varnish plug-in includes a variety of graphs, including the one below.

Installing Munin

If you’re installing Munin on an Ubuntu machine (or any distribution that uses apt), use the commands below. For other platforms, see the installation instructions here.

For every server you want to monitor, type:

$ sudo apt-get install munin-node

Designate a server to collect the data. The server can also be a Munin node. On the server, type:

$ sudo apt-get install munin

Configuring Munin

For each node, open the configuration file at /etc/munin/munin-node.conf. Add the IP address of the Munin server.

allow ^xxx.xxx.xxx.xxx$

After you modify the configuration file, restart the Munin node by typing:

$ sudo service munin-node restart

For the server, open the configuration file at /etc/munin/munin.conf. Add each node that you want to monitor.

[Domain;serverA]
  address xxx.xxx.xxx.xxx
  use_node_name yes

Choose any value you like for Domain and serverA above; the names are purely for organization. When the Munin server was installed, it also installed a cron job that runs every 5 minutes and collects data from each node. After editing the configuration file, wait 5 minutes for the charts to be generated. If you’re impatient, type:

$ sudo -u munin /usr/bin/munin-cron

View Munin Graphs

If you have lighttpd or Apache, point it at /var/cache/munin/www. If the charts have been generated properly, there should be an index.html file in that directory.

Troubleshooting Munin

If the Munin charts aren’t being generated, make sure that the directories listed in /etc/munin/munin.conf exist and have appropriate permissions for the user, munin.

Try manually executing munin-cron and see if there is any error output.

Look at /var/log/syslog for any Munin-related errors.

Conclusion

That’s it! Varnish is optimized and working correctly, and Munin is reporting the important stats so you can sleep easy at night. Enjoy!

Additional Resources

Web caching references

Caching Tutorial – Excellent overview of web caching by Mark Nottingham.
Things Caches Do – Overview of reverse proxy caches like Varnish and Rack-Cache.
HTTP 1.1 Caching Specification – Official HTTP 1.1 Caching Specification.

Varnish references

A Varnish Crash Course For Aspiring Sysadmins
Varnishstat for Dummies
Varnish Best Practices
Achieving a High Hitrate

Munin references

Munin Tutorial

Engineering at Miso

Recently, the Miso team came to the conclusion that alongside our existing Miso blog, there was an opportunity to discuss an entirely different aspect of what we do. While the official blog covers exciting new features and news, the posts here will have a much more technical and entrepreneurial focus, highlighting the engineering and creative efforts at Miso.

As my first post, I wanted to take a stab at a breadth-first overview of the state of things from an engineering perspective at Miso as we begin the new year.

Our Development Team

We have been very fortunate to put together a great team here and introducing each of us is a good place to start the post.

Timothy Lee

Tim is the CTO & cofounder of Miso.  Tim develops on the iOS platform and the Rails app. Prior to Miso, Tim has worked in software development at startups in the Bay Area like Real-Time Innovations and Lightspeed Genomics. Tim has a Masters in Electrical Engineering from Stanford, specializing in robotics and computer vision, and a Bachelors in Computer Engineering from the University of Texas.

Henry Yao

Henry is the Lead Mobile Engineer at Miso. He develops on the iOS and Android platforms. Prior to Miso, Henry developed the iOS, the Android, and the Blackberry apps at Howcast Media. Henry graduated from Columbia with a Bachelors in Computer Science and Math.

Nathan Esquenazi

Nathan is the Lead Platform Engineer at Miso. He focuses on the system architecture, database design and web services to power the mobile clients and APIs. Prior to joining Miso in July ’10, Nathan worked at technology startups in the Los Angeles area as a software engineer, created the Padrino ruby web framework, and graduated from University of California, Irvine with an Informatics degree.

Stuart Norrie

Stuart is the UI Designer at Miso. He works on the user interface and visual design for Miso’s web and mobile products. Prior to Miso, Stuart was a graphic designer at Redfin. He has a background in print and digital design and is currently a graduate student at the Academy of Art in San Francisco; prior to this he received a bachelor’s degree in digital imaging from Brooks Institute of Photography.

Technologies

Miso, like many companies, uses a wide array of technologies and libraries to power our operations. These tools break down into a few major areas:

Workflow

Our development workflow is primarily centered around “topics” or issues. We use Pivotal Tracker to manage our various development efforts and priorities. Each story there typically maps to a topic branch in our Git repository hosted on GitHub. Topic branches are created, tested, reviewed and deployed into the mainline branches as needed. We work hard to keep the flow free from unnecessary or weighty processes but frequent pair programming and code reviews along with automated testing are all integral to our day-to-day development.

Server-side

Our system architecture will be the topic of subsequent posts, but we are currently running a number of linux-based virtual private servers that power our application, database, and caching layers. Our applications are primarily developed using a fairly standard Ruby on Rails stack for the back-end and MySQL for most of our data persistence. In addition, Memcached and Redis are both used for fragment data and caching. We are all fans of using the right tool for the job and in particular polyglot persistence seems like a promising trend. Deployment is setup with Capistrano and cap-recipes allowing for quick and automated iterative releases.

Mobile

Miso is available natively on the Android and on the iPhone/iPad as a Universal app. We use a number of helpful libraries made by enormego, as well as authentication libraries such as OAuthConsumer and the Facebook iOS SDK. We implement a caching and persistence strategy that is based on NSUserDefaults and Core Data. In general, our philosophy with our mobile clients is that they should be about visualization, while business logic belongs on the server. Our mobile clients use a private version of our public API.

Wrapping Up

Hopefully, this has been a helpful whirlwind overview of the state of engineering at Miso. In future posts we will dive into exploring many different issues, libraries, technologies, that are important to our continued development. Hopefully much of it will be useful to others and we eagerly look forward to any and all feedback and input in the future.