AWS Cloudwatch to Graphite – cw2graphite

I recently built a small utility to dump AWS Cloudwatch metrics (and thus AWS statistics) into Graphite (directly, not through statsd though this can be done easily). I feel this could be useful to some so I put it up on GitHub Cloudwatch2Graphite. Enjoy.


Based on code from Oren Solomianik’s ec2-delete-old-snapshots ec2-manage-snapshots does things a little differently.

Oren’s script would delete any snapshots older than n days for (a) given volume(s). I improved the code a little bit so as to handle a –region parameter ( –region eu-west-1 is an example) an –noop so as to tell us what it would do but not actually do it. That’s the ec2-delete-old-snapshots you’ll find in the included archive.

Because I wanted to manage snapshots differently I heavily modified the original script to create ec2-manage-snapshots so that for a given volumes, it would keep snapshots made:

  • in the last 7 days
  • the past 4 sundays
  • every first day of the month

and erase the rest.

Its behavior can be easily modified but I wanted to make sure I kept monthly snapshots, last 4 weekly & last 7 days worth. It assumes you’re doing one daily snapshot of each volume.

Download here: ec2-manage-snapshots

Like its ancestor, the program checks to see that each vol-id entered on the command line has at least one newer snapshot than the deletion date, to prevent deletion of all snapshots of a certain volume. The deletion process will commence for a volume only if such a snapshot was found. More importantly, you can use the --noop command line option or the constant in the code (NOOP) and the script will tell you what it would have deleted but won’t actually delete anything.

Disclaimer: This program is in development. Although it has been tested and worked on production environments, it can’t be guaranteed to perform without unexpected results. Use at your own risk.

New web based AWS admin console from

Been trying out the AWS admin console from Ylastic (it also works with Eucalyptus and Sun Cloud Services) as well as their iPhone app. I have to say I am pretty impressed so far. It goes well beyond AWS in terms of features and includes RDS, S3 and EBS support built it. It’s still lacking a few niceties which I am talking to them about but it’s really pretty strong.

At $25 a month it’s a little expensive for the individual but cheap for a company. They have a one week trial period.

Yeah but how much does it cost?

I have been quite busy setting up a cluster for a SEM Business Intelligence Web Application (a LAMP app) on AWS but I wanted to highlight something I discovered today.

I have been a user of the unix ‘screen’ command since the Mono days and was showing it to developers here. As I ran it on our Ubuntu 9.04 based EC2 cluster (thank you Alestic for the Jaunty AMI), I discovered interesting profiles had been added to the distribution (had never seen it on my Debian based Linode machine).

Long story short, screen-profiles are pretty great (as described here and here) but, as it relates to EC2, they now have the ability to include the EC2 cost of the particular instance it’s being run on; as explained here and shown here:

I have always been struggling with the actual cost of EC2 (beyond running simulations on the calculator) for non transient web applications and that might help keep an eye on it.

Amazon Web Services (AWS) – management tools – Part 1

Amazon Web Services has an extremely rich offering, one they add to constantly. They have an elastic cloud product (EC2: on demand, pay per hour virtual servers of all sizes), permanent storage (EBS), distribution storage (S3), load balancers, relational databases services, map reduce and many others. That’s a lot of concepts and products to deal with and the barrier to entry feels a little steep at times.

So you’d think they’d offer a nice clean, simple, web based administrative console. Well, they do. Kinda. But not really. Huh.

Amazon started off by releasing an extensive API (SOAP based) to expose the functionality of each of their products (EC2 has one, RDS has one, …). There are many of those APIs (at least one per product), they’re very well documented and allow you to do everything you need. But of course, you can’t use an API directly, you have to program against it, build your own tools.

With a strong API in place, Amazon built a set of command line tools that maps to all of those APIs. EC2 has one (two really if you include the Amazon Machine Image or AMI API), RDS has one, you get the idea… From what I can tell, they’re pretty much a one to one mapping from the API, since it’s built on top of it. Command Line tools are great if you’re into a CLI (which I find myself to be occasionally into); they allow you to use fairly complex scripting to automate your cloud creation, snapshot (backup) creation, launching and terminating instances. It’s quite practical but there are a lot of commands to know and their syntax is often a little complex.

Setting up the environment for each of those product command line toolbox is fairly straightforward:

  1. Set up your JAVA_HOME (which by the way is export JAVA_HOME=$(/usr/libexec/java_home) on Snow Leopard)
  2. Set up the ENV variable that points to your command line toolbox directory (different for each toolbox/product)
  3. Set up ENV variables for private/public keys or Amazon ID/Secret depending on the toolbox

While the command line tools give you complete control over your Amazon Web Services, they’re a little annoying to use, you have to individually set up each toolbox (RDS, EC2, EC2/AMI, …), they don’t auto-update (Amazon doesn’t even really have a mechanism to let you know a new version of the tools have landed) and the documentation is not always up to date. Still, pretty powerful.

First there was an API, then there were tools. And so there are.

First of the AWS Management Console. It’s fairly simple, well made and straightforward. But it lacks some functionality. It allows you to manage most of Map Reduce, CoudFront and EC2 (though lacks the tools to create an AMI image from a linux EC2 server, upload it to S3 and register it as an available custom AMI owned by yourself – something you have to do with the Command Line tools) but has no visibility on S3, RDS and a few other products. It’s not very powerful but will generally suffice if you’ve already setup all your AMIs (or are using already made ones exclusively) and have no use for the products it doesn’t cover. It’s not for the power user and at this point, I think we’re all power users, aren’t we?

Next I’ll cover ElastiFox (and S3 Organizer) and RightScale. TTFN.

Amazon Cloud: so you don’t have to optimize your code

When I joined ‘V’, one of the products I am managing had huge performance issues. As the tool had became popular, the database-heavy code had caused the application and its servers to collapse onto itself. Since the product is a classic LAMP application, a decision had been made, rushed, to move everything to AWS and use a huge RDS instance for the database.

Let me pause for a second. What’s RDS ? Relational Data Service is a product from the good folks at Amazon, an implementation of MySQL as a service (as opposed to renting a virtual machine instance and installing MySQL 5.x on it). I like that approach since it frees us from having to

  • install security fixes to the OS,
  • bug fixes to the database engine,
  • back-up the database (mostly)
  • and other administrative chores.

The downside is minimal:

  • Price, a RDS instance is slightly more expensive than an EC2 instance with MySQL on it
  • RDS does not _yet_ support some features that we might want to use down the line (Replication, myisampack, …)

Now of course moving to an extra large RDS instance, the modern version of ‘throw hardware at the problem’, did not solve our performance issue. The dev team has been hard at work optimizing the application, getting pretty incredible performance improvements in the process. While they’re doing that, I am working on putting the right AWS based cluster in place to host the application for now and the future.

More on that later.


In late 2009, I lost my job at ‘B’ and was interviewing for a position as an engineering manager at Ubuntu/Canonical for their cloud team.

I had been using virtual servers at Linode for 5 years or so and what I call the Amazon discrete services (SimpleDB and S3 – Simple Storage Services) for a couple of years, had experience in open source software and had managed teams of 20+ engineers.
But Mark Shuttleworth, Canonical’s CEO at the time, judged that I didn’t have enough cloud experience so it didn’t happen.

Regardless of whether he was right or not, I was looking forward to growing my experience in cloud computing. At ‘V’, my current position, I was presented with this opportunity.

This is that story.