Perspective at //build

Recently I had the opportunity to share stage with some brilliant internal and external colleagues advancing open source in the cloud at //build, Microsoft’s developer conference in San Francisco. Beyond having been able to talk to about 400 attendees about how we’re approaching open source in the cloud, how customers are building open source applications in Azure and much more, speaking at //build had a very special meaning for me.

Before joining Microsoft, I didn’t have a lot of exposure to the Microsoft developer ecosystem, the Microsoft subsidiaries themselves or the employees working there. I was focused in a number of open source projects such as Canaima (Venezuela’s national distro) a number of communities and expanding a small open source system integrator in the region.

Most of my interaction with Microsoft was limited to public debates, at industry events or in Congress, or to the ISO/IEC 29500 discussion back in the days (both of which I’ve covered in this blog, in Spanish) However, around 2009 or so, the company I was a CTO for and Microsoft decided to create an Open Source Interoperability Lab in Venezuela. The idea was to document common hybrid technology use cases (such as Samba-based DCs in Windows environments, or PHP and ASP.NET communicating via ESB) and transfer that knowledge to customers.

As a result of that effort, I ended up being invited to and participating in PDC09 in Los Angeles. PDC was the precursor of //build, a yearly conference aimed at Microsoft-centric developers. There are 3 things I remember clearly from PDC09: one, was the “convertible tablet PC” they offered attendees (running Windows 7 bits that rapidly became Debian bits), the second one was the PHP SDK for Azure, and the preview access to that new “cloud” thingy, and the third one was an open source roundtable led by Miguel de Icaza that mainly talked about governance and CodePlex.

While I didn’t know it back then, a lot of the things discussed in that roundtable influenced my decision, about a year later, to join Microsoft and work in open source strategy; a journey that brought me to Azure in less than 5 years. But I digress, and that whole story deserves another post.

Maybe some of the attendees then foresaw that Microsoft would end up acquiring Xamarin, or that attention would be put in non-CodePlex initiatives, like GitHub. What I really didn’t expect was that all of that new reality would converge into a PDC-like event, less than 10 years after. This year at //build it did, and then some.

For me, speaking at //build was a humbling opportunity to reconcile the many worlds increasingly pulled together by the force of open source. From the announcements to the content and all other metasignals at the conference, it was incredibly exciting to see this transformation manifesting itself within Microsoft’s developer community.

It highlights the importance of leaving no one behind when we explore new paradigms and technologies in the cloud, and how every individual in the open source community can exert change in this industry.

Advertisements

Rebasing CoreOS for ephemeral cloud storage

The convenience and economy of cloud storage is indisputable, but cloud storage also presents an I/O performance challenge. For example, applications that rely too heavily on filesystem semantics and/or shared storage generally need to be rearchitected or at least have their performance reassessed when deployed in public cloud platforms.

Some of the most resilient cloud-based architectures out there minimize disk persistence across most of the solution components and try to consume either tightly engineered managed services (for databases, for examples) or persist in a very specific part of the application. This reality is more evident in container-based architectures, despite many methods to cooperate with the host operating system to provide cross-host volume functionality (i.e., volumes)

Like other public cloud vendors, Azure presents an ephemeral disk to all virtual machines. This device is generally /dev/sdb1 in Linux systems, and is mounted either by the Azure Linux agent or cloud-init in /mnt or /mnt/resource. This is an SSD device local to the rack where the VM is running so it is very convenient to use this device for any application that requires non-permanent persistence with higher IOPS. Users of MySQL, PostgreSQL and other servers regularly use this method for, say, batch jobs.

Today, you can roll out Docker containers in Azure via Ubuntu VMs (the azure-cli and walinuxagent components will set it up for you) or via CoreOS. But a seasoned Ubuntu sysadmin will find that simply moving or symlinking /var/lib/docker to /mnt/resource in a CoreOS instance and restarting Docker won’t cut it to run the containers in a higher IOPS disk. This article is designed to help you do that by explaining a few key concepts that are different in CoreOS.

First of all, in CoreOS stable Docker runs containers on btrfs. /dev/sdb1 is normally formatted with ext4, so you’ll need to unmount it (sudo umount /mnt/resource) and reformat it with btrfs (sudo mkfs.btrfs /dev/sdb1). You could also change Docker’s behaviour so it uses ext4, but it requires more systemd intervention.

Once this disk is formatted with btrfs, you need to tell CoreOS it should use it as /var/lib/docker. You accomplish this by creating a unit that runs before docker.service. This unit can be passed as custom data to the azure-cli agent or, if you have SSH access to your CoreOS instance, by dropping /etc/systemd/system/var-lib-docker.mount (file name needs to match the mountpoint) with the following:

[Unit]
Description=Mount ephemeral to /var/lib/docker
Before=docker.service
[Mount]
What=/dev/sdb1
Where=/var/lib/docker
Type=btrfs

After systemd reloads the unit (for example, by issuing a sudo systemctl daemon-reload) the next time you start Docker, this unit should be called and /dev/sdb1 should be mounted in /var/lib/docker. Try it with sudo systemctl start docker. You can also start var-lib-docker.mount independently. Remember, there’s no service in CoreOS and /etc is largely irrelevant thanks to systemd. If you wanted to use ext4, you’d also have to replace the Docker service unit with your own.

This is a simple way to rebase your entire CoreOS Docker service to an ephemeral mount without using volumes nor changing how prebaked containers write to disk (CoreOS describes something similar for EBS) Just extrapolate this to, say, your striped LVM, RAID 0 or RAID10 for higher IOPS and persistence across reboots. And, while not meant for benchmarking, here’s the difference between the out-of-the-box /var/lib/docker vs. the ephemeral-based one:

# In OS disk

--- . ( ) ioping statistics ---
20 requests completed in 19.4 s, 88 iops, 353.0 KiB/s
min/avg/max/mdev = 550 us / 11.3 ms / 36.4 ms / 8.8 ms

# In ephemeral disk

--- . ( ) ioping statistics ---
15 requests completed in 14.5 s, 1.6 k iops, 6.4 MiB/s
min/avg/max/mdev = 532 us / 614 us / 682 us / 38 us