Platform Product Management in 2021 (Product School Talk)

On June 26 I gave webinar for Product School about Platform Product Management. Definitely check it out as there’s a live stream of Q&A and commentary. I’ve turned the talking points into the post below and the slides are posted as well.

Watch the replay of the live event

Overview

I’ll give you a brief background about myself and my work on platforms. 

Then we’ll ground ourselves in what a platform is, as it can mean a lot of different things to a lot of different people.

From there we’ll tackle two of the many things that make platform product management different from other products: The customers, and the interfaces

Who am I?

Let me give you a bit of background about myself. I’m originally from St. Louis. Like many product managers, I’ve got a twisty/windy career path. It started with degrees in journalism and media management from the University of Missouri.

From there I moved to Atlanta and worked at Cox Media Group in software engineering on content management platforms. As a senior engineering leader I got exposed to a lot of our internal customer struggles – be it journalists, sales people, or developers at the local properties trying to build their own applications atop the CMS.

After that I moved to Kabbage, a fintech business lending product. I led an engineering team building an internal customer service tool that served multiple teams, as well as several international banking partners using Kabbage’s lending platform. They wanted to integrate the tools loan data and operations into their own local customer service and collections tools. That way their agents could act on their customer without having to bounce back and forth between multiple systems.

From there I worked at Pindrop on a cloud based voice security product aimed at financial and insurance call centers. The product took in voice and touch-tone data, performed real-time fraud analysis and provided programmatic updates to fraud scores. All of this had to be deployable, configurable and manageable via APIs.


This history of engineering leadership, close interaction with customers, and a lot of platform and API work, put me on a path to my current role at Square. Where I work with several product teams across our Payment Platform focused on payment acceptance. 

The platform is used by developers inside and outside of Square to build applications for sellers. Last year, we helped sellers moved more than 100 billion dollars in payments volume. Card not present payments, ACH payments, card present payments — on our magstripe, EMV, or beautiful all-in-one terminal hardware. All done via REST APIs, browser and mobile SDKs.

In the nearly four years I’ve been at Square we’ve grown rapidly, and continue to grow. We’re looking for more product managers to help us make our platforms and products more remarkable – head over to careers.squareup.com if you’d like to learn more.

What is a platform?

The fundamental concept of a platform is sharing.

Shared data, shared experiences, shared operations that in turn create shared value for the platform and the applications running on top of it.

For example: At Square you can take a payment on one device, in one application, then see it on another device, in yet another application. Then months later, refund it on yet another device, in yet another… and so on. All the while, sharing logic, security, compliance, and cost management that’s seamlessly kept up to date and managed on behalf of the seller and application developer.

Platforms can come in a variety of flavors.

  • External: It could be an external developer platform like the one I work on at Square, which is also used by our internal developers.
  • Internal: It could be a purely an internal platform like the content management system I worked on at Cox. 
  • Hybrid: It could be a hybrid platform with a mix of purely internal product surfaces, and some external product surfaces. The tool I worked on at Kabbage was a good example, of this. It had an underlying set of APIs and SDKs which our external platform customers consumed to integrate into their CRM systems. But we also used it to build our own internal customer service tool.

Sharing in practice

To grasp the power that shared data, operations, and experiences that a platform enables, let’s look at a few examples running on the Square developer platform:

  • Square Virtual Terminal is a payment terminal that runs in your browser and lets sellers take over-the-phone and in-person credit card payments, either quick-one off payments, or itemized.
  • Bentobox helps quick and full service restaurants get beautiful web sites up and running quickly. From there they’ve got menus, online ordering for pickup, dine-in, and even delivery.
  • Simpletix helps event organizers promote, sell and manage tickets to in-person and online events, and manage ticket sales and will-call pickups at the door.

That’s three different products that all used a common platform to build their product. Let’s dig in and see how a platform could help them and their sellers:

  • Shared data. We focus on small, atomic units of data at Square, trying to break down objects into their smallest useful bits. Questions like “Should a Fee be its own API object, or an attribute of the Payment and Refund objects” are decidedly non-trivial questions for a platform product manager at Square.

These three applications could re-use these building blocks: Payments, Refunds, Orders, Items, and Locations to build their own unique experiences tailored to the business problem they were trying to solve, while gaining interoperability with any other application or experience running on the Square platform.

  • Experiences. That interoperability means that when you create a payment or a catalog item, it shows up in a host of seller experiences throughout Square that you don’t have to build. Whether you’re a developer inside of Square or outside of Square, the less you have to build to bring your product to market means higher team velocity, and faster time to market.
  • Value. A platform has to offer a value proposition to the developers that build on it. In addition to the time-to-market value prop, I spend my day focused on the managed payments value proposition — making everyone a payments expert by embedding Square’s expertise into the platform itself. Improving authorization rates, conversion rates, reducing fraud. Giving you a payments-team-as-a-service-in-an-API so to speak

Platform Customers

Let’s talk about why you and your team are valuable. Your customers and the problems you solve for them.

You are not a service team! You don’t take tickets in, and spit code out. You are a platform product manager, you fight for your customer. But wait, who is your customer?

At first blush this should be easy, right? Ask your analyst or engineering team to tell you who’s using your platform, and voila you have found your customer? Right? Right?!

It’s worth looking at who is economic buyer is. “Follow the money” so to speak. You, and the engineers working on your product get paid, I hope. Your business generates revenue somehow.

Does your product get direct revenue from applications that use it — say from usage fees, or subscription revenue? Or are the customers of the applications built atop the platform the ones whom are actually paying for your platform.

Let’s look at two examples…

First let’s look at Amazon Web Services. In their case, application developers are the economic customer of the platform. They pay for their usage of the platform – for all their EC2 instances, S3 buckets, containers, etc.

Contrast that with the Square Payments Platform.

In our case, the seller is our customer. Sellers pay for Square through payment processing fees, and have to be happy enough with the value we provide to pick Square — pick apps that chose Square, or pick Square when apps offer a choice of payment processing platforms.

In that light, platform PMs must view all the application developers out there as potential distribution partners who can help us get our platform product into the hands of sellers. We have to provide value to the developer, we have to market to them, all so they’ll consider choosing us to include in their application.

Double the research, you say?!

This is one of the hardest parts of being a platform PM. To be successful, you need to understand the struggles of your application developers, and their customers to ensure you’re prioritizing and building out what both groups need.

But you are not alone trying to do double the research. Chances are those application development teams have a PM or eng lead with customer stories and quotes, or anonymized usage metrics that they’d be happy to share if it helps them help you make the case for that feature they want prioritized this quarter. Enlist them as your allies, and your front line eyes and ears.

Avoid the trap of a lot of platform teams who become obsessed only with architecture, API designs, scaling, and performance. As a platform PM bring the voice of the developer and the customer into the conversation. Be the team that’s obsessed with the customer struggle… and architecture, API designs, scaling and performance 

Platform Interfaces

APIs and SDKs

Most developer platforms offer two interfaces to their product:

Application Programming Interfaces, or APIs, are the core way of expressing those shared data objects and shared operations that we talked about earlier.

Software Development Kits or SDKs can come in two flavors – a native “wrapper” that provides a developer access to your API in their language of choice. Python, PHP, Java, etc. But more interesting to me are ones that provide those shared experiences across applications – likely with some customization. For example, Apple has SDKs to let developers summon a common “Share this” experience so developers don’t have to reinvent that particular wheel.

Side note: throughout this post, unless explicitly stated, I’ve been saying APIs as a catchall term for APIs and SDKs

Applications use APIs to interact with each other the way humans use GUIs, or voice interfaces (“press or say 1”), or text interfaces (ever use a command line?).

The bad news is that applications consuming an API can’t adapt to some changes in the API the way humans can adapt to changes in a GUI. 

An A/B test of the fields or endpoints of an API running in production would be a disaster, for example. We’ll talk about what this means for how to validate your design in a moment.

Nouns and Verbs

The Apollo Guidance computer also ran on nouns and verbs.

I could do a whole course on API design. But at a super high level, imagine APIs allowing software to software interactions defined as Verb acting on a Noun. 

The verbs are Create, Read, Update, Delete. Those are standard across the Internet. 

The nouns are what are unique to your platform: Payment, Database, Contact, etc. For example: you don’t call the ProcessCreditCard function you Create (verb) a  Payment (noun). You don’t call SetItemColor method you Update (verb) Item (noun) and include the color blue in the payload you send.

API Design and Validation

When it comes time to design an API, I highly recommend reading Mary Cagan’s The Four Big Risks. I want to focus on two in particular:

  • Value risk – whether customers will buy it or users will choose to use it” and
  • Usability risk – whether users can figure out how to use it

These are both areas where in traditional products, visual designers add a ton of value. But the role of API designer tends to be an ad-hoc one played by someone different on each team — maybe the product manager, maybe the visual designer, maybe an engineer, tech lead, or an architect.

Whomever it is, I encourage teams undertaking API design to take conscious steps during their design work to focus on paying down these two risks.

The most powerful and cheapest tool in your arsenal is documentation driven development

Once you have a sense of the customer use cases, sit down and write out the long form documentation. The “how to” guide that mixes narrative and API samples. Showing and telling how to use the API to complete the tasks that your potential customers value most — which you learned from your customer and developer research.

Share that with some prospective application developers — you can validate how easy it is to use, and if they’d choose your API over the competitions in an interview. Bonus points, you’ve validated that your API design meets your intended use cases from day 1, and the engineering team has nice narrative example of what we’re all trying to accomplish that they can rally around.

Dennis my general manager uses legos as a great analogy for good APIs and endpoints. They’re small and easy to understand. Your application can combine them into interesting and useful shapes, while another application can combine them into entirely different interesting and useful shapes. 

Even if you don’t have a public API, chances are you have an internal one. Treat your internal APIs like your external ones. They deserve as much care and thought if you expect to get good return on investment out of your internal platform.

Put time and care into your API design, take the next 5 or 10 things you intend to add to your API after your first release and prove to yourself you can do so in a way that is backwards compatible. That is, you can add the feature to version 3, while developers on version 1 and 2 can keep operating without any issue. They might not be able take advantage of the new feature, but they won’t break.

Don’t be afraid of versioning – it’s OK to ask developers to upgrade to take advantage of new features. That said, know that a version upgrade can add to the already long timeline of application developers adopting your feature so consider those tradeoffs.

Be afraid of building the YAGNI API, endpoint, or field. You can’t walk it back very easily. (Image credit: Martin Fowler)

You should be afraid, well lets say thoughtful about releasing new APIs and new endpoints. If you haven’t proven to yourself that the case is there to invest not just in building it today, but in operating it long term, don’t build it.

If the team hasn’t satisfied itself that the interface won’t be obsolete in a few months and require breaking changes, keep working the design. 

Once an API is in the wild, even and especially internal APIs, applications start using and depending on it and you’re left running it for a long time. The wrong design, the wrong abstraction, can and will absolutely hinder your ability to innovate and bring new platform features to market. That in turn makes it difficult for products to innovate on top of your platform. Worse, you’ll find yourself having to support the old poorly thought out solution, while also doing the big rewrite in parallel over a far longer period of time than if you had done a bit more design, research, and validation up front. Measure twice, cut once, because you will be operating it for what feels like a lifetime.

TLDR; Conclusion

Platforms build value through shared data and experiences, that’s also what makes them tough to build well

Two of the many things that set platform product management apart are how you approach Customers and Interfaces

If you don’t want to be a service team, identify your customers – even if they’re internal. Figure how who’s paying for your product, who you depend on to distribute your product.

Then research their struggles, and the struggles of the people using applications built atop your platform

Spend time on designing your APIs – they have long shelf lifes, you can’t walk them back like you can a visual interface.

Write the docs for APIs first against the use cases you’ve identified from your research as a way to validate the design, run it past potential customers to validate the value.

Posted in Management | Tagged , , | Leave a comment

Trader Chris Mai Tai

Mai Tais are better with The Good Ice™

The latest excitement in our house has been the arrival of our countertop ice maker. Why are we excited about an ice maker? 🧊  Because it makes The Good Ice™ – those beautiful nuggets of ice that keep your drink cold, are great for chewing on, and make you feel like your out at a restaurant or tiki bar. I make a pretty decent mai tai, but this ice has taken it from good to great!

Saving my Mai Tai recipe here for posterity and sharing. If you’re not using the High Ball app, it’s a great way to store and share drink recipes.

Posted in Cooking, Tiki | Leave a comment

Things.app Eisenhower-matrix made easy

As you may know, I love Things.app, the Eisenhower matrix, and booking time on my calendar for key tasks.

With the 3.4 release, Things.app gained the ability to craft links directly to tags and items

This made my daily and weekly planning process so much faster, I have a note in Bear that I call up each day and run through the links step-by-step.

My daily planning routine

  1. Review work in progress (things:///search?query=%E2%8F%B3)
  2. Review fires (urgent + important) (things:///search?query=%F0%9F%94%A5)
  3. Review goal items for the week (things://///search?query=%F0%9F%A5%85)
  4. Look at what you’ve selected for tomorrow (things:///show?id=upcoming&filter=tomorrow)
  5. Put time for them on your calendar

My weekly planning routine

  1. Clear all goal items for the week (things:///search?query=%F0%9F%A5%85)
  2. Review work in progress (things:///search?query=%E2%8F%B3)
  3. Review each project, tag items you want to consider for the week with 🥅 (Ctrl-G)
  4. Review goal items for the next week (things:///search?query=%F0%9F%A5%85)
  5. Tag next week’s items:
    1. Important/Urgent: 🔥 (Ctrl-U)
    2. Important: ⚡️ (Ctrl-I)
    3. Not Important / Urgent: ⚾️ (Ctrl-R)
  6. Schedule time on calendar for the big ones

giphy.gif

Posted in Business, Management | 3 Comments

Avoid productivity cookies, schedule time for todos

productivity-cookies.jpgFor me, a successful day doesn’t end once I’ve gotten my priorities straightened out and emoji-fied in Things.app. The next step is to schedule time on my calendar for any todo that needs more than 5 minutes.

I make an appointment with myself to do the work. Otherwise those rare times without meetings will quickly get eaten up by the not-important-yet-quick-things to do.

I’m talking about the productivity cookies – they’re sweet, tasty, and so easy to eat a lot of. But most of them are empty productivity calories. Instead of filing away e-mail or editing some writing of mine, I should be doing deeper work.

My daily/weekly ritual

At the end of each day, or first thing in the morning, I go through the list of things I want to make progress on put an item on my calendar for it.

Each Friday, I go through the items on my todo list looking at their Eisenhower-matrix tags, and add a 🥅 tag to the ones I want to tackle next week. For some key next-week goals I’ll scheduled time throughout the week for them.

When I’m forced to allocate time for these larger todos it achieves a couple of things:

It makes it easier to set a realistic expectations

It forces me, ironically, to leave space in the day.

If every 30 minute block on my calendar is filled up, I know I won’t have time for e-mail, catching up with a teammate in the hallway, or one of the many other things that are important but not deep work I need to do on a daily basis.

It makes the trade-offs clearer

If want an hour to focus on this requirements document tomorrow, I need to skip or delegate this other meeting. Or I need to push the PRD out a day.

When I get new meeting requests, I’ve got to decide if the meeting is more or less important than the task I already scheduled myself for at that time.

Keep it visual and automated

meetings-and-tasks.png
This day is not great for deep work, ¯\_(ツ)_/¯

Because I’m a visual person, I want those items to stand out on my schedule with a different color.

For a while I was manually re-coloring the item to green (blue is my work schedule, purple is home).

That is until I whipped up a Zapier integration! It’ll take any new calendar item that starts with the word “Task:” and re-color it green on my work calendar.

Here’s how to set it up:

  1. Set your trigger to be “New Event Matching Search” with the search term “Task:”
  2. Set your action to be Update Event
    1. For the Event field, select Use Custom Value for Event ID and insert the ID parameter.
    2. That’ll mean you’re editing the event that was found in step 1
  3. Set your color to #51b749 or any color you like, it’s your calendar world!

 

This slideshow requires JavaScript.

Time is like a predator finite, manage it well

Soran

I blocked off 2 hours for this movie and I want them back!

Your day only has so many hours in it, and a calendar is a fantastic way to visualize it.

All but the most trivial of items on our todo lists take non-trivial amounts of time to complete.

By putting time in my calendar for my todos I’ve found my days to have more focus, and to feel more sustainable and enjoyable than they did previously.

Posted in Business, Management | 5 Comments

Eisenhower and emoji: How I use Things.app to get focus

giphy

TFW there’s too much on my todo list

You know that anxiety you get when there’s something you need to do and you’re trying to keep it in your head? You write it down on your todo list and problem solved! Lather, rinse repeat over the course of a day and now you’ve got anxiety about all the things you’ve got to do on your sprawling todo list!

Let me tell you about how I use a combination of Things.app, the Eisenhower matrix and emoji to capture and prioritize my tasks, and to stay focused.

For many years now I’ve been using Things.app. I’ve had dalliances with Trello, Asana, Omnifocus, Remember The Milk and Wunderlist, but I keep coming back to Things.app. I’ve found that for me, it’s the most frictionless experience to capture and organize my tasks. Especially with the release of version 3 adding features like task checklists, recurring todos within a project, and one of my favorites for balancing work and home: This Evening. Check out all the new features in Things 3.

That said, all the concepts I talk about can be done in just about any tool, including paper!

Capture everything to the Inbox

Almost every task starts its life in the Things Inbox. It’s a great place to jot down a task – whether that’s right after a meeting, during a hallway conversation, or while waiting in line at the coffee shop.

Screen Shot 2017-08-31 at 1.43.26 PM

Hey, did ya get that thing I sent ya?

Items in the Inbox shouldn’t be there more than a few hours. If I find an item lingering there for days or more, it’s usually a sign I should delete it.

If an item hasn’t been prioritized, or if I think its priority has changed, then it’s time to update it using Eisenhower and emoji.

Eisenhower and emoji

Every item gets a Things tag to indicate where it falls in the Eisenhower matrix and a couple of other tags that I find useful. They started out life as words, but I switched to using emoji for each tag to make the UI easier and faster to scan.

Screen Shot 2017-08-31 at 10.50.37 AM.png

Continue reading

Posted in Kanban, Management, Uncategorized | 2 Comments

The Heisel Test: Five questions for professional happiness

I was recently asked to list out the values I look for in a person (when hiring), or a team or company (when looking for a job). Since The Joel Test, and its several updates is a thing, I am tongue-firmly-in-cheek calling this The Heisel Test.

The Heisel Test: Five questions for professional happiness

  1. Do you put customer value and experience first?
  2. Do you move responsibly fast?
  3. Are you genuinely curious and open to new information?
  4. Do you empower your teams and teammates?
  5. Do you respect and have empathy for people?

Do you put customer value and experience first?

Why is this important? Well no matter what you do, you have (or hope to soon have) customers. You won’t be in business long without them.

There’s a good chance you have a really interesting mix of external customers who pay you in dollars, and internal customers who pay you in good will and cooperation.

You’re relentlessly focused on solving a real customer problem to add customer value. But you don’t stop there! You want your customer’s experience using your product and engaging your services to be as positive, fast, and frictionless as possible.

Continue reading

Posted in Kanban, Management | 1 Comment

Take notes in your 1:1 and share them

The most important meetings I have every week are on my one-on-ones with my engineering managers and the engineers on their teams.

The agenda is the same every week – at least 15 minutes to talk about whatever they want to talk about, and up to 15 minutes for me to talk about whatever I want with them. The best ones are usually 20-30 minutes without me saying much at all.

They’re about relationship building, they’re about gemba, they’re about family, friends, beer, bands, pets.

5812365239_fdebb777ff_o

Image credit: derya

I’ve almost always taken notes during them, almost always with pen and paper so I can keep my eyes focused on the other person.

I’ve been spotty about what I do with the notes. The todo items, if any, would always end up in Things. But the subjects we talked about, the feedback I got, the feedback I gave would end up lost — either to my illegible handwriting or scanned into a deep dark Evernote archive. Some would get typed up for posterity and review season, but a lot wouldn’t because time and attention is finite resource.

Until recently, that is! A couple of folks in the Rands Leadership Slack mentioned that they type up their notes AND share them back with the other person.

Since then I’ve started making it habit to always type up my notes into a shared Google Doc per person – direct report, skip-level, peer, even my 1:1s with my boss – with a heading for the date, followed by the subjects we talked about, any questions I asked and the answers I heard and any feedback given or received.

It’s a beautiful thing because now I’ve got two great things I didn’t have before:

  • A feedback loop with the other person — they see exactly what I took away from our discussion and have a chance to correct anything I mistook
  • Instant accountability for myself — now the folks I’m meeting with know whether I actually typed up my notes, so they tend to get typed up same or next day.

So try this one weird trick after your next 1:1 – type up the notes and share a link back to other person. It’s easy with Google Docs or Evernote but even something as universal as an e-mail would do the trick.

 

Posted in Management | Leave a comment

Docker standards at Kabbage

I also posted this over at our Kabbage Tech Blog

In the five months my team’s been using Docker we’ve stolen adopted some standards to make our lives easier.

1. Build your own critical base images

Our application images have their own FROM inheritance chain. The application image depends on a Python Web application base image.

That web app image depends on an official Python image, which in turn depends on a Debian official image.

Those images are subject to change at the whim of their Github committers. Having dependencies change versions on you without notice is not cool.

Not cool bro

So we cloned the official Dockerfiles into our own git repo. We built the images and store them in our own Docker registry.

Every time we build our base or application images we know that nothing critical has changed out from underneath us.

2. Container expectations

Stop. Go read Shopify’s post on their container standards. The next section will now seem eerily similar because we stole adopted a bunch of their recommendations.

container/files/

We copy everything in ./container/files over the root filesytem. This lets you add or override just about system config file that your application needs.

container/test

We expect this script to test your application, duh. Ours are shell scripts that run the unit, integration or complexity tests based on arguments.

Testing your app becomes a simple command:

docker-compose run web container/test [unit|pep8|ui|complexity]

container/compile

We run this script as the last step before the CMD gets run.

This is what ours looks like:

echo "$(git rev-parse --abbrev-ref HEAD)--$(git rev-parse --short HEAD)" > /app/REVISION
echo "Bower install"
node node_modules/bower/bin/bower install

echo "Big Gulp build - minification"
node node_modules/gulp/bin/gulp.js build

/venv/bin/python /app/manage.py collectstatic --noinput

3. Docker optimization

ADD, install, ADD

We run docker build a lot. Every developer’s push to a branch kicks off a docker build / test cycle on our CI server. So making docker build as fast as possible is critical to a short feedback loop.

Pulling in libraries via pip and npm can be slow. So we use the ADD, install, ADD method:

# Add and install reqs
ADD ./requirements.txt /app/
RUN /venv/bin/pip install -r /app/requirements.txt
# Add ALL THE CODEZ
ADD . /app

By adding and then installing requirements.txt, Docker can cache that step. You’ll only have to endure a re-install when you change something in your requirements.txt.

If you go the simpler route like below, you’d suffer a pip install every time you change YOUR code:

# Don't do this
ADD . /app
RUN /venv/bin/pip install -r /app/requirements.txt

Install & cleanup in a layer

We also deploy a lot. After every merge to master, an image gets built and deployed to our staging environment. Then our UI tests run and yell at us if we broke something.

Sometimes you need to install packages to compile your application’s dependencies. The naive approach to this looks like this:

RUN apt-get update -y
RUN apt-get install libglib2.0-dev
RUN pip install -r requirements.txt # has something that depends on libglib
RUN apt-get remove libglib2.0-dev
RUN apt-get autoremove

The problem with that approach is that each command creates a new layer in your docker image. So the layer that adds libglib will always be a contributor to your image’s size, even when you remove the lib a few commands later.

Each instruction in your Dockerfile will only ever increase the size of your image.

Instead, move add-then-install-then-delete steps into a script you call from your Dockerfile. Ours looks something like this:

#Dockerfile
ADD ./container/files/usr/local/bin/install_and_cleanup.sh /usr/local/bin/
RUN /usr/local/bin/install_and_cleanup.sh

#install_and_cleanup.sh
set -e # fail if any of these steps fail
apt-get -y update
apt-get -y install build-essential ... ... ...
#... do some stuff ...
apt-get remove -y build-essential ...
apt-get autoremove -y
rm -rf /var/lib/apt/lists/*

For more Docker image optimization tips check out CenturyLink Labs’ great article.

4. Volumes locally, baked in for deployment

While working on our top-of-the-line laptops, we use docker-compose to mount our code into a running container.

But deployment is a different story.

Our CI server bundles our source code, system dependencies, libraries and config files into one authoritative image.

Packaged software It’s like this, except not.

That image is what’s running on our QA, staging and production servers. If we have an issue, we can pull an exact copy of what’s live from the registry to diagnose on our laptops.

5. One purpose (not process) per container

Some folks are strict, die-hard, purists that insist you only run one process in a container. One container for nginx, one container for uwsgi, one container for syslog, etc.

We take a more pragmatic approach of one purpose per container. Our web application containers run nginx and uwsgi and syslog. Their purpose is to serve our Web application.

One container runs our Redis cache, it’s purpose is to serve our Redis cache. Another container serves our Redis sentinel instance. Another serves our OpenLDAP instances. And so on….

I’d rather have a moderate increase in image size (by adding processes related to the purpose). It’s better than having to orchestrate a bunch more containers to serve a single purpose.

6. No Single Points of Failure

You're gonna have a bad time

Docker makes it super-easy to deploy everything to a single host and hook them up via Docker links.

But then you’re a power-cycle away from disaster.

Docker is an amazing tool that makes a lot of things way easier. But you still need to put thought and effort into what containers you deploy onto what hosts. You’ll need to plan a load balancing strategy for your apps, and failover or cluster strategy for your master databases, etc.

Future standards

Docker is ready for prime time production usage, but that doesn’t mean it or its ecosystem is stagnant. There are a couple of things to consider going forward.

Docker 1.6 logging/syslog

Docker 1.6 introduces the concept of a per-host (not per-container) logging driver. In theory this would let us remove syslog from our base images. Instead we’d send logs from the containers, via the Docker daemon, to syslog installed on the host itself.

Docker Swarm

Docker swarm is a clustering system. As of this writing it’s at version 0.2.0 so it’s still early access.

Its promise is to take a bunch of Docker hosts and to treat them as if they’re one giant host. You tell Docker swarm “Here’s a container, get it running. I don’t need to know where!”

There’s features planned but not implemented that would allow you to use it without potentially creating the aforementioned single point of failure.

Posted in Uncategorized | Leave a comment

Docker orchestration with maestro-ng at Kabbage

I also posted this over at our Kabbage Tech Blog

At Kabbage, my team loves using Docker! We get a ton of parity between our development, testing and production environments.

We package up our code, configuration and system dependencies into a Docker image. That image becomes our immutable deployment unit.

I’ll cover how we build and package repeatable Docker images in another post. For now lets talk about how we deploy and run these images.

Too many cooks options

You have many options for managing the deployment and operation of your docker images. Early into our first Docker project, I assumed we’d use Shipyard for orchestration.

It had a nice GUI and an API. I’d planned to script Shipyard’s API to get the images and containers onto the hosts.

I found out the hard way that Shipyard can’t pull images onto remote Docker hosts! I thought for a hot minute about scripting something to handle that part. But that seemed more complicated than it was worth.

So I started running down the list with not much time left to get a working solution…

Panamax.io — Had a GUI and an API but seemed way more complex than what we needed.

Fig/docker-compose — We were already using fig for our local development environments. Managing remote docker hosts isn’t its strong suit. It’s possible but slow because you deploy to each host in sequence.

Centurion — Looked promising. It was fig, but for remote systems. New Relic wrote it so it’s got some real-world usage. But the first thing I ran into when using it was Ruby traceback. I could’ve spent my time diagnosing it, but I had one more tool to try out.

maestro-ng — Looked a lot like Centurion and fig. It could pull images onto remote docker hosts, check! It’s written in Python, so if I ran into a problem I had a better chance of fixing the problem quickly.

Maestro-ng’s the winner

Maestro is a lot like fig. You configure your container — which image, environment variables, volumes, links, etc. — in a YAML file. You also configure the remote docker hosts, or “ships.”

Screenshot 2015-04-07 17.07.53

Plus, under the hood the yaml files are treated as Jinja2 templates. You can keep your configuration DRY with a base template for an application. In per-environment yaml files, you change only what’s needed!

Screenshot 2015-04-07 17.11.29

Deployment is a breeze. We use a Blue/Green deployment strategy so we can safely stop the running containers on our hosts. Here’s what our deploy script looks like:

# pull new image onto box
maestro -f $maestro_file pull $service

# stop the running service
maestro -f $maestro_file stop $service

# clean out old containers
maestro -f $maestro_file clean $service

# start the new containers with the new image
maestro -f $maestro_file start $service
Posted in Uncategorized | Leave a comment

Get Docker running on AWS OpsWorks

bhcmIBcI’ve spent the past couple of weeks at my new job doing a couple of things: hiring kick ass Python and UI engineers and getting some build-and-deploy infrastructure set up so the team can hit the ground running.

Long story short: I wanted a way to deploy pre-built Docker images from any repository to hosts running in OpsWorks.

I chose Docker because it would let me get a repeatable, consistent environment locally and on various non-production and production environments. And I’d get there a lot quicker than writing Puppet or Chef recipes and using Vagrant.

Screen Shot 2014-12-05 at 9.32.37 PMWhen it came time to get a non-local environment spun up I turned to AWS due to some networking and security issues around my team’s first project.

Time was of the essence, so I first turned to Beanstalk but found its Docker support problematic. Amazon announced but hasn’t yet released their Elastic Container Service. I ended up picking OpsWorks.

I couldn’t find a lot of advice on the 21st century version of man pages, so I’m writing this up in the hope it helps others, and that wiser folks tell me what I can do better!

Brief OpsWorks primer

Screen Shot 2014-12-05 at 9.34.47 PMOpsWorks is an engine for running Chef recipes based on lifecycle events in the course of a machine’s life.

You start by defining a layer, which is a group of machines that do similar tasks like serve your Web app, run memcache, or host Celery workers.

Then for that layer you define which recipes fire whenever a machine is setup, or an app is deployed to it, or it’s shutdown, etc.

AWS OpsWork and Docker deployment strategy

The best strategy I could find was on an AWS blog post.

Chris Barclay sets up a layer with recipes that install Docker. Application deployments require the OpsWorks instance to pull your code, including its Dockerfile from a git repo and build it locally before running it.

I didn’t like building the Docker images locally from git sources. It ruled out using pre-built community images and opened the door to random build issues on a subset of the boxen.

What I wanted was a way to deploy pre-built Docker images from any repository to hosts running in OpsWorks.

Improved OpsWorks and Docker deployment

I took the code from Chris Barclay and adopted it. You set some key environment variables in your OpsWork application definition and that tells the chef recipe what registry, image and tag to pull and, optionally, the registry username and password to authenticate with.
Here’s the instructions and source to get up and running:

  1. Set up a new stackinOpsWorks. Under Advanced set the following:
    • Chef version: 11.10
    • Use custom Chef cookbooks: https git url to a repo containing the recipes
    • Manage Berkshelf: Yes
    • Berkshelf version: 3.1.3
  2. Add a layer
    • Type: Other
    • Recipes
      • Setup: owdocker::install
      • Deploy: owdocker::docker-image-deploy
  3. Add an App
    • Type: Other
    • Repository type: Other
    • Environment variables:
      • registry_image: The path portion of a docker pull command ala: docker pull $registry_image
      • registry_tag: The tag of the image that should be pulled from the registry ala quay.io/yourusername/yourimage:$registry_tag
      • layer: The shortname of the layer the image should be deployed to
      • service_port: The port on the HOST that will be connected to the container
      • container_port: The port on the CONTAINER that will be connected to the service port
      • registry_username: OPTIONAL username to login to the registry
      • registry_password: OPTIONAL password to login to the registry
      • registry_url: OPTIONAL url to a non hub.docker.com registry ala quay.io

opsworks-docker

Things to make Docker go on Ops Works. We need help.</pakled>

Described in this blog post of mine and based on this AWS blog entry

Instructions

  1. Set up a new stack in OpsWorks. Under Advanced set the following:
    • Chef version: 11.10
    • Use custom Chef cookbooks: https git url to a repo containing the other files in the gist inside owdocker/recipes/
    • Manage Berkshelf: Yes
    • Berkshelf version: 3.1.3
  2. Add a layer
    • Type: Other
    • Recipes
      • Setup: owdocker::install
      • Deploy: owdocker::docker-image-deploy
  3. Add an App
    • Type: Other
    • Repository type: Other
    • Environment variables:
      • registry_image: The path portion of a docker pull command ala: docker pull $registry_image
      • registry_tag: The tag of the image that should be pulled from the registry ala quay.io/yourusername/yourimage:$registry_tag
      • layer: The shortname of the layer the image should be deployed to
      • service_port: The port on the HOST that will be connected to the container
      • container_port: The port on the CONTAINER that will be connected to the service port
      • registry_username: OPTIONAL username to login to the registry
      • registry_password: OPTIONAL password to login to the registry
      • registry_url: OPTIONAL url to a non hub.docker.com registry ala quay.io

view raw
README.md
hosted with ❤ by GitHub

source "https://supermarket.getchef.com&quot;
metadata
cookbook "apt", '~>2.6.0'
cookbook 'docker', '~> 0.36.0'
cookbook 'windows', '~> 1.34.0'

view raw
Berksfile
hosted with ❤ by GitHub

include_recipe 'deploy'
include_recipe 'docker'
Chef::Log.info("Entering docker-image-deploy")
node[:deploy].each do |application, deploy|
if node[:opsworks][:instance][:layers].first != deploy[:environment_variables][:layer]
Chef::Log.warn("Skipping deploy::docker application #{application} as it is not deployed to this layer")
next
end
opsworks_deploy_dir do
user deploy[:user]
group deploy[:group]
path deploy[:deploy_to]
end
opsworks_deploy do
deploy_data deploy
app application
end
Chef::Log.info('Docker cleanup')
bash "docker-cleanup" do
user "root"
returns [0, 1]
code <<-EOH
if docker ps | grep #{deploy[:application]};
then
docker stop #{deploy[:application]}
sleep 3
docker rm -f #{deploy[:application]}
fi
if docker ps -a | grep #{deploy[:application]};
then
docker rm -f #{deploy[:application]}
fi
if docker images | grep #{deploy[:environment_variables][:registry_image]};
then
docker rmi -f $(docker images | grep -m 1 #{deploy[:environment_variables][:registry_image]} | awk {'print $3'})
fi
EOH
end
if deploy[:environment_variables][:registry_username]
Chef::Log.info("REGISTRY: Login as #{deploy[:environment_variables][:registry_username]} to #{deploy[:environment_variables][:registry_url]}")
docker_registry "#{deploy[:environment_variables][:registry_url]}" do
username deploy[:environment_variables][:registry_username]
password deploy[:environment_variables][:registry_password]
email deploy[:environment_variables][:registry_username]
end
end
# Pull tagged image
Chef::Log.info("IMAGE: Pulling #{deploy[:environment_variables][:registry_image]}:#{deploy[:environment_variables][:registry_tag]}")
docker_image "#{deploy[:environment_variables][:registry_image]}" do
tag deploy[:environment_variables][:registry_tag]
end
dockerenvs = " "
deploy[:environment_variables].each do |key, value|
dockerenvs=dockerenvs+" -e "+key+"="+value unless key == "registry_password"
end
Chef::Log.info("ENVs: #{dockerenvs}")
Chef::Log.info('docker-run start')
bash "docker-run" do
user "root"
code <<-EOH
docker run #{dockerenvs} -p #{node[:opsworks][:instance][:private_ip]}:#{deploy[:environment_variables][:service_port]}:#{deploy[:environment_variables][:container_port]} –name #{deploy[:application]} -d #{deploy[:environment_variables][:registry_image]}:#{deploy[:environment_variables][:registry_tag]}
EOH
end
Chef::Log.info('docker-run stop')
end
Chef::Log.info("Exiting docker-image-deploy")

include_recipe 'apt'
package 'apt-transport-https'
apt_repository "docker" do
uri "https://get.docker.com/ubuntu&quot;
distribution "docker"
components ["main"]
keyserver "hkp://keyserver.ubuntu.com:80"
key "36A1D7869245C8950F966E92D8576A8BA88D21E9"
end
execute "apt-get update" do
user "root"
end
# Install Docker latest version
package "docker" do
package_name "lxc-docker"
action :install
end
service "docker" do
action :start
end

view raw
owdocker_install.rb
hosted with ❤ by GitHub

Posted in Programming, Python, Technology | 9 Comments