Hackweek 2015 - Janitor Monkey

Published on Wednesday, November 25, 2015

By Michael Olteanu

Improving Netflix's Janitor Monkey

What is Janitor Monkey?

Janitor Monkey is software that watches various AWS resources, and applies rules to decide when they are no longer needed. This is useful because AWS resources cost money, and when developers are empowered to create their own resources as needed, they may not remember to clean up the things they create after they no longer need them.

How does it work?

When Janitor Monkey decides resources are not needed any more, it marks them for deletion, and notifies an owner (by email) that this has occurred. After a certain amount of time has passed, the resource is deleted unless the owner has taken action to prevent the deletion.

We can make it better

First, we wanted this to be simple for a small group to set up. To accomplish this, we put the Janitor Monkey service in a Docker container. To allow configuration of the service, we also modified the Janitor Monkey Java code so that it checks for environment variables that have been set, and uses these parameters to overwrite the configuration files where appropriate. Environment variables are easier to change for a docker file, relative to changing configuration files.

Secondly, we wanted to eliminate false positives. The worst case scenario would be if this service did more harm than good, destroying work and costing developers time. The way Janitor Monkey works out of the box is that it deletes resources if you don't opt out of deletion within a certain amount of time. This seems likely to result in accidental deletions of potentially critical resources. So we modified the Janitor Monkey source code to require consent from an owner before deleting a resource.

And lastly, we wanted it to be easy to manage. So we created a GUI front end for (a) managing the resources to apply Janitor Monkey rules to, and (b) managing the consent status of whether something that has been marked as violating the rules is okay to delete. We did this by making a Flask web app, using jquery for some of the table rendering, and putting this web app in a docker container.

Janitor Monkey Resource Management UI

Janitor Monkey Resource Management UI

Janitor Monkey Event UI

Janitor Monkey Event UI

Next steps are to roll this out with some smaller internal groups. We'll help them customize the configuration to meet their needs, and help hand it off to them. Thanks to the work done in simplifying the setup, it should be much easier for our users to take ownership of this new service.