What is a microservice development ecosystem?
When I talk about a âdevelopment ecosystemâ, I am typically referring to what it takes to get code developed locally into production, and ultimately adding value to users of the system. This is more than just a build pipeline (as popularised by Jez Humbleâs and Dave Farleyâs great Continuous Delivery book), and for me a development ecosystem comprises of local and integration build, test, deploy, operate and observe. Letâs break this down furtherâ¦
Build
Building is a vital part of any software development project, and I believe this starts with getting a good local development configured and ends with creating a rugged integration build pipeline established. Creating a standardised, reliable and repeatable local development environment is especially important when working with microservices, as there are typically more moving parts (services) in comparison with working on a monolith, and there may be multiple languages, platforms and datastores in play. You seriously donât want to be hand-rolling development machine configuration.
Having stressed the importance of creating a solid local development environment, I wonât cover the topic in much depth within this article, as Iâve previously written at length about this on another OpenCredo blog post âWorking Locally with Microservicesâ. The topic of creating a build pipeline is also well-covered in the aforementioned âContinuous Deliveryâ book, and so I recommend reading this if you havenât already. However, the challenge of creating a multi-service multi-pipeline is something vital for the testing of microservice code on itâs way to production. Letâs look at the importance of testing.
Test
I started this section of my JavaOne talk by reminding the audience of the ever-popular (and very valuable) testing pyramid. The principles behind the pyramid obviously still apply when testing a microservice-based system, but often the systems/applications/services-under-test look a little different (and accordingly, so do the tests). Toby Clemsonâs excellent article on microservice testing strategies that was posted on Martin Fowlerâs blog is my go-to introductory reference to this topic.
In regards to the multi-service/multi-pipeline issue I mentioned above, I strongly caution against the use of service versioning to indicate which services will work with others. For anyone familiar with Java development, the use of artefact versioning (for example, within the ubiquitous Maven build and dependency management tool) is second nature. I believe this is best practice when building a modular monolith, as this ensures that anyone can pick up the code and resulting dependency descriptor (e.g. the pom.xml) and they are good to go. However, for loosely-coupled microservice systems, this notion of ârubber stampingâ versions of compatible service doesnât scale, and also increases explicit coupling â itâs very easy to create a âdistributed monolithâ (trust me, Iâve done this).
More effective techniques for testing compatibility between services include critical path testing (otherwise known as synthetic transactions or semantic monitoring) within QA, staging, and ultimately production. Consumer-driven contracts are a great approach for asserting both the interface and behaviour of a service, and we have found Pact-JVM (combined with Pact Broker) a very useful set of tools. If you are using Spring Boot, then Matt Stine has put together a very nice demonstration of the use of Pact JVM that is available within his Github account.
My personal opinions for testing with a microservice-based application are as follows:
- Apply Behaviour-Driven Design (BDD) for each serviceâs API. This shouldnât be a controversial suggestion, as each service should be following the âsingle responsibility principleâ and clearly offer related behaviour to consumers/users. My go-to tool for this is Serenity BDD (and John Smartâs equally excellent book âBDD in Actionâ).
- Use (consumer-based) contract tests to ensure conversations between (and consumer requirements of) services are captured and tested for regression. A broken contract integration test is not necessarily a show-stopper for the changes, but it is a trigger for a conversation between the related service owners. If the breaking changes are required, then contract tests must be updated.
- Component (integration-level) tests and unit tests are still vital at the individual service level.
- BDD critical journeys throughout the application. These tests will act as your âsmoke testsâ and will often test functionality across multiple services. You can run these tests in QA, staging and production.
My final suggestions for testing include the âilitiesâ, such as security, reliability, performance and scalability. The ZAP security tooling from the awesome OWASP team comes highly recommended (combined with the âbdd-securityâ framework), and I also suggest the use of Apache Jmeter and the Jenkins Performance Plugin for load testing everything from individual services (typically the happy paths, in order to watch for performance regressions), and also the system as a whole.
Deploy
On the topic of deployment I recommend the use of continuous deployment to production, with new or not-yet-ready functionality being hidden by feature flags. This is not particularly new advice (and Etsy have been pushing the benefits of this for years), but I would suggest that feature-flagging be implemented at the ingress-level for microservices i.e. the system boundary for user interaction or the internal module responsible for initiating action (i.e. a cron-like system). It is all too tempting to spread related feature flags throughout multiple services, and sometimes this is essential, but in the general case, switching some feature on/off in one location during the usage journey is much easier.
I also advocate for incremental rollout of new service versions (with close observation of related metrics), and the use of canarying and blue/green deployment techniques can be valuable. The final piece of advice is to avoid using datastore migration tooling (such as Liquibase and Flyway) to make non-backward compatible schema changes if at all possible, as this can lead to breakages during deployment.
The general advice with microservice data stores is to have one store per service, but in reality (for performance or reporting reasons) it can often be the case that necessitates only one class/type of services writing to a store, but many undertaking read-only operations. Non-backwards compatible changes will not only break the existing older instances of a service during deployment of the new service version (and hence prevent incremental upgrades), but it may also break the services performing read-only action.
Operate
My recommendations for operating microservices from a development perspective include standardising on an OS (e.g. Ubuntu, RHEL, CoreOS, RancherOS etc) across all environments in order to minimise deviation between development and production environments; utilising programmable infrastructure tooling that allows cross-vendor initialisation and destruction of entire environment stacks, such as HashiCorpâs Terraform; and also the use of configuration management tooling to manage lower-level instance and service configuration, for example, Chef, Ansible, Puppet, or SaltStack (âCAPSâ tooling).
In my JavaOne talk I also discussed the choice between external versus client-side service discovery and load-balancing, and how this relates to centralised configuration (and tooling such as Consul, etcd and ZooKeeper). There is some interesting work going on in this space, including srv-router and Baker Street.
Observe
I echo the sentiments of many industry luminaries, in that something should not be considered successfully deployed to production unless it is fully monitored. Once basic monitoring is complete, for example using the health check and metric endpoint functionality provided by Coda Haleâs (DropWizard) Metrics library or the Spring Boot actuator, I then advocate for exposing service functionality (business) specific metrics that will indicate service and system health. Examples of such metrics include assertions on minimum incoming message queue lengths, the latency of a dependent third-party downstream service, or the average number of ecommerce shop checkouts per minute.
I also repeated my âlog like an operatorâ mantra that I am often recommending, and recommended some good resources to help developers write log statements that are useful for everyone from fellow developers, QA specialists and operators. Tooling in this space was recommended, including the ubiquitous ElasticSearch-Logstash-Kibana (ELK) stack, Zipkin for distributed (correlated request) tracing, and InfluxDB, Telegraf and Grafana for metric capture and display.
The final recommendations for operating microservices included implementing a good approach to alerting (aâla Rob Ewaschukâs âPhilosophy on Alertingâ), and developing strategies and tactics for fixing problems, such as those provided by Brendan Greggâs USE method, and Kyle Rankinâs very useful âDevOps Troubleshootingâ book.
JavaOne Video
Below is a link to the YouTube recording of the talk. The video includes all of the talks that took place in the room that day, and so be careful when scrubbing the timeline (unless you would like to watch the other excellent talks)!
Talk slides
I have uploaded the version of the talk presented at JavaOne to my SlideShare account:
J1 2015 “Building a Microservice Ecosystem: Some Assembly Still Required” from Daniel Bryant
Feedback and questions are welcomed!
As usual, weâre always keen to receive feedback and comments at OpenCredo, and also to discuss any issues you may be having at your organisation. Please feel free to drop me a line on twitter @danielbryantuk or email daniel.bryant@opencredo.com
âThis blog is written exclusively by the OpenCredo team. We do not accept external contributions.