What went well?
Deploying services independently is important for agility so having decoupled services meant we did not have to make multiple service code changes to deploy. We are a small team but as we grow, we need to be able to separate ownership.
Being able to remove services or refactor services was relatively painless due to modularity. There are still costs but they were understood well enough we could cleanly deliver them.
From the start, we were conscious of being a small team so we didn't get carried away throwing in enterprise tooling that someone like Facebook, Google would use to solve scaling problems we will never have.
What went wrong?
Maybe half of the development teams efforts were spent on infrastructure and deployment. Micro-services approach adds a lot more work in this area. Having a well-oiled pipeline and deployment strategy pays dividends. Infrastructure and deployment is a crucial business process and gives us the ability to deploy with confidence and agility.
In the early days of our journey, parts of our platform had too many services and lots of asynchronous communications via a message bus. It was brittle, complex and difficult to debug. We improved this over time by simplifying some workflow and dependencies. We also used more off the shelf software to solve some of our problems. We will continuously review as Machine Learning pipeline software maturity increases.
What can we do better?
You can't escape trade-offs so it's not fair to say those points went wrong but they did present challenges.
We learned it is better to have a simpler architecture for a smaller team. Guessing about the future is hard, so solve problems you have right now and be careful about guessing about the future. Team size should feed into the architecture.
Weigh up the pros and cons when thinking about adding a new service
Think carefully before introducing a new service. It can be tempting to create lots of small services. This idea can seem like a good idea for a lot of situations. Thinking about the impacts on workflow, dependencies, interrelationships, and technologies can help. It also can be better to break apart components over time.
Business requirements can be unclear at the start and reveal themselves over time. It is wise to not over optimise and create lots of specialised services with complex communications unless there is a particularly justifiable reason to do so.
Bounded contexts can reveal themselves organically over time to fit the business process you're in.
Not to be underestimated. There is lots of tooling and software out there with lots to learn. It's a time-consuming area for a small team. You might have built a dockerised service, but don't forget the effort in deploying it.
Observability is an important aspect of a system like ours. In the near term, monitoring our infrastructure and deployments is an area we want to invest more effort in.
Different testing strategies emerge with micro-services. Contract testing is something we utilise more now than in the past. We use Pact.io but a lot of our services rely on messaging which it doesn't support as of writing but looks to be in development.
Two quotes come to mind "There is no silver bullet" and "Embrace change or get left behind".
One of the reasons I named operational costs as the biggest impacting factor is it was something we felt more than anything else, and it slowed our output and progress. Having a development team entrenched in maintaining deployments and infrastructure means time taken away from developing new features on the roadmap.
Don't be afraid to kill off or restructure services if they aren't working out, it'll be worth it.
Our architecture has changed over the last 12 months and could look different in the next 6-12 months as business and processes evolve.