How we build microservices

al2o3cr · on July 2, 2014

"By just going ahead and building the app, we became familiar with the problem we were trying to solve, and the more familiar we were with the problem, the more obvious it was where we needed boundaries between aspects of the app. Every time we encountered something that clearly looked like it should be a separate piece, we turned it into a service."

THIS. This should be printed in 36-point font on every microservice article.

Nelkins · on July 2, 2014

"The biggest challenge with microservices is testing. With a regular web application, an end-to-end test is easy: just click somewhere on the website, and see what changes in the database."

I find this surprising. I would've thought having a more modular system would lend itself to easier testing. Have others had this same experience?

iain_hecker · on July 2, 2014

It's easier to test individual items, for sure. Testing small code bases is the best thing ever. However, some actions require stuff to happen in multiple services.

There are "contracts" on how to talk to other services. You can test if you follow that contract, but it's hard to verify automatically that these contracts are in sync between apps.

palakchokshi · on July 2, 2014

I think what the article is talking about is testing microservices in an event based messaging architecture is difficult. If the architecture did not comprise of queues with async servicing of queues it might have been easier to test the microservices in an end-to-end scenario.

Arie · on July 2, 2014

Every little piece is easy/easier to test, but testing end-to-end over multiple micro-apps is hard in my experience.

vertex-four · on July 2, 2014

Why is this? Is it simply a deployment issue - that it's hard to get all the pieces running together in a test environment?

xetorthio · on July 2, 2014

In my experience even getting all the pieces installed in a testing environment is hard. Specially when you need to make sure that they are all wired correctly, with the correct version, with all their dependencies (maybe different kind of databases, queues, caching layers) and on top of that if services are written in different languages, that adds another level of complexity. Eventually when the system grows big enough and the team of people that needs a stable test environment grows big enough, it just becomes too expensive.

AdrianRossouw · on July 2, 2014

this is where docker and the ecosystem around it starts filling in the pieces.

also, it makes failures hurt less, and rollbacks much easier.

edwinnathaniel · on July 2, 2014

Docker answers half of the issues: deployment.

The other halves: distributed logging (how do you trace a 'transaction' that is being executed against multimachines/multiplaths?)

What about rolling releases? what about versioning?

What about service discovery (still immature field, look at how many products/tools out there trying to be THE service discovery choice).

sborsje · on July 3, 2014

We actually use syslog-ng + Loggly for distributed Loggly. We partially solve the transaction problem by slapping a request ID on every action that's initiated by a web request. We then carry this request ID from service to service, which gives us the ability to trace exceptions cross-service. It's obviously not the perfect solution, but it has helped us many times when debugging customer-facing 500 exceptions.

We solved the service discovery problem by simply using DNS. Every service runs behind an (internal) Elastic Load Balancer, and we let the load balancer configuration figure out which instance is up and which one is down. Again, not the perfect solution, but it works great for now and is very easy to setup/maintain.

Arie · on July 2, 2014

Sorry, I meant automated testing. Getting all the piecies running in a test stage isn't hard.

Smudge · on July 2, 2014

> We have an in-house tool we use called Fare which reads this configuration and sets up the appropriate SQS and SNS queues.

Would love to see this released publicly. As stated in the post, there aren't a ton of (public) examples of microservice architectures, so everyone seems to be solving the same problems independently. It would be great if we could start pooling more of our resources together.

iain_hecker · on July 2, 2014

We will! There are however a couple of really nasty issues that we don't feel comfortable to release to the public yet.

cwt137 · on July 2, 2014

I've been thinking a lot about SOA. Articles like this and Living Social's SOA blog posts series are a big help. But, I feel something is still missing. For example, no one talks about security of the various services. Are there any good books someone can recommend on SOA?

iain_hecker · on July 2, 2014

We have all services running inside a VPN (see one of our older posts: https://blog.yourkarma.com/building-private-clouds-with-amaz...) and we also use HTTP Basic token auth that are configured upon deployment. Every app gets its own token, so we can trace which app does what.

cwt137 · on July 2, 2014

I'd figured the services were on a private network (or at least the app is listening on an interface on a private network), but it is good someone confirmed it. Thanks for your insight on security.

pdx · on July 2, 2014

    The shipment app listens to the messaging system, sees an 
    order take place, looks at the details, and says, "Okay, 
    I need to send two boxes to this person." Any other 
    services interested in an order happening can do 
    whatever they need to with the event in their own 
    queues, and the store API doesn't need to worry about 
    it.

I'm curious how you guarantee that all the systems that need to see an event, will actually see it, before it is removed from the event queue. I assume the shipment app, in your example, is responsible for removing the event from the queue. So, what if it removes it before the mailer app or the "make cash register sound" app, sees the event?

swanson · on July 2, 2014

I believe each service has it's own queue, so producing an event would "fan-out" and add the event to each queue. That was my interpretation of why they were using both SNS and SQS - but I could be mistaken. If a particular service is not interested in a given event, it can just pluck it out of the queue and move onto processing the next message.

iain_hecker · on July 2, 2014

Each app gets their own queue. So there is a queue for the shipment app and for the mailer app. When an event is published to SNS, it gets added to both queues.

pdx · on July 2, 2014

I see. So you never add a listener, without also modifying the emitter to send to an additional queue.

futurepaul · on July 2, 2014

We don't change the emitter, the listener changes the SNS config. The emitter publishes to SQS under a certain topic. Listeners that interested in that topic subscribe to it by changing the SNS config, basically saying "give me a copy of those events too and put it onto my own queue". SNS copies the messages to every queue.

atombender · on July 2, 2014

SQS is very simple and the solution here requires that the publisher knows all the listeners.

If you use RabbitMQ (which are we using in our own microservice architecture), you can use fanout exchanges based on topics to accomplish the same thing more elegantly.

In this topology, every message sent to that exchange gets copied to any queue bound to that exchange.

This system also supports routing via "topics", which are paths that support wildcards; the publisher can publish to "foo.bar.baz", and queues can bind to the exchange using the routing key "foo.*.baz", for example.

We use this to listen to specific events, eg. a specific app is associated with content under the path "someapp.someclient". A data store publishes modification events with "create", "update" or "delete" followed by the path; so the the app, to get the stream of updates, simply listens to "create.someapp.someclient.#", "update.someapp.someclient.#" and "delete.someapp.someclient.#".

a-priori · on July 2, 2014

You can use SNS to act as a fanout exchange. Then you just need to subscribe all your SQS queues to the SNS topic. Events get pushed to the SNS topic, and SNS will distribute them to all the queues for you with the pusher needing to know anything about the subscribers.

tiglionabbit · on July 2, 2014

The text makes it seem like they're using SQS for many things, but the diagram only shows it between Orders and Shipments. I guess everything else communicates through the database and HTTP requests?

Using SQS means you're either polling or long-polling with hangups every 20 seconds. This seems pretty shitty to me. Also how do you structure one of these services around polling or long-polling to get its info? It sounds like they're using Rails. Does Rails have something that makes this easy to do?

At first, reading this reminds me of the Blackboard pattern from The Pragmatic Programmer. This pattern seems like a neat way to separate an application into different agents.

iain_hecker · on July 2, 2014

We use SQS for basically everything that happens asynchronously to the main flow. So when a user signs up there are a bunch of things that need to happen straight away (create an account, give 100MB to every new user), these go via HTTP. The others (email, give the owner of a Karma an additional 100MB and a push notification) goes through SNS and SQS.

All these async things don't impact the main flow of the user, so it's no problem that it isn't instantaneous. We are running background processes that do this.

We only use Rails for our frontend applications, they don't contain any business logic. Our backend processes are usually Sinatra for HTTP requests, or custom daemons.

jessaustin · on July 2, 2014

What's the advantage for your situation of publishing to SNS which is in turn publishing to SQS vs. just publishing directly to SQS? Are you forking to multiple SQS queues?

iain_hecker · on July 2, 2014

Yup we are. There might be multiple applications interested in the event and they each get their own queue.

marceldegraaf · on July 2, 2014

Have you looked into a "real" messaging queue for this, like RabbitMQ or another AMQP based system for this? Or are you bound to SNS/SQS for a particular reason?

amstrad464 · on July 2, 2014

"Real messaging" aside, this is an interesting question. It'd be great to hear from iain_hecker if they considered using RabbitMQ (or similar) and if yes, why SQS/SNS were chosen.

Also, I'd be interested to read more about how you handle authentication across web/mobile. Thank you for the blog post and for taking the time to answer questions here. :-)

futurepaul · on July 2, 2014

We have considered stuff like RabbitMQ. They would be excellent solutions too and we are working on a part of our architecture with MQTT/protobuf (more on that later, I'm sure). We went with SNS/SQS because we don't have to build a cluster with that. Building clusters is hard and Amazon takes care of that. Since we were already using a bunch of AWS products, this was a nice fit and we're quite happy with it.

Authentication is a good idea for an article too. Thanks :)

noselasd · on July 2, 2014

Why isn't SNS/SQS a "real" messaging queue ?

marceldegraaf · on July 2, 2014

Sorry, my question was a bit vague. Generally SQS works with polling, and once a message has been pulled from a queue it is not available to other viewers any longer. This makes it hard to create a system where you have an unknown number of clients that may or may not be interested in certain queues, or even in messages of certain types.

RabbitMQ (and other AMQP based queueing systems) have built-in solutions for this that SQS doesn't offer. Karma solved it by sending the same message to a number of different queues, but this makes scaling to more clients harder, and reduces flexibility in subscribing clients to certain message types.

Hope this makes it clearer :-).

iain_hecker · on July 2, 2014

That's where SNS comes in. We publish to SNS which will fan out to multiple SQS queues. When a new service is built, it says to SNS "subscribe my queue to these topics". The publisher of the event doesn't know about the subscribers.

marceldegraaf · on July 2, 2014

Ah, nice, I didn't know the integration of SNS and SQS worked that way. Cool!

MattGrommes · on July 2, 2014

Is there a difference between microservices and SOA? I've used architectures just like this in the past but never heard of "microservices" until recently.

dragonwriter · on July 2, 2014

AFAICT, "microservices" is exactly SOA in its original sense; my guess is the rename is because SOA has become so attached to particular XML-based implementations and standards (e.g., the WS-* series of standards).

iain_hecker · on July 2, 2014

This. Also SOA means STD in Dutch, so we don't want to use it that often ;)

buzzkills · on July 2, 2014

No difference, just a new name. Not so bad I guess, however I fear it will cropper when vendors leap on the bandwagon again and start sell inappropriate tools and frameworks badged as Microservice and we can start the process of renaming again.

mercer · on July 3, 2014

That's great news for vendors and consultants who jump on the bandwagon though :).

phazmatis · on July 2, 2014

Every time I cludge new functionality into an existing repo, I wish microservices were more generally acceptable.