I think this is a very clear disadvantage of the microservice architecture they chose in this case, and the post does allude to that. To recreate this data they needed to query several different microservices that would not have been able to sustain a higher load.
If I calculated this right the time they mention comes down to 30 items per second. Which is maybe not unreasonable for something that queries a whole bunch of services via HTTP, but is kinda ridiculous if you compare it to directly querying a single RDBMS.
You could probably fix this by scaling everything horizontally, if that is possible. But the real solution would be as you say to have bulk processing capabilities.
Yes, adding a "return X items" mode to the same microservices often is a way to get a significant performance boost with only minor changes, where even if your main use case needs only one item, it enables mass processing without incurring the immense overhead of a separate request per each item.
This goes beyond microservices. I've done a fair share of optimization in our own code to introduce batch-oriented calls rather than singular. Even though your service is on the same LAN as the DB server, fetching thousands of rows one select at a time is very slow compared to a single, or a few, selects.
In one case, the application was running on a laptop over WiFi, which increased the network latency by 10x. Suddenly a 30 sec job turned into a 5 minute job.
Since one can easily implement a singular version using a batch size of 1, it's a drop-in replacement in most cases.
Also, since one can easily implement a batch-style API using a singular version, you can write the API batch-oriented but implement it using the singular version if that's easier. This allows you to easily swap out the implementation if needed at a later date.
Totally agree. If a frontend wants exactly one of something then implement an API that calls the same batch code, but exposes it as a singular.
The thing I've found for object retrieval (as opposed to search) is that you might want to break GET semantics and have people POST in a list of IDs. Otherwise you might hit the query string size limit. Random tip.
If I calculated this right the time they mention comes down to 30 items per second. Which is maybe not unreasonable for something that queries a whole bunch of services via HTTP, but is kinda ridiculous if you compare it to directly querying a single RDBMS.
You could probably fix this by scaling everything horizontally, if that is possible. But the real solution would be as you say to have bulk processing capabilities.