~13 years ago (!) I worked with the Amazon Seller API and left this comment in my own source code:
// the XML returned from this request is *mind-bogglingly* bad. Terrifyingly bad.
// a completed batch looks like this:
// <Batch>batchid=363777811 status=Done dateandtime=09/18/2007 09:53:10 PDT activateditems=335 numberofwarnings=0 itemsnotacivated=17 </Batch>
// and an incomplete batch like:
// <Batch>batchid=363778361 status=In Progress </Batch>
// so we'll just parse each item as a regex. Thanks Amazon.
The documentation at the time was just a post on a forum, that later got removed, so it no longer exists at all but this was just one of many horrors.
I kid you not, I know exactly what you're talking about there. I also wrote my own regex parsers with fast-forward and rewind tokenizers, because it was so dreadful. I wouldn't normally do that.
There's also 4 (4!!!) different ways of returning errors in the MWS Feed APIs when used from Java. Given the number of entities, and the fact that they're combinatorial I had to do some of the craziest Scala code I've ever written because duplication was just too bad to handle for how critical that code was. To do it in a bulletproof way, that error handling for MWS feeds inherently has 2^4 control flows * N entities = 16N code paths at like 100-200 lines a piece if you didn't use some higher order abstractions.
That looks like putting "<Batch>" and "</Batch>" around a legacy text format and call it a day, to report that your service/API "supports XML" lol. Actually, SGML has mechanisms (shortrefs and "data tags") to parse strings like that as markup and could even infer "<Batch>" tags (but it still won't work satisfactorily with your example data).
I kind of remember a massive CDATA blob... but my memory must be playing tricks on me: Wikipedia shows some sample markup of the pre-2007 Microsoft Office (https://en.wikipedia.org/wiki/Microsoft_Office_XML_formats) and it doesn't look bad at all.
And it seems after 2007 they switched to an ECMA standard.
the text based format in that implies lovely horrors. now all it needs is sometimes quoted items after "key=" (using and requiring different quotes, depending on which system you're speaking to) and at least 3 sources of "universal event ID" counter... just one "batchid" is too simple.
Just an FYI. This is about Amazon's MWS API, which is Merchant Web Services, used to interact with the Amazon part of Amazon not the AWS side of Amazon.
Also having working on an OSS Ruby library for it, and having to had work with it quite a bit, I can confirm. The MWS APIs leave a lot to be desired of, but they are an absolute gem compared to the Ebay APIs.
My take on the problem is that these APIs were developed in the early to mid 2000's when e-commerce was taking off, so they are architected using the tech and philosophies from back then (XML, SOAP, SOA, etc) however we are no in 2020 and people expect up to date modern interfaces and APIs, and communication protocols; but these API's can't be shut down or risk major breaking changes because so much other stuff is built on top of them.
Really at the core of the day the problem is that so much crap on the Web is considered ephemeral and so many business come and go so quickly that the question of what to do about legacy tech on the web isn't much of an issue because things don't last long enough to be legacy. The exceptions are some of the e-commerce APIs of these big tech companies such as Ebay, and Amazon that managed to thrive, but now are faced with a challenge most people never are of how to migrate legacy web services.
EDIT: Updated appearantly I was not reading closely enough, this is not for the MWS API but the new REST version of their Seller APIs. That being said I'd still be willing to wager a pretty penny they slapped together a REST-to-XML translation layer on top of the MWS APIs and called it a day.
I think you've explained the situation quite well.
I got my career started helping a lot of big companies transition from SOAP/WSDL/XML tech to JSON APIs in the late 2000's (writing API servers and client libraries in PHP, Ruby, Python, and then Node.js) and so thought about these technologies day in and day out for many, many years.
I think we are still missing a dead simpler way to do it all—a very basic Tree/Grid Notation where API requests and responses are equivalent to sending a spreadsheet back and forth (no syntax characters—no quotes, ticks, brackets, parens, etc—just spacing for the encoding). Higher level types would be added onto that, but at the low level it's all about just keeping your data as simple and clean as possible.
All RPCs would take a tree notation doc as input and return a tree notation doc as output. Tree notation supports no data structures and tree notation supports all data structures. Apis would use only the data structures they need. Binary protocols would be easy to optimize.
Agree with what you have said about the MWS API, but this repo is actually the updated REST based API - which appears to be mostly the same functions as the older API but with less documentation.
Don't worry, the docs and APIs/libraries for AWS are just as bad
Boto3 is eye-watering bad. Boto2 was written by someone with some knowledge of Python and usability. Boto3 is just "automatically generated" from their API
It is not good. It is bad. The structure makes no sense. The API can be described as "the minimum that works".
I'm not here to defend AWS but I'm a bit puzzled by your comment about the Boto3 doco. Here's some sample usage for instance which clearly is something more than "automatically generated" https://boto3.amazonaws.com/v1/documentation/api/latest/guid... .
I believe they meant that the API bindings themselves are autogenerated - which I think is pretty likely given how baffling and un-pythonic some of decisions in boto3 are.
You're right that much of the documentation clearly isn't auto-generated. Of the many Amazon APIs and tools I've worked with, boto3 among the better of them in terms of documentation.
Not the docs, the library code is (for the most part) automatically generated
This is actually a good example of how the thing works, you have to use MessageBody as an argument name to send the message. Or use the Entries argument
Those names and that usage style are far from Pythonic
For some reason the data you send to eBay's REST APIs doesn't always show up in their UI...specifically, if you use their RESTful Inventory API, you can't see _any_ of it from the Selling page! Not sure why...e-commerce APIs seem to be (on average) remarkably bad.
Documents will only be good if there is a strong review process by someone other than the devs who wrote the code. MS used to do that to a great degree until fairly recently. Amazon isn't big on TPMs unfortunately.
I'm currently working with this API, and I think that if I didn't have experience with Amazon's old seller API (MWS), I wouldn't even know where to start with this one. As it is, it's taken me many, many hours to figure out how to make this one (sorta) work.
(Shameless plug: I wrote detailed walkthroughs of how to get access to[0] and build a basic application with[1] the Selling Partner API).
I won't touch any of amazon's products with a 10ft pole because of the way they treat their warehouse workers. Making their warehouse employees pee in bottles to make their metrics or risk being fired, that's just on a whole new level of extracting every joule of work out of a human being.
I agree with you but that's not unique to Amazon and nor is it a modern problem. More than 20 years ago, I was pissing in empty bottles next to trucks to make my quotas when throwing boxes for FedEx when I was in university. Ditto when I threw boxes at an Abercrombie & Fitch / Hollister distribution warehouse in university. Same story when I worked construction jobs. Same story when I was a teenager, picking corn and beans by hand in the fields near where I grew up. In fact, of the over a dozen manual labor jobs I've had in my life, the only one where I had convenient and frequent access to a restroom facility was a Thomson Consumer Electronics glass factory.
It's been surprising to see how much better people tend to view Amazon vs. Walmart when Amazon is just the next iteration of megacorp retail consolidation in the same vein with an increasingly horrible record of employee exploitation and abuse.
That's because we've had a good 15 years of tweets and articles from tech luminaries and op-ed writers that used their shopping experience at Amazon as a stick to beat Walmart with. For the most part, those thought pieces have always been written by white-collar people who had no idea of the human toll that two-day shipping takes.
Their entire experience w/ Amazon starts at the website and ends with the brown box with the smile on it.
When even one cent difference in cost equals millions in profit, its much easier to pretend you are dealing with numbers and not people.
I am not defending anyone only pointing out that maybe we shouldn't have entities so big that dissociation of mid to high level management decision's to reality is so easy. Where guilt and responsibility is pushed down the ladder on daily basis.
and the conditions have only been improving, and with more robots, the number of warehouse workers will only keep getting less. drivers on the other hand is a different story..
If you have had the misfortune of interacting with Amazon seller central or vendor central as a merchant, as I sadly have. This won’t come as a surprise, it boggles my mind how bad, buggy and user hostile that software is. The fact that Amazon keeps moving forward with their IT estate as bad as it is amazes me.
I ran a small manufacturing business that listed on Amazon in both seller and vendor central. I was just interacting with it as a ”common user” for 5 years. Never tried to develop against the API - we did about £300,000 per year through the platforms.
This is my experience, too. I was a new paying user of Amazon Seller Central, confused by the UI and missing documentation, and they simply ignored my ticket.
I once worked with a financial product that used CORBA for distributed processing. One chapter of the dev guide had examples of use that were... Let's say not possible with CORBA... I opened a ticket with the vendor, because of course we were paying a large purchase percentage maint fee. I was not part of vendor selection. The official response from the vendor was "please remove that chapter from the dev guide.". Beautiful!
Before you hate on CORBA I submit that my masters thesis was based on CORBA for distributed processing and that, "horror of horrors" it had solutions for many of today's issues.. albeit implemented with tooling that was lacking. Anyone for a game of what vendors RPC implementation this is based on packet dumps?
This actual exists in some places! Check out moduledocs and "doctests" for languages like Elixir. Was something really cool I liked while I have been learning Elixir.
If you write a unittest in D underneath a function, it will automatically be included in the generated docs as an example. This is how all the examples in the standard library documentation are created, so you know for a fact that those work.
I seem to recall a talk given at one point that demonstrated a scheme that would reverse engineer functions out of whole cloth based only on docstrings and doctests, but of course I can't find it now.
Use contracts and propery-based testing. See Hypothesis library in Python (and other laguages). I wrote a library for contracts in Python (http://github.com/Parquery/icontract, see its readme for further references to other libraries).
Internally to $WORK I've got a documentation system consisting of interactive F# notebooks, put together automatically from marked-up F# code (so you get type-checking while you write them) with output assertions checked in the CI pipeline for the corresponding library (and before a pipeline pushes out the latest version of the documentation). Hoping to open-source the system if I ever get time; I think Mathematica's documentation system is absolutely first-class, and I want that in any language that supports notebook-style interfaces!
It's far less extensive than what you presented, but on the current project we are working on at work, we are using https://github.com/swaggo/swag
It's somewhat specific to golang but so far it has been relatively good for us.
The spec is directly next to the code as code comments, so it has far more chances to get updated when changes are made compared to an external documentation.
Also, the payloads and responses are directly derived from the Golang structs, so, at least on that aspect, changes in the code are automatically reflected in the spec (apart for description and examples).
We also put the interactive documentation directly in our API under /doc. It is a useful tool for developers when implementing a new url/handler or modifying an existing one. It also creates incentives to keep the documentation matching.
Overall, we had very few mismatches between the spec and the actual implementation overall, despite not having deeply tested it (be it manually or automatically). Apart from one or two mismatches that were fixed quickly, I was able to take the spec, generate client libraries from it (swagger-codegen), and use them for quite extensive demos without issues.
We are still early in the project (not in production yet), and there are definitely some aspects we need to improve (integration/automated tests to be sure doc and code are 100% matching) or to completely figure out (ex: how to handle several versions of the same API). But overall, using swaggo/swag has been a pleasant experience.
I remember having some Microsoft documentation in the mid 90s for using DDE (an early Windows way of passing messages to different running applications).
I forget the specifics of how the example code worked but it acted as both a sender and receiver of DDE messages and did so by executing itself. I remember spending a good 30 minutes trying to work out how the example worked before giving up and doing the classic “let’s just run it and see what happens” approach.
The example code from Microsoft turned out to be a fork bomb and it quickly crashed my machine.
It took me a long time to trust Microsoft documentation again after that incident.
Having coded with MWS, I can confirm. I don't remember the details but there was a typo in one of the PHP examples leading all API calls to fail. Easily fixed but should I really have to edit the provided example code to get something working?
Typo in the example? Mr. One-Up here to bring up HP Fortify’s (static analysis tool) API that had typos in the named parameters. But no typo in the named return value. So you would pass in a value for “fuebar”, and then look for “foobar” on the return. As one of many examples.
My first job out of college was working on Fulfillment by Amazon, which was part of sellers central. The place was a huge mess technically, with like three different authoritative databases, and services constantly trying to second guess each other. Everything was in Java and there was no clear central direction. I think this is probably the main drawback to Bezos' cloud of services mandate.
All you've done is deploy the existing documentation on your own service and change the links on the GitHub repository to point to your service rather than the GitHub documentation (with no indication of how one would keep this in sync with updates).
Sure, it makes navigating this version of the docs slightly easier, but this doesn't seem to be a good-faith attempt at actually improving the documentation so much as promoting your service.
See, the issue is that most probably this person is not Amazon's customer. Their employer is.
In which case either Amazon is or isn't important to them. If it is, they can just tell the person to get over it and do their job. If it isn't then they aren't an important customer for Amazon either.
In neither case is there any incentive for Amazon to improve.
Such is life with a lot of enterprise software in my experience.
I hope the software industry realizes that good documentation is more valuable than good code. Code is transient, and even good code today will be called techdebt tomorrow when the next language or framework or library shows up. But bad documentation impedes your ability to write good code, your ability to work with other parts of the organization, and your ability to grow your business by having outsiders integate with you (as in this case): it also has the effect of turning code into techdebt, by causing people to write new implementations of code they otherwise could have used, because they have no documentation for it.
One step we could take is in hiring. When interviewing developers, let's ask them to write documentation, not just code.
I thought I was the only one! When I look around at the code examples using openCV it is clear that nobody knows how to use it. God help you trying to get your GPU working on Python.
matplotlib is esoteric ( if you didn’t already have psychic scars from matlab experience and thus already speak their esolang), but at least it documents kwargs and links out to source and doesn’t just give you out of date snippets in broken english.
OpenCV can’t be assed to link out to source, and also won’t even bother telling you what the function signatures are, what the names of all the methods are, or anything else useful.
I've written code against Amazon's MWS (btw they have a brand new RESTful API available now) and just for the record the eBay, USPS APIs are similarly "not stellar".
is it funny that when i first read the title i thought the author was talking about Apple's Swift documentation? I suppose funny is not the correct word - more like sad.
I would agree if this linked to a small project or someone's personal open-source project.
However, this is the repository to the public documentation of Amazon's API.
This is not the source code to a website somewhere. This is the actual documentation. You ask Amazon for the documentation and they link you to that repository.
This is a company with a revenue of $96.1 billion that relies on sellers to fill its marketplace. You would expect some level of quality.
We had an issue at work with their Seller API. We ended up having to email and call their support daily in order for them to switch an invisible flag on our account. After a month and a half of phone calls, they eventually fixed it.
This is the kind of support you receive from Amazon when you are a seller.
I wonder at the potential competitive points to Amazon. If you could largely copy logistical, reliability and marketplace features what would allow someone to start competing using a similar model?
I agree that this issue was likely not helpful except to shame Amazon. Sometimes publicity of things like this gets change that calm requests will not.
That said, I wrote documentation professionally for a few years in my career, and recently re-worked much of CRACO's documentation, and there is very little love for docs.
Docs are hard to do well and completely and if the API is as bad as described, you can only shine up a turd so much in explaining how something works.
Sometimes writing documentation is the only way someone realizes that the software is broken or not actually useful as implemented.
In most situations, a product manager is looking out for this in advance, but when the product is purely an API you have less product-type people who can provide useful management.
Obviously it is not a revenue problem at Amazon. Perhaps this is anger at Amazon's astonishing success redirected to areas of Amazon that have failed to realize the resources to make quality products.