Hacker News new | past | comments | ask | show | jobs | submit login
“This is the worst documentation I have ever seen in my life” (github.com/amzn)
307 points by paradisevoicez on Dec 14, 2020 | hide | past | favorite | 105 comments



~13 years ago (!) I worked with the Amazon Seller API and left this comment in my own source code:

    // the XML returned from this request is *mind-bogglingly* bad. Terrifyingly bad.
    // a completed batch looks like this:
    // <Batch>batchid=363777811 status=Done dateandtime=09/18/2007 09:53:10 PDT activateditems=335 numberofwarnings=0 itemsnotacivated=17 </Batch>
    // and an incomplete batch like:
    // <Batch>batchid=363778361 status=In Progress </Batch>
    // so we'll just parse each item as a regex. Thanks Amazon.
The documentation at the time was just a post on a forum, that later got removed, so it no longer exists at all but this was just one of many horrors.


I kid you not, I know exactly what you're talking about there. I also wrote my own regex parsers with fast-forward and rewind tokenizers, because it was so dreadful. I wouldn't normally do that.

There's also 4 (4!!!) different ways of returning errors in the MWS Feed APIs when used from Java. Given the number of entities, and the fact that they're combinatorial I had to do some of the craziest Scala code I've ever written because duplication was just too bad to handle for how critical that code was. To do it in a bulletproof way, that error handling for MWS feeds inherently has 2^4 control flows * N entities = 16N code paths at like 100-200 lines a piece if you didn't use some higher order abstractions.


That looks like putting "<Batch>" and "</Batch>" around a legacy text format and call it a day, to report that your service/API "supports XML" lol. Actually, SGML has mechanisms (shortrefs and "data tags") to parse strings like that as markup and could even infer "<Batch>" tags (but it still won't work satisfactorily with your example data).


I mean, Bezos did say people who didn't expose services as APIs would be fired...


I was kind of hoping for:

<Batch batchstate="batchid=363777811 status=Done dateandtime=09/18/2007 09:53:10 PDT activateditems=335 numberofwarnings=0 itemsnotacivated=17" />

:)


reminds me of a description of MSFT's first XML-based office file formats (not sure if its true or was a joke) but it went something like

<xml> <office-proprietary-binary-blob> wky4b5tlwybkjbb2... </office-proprietary-binary-blob> </xml>


I kind of remember a massive CDATA blob... but my memory must be playing tricks on me: Wikipedia shows some sample markup of the pre-2007 Microsoft Office (https://en.wikipedia.org/wiki/Microsoft_Office_XML_formats) and it doesn't look bad at all.

And it seems after 2007 they switched to an ECMA standard.


In all fairness, during the first XML migration, I'm sure internal MS folks were just as baffled at the previous format.


Xml is an open standard man, just pretend your Neo from the Matrix reading the screen saver


I note with horror that "activated" is misspelt in "itemsnotacivated" but spelt correctly in "activateditems".


the text based format in that implies lovely horrors. now all it needs is sometimes quoted items after "key=" (using and requiring different quotes, depending on which system you're speaking to) and at least 3 sources of "universal event ID" counter... just one "batchid" is too simple.


<Batch>batchid=363777811 status=Done dateandtime=09/18/2007 09:53:10 PDT activateditems=335 numberofwarnings=0 itemsnotacivated=17 </Batch>

Philosophically speaking, is this any different than 'syntactic salt' for a 'Batch' JSON object? ;)


Yes. It isn't json. Those are non-delimited key-value pairs.


Were they not using a standard XML serialization library?


Jesus christ just kick me in the balls than making me deal with that XD


Just an FYI. This is about Amazon's MWS API, which is Merchant Web Services, used to interact with the Amazon part of Amazon not the AWS side of Amazon.

Also having working on an OSS Ruby library for it, and having to had work with it quite a bit, I can confirm. The MWS APIs leave a lot to be desired of, but they are an absolute gem compared to the Ebay APIs.

My take on the problem is that these APIs were developed in the early to mid 2000's when e-commerce was taking off, so they are architected using the tech and philosophies from back then (XML, SOAP, SOA, etc) however we are no in 2020 and people expect up to date modern interfaces and APIs, and communication protocols; but these API's can't be shut down or risk major breaking changes because so much other stuff is built on top of them.

Really at the core of the day the problem is that so much crap on the Web is considered ephemeral and so many business come and go so quickly that the question of what to do about legacy tech on the web isn't much of an issue because things don't last long enough to be legacy. The exceptions are some of the e-commerce APIs of these big tech companies such as Ebay, and Amazon that managed to thrive, but now are faced with a challenge most people never are of how to migrate legacy web services.

EDIT: Updated appearantly I was not reading closely enough, this is not for the MWS API but the new REST version of their Seller APIs. That being said I'd still be willing to wager a pretty penny they slapped together a REST-to-XML translation layer on top of the MWS APIs and called it a day.


This is not the MWS API. The MWS Api was indeed written more than 10 years ago but is much better than this new "REST" api.

This is Selling Partner API which was released few months back and is intended to replace MWS.


It seems like more or less a wrapper around the MWS api. Many calls are identical.


I think you've explained the situation quite well.

I got my career started helping a lot of big companies transition from SOAP/WSDL/XML tech to JSON APIs in the late 2000's (writing API servers and client libraries in PHP, Ruby, Python, and then Node.js) and so thought about these technologies day in and day out for many, many years.

I think we are still missing a dead simpler way to do it all—a very basic Tree/Grid Notation where API requests and responses are equivalent to sending a spreadsheet back and forth (no syntax characters—no quotes, ticks, brackets, parens, etc—just spacing for the encoding). Higher level types would be added onto that, but at the low level it's all about just keeping your data as simple and clean as possible.

Here's a demo I made in 2014 of the idea, which I think is as relevant as ever https://www.youtube.com/watch?v=DV4Nv23bBwQ&t=4s

And another look at it: https://treenotation.org/#apis


> Higher level types would be added onto that, but at the low level it's all about just keeping your data as simple and clean as possible.

and at the end we would call it rpc and gave the binary protocol a better name, like protobuf or cap'n'proto?


All RPCs would take a tree notation doc as input and return a tree notation doc as output. Tree notation supports no data structures and tree notation supports all data structures. Apis would use only the data structures they need. Binary protocols would be easy to optimize.


Agree with what you have said about the MWS API, but this repo is actually the updated REST based API - which appears to be mostly the same functions as the older API but with less documentation.


Don't worry, the docs and APIs/libraries for AWS are just as bad

Boto3 is eye-watering bad. Boto2 was written by someone with some knowledge of Python and usability. Boto3 is just "automatically generated" from their API

It is not good. It is bad. The structure makes no sense. The API can be described as "the minimum that works".


I'm not here to defend AWS but I'm a bit puzzled by your comment about the Boto3 doco. Here's some sample usage for instance which clearly is something more than "automatically generated" https://boto3.amazonaws.com/v1/documentation/api/latest/guid... .


I believe they meant that the API bindings themselves are autogenerated - which I think is pretty likely given how baffling and un-pythonic some of decisions in boto3 are.

You're right that much of the documentation clearly isn't auto-generated. Of the many Amazon APIs and tools I've worked with, boto3 among the better of them in terms of documentation.


Not the docs, the library code is (for the most part) automatically generated

This is actually a good example of how the thing works, you have to use MessageBody as an argument name to send the message. Or use the Entries argument

Those names and that usage style are far from Pythonic


They map to the underlying JSON calls that are made. It's like they took the top-level keys of the JSON and made them kwargs in the Python.


EBay added a rest API. They still have the old XML one. They sometimes interact in fun ways!!!


For some reason the data you send to eBay's REST APIs doesn't always show up in their UI...specifically, if you use their RESTful Inventory API, you can't see _any_ of it from the Selling page! Not sure why...e-commerce APIs seem to be (on average) remarkably bad.


Even 40 years ago encoding things as free text such as this would have been very bad practice.

And don't get me started on encoding multiple items in a single csv column.


Those directed graph traversal interview questions really contributed to the quality of these documents.


Documents will only be good if there is a strong review process by someone other than the devs who wrote the code. MS used to do that to a great degree until fairly recently. Amazon isn't big on TPMs unfortunately.


so these convoluted hiring hazing quizzes realy don't help " I'm shocked, shocked "


I'm currently working with this API, and I think that if I didn't have experience with Amazon's old seller API (MWS), I wouldn't even know where to start with this one. As it is, it's taken me many, many hours to figure out how to make this one (sorta) work.

(Shameless plug: I wrote detailed walkthroughs of how to get access to[0] and build a basic application with[1] the Selling Partner API).

[0] https://jesseevers.com/selling-partner-api-access/ [1] https://jesseevers.com/spapi-first-application/


Amazon has locked the discussion as 'too heated' and then closed it. I am not even remotely impacted by these issues but that is absolutely hilarious.


I won't touch any of amazon's products with a 10ft pole because of the way they treat their warehouse workers. Making their warehouse employees pee in bottles to make their metrics or risk being fired, that's just on a whole new level of extracting every joule of work out of a human being.


I agree with you but that's not unique to Amazon and nor is it a modern problem. More than 20 years ago, I was pissing in empty bottles next to trucks to make my quotas when throwing boxes for FedEx when I was in university. Ditto when I threw boxes at an Abercrombie & Fitch / Hollister distribution warehouse in university. Same story when I worked construction jobs. Same story when I was a teenager, picking corn and beans by hand in the fields near where I grew up. In fact, of the over a dozen manual labor jobs I've had in my life, the only one where I had convenient and frequent access to a restroom facility was a Thomson Consumer Electronics glass factory.


It's been surprising to see how much better people tend to view Amazon vs. Walmart when Amazon is just the next iteration of megacorp retail consolidation in the same vein with an increasingly horrible record of employee exploitation and abuse.


That's because we've had a good 15 years of tweets and articles from tech luminaries and op-ed writers that used their shopping experience at Amazon as a stick to beat Walmart with. For the most part, those thought pieces have always been written by white-collar people who had no idea of the human toll that two-day shipping takes.

Their entire experience w/ Amazon starts at the website and ends with the brown box with the smile on it.


When even one cent difference in cost equals millions in profit, its much easier to pretend you are dealing with numbers and not people.

I am not defending anyone only pointing out that maybe we shouldn't have entities so big that dissociation of mid to high level management decision's to reality is so easy. Where guilt and responsibility is pushed down the ladder on daily basis.


fair but wildly off-topic.


Off-topic I'll give you, but wildly?

That strikes me as wildly hyperbolic.


no no the topic of this post is "amazon bad"


amazon technical != amazon retail

and the conditions have only been improving, and with more robots, the number of warehouse workers will only keep getting less. drivers on the other hand is a different story..


If you have had the misfortune of interacting with Amazon seller central or vendor central as a merchant, as I sadly have. This won’t come as a surprise, it boggles my mind how bad, buggy and user hostile that software is. The fact that Amazon keeps moving forward with their IT estate as bad as it is amazes me.


They get away with it because there is no competition. Well there is eBay, but that has an even worse api.

What are you doing to do? Deal with it or decide to not sell your products?


I ran a small manufacturing business that listed on Amazon in both seller and vendor central. I was just interacting with it as a ”common user” for 5 years. Never tried to develop against the API - we did about £300,000 per year through the platforms.


This is my experience, too. I was a new paying user of Amazon Seller Central, confused by the UI and missing documentation, and they simply ignored my ticket.


I once worked with a financial product that used CORBA for distributed processing. One chapter of the dev guide had examples of use that were... Let's say not possible with CORBA... I opened a ticket with the vendor, because of course we were paying a large purchase percentage maint fee. I was not part of vendor selection. The official response from the vendor was "please remove that chapter from the dev guide.". Beautiful!

Before you hate on CORBA I submit that my masters thesis was based on CORBA for distributed processing and that, "horror of horrors" it had solutions for many of today's issues.. albeit implemented with tooling that was lacking. Anyone for a game of what vendors RPC implementation this is based on packet dumps?


We need a way to unit test documentation to see if the code snippets still compile and the things that are referenced still exist.


This actual exists in some places! Check out moduledocs and "doctests" for languages like Elixir. Was something really cool I liked while I have been learning Elixir.


If you write a unittest in D underneath a function, it will automatically be included in the generated docs as an example. This is how all the examples in the standard library documentation are created, so you know for a fact that those work.



This is actually something from Python's stdlib: https://docs.python.org/3/library/doctest.html

Really underrated.


I seem to recall a talk given at one point that demonstrated a scheme that would reverse engineer functions out of whole cloth based only on docstrings and doctests, but of course I can't find it now.


My job is mostly Python. Did not know this existed.


Use contracts and propery-based testing. See Hypothesis library in Python (and other laguages). I wrote a library for contracts in Python (http://github.com/Parquery/icontract, see its readme for further references to other libraries).


This is why I prefer to link to actual code in the repo, in an examples directory, which gets compiled during every build.


Unfortunately that doesn't help if the documentation is written by a different team and in a different format.

Getting out of sync happens very quickly if docs aren't generated from source and aren't an integral part of the release process.


This is not only simple and comprehensive but also doesn't require any special support from the tooling or language. Love it!


That's definitely something that's still missing in technical documentation. The tooling still isn't there yet, and that's sad.

Every change in API or behaviour should be automatically blocked by CI tooling in release builds if its documentation is missing or outdated.

But alas, even in 2020 technical documentation is still just an afterthought for most companies - even big ones.


Internally to $WORK I've got a documentation system consisting of interactive F# notebooks, put together automatically from marked-up F# code (so you get type-checking while you write them) with output assertions checked in the CI pipeline for the corresponding library (and before a pipeline pushes out the latest version of the documentation). Hoping to open-source the system if I ever get time; I think Mathematica's documentation system is absolutely first-class, and I want that in any language that supports notebook-style interfaces!


My company is currently trialing this with the Open API Spec. The workflow at a high level:

1) Make code changes to a specific microservice. (note: The Open API Spec also lives in code.)

2) CI/CD pipelines get triggered.

3) New microservice is built.

4) Call every REST endpoint defined in the microservice's Open API Spec and validate it using the example requests/responses.

5) If successful, regenerate the API documentation from the Open API Spec.

This ensures that our documentation and services will stay inline with each other.


It's far less extensive than what you presented, but on the current project we are working on at work, we are using https://github.com/swaggo/swag

It's somewhat specific to golang but so far it has been relatively good for us.

The spec is directly next to the code as code comments, so it has far more chances to get updated when changes are made compared to an external documentation.

Also, the payloads and responses are directly derived from the Golang structs, so, at least on that aspect, changes in the code are automatically reflected in the spec (apart for description and examples).

We also put the interactive documentation directly in our API under /doc. It is a useful tool for developers when implementing a new url/handler or modifying an existing one. It also creates incentives to keep the documentation matching.

Overall, we had very few mismatches between the spec and the actual implementation overall, despite not having deeply tested it (be it manually or automatically). Apart from one or two mismatches that were fixed quickly, I was able to take the spec, generate client libraries from it (swagger-codegen), and use them for quite extensive demos without issues.

We are still early in the project (not in production yet), and there are definitely some aspects we need to improve (integration/automated tests to be sure doc and code are 100% matching) or to completely figure out (ex: how to handle several versions of the same API). But overall, using swaggo/swag has been a pleasant experience.


obligatory "Rust docs can do this"[0] (quite a nice feature tbh).

[0] https://doc.rust-lang.org/rustdoc/documentation-tests.html


Python can too, and I think most other languages are inspired by it.

https://docs.python.org/3/library/doctest.html


oh cool, didn't know that was where it came from, thanks for the pointer!


Obligatory reference to https://en.wikipedia.org/wiki/Web_(programming_system).

But yeah, Python made it usable :)


We do this for our project (https://hail.is) and it's a game changer. Saves so much time and so many bug reports.


Can you specify how are you doing it exactly? Very interested.


Comes standard with Rust


I remember having some Microsoft documentation in the mid 90s for using DDE (an early Windows way of passing messages to different running applications).

I forget the specifics of how the example code worked but it acted as both a sender and receiver of DDE messages and did so by executing itself. I remember spending a good 30 minutes trying to work out how the example worked before giving up and doing the classic “let’s just run it and see what happens” approach.

The example code from Microsoft turned out to be a fork bomb and it quickly crashed my machine.

It took me a long time to trust Microsoft documentation again after that incident.


Having coded with MWS, I can confirm. I don't remember the details but there was a typo in one of the PHP examples leading all API calls to fail. Easily fixed but should I really have to edit the provided example code to get something working?


Typo in the example? Mr. One-Up here to bring up HP Fortify’s (static analysis tool) API that had typos in the named parameters. But no typo in the named return value. So you would pass in a value for “fuebar”, and then look for “foobar” on the return. As one of many examples.


My first job out of college was working on Fulfillment by Amazon, which was part of sellers central. The place was a huge mess technically, with like three different authoritative databases, and services constantly trying to second guess each other. Everything was in Java and there was no clear central direction. I think this is probably the main drawback to Bezos' cloud of services mandate.


I know it doesn't solve all the aspects of how broken this documentation is: but here's a pull request that somewhat tries to address it.

The Pull Request:

https://github.com/amzn/selling-partner-api-docs/pull/209

The Deployed Docs:

https://docs.contour.so/amzn/selling-partner-api-docs

Lots of bugs still to fix but hopefully it's helpful to anyone. Super open to any feedback!

---- Edit: To make it clear, the docs are built on a web app I built while on break from school. Sorry if that wasn't clear before!


All you've done is deploy the existing documentation on your own service and change the links on the GitHub repository to point to your service rather than the GitHub documentation (with no indication of how one would keep this in sync with updates).

Sure, it makes navigating this version of the docs slightly easier, but this doesn't seem to be a good-faith attempt at actually improving the documentation so much as promoting your service.


Holding my two hands up--I saw this post on HN and wanted to help, and using a web app that I've built seemed to be the logical step.

But yeah, I could definitely see how it would come across this way, and I'll revert the links back to the PR! Hope that helps.


You should be transparent about the fact that contour.so is your product.

Also, I do not think contour.so addresses any of the real concerns about the documentation.


that website breaks my browser back button


that's one way to improve your bounce rate


See, the issue is that most probably this person is not Amazon's customer. Their employer is.

In which case either Amazon is or isn't important to them. If it is, they can just tell the person to get over it and do their job. If it isn't then they aren't an important customer for Amazon either.

In neither case is there any incentive for Amazon to improve.

Such is life with a lot of enterprise software in my experience.


I hope the software industry realizes that good documentation is more valuable than good code. Code is transient, and even good code today will be called techdebt tomorrow when the next language or framework or library shows up. But bad documentation impedes your ability to write good code, your ability to work with other parts of the organization, and your ability to grow your business by having outsiders integate with you (as in this case): it also has the effect of turning code into techdebt, by causing people to write new implementations of code they otherwise could have used, because they have no documentation for it.

One step we could take is in hiring. When interviewing developers, let's ask them to write documentation, not just code.


It’s worth fuck all because sellers will bang their head on a wall for 2 months until they work it out because they have no other option.


And now Amazon have locked the issue so others can't agree with the original poster. Don't fix the problem just fix the person who mentions it !


can i introduce you to my friend openCV? :/


I thought I was the only one! When I look around at the code examples using openCV it is clear that nobody knows how to use it. God help you trying to get your GPU working on Python.


Matplotlib...


matplotlib is esoteric ( if you didn’t already have psychic scars from matlab experience and thus already speak their esolang), but at least it documents kwargs and links out to source and doesn’t just give you out of date snippets in broken english.

OpenCV can’t be assed to link out to source, and also won’t even bother telling you what the function signatures are, what the names of all the methods are, or anything else useful.


I've written code against Amazon's MWS (btw they have a brand new RESTful API available now) and just for the record the eBay, USPS APIs are similarly "not stellar".


This post is talking about said new RESTful API.


Heh. And this is actually an improvement.

The previous/legacy MWS API was actually way worse. Though the documentation may have been a bit more coherent.


At least they are still seeing documentations...


I always thought AWS has horrible documentation because the incentive structure is to sell you on premium support.


Seems that API documentation (in this case) is not terribly important to the bottom line :)

(So why should the bother fixing it?)


This was the case on the dev side of iTunes Connect and the App Store for longer than anyone at Apple would like to admit.

It was clear for a long time that developers were not considered customers, or if they were, they were far less important ones.


Some fleeting sense of pride in one's work, I would suppose. But yeah that doesn't move those dashboards.


That only works if the pride would be induced in the one allocating developer time.


<cough>Google's SDK C# Documentation</cough>


is it funny that when i first read the title i thought the author was talking about Apple's Swift documentation? I suppose funny is not the correct word - more like sad.


I feel like linking directly to issue pages incites brigading. This HN post seems like a direct attack on the maintainers (not the actual issue).

Look at the new comments posted since this link hit frontpage. Not helpful, and in fact HARMFUL to open source at large. This is irresponsible.


I would agree if this linked to a small project or someone's personal open-source project.

However, this is the repository to the public documentation of Amazon's API.

This is not the source code to a website somewhere. This is the actual documentation. You ask Amazon for the documentation and they link you to that repository.

This is a company with a revenue of $96.1 billion that relies on sellers to fill its marketplace. You would expect some level of quality.

We had an issue at work with their Seller API. We ended up having to email and call their support daily in order for them to switch an invisible flag on our account. After a month and a half of phone calls, they eventually fixed it.

This is the kind of support you receive from Amazon when you are a seller.


I wonder at the potential competitive points to Amazon. If you could largely copy logistical, reliability and marketplace features what would allow someone to start competing using a similar model?

- Seller relations and developer experience

- Marketplace and human resource ethics

- ?


I agree that this issue was likely not helpful except to shame Amazon. Sometimes publicity of things like this gets change that calm requests will not.

That said, I wrote documentation professionally for a few years in my career, and recently re-worked much of CRACO's documentation, and there is very little love for docs.

Docs are hard to do well and completely and if the API is as bad as described, you can only shine up a turd so much in explaining how something works.

Sometimes writing documentation is the only way someone realizes that the software is broken or not actually useful as implemented.

In most situations, a product manager is looking out for this in advance, but when the product is purely an API you have less product-type people who can provide useful management.

Obviously it is not a revenue problem at Amazon. Perhaps this is anger at Amazon's astonishing success redirected to areas of Amazon that have failed to realize the resources to make quality products.


Well, at least it's not EDI...




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: