Autodoc: Toolkit for auto-generating codebase documentation using LLMs

SilverBirch · on March 25, 2023

I have a conceptual problem with this. Documentation is meant to describe stuff thats's not in the code. Sure, there's the odd occasion where you've done some super weird optimization where you want to say "Over here I've translated all my numbers into base 5, bit reversed them and then added them, mathematically this is the same as just adding them in base 9, but fits our custom logic cell better". But that's the exception, the general purpose of documentation is to describe why it's doing hwat it's doing, not how. Tell me that this module does X this way because that helps this other module do Y". Tell me why you've divided the problem this way. You're giving information about why certain design decisions were made, not just describe what they are.

It doesn't matter how good your LLM is, the information simply isn't there for it to know the information it needs to document. You're never going to get a comment out of this that says "This interface is meant to be backwards compatible with the interface Bob once wrote on a napkin in the pub on a particularly quiet friday afternoon when he decided to reinvent Kafka".

ithkuil · on March 25, 2023

Just give your LLM access to all your slack chats and screenshots of all items found in the bins and it would tell you what Bob had for breakfast too

hdjjhhvvhga · on March 25, 2023

> and it would tell you what Bob had for breakfast too

Sure it would, if you asked. But then, it could be 100% wrong while giving you a very confident answer.

turmeric_root · on March 25, 2023

'accuracy' and 'truth' are legacy 0.1X concepts, move fast and break things

ithkuil · on March 25, 2023

Like many eyewitness in court

gchamonlive · on March 25, 2023

I agree with you that documentation should expose developer intent that isn't encoded in the codebase, but if the documentation at least reduces the entry bar to understanding the code, I believe it could have its merits.

Autodoc won't substitute the need for engineers to document their work, but I believe specially in legacy cosebases that it could help with maintenance of an otherwise helpless codebase.

ModernMech · on March 25, 2023

I've been using ChatGPT to write docs. Here's how I'll do it: I'll start feeding it specs and examples about my project. Then I will tell it the outline of the docs we are going to write. Then for each section, I tell it we're going to write that section, and provide it with the subsections. Finally, in the prompt, I fill in all the key details it needs to hit, which is where I would tell it "Make a note that the interface was meant to be backwards compatible...". I had already defined ahead of time that a "Note" is an element in the docs with a little emoji in front and special styling, so it even formats it nicely.

igetspam · on March 25, 2023

I have a similar docs workflow now without the emojis. :)

I give non specific explanations about what I'm writing, the template and the target audience. It's sped up my documentation time significantly. I can create high quality documents in hours, instead of days.

kaycebasques · on March 25, 2023

Can you show examples of the documentation you've created with this workflow? Or better yet, can you make a screencast of the process?

usefulcat · on March 25, 2023

I think it depends a lot on what exactly is meant by ‘documentation’. If I’m looking at the man page for for strlen, what it does is almost always the first thing I need to know. I’d go so far as to say that knowing what something does is almost always a prerequisite for understanding anything to do with ‘why’.

cush · on March 25, 2023

Yep. Comments are for why not how. Docs though can be very dumb and still quite useful. "This is the class to access the Animals database. Use getAnimals to query for animals."

divan · on March 25, 2023

Just tested it on a side-project codebase.

Main impression - it does hallucinate like crazy. I asked "How does authorization of HTTP request work?" and it started spitting explanation of how user bcrypt hash is stored in SQlite database and token is stored in Redis cache. There are no signs of SQLite or Redis whatsover on this project.

In other query it started confidently explaining how `getTeam` and `createTeam` functions work. There are no such entity or a word "team" in the entire codebase. To add to the insult, it said that this whole team handling logic is stored in `/assets/sbadmin2/scss/_mixins.scss`.

Other time it offered extremely detailed explanation of some business-logic related question, linking to a lot of existing files from the project, but that was completely off.

Sometimes it offered meaningful explanations, but was ignoring the question. Like I ask to explain relation between two entities and it started showing how to display that entity in a HTML template.

But I guess it's just a question of time when tools like this become a daily assistant. Seems invaluable for the newcomers to the codebase.

scary-size · on March 25, 2023

Wrong documentation is even worse than no documentation. Without it your at least force to look at the code and validate your assumptions. Get a feel for the code base. Wrong docs pointing you to tech that’s not even used is going to be a mess.

danpalmer · on March 25, 2023

Furthermore, software is essentially versioned by its documentation first. If it says it does something, people will depend on that, and it not doing that is a bug.

narwally · on March 25, 2023

I wasted two hours yesterday just to track down a bug that was in the documentation, not the code. It's so frustrating. Maintaining good documentation is incredibly hard especially when you're trying to document how your program interacts with other software because now you have to change your documentation when a third party changes their code.

jprete · on March 25, 2023

Alternatively, it's hyped up like crazy, the tech is inherently bad at information retrieval, and most of the people hyping it are trying to get in on a gold rush.

To be clear, I don't know the answer.

funfunfunction · on March 25, 2023

Hey if your project is public I would love to take a look if you don’t mind sharing a link

divan · on March 25, 2023

No, sorry, that was private project.

ch33zer · on March 25, 2023

Most interesting part to me, the prompts:

https://github.com/context-labs/autodoc/blob/83f03a3cee62d6e...

> You are acting as a code documentation expert for a project called ${projectName}. Below is the code from a file located at \`${filePath}\`. Write a detailed technical explanation of what this code does. Focus on the high-level purpose of the code and how it may be used in the larger project. Include code examples where appropriate. Keep you response between 100 and 300 words. DO NOT RETURN MORE THAN 300 WORDS. Output should be in markdown format. Do not say "this file is a part of the ${projectName} project". Do not just list the methods and classes in this file. Code: ${fileContents} Response:

koboll · on March 25, 2023

This basically matches my experience in trying to get it to do the right thing. BEING VERY EXPLICIT AND ANGRY WORKS TO REINFORCE A POINT. Specifically telling it to not do a thing it will otherwise do is often necessary.

The only part that surprises me is `Output should be in markdown format`. Usually being that vague results in weird variation in output; I'd have expected a formatted example in the prompt for GPT to copy.

stareatgoats · on March 25, 2023

It understands most things without being given examples in my experience. Being explicit is helpful. Being angry is likely inconsequential, but I can't say for sure since I never felt the need TBH. What I can say is that the bot has become spookily like a person, "someone" that I conceive as helpful, courteous and friendly, albeit sometimes wildly wrong in spite of the assertive tone in the response.

I'll probably get used to it over time, as I get a deeper sense of how it works, and how it differs from real persons. ATM the distinction is blurred.

mpalmer · on March 25, 2023

There's a subreddit where people post angry prescriptive memos or notices from their terrible bosses, and this would fit right in.

firtoz · on March 25, 2023

Quick, someone make a project which gets ChatGPT to answer those memos

hedora · on March 25, 2023

This has far surpassed my dystopian predictions of how people would misuse LLMs.

Self spamming your own code base with comments that are either obvious, misleading or wrong was previously unfathomable to me.

Most people think I’m unrealistically pessimistic.

Well done.

jamesrom · on March 25, 2023

Code is the ultimate reference for understanding a project, while documentation is often neglected, outdated, or incorrect. It can also be difficult to keep up to date.

An LLM may not fully comprehend the code like the original author, but it can offer a different perspective that may be valuable. The only argument I've seen against LLMs is that it may encourage laziness, but this is a flawed argument similar to those made against the printing press, which was said to make people illiterate.

As a reader of the docs: does it require discipline to refer back to the code when needed. Yes, but this is no different from the discipline required to write documentation in the first place. But with a key difference, the discipline is shifted from author to reader.

woeirua · on March 25, 2023

This just begs the question… why not generate the docs when someone reads the code instead? Why bother generating a bunch of half assed docs today when you can wait and interrogate the code in a much more natural and fluid way using the same LLM when it matters?

docandrew · on March 25, 2023

SEO. When you ask an LLM “What’s a good library/tool I can use to accomplish X?” The ones with some documentation already attached will rise to the surface and make it easier for the consuming LLM to generate code samples for.

IanCal · on March 25, 2023

1. It's cheaper

2. Everyone can see the same thing

3. It can be utilized for search

4. You can do both

Perhaps most importantly

5. It can be reviewed

ModernMech · on March 25, 2023

Because not everyone will have access to LLM technology, and even if they do, your project might not always be used in a setting where access is technically feasible.

userbinator · on March 25, 2023

People have been doing that for a long time. It's usually a form of malicious compliance to "document everything" policies.

https://www.reddit.com/r/ProgrammerHumor/comments/4ktp12/my_...

raincole · on March 25, 2023

Just becasue people are doing that for a long time it doesn't mean we should make it more efficient.

hedora · on March 25, 2023

Just because we shouldn’t doesn’t mean people won’t pay for the service.

My “Well done” was not meant sarcastically!

userbinator · on March 25, 2023

I doubt an LLM is more efficient than previous documentation generators.

hedora · on March 25, 2023

Wait. This is already an industry?!?

armchairhacker · on March 25, 2023

But you see, developers have been self spamming their code with obvious, misleading, and/or wrong comments for decades. Especially with those pesky “everything must have a doc-comment” linters. Think of all the time ChatGPT will save doing it for them!

vorticalbox · on March 25, 2023

I have actually been using chatGPT do write jsdocs for me.

It mostly only need the function and the type definitions (typescript) and it is usually very good at writing them and it does save me time.

layer8 · on March 25, 2023

In can help to give an initial overview over a code base, its structure and contained functionality (with the usual risk of inaccuracies and fabulations), but it can’t supply the rationale and contextual information that good documentation provides, which can’t be derived from just the code. It also can’t distinguish between what is an implementation detail and what is part of an implicit interface contract.

abhishekbasu · on March 25, 2023

your pessimism is hardly unwarranted. this will lead to rather lousy code documentation, and might be misused by folks. but on the other hand, i look at this as something that will be effective as a "summarizer" of sorts where it describes already written code with lack of/poor documentation.

elromulous · on March 25, 2023

https://abstrusegoose.com/432

riku_iki · on March 25, 2023

He now can use gpt to generate code, comments, comments in code reviews, and work on 3 jobs simultaneously.

KyeRussell · on March 25, 2023

Overemployment is ruining America.

bqmjjx0kac · on March 25, 2023

It would be chef’s kiss perfect if another LLM did the code review.

abathur · on March 25, 2023

maybe a "code formatter" that just asks an LLM if it can rewrite this code, but pretty?

andrewmcwatters · on March 25, 2023

I'm far, far more interested in having an LLM tell me where particular functionality is in a codebase, and how it works from a high level.

Autogenerating function documentation seems like such a low bar by comparison. It's like taking limited creativity and applying it with high powered tools.

Literally like asking for a faster horse.

Tell me how WebKit generates tiles for rasterizing a document tree. Show me specifically where it takes virtualized rendering commands and translates them into port specific graphics calls.

Show me the specific binary format and where it is written for Unreal Engine 5 .umaps so that I can understand the embedded information for working between different types of software or porting to different engines.

Some codebases are so large that it literally doesn't matter if individual functions are documented when you have to build a mental model of several layers of abstraction to understand how something works.

funfunfunction · on March 25, 2023

Completely agree. Explaining how systems work in plain english is much more valuable than just giving inputs and outputs of individual functions. We want to understand how a system and it's subsystems work independently and interdependently.

We're not there yet with Autodoc; there is still tons of work to do.

If you have't tried the demo, give it a shot. You might be surprised.

andrewmcwatters · on March 25, 2023

Have you considered finding a way to instead write a text editor plugin that allows me to talk to a GPT and ask questions about a codebase? This would be a serious technology that moves beyond the in-code documentation paradigm.

funfunfunction · on March 25, 2023

This is exactly what autodoc does in your terminal. It wouldn't be hard to package it as a VSCode plugin.

IanCal · on March 25, 2023

Documentation can help the llm search though. Layering of tasks is important.

golem14 · on March 25, 2023

Today, LLMs learn from a codebase that has mostly insightful comments from well-meaning humans.

In the future, the training sets will contain more and more automatically generated stuff I believe will not be curated well, leading to a spiral of ever declining quality.

whiplash451 · on March 25, 2023

I am not overly concerned by a future where AI spoils its own training sets.

Companies like openai will keep injecting human-generated feedback into the training set.

A lot of the value is already in the RLHF today. See openai technical report.

golem14 · on March 25, 2023

Same reasoning suggests to make a copy of Wikipedia before it’s too late :)

jonas-w · on March 25, 2023

https://dumps.wikimedia.org/

Would be interesting to analyze it over the next years, maybe even anlyzing the past ones, with these AI detection tools.

narwally · on March 25, 2023

Remember that dude that wrote the vast majority of Scots wikipedia while not being able to speak Scots? Yeah, just waiting for the first LLM to make headlines by hallucinating it's own version of human history and publishing it to wikipedia.

ch33zer · on March 25, 2023

One thing I find worse than no docs is wrong docs.

It would be really cool if we could take code + docs, feed it into an LLM and get a determination of whether the code matches what's in the docs. It could also be a good way to evaluate the correctness of the generated docs from the linked tool (assuming it works).

userbinator · on March 25, 2023

If docs can be generated, they're not worth reading.

jamesrom · on March 25, 2023

All docs are generated. Now some are generated by silicon.

jprete · on March 25, 2023

I agree, and I think people are missing the point. The value of comments is to tell the reader things they can't get from reading the code, details of why the author did what they did. ChatGPT commenting its own code as it writes it makes sense, ChatGPT commenting others' code (or its own without context) is inherently guesstimating.

smel · on March 25, 2023

you're wrong it can be extremely useful

rkagerer · on March 25, 2023

Come talk to me when it can reverse engineer as good as Ken Shirriff, develop a complete understanding of the whole codebase, and generate authoritatively accurate and useful output. Oh, and uncover bugs while it's at it.

layer8 · on March 25, 2023

This still wouldn’t be enough, because writing good documentation usually requires knowing contextual information not contained in the code base.

awestroke · on March 25, 2023

Oh, so it's only interesting when it has completely surpassed even the smartest human in intelligence? When that happens, why would anybody bring it to you? When that happens, why would humans be needed for anything?

rkagerer · on March 26, 2023

Perhaps I was a little dramatic, but my impression is today's AI is not well suited for this task. What I laid out is slightly above my expectations for a human who does the job for me today (yes, my bar is fairly high) and the point is I won't get excited until the shiny new tech you want to replace that human with does a better job than they do.

In the meantime I'd prefer not to be subjected to served-from-a-can, machine-generated content in the code I'm trying to grok.

coolspot · on March 25, 2023

> Come talk to me when it can…

I am afraid no one will come talk to you when that happens.

whiplash451 · on March 25, 2023

New tech always need early adopters. You do not have to be part of them, but do not dismiss them either.

jeffypoo · on March 25, 2023

Are you really grand standing against an AI model rn

zx8080 · on March 25, 2023

How to verify the meaning of docs? How to deal with the model hallucinations?

It would be hell to lose trust to api docs due to those risks.

docandrew · on March 25, 2023

The way I’m thinking of it is as a junior engineer who can go do busy work for me. I’m not going to accept their “PRs” without a review. Even if it gets me 75% of the way there, that’s still a big time savings for me.

funfunfunction · on March 25, 2023

Hallucination is definitely a problem but can be somewhat mitigated by good prompting. GPT-4 seems less prone to hallucination. This will be better over time.

You can view the prompts used for generating docs here[1] and the prompts used for answering questions here[2]

[1]https://github.com/context-labs/autodoc/blob/master/src/cli/... [2https://github.com/context-labs/autodoc/blob/master/src/cli/...]

splatzone · on March 25, 2023

Very good point, but easily solved - just tag the docs as being generated by GPT-4, and make sure whoever reads them knows it.

didericis · on March 25, 2023

That doesn't solve the problem.

Documentation of unknown quality is useless noise.

People don't understand the unfathomable amount of garbage that's going to be generated in light of all these models. Doesn't matter how accurate they are, the lack of understanding for that remainder percent of inaccuracy is going to cause false confidence and cause errors to compound like mad.

petesergeant · on March 25, 2023

Example output: https://github.com/context-labs/autodoc/blob/master/.autodoc...

funfunfunction · on March 25, 2023

Generally these documents would be used directly. You can use the `doc q` command to query these documents with natural language questions.

funfunfunction · on March 25, 2023

Generally these documents wouldn't be used directly. You can use the `doc q` command to query these documents with natural language questions.

verdverm · on March 25, 2023

Please run this on the BuildKit code base which has almost no comments but a huge usage footprint. It's also (obviously) a non-trivial test case

chatmasta · on March 25, 2023

It was funny how for years the only documentation of BuildKit was some .md file in the middle of that repo with all the magic incantations for running apt install with cached layers... to be fair, I actually found it clearer than the real Docker docs.

funfunfunction · on March 25, 2023

Hi there, creator of autodoc here.

You can do it yourself and make pull request back to the BuildKit codebase :)

If it gets merged everyone who uses BuiltKit would have access.

verdverm · on March 25, 2023

Yea, not going to do it myself, would be better if you did. As I said, it's a good test to find where your tool provides value or breaks down

smrtinsert · on March 25, 2023

I'm more interested for LLMs to get the point when it can look at the several hundred codebases in your company and tell you who sets what value in your local data model, and why they set it. There's always slack instead of poorly generated documentation.

splatzone · on March 25, 2023

This is extremely interesting. I have a few monstrous repositories I’d like to try it on.

The thing I’m wondering about is the cost. How much would it cost to run this on the entire WordPress source, for example?

funfunfunction · on March 25, 2023

How many pages? You can get an estimate for how much it would cost using the `estimate` command in Autodoc.

splatzone · on March 25, 2023

I’m not prepared to install this at the moment, but if you were to give an example of costs for what I suggested above, it might convince me to try it out

funfunfunction · on March 25, 2023

Running the `doc estimate` command on the Wordpress repository says it will cost $58.47. The estimates are usually +/-10% of the actual cost.

splatzone · on March 25, 2023

That seems very reasonable! Thank you for testing it, I’ll give it a try on some of my own repos soon

ldng · on March 25, 2023

It does not for such an uncertain result.

whiplash451 · on March 25, 2023

Very cool stuff.

I think people who dismiss this kind of tool because it can hallucinate stuff are off topic.

The AI will get better and better, but more importantly we will evolve and learn to work with this kind of tool.

ImageDeeply · on March 25, 2023

Granting that some hallucination is likely, still seems like a step in the right direction.