Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Agreed, and its larger context window is fantastic. My workflow:

- Convert the whole codebase into a string

- Paste it into Gemini

- Ask a question

People seem to be very taken with "agentic" approaches were the model selects a few files to look at, but I've found it very effective and convenient just to give the model the whole codebase, and then have a conversation with it, get it to output code, modify a file, etc.



I usually do that in a 2 step process. Instead of giving the full source code to the model, I will ask it to write a comprehensive, detailed, description of the architecture, intent, and details (including filenames) of the codebase to a Markdown file.

Then for each subsequent conversation I would ask the model to use this file as reference.

The overall idea is the same, but going through an intermediate file allows for manual amendments to the file in case the model consistently forgets some things, it also gives it a bit of an easier time to find information and reason about the codebase in a pre-summarized format.

It's sort of like giving a very rich metadata and index of the codebase to the model instead of dumping the raw data to it.


My special hack on top of what you suggested: Ask it to draw the whole codebase in graphviz compatible graphing markup language. There are various tools out there to render this as an SVG or whatever, to get an actual map of the system. Very helpful when diving in to a big new area.


You can use mermaid format instead of graphviz, then paste it into a markdown file and github will render it inline.


For anyone wondering how to quickly get your codebase into a good "Gemini" format, check out repomix. Very cool tool and unbelievably easy to get started with. Just type `npx repomix` and it'll go.

Also, use Google AI Studio, not the regular Gemini plan for the best results. You'll have more control over results.


> Convert the whole codebase into a string

When using the Gemini web app on a desktop system (could be different depending upon how you consume Gemini) if you select the + button in the bottom-left of the chat prompt area, select Import code, and then choose the "Upload folder" link at the bottom of the dialog that pops up, it'll pull up a file dialog letting you choose a directory and it will upload all the files in that directory and all subdirectories (recursively) and you can then prompt it on that code from there.

The upload process for average sized projects is, in my experience, close to instantaneous (obviously your mileage can vary if you have any sort of large asset/resource type files commingled with the code).

If your workflow already works then keep with it, but for projects with a pretty clean directory structure, uploading the code via the Import system is very straightforward and fast.

(Obvious disclaimer: Depending upon your employer, the code base in question, etc, uploading a full directory of code like this to Google or anyone else may not be kosher, be sure any copyright holders of the code are ok with you giving a "cloud" LLM access to the code, etc, etc)


Well I am not sure Gemini or any other LLMs respect `.gitignore` which can immediately make the context window jump over the maximum.

Tools like repomix[0] do this better, plus you can add your own extra exclusions on top. It also estimates token usage as a part of its output but I found it too optimistic i.e. it regularly says "40_000 tokens" but when uploading the resulting single XML file to Gemini it's actually f.ex. 55k - 65k tokens.

[0] https://github.com/yamadashy/repomix/


I agree. I use repomix with AI Studio extensively and never found anything (including the cli agents) that's close.

I sometimes upload codebases that are around 600k tokens and even those work.

Repomix also lets you create a config file so you can give it ignore/include patterns in addition to .gitignore.

It also tells you about the outlier files with exceptionally long content.


try codex and claude code - game changing ability to use CLI tools, edit/reorg multiple files, even interact with git.


Gemini cli is a thing that exists. Are you saying those specifically are better? Or CLIs are better?


OpenAI Codex currently seems quite a lot better than Gemini 2.5 and marginally better than Claude.

I'm using all three back-to-back via the VS Code plugins (which I believe are equivalent to the CLI tools).

I can live with either OpenAI Codex or Claude. Gemini 2.5 is useful but it is consistently not quite as good as the other two.

I agree that for non-Agentic coding tasks Gemini 2.5 is really good though.


Since I have only used Gemini Pro 2.5 (free) and Claude on the web (free) and I am thinking of subbing to one service or two, are you saying that:

- Gemini Pro 2.5 is better at feeding it more code and ask it to do a task (or more than one)? - ...but that GPT Codex and Claude Code are better at iterating on a project? - ...or something else?

I am looking to gauge my options. Will be grateful for your shared experience.


Codex and Claude are better than Gemini in all coding tasks I've tried.

At the "smart autocomplete" level the distinction isn't large but it gets bigger the more agentic you ask for.


Gemini CLI does all this too


I started using gemini like that as well, but with gemini cli. Point it at the direction and then converse with it about codebase. It's wonderful.


Idk though, I've seen many issues occur because of a longer context though. I mean it makes sense, given there are only so many attention heads, the longer the context the less chance attention will pick relevant tokens.


the cli tools really are way faster. You can use them the same way if you want you just dont have to copy paste stuff around all the time




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: