Your developer tool already maps out the entire code base in useful ways, such as knowing all the symbols available in the current context and the structure of classes. This information can be distilled for presentation to the LLM. For instance, if you’re wanting to generate a method implementation inside a C++ class, the LLM can be given a condensed version of the header files that the compiler would have access to on compiling that specific class. Removing white space and comments and boiling macros down saves a lot of tokens.
You can also probably skip including standard library headers since those will be well known to the LLM through its fine tuning.
Either way, consider that a typical preprocessed C++ file would push against the 100K limit even with some optimizations. You will definitely want to have some middleware doing additional refinement before presenting that file to the LLM.
You can also probably skip including standard library headers since those will be well known to the LLM through its fine tuning.
Either way, consider that a typical preprocessed C++ file would push against the 100K limit even with some optimizations. You will definitely want to have some middleware doing additional refinement before presenting that file to the LLM.