I think you’re thinking about things very locally. Of course ChatGPT can help with some coding - I use it for regex quite often cause I never really learned that well.
The problem is that at the average medium sized company code looks like this - you have 1mln lines of code written over a decade by a few hundred people. A big portion of the code is redundant, some of it is incomplete, much of it is undocumented. Different companies have different coding styles, different testing approaches, different development dynamics. ChatGPT does not appreciate this context.
Excel has some similar problems. First of all Excel is 2 dimensional and LLMs really don’t think in 2 dimensions well. So you need to flatten the excel file for the LLM. A common approach to do this with LLMs is using pandas and then using the column and row names to index into the excel.
Unfortunately, excels at companies cannot be easily read using pandas. They are illogically structured, have tons of hardcoding, intersheet referencing is weird circular ways and so on. I spent some time in finance and sell side equity research models are written by highly trained financial analysts and are substantially better organized than the average excel model at a company. Even this subset of real world models is far from suitable for a direct pandas interpretation. Parsing sell side models requires a delicate and complex interpretation before being fed into an LLM.
The problem is that at the average medium sized company code looks like this - you have 1mln lines of code written over a decade by a few hundred people. A big portion of the code is redundant, some of it is incomplete, much of it is undocumented. Different companies have different coding styles, different testing approaches, different development dynamics. ChatGPT does not appreciate this context.
Excel has some similar problems. First of all Excel is 2 dimensional and LLMs really don’t think in 2 dimensions well. So you need to flatten the excel file for the LLM. A common approach to do this with LLMs is using pandas and then using the column and row names to index into the excel.
Unfortunately, excels at companies cannot be easily read using pandas. They are illogically structured, have tons of hardcoding, intersheet referencing is weird circular ways and so on. I spent some time in finance and sell side equity research models are written by highly trained financial analysts and are substantially better organized than the average excel model at a company. Even this subset of real world models is far from suitable for a direct pandas interpretation. Parsing sell side models requires a delicate and complex interpretation before being fed into an LLM.