Given the incoming compute capability from nvidia and the speed of advancement, we have to stop and think ... does it make sense to give access, paid or otherwise, to these models once they reach a certain sophistication?
Or does it make even more sense to hoard the capability to out compete any competitor, of any kind, commercially or politically and hide the true extent of your capability to avoid scrutiny and legislation?
I'm going with the latter. Perhaps now, perhaps in the very near future, the power of these capabilities is novel. Like an information nuclear weapon.
I'd be dialing back the public expectations and deploying the capability in a novel way to exploit it as the largest lever I could.
The more unseen the lever, the longer.
I think any other strategy is myopic from a competition perspective. The power of these models isn't direct utility, it is compounded by secrecy because their useful work isn't directly observable as coming from the model.
I have some first hand thoughts. I think overall the quality is significantly poorer on GPT4 with plugins and bing browsing enabled. If you disable those, I am able to get the same quality as before. The outputs are dramatically different. Would love to hear what everyone else sees when they try the same.
No, while I have no hard data, the experienced quality of the default GPT-4 model feels like it has gone down tremendously for me as well. Plugins and Bing browsing have so far for me almost never worked at all. I retry these just once a week but there always seem to be technical issues.
A more banal explanation is that compute is expensive, so they are tweaking the models to get more for less, and it isn't always working out. Scaling by itself is a hard problem, scaling and improving efficiency (margins) doubly so.
I don't think current trends are necessarily related to my root comment, but it begged the question of if absolute secrecy of capability would be a good route forward.
Experts who are both in a position to know, and seeking to maximize the commercial potential of their work are saying the cat is already out of the bag. They make a persuasive case that public, open-source models are closing the gap with private, commercial ones and admit bluntly, "We have no secret sauce."
Interesting, but Google has the worlds largest search index to use to build models, and billions of android phones and gmail accounts. An open source model may share the same algorithm but it’s possible training set will be dwarfed by Google. It even might have the same number of connections. The article is arguing that a few billion is enough, but what about 5 years from now and even for fewer connections data quality wouldn’t matter? Sure you can run a model slowly on a raspberry pi, but custom silicon can’t do more?
There’s a linked “data doesn't do what you think” document in that post, which might counter this argument but the site is now down.
Having spoken to a bunch of people that either have just left Google or still work there, practically all of them think this was not so much a leak as a placed bit of news to support them in potential future anti-trust cases.
There is not much that could help them. The search engine monopoly times wont repeat, despite some people trying very hard to build a regulatory captured Alphapersude monopoly. For that you need to offer an actual edge over the competition. Treating the users worse isnt one. Which brings us back to OP, progress will luckily eat anyone attempting that.
I rather think it has something to do with scale, hardware and energy costs. GPT4 is way more expensive to compute, than GPT3. Needing more GPUs and more energy tu run it.
And demand is still through the roof and they have a lot of people subscribing, so why not reduce costs a little bit, someone might have thought. Or well, "optimized" was probably the term used.
Or does it make even more sense to hoard the capability to out compete any competitor, of any kind, commercially or politically and hide the true extent of your capability to avoid scrutiny and legislation?
I'm going with the latter. Perhaps now, perhaps in the very near future, the power of these capabilities is novel. Like an information nuclear weapon.
I'd be dialing back the public expectations and deploying the capability in a novel way to exploit it as the largest lever I could.
The more unseen the lever, the longer.
I think any other strategy is myopic from a competition perspective. The power of these models isn't direct utility, it is compounded by secrecy because their useful work isn't directly observable as coming from the model.