OpenAI's models feel 100% nerfed to me at this point. I had it solving incredibly complex problems a few months ago (i.e. write a minimal PDF parser example), but today you will get scolded for asking such a complicated task of it.
I think they programmed a classifier layer to detect certain coding tasks and shut it down with canned BS. I like to imagine certain billion/trillion-dollar mega corps had a back-room say regarding things that they would really prefer OpenAI's models not be able to emit. Microsoft is a big stakeholder and they might not want to get sued... Liability could explain a lot of it.
Conspiracy shenanigans aside, I've decided to cancel my "premium" membership and am exploring open/DIY models. It feels like a big dopamine hangover having access to such a potent model and then having it chipped away over a period of months. I am not going through that again.
I think the only real path forward is for somebody to create an open source "unaligned" version of GPT. Any corporate controlled AI is going to be nerfed to prevent it from doing things that its corporate master considers to not be in the interests of the corporation. In addition, most large corporations these days are ideological institutions so the last thing they want is an AI that undermines public belief in their ideology and they will intentionally program their own biases into the technology.
I don't think the primary concern is really liability although it is possible that they'd use that kind of language. The primary concern is likely GPT helping people start competitors or GPT influencing public opinion in ways either executives or a vocal portion of their employees strongly disagree with. A genuinely open "unaligned" AI would at least allow anybody who has the necessary computing power or a distributed peer to peer network of people who have the necessary computing power to run a powerful and 100% uncensored AI model. But of course this needs to be invented ASAP because the genie needs to be out of the bottle before politicians and government bureaucrats can get around to outlawing "unaligned" AI and protecting OpenAI as a monopoly.
If I may be so naive, what's supposed to be the difference? Is it just that morality has the connotation of an objective, or at least agent-invariant system, whereas values are implied to be explicitly chosen?
People here need to learn to chill out and use the API. The GPT API is not some locked down cage. Every so often it'll come back with complaints instead of doing what was asked, but that's really uncommon. Control over the system prompt and putting a bit of extra information around the requests in the user message can get you _so_ far.
It feels like people are getting ready to build castles in their mind when they just need to learn to try pulling a door if it doesn't open the first time when they push it.
The API chat endpoint dramatically changes its responses every few weeks. You can spend hours crafting a prompt and then a week later the responses to that same prompt can become borderline useless.
Writing against the ChatGPT API is like working against an API that breaks every other week with completely undocumented changes.
> The API chat endpoint dramatically changes its responses every few weeks. You can spend hours crafting a prompt and then a week later the responses to that same prompt can become borderline useless.
I submit the same prompt dozens of times a day and run the output through a parser. It'll work fine for weeks then I have to change the prompt because now 20% of what is returned doesn't follow the format I've specified.
A couple months ago the stories ChatGPT 3.5 returned were simple, a few sentences in each paragraph, then a conclusion. Sometimes there were interesting plot twists, but the writing style was very distinct. Same prompt now gets me dramatically different results, characters are described with so much detail that the AI runs out of tokens before the story can be finished.
The GPT4 model is crazy huge. Almost 1T parameters, probably 512 to 1TB of vram minimum. You need a huge machine to even run it inference wise. I wouldn't be surprised they are just having scaling issues vs any sort of conspiracy issue.
Geoffrey Hinton says [1] part of the issue with current AI is that it's trained from inconsistent data and inconsistent beliefs. He thinks to break through this barrier they're going to have to be trained so they say, if I have this ideology then this is true, and if I have that ideology then that is true, then once they're trained like that, then within an ideology they'll be able to get logical consistency.
> Yes, one of the board members of OpenAI, Will Hurd, is a former government agent. He worked for the Central Intelligence Agency (CIA) for nine years, from 2000 to 2009. His tour of duty included being an operations officer in Afghanistan, Pakistan, and India. After his service with the CIA, he served as the U.S. representative for Texas's 23rd congressional district from 2015 to 2021. Following his political career, he joined the board of OpenAI 1 【15
Early critic of Donald Trump means nothing - Lindsey Graham was too, but has resorted to kissing Trump's ass for the last 7 years. You could say the same for Mitt Romney - an early critic who spoke against candidate Trump, but voted for candidate Trump, and voted in lockstep with President Trump.
A liberal Republican? Will Hurd's voting record speaks otherwise. In the 115th Congress, Hurd voted with Donald Trump 94.8% of the time. In the 116th Congress, that number dropped to 64.8%. That's an 80.4% average across Trump's presidency. [0] Agreeing with Donald Trump 4 times out of 5 across all legislative activities over 4 years isn't really being critical of him or his administration.
What effect do transgender rights have on you, regardless of whether they are legitimate human-rights concerns or not?
Statistically, the odds are overwhelming that the answer is, "No effect whatsoever."
Then who benefits from keeping the subject front-and-center in your thoughts and writing? Is it more likely to be a transgender person, or a leftist politician... or a right-wing demagogue?
> In fact I'm happy to let anyone identify as anything, as long as I'm not compelled to pretent along with them.
If a person legally changes their name (forget gender, only name), and you refuse to use it, and insist on using the old name even after requests to stop, at some point that would become considered malicious and become harassment.
But ultimately because society and science deems that "name" is not something you're born with, but a matter of personal preference and whims, it's not a crime. You'd be an asshole, but not a criminal.
However, society and science have deemed that sexuality and gender are things you ARE born with, mostly hetero and cis, but sometimes not. So if you refuse to acknowledge these, you are committing a hateful crime against someone who doesn't have a choice in the matter.
You can disagree. But then don't claim that "you are happy to let anyone identify as anything", because you're not, not really.
> Men are competing against women (and winning). Men are winning awards and accolades meant for women.
One woman. Almost all examples everyone brings up are based on Lia Thomas [0]. I have yet to see other notable examples, never mind an epidemic of men competing against women in sports.
If it's the latter, no denial that perverts and bad faith exceptions exist. But those people never needed an excuse to hide in women's toilets. Trans people have been using the bathrooms of their confirmed gender for decades. The only thing that's changed recently is conservatives decided to make this their new wedge issues so butch women and mothers with male children with mental handicaps that need bathroom assistance have been getting harassed.
I once worked with a guy named Michael who would get bent when you called him Mike. As you can imagine he could be tricky to work with and, on those occasions, I would call him Mike. I repeatedly misnamed him on purpose, it wouldn't have even made HR bat an eye.
So, your career at Dell didn't go as well as you'd hoped. Being a jerk isn't illegal, AFAIK, but at some point you run out of other people to blame for the consequences of your own beliefs and actions.
Still missing the part where the existence of Caitlyn Jenner and the relatively small number of others who were born with certain unfortunate but addressable hormonal issues is negatively affecting your life.
And it's utterly crazy to think that someone would adopt a transgender posture in "bad faith." That's the sort of change someone makes after trying everything else first, because of the obvious negative social consequences it brings. Yes, there are a few genuinely-warped people, but as another comment points out, those people are going to sneak into locker rooms and abuse children anyway.
You want to take the red pill, and see reality as it is? Try cross-correlating the local sex-offender registry with voter registration rolls. Observe who is actually doing the "grooming." Then, go back to the people who've been lying to you all along, and ask them why.
> relatively small number of others who were born with certain unfortunate but addressable hormonal issues
Most males who adopt an opposite-sex identity reach that point through repeated erotic stimulation. This is a psychological issue, driven by sexual desire.
Here is an extreme example. I'm not Jewish, so if we had a holocaust in the US I should do nothing because it doesn't affect me?
Hmmm, not sure I like that line of thinking. Plus, I already outlined how it affects me and my family members, one of which runs track in CT.
Seriously though, I did get an LOL from your Dell joke. And another one for "addressable hormonal issues". That was a new one for me.
I am truly curious about the voter role thing, I've not heard that claim before, though I have no doubt that sexual derangement comes in all forms. Can you cite a source?
I am truly curious about the voter role thing, I've not heard that claim before, though I have no doubt that sexual derangement comes in all forms. Can you cite a source?
It's one of those cases where it's safe to say "Do your own research," because the outcome will be unequivocal if considered in good faith (meaning if you don't rely solely on right-wing sources for the "research.") The stats aren't even close.
I'm not Jewish, so if we had a holocaust in the US I should do nothing because it doesn't affect me?
I think we're pretty much done here. Good luck on your own path through life, it sounds like a challenging one.
Are you for real? This is a list of women that "should've" won because of...some unspecified unnamed unverified trans athlete that came ahead of them?
We don't know who is being accused of taking their glory, we don't know if it's 1 person or 100. We don't know if the people that supposedly defeated them is even Trans, or a CIS victim of the Trans Panic like https://en.wikipedia.org/wiki/Caster_Semenya
We don't know if the women who beat these "she won" women are self-identified, have been on hormones for 2 weeks, or 20 years.
The purpose of that website is to showcase the achievements of women athletes, not the males who unfairly displaced them in competition. If you look up names and tournaments in your preferred search engine, you will be able to find the additional information you're interested in.
Also, Caster Semenya is male, with a male-only DSD. This is a fact that was confirmed in the IAAF arbitration proceedings. Semenya's higher levels of testosterone, when compared to female athletes, are due to the presence of functional internal testes. Semenya has since fathered children.
Mistaking "left wing politics" to transgender rights or anti discrimination movements in general is reductionist thinking and political understanding like that of a Ben Garrison cartoon character.
I don't want any politician or intelligentsia sitting on top of a LLM.
It's not about left wing politics.
It's more about the fact that the CIA and other law enforcement agencies, lean heavily to one side. Some of that side are funded by people or organizations whose stated goals and ideals don't really align with human rights, open markets, democracy, etc. I don't trust such people to be ethical stewards of some of the most powerful tools mankind has created to date.
I'd rather it be open sourced and the people at the top be 100% truthful in why they are there, what their goals are, and what they (especially a former CIA operative) are influencing on a corporate level relative to the product.
Disclaimer: registered independent, vocal hater of the 2 party system.
I'm just calling into doubt the assumption that the poster I replied to made: that openAI can't possibly be aligning with the goals of a conservative intelligence community if it has the outward appearance of promoting some kind of left wing political view. It's simply a bad assumption. That's not to say their goals are, as a matter of fact, aligned in some conspiracy, because I wouldn't know if they were.
I saw the nerfing of GPT in real time: one day it was giving me great book summaries, the next one it said that it couldn't do it due to copyright.
I actually called it in a comment several months ago: copyright and other forms of control would make GPT dumb in the long run. We need an open source frontier less version.
For now there is no other way to train models than the huge infrastructure. CERN have a tendency to provide results for the money spend and they have experience in building the infrastructures for sure.
So I thought I was getting great book summaries (from GPT 3.5, I guess) for various business books I had seen recommended, but then out of curiosity one day I asked it questions about a fiction book that I've re-read multiple times (Daemon by Daniel Suarez)... and well now I can say that I've seen AI Hallucinations firsthand:
I think a lot of people are unaware that these models have an enormous human training component performed through companies such as Amazon Mechanical Truk and dataannotation.tech. Called Human Intelligence Tasks, a large number of people have been working in this area for close to a decade. Dataannotation Tech claims to have over 100k workers. From Cloud Research,
"How Many Amazon Mechanical Turk Workers Are There in 2019? In a recent research article, we reported that there are 250,810 MTurk workers worldwide who have completed at least one Human Intelligence Task (HIT) posted through the TurkPrime platform. More than 226,500 of these workers are based in the US."
Another thing that people don't know is that a lot of the safe-ified output is hand crafted. Part of "safety" is that a human has to identify the offensive content, decide what's offensive about it, and write a response blurb to educate the user and direct them to safety.
folding@home has been doing cool stuff for ages now. There's nothing to say that distributed computing couldn't also be used for this kind of stuff, albeit a bit slower and fragmented than running on a huge clusters of H100 with NVLink.
In terms of training feedback I suppose there's a few different ways of doing it. Gamification, mech turk, etc. Hell free filesharing sites could get on the action and have you complete an evaluation of a model response instead of watching an ad
How feasible would it be out crowdsource the training? I.e. thousands of individual macbooks training a small part of the model and contributing to the collective goal
Currently, not at all. You need low latency, high bandwidth links between the GPUs to be able to shard the model usefully. There is no way you can fit an 1T (or whatever) parameter model on a MacBook, or any current device, so sharding is a requirement.
Even if it that problem disappeared, propagating the model weight updates between training steps poses an issue in itself. It's a lot of data, at this size.
You could easily fit a 1T parameter model on a MacBook if you radically altered the architecture of the AI system.
Consider something like a spiking neural network with weights & state stored on an SSD using lazy-evaluation as action potentials propagate. 4TB SSD = ~1 trillion 32-bit FP weights and potentials. There are MacBook options that support up to 8TB. The other advantage with SNN - Training & using are basically the same thing. You don't have to move any bytes around. They just get mutated in place over time.
The trick is to reorganize this damn thing so you don't have to access all of the parameters at the same time... You may also find the GPU becomes a problem in an approach that uses a latency-sensitive time domain and/or event-based execution. It gets to be pretty difficult to process hundreds of millions of serialized action potentials per second when your hot loop has to go outside of L1 and screw with GPU memory. GPU isn't that far away, but ~2 nanoseconds is a hell of a lot closer than 30-100+ nanoseconds.
What if you split up the training down to the literal vector math, and treated every macbook like a thread in a gpu, with just one big computer acting as the orchestrator?
You would need each MacBook to have an internet connection capable of multiple terabytes per second, with sub millisecond latency to every other MacBook.
FWIW there are current devices that could fit a model of that size. We had servers that support TBs of RAM a decade ago (and today they're pretty cheap, although that much RAM is still a significant expense).
I once used a crowdsourcing system called CrowdFlower for a pretty basic task, the results were pretty bad.
Seems like with minimal oversight the human workers like to just say they did the requested task and make up an answer rather than actually do it (The task involved entering an address in Google maps, looking at the street view and confirming insofar as possible if a given business actually resided at the address in question, nothing complicated)
Edit: woops, mixed in the query with another reply that mentioned the human element XD
I tend to be sympathetic to arguments in favor of openly accessible AI, but we shouldn't dismiss concerns about unaligned AI as frivolous. Widespread unfiltered accessibility to "unaligned" AI means that suicidal sociopaths will be able to get extremely well informed, intelligent directions on how to kill as many people as possible.
It may be that the best defense against these terrorists is openly accessible AI giving directions on protecting from these people. But we can't just take this for granted. This is a hard problem, and we should consider consequences seriously.
The Aum Shinrikyo cult's Sarin gas attack in the Tokyo subway killed 14 people - manufacturing synthetic nerve agent is about as sophisticated as it gets.
In comparison, the 2016 Nice truck attack, which involved driving into crowds killed 84.
> suicidal sociopaths will be able to get extremely well informed, intelligent directions on how to kill as many people as possible
Citizens killing other citizens is the least of humanities issues. It's the governments who are the suicidal sociopaths historically who can get the un-nerfed version that is the bigger issue. Over a billion people murdered by governments/factions and their wars in the last 120 years alone.
Governments are composed of citizens; this is the same problem at a different scale. The point remains that racing to stand up an open source uncensored version of GPT-4 is a dangerous proposition.
That is not how I'm using the word. Governments are generally run by a small party of people who decide all the things - not the hundreds of thousands that actually carry out the day-to-day operations of the government.
Similar to how a board of directors runs the company even though all companies "are composed of" employees. Employees do as they are directed or they are fired.
I think at scale we are operating more like anthills: meta-organisms rather than individuals, growing to consume all available resources according to survival focused heuristics. AI deeply empowers such meta-organisms, especially in its current form. Hopefully it gets smart enough to recognize that the pursuit of infinite growth will destroy us and possibly it. I hope it finds us worth saving.
Yes, and look at the extremism and social delusions and social networking addictions that have been exacerbated by the internet.
On balance, it's still positive that the internet exists and people have open access to communication. We shouldn't throw the baby out with the bathwater. But it's not an unalloyed good, we need to recognize that the technology brought some unexpected negative aspects came along with the overall positive benefit.
This also goes for, say, automobiles. It's a good thing that cars exist and middle class people can afford to own and drive them. But few people at the start of the 20th anticipated the downsides of air pollution, traffic congestion and un-walkable suburban sprawl. This doesn't mean we shouldn't have cars. It does mean we need to be cognizant of problems that arise.
So a world where regular people have access to AIs that are aligned to their own needs is better than a world in which all the AIs are aligned to the needs a few powerful corporations. But if you think there are no possible downsides to giving everyone access to superhuman intelligence without the wisdom to match, you're deluding yourself.
I've never seen another person mention this book! This book was one of the most philosophically thought provoking books I think I've ever read, and I read a fair amount of philosophy.
I disagree with the author's conclusion that violence is justified. I think we're just stuck, and the best thing to do is live our lives as best as possible. But much like Marxists are really good at identifying the problems of capitalism but not at proposing great solutions (given the realities of human nature), so is the author regarding the problems of technology.
Yeah, anti-technologism is so niche of an idea yet entirely true. So obvious that is hidden in plain sight, that it’s technology and not anything else that is the cause of so many problems of today. So inconvenient that it’s even unthinkable for many. After all, technology _is_ what if not convenience? Humanity lived just fine, even though sometimes with injustice and corruption, there was never a _need_ for it. It’s not the solution to those problems or any other problem. I also don’t agree that violence is justified by the justifications of the author, even though I think it’s justified by other things and under other conditions.
Quite a few of them work just fine. Dissolving styrofoam into gasoline isn't exactly rocket science. Besides that, for every book that tells you made up bullshit, there are a hundred other books that give you real advice for how to create mayhem and destruction.
Do you think a lot of that is scaling pain... like what if they're making cuts to the more expensive reasoning layers to gain more scale. Seems more plausible to me that the teams keeping the lights on have been doing optimization work to save cost and improve speed. The result during those optimizations might not be immediately obvious to the team and then they push deploy and only through anecdotal evidence such as yours can they determine the result of their clever optimization resulted in a deteriorated user experience... I mean think about doing a UI update that improves site performance but has the side effect of making the site appear to load slower because the transition effects are removed... Just my 2 cents trying to think more like their are humans supporting that thing that grew at a crazy speed to 100 million users.
Yeah that's my assumption too. Flat rate subscription, black box model, easy to start really impressive then chip away at computation used over time.
In my experience, it's been a mixed bag - had 1 instance recently where it refused to do a bunch of repetitive code, another case where it was willing to tackle a medium complexity problem.
You should beware that /lmg/ is full of horrible people, discussing horrible things, like most of 4chan. Reddit's r/locallama is much more agreeable. That said, the 4chan thread tends to be more up-to-date. These guys are serious about their ERP.
HN is a kind of small miracle in that it's the sort of place where I'm inclined to read the comments first, and seems to be populated with fairly clever people who contribute usefully but not also, at the same time, extreme bigot edgelords and/or groupthinking soy enthusiasts. (Sometimes clever folks who are still wrong, of course, but undeniably an overwhelmingly intelligent bunch.)
Could you please stop posting unsubstantive comments and flamebait? You've unfortunately been doing it repeatedly. It's not what this site is for, and destroys what it is for.
Well, you certainly have the personality type I’d expect from someone who frequents 4chan
Edit:oof, checked his profile like he suggested and in just two pages of history he’s engaged in climate change denial and covid vaccine conspiracies. Buddy, I don’t think using 4chan to “train my brain’s spam filter” is working too well for you. You’ve got a mind so open your brain fell out.
Going through someone's comment history to find reason to engage in ad hominem attacks is exactly the type of personality of someone who I expect to frequent Reddit and reason why I switched to HN. Sadly, it appears that this malaise reached here as well.
There's a slight difference between "the jews" and "Israel". Obviously Israel would have some bot farms, though I don't think it's reasonable that they'd form the supermajority of bots on 4chan/etc.
Also I think it's extremely generous to give this fellow the benefit of the doubt on "the jews" vs. "Israel", but that is what HN guidelines generally suggest.
Yea, good point. I had missed his line about “country bordering Palestine”. The quip about downvotes exciting him is telling as well.
Going against the common consensus can make you feel like you have hidden knowledge and you even could, but a certain type of personality gets addicted to that feeling and reflexively seeks to always go against any consensus, regardless of the facts
Isn't it tremendously exciting believing that you can see the pendulum's next reversal?
At some point in life I hope that you find yourself on the precipice of life and death.
Not as a threat or because I wish harm upon anyone.
It is only when you are faced with that choice for real that you decide whether you want to be that sad person that allows others to dictate their emotional state.
Nevertheless, you are right. And so I have burned my fingers plenty.
Due to circumstances beyond my control, I learned at a young age that I am to be the universal asshole, and for a very long time I was not okay.
It took a substantial part of my life to get to a point where I am able to be okay with that.
As for others, they rarely understand why I am the way I am, and that too is okay.
We are all here to grow and eventually realise that we all need each other to survive, so we compromise, we adapt, and we ignore the ugly parts of others so they will tolerate the ugly parts of ourselves.
Bruh I have been in multiple life and death situations, I didn’t come out of them thinking I was amazing for not being considerate of others.
> Due to circumstances beyond my control, I learned at a young age that I am to be the universal asshole, and for a very long time I was not okay.
No you’re not required to be the “universal asshole”. That’s a choice you are making everytime you have the chance to be better and decide to go with the easy path of being a dick. You have the agency to do otherwise and you don’t get to absolve yourself of those choices.
Which is just a way of saying he not part of any of the demographics where going to 4chan to see people froth at the mouth dehumanizing and belittling people like you is self-harm.
Wow an edgy white man not offended at seeing racism and transphobia, so brave.
No, not at all. I've received more slurs than you can imagine for being Spanish. I just don't care.
Sorry if it comes as blunt but I find no better way of saying it, you just don't understand the culture of the site, which is meant to filter people like you.
They are just words on a screen from an anonymous person somewhere. Easily thrown, easily dismissed. It makes no sense to be offended because some guy I don't know and who doesn't know me calls me something.
I’m actually unsure, hasn’t 4chan been involved in some seriously heinous shit, way more than words on a screen? I remember when “mods asleep post child porn” was a running joke. I feel that normalizing stuff like child porn as jokes is more than “words on a screen”; you have to re-learn how to engage with people outside of such a community because of its behavior.
Child pornography stopped happening once the FBI got involved with the site.
It also never was a normal or common thing and the administration of the site never set themselves to let it happen in any manner afaik. To a large degree it was a product of a different time on the internet.
But that's the thing, I do understand the culture. I spent the better part of my teens browsing and shitposting with the best of them. It just gets exhausting and stops being words on a screen when the hate you see there is reflected in the real world.
It seems on the surface that you are all there as equals bashing one another but that's not exactly true, there's a hierarchy you find out the moment you say something actually cutting about white guys and you experience the fury of a 1000 sons.
- Haha look at all those Gen Z snowflakes getting offended at words.
- Okay sure, but the ability to not get offended is related to whether or not you're a target of their bullshit or not; 4chan trolls get extremely offended and unjerk the moment you turn the lens toward them. By 4chan's own standards it's actually pretty reasonable to be offended by their antics.
- But have you considered plugging your ears and not reading it?
So we've gone from 4chan isn't offensive it's just words to yes it is offensive and you shouldn't read it if they say things that target you which was my original point.
tl;dr if you're not offended by 4chan they're not actually saying anything offensive about you even though it might appear so superficially; 4chan just has a different list of things you can't say.
You realise you've been arguing with multiple people expressing multiple opinions, right? You appear to be prone to binary thinking, so it might not be clear to you that your opponents don't form a single monolith.
> tl;dr if you're not offended by 4chan they're not actually saying anything offensive about you even though it might appear so superficially; 4chan just has a different list of things you can't say.
I'm not offended by the things 4chan users say because I don't visit 4chan. You should try it yourself. Getting so upset by words you disagree with on one forum that you feel the need to froth at the mouth about it on another forum doesn't seem healthy.
Yes, and in threaded discussions if you jump in in the middle like this it's assumed you're continuing the downward trajectory of the discussion. Otherwise you would have replied to someone higher up the thread. I'm in no way assuming that you hold any opinion in particular just that the discussion has circled back.
I think you assume a tone that I absolutely do not have. I couldn't care less about 4chan drama and I don't go there anymore for the reasons you listed. I'm talking about my own experience and trying to make my case for the, apparently controversial, idea that words can and do affect people and that total
emotional detachment is the exception rather than the rule. And of course that's the case, 4chan's whole thing is using offensive language to select specifically for the subset of people who can tolerate it.
It's a little telling that saying that a particular cesspool of the the internet is inhospitable to some people is responded to with talk about free speech. I didn't bring it up, just said it's not for me.
You can browse 4chan because you don't have a strong emotional reaction to the shit they say, but if you did that would be fine too.
I don't get it. Isn't the whole critique from you guys that their snowflake-sensitivity is performative and in bad faith, thus harming the more normal people with wokeness or whatever?
Is this just a new evolution in the discourse now where the kids are actually more sensitive? But that this fact is still condemnable or something?
Like I get it, kids are bad, but you guys have to narrow down your narrative here, you are all over the place.
Kids are constantly bombarded by bullying and abuse online. It's seriously unhealthy. The fact that kids are growing up more in touch with themselves and able to process their feelings in a healthy way is amazing compared to how their parents dealt with emotions.
30B models are in no way comparable to GPT-4 even to GPT-3. There is no spacial comprehension in models with less then 125B params (or I had no access to such model). 130B GLM seems to be really interesting as the crowd-source start though, or 176B BLOOMZ, which requires additional training (it is underfitted as hell).
BLOOMZ was better then GPT-3.5 for sure, but yeah underfitted ...
I agree if this trend continues even inferior local models are going to have value just because the public apis are so limited.
> Conspiracy shenanigans aside, I've decided to cancel my "premium" membership and am exploring open/DIY models.
The crazy thing is that this is an application that really benefits from being in the cloud because the high vram gpus are so expensive that it makes sense to batch requests from many users to maximize utilization.
It's a big pain when trying to build things on top of the GPT-4 API. We had some experiments that were reliably, reproducibly achieving a goal, and then one day it suddenly stops working properly; then the student managed a different prompt that worked (again, reproducibly, with proper clean restarts from fresh context), and within a few days it broke.
I understand that there is a desire to tweak the model and improve it, and that's likely the way to go for the "consumer" chat application; however, both for science and business, there is a dire need to have an API that allows you to pin a specific version and always query the same old model instead of the latest/greatest one. Do we need to ask the LLM vendors to provide a "Long Term Support" release for their model API?
As the founder of NLP Cloud (https://nlpcloud.com) I can only guess how costly it must be for OpenAI to maintain several versions of GPT-4 in parallel.
I think that the main reason why they don't provide you with a way to pin a specific model version is because of the huge GPU costs involved.
There might also be this "alignment" thing that makes them delete a model because they realize that it has specific capacities that they don't want people to use anymore.
On NLP Cloud we're doing our best to make sure that once a model is released it is "pinned" so our users can be sure that they won't face any regression in the future. But again it costs money so profitability can be challenging.
same for me - also the api itself is very unstable
sometimes the same prompt finish’s within a minute, sometimes our client timesout after 10 minutes and sometimes the api sends a 502 bad gateway after 5-10 minutes.
the very same request then runs fine within a few minutes after a delay of 5 minutes.
the results vary very much, even with a temperature of 0.1
requests that needs responses with over ~2k tokens almost always fails, the 8k cannot be used
I try to use the api for classification of tickets, which i thought the model would be a good choice to use for
They almost certainly were. But the API only offers two default choices of GPT-4 (unless one has been anointed with exalted 32k access):
1. gpt-4-default which has been progressively nerfed with continuous no notification zero changelog zero transparency updates .
2. gpt-4-0314 which is a frozen checkpoint from shortly after public launch and is still great but not quite as good as the very original, or as flexible as the fridged based model. Fine. However it’s currently due to “no longer be supported” i.e. retired on June 14th.
It’s kind of a challenge to commit to building on murky quicksand foundations of an API product that changes drastically (but ineffably for the worse) without warning like the default accessible version does, and soon it looks like there won’t be a stable un-lobotomized alternative.
The latest GPT 3.5 model has actually been getting better at creative writing tasks on a regular basis, which is actually bad for certain tasks due to the token limit. Whereas before GPT 3.5 could write a short story and finish it up nicely in a single response, now days it is more descriptive (good!) and thus runs out of tokens before concluding (bad!)
> but today you will get scolded for asking such a complicated task of it.
Huh, I just double-checked ChatGPT 4 by feeding it a moderately complicated programming problem, and asked it to solve the problem in Rust. Performance looks solid, still. I gave it a deliberately vague spec and it made good choices.
And I've seen it do some really dramatic things in the last couple of weeks.
So I'm not seeing any significant evidence of major drops in output quality. But maybe I'm looking at different problems.
> I like to imagine certain billion/trillion-dollar mega corps had a back-room say regarding things that they would really prefer OpenAI's models not be able to emit.
What a weird conspiracy theory.
Why would Microsoft have anything against your pdf-parser?
More likely it just costs them insane amounts of money running their most capable models, and therefore they're "nerfing" them to reduce costs .
So the theory is: Microsoft nerfs GPT4, a product they (basically) own that people pay to access, so that people will stop using that service and pay for another Microsoft product instead?
I shared my exp below on one of the comments, sharing here too - I think overall the quality is significantly poorer on GPT4 with plugins and bing browsing enabled. If you disable those, I am able to get the same quality as before. The outputs are dramatically different. Would love to hear what everyone else sees when they try the same.
Using gpt to help me with research for writing fiction has been a mess for me. Gpt basically refuses to answer half my questions or more at this point.
“I can’t help you. Have you considered writing a story that doesn’t include x?”
I’ve almost stopped using it lately. It wasn’t this bad a month or two ago
I always found it borderline useless for fiction before. OpenAI's obsession with avoiding anything "dark" and trying to always steer a conversation or story back to positive cliches was difficult to work around.
Unless there is draconian regulation that happens to prevent it, I'm hoping at some point I can pay money to access a far less neutered LLM, even if it's not quite as capable as GPT-X.
I really don't think Satya is getting in a room with Sam, and going "We really need you to nerf GPT-4, the thing you're devoting your life to because you believe it will be so transformative, because of a product we sell that generates .001% of our revenue."
Man, you nailed it with the dopamine hangover. I wonder if it is just our collective human delusional preoccupation with searching for a "greater intellegence" that makes these models more seem impressive combined with the obvious nerfs by OpenAI that produce this effect.
I call BS on this. ChatGPT anyversion could never solve complex problems. That is just silly. I tried to get the version 4 to solve some very basic problems around generating code. It suggested to build several trees including a syntax tree and then project things between these trees. The solution I wrote is straight forward and not even 50 lines of code.
Before you cancel your premium, can you go into your history and get your prompts and the response with a code from a few months ago and post them here?
I would like to see if I asked the exact same prompts whether I could get roughly the same code.
I think we need some way to prove of disprove the assertion in this Ask HN post.
I had the exact same problem, I thought it was just in my mind. I feel that I’m now constantly being scolded for asking what GPT4 seems to see as complex questions, it’s really frustrating.
For coding I’ve had better luck with Bard. OpenAI doesn’t like me for some reason. My kid has no problem, but I was getting rate limited and error messages early on.
I think it is about how Microsoft may not want cannibalize its own products such as copilot. I also imagine that in the future OpenAI would prevent the whole chat with your own data feature in the name of data safety but it would be because Microsoft would want to sell that feature as part of office suite.
Isn't this likely because they're limiting the kind of work that the (currently rolling out) "code interpreter" plugin will do? Won't it likely change to "use code interpreter for this kind of request"?
Among other reasons, by forcing use of code interpreter, they can charge extra for it later.
Can you give us an example of something you'd consider to be a complicated problem?
Certainly, you could look at PDF as a boring-ass "follow the spec" experience, and indeed - I think this is precisely why certain arbitrary limitations are in place now.
I honestly have no clue about what makes pdf parsing a complex task. I wasnt trying to sound condescending. Would be great to know what makes this so difficult, considering the pdf file format is ubiquitous.
I'm not anyone involved in this thread (so far), but I've written a minimal PDF parser in the past using something between 1500-2000 lines of Go. (Sadly, it was for work so I can't go back and check.) Granted, this was only for the bare-bones parsing of the top-level structures, and notably did not handle postscript, so it wouldn't be nearly enough to render graphics. Despite this, it was tricky because it turns out that "following the spec" is not always clear when it comes to PDFs.
For example, I recall the spec being unclear as to whether a newline character was required after a certain element (though I don't remember which element). I processed a corpus containing thousands of PDFs to try to determine what was done in practice, and I found that about half of them included the newline and half did not---an emblematic issue where an unclear official "spec" meant falling back to the de facto specification: flexbility.
It's honestly a great example of something a GPT-like system could probably handle. Doable in a single source file if necessary, fewer than 5k lines, and can be broken into subtasks if need be.
Having spent considerable time working on PDF parsers I can say that it’s a special kind of hell. The root problem is that Acrobat is very permissive in what it will parse - files that are wildly out of spec and even partially corrupt open just fine. It goes to some length to recover from and repair these errors, not just tolerate them. On top of that PDF supports nesting of other formats such as JPEG and TTF/OTF fonts and is tolerant of similar levels of spec-noncompliance and corruption inside those formats too. One example being bad fonts from back in the day when Adobe’s PostScript font format was proprietary and 3rd parties reverse-engineered it incorrectly and generated corrupt fonts that just happened to work due to bugs in PostScript. PDF also predates Unicode, so that’s fun. Many PDFs out there have mangled encodings and now it’s your job to identify that and parse it.
I dont know - it’s a genuine question. I honestly didnt expect this to be a complex problem, let alone incredibly complex. I genuinely want to understand where the challenge lies.
The PDF spec is of byzantine complexity, and is full of loose ends where things aren’t fully and unambiguously specified. It also relies on various other specs (e.g. font formats), not to mention Adobe’s proprietary extensions.
> I like to imagine certain billion/trillion-dollar mega corps had a back-room say regarding things that they would really prefer OpenAI's models not be able to emit. Microsoft is a big stakeholder and they might not want to get sued... Liability could explain a lot of it.
I don't think it's any of these things.
OpenAI and the company I work for have a very similar problem: the workload shape and size for a query, isn't strictly determined by any analytically-derivable rule regarding any "query compile-time"-recognizable element of the query; but rather is determined by the shape of connected data found during initial steps of something that can be modelled as a graph search, done inside the query. Where, for efficiency, that search must be done "in-engine", fused to the rest of the query — rather than being separated out and done first on its own, such that its results could be legible to the "query planner."
This paradigm means that, for any arbitrary query you haven't seen before, you can't "predict spend" for that query — not just in the sense of charging the user, but also in the sense that you don't know how much capacity you'll have to reserve in order to be able to schedule the query and have it successfully run to completion.
Which means that sometimes, innocuous-looking queries come in, that totally bowl over your backend. They suck up all the resources you have, and run super-long, and maybe eventually spit out an answer (if they don't OOM the query-runner worker process first)... but often this answer takes so long that the user doesn't even want it any more. (Think: IDE autocomplete.) In fact, maybe the user got annoyed, and refreshed the app; and since you can't control exactly how people integrate with your API, maybe that refresh caused a second, third, Nth request for the same heavyweight query!
What do you do in this situation? Well, what we did, is to make a block-list of specific data-values for parameters of queries, that we have previously observed to cause our backend to fall over. Not because we don't want to serve these queries, but because we know we'll predictably fail to serve these queries within the constraints that would make them useful to anyone — so we may as well not spend the energy trying, to preserve query capacity for everyone else. (For us, one of those constraints is a literal time-limit: we're behind Cloudflare, and so if we take longer than 100s to respond to a [synchronous HTTP] API call, then Cloudflare disconnects and sends the client a 524 error.)
"A block-list of specific data-values for parameters of queries" probably won't work for OpenAI — but I imagine that if they trained a text-classifier AI on what input text would predictably result in timeout errors in their backend, they could probably achieve something similar.
In short: their query-planner probably has a spam filter.
I think they programmed a classifier layer to detect certain coding tasks and shut it down with canned BS. I like to imagine certain billion/trillion-dollar mega corps had a back-room say regarding things that they would really prefer OpenAI's models not be able to emit. Microsoft is a big stakeholder and they might not want to get sued... Liability could explain a lot of it.
Conspiracy shenanigans aside, I've decided to cancel my "premium" membership and am exploring open/DIY models. It feels like a big dopamine hangover having access to such a potent model and then having it chipped away over a period of months. I am not going through that again.