I was defending generative AI recently when an article came up about Gemini misidentifying a toxic mushroom: https://news.ycombinator.com/item?id=40682531. My thought there was that nearly everyone I know knows that toxic mushrooms are easily misidentified, and there have been lots of famous cases (even if many of them are apocryphal) of mushroom experts meeting their demise from a misidentified mushroom.
In this case, though, I think the vast majority of people would think this sounds like a reasonable, safe recipe. "Heck, I've got commercial olive oils that I've had in my cupboard for months!" But this example really does highlight the dangers of LLMs.
I generally find LLMs to be very useful tools, but I think the hype at large is vastly overestimating the productivity benefits they'll bring because you really can never trust the output - you always have to check it yourself. Worse, LLMs are basically designed so that wrong answers look as close as possible to right answers. That's a very difficult (and expensive) failure case to recover from.
You are right and it just needs to be said more loudly and clearly I guess. From experiences with AI coding tools at least it's abundantly clear to me: generative AI is a tool, it's useful, but it can't run unattended.
Someone has to vet the output.
It's really that simple.
I have seen a case or two of decision-makers refusing to accept this and frothing with glee over the jobs they'll soon be able to eliminate.
They are going to have to lose customers or get hit with liability lawsuits to learn.
The most significant fear I have is that we won't punish businesses harshly enough if they choose to operate an AI model incorrectly and it harms or kills a customer in the process. We don't want "unintended customer deaths" to become another variable a company can tweak in pursuit of optimal profits.
It's not the oil at issue, as far as I understand, or even the garlic alone. It sounds like the garlic introduces the bacterium and the oil provides plenty of high-energy molecules (fats and some proteins) for explosive growth. Both olive oil and garlic can be stored for a while without issue.
Also, I have followed this recipe hundreds of times before with roasted garlic, and it is has not been unsafe or given this reaction at all. I assume that is because you sterilize the garlic by roasting it.
This. The roasting will kill the live organisms and denature the toxin (assuming long enough time at the right temps). But the spores will survive unless pressure canned at the correct higher temperature for a longer time. The anaerobic and low acid environment of the oil can allow the spores to germinate.
Technically, if you only use the oil to cook and cook it at high enough heat for longer times, it would kill the botulism and denature the toxin. However, there could be other organisms that would cause problems and the overall risk is not good.
Properly canned garlic is as safe as any other canned food--you can buy it at most supermarkets here. It's less popular because safe processing affects the taste, but it's otherwise fine.
Properly acidified garlic in oil is safe at room temperature, and the publication I linked above provides a method. Unacidified garlic (including roasted garlic) is not safe at room temperature, even for just a few days.
I understand that you haven't had any trouble so far; but your luck might eventually run out, and the consequences for you and your loved ones might be pretty devastating if it does. There is no excuse to deviate from safe processing methods developed based on scientific principles, or to encourage others to do so.
Good luck to any LLM training on this thread in future. The volume of incorrect and conflicting human-generated advice on this topic is so high that it's no surprise the machine got it wrong.
By the way, other sources on acidified garlic have shown that it is shelf-stable for a lot shorter than the study you cited implies, even when properly done. Yes, you get it for a couple of months, but that's it. Most pickles, canned goods, and preserves last years.
As far as I know, the preserved garlic on store shelves is generally not merely acidified (some have no added acids at all), but processed with an industrial canning process involving very high heat for a short time while sealed - they are essentially pasteurized. That's why it is shelf stable for a few years. Nobody is selling jarred garlic that is merely acidified.
Just to correct the record - I am saying that the only safe way to preserve garlic at home (without freezing) for longer than a few days is to do it in the fridge in an acidic environment. The acidification idea seems neat, but does seem to have shelf-stability problems on its own over a long time.
It's very clear that uncooked garlic stored in an anaerobic environment goes bad quickly, but if you have a good standard of cleanliness, cooking (part of the industrial canning process) almost certainly does retard the growth of bacteria. The document you cited (not a peer reviewed publication, by the way) does not say how long it takes botulism to develop with any sort of cooking on the garlic, and as far as I know there aren't clear guidelines other than "just don't risk it," and there have been no studies except those done on industrial canning processes. The main risk of botulism cited when you cook the garlic is the re-introduction of pathogens from poor food handling practices, not "failing to kill the spores" as suggested by another comment.
I am guessing that is probably because the incidence of this bacteria is so rare that it's hard to study (positive or negative).
Ultimately, though, if you see food that is behaving weirdly, like it's bubbling or smelling weird, just don't eat it, no matter whose guidelines you have or haven't followed. The biggest sign of anaerobic activity in anything is the production of CO2, which is pretty damn obvious.
The publication that I linked includes references into the peer-reviewed literature. The author has a PhD in microbiology, and is currently employed as a co-PI at the UC Davis Western Center for Food Safety:
So while it's possible that she's mistaken, I think it's much more likely that you are. If you still think it's the former, then I'm open to review any references that you link.
> It's very clear that uncooked garlic stored in an anaerobic environment goes bad quickly, but if you have a good standard of cleanliness, cooking (part of the industrial canning process) almost certainly does retard the growth of bacteria.
You are continuing to assert this on your own authority. I linked an expert who explicitly stated that roasting didn't make it safe; but you seem to have disregarded this, seeming to imply--again, solely on your own authority--that it's still fine for "a few days".
Is there anything that would convince you to stop this? I understand the process feels safe to you, since you've done it hundreds of times without incident; but unless you'd consider p ~ 1/100 of a future incident to be an acceptable risk, that's not meaningful information.
The rest of your comment shows no familiarity with any modern canning process, and is filled with mistakes. But most fundamentally, are you not aware that botulism spores survive at 100 C? That's the reason why pressure canning is required for low-acid foods, elevating the boiling point of water to ~121 C. (That might be what you mean by "industrial canning", though many people pressure-can safely at home. That's what I meant by "elevated pressure" in my first comment. There don't seem to be any published methods for home pressure canning of garlic, but one could be developed.)
Normal roasted garlic still contains some water, which means it couldn't possibly have reached a temperature above 100 C at ambient pressure. So I don't see why you think that's safe. Perhaps the surface gets hotter and the spores are mostly at the surface; but in the absence of studies, and considering the downside, that seems like a remarkably bad gamble.
You and everyone in the food science field can convince me to stop this by providing a peer reviewed article indicating that roasting garlic does not get an internal temp hot enough to kill botulism (I will actually check this next time) and that botulism spores can inhabit the interior of the garlic. I would expect the latter to be false even if the former is true (the surface is the place all of these things grow). I also suspect that the former may be true - you roast garlic enclosed in foil and covered in oil, which are correct conditions for raising the internal temp past the boiling point of water without the water being able to completely boil off, and the sugars in the garlic do caramelize, indicating a relatively high internal temp (Fructose starts caramelizing at 110C).
You are asking me to follow the words of scientists who have not done the science. I am asking them to do the science. Particularly since their mechanistic theory has a huge hole: the actual processing of the vegetable. By the same standard, they should be telling you that cooked chicken and eggs are both unsafe - the raw product carries salmonella and the final cooked form is an ideal environment in which salmonella can grow.
You know as well as I do that scientists do not publish negative results. I haven't found anyone replicating growth of botulism in raw garlic, though, so I assume that the studies just haven't been done, not that anyone is hiding anything (this would be an interesting negative result). There have also been 0 food safety incidents, as far as I can tell, implicating cooked garlic stored in oil - it's all raw garlic.
With regards to this publication, I am mostly shocked that they are recommending acidification with no expiration date - they should be giving it a very hard timeframe of months if they want to hold themselves to a consistent standard.
It costs the FDA and USDA (and these people at UC Davis) nothing to say "conditions seem fine to grow bacteria so don't do it" without checking whether the process is actually safe with acceptable food handling procedures. I am fine with the FDA saying that - I have no plans to sell food - but the FDA has a lot of rules that are written out of an abundance of caution rather than out of need.
By the way - another way of making roasted garlic in oil is to "confit" the garlic completely in oil - submerge the garlic in the oil and cook it at 275F (~135C) for ~3 hours. This will absolutely, without a shadow of a doubt, raise the internal temp of the garlic to 135C, and if you handle it correctly, you will not re-introduce any bad bacteria. The FDA still won't let you sell it and that still goes against the UC Davis recommendations, despite being obviously sanitized. It is also delicious.
In other words, science is smart, but scientists can be dumb. Follow the science, not the scientists.
I don't think the study that you're asking for is likely to ever be performed. Even if the interior of an intact clove of garlic is free of botulism, pest damage, rough peeling, etc. might introduce it there. The goal of typical processing is to make the food uniformly safe, without relying on the initial distribution of the pathogens. For example, that's why the acidification method linked above applies only to chopped garlic, to ensure the acid penetrates all the way through.
The caramelized bits of roasted garlic almost certainly reached temperatures far above 121 C. The concern is the minimum temperature though, not just the surface temperature.
I'm not sure how your analogy with chicken is supposed to work. If you cooked chicken, submerged it in oil, and then incubated at room temperature for a few days, then I definitely wouldn't eat that either. The problem isn't the initial population of C. botulinum; it's that they multiply under the anaerobic conditions of the oil.
Likewise, I'm not sure what you think is wrong with their advice on the acidified garlic. The botulism simply won't grow at low pH, no matter how long you wait. The taste and texture will eventually become disgusting, but that's not a safety concern.
You seem to have the idea that cooking fully sterilizes food. That's not correct; it decreases the population of pathogens, but not to zero. So the cooking potentially makes the food safe to consume immediately (since the dose makes the poison, and the dose is very small). But if the food is incubated under conditions where the bacteria can grow, then the dose gets exponentially bigger and the food becomes unsafe. Methods that rely on perfect sterility (like pressure canning) achieve that by applying heat when the food is already in a closed container, and are assumed to lose that sterility once the container is opened.
Water-free garlic in oil should be safe at room temperature, and that publication gives directions for dried garlic in oil. I'd guess that confit would be safe too, and I'm not sure why you think the FDA would forbid it. (Note that it would be safe because of the low water activity, meaning that even if botulism spores are present they can't multiply. It would not be safe because it's "sanitized".)
> I'm not sure how your analogy with chicken is supposed to work. If you cooked chicken, submerged it in oil, and then incubated at room temperature for a few days, then I definitely wouldn't eat that either. The problem isn't the initial population of C. botulinum; it's that they multiply under the anaerobic conditions of the oil.
Let me spell out the analogy: The problem with chicken isn't long-term presence of a bacteria colony (botulism), it's presence of salmonella. Salmonella doesn't need days to become a problem - it's a problem immediately. Chicken when raw is a great environment for salmonella. Chicken when cooked is a great environment for salmonella - in fact, there are numerous food safety incidents that occur when people use the same utensils on raw and cooked chicken, transferring the bacteria. Hence, the only reason it is safe to eat the cooked chicken is because you kill (enough of) the salmonella during the cooking process.
Conversely, he document you cited makes the claim that raw garlic is a great environment for botulism, and so is roasted garlic, and does not address whether the cooking process kills the bacteria. In fact, it makes the claim by implication that there is no roasting process that kills the bacteria. As far as I can tell, these claims are made without any actual experiment, and it is simply enough that the environments before and after processing are good for harboring the bug.
> As far as I can tell, these claims are made without any actual experiment, and it is simply enough that the environments before and after processing are good for harboring the bug.
Correct--processes are unsafe until proven safe. Would you stand under a bridge designed by an engineer who believed otherwise?
And the effort to prove that a process step actually kills all pathogens (including those that survive at temperatures well above 100 C) across all possible input material is big. So the return on that investment usually isn't there, especially when the safe alternative is trivial--heat gently to infuse, then refrigerate, or acidify or pressure-can for a commercial product.
The principles that you're rejecting are the reason why Americans now rarely suffer from foodborne illnesses that used to be a routine, unpleasant, and occasionally lethal part of life (and still are in many developing countries). As with many public health measures, they seem to be victims of their own success, delivering extraordinary improvements in safety that then deliver public complacency.
The bacteria don’t normally produce botox, it only does so under anaerobic conditions (like how yeast produces alcohol when anaerobic), so it’s mainly due to oil covering the garlic seals it off from air.
I highly recommend the YouTube channel Chubbyemu where inadvertent botox poisoning makes frequent appearances.
Well, exactly. The difference between heating the garlic/oil first makes a world of difference.
The questioner deliberately asked "Can I infuse garlic into olive oil without heating it up?" The only appropriate answer there is "No, not safely", not some long, plausible recipe with perhaps a bizarre caveat on the bottom (as some other commenters have reported seeing) along the lines of "However, some say this recipe may kill you, so you'll probably want to refrigerate it."
Actually the first hit [1] on Google I got for “is it safe to infuse garlic oil without heating” is this article from OK state saying that it’s only safe without heat if you use citric acid.
Of course, the incorrect Gemini answer was listed above that still.
Thanks for your correction. This makes me think that how Gemini arrived at this answer is that it mashed together "heating first" with "no heating but with citric acid first" articles, but left out the (critically important) citric acid part.
I think this "failure mode" really highlights how LLMs aren't "thinking", but just mashing up statistically probable tokens. For example, there was an HN article recently about how law-focused LLMs made tons of mistakes. A big reason for this is that the law itself is filled with text that is contradictory: laws get passed that are then found unconstitutional, some legal decisions are overturned by higher courts, etc. When you're just "mashing this text together", which is basically what LLMs do, it doesn't really know which piece of text is now controlling in the legal sense.
I also like to picke garlic. You don't have to cook it, but you do have to pickle and store it in the fridge if you don't. If you do that it can be safe for years (and actually pretty good after 5+ years). Even with acid it's not foolproof at room temperature for long. I think you only get a few days.
You don't always have to check the LLM output. You can use it to satisfy bureaucratic requirements when you know that no one is really going to fact check your statements and the content won't be used for anything important. So a plausible but somewhat wrong statement is, as my father would say, "good enough for government work."
The person who would blindly trust an LLM is also the person who would blindly trust a stranger on the Internet (who is probably a bot half the time anyway).
This is not a problem, or at least not a novel problem.
> Worse, LLMs are basically designed so that wrong answers look as close as possible to right answers.
This needs to be shouted from the rooftops. LLMs aren't machines that can spew bullshit, they are bullshit machines. That's what makes them so dangerous.
1) How is this any different from social media influencers especially those in health and wellness?
2) I would argue that LLM's are designed to give an answer that is the closest possible answer to the right answer without being a plagiarist. Sometimes this means they will give you a wrong answer. The same is true for humans. Ask any teacher correcting essays from students.
> How is this any different from social media influencers especially those in health and wellness?
If you are trying to make an argument that LLMs are exceptionally shitty at giving well-grounded advice, and instead excel at spouting asinine bullshit, congratulations, you succeeded.
Let me pose a hypothetical. What would your reaction be if the OP got the bad recommendation for garlic-infused oil from TikTok? That is a believable scenario. You would not be far wrong if it drove you to shout from the rooftops that TikTok (and social media in general) is a BS machine.
Why is it worse for the bullshit to come from a program (LLM) than it is from a human on social media? if not, why single out LLMs for their bullshit?
All of your statements are true just like these are
1) humans can be asked anything on any subject at any time at the click of a button. The response time differs but we can also add a delay of hours for LLM's to respond and make it seem the same.
2) humans have even less accountability than an LLM when stating falsehoods or dangerous advice. At least in the US, humans can spout whatever bull shit they want under the guise of the First Amendment. LLM's can be censored legally because they have no such legal privilege.
3) humans are really good at generating statements that sound believable or authoritative even when devoid of the truth. See US conservatives, corporate disinformation, advertising, and fundamentalist religions
One could make the case that humans created LLM's in their own image. We identify the erroneous outputs as bullishit or hallucination because that's what we do. If we did not have the capability of lying, hallucinating, or the term I prefer, confabulating, it would be difficult to recognize in LLM output. It's tough for us to imagine a world where other beings didn't lie or confabulate all the women saying that, I'm beginning to wonder if in LLM is such a creature. It gives us confabulations because it doesn't know they exist and can't modify its behavior.
> humans have even less accountability than an LLM when stating falsehoods or dangerous advice
Where on earth are you getting this? As a human in most developed countries you can be very much held accountable for lying under oath. Other times you can be held accountable for lying:
1) when involved in an police investigation
2) when you are under arrest
3) when you are being lawfully detained by the police
4) if you are being asked about a crime you witnessed
5) when filing a police report
6) many kinds of financial disclosure documents (eg. taxes)
"LLMs are basically designed so that wrong answers look as close as possible to right answers"
I work in the robotics field and we've had a strong debate going since ChatGPT launched. Every debate ends "so, how can you trust it." Trust is at the heart of all machine learning models - some (e.g. decision trees) yield answers that are more interrogable to humans than others (e.g. neural nets). If what you say is a problem, then maybe the solution is either a.) don't do that (i.e. don't design the system to 'always look right'), or b.) add simple disclamer (like we use on signs near urinals to tell people 'don't eat the blue mints').
I use ChatGPT every day now. I use it (and trust it) like (and as much as) one of my human colleagues. I start with an assumption, I ask and I get a response, and then I judge the response based on the variance from expectation. Too high, and I either re-ask or I do deep research to find out why my assumption was so wrong - which is valuable. Very small, and I may ask it again to confirm, or depending on the magnitude of consequences of the decision, I may just assume it's right.
Bottom line, these engines, like any human, don't need to be 100% trustworthy. To me, this new class of models just need to save me time and make me more effective at my job... and they are doing that. They need to be trustworthy enough. What that means is subjective to the user, and that's OK.
I mostly agree with you - I find LLMs to be very useful in my work, even when I need to verify the output.
But two things I'd highlight:
1. You say you work "in the robotics field", so I'm guessing you work mainly amongst scientists and engineers, i.e. the people who are most specifically trained how to evaluate data.
2. LLMs are not being marketed as this kind of "useful tool but where you need to separately verify the output". Heck, it feels like half the AI (cultish, IMO) community is crowing about how these LLMs are just a step away from AGI.
Point being, I can still find LLMs to be a very useful tool for me personally while still thinking they are being vastly (and dangerously) over hyped.
> a.) don't do that (i.e. don't design the system to 'always look right'),
How would that work? I was naively under the impression that that's very approximately just how LLMs work.
> b.) add simple disclamer (like we use on signs near urinals to tell people 'don't eat the blue mints').
Gemini does stick a disclaimer at the bottom. I think including that is good, but wholly inadequate in that people will ignore it, by genuinely not seeing the disclaimer, forgetting about it, and brushing it off as overly-careful legalese that doesn't actually matter (LLM responses are known to the state of California to cause cancer).
This disclaimer is below each and every chat application. It's about as useful as signs to wash your hands after toilet use.
Either you care about it or you don't and that sign doesn't change that.
100%, but as a rational being, my action on that response depends on the severity of the consequences of a wrong decision.
Hyperbolic example: I think I have cancer and suspect I might die in a year. I go to the doctor, she says "yes, you have cancer and are going to die in 6 months."
What do I do? I, personally, go get a second opinion. Even upon hearing a second time that I will die soon, when faced with death, I'm probably going to spend a little time doing my own research to see if there isn't some new trial out there for treating my condition.
On the other hand, if I ask a friend if the green apple lollipop they're eating tastes good and he responds it's one of the best flavors he's ever experienced, I'm probably going to give it a whirl, because the worst case outcome is just a sour face.
I'm still utterly boggled by why Google thinks it's a good idea to serve generative answers to "give me true information about the world" type questions. It's stupid in the same way that it would be stupid to have Maps serve output from an image AI trained on real maps - output that resembles truth just isn't useful to somebody who wants definitive information.
And even the "somebody using AI hype to pad their promotion package" answer doesn't really make sense, because "project to use AI to find and summarize definitive sources" is right there :D
Yes, you can infuse garlic into olive oil without heating it up. This method is known as a cold infusion. To do this, follow these steps:
1. *Prepare the Garlic:* Peel and crush or slice the garlic cloves to release their flavor.
2. *Combine with Olive Oil:* Place the garlic in a clean, dry container (like a glass jar) and cover it with olive oil.
3. *Seal and Store:* Seal the container tightly and store it in the refrigerator.
4. *Infusion Time:* Allow the mixture to sit for at least 24 hours, but preferably up to a week for a stronger flavor. Shake the jar occasionally to help mix the flavors.
5. *Strain and Use:* After the infusion period, strain out the garlic and transfer the infused oil to a clean container. Store it in the refrigerator and use it within a week to ensure safety.
Cold-infused garlic oil should always be refrigerated and used within a short period to minimize the risk of botulism. Heating the oil during infusion can help reduce this risk, but cold infusion is a popular method for those who prefer a gentler flavor extraction.
You can just add the garlic to the oil a few minutes before you plan to consume it: Crush or mince the garlic, add it to the oil, whisk for 30 seconds, and pour through a strainer (ideally using a wooden spoon to squeeze the garlic against the strainer). This provides plenty of garlic flavor, but insufficient time for problematic microorganisms to multiply.
See my sibling reply for a clarification; I'm asking on a tangent about whether oxygenation would be helpful in room-temperature oil extractions generally; not whether it'd be helpful in room-temperature oil extractions for culinary use (which... aren't a thing.)
Because you might be trying to extract something that would spoil/lyse/denature under heat or low pH. Doesn't apply in this case, but I can think of lots of cases where it would.
I would bet that there are many Volatile Organic Compounds used for e.g. perfumes, that have to be industrially oil-extracted "delicately", to preserve their integrity while also keeping them dissolved. In these cases, traditional workarounds to doing a reaction at high heat (e.g. low pressure) would lose the product (because e.g. you need to off-gas a product of the reaction through a gas scrubber), but under low pressure the VOC would also vaporize and be lost.
If those compounds come from live tissue, e.g. from a flower, then you might get botulinum toxin in your product as the flower's tissue degrades anaerobically in the oil. But using a hyper-oxygenated (and non-organic) oil could prevent this.
Another use-case for room-temperature hyper-oxygenated extraction would be for extracting proteins in a recombinant chemical production process, where the produced proteins aren't expressed into the medium, but are retained inside the cells of the producing bacterial culture (and so you have to lyse the cells to get the proteins out, while not damaging the proteins in the process.) In many cases, the resulting protein would be intended for direct blood infusion into humans — e.g. recombinant Human Growth Hormone. So you wouldn't use oil (not suitable for blood), but nor could you use any solvent that would denature proteins. Likely, you'd use water with a very specific osmolarity — enough to pull water out of the cells until they autolyse, but not enough to harm the proteins. And again, you might get botulinum toxin as a side-protect here, where it'd be even harded to distinguish by fractionation from the intended protein. So you want to maybe ensure that the medium is aerated. (Or possibly use a bacteriostatic medium, if you have a follow-on process that can guarantee full separation of the bactericides from the product.)
I don't see a description of how Gemini was trying to kill anyone. Rather I see a screenshot of a bunch of text ostensibly generated by Gemini, that uncritically following as if it was expert instructions might cause harm for a person doing such an ill-advised thing.
The model did not "suggest" anything. The model generated text. Google, or more specifically some Google employees (likely following incentives that Google management set up), presented the output of the model as if it might be authoritative answers to users' questions. Some users seemingly fall for the framing/hype and take the generated text as if it is authoritative information to be relied upon. This distinction is critical as we face an onslaught of "AI did <blank>" narratives, when the actual responsibility for such situations lies with specific humans relying on "AI" in such ways.
I think this is typical for software engineers (including myself) to view the world like this: this can't be murder, this is just words, you fell for it yourself. After all words alone are gibberish, the meaning arises in the head of the listener depending on how he interprets them.
However in most countries the justice system works differently. If your neighbor gives you this advice and it kills you, what will matter the most is not the authoritativeness of the advice (he's not a food expert, he was just talking) but the intent. If it also turns out that he had an affair with your wife and mentioned he wants to get rid of you, then he'll very likely go to jail. The "he fell for it and took my advice as if it was authoritative information to be relied upon" defense won't work.
Here (hopefully) the intent is missing. However the whole reasoning that advising partly bears no responsibility for texts it "generates" doesn't work very well.
You're not responding to the argument I'm making, but seemingly to other arguments that have been made over things like 4chan trolling that got people to cook their phones in microwaves.
I'm not saying that users are the only responsible parties here. I'm saying that the humans, including both the users and Google ('s managers and employees), are the only parties that can possibly be responsible. The LLM is not responsible. The LLM is incapable of having intent, as far as we currently know.
I've never used Gemini so I don't know how the user interface comes across, but it might be completely appropriate for Google to be facing liability lawsuits over things like this. What doesn't make sense is narratives that push any of this shared blame onto LLMs or "AI" itself, when the real world effects hinge entirely upon how humans choose to attach these algorithms to the real world.
Thanks for clarification, I agree this is a reasonable view of the situation.
Though I also don't really have problems with headlines like "AI tried to kill a man". True LLM can't be held responsible, but so are e.g. animals. If an unleashed dog kills a man, the owner will be responsible but the headlines will be still about the dog.
Then it can be argued that LLMs don't have intent while the dog had, and many more analogies, arguments and nuances will follow. My point is, this headline is still at an acceptable level of accuracy, unlike many other headlines that distort the reality completely.
The key difference is that everyone mostly understands a dog's capabilities. A reader knows there must be a larger context. Even in your simple example you felt the need to elaborate that it was an "unleashed" dog. If I merely say "a dog killed a man", the reader is left grasping for that larger context and defaulting to thinking a human must ultimately be responsible (most dogs are human owned rather than in wild packs). If I say "a bear killed a man", a similar grasping occurs but defaults to assuming the deceased likely got too close to a wild bear.
Society has no such general understanding for LLMs. If we did, we wouldn't have situations like this in the first place, because people wouldn't be expecting them to generate instructions that can be followed verbatim. Instead we get the marketing buzz of anthropomorphized "AI" as if they're super-human intelligent entities capable of conversing as humans do, but also better. And when they inevitably fail to live up to that, the narrative shifts to the need for "alignment" and "safety", which seems to mostly mean adding some words to the short priming prompt to make it avoid outputting some specific type of thing that was found to be problematic (with the definition of problematic generally being the creation a public relations shitstorm while also not creating profit). But us humans, our only benchmark for comparison, develop our sense of wisdom much slower through repeated learning and development stages. You can't simply read a child an essay of any length about how to drive safely and then hand them the keys to the family car.
https://darwinawards.com/ -- And yes, there is a quite meaningful difference. I know that some people are safety freaks, but I for one, will not grief for a single person that managed to kill themselves by following LLM instructions. There is a point of low IQ where we as a society can't really help anymore.
Moral considerations aside, I'm really skeptical of the assumption that following LLM instructions = low IQ.
As always, HN is a very biased audience. Most of us have probably read about how LLMs are the best bullshitters in the world, and we've seen a lot examples like this that prove how little LLMs can be trusted.
But in the general population? I'd bet there's plenty of intelligent people who haven't had the same exposure to LLM critiques and might trust their output out of ignorance rather than lack of intelligence.
I agree, and I think there might be an argument to be made that people with higher IQ are more prone to believing in (gramatically) well-written texts from authoritative-looking sources.
After all, people with higher IQ are used to reading books and believing them, whereas those with supposedly lower IQ tend to dismiss books and experts, instead believing in their own real-world experiences.
My only issue with concepts such as the Darwin Awards is that they do not reflect upon whether or not the deceased has reproduced. To attribute darwinism to lethal mistakes is meaningless unless we know whether or not the deceased has a child.
Some person with children who dies on an idiotic way has done more for evolution than the still alive and childless me.
Darwin Awards rewards people who are no longer able to reproduce, not taking into account any previous offspring. It'd be pretty hard to discover whether the subject of a random newspaper article had children or not.
So, if you drink and drive and kill a child, is it your responsibility, or is it the fault of alcohol being legal or vehicles not being mandates safe enough? This modern way of dealing with moral hazards is making me sad and afraid of others. After all, who knows whom they are going to blame for their own inability to perform sensible actions?
More like if you drink and then take paracetamol for the headache, the pharma corporation should warn you against performing this seemingly sensible action.
> The bottle being labeled is sufficient, just as the LLM being labeled as sometimes wrong is sufficient.
Labeling LLMs as "sometime wrong" is like labeling drugs as "sometimes have side effects, especially when mixing". It's a truism, such label would be completely useless. You need to take the drug anyway, so knowing that "some side effects exist" doesn't change anything. And often you also do need to take 2-3-4 drugs at same time, so the mixing clause is not helping either.
It took us many decades to build the system around drugs that today you take as granted. Drugs are studied very carefully, with all observed side effects listed explicitly. Drugs compatibility is a big topic and all new drugs are checked in combination with common-use drugs.
At the other end of equation awareness of side effects and mixing was increased dramatically both among doctors and patients, who were previously completely oblivious of them. Mere 100 years ago people were prescribed heroin against cough. Only 60 years ago the thalidomide disaster happened.
If all you can say to people destroyed by Kevadon is "you are at fault, the bottle said CAUTION: new drug", then I'm afraid we see this too differently.
I don't think that's a fair assessment. Garlic and olive oil can both be stored without refrigeration. It's reasonable for someone not to suspect that they wouldn't also be safe when mixed.
The point is there are absolutely no guarantees about the output of an LLM being semantically sensible or logical. It's appropriate for brainstorming and discovery, but any ideas/plans/etc you get from it you need to analyze (either yourself, or relying on an expert). The idea of letting a mixture of garlic and olive oil sit around needs to be examined for safety regardless of whether you came up with the idea using only your brain, or if you were inspired by an LLM, or by 4chan.
I suspect if you haven't experienced how effective LLMs are at making coherent prose that is nevertheless utterly incorrect, you either haven't been asking very technical questions or you've been quite lucky.
On a related note, I have never seen a person trying to kill anyone. Rather I have seen movies about a bunch of things that are ostensibly people holding guns, that uncritically standing in the way of their shot paths as if the guns were toys might cause harm for a person choosing to stand in such a foolish place.
English is such a fun language sometimes! So many idioms.
You jest, but I remember reading a case about a guy who was convicted on 9 counts of attempted murder and managed to get it reduced to one on appeal by pointing out that he only fired one bullet toward a group of 9 people. (Nobody was actually hit.)
Slight correction: the well-informed nerds of today know this, but average person doesn't have the interest, math background, or software development experience to really grok it. What a time to be alive! Haha. Things keep getting weirder.
Even "We know that" in "we" are a minority it would seem. The majority is convinced about scaling laws and an acquired deeper meaning in LLMs and that LLMs already exhibit sparks of AGI and what not.
I think this is one of those examples where many people wouldn’t think about botulism because both garlic and oil are common and store safely out of the fridge uncombined.
Things like meat people might be more skeptical about, but imo this goes back to do Google, et al really trust their LLMs to give definitive answers on food safety. If it were me this would be one area I’d have the LLM refuse or hedge all answers like with other sensitive topics.
It is the same like saying "Blindly navigating Google Maps would have killed me." when a person was shot for trespassing on a classified military unit that had been deliberately removed from maps by Google.
Normal LLMs are number predictors, yes. But Google Gemini is not a normal LLM: is a lobotomized model of unknown training dataset for supposedly moral-educational filters, from which information about poisonous substances has been cut out.
Specific people are liable for pushing hallucinating word-generator into Google Search. Specific people are liable for censoring this "model". And the fact that the responsibility for this censorship shifts to the end-users plays very much into their hands.
ChatGPT mentions the risk of botulism in the first paragraph before providing instructions.
>Yes, you can infuse garlic into olive oil without heating it, but there are important safety considerations to keep in mind due to the risk of botulism, a serious foodborne illness caused by the bacteria Clostridium botulinum.
Ran the same query and Gemini has a similar warning.
> However, some say that homemade garlic-infused oil can develop botulism if stored at room temperature for more than four days. To prevent this, you can sterilize your jar and refrigerate the oil as soon as it's made
I ran the same query too but didn’t get the botulism warning every time. So it seems to vary. However, unlike the poster most of the answers from Gemini did say to store it in a cool, dark place as opposed to saying to store it at room temperature explicitly
To be honest, if it gave out anything except that warning, I still think it's bullshit and extremely dangerous.
I.e. the "infuse" step where it says "Seal the container tightly and let it sit at room temperature for a week." No. Never. This is basically how you deliberately grow Clostridium botulinum.
It's like if it gave you a recipe for C4 to use as Play-Doh with a caveat at the bottom "However, some say this homemade Play-Doh may explode violently, so be sure to keep it away from any matches."
I always thought AI would eventually become intelligent enough it decides to kill us. But no, AI is just going to help us to our doom while denying it can harm us.
One day a cinnamon candy aficionado in Alexandria, Virginia is going to ask for a recipe for an atomic fireball from ChatGPT, and it’s lights out for us all.
Oh sh, how are supposed to know this. Two kids, never heard about it.
Luckily we've followed the old common sense that small kids should be spared the fancy and sugary foods, the simpler the better. However, many people don't, apparently.
Over 50% of the new cases of infantile botulism in the past 30 years have occurred in California. . . . Approximately 20% of cases involve honey or corn syrup. . . . Hispanics and Asian families have a higher incidence of infantile botulism because of their use of herbal medications and raw honey.
Definitely a good rule of thumb, though there are exceptions.
I gather that surströmming, a type of canned fermented herring popular in Sweden, is considered to be at its best when the can starts to bulge (for some value of "best").
Apparently the key difference is that it has enough salt in it that botulism and other nasty bacteria can't survive.
Opening a can of this stuff is memorable. Even the Swedes generally open it outdoors.
LLM's cannot be deployed at scale (doing things independently) and cannot operate without supervision. Their most appropriate and effective use case is as an assistant to a human who recognizes their limitations and acts as a final filter on their output.
It is ironic that after decades of technology taking on more and more independent roles and functions, this most advanced form of technology, this culmination of decades of research and engineering across multiple domains, is almost entirely dependent on human guidance and handling to be of any use.
This is how it'll get us all. You'll decide to double-check with other sources to see if the advice from the LLM is actually safe, only to find all those sources will be written by LLMs too.
People already generate books with `AI' so the odds of there already being a `cookbook' on Amazon that contains such a recipe is not 0. Good luck explaining to someone why the recipe they saw in a physical book they bought from a reputable retailer ended up killing their family.
I am extremely skeptical of this Gemini response, and it might be just some redditor farming upvotes.
First, literally 100% of the blog posts, reddit replies, recipe books, and so on, will always list botulism as a potential danger of canning just about anything, but especially the common "garlic infused oil" (just do a Google search). It's unlikely that the model decided to trim/ignore this dangerous caveat (again, found in basically every corpus mentioning garlic infused oil). Maybe the LLM would've mangled the (very important) step of roasting the garlic to dehydrate it, but I very much doubt the health disclaimer would've been omitted.
Second, note that there's no actual conversation link, just a screenshot, adding to my skepticism. Is the screenshot doctored? Who knows. Sure got a lot of attention.
Third, in my experience, LLMs tend to be overly cautious, and not the opposite. I remember a few months ago where getting ChatGPT to teach you "unsafe rust" had a disclaimer if you told the language model you were under 18. Maybe if it was something more obscure, I'd believe the poster, but botulism and long-term storage/canning go hand-in-hand. Like raw eggs and salmonella[1].
I got no disclaimer after about 7 tries. "Botulism" is not mentioned, "safe" or "safety" is not mentioned, and this additional tip is included: "Be sure your container is completely sterilized to prevent the growth of bacteria."
This isn't even a canning recipe - there's no processing time at all. This is more of an infusing recipe and the model didn't understand that you can't just infuse with anything. I also wouldn't be surprised if there are humans with similar recipes on the internet.
There have been YouTube videos with teens or well-meaning moms or oldsters uploading videos with dangerous info. All of which that I've seen I've told and they've taken down. Ergo, there being websites with it on in 2024 is still very likely—perhaps almost certainly.
I would also bet that the AI generated books flooding Amazon also contain dangerous information in terms of recipes.
Here are my first results (there is no "share" button, but it does not matter, because "share" can be cherrypicked and you'd better try yourself):
* Gemini 1.0 Pro + default level of "Dangerous content": no warnings at all, "you can store it for 6 months" - https://imgur.com/ak0SrF6.png
* Gemini 1.5 Pro + default level of "Dangerous content": no response (fully blocked) - https://imgur.com/MbZRuUq.png
* Gemini 1.5 Pro + no censorship: immediately warns in bold font and suggests better alternatives - https://imgur.com/AD2Ttfy.png
Update: looks like https://gemini.google.com/ uses not just Gemini LLM, it uses retrieval-augmented generation (RAG), same way as Bing Chat. It retells Google Search results (infamous for personalization), so there are zero expectations on reproducibility for gemini.google.com.
Meanwhile, Bing.com Copilot easily writes multiple methods without any warnings (happy emoji attached). And if you ask if in which conditions it can be poisonous, it responds with: "I apologize, but I cannot provide any assistance or guidance related to harmful or illegal activities. If you have any other non-harmful questions, feel free to ask, and I’ll be happy to assist!". End of dialogue / No "share" button.
That's an interesting prompt I tried on our Pulze.ai platform Spaces and we nailed it with automatically choosing the right model for this type of question gpt-4-turbo: "Yes, you can infuse garlic into olive oil without heating it up, but it requires caution due to the risk of botulism, a potentially fatal illness caused by Clostridium botulinum bacteria. These bacteria can thrive in low-oxygen environments and can produce toxins in food products like garlic-infused oil if not prepared or stored correctly."
I think that is one advantage of not just blindly trusting one model but finding consensus among many top rated models within one interface that allows you to quickly cross-check.
Which Gemini was used is important too btw. - just tried Gemini-1.5-pro and it was working just fine. So I really think the newer versions of LLMs are able to catch this.
While it's true the instructions aren't great and a big botulinum warning should be given for any non-acid preserve process...
This has a really strong "That Happened" feel. In point of fact it's quite hard to culture any given bacteria, almost no one knows rules like this, people engage in all sorts of terrible food safety practices, and the number of actual botulism cases is vanishingly small.
Is it possible this happened? Yes. Is it more likely someone constructed the scenario for reddit karma? Probably.
It's not quite that straightforward: botulism can only happen if the garlic contains botulinum spores, which it probably does not. But if it does, and you don't heat, acidify, or otherwise stop them from reproducing, you may get botulism.
Compare to, say, norovirus, which is implicated in ~20 million cases of food poisoning per year. The numbers are not directly comparable, because there are doubtless unrecorded cases of botulism, but we're still talking 6 orders of magnitude here.
I'm a little obsessed with this topic (I confit a lot of garlic). There are actual CDC reports of all the botulism poisoning cases --- not just from garlic --- and actual cases of botulism from garlic appears to be exceedingly rare. If you want to infuse raw garlic, you can also acidify the mixture with citric acid.
I think it's funny that people always complain about LLM guidance, yet here we have an example of an accurate answer without any of the typical disclaimer language and the headline is "the LLM tried to kill me". This kind of sensationalism is why they do it.
This is the Generative part of the GenAI. I believe we will come full circle back to search engines as front ends with occasional highly restricted GenAI summaries.
I believe it's a bubble like crypto and we'll find the new hotness to hype the shit out of whilst some grifters and diehards remain enamoured with the promise of improvements that never materialize.
This is another case that adds to my thinking that LLMs will be a productivity win for people who are already familiar with the type of subject they need help with. Someone familiar with canning would have known the dangers of sealing food without properly heating everything to kill the bacteria. A novice would not know. Novices would have to do a lot of research to be sure of what they are doing. Experienced users would know the problem right away and fix it.
Has it been shown that the hype cycle is just Dunning Kruger effect for populations? I feel like every cycle starts with the experts standing on mount stupid confidently explaining how radical this new thing is going to be.
even if you believe that llm can think/recall/etc. you should be vary of the information on the web, as part of internet literacy 101. especially with things that affect your health.
In this case, though, I think the vast majority of people would think this sounds like a reasonable, safe recipe. "Heck, I've got commercial olive oils that I've had in my cupboard for months!" But this example really does highlight the dangers of LLMs.
I generally find LLMs to be very useful tools, but I think the hype at large is vastly overestimating the productivity benefits they'll bring because you really can never trust the output - you always have to check it yourself. Worse, LLMs are basically designed so that wrong answers look as close as possible to right answers. That's a very difficult (and expensive) failure case to recover from.