Are we in an AI Overhang?

polytely · on July 27, 2020

I've been wondering about something: (I only know the basics of AI, this might be kinda incoherent)

Right now if you look at GTP-3's output it seems like it's approaching a convincing approximation of a bluffing college student writing a bad paper, correct sentences and stuff but very 'cocky'. It cannot tell right from wrong, and it will just make up convincing rubbish, 'hoping' to fool the reader. (I know I'm anthropomorphizing but bear with me).

Current models are being trained on a huge amount of internet text. As smarmy denizens of hackernews we know that people are very often wrong (or 'not even wrong') on the internet. It seems to me that anything trained on internet data is kinda doomed to poison itself on the high ratio of garbage floating around here?

We've seen with a lot of machine-learning stuff that biased data will create biased models, so you have to be really careful what you train it on. The dataset on which GTP-n has to be trained has to be pretty huge(?); and moderation is hard(?) and doesn't scale; it's easier to generate falsehood than truth; and the further we go along the more of internet data will be (weaponized?) output of GTP-(n-1); So won't the arrival of AGI just be sabotaged by the arrival of AGI?

Has anyone written something about the process of building AGI that deals with this?

simonh · on July 27, 2020

I don't think the problem is really with the data set, it's that GTP-3 doesn't actually have any understanding of the data. It's building a model of the text input to generate text output. It's not building a model of what the data means, or what it represents.

When GTP-3 writes a scientific paper it's not trying to test a premise or critically evaluate some data, it's trying to generate text that looks like that sort of thing.

It doesn't matter how many excellent quality papers you fed it, or how stringently you excluded low quality input data, it would still only be trying to produce facsimiles. It wouldn't be actually trying to do the things a real conscientious scientist is trying to do when they write a paper. Arguably at best it might be trying to do what a deceitful scientist trying to get credit for a paper with spurious results with no actual scientific merit might be doing when writing a paper that looks plausible, but thats actually a completely different activity.

The code generation examples are really interesting. Here it's being used to generate real working code that has actual value, and it appears to work pretty well. It's code often has bugs, but who's doesn't? It only works for fairly short precisely definable coding tasks though. I don't think scaling it up to more complex coding problems is going to work. Again it doesn't understand the meaning of anything. It's trying to produce code that looks like working code, not actually solve the programming task you're giving it. It doesn't even know what a program is or what a programming task is. It doesn't know what input and output are. For example you can't ask it to modify existing code to change it's behaviour. It doesn't know code has behaviour. It has no idea what that even means and has no way to find out or any route to gaining that capability, because that's not a text transformation task and all it does is transform text.

stupidcar · on July 27, 2020

I've seen this objection raised a lot, but I think it betrays a misunderstanding of what GPT-3 is capable of doing.

The best, in fact the only way to generate truly convincing text output on most subjects is to understand, on some level, what you're writing about. In other words, to create a higher level abstraction than simply "statistically speaking, this word seems to follow that one". Once you start to encode that words map to concepts, you can use the resulting conceptual model to create output which is conceptually consistent, then map it backwards to words. There is what humans do with sensory data, and there is good evidence that GPT-3 is doing this too, to some degree.

Take simple arithmetic, such as adding two and three digit numbers. GPT-2 could not do this very successfully. It did indeed look like it was treating it as a "find the textual pattern" problem.

But GPT-3 is much more successful, including at giving correct answers to arithmetic problems that weren't in its training set.

So what changed? We aren't sure, but the speculation is that in the process of training, GPT-3 found that the best strategy to correctly predicting the continuation of arithmetic expressions was to figure out the rules of basic arithmetic and encode them in some portion of its neural network, then apply them whenever the prompt suggested to do so.

If this is the case, and it remains speculation at this point, would you still argue that GPT-3 doesn't "understand" arithmetic, on some level? I would argue that this abstraction, this mapping of words onto higher-level concepts, which can then be manipulated to solve more complex problems, is exactly what intelligence is, once you strip away biologically-biased assumptions.

Certainly, at this point GPT-3's conceptual understanding remains somewhat primitive and unstable, but the fact that it exhibits it at all, and sometimes in spookily impressive ways, is what has people excited and worried. We have produced AIs that can perhaps think conceptually about relatively narrow topics like playing Go, but we have never before created one that can do so one such a wide range of topics. And there is no suggestion that GPT-3's level of ability represents a maximum. GPT-4 and beyond will be more powerful, meaning that it can mine more and more powerful conceptual understanding from their training data.

bo1024 · on July 27, 2020

> We aren't sure, but the speculation is that in the process of training, GPT-3 found that the best strategy to correctly predicting the continuation of arithmetic expressions was to figure out the rules of basic arithmetic and encode them in some portion of its neural network

I don't mean to attack you personally, but this is a perfect example of what I feel is wrong with so much neural network research. (And I understand that you are just commenting in a discussion, not conducting research.)

In a word, it's baloney. And it's a really common pattern in neural networks' recent history: "How did they perform reasonably well on this task? We aren't sure, but the speculation is that they magically solved artificial general intelligence under the hood." Usually this is followed up by "I don't know how it works, but let's see if a bigger network can make even prettier text." Meanwhile, "it's funny how our image classifiers grossly misperform if you rotate the images a little or add some noise."

A rigorous scientific approach would be aimed at actually figuring out what these models can do, why, and how they work. Rather than just assuming the most optimistic possible explanation for what's happening -- that's antithetical to science.

glenstein · on July 27, 2020

>In a word, it's baloney.

This is where you lost me. They included important caveats to indicate not being sure, which is important to me as an indication of healthy skepticism. And you substituted a specific example: making inferences about arithmetic, for an more expansive, uncharitable, easy-to-caricature claim of "gee we must have solved general AI!" which is much easier to attack. And, unlike your counterpart, who hedged, you just went ahead and categorically declared it to be baloney, making you the only person to take a definitive side on an unsettled question before the data is in. This is a perfect example of the anti-scientific attitude exhibited in Overconfident Pessimism [0].

I don't think it's known how GPT-3 got so much better at answering math questions it wasn't trained on, I do think the explanation that it made inferences about arithmetic is reasonable, I think the commenter added all the qualifiers you could reasonably ask them to make before suggesting the idea, and frankly I would disagree that there's some sort of obvious history of parallels that GPT-3 can be compared to.

There is an interesting conversation to be had here, and there probably is much more to learn about why GPT-3 probably isn't quite as advanced as it may immediately appear to be to those who want to believe in it. But I think a huge wrench is thrown in that whole conversation with the total lack of humility required to confidently declare it 'baloney', which is the thing that sticks out to me as antithetical to science.

0: https://www.lesswrong.com/posts/gvdYK8sEFqHqHLRqN/overconfid...

bo1024 · on July 28, 2020

Thanks for your reply. A couple responses to advance the conversation.

As a side note, it's worth mentioning that apparently, from other responses, it seems we have little idea how much arithmetic GPT-3 has learned, and it may not be much.

Anyway, I think the important distinction between my perspective and Overconfident Pessimism, which you attribute to me, is that I'm not talking about (im)possibility of achievement, I'm talking about scientific methodology or lack thereof.

In other words, I'm not saying (here) that some NLP achievements are impossible. I'm saying that we are not rigorously testing, measuring, and verifying what we are even achieving. Instead we throw out superficially impressive examples of results and invite, or provoke, speculation about how much achievement probably must have maybe happened somewhere in order to produce them.

We have seen several years of this pattern, so this is not a GPT-3 specific criticism; it's just that particular quote so neatly captured patterns of lack of scientific rigour that we have seen repeatedly at this point.

Probably the first example was image recognition. Everyone was amazed by how well neural nets could classify images. There was a ton of analogous speculation -- along the lines of 'we're not sure, but the speculation is the networks figured out what it really means to be a panda or a stop sign and encoded it in their weights.' The terms "near-human performance" and then "human-level performance" were thrown around a lot.

Then we found adversarial examples and realized that e.g. if you rotate the turtle image slightly, the model becomes extremely confident that it's a rifle. So, obviously it has no understand of what a turtle or a rifle is. And obviously, we as researchers don't understand what those neural nets were doing under the hood, and that speculation was extremely over-optimistic.

Engineering cool things can absolutely be a part of a scientific process. But we have seen countless repetitions of this pattern (especially since GANs): press releases and impressive-looking examples without rigorous evaluation of what the models are doing or how; invitations to speculate on the best-possible interpretation; and announcing that the next step is to make it bigger. I think this approach is both anti-science and misleading to readers.

JoshuaDavid · on July 27, 2020

> And it's a really common pattern in neural networks' recent history: "How did they perform reasonably well on this task? We aren't sure, but the speculation is that they magically solved artificial general intelligence under the hood." Usually this is followed up by "I don't know how it works, but let's see if a bigger network can make even prettier text."

Layperson here, but my impression is that "let's see if a bigger network can make even prettier text" has _worked_ far beyond the point most people expected it would stop working.

Also my layperson impression: most "researchers" that are on the cutting edge of cool things are more interested in seeing what cool things they can do than on doing rigorous science (which makes sense -- if you optimize for rigorous science, your stuff probably isn't as flashy as the stuff produced by people optimizing for flash).

lowdose · on July 28, 2020

> what cool things they can do than on doing rigorous science (which makes sense -- if you optimize for rigorous science, your stuff probably isn't as flashy as the stuff produced by people optimizing for flash.

Is this a new iteration on that zigzag quote?

> Zak phases of the bulk bands and the winding number associated with the bulk Hamiltonian, and verified it through four typical ribbon boundaries, i.e. zigzag, bearded zigzag, armchair, and bearded armchair.

From "The existence of topological edge states in honeycomb plasmonic lattices"

https://iopscience.iop.org/article/10.1088/1367-2630/18/10/1...

atomicity · on July 27, 2020

I don't know if this means something is truly wrong. AI is a mix of engineering and scientific research, just like most CS subfields. Recently, the emphasis has shifted towards engineering, as the applications of neural nets have skyrocketed after a few breakthroughs in performance.

It's similar to computer systems research. For example, a research paper on filesystems might tell us a simple trick which leads to better performance on NVMM. The paper may go into why the trick works, but it doesn't (and shouldn't need to) generalize and try to improve our general understanding of how to design filesystems on different hardware. We've been designing filesystems to this day and well, we are always still guessing about which approaches to use and hoping for the best. In the same vein, we don't even have a widely-accepted theory of how to use data structures yet.

So, I don't think that neural nets aren't scientific enough means that it's all BS. We have gaps in understanding, but the power of the models warrants a lot of continued work on finding useful applications.

Doesn't mean I don't think AI is over-hyped/overfunded though...

bo1024 · on July 28, 2020

I agree with a lot of this, but I think there is a consistent pattern of AI announcements playing on humans' intuitions to create the impression that much more has been achieved than can actually be proven -- in fact, not even trying to prove anything. Part of this is that the researchers are humans too and may be misled themselves. But a rigorous research process would at least try to prevent that.

For example, people once thought playing chess was hard. So they thought that if a computer could beat the world champion, then computers would probably also be able to replace every job and so on. If you sent Deep Blue back in time to the 1960s, they wouldn't understand how it works so they'd probably assume that it since it could beat Petrosian in chess, it could probably drive cars and treat disease.

But then we built Deep Blue and realized that you don't need AGI to play chess; a very specialized algorithm will do it.

So we're like people in the 70s who've been handed Deep Blue. It's irresponsible, in my opinion, to over-hype it when we have no idea how it works.

dragongod2718 · on July 28, 2020

Wait, you think AI is overfunded?

cma · on July 27, 2020

> Meanwhile, "it's funny how our image classifiers grossly misperform if you rotate the images a little or add some noise."

Same thing arguably happens with humans with rotation. Our eyes even rotate in the roll axis to keep gravity aligned things upright. Most people can draw faces more accurately when copying from an upside down face than a right side up one.

confuseshrink · on July 27, 2020

> So what changed? We aren't sure, but the speculation is that in the process of training, GPT-3 found that the best strategy to correctly predicting the continuation of arithmetic expressions was to figure out the rules of basic arithmetic and encode them in some portion of its neural network, then apply them whenever the prompt suggested to do so.

I saw a lot of basic arithmetic in the thousands range where it failed. If we have to keep scaling it quadratically for it to learn log n scale arithmetic then we're doing it wrong.

I'm surprised you think it learned some basic rules around arithmetic. A lot of simple rules extrapolate very well, into all number ranges. To me it seems like it's just making things up as it goes along. I'll grant you this though, it can make for a convincing illusion at times.

coryfklein · on July 28, 2020

> To me it seems like it's just making things up as it goes along.

Oh, aren’t we all?

simonh · on July 27, 2020

The example of simple arithmetic is interesting. I think you might be right, that is evidence that GPT-3 might be generating what we might consider models of it's input data. Very simple, primitive and fragile models, but yes that's a start. Thank you.

klipt · on July 27, 2020

A disembodied AI with a really good model might be able to do good theoretical science, but it would still need a way of acting in the physical world to do experimental science.

Veedrac · on July 27, 2020

With a sufficiently effective language model it would be fairly easy to bridge this model, by letting the text direct humans on the other side.

    Hypothesis: <AI writes this>
    Results: <human observations>
    <repeat>

endgame · on July 27, 2020

That seems like a _very bad_ habit to get humans into.

falcor84 · on July 29, 2020

Well, we already have tons of examples of computers telling humans what to do, e.g. autogenerated emails alerting a human to handle an issue.

The novel Manna explores where this can lead quite nicely - http://www.marshallbrain.com/manna1.htm

_greim_ · on July 27, 2020

This strikes me as very similar to the debate around the Chinese Room.

https://plato.stanford.edu/entries/chinese-room/

joefourier · on July 27, 2020

I would love to talk to someone who actually believes in the Chinese Room argument. To me it seems to be ignoring the existence of emergent behavior, and the same argument could prove that a human Chinese speaker doesn't understand Chinese either: his neurons are just reacting to produce answers depending on the input and their current state (e.g. neurotransmitters and action potentials).

dragonwriter · on July 28, 2020

The Chinese Room argument is fairly transparently circular; if you assume understanding involves something more than applying a sufficiently complex set of deterministic rules, then a pure system of deterministic rules cannot ever achieve understanding.

Of course if you accept the required premise of the argument, you must accept that either, one, we don't live in a universe that is a pure system of deterministic rules, or, two, nothing in the universe can have true understanding.

The Chinese Room argument, scientific materialism, or the existence of true understanding—you can have at most two of those in a consistent view of the universe.

smallnamespace · on July 27, 2020

John Searle came up with that argument to conclude that despite a hypothetical Chinese room being able to have a conversation with someone, it doesn't truly have understanding, so N seems to be at least 1.

To your point though, the more interesting case is people who would disavow the Chinese Room argument, but then end up using reflecting its views while argue against the intelligence of this or that system.

_greim_ · on July 28, 2020

Practically everyone in my online bubble feels similarly, it seems, though I do think steelmanning it is a great way to explore the topic. Same with the Mary's Room argument.

https://plato.stanford.edu/entries/qualia-knowledge/

slowmovintarget · on July 27, 2020

Peter Watts explores this in the novel Blindsight. I don't want to give the plot away, but the main idea is really interesting, and relevant to this discussion.

Baeocystin · on July 27, 2020

I'm posting to second the recommendation for the novel. It is the most interesting exploration of the Mind's I (not a typo) that I've come across in modern sci-fi.

It can be read in its entirety at the author's site: https://rifters.com/real/Blindsight.htm

tinmandespot · on July 27, 2020

Threads like these are the reason why I keep coming back to HN !

_greim_ · on July 28, 2020

Recently finished my second read of Blindsight. Enjoyed it more the second time than the first.

vivekkalyan · on July 28, 2020

> We aren't sure, but the speculation is that in the process of training, GPT-3 found that the best strategy to correctly predicting the continuation of arithmetic expressions was to figure out the rules of basic arithmetic and encode them in some portion of its neural network, then apply them whenever the prompt suggested to do so.

I strongly disagree. GPT-3 has 100% accuracy on 2-digit addition, 80% on 3-digit addition, 25% on 4-digit addition and 9% on 5-digit addition. If it could indeed "understand arithmetic" the increase in number of digits should not affect its accuracy.

My perspective as an ML practitioner is that the cool part of GPT-3 is storing information effectively and it is able to decode queries easier than before to get the information that is required. Yet with things like arithmetic, the most efficient way would be to understand the rules of addition but the internal structure is too rigid to encode those rules atm.

Jack000 · on July 27, 2020

I don't think training on language by itself is enough. Consider for example, if we found extraterrestrial transmissions from an alien civilization. We don't know what they look like, what they're made of or if they even have corporeal form. All we have is a large quantity of sequential tokens from their communications.

It's possible to train GPT3 to produce a facsimile of these transmissions, but doing so does not let us learn anything at all about these aliens, beyond statistical correlations like ⊑⏃⟒⍀ often occurring in close proximity to ⋏⟒⍙⌇ (what do they represent - who knows?). Just having the text is not enough, because we have no understanding of the underlying processes that produced the text.

That said, this is only a limitation of language models as they currently exist. I imagine it would be possible to train a ML model that encodes more of the human experience via video/audio/proprioception data.

hackinthebochs · on July 27, 2020

I wouldn't be so sure we couldn't decode the meaning of an alien language given enough sample text. There have been some advances[1] towards learning a translation between two human languages in an unsupervised manner, meaning without any (language1, language2) sentence pairs to serve as the ground truth for building a translation. Essentially it independently learns abstract representations of the two languages from written text in each language, all the while nudging these abstract representations towards identical feature spaces. The result is a strong translation model trained without utilizing any upfront translations as training data.

The intuition behind this idea is that the structure inherent in a language is dependent upon features of the world being described by that language to some degree. If we can abstract out the details of the language and get at the underlying structure the language is describing, then this latent structure should be language-independent. But then translation turns out to simply be a matter of decoding and encoding a language to this latent structure. One limitation of this idea is that it depends on there being some shared structure that underlies the languages we're attempting to model and translate. It's easy to imagine this constraint holds in the real world as human contexts are very similar regardless of language spoken. The basic units and concepts that feature in our lives are more-or-less universally shared and so this shared structure provides a meaningful pathway to translation. We might even expect the world of intelligent aliens to share enough latent structure from which to build a translation given enough source text. The laws of physics and mathematics are universal after all.

[1] https://openreview.net/pdf?id=rkYTTf-AZ

tsimionescu · on July 27, 2020

This doesn't really make any sense to me. Shared structure is not enough to assume shared meaning. Even given the idea of Universal Grammar (which seems extremely likely, given the interchangeablility of human languages for babies), that tells us nothing about the actual words and their association with the human world.

Take the sentence 'I fooed a bar with a Baz' - can you infer what I did from this?

hackinthebochs · on July 27, 2020

>Shared structure is not enough to assume shared meaning.

How do you define meaning? If we can find a mapping between the sequence of words in a language and the underlying structure of the world, then we by definition know what those words mean. The question then reduces to whether there will be multiple such plausible mappings once we have completely captured the regularity of a "very large" sequence of natural language text. I strongly suspect the answer is no, there will only be one or a roughly equivalent class of mappings such that we can be confident in the discovered associations between words and concepts.

The number of relationships (think graph edge) between things-in-the-world is "very large". The set of possible relationships between entities is exponential in the size of the number of entities. But the structure in a natural language isn't arbitrary, it maps to these real world relationships in natural ways. So once we capture all the statistical regularities, there should be some "innocent" mapping between these regularities and things-in-the-world. "Innocent" here meaning a relatively insignificant amount of computation went into finding the mapping (relative to the sample space of the input/output).

>Take the sentence 'I fooed a bar with a Baz' - can you infer what I did from this?

Write me a billion pages of text while using foo bar baz and all other words consistently throughout, and I could probably tell you.

TomSwirly · on July 28, 2020

You're relying on having texts covering all aspects of a language.

Here's a good example. Suppose I had a huge set of recipe books from a human culture - just recipes, no other record of a culture.

I might be able to get as far as XYZZY meaning "a food that can be sliced, mashed, fried and diced" but how would I really tell if XYZZY means carrot, potato, or tomato, or tuna?

mannykannot · on July 28, 2020

> If we can find a mapping between the sequence of words in a language and the underlying structure of the world, then we by definition know what those words mean.

This seems a bit tautological to me: if being able to make certain mappings is understanding, then does this not amount to "once we understand something, understanding it is a solved problem?"

On the other hand, the apparently simplistic mappings used by these language models have achieved way more than I would have thought, so I am somewhat primed to accept that understanding turns out to be no more mysterious than qualia.

I doubt that just any mapping will do. One aspect of human understanding that still seems to be difficult for these models is reasoning about causality and motives.

I think it is a fairly common intuition that one cannot understand something just by rote-learning a bunch of facts.

hackinthebochs · on July 28, 2020

>This seems a bit tautological to me

I meant it in the sense of: given any reasonable definition of "understanding", finding a mapping between the sequences of words and the structure of the world must satisfy the definition.

>I think it is a fairly common intuition that one cannot understand something just by rote-learning a bunch of facts.

I agree, but its important to understand why this is. The issue is that learning an assignment between some words and some objects misses the underlying structure that is critical to understanding. For example, one can point out the names of birds but know nothing about them. It is once you can also point out details about their anatomy, how they interact with their environment, find food, etc and then do some basic reasoning using these bird facts that we might say you understand a lot about birds.

The assumption underlying the power of these language models is that a large enough text corpus will contain all these bird facts, perhaps indirectly through being deployed in conversation. If it can learn all these details, deploy them correctly within context, and even do rudimentary reasoning using such facts (there are examples of GPT-3 doing this), then it is reasonable to say that the language model captures understanding to some degree.

tsimionescu · on July 27, 2020

Ok, I think I'm getting some of your idea more clearly. Essentially, the observation is that we probably can't consistently replace the words in a novel with other words without preserving the meaning of the novel (consistently meaning each word is always replaced with the same other word).

I think the biggest problem with this argument is the assumption that '[the structure in a natural language] maps to these real world relationships in natural ways'. One thing we know for sure is that human language maps to internal concepts of the human mind, and that it doesn't map directly to the real world at all. This is not necessarily a barrier to translation between human languages, but I think it makes the applicability of this idea to translations between human and alien languages almost certainly null.

Perhaps the most obvious aspect of this is any word related directly to the internal world - emotions, perceptions (colors, tastes, textures etc) - there is no hope of translating these between organisms with different biologies.

However, essentially any human word, at least outside the sciences, falls in this category. At the most basic level, what you perceive as an object is a somewhat arbitrary modeling of the world specific to our biology and our size and time scale. To a being that perceived time much slower than us, many things that we see as static and solid may appear as more liquid and blurry. A significantly smaller or larger creature may see or miss many details of the human world and thus be unable to comprehend some of our concepts.

Another obstacle is that many objects are defined exclusively in terms of their uses in human culture and customs - there is no way to tell the difference between a sword, a scalpel, a knife, a machete etc unless you have an understanding of many particulars of some specific human society. Even the concept of 'cutting object' is dependent on some human-specific perceptions - for example, we perceive a knife as cutting bread, but we don't perceive a spoon as cutting the water when we take a spoonful from a soup, though it is also an object with a thin metal edge separating a mass of one substance into two separate masses (coincidentally, also doing so for consumption).

And finally, even the way we conceive mathematics may be strongly related to our biology (given that virtually all human beings are capable of learning at least arithmetic, and not a single animal is able to learn even counting), possibly also related to the structure of our language. Perhaps an alien mind has come up with a completely different approach to mathematics that we can't even fathom (though there would certainly be an isomorphism between their formulation of maths and ours, neither of our species may be capable of finding it).

And finally, there are simply so many words and concepts that are related to specific organisms in our natural environment, that you simply can't translate without some amount of firsthand experience. I could talk about the texture of silk for a long while, and you may be able to understand roughly what I'm describing, but you certainly won't be able to understand exactly what a silkworm is unless you've perceived one directly in some way that is specific to your species, even though you probably could understand I'm talking about some kind of other life form, it's rough size and some other details.

hackinthebochs · on July 27, 2020

>human language maps to internal concepts of the human mind, and that it doesn't map directly to the real world at all.

I disagree. Mental concepts have a high degree of correlation with the real world, otherwise we could not explain how we are so capable of navigating and manipulating the world to the degree that we do. So something that correlates with mental concepts necessarily correlates with things-in-the-world. Even things like emotions have real world function. Fear, for example, correlates with states in the world such that some alien species would be expected to have a corresponding concept.

>There is no way to tell the difference between a sword, a scalpel, a knife, a machete

There is some ambiguity here, but not as much as you claim. Machetes, for example, are mostly used in the context of "hacking", either vegetation or people, rather than precision cuts of a knife or a scalpel. These subtle differences in contextual usage would be picked up by a strong language model and a sufficient text corpus.

tsimionescu · on July 28, 2020

> Mental concepts have a high degree of correlation with the real world, otherwise we could not explain how we are so capable of navigating and manipulating the world to the degree that we do.

This is obviously a strong association from human mental concepts to real world objects. The question is if the opposite mapping exists as well - there could well be infinitely many non-human concepts that could map onto the physical world. They could have some level of similarity, but nertheless remain significantly different.

For a trivial example, in all likelihood an alien race that has some kind of eye would perceive different colors than we do. With enough text and shared context, we may be able to understand that worble is some kind of shade of red or green, but never understand exactly how they perceive it (just as they may understand that red is some shade or worble or murble, but never exactly). Even worse, they could have some colors like purple, which only exists in the human mind/eye (it is the perception we get when we see both high-wavlength and low- wavelength light at the same time, but with different phases).

Similarly, alien beings may have a significantly different model of the real world, perhaps one not divided into objects, but, say, currents, where they perceive moving things not as an object that changes location, but as a sort of four-dimensional flow from chair-here to chair-there, just like we perceive a river as a single object, not as water particles moving on a particular path. Thus, it may be extremely difficult if not impossible to map between our concepts and the real world back to their concepts.

> Fear, for example, correlates with states in the world such that some alien species would be expected to have a corresponding concept.

Unlikely, given that most organisms on earth have no semblance of fear. Even for more universal mental states, there is no reason to imagine completely different organisms would have similar coping mechanisms as we have evolved.

> Machetes, for example, are mostly used in the context of "hacking", either vegetation or people, rather than precision cuts of a knife or a scalpel.

Well, I would say hacking VS cutting are not significantly different concepts, they are a matter of human-specific and even culturally-specific degrees, which would be unlikely to me to be uniquely identifiable, though some vague level of understanding could probably be reached.

Veedrac · on July 27, 2020

This is as true of any information channel, including your eyes and ears.

Jack000 · on July 27, 2020

That kind of gets into "what is it like to be a bat" territory.

The more imminent question is more of engineering than philosophy - what does it take for GPT-3 to not make the mistakes it does? This would require it to have some internal model for why humans generate text (persuasion, entertainment, etc.) as well as the social context in which that human generated the text. On a lower level it also needs to know about cognitive shortcuts that humans take for granted (object permanence, gravity)

Basically, some degree of human subjective experience must be encoded and fed to the model. That's a difficult problem, but not an intractable one.

mannykannot · on July 28, 2020

We don't even have to look to hypothetical aliens for an example. All the bronze-age Aegean scripts, except for Linear B, remain undeciphered.

ben_w · on July 27, 2020

I certainly agree that GPT-3 appears to be learning how to do mathematics. I suspect that if you gave it enough it might perhaps even learn the maths of physics.

I suspect that if it did that, it would be able to write a very convincing fake paper about how it designed and tested an Alcubierre drive, and that the main clue about the paper being fake being a sentence such as “we dismantled Jupiter for use as a radiation shield against the issue raised by McMonigal et al, 2012”.

Or, to put it another way, the hardest of hard SciFi, but still SciFi, not science.

TomSwirly · on July 28, 2020

Nothing you say convinces me that GPT-3 is exhibiting any conceptual understanding.

Imitating existing texts better is not conceptual understanding.

"Understanding" means you can explain why you made a decision. It means there exists a model with conceptual entities that you can access and make available to others.

What GPT-3 does is this: "I am given many answers to similar questions, and I build up a huge model that reflects these answers. If I'm given a new question, I come up with a response that's probably right, based on the previous answers, but there's no explanation possible."

Don't get me wrong - it's amazing! But it's not understanding anything yet.

Even humans have skills that we know but do not understand - like "walking" for most of us!

But on abstract question, we almost always have access to a complete set of reasons. "Why did you go back to the store?" "I left my bag there." "Why did you talk to that man?" "I know he's the manager, I'm a regular." "Why were you happy?" "I had my bag."

(Indeed, this is so common that people often "backdate" reasons for actions that didn't really have any reason at the time. But I digress.)

Nasrudith · on July 27, 2020

I wonder how well it would perform in accuracy if given a large number of simple but lengthy sums like 13453 + 53521. Increased set size would move it beyond simple input/output memorization. Although if it recurses properly and carries the digit it could be text parsing and have an accurate but probably very inefficently written math parser.

dwohnitmok · on July 27, 2020

> Although if it recurses properly and carries the digit it could be text parsing and have an accurate but probably very inefficently written math parser.

I suspect this is how many humans do arithmetic (especially considering how many people conflate numbers with their representation as digits). So if GPT-3 is doing that, that's pretty impressive.

ja3k · on July 27, 2020

You don't have to wonder. In their paper: https://arxiv.org/abs/2005.14165 they state it has 0.7% accuracy on zero shot 5 digit addition problems and 9.3% accuracy on few shot 5 digit addition problems.

ralfd · on July 27, 2020

By the way: Arithmetic accuracy is better if dollar sign and commas are added (financial data in the training set):

http://gptprompts.wikidot.com/logic:math

gwern · on July 27, 2020

You do have to wonder, because as that section states, the BPEs may impede arithmetic, and as we've found using the API, if you use commas, the accuracy (zero and few-shot) goes way up.

yters · on July 27, 2020

Solmonoff induction would imply the algorithm that learns the rules of arithmetic will have the most concise model for the data. But, it is unclear these gpt-3 type algorithms are solomonoff learners.

YeGoblynQueenne · on July 27, 2020

>> But GPT-3 is much more successful, including at giving correct answers to arithmetic problems that weren't in its training set.

That's not exactly what the GPT-3 paper [1] claims. The paper claims that a search of the training dataset for instances of, very specifically, three-digit addition, returned no matches. That doesn't mean there weren't any instances, it only means the search didn't find any. It also doesn't say anything about the existence of instances of other arithmetic operations in GPT-3's training set (and the absence of "spot checks" for such instances of other operations suggests they were, actually, found- but not reported, in time-honoured fashion of not reporting negative results). So at best we can conclude that GPT-3 gave correct answers to three-digit addition problems that weren't in its training set and then again, only the 2000 or so problems that were specifically searched for.

In general, the paper tested GPT-3's arithmetic abilities with addition and subtraction between one to five digit numbers and multiplication between two-digit numbers. They also tested a composite task of one-digit expressions, e.g. "6+(4*8)" etc. No division was attempted at all (or no results were reported).

Of the attempted tasks, all than addition and subtraction between one to three digit numbers had accuracy below 20%.

In other words, the only tasks that were at all successful were exactly those tasks that were the most likely to be found in a corpus of text, rather than a corpus of arithmetic expressions. The results indicate that GPT-3 cannot "perform arithmetic" despite the paper's claims to the contrary. They are precisely the results one should expect to see if GPT-3 was simply memorising examples of arithmetic in its training corpus.

>> So what changed? We aren't sure, but the speculation is that in the process of training, GPT-3 found that the best strategy to correctly predicting the continuation of arithmetic expressions was to figure out the rules of basic arithmetic and encode them in some portion of its neural network, then apply them whenever the prompt suggested to do so.

There is no reason why a language model should be able to "figure out the rules of basic arithmetic" so this "speculation" is tantamount to invoking magick.

Additionally, language models and neural networks in general are not capable of representing the rules of arithmetic because they are incapable of representing recursion and universally quantified variables, both of which are necessary to express the rules of arithmetic.

In any case, if GPT-3 had "figure(d) out the rules of basic arithmetic", why stop at addition, subtraction and multiplication between one to five digit numbers? Why was it not able to use those learned rules to perform the same operations with more digits? Why was it not capable of performing division (i.e. the opposite of multiplication)? A very simple asnwer is: GPT-3 did not learn the rules of arithmetic.

_________

[1] https://arxiv.org/abs/2005.14165

darepublic · on July 28, 2020

I dunno to me it seems clear that there is nothing of what we call intelligence in these neural networks. And I think we could have a general AI that can problem solve in the world but have zero of what we know of as understanding and sel awareness

johnc1 · on July 27, 2020

Another way to think about it is comparing to how children learn. First, children spend inordinate amount of time just trying to make sense of words they hear. Once they develop their language models, adults can explain new concepts to them using the language. What'd be really exciting is being able to explain a new concept to GPT-n in words, and have it draw conclusions from it. Few-shots learning is a tiny step in that direction.

tsimionescu · on July 27, 2020

Children don't spend inordinate amounts of time learning words. In fact, past the first months, children often learn words from hearing them a single time.

coryfklein · on July 28, 2020

I have a 4, 6, and 8 year old, and each of them are still learning words. Yeah they don’t spend 80% of each day learning words, but building up their vocabulary legit takes a looong time.

tsimionescu · on July 29, 2020

Oh, absolutely. I'm 31 and I'm still learning words!

But I don't think I've ever spent time to learn a particular word - it's almost always enough to hear it in context once, and maybe get a chance to actually use it yourself once or twice, and you'll probably remember it for life.

If it's a word for a more complex concept (e.g. some mathematical construct), you may well need more time to actually understand the meaning, and you may also pretty easily forget the meaning in time, but you'll likely not forget the word itself.

LinchZhang · on Aug 8, 2020

"But I don't think I've ever spent time to learn a particular word - it's almost always enough to hear it in context once, and maybe get a chance to actually use it yourself once or twice, and you'll probably remember it for life."

I'd strongly bet against this. If it were true, SAT and similar vocabulary tests would be trivial to anybody who has taken high school English, and I think it is not the case that most people perceive the SAT to be trivial.

johnc1 · on July 30, 2020

That's of course correct. Perhaps GPT-3 can do that too? I don't have access to it, but I wonder if it can be taught new words using few-shot learning.

In fact, even GPT-2 gets close to that. Here's what I just got on Huggingface's Write With Transformer: Prompt: "Word dfjgasdjf means happiness. What is dfjgasdjf?" GPT-2: "dfjgasdjf is a very special word that you can use to express happiness, love or joy."

What takes time is all the learning a child needs to go through before they can be taught new words on the spot.

SpicyLemonZest · on July 27, 2020

How can we tell whether or not GPT-3 has understanding of the data?

grugagag · on July 27, 2020

I think the best way to tell whether GPT-3 has understanding of its data is by by asking questions related to the data but not explicit in the training dataset.

polytely · on July 27, 2020

Maybe by asking it to clean up it's own training data?

mike_ivanov · on July 27, 2020

By asking counterfactual questions.

bo1024 · on July 27, 2020

I think this is a very good take.

lordnacho · on July 27, 2020

I've been wondering about something similar to you, but I read one of Pearl's causality books recently and thought that might be the missing piece.

It's certainly impressive what GPT-3 can do, but it boggles the mind how much data went into it. By contrast a well-educated renaissance man might have read a book every month or so from age 15 to 30? That doesn't seem to be anywhere near what GPT could swallow in a few seconds.

When you look at how GPT answers things, it kinda feels like someone who has heard the keywords and can spout some things that at least obscure whether it has ever studied a given subject, and this is impressive. What I wonder is whether it can do reasoning of the causality kind: what if X hadn't happened, what evidence do we need to collect to know if theory Z is falsified, which data W is confounding?

To me it seems that sort of thing is what smart people are able to work out, with a lot of reading, but not quite the mountain that GPT reads.

skulk · on July 27, 2020

> By contrast a well-educated renaissance man might have read a book every month or so from age 15 to 30? That doesn't seem to be anywhere near what GPT could swallow in a few seconds.

You're ignoring the insane amount of sensory information a human gets in 30 years. I think that absolutely dwarfs the amount of information that GPT-3 eats in a training run.

PaulDavisThe1st · on July 27, 2020

But that sensory information includes very few written words. GPT(n) isn't being trained on "worldly audio data", or "worldly tactile data", or in fact any sensory data at all.

So the two training sets are completely orthogonal, and the well educated renaissance man is somehow able to take a very small exposure to written words and do at least as well as GPT(n) in processing them and responding.

MauranKilom · on July 27, 2020

And the renaissance man has tons of structure encoded in his brain on birth already. Just like GPT-3 does before you give it a prompt. I'm not saying this is fully equivalent (clearly a baby can't spout correct Latex just by seeing three samples), but you simply cannot just handwave away thousands of years of human evolution and millions of years of general evolution before that.

The renaissance man is very obviously not working solely based on a few years of reading books (or learning to speak/write).

PaulDavisThe1st · on July 28, 2020

A person who is never taught to read will never be able to respond to written text. So the renaissance-era man is working "solely" based on their lived experience with text, which compared to GPT(n) is tiny.

Ah! you cry. Don't humans have some sort of hard-wiring for speech and language? Perhaps. But it is clearly completely incapable of enabling an untrained human to deal with written text. Does it give the human a head start in learning to deal with written text? Perhaps (maybe even probably). It demonstrably takes much less training than GPT(n) does.

But that is sort of the point of the comment at the top of this chain.

klipt · on July 27, 2020

Did this renaissance man teach themselves to read from scratch or are we assuming they were assisted in their schooling?

PaulDavisThe1st · on July 27, 2020

Doesn't make too much difference to the overall point.

lordnacho · on July 27, 2020

That's an interesting point. I'm not sure how to measure that though. Also my guess is we have the sensors on for part of the day only, plus that's filtered heavily by your attention process, eg you can't read two books at once.

hnick · on July 28, 2020

Yeah. A dog never read a book and only has rudimentary understanding of language (if any - no idea, maybe they just pattern match cause and effect) but a dog-level AI would be incredibly valuable. And you can train a dog to be fairly competent in a task in less than a year.

tsimionescu · on July 27, 2020

Do people generally learn what to say from sensory data? How would the sensory data impact our ability to produce meaningful information?

slickQ · on July 28, 2020

Written language is used, in large part, to express sensory data (ex: colors, shapes, events, sounds, temperatures, etc). Abstract models are, through inductive reasoning, extrapolated from that sensory information. So in effect more sensory data should mean more accurate abstract models.

For example, it might take several paragraphs to wholly capture all the meaningful information in one image in such a way that it can be reproduced accurately. Humans, and many animals, process large amounts of data before they are even capable of speech.

The data GPT-3 was provided with pales in comparison. It is unclear whether these GPT models are capable of induction because it may be that they need more or better sanitised data to develop abstract models. Therefore they should be scaled up further until they only negligepbly improve. If even then they, still, are incapable of general induction or have inaccurate models. Then the transformer model is not enough or perhaps we need a more diverse set of data (images, audio, thermosensors, etc).

layoutIfNeeded · on July 27, 2020

https://en.m.wikipedia.org/wiki/Helen_Keller

aeternum · on July 27, 2020

The well-educated renaissance man has a lot more feedback on what is useful and what isn't. I think that GPT-3 could be vastly improved simply by assigning weights to knowledge, IE valuing academic papers more.

Humans get this through experience and time (recognizing patterns about which sources to trust) but there is nothing magical about it. Should be very easy to add this.

rorykoehler · on July 27, 2020

I had this exact conversation with a friend over the weekend. If GPT-n weights all input equally then we are truly in for a bad ride. It's basically the same problem we are experiencing with social media.

aeternum · on July 27, 2020

It is a very interesting problem. Throughout history humans have been able to rely on direct experience via our senses to evaluate input and ideas.

Many of those ideas are now complex enough that direct experience doesn't work. IE global warming, economics, various policies. Futhermore even direct (or near-direct experience such as video) is becoming less trustworthy due to technology like deepfakes and eventually VR and neuralink.

It seems to me that this problem of validating what is real and true might soon be an issue for both humans and AI. Are we both destined to put our future in the hands of weights provided by 'experts'?

bobcostas55 · on July 27, 2020

A well-educated renaissance man is not a blank slate, he is the result of millions of years of selection.

hnrq · on July 27, 2020

The human equivalent to GPT-3 training is not as much the learning one has in a lifetime, but the millions of years evolution process. "Normal" human learning is more akin to finetunning I think, although these analogies are flawed anyway.

roca · on July 27, 2020

Obviously the genome encodes what you need to grow a brain, but, there aren't enough bits in the human genome to encode many neural network weights.

hnrq · on July 27, 2020

Good point. By this logic, evolution is more about creating the best architecture, which I think is also a good analogy. But I also think that a lot of the brain's "weights" are pretty much set before any experience (maybe less so in humans, but several animals are born pretty much ready, even ones with complex brains, like whales), the genome information is, in a sense, very compressed, so even if isn't setting individual weights, it does determine the weights somehow, I think.

Does anyone know if anyone has investigated these questions more seriously elsewhere?

amelius · on July 27, 2020

> What I wonder is whether it can do reasoning of the causality kind: what if X hadn't happened, what evidence do we need to collect to know if theory Z is falsified, which data W is confounding?

I think logic is the easy part, we can already do that with our current technology.

The difficult part is disambiguating an input text, and transforming it into something a logic subsystem can deal with. (And then optionally transform the results back to text).

burntoutfire · on July 27, 2020

> When you look at how GPT answers things, it kinda feels like someone who has heard the keywords and can spout some things that at least obscure whether it has ever studied a given subject, and this is impressive.

Not unlike a non-technical manager discussing tech!

Smaug123 · on July 27, 2020

> it will just make up convincing rubbish

It's not certain that this is always the case. In at least one case I've seen, if you give it question-and-answer prompts where you don't demonstrate that you will accept the answer "your question is nonsense", it will indeed make things up; but if you include "your question is nonsense" as an acceptable answer in the sample prompts, then it will use it correctly. See https://twitter.com/nicklovescode/status/1284050958977130497 .

It seems that we have a lot to learn about how to use GPT-3 effectively!

Veedrac · on July 27, 2020

> It cannot tell right from wrong

It doesn't know which facts about the world are true and which are fabricated except through the text it's trained on, but to a large extent neither do you. It suffices to merely have the ability to reason about it. The primary difference is that whereas you reason fairly competently from one perspective with one pseudo-coherent set of goals, GPT-3 reasons to a weak degree from all perspectives, privileging no view point in particular.

How important this ends up being depends on where the model plateaus. On one extreme, if it plateaus close to where GPT-3 already is, no harm done, it's a fun toy. On the other, if it scales until perplexity gets to far-superhuman levels, it doesn't matter at all, since you can just prompt it with Terence Tao talking about his latest discovery.

Naturally, it will land somewhere in the middle of these points. The question is ultimately then whether the landing point captures enough general reasoning that you can use it to bootstrap some more advanced reasoning agent. A sufficiently powerful GPT-N should, for example, be able to deliberate over its own generated ideas and sort the coherent reasoning from the incoherent.

jrowen · on July 27, 2020

It seems to me that anything trained on internet data is kinda doomed to poison itself on the high ratio of garbage floating around here?

That same sentiment could be equally applied to humans, and not just in the internet era, but throughout all of history. There will always be misinformation and "wrong" opinions out there. "It cannot tell right from wrong" is an accusation leveled against human beings every day. We can't even all agree on what is right and wrong, truth or untruth.

A true AI is going to have to wade through all that and make its own decisions to be viewed and judged from many different perspectives, just like the rest of us.

rytill · on July 27, 2020

I've also been wondering this. It is like Kessler Syndrome. We have to be careful not to pollute our ecosystem of data.

https://en.wikipedia.org/wiki/Kessler_syndrome

Speaking with GPT-3 also makes one realize how influenced its predictions of AI scenarios are by dystopian memes. In any conversation in which you are "speaking to the AI", the AI can go rogue.

There just aren't enough positive role models authors have written for AGI.

giancarlostoro · on July 27, 2020

> It seems to me that anything trained on internet data is kinda doomed to poison itself on the high ratio of garbage floating around here?

Ah one only needs to think back to Tay to know how these sort of things will end.

https://en.wikipedia.org/wiki/Tay_(bot)

(Imagine 4chan got a wind of this bot and retrained it, which is probably what happened...)

tda · on July 27, 2020

> It would be a bit like how carbon dating or production of low background steel changed after 1945 due to nuclear testing

https://news.ycombinator.com/item?id=23896293

I guess books published will be more useful than reddit rants (depending on the application)

confeit · on July 27, 2020

> anything trained on internet data is kinda doomed to poison itself on the high ratio of garbage floating around here?

Low-quality noise cancels out and leaves the high-quality signal. In the limit, the internet offers the true sequence probabilities for compression of natural text.

You can also put more weight on authoritative data sources, such as Wikipedia and StackOverflow, but even uniformly weighted: It is possible to sequence-complete prime numbers, despite the many many pages online with random numbers.

GPT-3 is trained on a filtered version of Common Crawl, enhanced with authoritative datasets, such as Books1, WebText, and Wikipedia-en. Moderation is done automatically, with a toxicity classifier/toggle. If GPT-n becomes good enough to be accepted in authoritative datasets, then it is perfectly fine training data, a form of semi-supervised learning.

Bias is going to be a double-edged sword: I believe it will be impossible to prescribe common sense, nor to sanitize common sense to remove, say, gender bias, and still be able to understand a sexist joke about female programmers, or male nurses. We want an AI to be human, but we don't want it to associate CEOs with white males, dark hair, wearing suits. That will conflict.

zozin · on July 27, 2020

'authoritative data sources, such as Wikipedia"

Lol

confeit · on July 27, 2020

> Canberra is the capital city of Australia.

gillesjacobs · on July 27, 2020

I largely agree with the arguments made, but the following assertion is plain bogus

> GPT-3 is the first NLP system that has obvious, immediate, substantial economic value.

Text mining (relation extraction, named entity recognition, terminology mining) and sentiment analysis are billion dollar industries and are being directly applied right now in marketing, finance, law, search, automotive, basically every industry. Machine translation is another huge industry of its own. Chat bots were all the hype a few years ago. Let's not reduce the whole field of NLP to language generation.

andyljones · on July 27, 2020

> are billion dollar industries

When speaking of billion dollar investments, a billion dollar industry is not substantial. Google and Facebook's industries are advertising, at $600bn/year. Amazon's industry is retail, at $25tn/year.

What's opened up by the GPT-3 and its prompt-programming abilities is services, without qualification. That's $50tn/year, and capturing some tiny percentage of it is what's needed to make a billion-dollar investment worthwhile.

That said, I admit this isn't the mindset most people take when they read 'substantial'.

e: I changed the wording from 'substantial' to 'transformative', thanks!

skybrian · on July 27, 2020

GPT-3 lets you create rigged demos to do lots of tasks but so far it's not reliable enough to do anything in production. It seems unlikely to get there using output based on random word selection. Nobody is even talking about error rates yet.

The best applications are probably when error rates don't matter because a human is just going to use it for inspiration.

gillesjacobs · on July 27, 2020

I might have missed the business plan behind monetising GPT3 here. Can you elaborate on why you think prompt programming will successfully take a cut from services?

Prompt-programming is a standard features of all LMs. What differentiates GPT3 is not this application but the quality of the output. NLP companies such as chatbot providers and specialised search (patents, legal assistants, tenders) have been using domain-specific LMs for years.

dragongod2718 · on July 27, 2020

You both agree I think. He's not saying that GPT-3 invented the revolutionary ability of prompt programming, but that prompt programming allows GPT-3 to be applied to arbitrary contexts (from programming to providing legal advice to generating fiction). That amazing generality and high quality allow it to be applicable to most services.

So it's taking some slice of the $50tn pie.

justaguy1212 · on July 27, 2020

Yeah this is a bizarre statement considering that GPT-3 is definitely not any of those things really. GPT-3 is far, far too computationally expensive to have value in industry. A linear CRF is more useful than most NN approaches in industry right now, just simply because in many circumstances you want to have something that you can apply to a few billion documents and get the result within a few hours, then tweak a few things and repeat if you like. These simple models also have the ability to be predictable as well. Some transformer or lstm methods can be useful in industry, but it really depends on the application. I certainly would not be using GPT-like systems for much in industry, other than gimmicks for marketing. GPT-3 is useful for academia - not industry.

sillysaurusx · on July 27, 2020

Hm? GPT-3 is relatively cheap to inference from, at least compared to the cost of training. You can load all the params onto a single TPU, actually. (A TPU can allocate up to 300GB on its CPU without OOM'ing.)

AI dungeon is also powered by GPT-3, and it's quite snappy. I'm not sure why GPT-3 is seen as computationally expensive, but it seems workable.

Tepix · on July 27, 2020

Only the premium-exclusive version of the AI model, named Dragon released last month.

eru · on July 27, 2020

Premium here means ten dollar a month, I think.

aveni · on July 27, 2020

GPT-3 is not that expensive. Estimating from the paper, to train the model, the GPU hardware costs were a few million dollars, and the electricity costs were probably under 100k. This is totally feasible for many companies today, especially if the hardware is a fixed cost and can be reused for training multiple models.

And as mentioned elsewhere, inference for a trained model is much, much cheaper.

ben_w · on July 27, 2020

Is sentiment analysis really that good already?

Every time I’ve looked at the start of the art in sentiment analysis, it seems to be suffering from the same issue that bag-of-words has with modifiers like “not”. Or is that more a theoretical problem than a practical one?

I appreciate this is a rapidly moving field, so my knowledge could easily be out of date.

smallnamespace · on July 27, 2020

Are modifiers an actual issue for many applications?

"This isn't a terrible horrible restaurant that nobody should ever go to" seems like 1) it doesn't mean it's actually a good restaurant either 2) the writer might be joking and sarcastic and 3) this will be very rare in actual reviews.

Put another way, certain modifiers contextually go with certain words and sentiments, so why shouldn't state of the art systems lean on that fact, notwithstanding the strict application of grammar?

ben_w · on July 27, 2020

There was a story that a language lecturer had just explained how double-negatives were a sometimes a positive and sometimes an emphasised negative, and that likewise some languages used a double-positive to mean an negative. He claimed that English was not such a language, using double-positives only for emphasis, to which one of the students said “yeah yeah”.

thrownagain90 · on July 27, 2020

An MIT linguistics professor was lecturing his class the other day. "In English," he said, "a double negative forms a positive. However, in some languages, such as Russian, a double negative remains a negative. But there isn't a single language, not one, in which a double positive can express a negative."

A voice from the back of the room piped up, "Yeah, right."

(https://www.ling.upenn.edu/~beatrice/humor/double-positive.h...)

gillesjacobs · on July 27, 2020

To be fair, "yeah, right" is a sarcastic statement and linguistically the two words do not scope eachother so the positive statement is produced at the pragmatic level, double negatives are syntactico-semantic.

gillesjacobs · on July 27, 2020

The issue you describe is typically called "Valence shifting" in this specific case "negation processing". It is of course a difficult problem to capture word-level sentiment and emotions but recent techniques in academic work obtain decent results.

However, industry typically relies on sentence- or document-level sentiment in, for instance, customer reviews with systems obtaining 80-90 F1-score which is very good. Often in e-commerce, aspect-based sentiment analysis is used in which a qualifying sentiment is attached to a target aspect, e.g. from a phone review systems extract: battery: large > positive; screen: dim > positive. You might have seen these types of reports in aggregate on review our e-commerce sites yourself.

It is however an ongoing field of research to process the scope of negation and uncertainty, but the field is making strides. State-of-the-art attention-based models obtain good scores on benchmark fine-grained sentiment analysis datasets such as the GoodFor/BadFor and MPQA2.0 of around 70% F1score [1]. This performance is nearly enough for commercial systems, depending on how you employ them.

1. https://link.springer.com/article/10.1186/s13673-019-0196-3

tambourine_man · on July 27, 2020

I think he means “ obvious, immediate, substantial economic value” to non technical people. It take little effort to imagine how to monetize it even for regular folks.

gillesjacobs · on July 27, 2020

For many GPT1-3 is the first exposure to language modelling technology, which is great, but language modelling and pretraining is already widely used in nearly every NLP task even before GPT1. Any NLP engineer has used this, so it is kind of weird to claim large-scale pretrained LMs are revolutionised with GPT3.

Don't get me wrong the hype is largely deserved because of the performance and engineering/research/funding effort required. Plus cool demos and media marketing from OpenAI helps a lot in spreading awareness.

OpenAI has definitely revolutionised the marketing for language models, no doubt. Let's wait and see if they manage to do the same for the economic valorisation.

riffraff · on July 27, 2020

really? I am very impressed by GPT-3 but I still don't see any way to make money "obviously" out of it.

Maybe it can be an adjuvant to human in some tasks but then so could existing technologies too, I guess?

tambourine_man · on July 27, 2020

Do you want a computer that can reliably understand you and give you the best possible answer 99% of the time?

I think most people would go: yes! How much does it cost?

p1esk · on July 27, 2020

Oh, I thought we were talking about GPT-3

riffraff · on July 27, 2020

If you're running a search engine maybe.

But everyday people are used to getting 80% answers by search engines, I don't think many would pay for something that is "like google, but a bit better".

This seems the be current issue for many things that we used to pay for (dictionaries, encyclopaedias, newspapers, etc), and I'm not sure this would be different.

tambourine_man · on July 27, 2020

> "like google, but a bit better"

For simple queries like “who was the president of country X in YYYY” it’s probably just a bit better (if cached, of course, Google search is wicked fast).

But for more complex queries, Google is still remarkably dumb. Or downright insolent, ignoring my verbatim selection or quoted terms.

I’d pay good money for scarily smart search and a “grep for the web” service, that included JSON, CSS, JavaScript, comments, whatever. A toggle button for dumb/smart search

rytill · on July 27, 2020

I would easily pay money for something that is like Google, but a bit better.

cambalache · on July 27, 2020

I wont

sktguha · on July 27, 2020

google's majority revenue is from search. So bit better than google, particularly if it is integrated into bing (as microsoft has invested in openAI) , and allows bing to capture market share from google, would be really lucrative

notahacker · on July 27, 2020

On the other hand, do you want a computer that gives you less accurate answers than more specialised tools but can spin them into something that looks like an essay? is less obviously monetisable than, say, already commercially available services like Alexa which incorporate NLP but don't rely exclusively on it.

dwaltrip · on July 28, 2020

GPT-3 is not capable of what you are describing.

vikramkr · on July 27, 2020

I think regular folks are already pretty clued into how other things like text mining and sentiment determination have big markets considering how strong the backlash against tech politically is right now. The everyday public and by extension politicians seem reasonably capable of imagining how data can be monetized for ads, and they dont seem too happy about it.

tambourine_man · on July 27, 2020

Really? I have the opposite, not as positive impression. I hope you are right though.

Nasrudith · on July 27, 2020

I think it depends on how you define "clued in". They are aware of its existence but aren't just ignorant of how it works but outright apathetic and hostile to anything which goes against their personal narrative.

Just look at the "Google selling your data" being uncritically accepted when five minutes of thought would conclude it is the last thing Google would want (even a better search algorithim would find it hard to bootstrap on user base and comparable training time) or the casual John Yoo worthy torture of the definition of monopoly to include Goddamned Netflix when whining about FAANG monopolies. That level of generalization and stereotyping is like blaming the Amish for flying planes into the World Trade center because both are radical Sbraham religions.

PaulHoule · on July 27, 2020

From the "applied physics" department of John Hopkins university in Baltimore (last stronghold of the JASONs) south to Virginia and the Research Triangle Park area of North Carolina you will find people who know things about practical NLP systems that aren't in the open literature. They could tell you about it but they'd have to kill you.

Around Mumbai I know there is a crew that can really use UIMA, and there are other Indians I know who do intelligence and defense work.

blueyes · on July 27, 2020

Both DeepMind and OpenAI were founded on the premise that we are in an AI overhang. OpenAI, in particular, believes in scale. Scale will get us there based on the algorithms we have, such as the Transformer. With each new release, they add evidence that they were correct.

The call for legislation neglects that there exists a global arms race to make this technology succeed. Legislation in one nation will simply handicap that nation. Against that backdrop, legislation is probably unlikely among the nations already leading in AI.

hyperbovine · on July 27, 2020

> With each new release, they add evidence that they were correct.

Is it though? If the goal is human-level AI, or hell, even rat-level AI, the evidence is pretty convincing that you should be able to train and deploy it without requiring enough energy to sail a loaded container ship across the Pacific Ocean. Our brains draw about 20 watts, remember. This suggests to me that no, in fact, scale will not get us "there".

https://www.forbes.com/sites/robtoews/2020/06/17/deep-learni...

biesnecker · on July 27, 2020

I don’t actually know if this is true, but the intuition I have is that this huge expenditure of energy is just the result of speeding through evolution. Our neural structures have evolved for hundreds of millions of years. The aggregate energy cost of that evolution has been enormous, but the result is a compact, hyper-efficient brain. Who’s to say that on the other end of this we’re not going to end up with the same in silicon?

ssivark · on July 27, 2020

Even if that were the case, it seems wasteful/pointless to have to go through all of biological evolution every time one wants to label images or generate some text.

UK-Al05 · on July 27, 2020

You provide pre-trained models to then train further.

This happens now.

vikramkr · on July 27, 2020

That 20 watts is to run the network. Our brain has had a billion years to work out details of the architecture and encode a lot of basic stuff as instinct (and it still sucks at a lot of things). You should be counting that energy cost as well - we didnt get from nerve nets to frontal lobes overnight.

logicslave · on July 27, 2020

This exactly, I am so tired of reading these posts online that ignore the billions of years the human brain took to evolve

slowmovintarget · on July 27, 2020

*millions, not billions.

Earth is about 4.5B years old, life is about 3.7B years old, multicellular life (including life with neural nets) is about 600 million years old. I don't think the span from microbe to multicellular organism counts in brain evolution.

blueyes · on July 27, 2020

Do you want artificial intelligence or do you want energy efficiency? Personally, I think this work is about proving that we can create the former. Making it small and efficient comes later. That has been true of many advances in technology, and I see no reason why it should not apply here. I find it hard to believe that present energy consumption is evidence that we cannot create human or rat-level AI.

andyljones · on July 27, 2020

Training an AI in 2020 is best thought of as a capital investment. Like digging a mine or building a wind farm, the initial investment is very large but the operating costs are much lower, and in the long run you expect to get a lot more money out - a lot more value out - than you put in.

Training GPT-3 cost $5m; running it costs .04c per page of output.

ac29 · on July 27, 2020

If it is scaled up by 1000x as the article proposes, does that mean it will cost $40/page of output? Or does the additional cost just go into training the model?

andyljones · on July 27, 2020

If it's 100x from increased investment and 10x from short-term efficiency gains, yeah you'd expect $4/page. Model compression or some other tech might make it more efficient in the long-run.

stainforth · on July 27, 2020

Once the investment is recouped and a small margin rewarded, the value should be spread equitably among society.

dragongod2718 · on July 28, 2020

Why only a small margin?

BoiledCabbage · on July 27, 2020

> , the evidence is pretty convincing that you should be able to train and deploy it without requiring enough energy to sail a loaded container ship across the Pacific Ocean.

Yes and airplanes use much more energy to fly than a bird. What that got to do with the airline industry?

mattymatty · on July 28, 2020

yh but the plane uses much more energy so it's not really flying

saddlerustle · on July 27, 2020

Who cares how much power it needs? Plug it into a hydroelectric dam. A superhuman AI would surely provide higher ROI than the terrawatts used for smelting Aluminium.

hyperbovine · on July 27, 2020

You're missing my point. If it's possible to achieve general AI with incredibly minimal computational requirements, then this implies that current methods which rely on some sort of teraflop arms race to achieve better results are based on a fundamentally flawed model.

postnihilism · on July 28, 2020

General intelligence in its biological form was achieved with hundreds of millions of years evolution, which required the "evaluation" of trillions and trillions of instantiations of nervous systems. The total energy consumption of all those individual organisms was many many orders of magnitude more than all of the energy that has been produced by the entirety of humanity.

dragongod2718 · on July 28, 2020

The compute intensive methods are likely to deliver results much faster.

http://incompleteideas.net/IncIdeas/BitterLesson.html

ip26 · on July 27, 2020

Our brains are mostly an already-trained network though. Running a model that has been trained is the easy part.

hyperbovine · on July 27, 2020

I'm well aware of the distinction between training a model and running it. Look, GPT-3 has 185 billion parameters. Modern low-power CPUs will get you about 2GFLOPs/watt [1]. So even if all GPT-3 did was add its parameters together it would take multiple seconds on an equivalently powered CPU to do something that our brains do easily in real time. It's not an issue of processing power; an 8086 from 40 years ago easily runs circles around us in terms of raw computational power. Rather, it's that our brains are wired in a fundamentally different way than all existing neural networks, and because of that, this line of research will never lead to GAI, not even if you threw unlimited computing power at it.

[1] http://web.eece.maine.edu/~vweaver/group/green_machines.html

armitron · on July 27, 2020

Birds are wired in a fundamentally different way than all our existing computers thus we will never have fly-by-wire, not even if we throw unlimited computing power at it.

hyperbovine · on July 28, 2020

Actually that's a great example. For centuries men labored (and died) trying to build ornithopters--machines that flap their wings like birds--under the mistaken impression that this was the secret to flight. Finally, after hundres of years of progressively larger and more powerful, but ultimately failing designs, the Wright brothers came along and showed us that flight is the result of wing shape and pressure differentials, and has nothing whatsover to do with flapping.

GPT-3 and whatever succeeds it are like late-stage ornithopters: very impressive feats of engineering, but not ultimately destined to lead us to where their creators hoped. We need the Wright brothers of AI to come and show us the way.

curious_fella1 · on July 27, 2020

Perhaps our artificial representation of neurons are simply much less energy efficient than a biological neuron?

aoeusnth1 · on July 28, 2020

If someone created AGI that ran on 1KW you would deem it a rank failure by that metric (by a factor of 50x!).

pyb · on July 27, 2020

The global arms race for AI has definitely started. Unfortunately, most states don't appear to be aware of this.

simonw · on July 27, 2020

I've been pandemic-rewatching Person of Interest. It's really quite shocking how much more relevant it feels today compared to when it aired just a few years ago.

It's fun looking at things like GPT-3 and imagining how they could be used to build the surveillance AI at the heart of Person of Interest.

(If you haven't watched Person of Interest yet, here's my pitch for it: it's a CBS procedural where the hook is that an engineer built a secret, surveillance feed tracking AI for the government after 9/11 - but he cared about civil liberties, so he built it as an impenetrable black box. All it does is kick out the SSN of someone who is about to be either the victim or the perpetrator of a terrorist attack - which means government agents still have to investigate what's going on rather than taking the AI's word for it. "The Machine" also sees victims/perpetrators of violent crimes - but the government don't care about those. Finch, the machine's inventor, does - so he fakes his own death, hooks into a backdoor into the machine that gives him those SSNs and sets up a private vigilante squad to help stop the violent crimes from happening. So that gives you the "case of the week". Only it's actually an extremely deep piece of philosophical science fiction disguised as a case-of-the-week procedural, and as time goes on the plots become much more about AI, the machine, attempts to build rival machines, AI ethics and so on. It's the best fictional version of AI I've ever seen. The creative team later worked on Westworld.)

Barrin92 · on July 27, 2020

I honestly don't think the show has much to do with AI at all and is more like a retelling of the Greek classics in a sci-fi wrapping, which is actually something that comes up in the show at several points excplicitly.

The AIs in the show very quickly turn into godlike characters with antropomorphic personalities and the real world issues of AI such as surveillance, economics and so on are all dealt with in very shallow fashion. I had the same issues with Westworld too. It turns from an AI premise into a classical Christian morality tale very fast. ("we need to suffer to become conscious").

simonw · on July 27, 2020

One of the things I loved about the show is that different characters have different philosophies concerning AI, and they argue about them. Nathan v.s. Fitch. Fitch vs. Root. Control, Greer - for the most part the show tried to give some depth and background to their thinking around the implications of what they were responsible for.

Way smarter than you would expect from a CBS procedural!

ShamelessC · on July 27, 2020

That's Jonah Nolan's show right?

simonw · on July 27, 2020

Yup, he did it before Westworld.

GrantS · on July 27, 2020

I thought the overhang was going to be along the lines of the following, whether realistic or not:

-GPT-3, as is, should be the inner loop of a continuously running process which generates 1000s+ of ideas for "how to respond next" to any query, with a separate network on top of it as the filter which cherry-picks the best responses (as humans are already doing with the examples they are posting)

-Since GPT-3, as is, can already predict both sides of a conversation, it can steer a conversation toward a goal state just like AlphaGo does by evaluating 1000s+ of potential moves, lots of potential responses and counter-responses until it finds the best thing to say in order to get you to say what it "wants" you to say.

It seems ready to go as the initial attempt at the inner loop of both of these tasks (and more) without modification or retraining of the core network itself, no?

lambdatronics · on July 27, 2020

I'd love to see what could be done with GPT-3 as part of a GAN. Text compression/summary, maybe?

jobigoud · on July 27, 2020

I was also thinking something like this. GPT-3 should be the internal monologue, the subconscious soup of words constantly exploring random thought alleys, and there should be another layer on top of it to bridge it with the outside.

lacker · on July 27, 2020

GPT-3 is the first AI system that has obvious, immediate, transformative economic value.

I think the jury is still out on this one. It certainly seems powerful, it's doing interesting things, and it's better in many ways than any system that has come before. But there's a different between exciting demos and transformative economic value.

It's too soon to be sure, but to me, the most interesting question is whether any valuable startups will be built on top of GPT-3. Some leading indicators before that are whether useful products are built on GPT-3, and whether early-stage startups built on GPT-3 get seed investment. I'm not aware of any of these yet but maybe latitude.io counts as one.

f00zz · on July 27, 2020

In Kernighan and Pike's "The Practice of Programming" there's a chapter that covers the implementation in different languages of a random text generator using markov chains. It's a nice exercise and a lot of fun to play with.

I'm guessing that not many people have read that book, because I'm seeing here and elsewhere even technical people talking about GPT-3 as if it's heralding the imminent advent of SkyNet. I get that transformers have a somewhat longer attention span than markov chains, but it's still a statistical language model. It can't even do the kind of planning or reasoning that early AI demos like SHRDLU could.

Veedrac · on July 27, 2020

It's ridiculous to equate this to a traditional Markov chain language model. Here's something a Markov chain certainly cannot do:

Human: I want to test your creativity. Please invent a new word and give its meaning.

GPT-3: Ok. Um... Tana means to hit someone with the intention to wound them.

Human: Please use the word tana in a sentence.

GPT-3: You are about to tana the man attacking you.

Human: Speak like a dwarf.

GPT-3: I ain't talkin' like a dwarf.

https://www.reddit.com/r/MachineLearning/comments/hvssqn/d_g...

f00zz · on July 28, 2020

Right, I admit that I don't know the first thing about ML, so I tried an experiment.

Consider a language with the tokens "{[()]}" and the following grammar:

S := S S | '{' S '}' | '[' S ']' | '(' S ')' | <empty>

That is, "[()]" and "[]()" are valid sequences, but "[(])" or "))))" aren't. A child would quickly figure out the grammar if presented some valid sequences.

I generated all 73206 valid sequences with 10 tokens and used it as input to the RNN text generator code at http://karpathy.github.io/2015/05/21/rnn-effectiveness/. After 500,000 iterations I'm still getting invalid sequences.

Am I doing something stupid, or is a RNN text generator weaker than a child (or a pushdown automaton)? Is GPT fundamentally more powerful than this?

Veedrac · on July 28, 2020

GPT-3 can generate well-formed programs, so yes, it does things well beyond this complexity.

> After 500,000 iterations I'm still getting invalid sequences.

How frequently? If it's only the occasional issue it might be down to the temperature-based sampling that code uses, which means it will, with some small probability, return arbitrarily unlikely outputs.

mrfusion · on July 28, 2020

How can it do that? Did it read “tana” and the meaning somewhere?

perl4ever · on July 28, 2020

I suspect people overestimate the intelligence because they just can't grasp how much data it's ingested or don't have a visceral sense of what an ocean of data can contain. There's a saying that "quantity has a quality all its own".

dragongod2718 · on July 28, 2020

I don't think this is an overestimation of intelligence. That ability is itself intelligence.

Jach · on July 27, 2020

It's a good book, and one most programmers should read early on (it grows less useful over time), but to think of this as only an incremental improvement over markov chains is underselling the advance. The technology is different, and can scale to much higher levels of capability. AGI levels? Almost certainly not. But it's passing usefulness thresholds so things can progress elsewhere.

scottlocklin · on July 27, 2020

>to think of this as only an incremental improvement over markov chains is underselling the advance.

Erm, citations needed. It's a giant, inefficient and shitty KNN model, which is capable of mimicking markov chains. Wonderful marketing achievement and not much else.

Jach · on July 27, 2020

https://www.gwern.net/GPT-3 (Edit: in case it's not clear, I suspect if you give an honest perusal of that page, and the pages it links, and the pages they link, you'll come away with a different opinion.)

shpongled · on July 28, 2020

I'm totally with you. I find GPT-3 incredibly impressive, but people are acting as if GPT-3 is a harbinger of AGI.

This is especially obvious on stuff like lesswrong, where AI is a big part of what they talk about. I tend to agree with the LW/SSC crowd about the negative effects of AGI, but they are being so hyperbolic about GPT-3.

legulere · on July 27, 2020

Nah we’re in for the next AI winter. GPT-3 shows how much energy is needed to perform a nice trick with current technology. We mostly have reached the limits of the technology. Investing more compute power for a few percentage points more Precision is not going to bring the technology forward.

Veedrac · on July 27, 2020

2017 SOTA on Penn Treebank was 47.69 perplexity. GPT-3 is at 20.5. AI has already been productized on consumer devices through Siri, Google Assistant, speech detection, speech generation, textual photo library search, similar data augmentations for web search, Google Translate, recommendation algorithms, phone cameras, server cooling optimization, phone touch screens' touch detection, video game upscaling, noise reduction in web calls, file prefetching, Google Maps, OCR, and more. DLSS alone justified continued investment by NVIDIA. NVIDIA Ampere will be ~6x as fast at running consumer-targeted models as Turing, given raw throughput increases compounded with sparsity and int8 hardware. A huge number of research threads around AI have direct applicability to large tech companies.

legulere · on July 28, 2020

I'm not arguing that current machine learning technologies are not useful. I'm just arguing that progress is based on increasing some metric, usually depending on a trade-off of computation. This can even make ML-techniques applicable to some new fields, but it's not what is holding back autonomous driving, the often touted parade example which also brings in a lot of employment for machine learning.

This article clearly sits on the peak of inflated expectations in the hype cycle.

https://en.wikipedia.org/wiki/Hype_cycle

Veedrac · on July 28, 2020

It's not just that you're not arguing it isn't useful; as far as I can tell, neither of your comments contain an argument against ML at all. I have nothing to meaningfully argue against.

ML is undergoing a Cambrian explosion of use-cases (see my prior comment), almost all of this over an incredibly small time period, progress is accelerating, and many of these use-cases are incredibly high value. Scale is not proving a major stopper; Google's MoE experiments show that huge models are productizable, and small models work plenty fine too in restricted places, to the point where they're literally used to parse touch screen sense data in phones.

If you want to claim we're in for another AI winter, you need a vastly stronger argument than ‘something something hype cycle’.

Hypx · on July 27, 2020

Agreed. Nearly every AI startup or idea has disappointed or failed. We’re spending the equivalent of billions of dollars on something dumber than a rat in most cases.

pretendscholar · on July 27, 2020

If you look at the paper doesn't it scale well across many different metrics?

emteycz · on July 27, 2020

It's a one time investment though, and not that large relatively speaking. Is is that much more costly and less rewarding than other investment opportunities?

datameta · on July 27, 2020

I wonder if at some point the amount of extra data necessary to achieve an n-fold improvement will outstrip what we can provide.

I think the time for AI legislation is now - before FAAMG deploys something like the next-gen of GPT-3. Of course with the legislative lag that exists even for decade-old tech I don't have the highest confidence in this being achieved by a federal government in the state it is in now.

rewq4321 · on July 27, 2020

On that data point: I wonder if anyone can comment on how much useful training data we could get out of generating text based on knowledge graphs/databases that we have. You can construct an awful lot of sentences out of just a few facts (e.g. weights of various classes to generate sentences like: "x's are heavier than y's, but not as heavy as z's"). All the variations would contain the same information (of subsets of it), but the same could be said of lots of text online. Obviously this is an inefficient way to incorporate the databases into a GPT-like model, but it might make sense economically given the race that is now playing out - just shoehorn it in or you'll be left behind (at least in the short term) by those who do. "We can work out how to make it efficient after we're rolling around in cash."

The knowledge databases could be used to generate what would essentially be "word problems" (in math classes), starting with simple things like "If I put three marbles in a cup, and then I take one out, and each marble weighs 20g, then the remaining marbles weigh 40g in total" and moving on to progressively more complex ones.

If that were to happen, then you'd see companies employing people to create templates which essentially convert databases into sentences/paragraphs, which can then be consumed by the GPT-like model.

It seems like this data would need to be used in a sort of pre-training step though, because you want the model to encode all the relationships, but you don't want it to learn to generate these types of concrete sentences, specifically.

inetsee · on July 27, 2020

As blueeyes has already pointed out "Legislation in one nation will simply handicap that nation." I don't have a lot of faith in our legislators ability to legislate safety without relegating us to an AI backwater.

datameta · on July 27, 2020

You're right. I think ideally the legislation should be international. Maybe something like the Washington Naval Treaty that set an upper limit on the tonnage and armament of new battleships. Or perhaps more aptly something akin to SALT I & II where older models are taken offline to avoid derelict AI systems from falling into malicious hands and to keep the number from growing out of control. Although this parallel is somewhat weak considering the capabilities of one advanced model are more valuable than 10x models of the last generation.

Theoretical wishful thinking, I suppose, but I strongly believe that corp/govt scale ML research should be treated like advanced weaponry because it isn't a matter of if but when AI will be weaponized (whether the flavor of warfare is physical or informational).

Although of course as with weapons treaties - the major powers would likely tend to be selective in what they commit to limiting themselves in.

Jach · on July 27, 2020

The world couldn't even come together on controlling 3D printed weaponry, there's no hope for an arms treaty for AI right now. The "it's not feasible to regulate even if you tried" stance applies too -- you can restrict central actors without much difficulty, and that would work for AI just as well as it works for battleships, but there's a lot of distributed compute whereas there's not a lot of distributed shipyards. Like, you just have to follow what's been done with anime image nets to see that something like GPT-3 is possible for a distributed worldwide group to achieve and is not limited to firms or governments.

Maybe when we have a disaster directly attributable to AI, nations can get on-board with something like the BWC and CWC. Until then, be even more pessimistic. (If you want a fun if rather dry book to read on material technology developments that were in the pipeline a couple decades ago, some of which have come to fruition, as well as some policy recommendations for the technologies that aren't generally good, check out Jürgen Altmann's Military Nanotechnology.)

sbierwagen · on July 28, 2020

>The world couldn't even come together on controlling 3D printed weaponry

Beg pardon? Plastic guns have been banned in the US since 1988 https://en.wikipedia.org/wiki/Undetectable_Firearms_Act

I assume other countries have similar bans.

Jach · on July 28, 2020

As a small amount of metal can be added at the end to make the weapon 'legal', that act does little to address the numerous¹ problems beyond being able to sneak a gun past airport security. Hardly an important milestone in controlling anything. It didn't even affect any gun in existence at its time.

But more generally, as we all know, a ban without provisions for enforcement is useless. Compare to the CWC (Chemical Weapons Convention) which I point to as one of the best pieces of international "coming together" via treaty. It includes requirements that member countries submit to inspections from its enforcement body (OPCW) and furthermore that countries can request the OPCW inspects another member country if they suspect non-compliance. It also includes restrictions on transfer of various chemicals in order to incentivize non-member countries to become members so they can purchase chemicals for industrial purposes from other members.

¹ and bigger, if you're modeling this from assumptions where it's a problem at all -- not everyone thinks it is, "an armed society is a polite society" etc.

sbierwagen · on July 28, 2020

Legislation? International treaties?

An AI-risk maximalist would believe AI is a near-term existential threat, with the prospect of total human extinction. In that scenario, the final backstop measure to a rogue country engaging in AI research is using nuclear weapons.

This... obviously... would be very bad. If it escalated to a full nuclear war, it would kill billions of people. But it would leave survivors, who wouldn't be interested in, or be able to, pursuing AI for decades or centuries. Better than the alternative.

01100011 · on July 27, 2020

One thing I don't hear a lot of people talking about are ML/AI systems in the hands of government agencies. We know that the military and NSA are often ahead in many technologies but when it comes to AI the assumption seems to be that the industry is moving faster than the government. Is that really a safe assumption?

The goverment is openly using autonomous systems to pilot drones, but what else are they leveraging AI for? Threat analysis? Logistics? Weapons optimization? PsyOps?

The DoE is openly a very large consumer of GPUs. What about the military?

confeit · on July 27, 2020

You can get a glimpse by scrolling websites like: https://www.darpa.mil/opencatalog?ppl=view200&sort=title&ocF... [.mil] and looking at DARPA and Office of Naval Research sponsored ML/AI research. The military has been deeply involved with ML/AI research since its inception, and it is near impossible to avoid first - or second degree involvement, if active in ML/AI.

The military wants: automated chat agents/web users that can be sent to dark web markets and hacker IRC channels and report back intelligence. Common sense inference from security and drone footage: predict who the killer is when watching a movie. Author deanonimization and cross-device tracking. Global-scale 99.9%+ accurate face detection.

The Dutch Intelligence Agency organizes a yearly competition with difficult codes to crack. [1] It is rare for someone to answer all questions correctly. The answers require logic, creativity, common sense, linguistics, causal inference, spatial reasoning, expertise, analysis, and systematic thinking. I bet the military would be mighty interested in an automated problem solver for that. And mighty scared some other country gets there first.

[1] https://www.aivd.nl/onderwerpen/aivd-kerstpuzzel

heavyset_go · on July 27, 2020

> Is that really a safe assumption?

Not at all. The government can throw billions of dollars at a problem that, if solved, will never turn a profit or immediately benefit a business.

futureshock · on July 27, 2020

Let me flip this argument on its head. Consider this: About 5 years ago several key SV people including Sam Altman, Peter Thiel and Elon Musk became suddenly very concerned about AI ethics and started OpenAI. What if they, with this insider status, had already seen a GPT-3 like system at Google, Facebook, Baidu or wherever and its capabilities for political and social manipulation so concerned them that they started OpenAI in an effort to bring this tech out of the shadows and into the sunlight so we could debate it and regulate it. GPT-3 might not be a state of the art breakthrough. It could be just catching up with where the big tech companies were 5 years ago so that we can finally see what they are capable of. Corporate secrets are a normal part of doing business and maybe the tech companies didn't like the PR they would have gotten from publicizing something like this. Remember the blowback from Google's project that called business for their store hours? They already struggle with regulators across the world as it is. Do we really believe that little OpenAI is so much farther ahead of Google like the posted article posits?

notadonut · on July 27, 2020

You don't need secrets for your theory of insiders to work. What Thiel and Musk saw was DeepMind. Then they invested in it. Altman acted later.

And the vision of what AI was becoming was voiced much earlier by Yudkowsky, whose Singularity Institute received funding from Thiel.

If anything, they heard what a few prophets were shouting. They saw some early demos in a startup pitch. They responded and DeepMind's work soon became as public as AI research is. That is to say, most people ignored it until the Google acquisition.

exolymph · on July 27, 2020

Yudkowsky's outfit is the Machine Intelligence Research Institute, not the Singularity Institute. The latter might be Bostrom? Can't recall.

pjscott · on July 27, 2020

MIRI was, until 2013, called the Singularity Institute for Artificial Intelligence. Apparently the name change was part of a deal with Singularity University to avoid brand confusion. Announcement here:

https://intelligence.org/2013/01/30/we-are-now-the-machine-i...

notadonut · on July 27, 2020

Thank you.

Thiel was an early backer of SI and attendee of the Singularity Summits that ran from 2006-2012.

https://en.wikipedia.org/wiki/Singularity_Summit

Soon after that, awareness of AI and superintelligence went mainstream, and we got FHI, FLI, etc.

I don't know if Thiel backs MIRI as he did SI. Arguably, he doesn't need to. He made his money on DeepMind and helped trigger a larger movement, and other institutions with a lot more resources, like Alphabet and MSFT, carry forward the torch.

Nuzzerino · on July 27, 2020

Thiel wasn't an attendee, he was one of the people running the conference

notadonut · on July 27, 2020

He was attendee and sponsor and running the conference.

RoddaWallPro · on July 27, 2020

from Gwern, @ https://www.gwern.net/newsletter/2020/05: "This year, GPT-3 is scary because it’s a magnificently obsolete architecture from early 2018, which is small & shallow compared to what’s possible3, with a simple uniform architecture4 trained in the dumbest way possible (unidirectional prediction of next text token) on a single impoverished modality (random Internet HTML text dumps5) on tiny data (fits on a laptop), sampled in a dumb way6, and yet, the first version already manifests crazy runtime meta-learning—and the scaling curves still are not bending!"

It's probably not a state-of-the-art breakthrough at this point. Who knows what OpenAI has done in the intervening two years?

stjo · on July 27, 2020

Sounds like a nice conspiracy, but realistically, how could they hide something like that? Presumably there are hundreds or more employees working on this. If Elon Musk et al. heard about it 5 years ago, this must be one of the best kept secrets in recent history.

pyb · on July 27, 2020

It's best to assume that Google et al. Are not hiding anything "huge". What you see publicly is what there is. Deepmind used to be years ahead, but now they appear to have been, in some very important respects at least, leapfrogged by OpenAI, initially an imitator. It would be interesting to know if Deepmind have indeed squandered their lead away, and if so, why it happened.