I'm a huge fan of Mumford, but I think he's stretching a bit when he says that n...

YeGoblynQueenne · on Feb 11, 2017

>> I'm a huge fan of Mumford, but I think he's stretching a bit when he says that natural images have grammar like language has grammar.

That seems to be what he's saying- that the process of vision in living things is actually a grammar.

He doesn't have to be entirely correct for his intuition to be that useful though. We don't have to assume an existing grammar to fit a bunch of data to a grammar (like we do in grammar induction, where there's no such assumption when, say, someone models DNA sequences as a grammar etc).

After all, a grammar is just a representation. The question is how good that representation is- in theory as well as in practice. In theory, it's a good representation if it helps us answer questions about the process we're trying to model. In practice it's good if it allows us to reproduce the process, especially automatically, with computers, and predict the behaviour of agents that employ this process etc etc.

jacobolus · on Feb 11, 2017

I highly recommend you read Arnheim’s 1969 book Visual Thinking, https://amzn.com/0520013786/

yellowapple · on Feb 11, 2017

Human brains also perceive speech or writing or what have you using grammar-like structures. In the sense brought up by the article, a grammar is just a structured logical representation of some physical phenomenon, like sound or imagery.

In other words, grammar is how we perceive things, and it makes sense that it can be generalized instead of only being applicable to a specific sensory input.

The specific point about parsing visual input is much more obvious when looking at things like graphical charts, user interfaces, etc.; there's clearly a grammar of some form involved in, say, determining which button to press on my phone's on-screen keyboard to create the letter 'b', and further involved when determining what a "button" is or what a "keyboard" is or what a "letter" is. Hell, that seems like the same process that lets me figure out what a "screen" is and what a "phone" is. Eventually we go from "type the letter 'b'" to "move your right thumbtip to this position" (and even that can be broken down further).

mistersys · on Feb 11, 2017

You're treating input/output as the same.

We receive speech coming from a source and parse it using a grammar. One could imagine it being a similar process for the perceiving images captured by the retina.

For output, when a human paints an image they are panting from an imaged visualized inside of their mental canvas, just like we realize thoughts produced within our minds as speech.

smallnamespace · on Feb 11, 2017

> when a human paints an image they are panting from an imaged visualized inside of their mental canvas

Yes, but images produced by humans are a tiny fraction of images processed by the eye. But every written or spoken sentence was ultimately created by a human brain.

That's why it seems like a big stretch to claim there is a 'universal grammar' involved in visual processing, if you believe that grammar is primarily a way for brains to encode information for communication purposes...

pygy_ · on Feb 11, 2017

> Yes, but images produced by humans are a tiny fraction of images processed by the eye.

Processed by the eye, yes, but that rises to 100% for images processed by the brain. The brain appropriates images by imparting its processing on the lower level visual cortex. Perception is an active process.

taeric · on Feb 11, 2017

This seems... Wrong. Consider, kids process images and sounds long before they are capable of sentences.

pygy_ · on Feb 13, 2017

Vision and language are processed by distinct brain areas, which mature at a different pace.

That doesn't rule out the possibility of grammar-like processing in visual areas.

taeric · on Feb 15, 2017

Apologies for neglecting this over the weekend.

This is true. I did not mean to say that just because I think it is wrong, that it is. However, the claim seemed to be that the images experienced by the brain are fully synthesized by the brain. Which seemed off.

Again, just because it seems off to be does not mean it is wrong. Not my field, and whatnot. I can even see something to be said for visual processing going in stages such that the stage that you are cognizant of is effectively on images constructed by you. That seems to be a different claim, though.

dwaltrip · on Feb 12, 2017

Couldn't it simply take them that long to understand/model the visual input deeply enough to interact with it in complete sentences?

taeric · on Feb 15, 2017

Apologies for neglecting this over the weekend.

The claim was "Processed by the eye, yes, but that rises to 100% for images processed by the brain." That is, that the images processed by the brain were 100% constructed by the brain.

The implication I got was that the images you perceive are entirely of your own devising. This seems off to me. Certainly anyone that is blind but still able to visualize a room is using constructed visualizations. But, that is a different thing than someone that is able to see.

This is different from written words. Which are 100% devised by another being. Maybe assembled by a machine, but the words and the meanings of them are learned and come from taught meanings. Not from raw processed experiences.

dwaltrip · on Feb 18, 2017

I'm not entirely following what you mean, but that's OK. My hunch is our differences lie this concept of "taught meaning". I don't think meanings are taught, in any traditional sense. I think they are absorbed, acquired, and synthesized by the incredible pattern matching of the brain, operating off of direct, perceptual experience. Of course, these experiences includes things like reflection, reading a textbook, having a conversation, watching a movie, daydreaming etc.

When one reads a piece of text, it's being interpreted through the complex mental models of the world and layers of meaning that have been built up in the individual's brain over the years.

I realize we our now squarely off on a tangent :)

woodandsteel · on Feb 12, 2017

>A natural image is formed when a collection of objects is illuminated by incoming light, and the resulting image is projected onto the retina. The human brain is not involved at all in this process

No, the retina is a complex processor, and so is the optic nerve. Brain scientists nowadays say the retina is an extension of the brain.

clord · on Feb 11, 2017

> A natural image is formed when a collection of objects is illuminated by incoming light, and the resulting image is projected onto the retina.

But that's sort of the point no? The incoming information is not an unstructured white noise of photons striking our retina. There is a sort of structure to the information that can be modelled. One such model of this structured information is as a "grammar tree" (really just a tree, we're coders here.) The example in the article is that the arm occludes the teepee, which occludes the background trees. Any visual system needs to break this hierarchy down.

eternalban · on Feb 11, 2017

Recursion was recently shown to enable generalization in neural programming architectures [1] but from a critical inquiry p.o.v. we note that recursion requires a mechanism for maintaining context and look for the existence (or lack of) such mechanism in animal brains.

[1]: https://news.ycombinator.com/item?id=13551298