"I don't think any commonly used language has a truly context-free grammar"
It's a matter of semantics. (doubly so)
These errors that you describe during compilation are defined by what? Not by the syntax (at least not in r5rs Scheme [1]) but instead by the semantics. Conflating the two would, of course, result in the impression of non-context-free language syntaxes because if we require the syntax to also be the semantics of the language then the syntax must be able to evaluate the program i.e. the syntax must describe a turing machine.
I think the confusion is compilation. Compilation is a series of evaluation steps taken before complete evaluation.
All the errors (as they exist in Scheme) that you describe are DEFINED by the semantics of r5rs Scheme. For example, the semantics of variable reference are:
Which basically says, if the variable has been bound, then use its value to continue the evaluation. If it is unbound, then stop evaluating and signal the error "undefined variable".
====
"That's not exactly true - one can rewrite (normalize) a context-free grammar into this form, but CFGs are often written in an "unnormalized" form which has multiple nonterminal symbols, because it tends to be clearer to read in many cases."
I think you misunderstand what the OP is stating. OP refers to the left hand sides:
E -> E + E
| E * E
Certainly the left hand side in a CFG normally has a single non-terminal. I suppose you could write a grammar with the left hand sides containing multiple non-terminal symbols and then, if it turns out the grammar you wrote is in fact context-free, you could re-write it in the above form, but does anyone actually do this?
It's a matter of semantics. (doubly so)
These errors that you describe during compilation are defined by what? Not by the syntax (at least not in r5rs Scheme [1]) but instead by the semantics. Conflating the two would, of course, result in the impression of non-context-free language syntaxes because if we require the syntax to also be the semantics of the language then the syntax must be able to evaluate the program i.e. the syntax must describe a turing machine.
I think the confusion is compilation. Compilation is a series of evaluation steps taken before complete evaluation.
All the errors (as they exist in Scheme) that you describe are DEFINED by the semantics of r5rs Scheme. For example, the semantics of variable reference are:
E[[I]] = λρκ . hold (lookup ρ I) (single(λ . = undefined → wrong “undefined variable”, send κ)) [r5rs, Section 7.2.3]
Which basically says, if the variable has been bound, then use its value to continue the evaluation. If it is unbound, then stop evaluating and signal the error "undefined variable".
====
"That's not exactly true - one can rewrite (normalize) a context-free grammar into this form, but CFGs are often written in an "unnormalized" form which has multiple nonterminal symbols, because it tends to be clearer to read in many cases."
I think you misunderstand what the OP is stating. OP refers to the left hand sides:
Certainly the left hand side in a CFG normally has a single non-terminal. I suppose you could write a grammar with the left hand sides containing multiple non-terminal symbols and then, if it turns out the grammar you wrote is in fact context-free, you could re-write it in the above form, but does anyone actually do this?[1] http://www.schemers.org/Documents/Standards/R5RS/HTML/r5rs-Z...