To me, expressiveness is about expressing an idea with as little incidental complexity as possible, which is definitely different from as in as few characters as possible.
I don't think it is different actually. 'Kolmogorov complexity' (which is essentially the global minimum 'essential complexity' of a particular algorithm) is specified in terms of length.
Applying Kolmogorov complexity in this context is a bit more tricky than you make it appear. Kolmogorov complexity measures the complexity of a string as the length of the shortest possible program that generates that string.
So to measure the complexity of a program P we have to write another program G that generates P and then take G's length. It's not a given that a generator for a verbose language is necessarily longer than a generator for a terse language.
But more importantly, what we want to measure is the mental effort required to write and understand code in different languages. To measure that effort in terms of Kolmogorov complexity, we'd have to write a program that generates our states of mind while programming.
Actually, given a Turing Machine M which outputs x when given x, then we would have the description of program T0 (for terse language T) equal to the concatenation of the [specification of the] machine M with T0, M T0. Given program J1 (for verbose language J), its description equals M J1. Given sufficiently small M and large T0, the M factor's importance is quite reduced and we see that Kolmogorov complexity is roughly analogous to the respective sizes of T0 and J1 themselves (for this specific Turing Machine M).
You're forgetting that Kolmogorov complexity is the shortest possible description, not just any description of the desired output. The concatenation you're suggesting is highly unlikely to be the shortest possible one.
You have to think of it in terms of compression, because that's what Kolmogorov complexity is really about. It is certainly possible that after compression J1 is shorter than T0.
You're forgetting that I get to pick the Turing machine to use. Kolmogorov complexity is the shortest possible description relative to a particular Turing machine. Re-read my previous post with that in mind.
No you don't get to pick the particular Turing machine if by "particular Turing machine" you mean a Turing machine including a particular transition function, which is the conventional definition of Turing machine (http://en.wikipedia.org/wiki/Turing_machine#Formal_definitio...)
In terms of this definition we need two Turing machines, one whose transition function can be proven to be the shortest that describes L1. And another one whose transition function can be proven to be the shortest that describes T0. These two Turing machines or transition functions are compressed representations of L1 and T0 respectively.
What I'm saying is that the one describing L1 can be shorter than the one describing T0 even if L1 is longer than T0.
I think we are approaching this from different angles. I am thinking of a single contrived M for whom a simple copy input to output is the primary supported operation and all other operations/modes, although still capable of performing arbitrary computation [hence able to execute 'compressed' programs if you should wish to attempt to provide them and also, by definition, still Turing complete], are exceedingly clumsy/verbose.
I have not only the transition function but 6 other entities to define to accomplish this (really 5 barring the 'blank symbol') so I am certain that it is not only possible but likely much smaller than one might think at first. With such a small M (that penalizes attempts to compress T0 and J1), and a large T0 [such as the code base for Microsoft Windows ported to F#] and an even larger J1 [such as the same code base written in C], we still see my desired result.
I understand what you're saying but I fail to see how your M is related to Kolmogorov complexity. You're not free to define some contrived M with particular properties that make your assumption true. Kolmogorov complexity explicitly asks for the shortest possible M for each individual string (i.e L1 and T0). Here's the definition from Wikipedia (http://en.wikipedia.org/wiki/Kolmogorov_complexity):
"More formally, the complexity of a string is the length of the shortest possible description of the string in some fixed universal description language".
M is not that universal language. M is a program written in the universal language of Turing machines. I guess that is our misunderstanding. If you define M itself as that universal language but M only ever does that one contrived thing you said (i.e. there is only one program written in that language) then it is not a universal language any longer and any comparison of lengths would become completely pointless.
In my case, the 'universal description language' is one that supports a single operation, let's say CC for 'carbon-copy', and some other number of operations (contrived to have a quite clumsy/verbose interface). The CC operation takes one input--the 'text' to output (which in my case would be T0 and J1). The 'other operations' would be only the minimal set required to still be Turing-complete (and once again would be devised such that no sane person would want to use them for any sufficiently complicated task). So, M is the implementation of the above UDL.
Then, the question becomes, how much logic do we have to put into M itself to achieve the above? I posit that it would not be very much in fact (but that should be demonstrable if someone were sufficiently motivated).
I think KC requires that M be constant across the texts you wish to compare (so that texts across various languages are normalized to an underlying architecture).
If one views M in this light (as an attempt to level the playing field across competing hardwares/algorithms/languages), then we could see it as the concatenation of the following:
1 - the specification of the raw hardware.
2 - the microcode for the hardware.
3 - BIOS, drivers, HAL, OS, etc.
4 - language implementation/runtime/shared libraries etc.
Now, to compare any two real-world implementations, we need to port them both to the language implemented in #4, for running on the hardware specified by #1-3 and sum their length with M's length.
And if we want to compare two algorithms from different languages (but running on the same 'machine'), we can shift the codes in #4 into the 'texts' of the programs themselves (so as to keep M constant between them). We can do this all the way down to the minimum M required to be Turing complete (and nearly all the way 'up' to the programs under test); i.e., the division between M and a particular text, L0, is somewhat variable.
So you are constructing a particular universal language that is designed to make any compression algorithm written in it so long that the concatenation of the algorithm and its own output would always be longer than its input.
First of all, I doubt that this is even possible, because the input, i.e L1 and T0 is arbitrarily long in the general case, while the algorithm's length must be determined before it sees any input, i.e it is constant. That's related to the invariance theorem: http://en.wikipedia.org/wiki/Kolmogorov_complexity#Invarianc...
But even if it were possible or if we agree to arbitrarily limit the size of the input for practical purposes, you wouldn't have achieved your original goal, which was to show that a verbose language can never be as expressive as a terse language.
Your universal language is neither a Turing machine nor any other generally accepted universal language, so no one would agree to accept this particular language as a test of Kolmogorov complexity.
If you can disprove the invariance theorem you may have come up with a valid criticism of Kolmogorov complexity itself, but you were the one who initially based his argument about the expressiveness of programming languages on Kolmogorov complexity. If you invalidate Kolmogorov complexity you invalidate your own argument as well.
In any event, this has been a fun thought experiment.
I think I am suggesting to construct a particular universal language designed to make any compression of some reasonable T0 and J1 written in that UL (e.g., the source code of Microsoft Windows written in F# and C respectively) longer than T0 and J1 themselves, hence leaving them to be their own Kolmogorov limit (if you will). This seems entirely doable to me within the confines of the 5-tuples afforded me by the formal definition of a Turing machine (although I can see how others, including you, might be less convinced). :-) Maybe I, or someone else sufficiently motivated, will attempt it someday.