And this is where I decide the language is too far developed on the wrong founda...

Locke1689 · on Oct 31, 2012

Just to elaborate on what danking00 is saying, the extra syntax in this case is not adding any extra information (to the compiler). The left hand type of the expression can be completely inferred at compile time, in this case. What that required syntax is adding is pain, but no gain (except for imperceptibly faster compile time).

In Haskell this would look like:

  TagDataMap = [
          ("title", ((3, 30, stripnulls)),
          ("artist", ((33, 30, stripnulls)),
          ...

Haskell will correctly infer that TagDataMap :: [(String, (Integer, Integer, String -> String)].

Ok, technically this is an associative list, not a Python dictionary, but it is a map and can be accessed like one. Hell, most people use dictionaries with less than 10 items, which are much slower than arrays most of the time

fusiongyro · on Oct 31, 2012

"the extra syntax in this case is not adding any extra information (to the compiler)." This actually isn't true in C++, because there could be many classes in scope that could take a list initializer like this. In Haskell, no literal with [] can "turn into" something besides a list, but that can happen in C++:

    vector<int> items = {1,2,3,4};
    int[] items       = {1,2,3,4};

These have different types, so how would C++ know which one you meant if you instead wrote:

    auto items = {1,2,3,4};

Again, in Haskell this isn't a problem because literals have essentially a single type. (Edge cases around integers and strings notwithstanding).

Edit: Just to clarify, in a Hindley-Milner system you could maybe get away with something like that, but everything you name in an HM system you must use, and that isn't the case in C++:

    void foo() {
      auto items = {1,2,3,4};
      return;
    }

I can then make two classes with list constructors:

    struct FooClass {
      FooClass(std::initializer_list<int> list) {
        cout << "Made a Foo!" << endl;
      }
    };

    struct BarClass {
      BarClass(std::initializer_list<int> list) {
        format_your_hard_disk();
      }
    };

Obviously there are consequences to choosing the right type, but the type of that value never leaks out of the function. Nevertheless, because side-effects can happen anywhere, even in a constructor, C++ cannot optimize that out.

This might be a convoluted example, and it may be flawed, but conjuring up others is not hard and demonstrates that C++ simply cannot ever have true HM type inferencing. Since the "real deal" is not possible, the language is complex and the standard is large, I would not expect to be able to live without annotations in C++-land. (Again, the FQA makes the horror of multiple non-orthogonal solutions to the same problems quite clear).

danking00 · on Nov 1, 2012

Your example,

    void foo() {
      auto items = {1,2,3,4};
      return;
    }

is the case of poorly designed syntax, in a truly Hindley-Milner system there is no expression which doesn't have a type. Now, we could add some syntax that screws that up, say:

    let v = I-HAVE-AN-AMBIGUOUS-TYPE in 0

But that's rather silly, isn't it? If a value isn't used then I would, personally, like my language to optimize it away. So why not ditch such silly syntax?

This is part of why I said that the foundations of C++ are too far-gone.

And this syntax isn't necessary to save on typing. In a language that supported syntactic macros (such as Scheme or Racket) we could write something like:

   auto items = my-vector-syntax(1,2,3,4);

which expands to something like:

    auto items;
    items.push_back(1);
    items.push_back(2);
    items.push_back(3);
    items.push_back(4);

If you're interested in true syntactic macros for non-sexp languages (though I do suggest getting over the parentheses, my color settings make them nearly indistinguishable from the background) look at Rust [1].

Actually, Rust also has type inference [2].

Hell, stop programming in C++ and start using Rust! [3]

[1] http://dl.rust-lang.org/doc/tutorial-macros.html [2] http://dl.rust-lang.org/doc/0.4/rust.html#type-system [3] http://www.rust-lang.org/

fusiongyro · on Nov 1, 2012

Having programmed Haskell for the last seven years (and C++ for zero) I have a fairly decent handle on what HM is all about. If all you had said is that the foundations of C++ are too far gone, there wouldn't be anything to discuss. But the foundations of C++ being what they are, there's no point being offended when unfixable things go unfixed. I love HM, but you can't just throw it in any old language simply because it's cool. The language's semantics need to allow for it, and they just don't with C++.

danking00 · on Nov 1, 2012

Correct, I do not suggest adding HM to C++.

I suggest using a HM language when you start a new project.

comex · on Nov 1, 2012

It's possible to do exactly that (a function that initialized a vector with its arguments) in C++ with variadic templates...

You might end up in trouble if you did something like

    auto v = vec(
                  vec(1.2, 3.4),
                  vec(0)
                )

not sure how well other languages deal with that.

daivd · on Nov 1, 2012

"Hell, stop programming in C++ and start using Rust!" I plan to try to do that, when Rust has become more stable.

kibwen · on Nov 2, 2012

No need to downvote this; I'm an enthusiastic Rust follower and I as well recommend that you stay far away from Rust until it settles down a bit. Wait for 0.5 if you're adventurous, 0.6 if you expect some degree of feature-completion, 0.7 if you're fine with half-finished standard libraries, and even later if you expect a high degree of stability, ergonomics, documentation, or performance.

Locke1689 · on Oct 31, 2012

This is definitely true and demonstrates how competing design decisions can add huge complexity to a language. C++ made earlier design goals to allow heavily context-dependent overriding, which has side effects on what features it can add later.

Peaker · on Oct 31, 2012

Integers and Strings aren't edge cases. Their literals are polymorphic and typed.

C++ could also type its list initializers with some polymorphic type (similar to Haskell's Num) but didn't do so.

This is not inherent.

fusiongyro · on Nov 1, 2012

They're not worth discussing because they're a counterintuitive mess rather than a case study of the glory of HM. Maybe edge case isn't the right word, but they're definitely not something I would hail as a perfect resounding success.

There are no polymorphic literals in ML, just polymorphic math operators, which is enough of a blight on the standard that OCaml discarded it and forces you to use different operators for real and integer arithmetic. And there's only one kind of string in both MLs.

Haskell's Num hierarchy is troublesome. They traded usability for + with complexity for /. It's extremely unlikely that you could write a program in Haskell that does much arithmetic and have it build correctly on the first try without any manifest typing. This is one reason students of Haskell find things so confusing: type declarations are necessary at the top level simply because the extensions and complexity of modern Haskell break HM if you try using it everywhere. Also, the class system in there is not especially mathematically correct, which leads to the numerous replacement Preludes that try to do a better job but haven't caught on.

Strings are edge cases because they are not polymorphic unless you enable OverloadedStrings. Once you do, you will either replace the built-in string with something else (ByteString or Text) or find yourself in the same kind of trouble you'd be in with Num.

Let me be clear: I'm not saying that these problems are showstoppers. They're really minor annoyances once you're experienced, though they contribute to confusion for beginners. The point I'm trying to make is that you can't just drop HM into any old language and expect it to work. A greater point would perhaps be that all languages have warts simply because they're large, complex beasts (Haskell and C++ especially) and it's unproductive to point to a missing feature in one and demand some sort of perfected version of the other's.

Peaker · on Nov 1, 2012

> They're not worth discussing because they're a counterintuitive mess

Do you really believe Num overloading is a counterintuitive mess? I disagree completely.

> OCaml discarded it and forces you to use different operators for real and integer arithmetic

Which is pretty terrible.

> Haskell's Num hierarchy is troublesome.

Yes, but that's an orthogonal issue.

> This is one reason students of Haskell find things so confusing: type declarations are necessary at the top level simply because the extensions and complexity of modern Haskell break HM if you try using it everywhere

That sounds like FUD to me, a heavy Haskell user. Type declarations at the top-level are generally necessary to avoid the dreaded MR and for documentation purposes. Modern Haskell doesn't heavily use extensions that require type annotations on the top-level.

> Strings are edge cases because they are not polymorphic unless you enable OverloadedStrings. Once you do, you will either replace the built-in string with something else (ByteString or Text) or find yourself in the same kind of trouble you'd be in with Num.

It lets me use literals for Lazy Text, Strict Text, and String with the same syntax, which is nice.

My point is merely that giving a type to a polymorphic initializer is possible, and C++ chose not to.

fusiongyro · on Nov 1, 2012

The distinction between your point and mine is becoming miniscule, but I must defend my position on Haskell, as a heavy user myself. If Num doesn't seem to be a problem to you, it's because you supply top-level annotations that disambiguate it. Try removing all the annotations from whatever you did last week and see if it still compiles cleanly. I'd wager it doesn't. This isn't an issue in practice because we supply annotations in most cases, but don't be fooled: top level annotations, despite the rhetoric, actually are essential for modern Haskell programs to disambiguate. As for strings, the problem doesn't appear in practice because people mostly don't intermingle string types in the same module. That's what makes Num tricky; it's easy to find yourself with Ints and Integers together, wanting to divide them and get a float, and figuring out how to resolve these minor issues is significant for learners. Just ask my friends!

Peaker · on Nov 2, 2012

This is one of the files in a project I've worked on last week.

https://github.com/Peaker/bottle/blob/master/codeedit/Editor...

I removed all the top-level type declarations, and only one definition broke, because of the MR.

  showP :: Show a => a -> String
  showP = parenify . show

Once I removed the type declaration, I made it work again by adding a parameter to avoid the MR:

  showP x = parenify (show x)

and everything compiles smoothly.

Feel free to browse the bottle repo -- and try to build it without top-level declaration. Apart from a few functions in the entire project that use Rank2, you won't need any declarations.

fusiongyro · on Nov 2, 2012

Thanks for sticking with this conversation.

This is very good code.

One difference I see between our styles that may explain the differences in behavior we see is that you're quite meticulous about importing only the parts of modules you need, and you make heavy use of qualified imports. My style has been to import everything in case I need it later and only use qualified imports when absolutely necessary, and it must be creating the unnecessary ambiguity that I have to deal with. I will try to adopt your style and see if it cleans up my error messages, and I'll encourage my friends to do the same.

daivd · on Oct 31, 2012

Even in Hindley-Miller type systems it is considered good practice to add types as documentation to top-level constructs (see Haskell). In Python it is also considered good practice to add argument and return type info in the doc string. In a dynamic language you would also have to add a unit test or two for cases for some of the things that the compiler can catch for you.

Looking at the complete picture makes a language with local type inference (like C++11) more or less as verbose as one with complete type inference.

longlivedeath · on Nov 1, 2012

> it is considered good practice to add types as documentation to top-level constructs

But with type inference your tools can do that for you (e.g. C-u C-c C-t in haskell-mode).

lmm · on Nov 1, 2012

A good IDE can fill in the types in C++ too.

longlivedeath · on Nov 2, 2012

Is there a C++ IDE that can figure out the function signature after you have written something like

_ f(_ a, _ b, _ c) { YOUR; CODE; HERE; }

?

codewright · on Nov 1, 2012

Wrong foundation? No. Foundation you don't want? Sure.

Try to remember that Hindley-Milner is very hard to do outside of functional languages like ML and Haskell.

danking00 · on Nov 1, 2012

Yes type inference is harder with C++'s language design (OO comes to mind as a particular problem), thus my claim that it's the wrong foundation to build on. Of course, people like challenging problems and are working to bring more type inference to OO languages [1].

Furthermore, I assert that this truly is the wrong foundation. For new projects that must have OO, Scala provides local type inference and object orientation. If you're really hurting for some manual memory management, look at Rust [3] or Habit [2].

If we can recover the features we love on a new foundation that provides new features like type inference or memory-safety, then we've found a better foundation, IMHO.

[1] http://www.cs.ucla.edu/~palsberg/typeflow.html

[2] http://hasp.cs.pdx.edu/habit-report-Nov2010.pdf

[3] http://www.rust-lang.org/