Hacker News new | past | comments | ask | show | jobs | submit login
C-style vs. Python-style syntax: which one would you prefer for your next programming language?
13 points by mojuba on Dec 14, 2008 | hide | past | favorite | 68 comments
I'm designing a new programming language with a few interesting features. I call it a "stream-oriented" language and it is supposed to be highly efficient by design, almost at the level of C/C++.

The spec is ready, I even started writing a compiler. Almost everything is decided, except one major thing: C-style syntax vs. Python-style.

I like both, to be honest. The first is familiar - well, painfully familiar. The second is incredibly clean and aesthetical, but can create purely syntactic obstacles to introducing important features, such like multi-line lambdas. In principle, the compiler can support both styles on per-file basis, if that makes any sense, but most likely it doesn't.

I'd be happy to hear your opinions. If you are a C/C++ programmer, would you consider a Python-style language as your next tool, provided it is as efficient as C? If you are a Python programmer, are you happy with the style? Is there anything that can be fixed or improved in it? Finally, if you are a Lisp, JavaScript or Ruby programmer, would you consider a Python-like language if it gave you some new possibilities?

(Lisp style is not considered, as this new language is not a dialect of Lisp and it doesn't unify code and data.)




As a multi language user I prefer the python style for the following reasons:

- It is easier on the eye, especially when somone else has to read through & understand it

- I find it quicker to code

That said there are some aspects where C-style is superior - so a combination of the 2 might work well! Invent your own "mash up" style (as long as it is not too confusing).


how about take a page from haskell style syntax, which allows both inferring the block structure from indentation, but also always it to be explicitly denoted at well?


I've found that I really don't like this, aesthetically (mixing nesting styles in code looks bad)

Also, whitespace based indentation works in Python because Python statements tend to be fairly short. (The python indentation rules are easily understood.) Haskell layout has been somewhat more difficult to understand (at least for me). I have been writing Haskell code and usually it works, but there have been problems due to failure to indent when I should'v, which were incredibly confusing to debug.

In python, it's hard to fuck it up: after a colon, you always newline and indent.


> In python, it's hard to fuck it up: after a colon, you always newline and indent.

But what about:

    if True: print "Yep"
;)


Get with the program:

if True: print("Yep")


Don't do that: http://www.python.org/dev/peps/pep-0008/

Some features of Python are only really meant to be used at interactive prompts, like import STAR ( I can't write star?)


Completely agreed; I was just pointing it out.


Completely agree. The main reason for my preference is that one of the things I hate most in C-style languages is that one-line functions take up three or four lines (depending on whether or not you use K&R style). This greatly increases the amount of scrolling you have to do.


Put it all on one line then! :) I just grepped some of my C/C++ code and my one-liners consistently look like this:

virtual const std::string& getCommand() const throw() { return m_command ; }


Yeah, that works for really short methods, but for anything even a little more complex you're just replacing vertical scrolling with horizontal scrolling.


That's true for one-liners in any language.


Not necessarily. In Haskell the oneliners are frequently short enough to fit on one screen.

For example, I recently needed to do some Run-Length Encoding:

    import Control.Arrow
    import Data.List
    
    encodeRLE :: (Eq a) => [a] -> [(Int, a)]
    encodeRLE = map (length &&& head) . group
    
    decodeRLE :: [(Int, a)] -> [a]
    decodeRLE = concatMap (uncurry replicate)


Something like

  if tyrant:
    divide(); conquer()
is possible. Mixing curly braces with Python-style though wouldn't be very nice, I think.


Oh yeh I didnt mean that at all.

Indentation-as-a-divider is definitely easer on the eye than curly braces. But C-stlye has some handy bits and pieces to pick up (expecially on the OOP side of things - and also in the type-setting; though that has improved in python)


What you mean by the OOP side of things?

I find superlong indented blocks in Python a bit awkward, although can't explain why. When I look at a long class definition, for example, it just begs to be closed with end or "}". Small blocks though look definitely better without the block braces.

I'm thinking about introducing an optional end and leaving it to the taste of a programmer. Not sure yet.


Python already has an optional end:

    def something:
        alpha()
        omega()
    #end
I think introducing it as a special case, with no semantic value, into your parser would be an extremely poor idea.


True, except #end doesn't verify you are at the correct level of indentation. This, instead, would be more appropriate:

  def something:
    ...
  end def
but then it makes indentations insignificant altogether.


That would indeed make it semantically valuable. I assume you'd have "end while", "end if", etc?

I can't say I'm a fan, but at least you'd be making it mean something if you did that.

(Also: why make it optional at that point? Seems like it should be required for simplicity's sake)


Shell/Ruby style (if ... endif) is not even considered because I'm no fan of it either. I think it already lost the battle to both C and Python styles.

It would be nice, however, to have some optional verification mechanism that allowed you to stay minimalist for as long as you wish, and to use closing tags for verification and/or aesthetical reasons - for example, when closing a superlong block.


Woah... Be wary of that attitude for there lies the path to C++ hell. Like people say, the path to hell is paved with good intentions.


That's quite a good idea.

My point with braces is that you dont have to indent the code then - which is dangerous :)

I have inherited a couple of PHP and C projects that have taken weeks just to indent to be readable - before i can even begin to work on them! (admittedly that is more of a PHP problem). Forced indentation makes definite that everything is obvious: addign curly braces is purely asthetic IMO :D


    (defun indent-buffer ()
      "Indent entire buffer"
      (interactive)
        (indent-region (point-min) (point-max) nil))


The Unix 'indent' command has come a long way in the last few years, too.


Amen.


I don't think it's the job of the programming language to enforce style. Proper code style should always be used, but should not be forced upon the coder, and that's my major objection to Python syntax.


I don't think it's the job of human beings to enforce style. Proper code style should always be used automatically, but should not force the coder to waste a second thinking about it. The coder should think purely about the logic of the program, and that's my major objection to C.


So proper code style should be forced on the maintainer instead?


That's valid python.


The not too confusing part is especially important. It is very difficult to cut features down and very easy to build them up. Simplicity is king.


I really like Python's style.

Multi-line lambdas aren't such a must-have. You can always use named functions. So, Python's syntax doesn't mean you can't do something, just that you have to give it a name if it's going to be multi-line rather than being anonymous. Plus, if you really care that much, just make multi-line lambdas an exception. Heck, look through the proposals people have made for Python to have multi-line lambdas. You can make multi-line lambdas that really fit with the Python syntax (in my opinion).

So, don't make lambda the deciding factor between Python and C syntax.


I've used Python in the past and never once had problems with the 'whitespace thing'. However, I think it is limiting for things like templates where you want to sort of abuse the language and mix it up with text. Also, it seems to put lots of people off, and even if they're being a bit silly, it may be a consideration in terms of your goals of having the language adopted.


I developed a relatively large web app with Python. Python's meaningful whitespace is the primary reason I no longer use python.

When one is working on a server, the editing tools are sometimes limited and they thus can make seeing the whitespace hard. In another, smaller, Python project, we had whitepace errors that appeared and disappeared like ghosts. Even if you have tools that make seeing everything easy now doesn't that you will have them later, on another system - especially, sooner or later spaces and tabs will get confused and be invisible.

I haven't programmed in C for years but the meaning of C programs is still obvious at a glance to me. After being away from python for a couple of years, the meaning of python program is opaque to due to the meaningful whitespace syntax. I know I could pick python up again if I tried but this indicates to me that python's syntax is not as natural as what Steve McConnell calls block structure. The end of a block is just as important to find as the start and so should be just as visible.

Writing one-line functions is, if anything, easier in Ruby than in Python.

Oh, and a syntax that requires as comments to show a normal program constructs is broken virtually by definition.

I can understand the appeal of the syntax. The syntax indeed appealed to me at one time and I enjoyed the rest of the language. But I think ultimately meaningful whitespace just does not work and will confined python to being a niche language.


When are your editing tools so limited you must use tabs mixed with spaces? That doesn't really make sense.

Just don't use tabs, ever. If you have whitespace problems you can simply grep for tab characters and remove them.


Keep in mind that part of 'scaling down' (making stuff accessible to new users) is that new users are the ones most likely to use lame or inappropriate editors. Stuff like Java and Lisp need more in terms of editors because of verbosity, and sorting through the parens, respectively.


I think you are making 2 good points here.

First, we went through transition from begin...end to {...} and it porbably wasn't as painful for conservative folks as it is now when going from {...} to indentaion. So one question for a language designer is to whether target conservative public or not.

Second, mixing two styles when, for example, generating markup makes Python a burden. They don't go well together. You have to focus on two things: what your code is doing and proper indentation at the same time, while with C-style languages you worry only about the code. Oh, and the "isolate code from markup" concept hasn't been proven to be effective in all situations. So another decision to be made is to whether the language is going to be used on the Web.


Having programmed in both Python and Ruby, I've come to appreciate Ruby's combination of clean syntax (no semicolons or curly braces) with a required end keyword (making it suitable as an embedded template language, among other things). Python only has the first, and it's (slightly) the poorer for it.


One point I want to raise is that of how fast the code is to type. If you can come up with an elegant syntax that doesn't require characters that need key-combinations ("Ctrl" + "[" for an open-brace, "Shift" + "9" for open-bracket, etc) then that would be awesome.

I agree that typing code isn't as fast as typing English (because of all the non-standard characters we have to use []{}()/\~ and so on) but you could help to minimise this gap.


You might get some ideas from the Lua syntax (http://lua-users.org/wiki/SampleCode). It's a nice compromise between the Python and C syntaxes; despite not having syntactically-significant whitespace, it tends to look about as clear as Python, IMHO.

You can definitely improve upon it, though -- the syntax for anonymous functions is a bit cumbersome, in particular. The Lua syntax was deliberately kept tiny to reduce the memory footprint on embedded systems, so some things like switch/case statements are missing, but your language likely won't have the same constraint.

Also, you might want to get an interpreter working 100% first before writing the compiler, as it will be easier to make changes to smooth out emergent quirks in the language. Just a suggestion.


Thanks for your suggestions. Yes, I checked Lua of course, it's neat, but as I said elsewhere, the if ... end syntax is not considered because I find it slightly less appealing than C and Python.

As for interpreter vs. compiler, I thought about this. One approach is to build a simple VM-based system with the entire language infrastructure in place (fundamental types, minimal standard library), which is not much harder than to write an interpreter with the same infrastructure. Then start fine-tuning the VM itself and possibly thinking about the JIC. The original compiler code, which is not the biggest part of your system, is left almost intact in this case.

As opposed to the "interpreter - first" approach, where you will be throwing away some significant portions of your code once you start writing the real compiler. I thought it would be a waste of time. Interpreters and compilers are too different in almost every respect.


Lua actually has a very odd syntax in that statements don't have to be separated by any particular whitespace.

    x = 1
    y = 2
and

    x = 1; y = 2
and

    x = 1 y = 2
are all equivalent.

I actually don't mind this. I've never encountered a situation where this style makes the meaning of the code undecidable.


The main Lua book points out that such code is "ugly, but valid", and leaves it at that. It seems to me that "x, y = 1, 2" would be cleaner, if someone were determined to make it a one-liner.


> can create purely syntactic obstacles to introducing important features, such like multi-line lambdas

The reason that python has problems with multi-line lambdas is that python lambdas are not like other python functions. The "body" of a python lambda is an expression while the body of other python functions is an indented statement block, with an implicit "return None" for fall through, requiring an explicit "return" for some other value.

Your lambdas could be like other functions. Or, horrors, you could define two forms, "anonymous" and "expression", where "expression" is like python lambdas and "anonymous" defines an anonymous function whose body is just like other function bodies (ie multi-line, requiring explicit "return" s).

do-while/until may pose bigger problems.


Why not allow both? To resolve ambiguity, there could be keywords CLIKE and PYTHONLIKE which determine how the code is to be parsed from that point, e.g:

   CLIKE

   def factorial(x) {
      return ( x<=2 ? 
         1 : 
         n*factorial(n-1) );
   }

   PYTHONLIKE

   printf(factorial(10))  # note ';' not needed
Incidently I'm writing a language that'll have C-like syntax which will then be compiled into Lisp-like intermediate code. The Lisp-like code will be available to the programmer, who will thus be able to inspect the syntax tree, write Lisp-like macros, etc. Furthermore, the programmer can embed the Lisp-like code inside normal code (like inserting assembler inside a C function).


I haven't been coding Python that long, but so far multi-line lambdas are the only downside I've seen to the significant whitespace syntax.

So I'd recommend going all Python or mixing Python plus some kind of multi-line lambda support.


Note that Python effectively has an open-brace (the colon) but no close-brace except for unindent. Personally, I hate this because if I'm in emacs and I hit the tab (autoindent) key the result may or may not be what I want. I personally use a programming style where I use a PASS statement as a close-brace so my code can always be safely autoindented without changing its semantics. But this is rather ugly. It would be better to have a less obnoxious but still backwards-compatible close-brace, like ## or maybe #> Some day I'll hack emacs-mode.elisp to support this.


If you haven't already, take a look at Haskell syntax. It allows mixing indentation with braces. Parsing it is a bit harder of course, but I feel it really does provide the best of both worlds.


Honestly consider the smalltalk approach - particularly keyword parameters. The one thing I would do differently is use { and } for lambdas so you can use [ ] for arrays.


May I make a suggestion? Instead of choosing one or the other, why not play with the syntax a bit? Pick one and try it out on a small but non-trivial task. Then make small changes and iterate. Maybe let some friends play with different flavors of it. This may take more work but the end result will probably be more genuinely useful and your language will probably find more users.


The best answer isn't necessarily the right answer. You should consider which style is closest to your likely users and try to appeal to them.

If your users are independent web developers then python-style would make sense. If you are aiming at more corporate-comfortable programmers, braces are de rigeur.


Interestingly though, those who used Python extensively in web apps aren't very happy with it, at least because of the problem of mixing indentations and markup - see discussions above. On the other hand, Python is successfully making it into the corporate world, which is obvious from job boards. Which is probably because Python's modularity and clean OO-ness plays well with the hierarchical culture of commercial vendors.

No, this is a tough question. I still don't know.


That's funny; I've used Python extensively in webapps and I'm very happy with it. I use Mako for templating. http://www.makotemplates.org/


To me, it doesn't really matter. As long as whatever editor I'm using has a good syntax highlighter, adapting to the language is easy.

I would say that as long as it's either easy to create syntax highlighting or the community gets some excellent syntax highlighters, it doesn't matter which way you go.


Care to share your language (spec, basic compiler, etc)


I'm hoping to finilize the spec after this discussion, once a decision is made.

Three important features of the language that probably make it distinct are:

1. You can do something like:

  http_request('google.com') >> parse_headers() >>
    decode_chunks() >> decompress() >> tempfile('tmp')
where elements of the pipe chain are special functions that run as if they were separate threads or subprocesses. In fact they aren't. This is implemented without involving threads or even seperate stacks. Similar to the way you play with processes in the UNIX shell, only you do it within your program.

2. Built-in multithreading:

  proc1() && proc2() && finish_sem
which allows the compiler to analyze which parts of the program need thread safety.

3. You can do

  each c in customers:
    c.name + ' from ' + c.country >> std
and in case "customers" were marked as persistent in the declaration, this statement is translated to an SQL query for you. I'm not planning to implement this in the first iteration though. Hoping to polish the language itself before I can start building the "SQL killer" for it.


Cool! Keep us updated, please!


Just make sure you have for loops that use Python style "for x in alist" syntax rather than C++ style "for i = 0; i < alist.size(); ++i" syntax.


It has

  each i in list: ...
  each i in generator: ...
  each i in 0..9: ... # where 0..9 is a set, but this 
                      # construct is optimized down
                      # to a plain incremental loop
or in case I choose C style:

  each (i = list) ...
  each (i = generator) ...
  each (i = 0..9) ...


Here's my advice about the most important thing Python does:

    >>> if x = y:
      File "<stdin>", line 1
        if x = y:
             ^
    SyntaxError: invalid syntax
Why is this important? Because it's such a damnably subtle bug. But what if people want to do something like if r = get_result(): r.do_something()? Make them write if r with r as get_result(): r.do_something() instead.


I'm a Python programmer. Definitely loved the indentation-based syntax -- so much cleaner and more readable than C-style syntax.


C style but where an error is thrown/compilation aborts/etc whenever inconsistent indentation is found.

I like how easy python tends to be to read, but I dislike the fact that editors' autoindentation can not possibly work without adding otherwise gratuitous pass statements.


I don't have a syntax preference, but there are some things that I don't enjoy typing or reading.

->

::

>>

<<

__ugly__


Why would you disallow macros in the Lisp style?


Because people can't get over Lisp's lack of syntax. On the other hand, TCL is also a homoiconic language, which can support Lisp style macros and it too lacks popularity with the general programming community... I don't get it.


You said: Because people can't get over Lisp's lack of syntax.

"People" are getting over it all the time. Lisp is back, baby! Don't miss the elegance train.


Whatever ruby uses...


Even if you're not allowing macros and code-as-data, you may want to consider Lisp-style S-expressions. It doesn't take long to get used to the parentheses, especially given that good Lisp code uses indentation to such an extent that most of the parentheses "disappear" for a human reader.


Borland SilkTest (4-Test) uses indentation instead of brackets. Otherwise it is similar to c/c++. I don't know though. I still kind of like the curly brace, aesthetically speaking!


python except make the meaningful whitespace thing optional. make a way to disable it. it's a problem in some contexts.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: