Hacker News new | past | comments | ask | show | jobs | submit login
Obfuscating “Hello World” in Python (benkurtovic.com)
285 points by carljoseph on Dec 19, 2014 | hide | past | favorite | 55 comments



When I first saw the final code, I thought it unfathomable that I could resolve it down to "Hello World!" at any point in my lifetime. The following 20min provided an education into logic, programming, and Python that rivals 200 page reference books with clear explanations and the ideal amount of build up for each step. Thanks Ben for the post!


I thought this was a great post, some amazing code, and humbling since this is WAY beyond my current - and future - ability.

Even more impressive - it looks like the author is a freshman in college! https://www.linkedin.com/in/benkurtovic

Nice work, Ben!


Damn. If I was a freshman in his CS class I would hate his guts.


Can you explain why? I don't really understand this sentiment at all.


It's a human emotion called jealousy.


This is perhaps the clearest example on the value of implementation-oriented code comments. When you need them, they are superheroic. But... don't need them, please.


I barely know any Python and upon first glance I could easily get an idea of how it worked (without looking at the explanation): by computing the required strings out of funny-named variables with shifts and arithmetic. co_nlocals is obviously the only actual integer constant from which all the other values are coming from.

Only then did I read the explanation, and realised I wasn't far-off (the only thing I got wrong was the string computation, which I thought would be a string concatenation.) I think it says something about the language when even obfuscated code in it is rather readable! (Then again, I do RE where most of the code I'm reading is disassembled machine instructions, so maybe my perspective on what constitutes obfuscation is a bit skewed...)


"There should be one-- and preferably only one --obvious way to do it."


I don't think this counts as "obvious".

BTW my intuition say turing completness require that there are many ways to do it, for at least some values of "it", but I can't proove it.


Since turing completness implies that you can write an interpreter for any other programming language you can do it in python in at least as many ways as there are programming languages where you can do it.


I saw this yesterday and shared it with a friend who had introduced me to code golf. Simply fantastic, particularly the comments.


I didn't realize there needed to be an IOPCC but now I do.


This is like brainf*ck to me.


Brainfuck relies on pointer arithmetic, while this is using bit shifting.

Brainfuck commands: http://en.wikipedia.org/wiki/Brainfuck#Commands

Bit shifting: http://stackoverflow.com/a/141873/58740


the level of obfuscation is impressive. i wonder if similar obfuscation for javascript exists? The one from google doesn't seem to quite work right and end up breaking my script. The paid version from jscrambler seems to work but seems like a large waste of money to obfuscate one file.


[flagged]


- Fast CPU performance. The PyPy implementation in my tests (JSON processing) was faster than Go and CPython (module used was a C library)

- Compiles to standalone binaries, C++ using Nuitka and other methods such as PyInstaller

- Fast speed of development

- Syntax: matter of personal taste. But overall Python is very terse

- Large community, ~20 year vast resources for use and learning

- IronPython for .Net libraries and Jython for Java libraries

- Versatile and already in largescale use for systems, web, administration scripting, game scripting, 3d applications, data modeling, game creation, crossplatform GUI applications, glue code

- iOS/Android/desktop support with Kivy

- Portable, embeddable and extendable

- Focus on readability/maintainability

- One of few languages that covers nearly every possible need well, though not top tier in all

- Bridges nicely to either exploring lower level aspects, strongly tied to C (and by extension assembler). Also offers ways to start learning functional programming within Python

- Already installed on Mac and most Linux distros

- The culture. Join us in a Python Meetup, and find out.

I did my homework before choosing to learn Python, and there is a huge con with Python3 (which many of my items do not apply to right now). But overall I'm not sure what your definition of better is, but I haven't found anything better or even close.


Citing "fast CPU performance" alongside "large community" and "already installed" sounds a bit disingenuous. The version that is fast has a smaller community and is unlikely to be already installed while the one that is already installed and has a large community is not fast.

Like it or not, "Python" means CPython to most people, inducing most library authors, and you run into problems using PyPy if you assume it's "just Python", especially if you have come to rely on Python's easy interfacing with C.


It wasn't stated alongside. "Most people" aren't ones I want to be around or work with. I come here to associate with those who do their homework. CPython and PyPy are the same Python community as PyPy runs Python code unmodified. PyPy also interfaces just as well with C, inform yourself.

Python is a language. Languages don't have speed. Implementations do.


>"Most people" aren't ones I want to be around or work with. I come here to associate with those who do their homework.

I am not sure what you mean here in the context of what I said. My was simply that most users of Python don't use PyPy, so most library authors don't test against it.

>CPython and PyPy are the same Python community as PyPy runs Python code unmodified.

Not all of it, as https://bitbucket.org/pypy/compatibility/wiki/Home will confirm.

>PyPy also interfaces just as well with C, inform yourself.

Sorry, this is where I did not make my point very clear. I meant to point out that the way in which PyPy interfaces with C is different from the one used by CPython, so libraries that rely on it don't work in PyPy without an extra porting effort.


CTypes and CFFI still work with PyPy. Also, 'not all of it'- well yes, this is a WIP with a ~2 man project/implementation. Not bad. What are you using that's not on the list? The intention isn't so that authors have to test against PyPy.. PyPy's intention is to be as compatible with CPython as possible.

Either way, while you attempt to discredit PyPy the fact remains that Python code is and can be very fast for CPU heavy work. There's no way around that other than trying to poke holes.


I will never understand the whitespace complaint. Any code you read in python will follow standardized indenting, by necessity. Why on earth is that a problem?


Yeah, I also don't get that. My Python code even has the same indentation as my Java code in university had. I just don't have to spell out the ; and the {}. But in the beginning it was really tough, especially because my eyes were trained to structuring code by {}. I think it's mostly a question about if you want to put in the effort to retrain your instincts or not.


I prefer having to add the {}'s and ;'s, even if the spacing is the same, because they provide a clear and definite bound on scoping, which (unfortunately) often gets messed up in Python.


Yeah that's pretty much what I also experienced. You won't get into that trouble that often, though, after you have trained yourself to the Python look. There are some quite tough problems where I really need to think if I put a line into an if block or not, but that the kind of trouble you have for logical reasons not because of having {} or not. I work full time on a Python project and I can't remember a single instance where I "misclicked" a line into or out of a block.


Personally I don't care, although copy-pasting Python can be annoying. If your biggest concern with any general-purpose programming language is its syntax, you have probably not put enough thought into semantics issues - the ones that persist no matter how familiar the syntax becomes.

Anyway, Walter Bright, designer of D language, wrote an article on language design [1] touching on the issue:

> Yes, the grammar should be redundant. You've all heard people say that statement terminating ; are not necessary because the compiler can figure it out. That's true — but such non-redundancy makes for incomprehensible error messages. Consider a syntax with no redundancy: Any random sequence of characters would then be a valid program. No error messages are even possible. A good syntax needs redundancy in order to diagnose errors and give good error messages.

I thought that was interesting. It is a pretty much undebatable argument against taking terseness too far.

On the other hand, I have never had a problem with Python's error messages.

[1] http://www.drdobbs.com/architecture-and-design/so-you-want-t...


The 'ideal' language, in my eyes, would allow terse representation of _algorithms_, not of the language syntax. This often means that you need several different ways to do the same kind of thing in different contexts.

"The syntax is terse" doesn't help if the libraries aren't. We want libraries and common concepts to be terse. That's why I feel that any language designed without heavy consideration for how it's standard libraries are to be used is a regressive exercise. Write your libraries how you want them to be used, then figure out the syntax from that.


Easy: because there are people who mistake being used to something for it being good. Also because people know too few different styles of syntaxes and are not required to learn more of them (let's do everything in X! from microcontrollers to OSes!), which makes them inflexible.

Syntax matters, but not in the way most people think it does. Syntax should be judged based on how well it supports semantics of a given language and not based on how "pretty" it is. EDIT: I should probably add that Python syntax matches its semantics and intended usage rather well; for an example of a conflict between syntax and semantics look at JavaScript.


Python syntax helps the semantics pretty well I think.

Why do you need braces when the program's tree structure is obvious from indentation?


> Python syntax helps the semantics pretty well I think.

Yes, I think so too.

> Why do you need braces

You don't. You need some kind of delimiters and newline+indent works just as well as any other kind of delimiter, like braces, parens, begin+end and so on and on.


- Cutting and pasting code may change its functionality. Especially from a non-source medium like web pages.

- I find myself fighting my editor's indentation a lot more as a result. Hitting return at the start of a line in the middle of an existing chunk of code indents to the previous line's indentation not the current line's indentation.

- It's asymmetrical: code drifts off to the right as you go down. Together with the editor issue I've taken to inserting a single '#' at points of down-indentation to mark this.


Your first complaint can be easily solved with a good text editor. I use sublime and it will handle whitespace differences when pasting. I haven't had any trouble pulling from webpages. Secondly, it isn't any harder to indent after hitting return than to add a Bracket. Same number of keystrokes and again, a decent editor can fix this problem too. Lastly, this is just a personal preference. Clojure code also drifts right, but its just a matter of training your brain to interpret it.


How can it handle whitespace differences when pasting correctly?

I have a conditional block. I want to paste some code at the end of it. Do I want to paste it inside the condition, so indented the same as the preceding line - or outside the condition,so indented one level less than the preceding line?

With a brace delimited language, if I put the cursor inside or outside the brace and paste, an editor has enough information to correctly indent it. With whitespace delimiting, the editor doesn't have enough information.


Those are valid complaints, but they are kind of on the same level as saying "I have to type 'int' to declare an int, what a drag!"


I have to click 'indent'? Ugh! I might as well write this in assembly.


Not supporting Copy & Paste is a feature.


Specially given that Python isn't the only language that does it.


I use it (full-time for 8 years) because of the syntax and the forced use of white space. The language encourages people to write better and more maintainable code.

Out of interest, what language are you using that you prefer the syntax of?

Python is notoriously similar to pseudocode, and I can't think of a mainstream language (other than Ruby) that even seems to care about syntax.


>The language encourages people to write better and more maintainable code.

Python lowers the barrier to entry for programming. Nowadays lots of business analyst types and such are getting into programming because of Python. As a result I have seen some horrendous looking Python code out there which performs quite poorly.


I've written an answer to this on this question on Quora: http://www.quora.com/Why-should-I-use-Python-over-Perl

The short version is:

Different people think differently and different languages require different thought models and as such appeal to different people. Different languages prioritize different computation models and as such are differently suited for different purposes. And lastly, different languages provide different sets of third-party libraries.

Python's main advantages over other dynamic languages are as such:

- it traded expressivity for simplicity of syntax and as such is easy to master

- out of all the dynamic languages (other than R) it is best suited for numerical calculation


And people prefer different coding styles. One persons "icky syntax and forced use of whitespace" is another ones preference. At the same time, you trade your freedom of whitespace expression with simplicity when you need to understand or work with code written by others.


> you trade your freedom of whitespace expression with simplicity when you need to understand or work with code written by others.

To be fair, that's a red herring in an age where most languages have tools available to clean up and reformat a weirdly formatted file directly in your editor. (I have Perl::Tidy bound to Ctrl+E and that combination gets hit more often than the space key, especially on code i am writing.)


You are not just looking at code in your editor and the majority of tools are not able to just reformat code.

Reformatting also breaks line numbers which makes it difficult to talk about the code with others.


> You are not just looking at code in your editor and the majority of tools are not able to just reformat code.

Ctrl+C and Ctrl+V are a thing. The last time i looked at code on paper was years ago. In practice over the past 9 years i have not looked at any piece of code that i wasn't able to tidy so i could work with it.

> Reformatting also breaks line numbers which makes it difficult to talk about the code with others.

Ctrl+Z is also a thing, as is a vast number of pastebins. Again, in practice i have never run into the problem you describe, and i work with what is claimed by some to be the most unreadable mainstream language in existence.


The general consensus seems to be that it was more readable than Perl and forced you to write more maintainable code at the right moment in history.

Today it has an abundance of libraries and enough competent developers to be hard to ignore.

To me the inconsistent naming in the standard library is a bigger flaw than the syntax, which is a matter of habit. Date and Unicode handling (the latter in both 2 and 3) follow closely after that.


Most of the inconsistent naming and unicode handling was fixed in Python 3.


Fair point about the naming but on Unicode handling I'll have to disagree with you. The problems of the kind described in http://lucumr.pocoo.org/2014/1/5/unicode-in-2-and-3/ are not difficult to run into.


I don't think python's syntax is icky at all and I hope you're not judging it by the looks of an obfuscation challenge.


I had to +1 this although I disagree. I really am disappointed that people downvote someone for expressing their opinion.

Let me answer your question, though. Why is it good? It's not good. I hate it. But that's the case with everything you have either used too little or too much. And I try to use Python for everything, which is always a bad idea. But I will continue to use it, because using Python I can express myself more clearly than with Java, C, PHP, Bash, or AWK. The reason might be that I simply know more about than about the other languages. But I get fast results, I can change them fast, I can deploy packages in it that you can use 1 second after I uploaded it, if I get a strange exception I just look into the source code of that project (source code packages yeah!) change it just on my system to what I was expecting, then think about patching my code or submitting a patch to that other project. I really love how I can put a 200 line Java program into a single line of Python and still have readable code, actually more readable than 200 lines of any language. And it's the only language I can still read if a Java programmer or a C programmer without much Python domain knowledge has written some code. It's also the only language where I can read the interpreter source code without much trouble, because they used Macros and functions to make it look a little like Python code, although it's C. Also I hear from people who code Ruby and Python that Python is the more stable of the two brothers, which I like. I'm not that much into experimenting.

So why do I use it? It makes me more productive because it matches my style of developing. Is it better than most other languages? Probably not, it has it's pros and cons. If your target project is in the middle between stable and experiment use Python. If your balance for optimization leans towards maintainability instead of execution speed use Python. If you love a few of the projects like Django and the community use Python. Otherwise you might be more happy with someone else.


GP is not just expressing an opinion, they're expressing it in a fashion so destructive that it seems likely to be a deliberate troll. "icky syntax" is an opinion that the author should have known was unpopular, expressed aggressively as though it were a widely recognized fact. "I have looked into the alternatives and they all seem to be better" provides zero information and just fans the flames; someone trying to be constructive would mention specific alternatives and specific things they thought was better about them.


Questions don't have to be polite, well-formed, or sensible. That is required of answers.


Says who? There's no rule against downvoting bad questions, nor should there be.


It's required of any question that expects an answer. Nobody's going to answer your question if you're an ass who doesn't show a modicum of consideration of those being asked the question.


It has a very feature rich library. You can build relatively big applications with lots of features in a very short time.


I'm not so sure if that still applies. Since the troubles in the packaging department in 2009-2013 I feel we are lacking behind other languages. Nowadays with fast internet, experienced packaging tool developers, etc, new languages like Go are fast to provide an infrastructure as good as ours or in some regards maybe even better. Some libraries are also itchy because they are written in C or Java style and not in Python style, yet they are in the stdlib. Not even the BDFL likes the stdlib that much any more. Python has a lot of cool features, that's why I will continue to use it. But the library is not the selling point any more, imho.


This code is so ugly that I thought it was Lisp :p




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: