Hacker News new | past | comments | ask | show | jobs | submit login
Performance These Days (inessential.com)
79 points by mpweiher on April 21, 2016 | hide | past | favorite | 45 comments



I'm afraid I have to disagree with the basic premise of this article and say that app performance matters more than developer performance.

Improving the sign-up funnel speed on our app by one second resulted in a 11% increase in signups. People are impatient and lazy, and everything you do to improve app performance will improve sign ups, retention, app store reviews and the perceived quality of your app.

The point isn't to get apps or new features out as quickly as possible, it's to make a quality product and the speed of your app is critical to that.


The actual answer, like always is: it depends. If you are working for a startup looking desperately for product market fit, by all means optimize for developer performance! That 11% percent gain is noise if you are in the right product-market and growing accordingly.

But. Give that startup some time and it's not a startup any more. That 11% sure looks tasty now!

I hope we can all agree on that.

Now, here's the actually interesting bit: Almost all the posts on the internet people write and/or pay attention to, revolve around startups. Startups are hip and cool! And then all the poor enterprise developers like me read that stuff and get the wrong ideas. We have entirely different problems, mostly related to moving data into other people's heads, not so much about the machine.

And yet, here I am, reading HN...


A little sidetrack here: As a user, I absolutely hate apps that require me to sign-up, especially if it requires email verification. It's an instant uninstall from me, no matter how compelling the app may otherwise be.

Developers, what is it that you absolutely cannot offer the user without forcing them to register? If it's personalization, then store it on the user's device or iCloud or something similar, or say, generate a random ID number on first-run and inform the user to save it if they want to transfer/restore their preferences in the future.

ANYthing other than this deliberate speedbump which only decreases the total amount of users you would have.


Absolutely this. I've become completely unforgiving on this, especially with mobile apps. I am far too frequently left wondering if the devs—or, more likely, their managers—missed the part of the documentation that allows them to easily store data on my device for their app to use.


> I'm afraid I have to disagree with the basic premise of this article and say that app performance matters more than developer performance.

Developer performance directly translates to app performance when the developer is tasked with improving performance. Fetishizing app performance for every little microbenchmark is counterproductive if it results in you sacrificing developer performance sufficiently.

The basic premise of the article might be stated as "developer performance matters more than app performance, when app performance doesn't matter" - which is bordering on a truism. Do you disagree with this premise?

> The point isn't to get apps or new features out as quickly as possible, it's to make a quality product and the speed of your app is critical to that.

So the point is to make a quality product as efficiently as possible? This requires a balance of features, stability, performance, maintainability, etc. - all things directly impacted by developer performance. The more effective I am, the faster I can implement features, find and fix bugs, profile and rewrite hot loops, refactor and review, etc.


The point of the article was that you can easily get that performance boost in any language - especially when network latency is involved; it will be there no matter the language used.

You can make fast things in C++, but your competitors using Clojure will crush you. Developer performance isn't about sacrificing app performance, but rather getting there faster and making sure the competition doesn't catch up.


I don't think you're catching the basic premise here. Brent is suggesting that when you're writing code targeting iOS or the Mac, the majority of time spent is within the system libraries (in most situations). So the language of the driver code (i.e., your stuff) doesn't have much impact on app performance.


You're correct, and that's the same excuse Ruby and Python advocates use when developers complain about performance. Frankly, it's bullshit.

Take Rails as an example. In 2006 we were told that Ruby's slowness didn't matter since web apps spend most of the time waiting for IO, processors were getting faster, yada yada yada.

Guess what? 10 years later and Rails is still slow. Django is also slow. Template rendering is slow. I've written enough Erlang, Clojure and OCaml apps to know that no, I don't need "russian doll caching" in a fast language. Hell, I rarely need caching at all. My database layer is fast enough. /rant


I'm Django developer and I agree with you. We've ditched Django templates entirely. We use it as our API platform only - most of the computation time is spent in json and in Postgres.


Django is slow sometimes, true (the worse is when it's slow when it doesn't make sense, that is when you don't expect)

There are tricks and things that help (if you want to shoot your DB performance in the foot use multi-table Model inheritance)

Try Jinja2 for templates


>, the majority of time spent is within the system libraries (in most situations). So the language of the driver code (i.e., your stuff) doesn't have much impact on app performance.

Is this really true or is it a perception based on being in whatever respective bubble filter we're in? E.g. if one is an Apple developer for a corporate backoffice, one does CRUD stuff on iOS. But if one is not in that situation, one thinks performance matters.

If we take a look at top paid apps at: http://www.apple.com/itunes/charts/paid-apps/

... one sees a lot of games and some image manipulation apps (facetune, faceswap, video editor, etc). These types of apps require tight loops outside of system libs and performance tuning. I see a few CRUD apps on that list such as HotSchedules and maybe the fitness apps which probably don't need ultratight performance loops.


> Improving the sign-up funnel speed on our app by one second resulted in a 11% increase in signups.

I have hard time believing this is the sole reason that caused the increase.


Depends on how slow it we before they improved it.


Or they went from 9 signups to 10.


The article's title is somewhat coy about the author's central thesis which he states at the bottom[1]: Basically, he prefers a language that is dynamic instead of static.

Instead of giving a reader an immediate heads up about the dynamic-vs-static argument, he gave it the subtitle of "Another kind of performance" ... which he means to be "programmer performance." The majority of the article is about this topic.

The (cpu) "performance" is therefore a subtopic of the dynamic-vs-static choice. He argues that "performance" was one of the drivers of static type checking but that factor does not matter anymore. (E.g. his example of objc_msgSend taking nanoseconds on both desktop & mobile.) I think the Android interest in AOT (static) instead of JIT is a counterexample to that. Performance still matters on battery powered, low GHz cpu devices like smartphones.

[1]author says: "Yes, I realize that would have meant pushing some errors from compile time to run time, in exactly the same way Objective-C does right now. But I don’t care — I’d take that trade-off every single day of the week."


For me, when working in dynamic languages (that lack static checking of types, null, etc) I'm a bit too nervous to "really fly."

When the computer helps me find errors, misspellings, and bugs—given that I understand the type system well enough to not get very confused—that's when I "fly."


What sort of dynamic language? And what sort of environment?

I know people working in very dynamic/interactive environments who feel exactly the same way when they can no longer "touch the objects".


Well, I'll also say that I really fly in Common Lisp, and I think that language has a nice balance: the compilers do check types and nullability, to some extent at least... and the interactivity, "touching the objects"-ness, and general pleasure of coding weighs up for the lack of a really rigorous Haskell-like type system.

And I'm not really a strong advocate of any particular language. I'm still holding out for a really excellent environment with some kind of best of all worlds approach. Maybe in 30 years...


Same here (2nd para). IMHO, it all sucks pretty badly, though there are tiny patches of light here and there.

What I get somewhat tired of is advocates who have seen one of these tiny patches and then proclaims having seen THE LIGHT and declaring all the rest obvious darkness (and you're obviously not one of them).

I want to be able to adapt my language to my domain without having to write a DSL or solve boring multi-language integration problems, I want to reuse components without being overwhelmed by and stuck in glue code, I want to be able to define any type of architectural connector that's appropriate for my problem and have it be on par with any built-in stuff. I want static checking of any properties I care to check, and I want to that to not get in my way by obscuring the problem I am trying to solve. And I want it all live and with amazing performance that is under my control :)


Yeah, as soon as I hit "Another kind of performance" it became pretty obvious to me (IMO at least) that this was a thinly veiled opinion piece on static vs dynamic.

And at the end of this day this is an argument with no clear winner, some people are more productive in Ruby and some are more productive in Swift. IMO most modern statically typed languages aren't statically typed because they're faster, they're statically typed because their designers preferred that kind of environment.


objc_msgSend is amazingly fast considering everything it is trying to accomplish. But, it is amazingly slow considering what it actually does 99% of the time in practice: call a single function repeatedly without variation that could have been determined at compile time, but wasn't.

I sometimes come across companies making iOS products that don't heavily leverage Apple's SDKs (games) but still use 100% ObjC instead of C or C++ for some reason. Best I can guess is because Apple said that's what you are supposed to do. ObjC is very nice. But, some of these companies have measured upwards of 50% of their CPU time being spent inside objc_msgSend and I question how much benefit they are getting in return.

Meanwhile, I have measured a direct, immediate correlations between performance issues coming and going in my own apps to drops and recoveries in revenue per day in those apps.

My point is: The widespread theme of valuing productivity over performance is very wise. But, I frequently see it being applied in knee-jerk, unwise situations. Getting good performance is difficult. Working in high-productivity languages/styles is fun. I know you'd rather work on stuff that's fun. But, having fun making a product that performs badly and drives users away isn't as much fun in the end as it seems in the beginning.


That "2 nanoseconds for objc_msgSend" number is misleading for several reasons. For one, that's the timing you'll get for a fully correctly predicted, BTB hit, method cache hit, which does not always describe the real world: Obj-C method cache misses are several orders of magnitude slower. But the bigger problem is that the dynamic nature of Objective-C makes optimization harder. Inlining can result in multi-factor performance increases by converting intraprocedural optimizations to interprocedural ones, not to mention all the actual IPO that a modern compiler for a more static language can easily do. Apple clearly cares about this a lot, as shown by all the work they've done on SIL-level optimizations.

Another way to look at it is: I would probably get similar (or better) numbers by comparing the speed of a JavaScript method call with a warm monomorphic inline cache, with the type test correctly predicted by the CPU branch predictor. Does that mean JS is as fast as C++?


Though if your code size is smaller, your icache more likely has what you're looking for; and if not the icache then one of the common caches.


The article seems to conflate performance with speed. An app's impact on battery life is also an increasingly important performance metric.


Usually there is a direct correlation, most of an applications power consumption will come from time on the CPU, even SSD's are using 2-4W.. memory is even less expensive for power, but CPU can be up to 85W for a fully loaded laptop CPU (well, mine)

Make sure your application doesn't idle rough, and executes quickly to spend less time on the CPU, power consumption will follow.


It's a little more complicated than that though. For example, SSE/AVX instructions cause Intel CPUs to (briefly) raise power consumption considerably[1]. In some cases it may be better to avoid vector instructions when optimizing for battery life at the cost of speed.

Also being smart about rendering (particularly for web, poorly written animations can run smoothly but kill battery life) can dramatically affect battery life even without affecting execution speed particularly much. They are separate, though closely related, concepts.

1. See e.g. http://www.intel.com/content/dam/www/public/us/en/documents/... p.1 ¶2


> For example, SSE/AVX instructions cause Intel CPUs to (briefly) raise power consumption considerably[1].

I've never seen a case in which getting the work done faster wasn't a win. That doesn't mean it can't happen, of course… But usually vector instructions are a win.


When optimizing for wattage, SIMD is great for getting stuff done quickly so you can get back to spinning on low-power Pause instructions.

But, if you SIMD-optimize, cheer at how much faster things are, then tell yourself "That means I can squeeze in 4-8x more work in the same time!" Well... You not only eliminated all the Pause time you gained, but you replaced it with high-wattage work.

It takes discipline to optimize for power. Somewhat karmicly, pushing to max out everything can backfire when you overheat the CPU and end up just pushing it into adopting a lower clock rate.


If you raise power consumption by a factor of 4 in the ALU and ROB, you'll still take less power if you go 4x faster, assuming other parts of the system remain at constant consumption.

Also take into account memory clock speeds and power on time. Every nanosecond your CPU spends with the memory bus powered up and the memory at top clock, is an important nanosecond.


It's not so much raw performance of a given language, but the stuff that "managed" languages and their runtimes do under the hood that's the problem. I've spent a lot of time debugging stuff like ARC (Automatic Reference Counting) creating hidden reference counting calls that were eating up a terrible amount of total CPU time. These hidden performance traps are much worse than having to think about (e.g.) proper memory management and data layout. I'll take manual memory management over any 'magic solution' anytime (they just increase complexity and if disaster strikes it is much more trouble to find and fix the problem).

Details: http://floooh.github.io/2016/01/14/metal-arc.html


Kinda depends on the cost of an error vs the cost of a slowdown though, and then there's a big difference between say Objective-C or the once-relevant COM/ATL etc. by MS, where you have all the C bugs and manual memory management plus somewhat automated memory management in some places with all the troubles of that, and say Java where at least you have a fully working garbage collector which deals with cycles and never leaks unreachable memory, and you also have guaranteed memory safety.

I'd rather trust my credit card number to a Java program than a C/C++ program - or an Objective-C program; see also http://www.flownet.com/ron/xooglers.pdf for the early Google billing server disaster, not possible with a garbage collector "thinking" about proper memory management and rather hard to prevent with very good developers thinking that they have thought proper memory management through. (And git had a bug where you could execute code remotely because nobody can really manage memory in C, etc. etc. etc.)


I don't buy it. Part of what makes a great app is that it doesn't have bugs. So don't try to tell me that the special tool you need to really make great apps is one that lets you write more bugs.


lols. as if replacing objective-c with clean c or c++ wasn't a valid performance strategy (p.s. it is. i've made real world applications hit framerate that way because a ton of objc_msgSend is more expensive than nothing every single time, and its very exceptionally necessary (are you crossing a very hard boundary like a network?)).

i agree with his basic point though. almost all ways of building an app require needless complexity that is easily solved at the tools level.


I wish people would stop talking about "fast" and instead talk about "efficient".


I think it's actually the opposite. The important thing is a short response time to the user - whether that's achieved by reducing the number of things you do, or by doing them more efficiently, is kind of moot.


Languages don't really dictate response times to the user.


> whether that's achieved by reducing the number of things you do, or by doing them more efficiently

Is there any way to do stuff more efficiently instead of doing less things (i.e. doing it in less steps)? Optimizing is always about trimming out the fat.


Depends on what you're defining as "things". In terms of base instructions? No, all you can do is do less things. (Edit: Although even here, you can do different things to make the overall process cheaper - multiply by the inverse instead of dividing, replace a sin() calculation with a lookup, etc.) In terms of intermediate goals? If you have to process 500 objects, you can either find a way to skip some of the objects ('reduce the number of things you do') or you can make processing an object as fast as possible ('doing them more efficiently').


Battery life.


Definitely a big priority but still a secondary one. Your app won't affect battery life if they've uninstalled because it's laggy.


Most of the other comments point out the idiosyncrasies of the article but I think most of the commentators are missing the point behind the author's basic frustration. While Chris Lattner is a brilliant system's programmer, the work he and his team did with Swift shows their focus on systems type programming. As the author points out, Objective-C was actually a _brilliant_ language for GUI's and application design. It was hampered by C syntax cruft (easily remedied), but the proof is that Objective-C and the Cocoa Framework has had one of the best (IMHO) successes for any GUI toolkit and desktop framework.

Consider that Obj-C survived transitions from desktop to mobile apps, and many of the features such as modern email viewers with embedded image viewing were accomplished on NeXT long before Windows had comparable features. I think Steve Jobs realized this which was his particular genius -- choosing the right tech and focusing it with a "marketing" angle [8]. I think the newer shift toward Swift at Apple shows a focus on "systems" type of thinking, but unfortunately I've also notice a certain decline in application quality -- i.e. Safari search bar crashing on OS X and iOS [2].

If I had to speculate, I'd say the success of many of Apple's UI were due to the SmallTalk aspects of Objective-C which focused on message passing instead of eager function binding and C++ style inheritance. Having written my own low level object system in C and LLVM for an XML processing language, my take is that while having similar intent to implement dynamic dispatching, the emphasis shift between message-passing vs virtual-tables forces a slightly different style of library and framework composition. Essentially message passing is more expressive but sacrifices potential speed, but as Wikipedia describes it:

> Because a type can have a chain of base types, this look-up can be expensive. A naive implementation of Smalltalk's mechanism would seem to have a significantly higher overhead than that of C++ and this overhead would be incurred for each and every message that an object receives. Real Smalltalk implementations often use a technique known as inline caching[3] that makes method dispatch very fast. Inline caching basically stores the previous destination method address and object class of the call site (or multiple pairs for multi-way caching). The cached method is initialized with the most common target method (or just the cache miss handler), based on the method selector. -- https://en.wikipedia.org/wiki/Dynamic_dispatch

However, practically speaking cache inlining can allow the language implementation to be more expressive and forgiving -- a failed method lookup can be forwarded to an "unknown handler" method as in Ruby's `method_missing` unlike in C++ [3], or the situation in Swift (?) [4].

Basically, Haskell et al have made tremendous strides in progressing type theory and corresponding practical implementation of functional style programming with monadic theory and similar, we're still not sophisticated enough to basically provide easily expressive statically type code which still gives the full expressiveness of dynamic languages. Humans are _really_ good at reasoning about systems of communicating components (message passing, overlapping the actor models of computation) but really not as good at category theory, which while mathematically equivalent don't necessary make them as good for programming large, untyped information that's present in most modern GUI's. Thing's like React however make a huge stride forward in some aspects in that regard. Though I'd rather program that in JavaScript (another prototype based language borrowing from the message passing tradition) than in Swift/C++ style languages which essentially force type checking at a lower level while more complex higher abstractions remain un-type checked. So all the work of the type system while not getting benefit for higher abstractions (e.g. User clicks button which sets system into X state). I'm branching out quite a bit, but it's actually quite a complex problem and is still very much an "open problem", although one we're getting closer to solving well. For clearer examples of what I'm talking about see some of these type of papers I found with a quick search: [5], [6]. My take is that the author is talking about a language more oriented toward system-architecting, an example approach like that taken by Ceylon [7].

[1]: http://arstechnica.com/apple/2005/12/2221/ [2]: http://www.bbc.com/news/technology-35420573 [3]: https://rosettacode.org/wiki/Respond_to_an_unknown_method_ca... [4]: https://github.com/rodionovd/SWRoute/wiki/Function-hooking-i... [5]: https://pdfs.semanticscholar.org/8a42/9baf82c17a936801de4e37... [6]: http://lambda-the-ultimate.org/node/5329 [7]: http://ceylon-lang.org [8]: http://www.infoq.com/news/2011/10/steve-jobs-contribution


Objective-C is not a brilliant language and it's finally being discarded by Apple and replaced with something better. This is hard to accept for the (surprisingly) many fans that it has, but Objective-C is almost legacy - jobs, docs, APIs will be Swift-first. Talking about performance and dynamic typing and message passing won't change that.

Apple looked at the competition and saw that Java 6(!!) was kicking Objective-C's ass - it was easier to program in, less error-prone, easier to understand and fast enough.


> If I had to speculate, I'd say the success of many of Apple's UI were due to the SmallTalk aspects of Objective-C

Well, Apple got ObjC and the first version of OSX from NeXTStep and NeXT technology, and that was definitely always NeXT's claim, that they created the right language and the right OS and the right hardware for good UI, from the ground up.


I think that in most programs the performance depends on the architecture. A well written Python + twisted or Java + netty network code can easily outperform any naive C implementation. The same goes for algorithms used. Number crunching is the exception, not the rule.


Forget scripting languages -- Where's our Photoshop for (web)apps?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: