Hacker News new | past | comments | ask | show | jobs | submit login
Statistics for Software (paypal-engineering.com)
199 points by mhashemi on April 12, 2016 | hide | past | favorite | 14 comments



A student of mine just finished building a benchmarking tool for applications [0]. For example, it warns if your sample size is too small. Here is an example, where he compares GHC performance over the last years [1].

[0] https://github.com/parttimenerd/temci [1] https://uqudy.serpens.uberspace.de/blog/2016/02/08/ghc-perfo...


"When the software industry gets to a point where it leverages this analysis as much as the hardware industry, the technology world will undoubtedly have become a cleaner place."

Hear hear! Quality control is usually one of the _primary_ drivers in new hardware development. With software, I find it's often tacked on at the end.


I am not sure that is coming anytime soon. Generally, You have the ability to update software products more cheaply than hardware products. Also, people get quite a bit of utility out of buggy products, people will jump on a beta if they think it is useful.

Given that, half finished buggy software products have incentives to be released with less quality control. This of course will vary by use, no one is getting in a plane with software with low quality control.

I would guess that we can look at the attention paid to quality control as a function of updatability and risk.


Exactly this. Reminds me of this quote:

"If you are not embarrassed by the first version of your product, you’ve launched too late."

- Reid Hoffman


Yes, the ability to quickly update software covers a multitude of sins.


Slightly off-topic, but it's surprising to me that they have this blog that is up-to-date with their latest brand, yet they've still let large portions of Paypal.com go without an update to line up with their latest brand.


Well, this post blog will be seen by a relatively small number of people and also it's just that, a blog. Their main site is used by an exponentially large number of people, including our grandma's and possibly people with different accessibility requirements, making it trickier to update. This is probably not an excuse (I consider having a half updated site worse that a slightly bad full update) but it's to say that it's not that surprising


OP writes really well. Found myself reading the README of his Python web framework (which I'll never use) just because of the clarity, style and pedagogical approach.

Hope there's more on the way.


Whoa! With praise like this, how couldn't there be more? Thank you!


Oh God. I need to read this. Great post.


I started by dumping spreadsheets and forcing myself to use R. Also signed up to datacamp.com

I never liked spreadsheets or people that like spreadsheets.


I don't really like the idea of throwing data away because it ultimately gives an incomplete view of the system. But, easy solution to solve a hard problem!


Okay. Not really sure why I got downvoted so much. Why am I wrong?


Sampling is fundamental to so much of practical statistics. It's more or less proven and accepted. In real studies, we "throw data away" by just not collecting it in the first place. As long as you do it right, you still get a reliable answer.

But if you've already got it all and it all fits in memory, by all means, hold on to it!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: