Hacker News new | past | comments | ask | show | jobs | submit login
Intern Impact: Brotli compression for Play Store app downloads (googleblog.com)
313 points by abhikandoi2000 on Feb 6, 2017 | hide | past | favorite | 172 comments




So wait, if I understand this article correctly she applied the compression because someone told her to? Or did she research herself and apllied the whole thing? I agree this is a bit like a "we re hiring interns" post


I would be excited if I had taken an internship and someone had told me to do this :)


yeah sure nothing wrong with that - it's just that the article is a tad clickbait


The amount of negativity in the comments section here is astounding. How could you not be excited and happy for this promising young woman's achievement? No, her work will not put her on the shortlist for a Turing Award. But it is something any engineer should be proud of, and has real impact for millions of users.

You have a right to be unimpressed, but if you're taking the time to say "So what?" or "This is just a recruiting ad" then you should probably rethink. I never thought I'd say this, but the negativity here really indicates the kind of latent discrimination that so many URMs & women in tech complain about. I have literally no other explanation for it -- a senior engineer at Google could have implemented this compression and it would still be HN worthy, and nobody would be calling the blog article a fluffy PR piece.


> a senior engineer at Google could have implemented this compression and it would still be HN worthy,

No, I doubt it would be. How many of the hundreds of little features in the google play store have been posted on HN with an article about the person who implemented them?

Also, I find it more than a little presumptuous of you to assume that any scoffing is due to sexism. I see the exact same cynicism and lack of awe in the posts below that I have come to expect from HN - regardless of gender or color of the person involved.


1.5 petabytes of savings in data usage per day is not HN worthy? I've seen far less significant improvements get voted to the front page with hundreds of comments. This is more than a small tweak to the Play store.

Agreed on the high level of cynicism here -- we've also come to expect that. Moreover I never come to HN (my account is five years old) to point out sexism/racism -- it's just too sensitive and difficult of a subject and to be honest I'd rather just read/talk about topical things without getting political. But again, what I pointed out is stark and I have no other explanation for it (believe me, I want one).

Let me point out the title of the blog: "Google Student Blog: Google news and updates especially for students". Of course there is PR going on, and of course the achievements of an intern will often be on a smaller scale. But this particular achievement is high-impact and the intern deserves credit on that blog for her work. If you're not impressed, then just move on.


>> But this particular achievement is high-impact and the intern deserves credit on that blog for her work

I think the issue is that the project here was a library swap, which isn't hard to do. It just so happens to be at Google, where small optimizations reap huge rewards. Google want's to market it as "hey, look at the huge impact our interns have" and make it sound like a big deal, when in reality it's not that much of a crazy technical achievement.

>> 1.5 petabytes of savings in data usage per day is not HN worthy? I've seen far less significant improvements get voted to the front page with hundreds of comments. This is more than a small tweak to the Play store.

I know a lot of interns at smaller companies who did very impressive work, but because the companies are smaller the numbers aren't as crazy (both women and men, I should add). The effort by them might very much be HN worthy, but you won't see it here because they don't have a huge PR machine behind them.

Kudos to Anamaria for the great work though! Will probably save Google some cash.


> I have no other explanation for it (believe me, I want one)

Um, the fact that that her work saved that much bandwidth is a happy accident of the fact that she was at Google and assigned on a project that had a high user volume. Other than passing the intern interview loop, that took no absolutely no merit or effort on her part whatsoever.

Moreover, the blog post describes the work as "to add support for Brotli for both new app installs and app update." I mean, hundreds of thousands of developers use third-party libraries to add functionality every single day of the year. Some "achievement".

Please take off the X-ism colored glasses, dajohnson89. I promise that it makes the world look like a better place.


It's of course true that at Google you have a chance to make a bigger impact. (Not guaranteed - it depends on what project the intern is given, and that's kind of random.) And interns do get lots of support.

But, to say that there is "no merit or effort on her part" is an insult to all the good work interns do while they're here. They're not coasting.

Seems like you're so eager to tear this down that you'll say anything.


Given the fact that your first paragraph says the same thing my first paragraph does, it's unclear to me what point you're trying to make.

The intern in question had no part in making Google the size it is and was most likely not offered much choice in the way of team or project assigned, so no merit or effort was involved in either of the two. Which part would you like to dispute?


I dispute the part where you say she shouldn't get any credit for her work because she did the work at Google.

If you're going to make that argument, nobody at Google deserves any credit for anything we do. That's not how we normally measure impact. We all stand on the shoulders of giants, but putting those resources to work effectively still counts.


The throughput isn't really that important. 10 GB of savings per day where the intern actually implemented the algorithm would be more newsworthy.


I agree with you.

To be fair, 10% is the better number attributed to this change.

That ten percent amounts to 1.5 petabytes is a credit to the Google Play Store / Android platform.


> 1.5 petabytes of savings in data usage per day is not HN worthy?

Well it is, but the merit should come first to the engineers who implemented the Brotli algorithm. What Anamaria did was just to include the code for that algorithm in the Play Store. Sure, it's some good work, especially for an intern, but I think it's a big exaggeration to say that it is that work that saved 1.5 petabytes/day.


Deploying this at global scale was more than just changing some config and walking away. I'm sure things went wrong, changes had to be rolled back, lessons were learned. "just include the code", seriously, do you even deploy apps at scale?


You are perfectly right, and no, I haven't deployed anythong at the scale of Google. On the other hand, deployments at that scale are routinely done at Google, Facebook and Amazon amongst other tech giants, so an intern doing it shouldn't warrant an article.


I think you're asking the wrong hypothetical: if this were another Google intern who was male, would the article even be discussed here? I think if people weren't impressed they'd ignore it (as they do so many other blog posts), not criticize it. The decision to debunk versus ignore is not applied evenly.


I agree, even if all she did was comment out one line of config and enable another line, doing it at scale, in production, is not to be done lightly. I suspect as a student intern she learned a lot about doing that and contributed a fair amount to making sure it was done carefully and correctly. I'm jealous actually and I change config lines like that every day.


In fact, this was not posted to the main Google blog, but rather a special "student" blog that's obviously designed for recruiting interns. Had I seen this post when I was going to school, I certainly would have seriously considered a Google internship.

So bravo, Anamaria, for completing important work for Google, and bravo, Google, for highlighting her work in an appropriate place. Not every good work needs to be earth shattering.


Even if this is a recruiting post... let's stick to the facts: Her work has nevertheless saved 1.5 PB per day. All of the people who are saying that everybody can do this and implement this: I bet your hacker excellent skills cannot save 1.5 PB traffic per day. And it's nice to see that an interns work can have so great results. Kudos to Google that they have great programs for young people. And I'm also forever thankful to them that I had the opportunity to work as Google Summer of Code Student and Mentor for them which was a true life changer for me!


Most people don't work at companies that have 1.5 PB of traffic per day. The question to ask is if another intern in the same position would be able to accomplish what she did of integrating a library into an application.


Well, HN is where rather technical discussions take place.

And from a technological viewpoint, some people here may find that swapping gzip/etc for brotli is not that astonishing. After all, she didn't invent or implement Brotli, but merely applied it.

Note, this is not to denigrate her work in any way - the results, scaled at Google's level, are very impressive.

It is sad to see this turn into a gender-focused discussion, as I'm sure we'd see all the same "meh" comments here regardless of intern's gender or any other physical traits.

Also, I've noticed this trend where the top comment is attacking over-exaggerated "astounding negativity". There's skepticism here under nearly every article, and the odd couple of dead, downvoted comments; not "astounding negativity" by any means. While it's challenging to resist the desire to virtue signal and collect karma points, please don't exaggerate.


> the negativity here really indicates the kind of latent discrimination that so many URMs & women in tech complain about

was with you until this. hn's negativity is across the board, it's very unlikely there's discrimination going on here imo


I agree with you.

It is things like this one which makes a great engineer. Not only to spend decades coding an earth-shattering algorithm that will take the industry by storm, but also knowing how to save 1.5PB per day with a rather simple (I guess it's not as simple) decision.

HN is supposedly full of business-minded technologists. How are we missing the great impact this had on the business is mind-boggling.

I can't find an explanation for this reaction either.


"has real impact for millions of users"

Come on. That is so not hip these days. She saved a bunch of PB ? who cares. If only she built a nice shiny js app, now that is something to talk about

/s


If my impression was anything to go by, I clicked on the original title of "Intern saves Google 15M GB every day" (something like that) expecting to hear what on earth could have saved that much bandwidth. Instead, I got an advertisement for how great this woman is. (But, now that you want to hire her, too bad, she's already taken.) There was just one sentence that talked about what I went there for: namely, she changed the compression algorithm. If the title were "Look how awesome this Google intern is" I wouldn't have had a problem with it. On the other hand, I woudn't have clicked it, either.


This is surely a great resume item for her, it's a direct benefit to the place she interned, which is pretty awesome. But I think people are quite well attuned to Google's marketing/recruiting efforts these days, and how they compare to the average example. (It's like when such-and-such for profit school brags about how it's graduates work for companies A, B, and C, but more than likely, that's a very small percentage, and the results for the average graduate are lower.)

So yeah, it's awesome she did this, but most interns should not expect this out of their Google internship.


I wouldn't call this discrimination - I don't think HN has ever treated early-career projects with a light touch. There's a curmudgeonly atmosphere in nearly all of these threads!


On the contrary, I have seen projects that change desktop wallpapers get LOTS of praise


URMs?


Underrepresented minorities


This is typical HN; a bunch of people who think highly of themselves for not being sheep working for the big companies, and for whom everything that these companies do is easy peasy stuff that they could do in their sleep.

If I had done something to save anybody 1.5 Petabytes of bandwidth per day, I would be very content for at least a few months. Congratulations to the intern for having such a lasting impact.


> This is typical HN; a bunch of people who think highly of themselves for not being sheep working for the big companies, and for whom everything that these companies do is easy peasy stuff that they could do in their sleep.

Please don't mar your fine comment with an uncharitable dismissal of your own, and please don't make up generalizations about HN to score rhetorical points. The problem of dismissiveness and snark on the internet is a systemic one that is far from limited to HN. Many more people here are charitable and good-natured than not.


[flagged]


> Meritocracy being part of the dev culture,

That may be true on surface, and it's a good cliche, but ask any woman developer and you'll hear stories you'd never hear from males.


Honest question: is it because they don't happen, or they don't get told? What are some examples?


Your response was still in my mind, so today's FP story should shed some light: https://news.ycombinator.com/item?id=13682022

Also, do check the comments in that thread.


I don't disagree, I'm just giving an explanation on what can motivate people to react this way.


"The fact is there are women doing great work and talks. Make noise about them."

And here is a woman who did great work, and Google is making noise about her. Why is this particular case not valid, when it's doing exactly what you recommend?


Work that would be shown regardless of gender. This would not have been show on HN if the dev were a white CIS male because it is basically regular work. Good work, but nothing to show off about. We all do that everyday. It's not a complain, I'm just explaining: you don't want to hear about the regular stuff, for that you have the coffee machine.


There is almost certainly a degree of 'diversity PR' about this post - specifically because this is an intern simply 'doing her job' by implementing an alg she was probably asked to implement. And I'm certain interns are doing all sorts of cool stuff at G that we don't talk about.

But ...

That said, it's probably ok. I do believe that tech is more or less 'meritocratic' ... but surveys on 'why there are not more women in tech' answered by 'men' indicate 'not enough women getting tech degrees' - while answers by 'women' indicate 'not enough role models'. Which is understandable. If women are 'turned off' or less assertively, merely 'not excited' or 'don't think it's for them' because they don't see enough faces doing something, and it affects their choices ... well then it's fair to be a little bit lopsided on what Google choses to promote, especially as it relates to entry-level stuff.

So as long as it isn't flogged too hard, I think that highlighting someone's work is reasonable, and if it helps some 'hey, you can do this too!' communication, that's fine.


I think there is some jealousy too.

When being a geek in the 90 was basically being the less popular people in the area, nobody wanted to be part of the crew. Slowly but surely, with unrecognized hard work, geeks built something, and nobody helped them until shitload of money and cool stuff were made.

They were a minority none helped and that people basically disrespected through media, stereotypes, etc. And you never heard any girl saying they wanted to be part of it.

Now being a geek is cool and well paid. And it becomes a trend to help minorities to be part of it.

I followed some of those efforts, such as pyladies, django-girls... I spent some time in Mali myself to help some people here to become Python dev with NGOS programs. It works. The community is better because of it.

But the feeling of it being unfair can be felt on a lot of forums and comments. This is not sexism.

It's more along the line of "so now you want it ?" and "oh but you want it easy too ?".


+1 I am proud and jealous of her. She did something amazing during her internship that I haven't done in my 10 year professional career. Hell she very well might win a Turing. I would happily have her as my mentor.


+1 This article a great of example of women in tech doing real-world, impactful, "proper" engineering work that directly benefits millions of users.

We need more stories like this.

Please - if you criticised this article about how an engineer implemented a newly-published compression algorithm that saved 1.5 petabytes day, please go take 2 minutes to think genuinely about why you criticised this article. Your 120 seconds of introspection will benefit our entire industry regardless of your motivations and conclusions.


You mean because it's vacuous and condescending? By pretending this is a noteworthy accomplishment, this article de-legitimizes the actual accomplishments of all women in tech. To put this forward as something worth writing about is to essentially say "we don't have any real positive stories of women in technical positions at our company, so here's a library swap that we can use Google's massive traffic numbers to make seem cool".

I'm sure this woman is a perfectly competent engineer and will go on to do things that are actually super cool in her life. And she likely already has, if she's got an internship at Google. This just isn't one of them, we all know it, and so do the authors of this article.


> her work resulted in saving users an expected 1.5 petabytes (that's 1.5 million gigabytes) of data each day.

I'm guessing this is not a measure of data at rest, but data transferred over the network. The couple samples listed on the page ranged from 2.5% improvement to 20.3% (vs. zLib) so I guess they're extrapolating that out to all app downloads and updates across the world. Nicely done.

More generally, we've seen some great advances in compression lately. I've been using Facebook's zStandard [1] for compression in a product I'm currently working on, and I've been extremely pleased with both its speed and compression ratio. The days of "just use zLib" are coming to a close.

[1]: https://github.com/facebook/zstd


Are you worried at all about their patents stance. I currently I think it says if you litigate with Facebook you lose the license. Otherwise I agree zstd is looking like a very nice improvement in an area where most people think nothing happens. I especially dictionary compression bit.


That's a good point, the downside of using new algorithms is playing by the rules of their patent licenses. In this situation I'm not concerned about litigation with Facebook, but I could foresee other products where that might be an issue.

Aside: anywhere you store compressed data at rest, you need to store (in a header somewhere) the algorithm used to compress that data. If you need to change algorithms down the road--e.g. you do get into a lawsuit with Facebook--you'll need that header to know how to decompress old data vs. new.


Looks like they're using the BSD license in this project, https://github.com/facebook/zstd/blob/dev/LICENSE

So no need to worry about the patent clause in this case right?


AIUI BSD only covers copyright, unlike Apache and GPL3 for example.


Check out their patents file


Pretty cool that an intern was given this level of confidence. Less data for updating/installing applications is good no matter how you slice it


I've worked with a fair number of people that graduated from the Mathematics and Informatics at Babeș-Bolyai University. I'm generally very impressed by them, and is just another data point of areas of that world that get overlooked.


Can we get her to work for DropBox? Every time my iPad GoodReader syncs my 1,000+ papers, it has to check every file. It boggles the mind that they don't support some version of change records.


I don't whats the config of your computer, but Dropbox works like a charm for me. I have more than 600 gigs of data synced btwn Dropbox and my computer and it works pretty nicely. I never had to manually check whether a file has been transferred or not.


Maybe the solution should start with GoodReader? I don't have syncing issues with the core app.


I bet switching to LZMA would have saved even more. LZMA beats Brotli nearly every time. zStandard would likely have worked better as well. Brotli is very slow to compress.


That doesn't appear to be true:

https://cran.r-project.org/web/packages/brotli/vignettes/bro...

I'm sure you can use those results to argue that LZMA is superior in some way (e.g. compression speed) but it definitely isn't clear cut superior in other important ways (compressed size and decompression speed are inferior).

I can see why, given those results, that they would use Brotli over LZMA.


The independent tests I did here: https://github.com/google/brotli/issues/165

And also those here: https://www.percona.com/blog/2016/03/09/evaluating-database-...

Suggest that LZMA compresses better than Brotli except in the case of text documents.


LZMA is much slower at decompress compared to Brotli.

https://www.opencpu.org/posts/brotli-benchmarks/


Apps are usually in the order of 40-50 MB. From a back of the hand calculation, the a 50 MB file when compressed with Brotli is 14.5 MB and LZMA is 17.5 MB. The difference of 3 MB translates to an additional 8 seconds of data transfer in a 3.1 Mbps 3G connection (an average one).

This [1] states that a LZMA has a decompression speed of 70 MB/s, which is about 0.7 seconds. The 334 MB/s speed of Brotli does the same in about 0.15 seconds. So the additional overhead of LZMA (compared to Brotli) in decompression is just 0.65 seconds.

Given these orders of magnitudes, I think optimizing for compression ratio is a much better option than decompression speed.

[1] https://cran.r-project.org/web/packages/brotli/vignettes/bro...


In my tests and others LZMA seems to beat Brotli by quite a bit on non-text cases (DB, 3D meshes): https://news.ycombinator.com/item?id=13589048

I am concerned that the Brotli v LZMA v GZip paper you cite is not fully representative as it is written by the the Brotli team.


I just installed brotli on my ubuntu machine and compared it compressing a executable.

Brotli: 4.77MB -> 1.21MB (8.5s) xz: 4.77MB -> 1.129MB (1.4s)

LZMA does better than Brotli in the case of binaries by a fair bit every time.


More on the (speed, compression) trade space here: http://www.gstatic.com/b/brotlidocs/brotli-2015-09-22.pdf


This is not correct for the case of binaries.


I agree that LZMA beats Brotli in compression in the majority of cases (not by much), zStandard does not however.

The thing that makes Brotli attractive though, is that it has high compression (again, very close to LZMA, sometimes even better) while decompressing MUCH faster than LZMA.

The big downside is that it is very slow in compressing, which makes it mainly suitable for 'compress once, decompress MANY times' type data.


There isn't much information but this reads more like an advertisement for google internships than anything else. Not to denigrate her work, she could very well be brilliant and have gone above and beyond, but from how it reads they could be blowing it up to make it seem like every intern has a huge impact and you could too! Either way good for her, but not sure why this is so high up on HN.


It's almost like Google might be recruiting.


I suspect this is exactly correct. First, it plays to a lot of colleges who are having career fairs this month as their seniors consider what to do after graduation. Second it plays to what has been the biggest job selector for new graduates over the last 5 years, "Will I be able to make an impact?". The number one way to discourage people from going to BigCo is to point out that they are going to spend 3 - 5 years just getting enough of a political network developed in order to be given a chance at an assignment that might move the needle slightly.

Last year I was walking around a recruiting event at CU and just listening to the pitches being made by companies Facebook, Microsoft, Google, and LinkedIn (and in full disclosure was making the same sort of pitches for IBM with "Hey its the Watson group, its the most important company project and we're building one of the most important projects in it!" all to counter the fact that it was hundreds of thousands of employees, most of which were not having much impact at all.

You need stories where you can pitch them "if you're brilliant enough and work hard enough you can be the person who changes the world." otherwise the siren song of startups of "You will probably eat lunch with the CEO at least once a month, and everything you do will be really impactful." will carry the day.


Actually it is odd timing considering intern recruiting is slowing way down for most companies around now.


there's always next year


Yup. Voting on HN is very suspect in general and easy to manipulate because the votes are hidden.


If you've got concerns about voting rings or brigades or other irregularities, your best option is to contact the mods via the Contact link in the footer.


and of course, you're getting downvoted. almost as if you're right.


At the scale at which Google operates even small optimizations can save tons of data. Not saying this was a small contribution, but I agree, mostly internship advertising.

A lot of my friends who have interned at Google felt insignificant, Google needs this counter-marketing.


I think that is how so many school projects and internships end up. You just feel like fuel for some grant machine, your contributions are overblown and you might not have been able to contribute as much as you wanted or maybe you didn't get nearly as much mentorship as you were promised and expecting, but everyone plays along and so it continues. I would like to think that if you go high enough eventually you end up in places where the experience matches the expectation better but maybe not.


> A lot of my friends who have interned at Google felt insignificant

The world has changed a lot.

When I was in college, I sought an internship specifically to have real-world experience on my resume to get a leg up when it came time to find a job.


Why do you say the world has changed? That's still the primary purpose of internships.


Per the parent's comment, Google needs to market their internships because there is little value to them, because as interns his friends felt insignificant.

I would say in my day, a Google Internship would have been amazing experience, even if I wasn't special and unique, however, Google wasn't really a the massive behemoth in 2001 (I first started using Google at my internship when the head of IT told me about it).


It is strange. They were trying to decrease update sizes by using File-by-File patching, for example, but never decided to use the best applicable compression library until an intern came along?


According to the article Brotli compression was only launched in 2015 and she did the integration work in the summer of 2016. Bearing in mind we don't know how much of an improvement to Brotli was made during the intervening period and what the evaluation and adoption criteria were that actually seems pretty prompt.


This, and I'd see it as probably something that was on the Play team's roadmap that they decided was a project that would be a good fit for a summer intern, and they may have deferred the work by a few months on that basis.

Speaking from my own experience trying to put together good intern projects, it's actually pretty tough to find that magic combination of something that will occupy an entry-level programmer for a summer, and give them a sense of accomplishment, and actually be useful to the team in the long term. In the past I've deferred useful work so that it could be given to an intern.


As someone who has managed interns, this is exactly my take. It's easy to have wishlist items that you never get to. Things like this are ideal in that they are not time-sensitive, don't require a lot of additional knowledge to do proof-of-concept work on, are accessible to undergrads, and you can be pretty confident in the implementation if it passes unit tests.


She gets much of the recognition, and the blog states she made the evaluation (followed by a simple comparison table) and then implemented changes to the servers while a small, subtle shout out goes out to the authors of the compression algorithm. It's as if the credit was passed on to her for the purpose of google internship advertising.

I think outside of this article, it's very different who gets recognition here.


It probably is - much like their Pokemon Go + Google Cloud article.

When will HN readers that they are furthering corporate agendas when they upvote fluff like this?


If it's an interesting article with relevant content (I'd not heard of Brotli compression before), so what? Unless you're advocating active hostility to companies like Googe, to particularly discourage relevant content with such connections. In which case would you like to justify that?


This compression technique seems to be based on the fact they have previous installation of an app that can be diffed and patched, so it wouldn't receive any benefit from first installations, only updates. But still might be worth it for many applications. I remember I investigated a way to send and apply diffs of javascript code (using a js version of patch) and store in the browser using localstorage. However, at the time the performance wasn't good enough when compared in an end to end benchmark.

However, this has got me wondering as a general corollary for application delivery...would it just make more sense to use something like a well-pruned and compact git repo, and make the connections over HTTP with gzip compression? I'm not sure how space efficient the git repo is but may seem like an interesting project. I'm wary of using any Google technology, open source or not if it can be done yourself in an afternoon.

Does such a thing even exist?


What you want is http://www.daemonology.net/bsdiff/

All objects in git packfiles are already compressed, so you aren't gaining much by adding another layer of compression.


The phrasing makes it clear that this is not intended to wow a tech audience. It's a Google ad to parents, or something.


Mathematicians are the true programmers. I wish I was one.


Even mathematicians rely on abstractions every day to do their jobs. Don't sell yourself short.



"Google Student Blog" // "Google news and updates especially for students"

Important context for this blog post and the comments in this thread.


On the other end of the spectrum, how much more energy has been used by the millions of Android phones uncompressing the app, applying the patch, and re-compressing the data?


Take a look at LZ5 algorithm -> https://github.com/inikep/lz5


This page is consistently crashing my Firefox content process. I'm running 51.0.1 64-bit on Win10. Anyone else having this problem?


Look in about:crashes, you can click the links and see if there is already a bug filed for it. Given that you're on release and this is a Google blog rather than an obscure site, it's unlikely to be a problem in Firefox itself. Check gfx drivers, plugins, etc.


51.0.1 on Fedora 25, no issues here.


Firefox 51.0.1 (64 bit) Ubuntu works ok.


I've seen replies about how this is a "simple library swap" and so doesn't deserve the attention it recognition it has received. As some who works at Google but not anywhere remotely near this project, but with experience in similar projects, I'd like to shed some light on why this isn't a simple library swap, and seems from far away to have been both a tremendous accomplishment and a wonderful learning experience.

First off, there is no such thing as a library swap at Google. Our codebase is quite large. Like shockingly overwhelmingly large. Executing a change like this is almost certainly not a case of "swapping out one configuration line for another." It requires writing new code, testing it appropriately, updating any integration tests, updating documentation, etc. But the real fun starts when you're done coding...

There's the issue of frontend and backend. Serving Brotli-compressed data is great, but what if you're app doesn't support it? If you're lucky, this will be handled by the underlying network layer but then you have to deal with...

Rollout. I don't know how many servers are dedicated to app updates, but I imagine it's a lot. I also imagine they're distributed geographically, across regions and probably even continents. Getting all those servers to support new features is a delicate, time consuming process where any misstep will result in users noticing. It's not coding, but that's why it's called "software engineering" and not "coding engineering." But then once you're servers are all up and running you have to deal with...

Versioning. Updating backend servers is bad enough, but at least you control them. What about that zoo of Android versions out in the wild? How do you ensure they all support this changes? Short answer: you don't. You design a strategy that will allow the rollout to happen gradually over a period of time, and closely monitor it to make sure nothing unintended is happening.

Then how do you turn down the old feature? When do you turn it down? You need to build and properly use instrumentation to determine the safest time to kill off the old feature. Or you could never kill it and commit to paying the cost in perpetuity. That's a design decision, and not a trivial one.

But, odds are you're not the only feature being rolled out. You have to anticipate/deal with potential interactions with other features, rollbacks of other people's work, etc.

I could go on, but I think I've already demonstrated why this is by no means a trivial accomplishment, even for a full time engineer. Add to this the fact that every intern has to race against the clock to get ramped up on their project, making something of this complexity and with this large an impact happen deserves applause.

I should add, I'm speaking as myself here and not representing Google in any way.


Makes you wonder how much they'd save by using Courgette, like the Chrome team does.


thats like 50 million dollars a year (in egress cost)


she didn't create a compression algorithm.

More akin to enabling GZIP in IIS...


If you had an intern that was responsible for turning GZIP on in IIS for a website that had 1B users it starts to become much more of an accomplishment.

Even small changes at that scale require careful analysis and coordination.


true.

However my response was more to the click baity title giving the automatic impression an intern came up with an innovative approach that netted tremendous result.


1.5M GB = 1.5 PB?


Please fix the title, its 1.5 PB, not GB


It says 1.5M GB == 1,5000,000 GB == 1.5 PB


That's confusing.

To use an analogy, imagine if someone wrote: "$1.5M B" instead of 1.5 quadrillion or 1,500,000,000,000,000. You'd be confused, and rightly so. A lot of people would mistakenly read it either as $1.5M or $1.5B, neither of which is right.

In this case a lot of people are misreading it as 1.5 GB/Day instead of 1.5 PB/Day.

PS - The way Google uses it in the Blog post is pretty clear, they're describing what a petabyte is. My issue is with the HN title only.


I'm sure I'm strange, but I find "$1.5M B" clearer than talk of "quadrillion"s, because I'm not confident I know what power of 10 a quadrillion is, but when I see M B I just add 6+9. Even then, the British sometimes say things like "thousand million", because billion over there used to mean 10^12, but now means 10^9 (which they used to call a milliard); as with other wordy number-words they've succumbed to American usage¹.

¹https://en.oxforddictionaries.com/explore/how-many-is-a-bill...


It's still a stupid way to write it.


1.5 megagigabytes.


That really is the worst possible way to put it.

Why not express it in nanobytes?

PB is a unit everyone is more familiar with than MGB.


*M GB

M is well-known in headlines as million. A GB is something people appreciate the size of, TB not as much, and PB is definitely not in most people's vocab.


> M is well-known in headlines as million.

In which case it doesn't get combined with an SI prefix because the financial "M" is not the same as the SI Mega. Jamming two SI prefixes together is plain ignorant.


I naturally read the title's GB as GiB (making only one SI prefix), but I don't hold anything against other people being more precise readers.


Oh, come on. This is a bullshit argument. Hacker News does not have the same audience as The Daily Mail.

Nobody measures things in megagigabytes.

To be truly pedantic, "M" can also mean "mille" or thousand if you're talking about things like CPM. Mixing units like this introduces needless confusion when there's already a SI unit for the job: P.


I didn't even saw that. good trick :)


1.5 kilokilokilokilokilobytes, more commonly known as 1.5 kilomegagigamillibytes.


So she did this for free? :D please tell me this is a paid internship!


....

I hope the rock you live under is heated, they're paid $6.6K a month + 9K-12K housing in the US.


2700 GBP a month, plus 1000 GBP housing (for a summer). Because Europe.


She started in Krakow, Poland. 2700 GBP/month is what the prime minister of Poland makes, it's an absurdly high salary for the local market there. I've friends who work as professional C++ programmers there and they make about 1000-1500 GBP/month working for huge corporations.


She worked on something completely different in Krakow, and earned a lot less (about 1000 euro I think). The compression work was in London, where 2700 GBP doesn't go a long way at all. Rent is about 1000 GBP for a room, and 2700 GBP is about 1800 net, so that leaves 800 GBP per month to spend on food, transportation and other expenditures.

I happen to know the person in the article. I was also an intern at Google this summer in Europe.


For an intern, that is insanely good.

edit: In fact, for a lot of people, that is insanely good! Apologies if I came across as an asshole when I said "for an intern".


No it isn't, especially considering it's 30% that what Google interns earn in the US, even in cheaper places than London (Pittsburgh, Seattle).

Also, it's a lot less than what competitors pay interns in London (Bloomberg, Facebook and Palantir pay interns a LOT better, among others).


I don't know, I think you have to look at what is good for a certain area.

In Poland, ~1000EUR/4000PLN/month is more than what an accountant makes. Teachers never reach this salary, even with 30+ years of experience.

You can easily live very comfortably on that salary, if there's two of you making that money you can buy a house and pay it off within 10 years or build one.

Being paid that much as an intern is just unheard of, most people I know would trip over and faint if they heard this.


No, Google interns are paid more than most devs in Europe.


Google's interns are paid well.


Unpaid internships are illegal, if the intern is doing real work for the company. https://www.dol.gov/whd/regs/compliance/whdfs71.htm


Newsflash: There are more countries than the US of A and your laws do not apply here.


Yeah but Google is mostly a US company so it was an easy slip to make.


[flagged]


Please comment civilly and substantively on HN or not at all.

https://news.ycombinator.com/newsguidelines.html


Not in the United Kingdom.

Although I'd imagine Google pay their interns


I did miss that it was in London, but I'm still surprised that there is some form of worker protection law that the US has that the UK doesn't.


There's pretty strict rules about what does or doesn't count as an internship though.

https://www.gov.uk/employment-rights-for-interns


just want to point out that your link explicitly refutes your statement. They are not illegal and you linked to a test about legality of unpaid internships:

> There are some circumstances under which individuals who participate in “for-profit” private sector internships or training programs may do so without compensation.


Yes, there is a list of criteria in that section. This job fails to meet several of them, especially number 4: The employer that provides the training derives no immediate advantage from the activities of the intern; and on occasion its operations may actually be impeded;


> Unpaid internships are illegal Sorry venerable Sir, what planet you're talking about?


"But whatever your moral leanings, a judge on Tuesday confirmed what intern advocates have been alleging for years: a lot of these programs are illegal."

"it's hard to predict what appeals court judges will rule on any of these cases."

https://www.washingtonpost.com/news/wonk/wp/2013/06/13/are-u...


[flagged]


> Your own shitty[0] remark

You can't do this kind of name-calling on HN, so please don't. You've also been posting quite a few ranty comments lately. Please don't do that; we're trying to for a higher quality of discussion than that here. When hot under the collar, please cool down before posting.

We detached this subthread from https://news.ycombinator.com/item?id=13581770 and marked it off-topic.


[flagged]


> Although you don't deserve an reply for your rudeness

> something you are lacking: respect

Please don't break the HN guidelines by being uncivil and making this site worse, even if someone else has behaved badly. Reacting like this creates a downward spiral.


Click bait.


No it's not. That's literally what I would summarise the entire article as: Google Intern's work saves over 1.5M GB per day.


I don't know much about gigabytes, but that seems like a lot

edit: (I'm guessing the downvotes are because I phrased it like a meme, but to clarify, this was a genuine compliment in response to a 'look what this person did' type post- it's inspiring stuff)


So she used a compression algorithm developed by other googlers? So what?

Don't get me wrong, I'm sure she did a lot of work for it, but looks like a lot of people would have been able to do that, there is nothing innovative in what she did, right?


I think it is mostly young IT worker recruitment ad, and also a 'Women in technology' angle.

I personally wouldn't be too happy if my modest contributions as novice described as 'massive improvement for millions of our users'.


I'd be happy to put, "Implemented compression alghorith that saves the company 1.5PB in data transfer a day." on my CV.

That's $15000 a day saved on data traffic. You are worth $5M+ per year now (at the same scale).


But who is exactly? Is she, or are the compression algorithm authors?


Neither. All parties involved seemed to have done their job perfectly well. It's all I can judge from here. Here on HN there seems to be too many assumptions around this blog post.


You wouldn't be happy if somebody gave you a virtual high five for your work when you were early career?


I'd be happy, but happier if it came without the hyperbole (personally).


I guess I remain critical of my efforts. For example, I would not call my sneaky effort to use old veggies and other food items in refrigerator as massive cooking innovation to keep the planet green.


How about a modest improvement for millions of users?


A "modest contribution" is not one that saves 1.5PB of data transferred.


It's not innovative, and yet it had a large impact on Google and their customers. Innovation isn't everything.


It's easy for people (especially people who like tech) to forget this. "Innovation" in the sense of better algorithms or improved computations are great, but they don't solve problems - applications do.

That's why many 'hot' areas (like machine learning, for example) have a large gap between industry and academia. Researchers focus on "innovation" in terms of new approaches, aiming for marginally higher performance, and requiring increasingly complex models/infrastructure to unlock the remaining performance. On the other hand, combining an existing technology with industry knowledge of how to apply it can have a huge impact.


sometimes just applying the same thing in a different place can make a huge difference


> So she used a compression algorithm developed by other googlers?

Yes.

> So what?

So she saved 1.5PB per day - which no other Googler did. Nor you or me. She did.

> Don't get me wrong, I'm sure she did a lot of work for it, but looks like a lot of people would have been able to do that, there is nothing innovative in what she did, right?

If it was that easy, why wasn't done already? Why did she have to come along to do it?

You are confusing easy to implement with easy to come up with the idea. In hindsight it is a no-brainer, but somehow the almighty Google didn't do in years of Play Store.

Go figure, right?

P.S.: Please support your downvotes with a reply.


From the article:

"Anamaria’s project was to add support for Brotli for both new app installs and app updates."

So someone else told her to do it, if she wasn't there some other intern would have made the same savings.


I'm not sure how that says "someone else told her to do it", but even if it was true, the point still stands:

She did it, no one else did. You didn't, I didn't.

If other intern would have did it, then we would be having the same conversation about that other intern.


They had to wait for someone to design and implement the compression algorithm before they could have an intern swap it out.


And they had to wait for people to invent computers so you could do your job; still you get praised for doing good stuff that no one else in your company does :)


What makes you think she came up with the idea? Read the article -- it's clear that it was her "project". She's an intern at Google, they're generally on a short leash.


> What makes you think she came up with the idea?

It might have not been her sole idea or even her idea, I'll give you that. However there's no evidence it was not her idea of partly her idea.

> Read the article

Please, this is the kind of comment not welcomed here. I did read the article before commenting. You are trying very hard to read between the lines, when Google is claiming once and again that her work saved 1.5PB/day - I'm not sure why will you try so hard to read a hidden message in Google's words and dismiss what's written in plain English.

> it's clear that it was her "project".

So what? See my previous paragraph.

> She's an intern at Google, they're generally on a short leash.

Generally or always? In the US or in Europe? It was the second time she was interning at Google (it's in the article).

Again, you are trying very hard to read between the lines and guess.

This engineer did something that saved users 1.5PB/day. She did, no one else did. Other people did other stuff. To each their own - I'm not sure why do we have to downplay her achievement though.

P.S.: This is the kind of shit women have to put up with constantly. It might look like it's not important, but when you have to go through this every day it takes a toll on you. The worst part is that we don't even want to acknowledge it.


> Please, this is the kind of comment not welcomed here

Oh, come on. I'm not being facetious. The wording in the article is "work", "project" not "idea" or "initiative" -- which frankly would have been the article I was expecting to read. The article isn't even about her, by word count it's mostly about Brotli. There's nothing in the article to suggest that this was anything outside the realm of day-to-day software engineering.

> This is the kind of shit women have to put up with constantly.

Why are you bringing gender into this? I don't care what her biological sex is. I care that the original HN title made it sound like an interesting article about storage or service architecture and instead it was just "Google intern successfully uses existing Google library".


> they're generally on a short leash

How do you know that?


Many friends who had gone there. It was one of the reasons I turned them down. It's just part of being a large tech company -- they get hordes of interns and generally want to get something out of them in a short period of time. Most are explicitly attached to one or two projects which can be completed in ~2 months.


You're missing the point of the blog post, which is "Hey, students, Google deals with massive amounts of data, so even as an intern you can have a huge impact."


After any idea is implemented, there are people who say the same. Coz most ideas are out there, they need dedicated implementation.


Honestly genuine question: What did you accomplish that was technically as significant by her age/experience?


Why she doesn't wear a Pied Piper t-shirt? However, I'm more interested if the Erick's position is still vacant in the venture? (Jin Yang's will work as well for first 2 years, I suppose).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: