Hacker News new | past | comments | ask | show | jobs | submit | throwaway427's comments login

This is a great presentation.

Where I work we don't use any process level parallelism at the ruby level, we hoist that up to the kubernetes level and use signals (CPU load, job queue sizes, etc) to increase/decrease capacity. Workloads (replica sets) are segmented across multiple dimensions (different types of API traffic, worker queues) and are tuned for memory, cpu and thread counts according to their needs. Some heavy IO workloads can exceed a single cpu ever so slightly because db adapter isn't bound by the GVL, but practically speaking a pod/ruby process can only utilize 1 CPU, regardless of thread count.

One downside of this approach though is it takes a long time for our app to boot and this along with time to provision new nodes can cause pod autoscalers to flap/overprovision if we don't periodically tune our workloads.

In a perfect world we would be able to spawn processes/pods that are already warmed up/preloaded (similar to forking, but at the k8s level and the processes are detached from the root process) in a way that's not constrained by the CPU capacity of some underlying k8s node it is running on and instead is basically an infinite pool of CPUs that we only pay for what we use. Obviously serverless sort of offers this kind of solution if you squint but it is not a good fit for our architecture.


> One downside of this approach though is it takes a long time for our app to boot

Another is that you're leaving a lot of memory saving on the table by not benefiting from Copy on Write.


In my past experience with a large rails monolith, memory usage was always the limiting factor. Just booting the app had significant memory overhead. Using in-process concurrency would have led to massive infrastructure savings, since a lot of that overhead could be shared across threads. Probably 2-3x the density compared to single threaded.

In the end, we never got quite there, due to thread safety issues. We did use a post-boot forking solution to achieve some memory savings thanks to copy-on-write memory, which also led to significant savings, but was a bit more complex.

All that to say, the naive "just let kubernetes scale it for you" is probably quite expensive.


Since you seem to have some insight into this, I interviewed a former Heroku eng a while back and he said that they deploy Heroku infra on Heroku, essentially dogfooding their own product. I also see a lot of claims that Heroku has essentially stagnated on features. I'm curious how do internal teams reconcile this (do they just work around the stagnation?) and is there a layer that is a choke point for product improvement that also affects to internal teams (if so that would seem demoralizing)?


Most of their attrition is due to the slow rate of product improvements. So yeah, it's demoralizing for sure.

I've heard this as a reason for leaving for almost a decade though so it hasn't been a new problem but it seems to get worse every year.


There are a bunch of internal only feature flags that Herokai use internally. Making internally visible feature flags is less of a political fight than making externally visible features.


Apollo has the ability to intercept clicks on reddit links and open them in the app. It works pretty well.


This is basic potted meat. You can even do it with fish: https://youtu.be/tXh_VT5ygOY


Yup, feeding AMP to consumers is a lot different than, say, https://outline.com/ which is straight up copyright infringement.


Does anyone actually know who is running outline.com? Are they definitely infringing on copyrights or is that just a popular assumption?

I've been wondering for awhile, given how successful it is at navigating paywalls, if it might actually be run by the newspapers themselves as a way of attracting potential customers. Giving them content to keep them interested while guilting them into paying.


> Does anyone actually know who is running outline.com?

The only hint I can get is their Terms of Service are governed by California law and the courts of Santa Clara County [1]. (It also refeeences “business transfers” in its privacy policy [2], implying it’s a for-profit entity.)

Otherwise—and this is unusual for any legitimate activity—I can find no reference to any legal person anywhere on their site.

[1] https://outline.com/terms.html

[2] https://outline.com/privacy.html


Unusual, but not as far as I know illegal or anything.

It would be unusual for an illegal operation to be

- Agreeing to the laws of a US court

- Have business transfers and be a for profit entity in that sense, hell, to even have a privacy policy.

(of course, they could also have just written that for fun)


You can pretty easily do a historical record look-up on the domain and its DNS. It does not appear to have any ownership tie to major media.


This is less useful than it was in the 90s:

    Registrant Name: PERFECT PRIVACY, LLC

    Name Server: NS-497.AWSDNS-62.COM
    Name Server: NS-1669.AWSDNS-16.CO.UK
    Name Server: NS-861.AWSDNS-43.NET
    Name Server: NS-1406.AWSDNS-47.ORG


Exactly, this tells us nothing.

If media were running this, the whole point is to not have it be publicly known, make people feel guilty for using the service.

If criminals are running this, obviously they don't want to be publicly known.


What if outline.com simply did a GET request on the source from the browser, stripped out paywalls and ads and served the content from the source server. Is that still copyright infringement?


That's basically what Brave does - web browser with a built-in adblocker. Or, for that matter, Chrome with any adblocking extension. It's not illegal, but it's not exactly popular with webmasters, and they're within their rights to block access to these browsers. In practice it tends to evolve into a cat-and-mouse game where adblockers block the ads, websites try to detect the adblockers and show you a pop-up encouraging you to turn off the adblocker, adblockers try to block the pop-ups, and so on.


Brave does more than that: https://news.ycombinator.com/item?id=18734999

Adblock Plus should fork its own web browser with built-in Acceptable Ads whitelisting. It'd be more honest than Brave.


Acceptable Ads is anything but honest. They take money to whitelist ads. Including ads from Taboola, one of the worst ad network. https://www.businessinsider.fr/us/google-microsoft-amazon-ta...


Now, I'm not a user of Acceptable Ads, but I don't think either their specific policy of taking money to whitelist AA-compliant ads from large companies, or having that policy apply to entities that have otherwise scummy ads is necessarily dishonest.


- Taboola ads are scummy

- Acceptable Ads is not supposed to allow scummy ads

- Taboola paid to get their ads accepted

- Acceptable Ads is dishonest

If you google a bit, you'll find that the ads that get whitelisted under Acceptable Ads are nothing different from the normal Taboola bullshit. In fact, the whitelist is quite simple: They allow the whole taboola network to operate.


Webmasters hate it when you do this but there’s nothing they can do to stop you


Yep, it's one secret technique they don't want you to know about.


That sounds like Reader View. I don’t think copyright law requires rendering the entire webpage exactly as the server requests.


I see no difference between visiting a webpage with a browser or through a program which modifies the content prior to delivery. That's exactly what any a browser plugin does.


The big difference is that if the program is run on the server of somebody else, it needs rights for redistribution (copyright) for the redistributor server, which is a different entity than you (who is running the local program).


Exactly, people want to make this a simple case (outline is really just a browser by a different name) but copyright isn't a bright line domain: intent matters and outline is just re-hosting other peoples content for broad consumption.


This would be true for every switch on the route.


Those don't modify or cache the content, only serve it to the user which the publisher approved serving it to, and increasingly with HTTPS they can't even see it. Moreover, the publisher is implicitly accepting that by being on the Internet.


Of course a lot of content is cached along the route, for frequent access urls.


Not by network hardware, which is the context of this discussion.

It's also not the type of activity being discussed, which is unauthorized caching: a content provider who uses a CDN is doing so intentionally and while shared local proxies are increasingly uncommon they also respect the Cache-Control headers set by the source — see e.g. https://redbot.org/?uri=https%3A%2F%2Fwww.wsj.com — so again there's the distinguishing factor of authorization.


I'm absolutely foggy about the specifics, but I wasn't talking about CDN, rather about the ISP-side.


Those have an implied license by the copyright holder.


Because this is disallowed by CORS/single origin policy


I think the OP stated a "what if", ie "What if CORS didn't exist?" I also think you could argue "What if outline loaded articles in an iframe?" (and at the same time "what if same-origin policy wasn't a thing?") If it was technologically possible, would it be infringing?


> If it was technologically possible

It's actually pretty easy, you can start chrome with --disable-web-security flag [1]

> I also think you could argue "What if outline loaded articles in an iframe?"

I'm sure this would be legal as it's equivalent to loading the site in a tab. The parent site wouldn't be able to manipulate any of the content/ads/paywalls/functionality, and the content site gets the full hit.

[1] https://stackoverflow.com/questions/3102819/disable-same-ori...


> The parent site wouldn't be able to manipulate any of the content/ads/paywalls/functionality

What? What do we disable CORS for if not to allow Javascript from one domain manipulate content in an iframe of an other domain? Am I missing something?


Disabling CORS would allow you to make straight requests to foreign content from your site and manipulate the responses exactly as though they came from your own servers - no iframe needed. CORS does not disable iframe sandboxing.


CORS is just a security feature, it does not imply anything about copyright or terms of use.


The DMCA ties the two by prohibiting users of copyrighted works from circumventing technological protection measures. It could be argued that bypassing CORS applies.


CORS isn't a technological protection like a DRM and isn't design as such, it's purely a security measure, by default you don't even specify it. Browsers are free to ignore them as they wish (but with increased security risks of course).


I agree. CORS is something my user agent does to protect me. It has nothing to do with the upstream site; I could easily browse it with a user agent that doesn't support CORS and nothing would break. CORS is just some annotations that lets my user agent determine "hey these scripts might be up to something shady". It is not a copy-protection mechanism by any means.


Yeah, but my point is what was being suggested is physically impossible with CORS in place, so it does imply something about what is in the realm of possibility.


CORS is really just a security for embeded pages and elements. It's not intended and cannot enforce usage restrictions/rights since it requires the client (browesr) to honor the setting. If I wget a page and strip the text from it, I'm not embeding the page in any manner so CORS is irrelevant. The current 'aggrement' for respecting copyright (wether it would hold up in court even with a TOS is beyond my knowledge) is robots.txt which, I'll admit is pretty dated and a very poor solution for dynamic pages and still requires client .

The best solution for copyright/paywall enforcement is to roll your own. If the request doesn't have the required cookie to access the full article, don't respond with the full page. This works very well for dealing with sites such as outline.com .

Sites like outline.com would be really interesting/usefull if they allowed you to upload your login cookies so that they could get paywalled articles and still strip the ads.


> If the request doesn't have the required cookie to access the full article, don't respond with the full page

The way outline.com works is by loading the article unsuspiciously once from their server, then serving it any number of times from their infrastructure. How would this stop that from happening?


Yes, it is. It's a form of republication, and as such, protected under copyright law. The difference between a third-party server side solution and client-side solutions is easy to see: client-side solutions don't redistribute.


If that is copyright infringement then so is Google Chrome.

Outline is a browser in a browser.

If it were accessing unauthorised content in a shady way I would say they have a leg to stand on, but serving up content to Google different from what it serves to the user is already borderline illegal for a variety of reasons. (It's a misrepresentation) I don't think anyone is wanting to go down that road.

Remember with the law intent is usually what matters most.

The technical stance that outline is serving up content that the user requests from another site is by design not redistribution.

The funny part is that the argument that allows this is the same stupid argument that permits copyright to exist in a world where everytime you open or read a file you are technically copying it.


> serving up content to Google different from what it serves to the user is already borderline illegal for a variety of reasons

This is called spoofing. Google doesn’t like it because it makes for a bad user experience, but it is certainly not illegal (or even borderline).

I think the legal argument you’re trying to make died with the Aereo supreme court decision. The ”outline is a browser in a browser” statement is cute but it doesn’t pass the duck test.


> I think the legal argument you’re trying to make died with the Aereo supreme court decision. The ”outline is a browser in a browser” statement is cute but it doesn’t pass the duck test.

About Aereo, could you elaborate more on that? I never heard of it, checking the Wikipedia article, I cannot find the word browser inside.

I'm seriously interested in that argument because years ago I was considering an idea to do exactly that. I mean look at Rubinius (Ruby in Ruby) or PyPy (Python in Python). Those are actually serious projects that are more than just research - as far as I know some thing can be even done faster that way and gave inspiration for the reference implementation.

Speaking about JS, React is basically re-implementation of the DOM in JS with an XML like language.

Nobody minds if Chrome and Firefox include Translations that transforms websites, that people use screen readers etc. I think there are limits of reason of what a content providers can restrict.


Outline is doing a transformation and then serving a copy of it, not just doing it in place with the copy that the user lawfully obtained.

Aereo tried to do the same for TV broadcasting (they were claiming they didn't copy, just digitalize and transmit on behalf of the user), and the courts struck that down.


Right. In Aereo's case they even went as far as to nominally "rent" the colocated hardware to the user such that it was the user who was receiving the broadcast and transmitting it to themself for personal use. Which is clever, but the court ultimately decided that it doesn't work that way.


How is this different from what the web archive does?


Hahha TRULY.

I loved reading the CDC growing up. That one of them is a serious presidential candidate now is truly making my day. I'm almost in tears laughing at the idea.


It's also possible that that software system isn't triggered the same way on AA and SWA 737s as it is on the other airlines...


You keep posting this Reddit comment in every single one of these threads. Maybe lay off for a bit?


I agree and I don't think you deserve the downvotes. The FAA is an institution with a deep sense of safety first and independence from (and even dominion over) private industry.

They are also a body of experts, with more collective knowledge than any other similar body of domain experts in world.

If we believe in institutions of experts who study and make recommendations regarding complex systems, we owe it to the FAA to trust they are charting a prudent course based on the evidence they are gathering and analysis they are doing.


It's interesting you put it this way. I just heard on NPR this morning that the FAA and NTSB is actually in partnership with private industry (like Boeing and the airlines) for cooperative self regulation because the FAA and NTSB just doesn't have the resources to provide adequate inspections without their help. I'll see if I can find that interview.

https://www.npr.org/2019/03/13/702908761/why-did-a-boeing-73...


I listened to this interview and it's interesting the way you put it:

Steve Inskeep rephrases a statement that Goelz makes to make it seem like the FAA is short staffed and that airline compliance is on the honor system.

Goelz then says what he means is that there cannot be an independent FAA inspector in every single corner of a manufacturer or maintenance shop, but by then it's too late because the seed has been planted listeners mind.

So here you are saying the "FAA and NTSB just doesn't have the resources to provide adequate inspections" and that the overall compliance model here is "cooperative self regulation" which... I guess barring the FAA having a 1:1 staff for every employee in the airline industry we are indeed going to have "cooperative self regulation".


Regulatory capture happens. Top level leadership starts overruling career professionals and then that dominion gets reversed. Not saying that's what has happened hear but boy did Boeing pull one over on the Faa with this planes release. And Faa leadership has incentive to hold the line. Faa fucked this up long ago.


An attitude of “damn the regulations, we need to help business” flows from the very top of this administration, too. That, and blatant corruption. How much has Boeing spent at Trump properties in the last few years?

The FAA should be independent but it doesn’t always work that way, and this sort of thing can tip the balance.


The FAA is great, and I'd agree it's probably better than any other individual agency in other countries. But better than all of them together? Maybe, maybe not.

To turn your concluding argument around, what is imprudent about grounding these planes until we have a clearer answer? It's going to cause some economic loss, but this seems like a much lower priority.


Concur, lived in Texas and NYC both, Austin is hotter for sure, but air conditioning is on blast everywhere and the driving helps. Just walking a couple of blocks on a 90 degree day in NYC is enough to get you soaked.

On top of that you can wear shorts most workplaces in Austin and not have a problem. Shorts in NYC is frowned upon in the cosmopolitan set and professional industries.


This is Hacker News. The conversations generally have a tech-oriented context. I've never worked at a tech company in NYC where lots of people don't wear shorts in the summer. Shorts, flip flops, tank tops. It's all fair game. No one cares.


thanks for the mansplain. you got new york figured out, bro.


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: