- The user sends an HTTP request to somesite.com
- Their DNS query for somesite.com gets resolved to some datacenter near them
- The HTTP request arrives at the datacenter where the PHP in WebAssembly is executed at half the speed of native PHP
- The PHP in Webassembly sends database queries to a central DB server over the internet
- The PHP in Webassembly templates the data and sends it back to the user
How is that faster than resolving somesite.com to the central server, sending the HTTP query to the central server where PHP runs at full speed and talks to the DB on the same machine (or over the LAN)? Even if PHP ran at full speed at the "Edge", won't the two requests over the internet
user --1--> edge PHP server --2--> central Db server
take longer than the one http request when the user connects to the central server directly?
user --1--> central PHP+DB server
In reality, the PHP script on the "Edge" server probably makes not one but multiple queries to the DB server. Won't that make it super slow compared to having it all happening on one machine or one LAN?
Due to connection setup roundtrips, TCP slow start mechanics and quality of network connections it is usually better to terminate the client's TCP connection close to them. But I agree that moving the frontend rendering near the client doesn't really make sense for almost all cases. So the best general purpose setup would probably look like user → CDN → central PHP+DB.
Of course there are always exceptions. Often stale data is ok and you can ship data snapshots to the edge so that they can be served without going back to the central DB. But in many cases some basic cache policies with a CDN can be nearly as effective.
What I usually do is that I use two hostnames. host1 for http requests which can be cached. That host is behind a CDN. host2 for HTTP requests which can't be cached. That host points directly to my server.
Are you saying it would result in a better user experience when I only use host1 which is behind the CDN and add no-cache headers to the request that can't be cached?
It's complicated but typically yes. The simplest reason is that TCP+TLS handshakes require multiple round trips for a fresh connection. The CDN can maintain a persistent connection to the backend that is shared across users. It is also likely that the CDN to backend connection goes over a better connection than the user to backend connection would.
> The CDN can maintain a persistent connection to the backend that is shared across users
We considered using Cloudflare Workers as a reverse proxy, and I did extensive testing of this (very reasonable) assumption. Turns out that when calling back to the origin from the edge, CF Workers established a new connection almost every time, and so had to pay the penalty of the TCP and TLS handshake on every request. That killed any performance gains, and was a deal breaker for us. It’s rather difficult to predict or monitor network/routing behavior when running on the edge.
This didn't sound right to me so I did some investigation and I think I found a bug.
Keep in mind that Cloudflare is a complex stack of proxies. When a worker performs a fetch(), that request has to pass through a few machines on Cloudflare's network before it can actually go to origin. E.g. to implement caching we need to go to the appropriate cache machine, and then to try to reuse connections we need to go to the appropriate egress machine. Point is, the connection to origin isn't literally coming from the machine that called fetch().
So if you call fetch() twice in a row, to the same hostname, does it reuse a connection? If everything were on a single machine, you'd expect so, yes! But in this complex proxy stack, stuff has to happen correctly for those two requests to end up back on the same machine at the other end in order to use the same connection.
Well, it looks like heuristics involved here aren't currently handling Workers requests the way they should. They are designed more around regular CDN requests (Workers shares the same egress path that regular non-Workers CDN requests use). In the standard CDN use case where you get a request from a user, possibly rewrite it in a Worker, then forward it to origin, you should be seeing connection reuse.
But, it looks like if you have a Worker that performs multiple fetch() requests to origin (e.g. not forwarding the user's requests, but making some API requests or something)... we're not hashing things correctly so that those fetches land on the same egress machine. So... you won't get connection reuse, unless of course you have enough traffic to light up all the egress machines.
I'm face-palming a bit here, and wondering why there hasn't been more noise about this. We'll fix it. Talk about low-hanging fruit...
(I'm the tech lead for Cloudflare Workers.)
(On a side note, enabling Argo Smart Routing will greatly increase the rate of connection reuse in general, even for traffic distributed around the world, as it causes requests to be routed within Cloudflare's network to the location closest to your origin. Also, even if the origin connections aren't reused, the RTT from Cloudflare to origin becomes much shorter, so connection setup becomes much less expensive. However, this is a paid feature.)
> So if you call fetch() twice in a row, to the same hostname, does it reuse a connection?
In my testing, the second fetch() call from a worker to the same origin ran over the same TCP connection 50% of the time and was much faster.
We want to use Workers as a reverse proxy - to pick up all HTTP requests globally and then route them to our backend. So our use-case is mostly one fetch() call (to the origin) per one incoming call. The issue is that incoming requests arrive to a ~random worker in the user's POP, and it looks like each Worker isolate has to re-establish its own TCP/TLS connection to our backend, which takes a long time (~90% of the time).
What I want is Hyperdrive for HTTPS connections. I tried connecting to backend via CF Tunnel, but that didn't make any difference. Our backend is accessible via AWS Global Accelerator, so Argo won't help much. The only thing that made a difference was pinning the Worker close to our backend - connections to the backend becamse fast(er) because the TLS roundtrip was faster, but that's not a great solution.
> The issue is that incoming requests arrive to a ~random worker in the user's POP, and it looks like each Worker isolate has to re-establish its own TCP/TLS connection to our backend, which takes a long time (~90% of the time).
Again, origin connections are not owned by isolates -- there are proxies involved before we get to the origin connection. Requests from unrelated isolates can share a connection, if the are routed to the same egress point. Problem is that they apparently aren't being routed to the same point in your case. That could be for a number of reasons.
It sounds like the bug I found may not be the issue in your case (in fact it sounds like you explicitly aren't experiencing the bug, which is surprising, maybe I am misreading the code and there actually is no bug!).
But there are other challenges the heuristics are trying to solve for, so it's not quite as simple as "all requests to the same origin hostname should go through the same egress node"... like, many of our customers get way too much traffic for just one egress node (even per-colo), so we have to be smarter than that.
I pinged someone on the relevant team and it sounds like this is something they are actively improving.
> The only thing that made a difference was pinning the Worker close to our backend - connections to the backend becamse fast(er) because the TLS roundtrip was faster, but that's not a great solution.
Argo Smart Routing should have the same effect... it causes Cloudflare to make connections from a colo close to your backend, which means the TLS roundtrip is faster.
Thank you for looking into it in such detail based on an unrelated thread!
Cloudflare seems to consistently make all types of network improvements behind the scenes, so I’ll continue to monitor for this “connection reuse” feature. It might just show up announced.
Yes, tried tunnels too. There is significant variability among individual requests, but when benchmarking at scale I found no meaningful difference in p50 and p90 between “Worker -> CF Tunnel -> EC2 -> backend app” and “Worker -> AWS Global Accelerator -> EC2 -> backend app”
> Are you saying it would result in a better user experience when I only use host1 which is behind the CDN and add no-cache headers to the request that can't be cached?
Yes, because that way you can leverage the CDN to defend against DDoS issues, and you can firewall the origin server itself so only the CDN is allowed to communicate with it, but no one else.
That's a valid concern, and with a classical centralized DB the Edge solution will not be faster in many cases, especially considering how many DB requests some ORMs like Drupal or Wordpress need for each page load...
Some other providers have tried to solve the problem with replicated databases that are used for local reads.
I can't go into details yet, but Wasmer will offer a solution for this problem quite soon as well, which hopefully will be much more seamless than existing strategies.
You're assuming that network latencies follow the triangle inequality, i.e., that A->C is smaller than A->B + B->C. However, that breaks down because of money. It's possible that the user's ISP is a cheapskate without good peering agreements such that while A->B is fast, they haven't setup the necessary agreements to make A->C fast.
I've personally seen this with a game recently and my own ISP. I can ping their West Coast servers in 20-40ms while pinging their Chicago servers takes 200ms+.
This is actually one huge flaw with the current way IP is done. Because your ISP is closely tied with the last-mile of routing, you have very little control over optimizing those backend layers of routing.
This doesn't explain why you wouldn't just run a thin proxy at the edge and instead want to run a full PHP app. For that, my guess is this makes sense when you don't want to run a full server all the time or if you want it to be rapidly scalable.
I'm working on something in a similar space. It's not as mysterious as people make it sound:
You do caching so you don't always have to go back to the DB is the main thing. The other interesting capability is light scripting at the edge. Think "this page is cached but I'm going to add in a custom username".
There's some more exotic stuff around sharded database locality. But I'll skip over that because distributed mutable state is hard and its not my specialty.
Connection warming and shorter tls negotiation round trip times are also a thing but probably less important compared to the database thing you are mentioning.
But mostly this architecture makes sense if you can do some caching or if you're doing more data/log collection stuff.
For prior work see ESI (edge side includes). Assembling stuff at the edge does have some benefits but it has been hard to create a good interface for that.
I guess if you have a use case with higly cacheable data on edge, it might work great.
For example, a form builder like typeform could cache form definitions on edge and render them close to users. Submissions would require db communication but the entire experience could be better.
Otherwise it is just not worth it.
BTW, I don’t even think PHP running slower in wasm would be that important. These things generally depend on IO performance rather than runtime if you are not doing a lot of calculations on the edge. WASM is also pretty fast these days so..
ironically everything is petty fast and everyone waste that speed adding things that allow for a cheaper labour. wasm included. it went from a "have some expertly optimized code in the browser" to "just compile this hide cpp thing and ship as a react component" to then "it allows our servers to run any crap without knowledgeable staff in each language"
we added so many Layers to cheapen labour that now all the labour costs goes into managing the Layers.
Ok hear this, I was toying with zig last week and compiled a quickjs wrapper that can run React and compiled that to wasm. So I have a JS runtime running wasm which runs quickjs runtime via zig and that runs React. Just trying to learn zig though but not much different than running PHP in WASM haha. Layers and layers of shit pressed together.
I'm gonna counter the other comments that try to justify this somehow, and say:
I agree with you, and it's not faster. With the risk of sounding old (get off my lawn): I assure you the megabytes of unneeded Javascript they're downloading to the client are ten times slower to execute than the milliseconds they may be saving on the wire.
yes you are correct. the top comments disagreeing are either full of wrong basic concepts or are very dense in detail while hand waving the important parts and from CDN-for-hire employers.
The post mentions a cache. I think the key here would be not going to the central DB and instead going to a distributed cache. Not an unreasonable concept when I assume the vast majority will be read operations.
Hi, on kraft.cloud we use FC, along with a custom controller and very specialized VMs (unikernels) to have extremely efficient deployments (eg, millisecond cold starts). For a PHP web server, for instance, we can cold start things in about 30ms (https://docs.kraft.cloud/guides/php/). It's also possible to run wasm workloads/blobs (e.g., https://docs.kraft.cloud/guides/wazero/).
The builds are based on Dockerfiles, but for deployment we transparently convert that to unikernels.
Given that we run PHP on the edge, what is the point of running the PHP interpreter on top of a WebAssembly interpreter (Wasmer) instead of just running the PHP interpreter directly?
Why are we running this stuff at the edge anyway? Wasn't the point of the web to permit light/thin clients? What problem is this really solving? I feel a lot of these new methods exist simply out of "holy crap look what we can do" factors. I have a hard time thinking that running a PHP interpreter in WASM on a smartphone is a smart idea.
"What if our code ran in tens of thousands of different environments that we have no control over, rather than a single server environment we control!"
One thing WASM runtimes usually do really well is sandboxing.
Various interpreters might or might not have a good capability/permissioning model (Java's is capable but complex and not supported by many applications, for example); even if they do, there might be exploitable bugs in the interpreter itself.
When Cloudflare Workers launched, they said V8 isolates had some great properties for serverless-style compute:
5 ms cold starts vs 500+ ms for containers
3 MB memory vs 35 MB for a similar container
No context switch between different tenants' code
No virtualization overhead
I'm sure these numbers would be different today, for instance with Firecracker, but there's probably still a memory and/or cold start advantage to V8 isolates.
But in this case, it's running PHP, which doesn't have a long-running model, it always cold starts, and it does so really fast natively. I can't see how it could be faster in WASM.
In almost all cloud deployment, whether transparently or not, you'll have a hypervisor/VM underneath for hardware-level/strong isolation reasons. Using wasm on top of that stack only for isolation purposes might not be the best use of it. Having said that, if wasm is useful for other reasons (e.g., you need to run wasm blobs on behalf of your users/customers), then my (admittedly biased) view is that you should run these in an extremely specialized VM, that has the ability to run the blob and little else.
If you do this, it is entirely possible to have a VM that can run wasm and still only consume a few MBs and cold start/scale to 0 in milliseconds. On kraft.cloud we do this (eg, https://docs.kraft.cloud/guides/wazero/ , wazero, 20ms cold start).
On kraft.cloud we can (done internal stress tests for this) run thousands of specialized VMs (aka unikernels) scaled to zero, meaning that when a request for one of them arrives we can wake it up and respond within the timescales of an RTT. You can take it out for a spin, just use the -0 flag when deploying to do scale to 0 (https://docs.kraft.cloud/guides/features/scaletozero/).
Interesting – are we talking actual Linux VMs here, with binary-compatible syscalls etc., or something that applications need to be specifically built or packaged for in some way?
> Server-side WASM takes off with the re-implementation of PHP, Ruby/Rails, Python, and others, and a WASM based virtual server (shell, filesystem, web server, etc..) Cost more but has better security for both the host and user.
Guess I was wrong about it costing more?
> … we can run PHP safely without the overhead of OS or hardware virtualization.
But it only runs at half the speed of PHP, so you need more resources.
Honest question, what kind of requests are you thinking of? In my projects I’m always fetching or changing data in a database on each request and if I’m not then I’m probably moving that logic to the frontend.
A basic example is some compute service, say image transformation. You just run computations where all the input is in the request, and all the output goes to the response.
I feel like you still need to the DB for that, for auth at a minimum (which might not need to hit your main DB I guess) if not for logging (spend credit/record usage/just analytics). I guess all of that could be skipped with something like a JWT and log ETL process.
I'm intrigued by edge computing but the DB always seems like the bottleneck so thank you for the example.
I'm still trying to understand what this does and what's the use case. Is the "edge" a server? The browser? Why should I compile WordPress or Laravel to wasm?
In this context the edge is a fancy word to say "serverless". It just means that your PHP interpreter will be started on-demand on a node closer to your customer's request.
So if your website receives no requests, it costs you nothing. And requests have less latency for the user.
That's the theory anyway, in my experience reality is a lot more nuanced because the serverless node still has to reach a database and so on.
This seems like a misunderstanding, WebAssembly has nothing to do with PHP’s internal performance mechanisms. WASM is a compilation target, this means you can take code which needs to be compiled (like PHP’s core binary), and compile it to be run in a browser.
PHP in WASM means developers can run actual, real, native PHP code in the user’s browser, without the user needing to have PHP installed locally, or nginx, etc…
You could do that, but the post here is in fact about running PHP on a server on top of a Wasm runtime.
“At the edge” basically means “close to the user”, with the details left as an exercise to whoever is selling you their “edge.” In this case, it’s a Wasm runtime company.
Yes, though I'd like to point out that "scale to zero" is a loose definition to mean anything that can be transparently scaled to 0 whenever an app/service is idle, and then wake up when traffic to the service arrives once again.
The problem in practice with Cloud Run (and similar products from other providers) is that it can take seconds or minutes for the platform to detect idleness, during which you're still paying, and then seconds to wake up -- during which users/clients have to wait for a response or possibly leave the service/site.
For my taste, real scale to 0 would be: detection and scale to 0 within < 1 second of a idleness, and wakeup within an RTT, such that the mechanism is transparent to end users.
As a shameless plug, this is what we do at kraft.cloud (based on years or research, LF OSS work, unikernels, a custom controller and overall non-negligible engineering effort).
The “edge” in this case is the browser (from a pure WASM standpoint, though I see these guys also offer a hosted serverless version too).
For most general-purpose applications, there’s no point to WASM. But some apps may run specific functions which take a long time (e.g. bulk/batch processing), and being able to execute those tasks securely on the client side provides immediate feedback and better UX.
That’s just one use case. Another is that WASM makes PHP portable, in the sense it can run in any web browser without the need for a back-end server. Lots of potential opportunities for distributing software which runs completely locally.
Thank you for explaining. As a Web dev, who's using PHP since version 4, I'm still very confused why someone would consider running a CMS like WordPress on the client side, or at the "edge". I guess the good thing here is that someone spends (a lot of) energy in giving PHP new ways to get used by developers.
The edge within this context means running a server close, in terms of Internet latency, to users. For example, if a user if sending a request from Germany, then the response should come from a server running in say Frankfurt, not the US. There are now many providers that allow devels to deploy services at many different locations at once, and to ensure that client requests are routed to the closest available location. An understandable source of confusion is that wasm comes from the browser world, but it's also possible to run it as standalone (no browser) server code.
Also not to be confused with the term edge within the context of IoT/embedded, where the edge is devices running at the very edge of the Internet, e.g., factory floors, trucks, etc.
The browser is not the "edge". The browser is the browser. Running WordPress in the browser makes exactly zero sense. Only exception if you are running a test instance.
Can someone ELI5 what is does "edge" computing means?
The way I understand it is that is moving some operations closer to the client to avoid bandwidth costs and improve performance.
I thought of the Tesla car computer as edge computing, as it does a lot of processing within the car that would otherwise add latency and reliance on a internet connection.
But for web browsers? Going to some websites?
What sort of apps need this functionality?
Seems like over-engineering, so I'm looking for someone to explain me.
Trying to understand better the solution. Why isn't it possible to restrict the application via process isolation (nsjail, cgroups, docker...) and wasm is needed instead?
Actually it means both, in an unfortunate case of term overload. Though I can understand the embedded/IoT world being frustrated by this, as the term existed first within that context.
Author here. Faster than it has ever been in the Edge via WebAssembly :)
But you are completely right pointing out that there's still some room to improve, specially when compared to running PHP natively.
Right now there's some price to pay for the isolation and sandboxing, but we are working hard to reduce the gap to zero. Stay tuned for more updates on this front!
Given that we run PHP on the edge, what is the point of running the PHP interpreter on top of a WebAssembly interpreter (Wasmer) instead of just running the PHP interpreter directly?
From what I can tell it's because some of these "edge" service providers will expect you to give them a WASM binary instead of a PHP script.
The other caveats about "edge" throughout this discussion aside, if I needed to do this, I'd try to write something in Zig or (gag) JS or something else that compiles to WASM directly rather than writing a script for an interpreter that runs under WASM.
Nb PHP is already really fast. Some major things built on it are slow, mostly due to things like poor data access patterns or architecture, but its culture since the beginning has basically been “get out of PHP and into C as fast as possible” and it shows (this is basically the trick to any scripting language “being fast”, PHP just embraced it hard from the very beginning).
If you’re on PHP and need more speed in the language itself, basically every other scripting language (yes, including Node) is off the table immediately. Lateral move at best.
(All that to say, yeah, I entirely expected that the headline would lead to an article about making PHP slower)
> PHP just embraced it hard from the very beginning
Which is also the reason why a lot of the PHP standard library functions are so inconsistent. They're straight wrappers around the C libraries.
Upside is, unless you have to do shit with pointer magic or play around the edges of signed/unsigned numbers, it's fairly easy to port C example code for any of the myriad things PHP has bindings to to PHP.
This is trying to solve a solved problem with lots of difficult technology that doesn't apply here. Most of the PHP websites are WordPress. The solution to have a speedy WordPress site is to compile it to static HTML. Calls to the server should happen with JavaScript. The server will always remain relevant as WordPress uses a Database and thus the "Edge" makes no sense here.
Getting WordPress running in WASM was a huge milestone. It was one of the first big PHP/WASM achievements, but was never the end goal, just a proof-of-concept. The target market for this tech is not WordPress bloggers.