Hacker News new | past | comments | ask | show | jobs | submit login
Modern Web Application Architecture (leftnode.com)
88 points by leftnode on Nov 5, 2011 | hide | past | favorite | 45 comments



Your client side shouldn't have any PHP in it and it should be the Javascript making requests to the REST API. Is there a reason why you are not doing this?

What you have done here is just add another tier on your backend, when you don't really have to and don't really need it. This is something you do when you have hundreds of thousands (or millions) of visitors, not something I would worry about before you even launch an app.

I was surprised when I came to the point of the post where you describe the client - I thought the point of your API (and the post) would be to describe a single page app with only an API on the server side.

Edit: what aedan said, as well :) you should really take the time to just do the 'client' in javascript + html + css - it is totally awesome. Your client is then static and you can CDN and the hell out of it. Twitter is the best popular example of this architecture where the same API is used for their own apps, including the website, and for third-party developers.


Aside from the API having no access restrictions (and therefore has to be internal), it's providing pure data, no html.

The template rendering would have to be done in JS, which means that the templates have to be delivered to the browser. Either you load every template when you need it or you have to transfer all templates on page load. The first case has no advantage over a simple application layer on the server and the second increases page load time.

Thought further, some older mobile phones might have a hard time rendering complex templates in JS, so that might be another factor.

He should just do what best fits his requirements, and that might require a simple application layer in front of the api.


While I agree on the 100% js. To optimize the experience and app load time its best to provide the data on load inside the document. reducing http calls. backbone supports the idea here:

http://documentcloud.github.com/backbone/#Collection-reset


I put templates in my loads, but data I get with a background process that takes advantage of the local cache. So the first get will take 500ms or so to prime the user cache, but from then on it is all diffs back to the server. The agent is something that I have been meaning to open source.

It is about weighing pro's and con's. I would prefer to take advantage of having all the assets being static, cachable, CDN-able etc. Weighing those options in other apps might not make as much sense, but this particular app for me had small amounts of very user-specific data.

Btw there is still a lot of this 'stack' to be built, I am finding myself having written and having to write a lot of Javascript to support the model I am using


I am guessing there are some access right restrictions to the API and they haven't added anything like OAuth yet. That's why they need to proxy for it (hunch based on the use of word api-internal in the post).

Another reason might simply be they just don't have the right people to develop the JS front-end.


I think you should take the frontend one step further and have it live completely on the client side. It is fully within the realm of possibility to build a JavaScript/HTML application for the front end that only interacts with your API.

Is there a reason you didn't do this?


Exactly my thought: why have one server proxy request to another Web server instead of having the client use the API directly?

Your main Web server could just serve the initial page / application / whatever, and then the application lives on the client, fetching data from the API and formatting it as needed. You end up with proper MVC in the browser.

That's what I did for a small side-project of mine last week and it worked very well. I detailed how I built it here: https://plus.google.com/118077068834135870002/posts/62agb7qp...


Thanks for the link, I'll check it out.


He talks about wanting to charge for the API. If the javascript on your public website directly calls your (otherwise paid) API, how can you avoid exposing the credentials it uses which are unmetered?


I recall some functionality in a PHP framework I was using that allowed you to make API calls on the server side through use of a class or function. It was something like

    $user = $api->GET('/accounts/the_user');
which would process the API call without actually making a separate HTTP request. Would this accomplish it?


You could make sure that the requesting page is on your domain.


It's trivial to use a proxy to modify the referral headers.


Make the calls to the API over HTTPS.


aeden, I agree with you that the disadvantage of this approach is that you can't just repackage the client as a mobile app.

However, there are two reasons why you would access the API with server side code in a web client (as opposed to a mobile one).

[1] You can provide a Google friendly, static (no javascript) version of the site and all content - as opposed to a completely invisible site from an SEO perspective.

[2] You can store local versions of all data that's returned from API calls.

This is useful in a couple of ways.

[a] It means that you need only check the API for a 'last modified' date and, if it's older than the current date, use the locally cached API call results first (if they exist). This makes the client extremely fast, since you're reading flat files locally.

[b] If the API is inaccessible for some reason, the client can default to using the locally cached copies of previous API call results

I built this site (backend, not the HTML/CSS/JS) the same way a while back and have been pleasantly surprised with the results - http://jacksonteece.com/


This pattern is very interesting, although I'm finding I develop a quite a bit faster on server side frameworks like Flask/Django... One thing I don't like about the pattern is that you have to send the templates to all the users (static), even users that don't have access to the those pages (for example, an admin page). Have you run into this? Of course you could also send the templates via JSON...


Well in my case I wrote a little packer / minifier for compiled Jade (http://jade-lang.com/) templates and serve them as a single JavaScript file.

When you have different templates based on user rights, you should probably make different packed template files and serve them only to users with the proper cookie / session rights.

I'd serve an admin page / app at a specific URL and then it would reference the proper template JS file (or whatever you're using).

Either way I wouldn't worry about the extra few bytes served for rarely used templates, just make sure the template file is properly cached. Or serve it as a different file and load it asynchronously when needed.


If anyone's interested, I put up the code I wrote for packing Jade templates into a single file and serving them to the client with Express: http://pastebin.com/kRtCJHYH (feel free to do whatever you want with it)

I found there were very few good alternatives for using Jade template client-side. The only one that seems promising is JadeVu (https://github.com/LearnBoost/jadevu/) but the approach is very different.


Oh yes, its not the page size I worry about, but whether the template already has some info that you don't want to share (OK, probably most interesting info is dynamic, but anyway)


Or a hybrid: serve templates/views/html/etc as usual but Ajax straight to the API/db.



Whenever I heard about Amazon using SOA heavily, I thought this is how they implemented it (making HTTP requests to their own API). Is it true? If there is a performance penalty to this approach, what are the alternatives to call the internal services/APIs? People talk about services all the time, but no one talks about the actual implementation. Thanks to the blog author for documenting this. It would be great if someone could point out similar articles.


My understanding of Amazon's internal SOA usage is that it is geared toward timeout-enforced service response guarantees to mitigate concerns about performance penalty. Specifically, a page render event triggers initiation of asynchronous calls to N service endpoints( I have read as many as 300 per page render event ). The page renderer will wait for a specified time to receive responses from each service, but timeout on those services that do not meet their latency SLA. Page components with dependencies on either error result or timed-out service responses will simply be not rendered or replaced with dummy content.

Employees of Amazon, feel free to correct any holes or errors in my understanding of the web application architecture there.


I wrote about this some time ago. It's a great way to build web applications after you sort out the little kinks and intricacies. Some tips can be found in this post: http://www.w2lessons.com/2011/05/why-and-how-you-should-writ...


For the last couple of years I've been developing projects as an api layer coupled with a dumb 100% javascript front end. The api layer always returns json data, and the front end requests and consumes it. 3 projects down the line and this is so far been great success. The back end server is now free to do what it really needs to be doing - get data to the front end. You can write different front ends (mobile, voice etc.). They all request and consume the same data. And, you earn an api you can open to the public


Harel, I'd be very interested in seeing some of the projects you've done using this technique. Are there any that are publicly accessible?


> It is stateless and uses no sessions or cookies.

Could someone please enlighten me how this should be done best?

There are a lot of cases where API must authenticate and authorize the user before allowing them to access the resource. To do this one has to use either self-invented scheme or HTTP authentication. Here're my thoughts:

- HTTPS X.509 certificate auth is nice, but unfortunately requires excessive user awareness, does not work in many browsers (at least, Chromium@GNU/Linux and Android), and, I believe, cannot be easily controlled from JavaScript.

- HTTP Basic auth is silly, as it requires browser to hold the password in memory for prolonged time period, and there are no sane methods to make the browser forget the credentials. As there's no notion of session, remotely revoking previously-open sessions is impossible, so the situation "Oops, I forgot to log out at that Internet café" has the only one possible solution - changing your password.

- HTTP Digest auth requires plaintext password knowledge on the server side, so it has limited use cases. HTTP Basic problems apply here, too.

- HTTP OAuth1 seem to be the best available solution out there, but has a downside that it requires signature generation and verification on each request, and isn't natively supported by any browser I know of.

- HTTP OAuth2 is HTTP cookies reinvented. Except that, once again, no browsers supports it natively.


Perhaps the front-end can always be open sourced since the business-critical parts all sit on the back-end.

Advantages of doing this:

* Clients can run their own copy of the software

* Other developers (or even clients) can submit pull requests directly instead of issue ticket

* The monetization potential can be maximized


One thing I've been thinking about for an architecture like this is to use an event machine on the frontend and make the API calls asynchronous to negate the latency issues.


Yep, thats a big reason. Always load the possible next actions upfront, and when the user clicks, its immediate.


On "it's slow" ...

Making direct calls to the DB probably won't make it ten times faster. Reporting data should come from a pre-aggregated table (think data warehouse) so that you're not doing anything than an indexed SELECT.

So anything displayed to the user should become instant. You're never displaying a million records at once, max 20.

For reporting, build those on the server side using a different data interface to the DB - RESTful is not the right answer here.


I posted a question on stackoverflow trying to gather pros and cons of the methods outlined in this HN post.

http://stackoverflow.com/questions/8022715/standalone-rails-...


Another advantage to this is that you can have other types of clients in parallel. For example a native iOS app could call the API directly.

This native app could be made with client side web tech and packaged with PhoneGap.

Would also be interesting to try to implement the client in such a way that it both can run on the server (in node js, outputting static pages to browsers) and also as a pure client side web app, packaged with PhoneGap.


I see a lot of great things coming from this, such as creating an infrastructure where the web team understands and has incentive to develop an API that will be just as useful for the mobile team.

I wonder about security though. How would you lock this down so that the data can be encrypted?


Use HTTPS? Or do you mean security against other kinds of attacks besides sniffing and MITMing?


"Generally a single URL returns a specific resource"

I've seen this misconception. There is absolutely nothing wrong with GETting multiple records with a querystring.

There are frameworks like CodeIgniter that inexplicably do not support this in a first class way.


A resource is not a record.


Forcing every feature you write to be service-backed is actually quite annoying. I'd avoid using web services for the main app until it was absolutely necessary.


I am using this architecture on a new project, I would love to hear any possible drawbacks on it.


We can't tell you if this is a good idea without knowing what your project is.

In the absence of that, here's some "possible" criticisms, some of which come directly from the blog post itself:

You're incurring extra latency by doing two HTTP connections where one might do.

If you choose to serve your REST API from a different layer of servers than your front end, that's a whole new layer of servers that can go down. So you have a whole new layer of redundancy and failover to engineer.

Your back end is issuing SQL queries, encoding the results as JSON or (god help us) XML, passing it to your front end which is very likely parsing it, filling out templates with the resulting data structure, and emitting HTML. That extra encode-and-parse step costs time and resources. (Of course, if your front end is generally Javascript running on the customer's machine, this isn't a big problem.)

Finding bugs can be a pain. You now need multiple layers of logging, and tracing a buggy request through the system requires you to piece together a chain of internal requests.

You've designed twice the surface area of API. You must now document and test an internal API as well as your front end, and the two may have significantly different semantics. Now, it's true: Just because you don't formally define your internal API doesn't mean it isn't there, because every app has some sort of internal API (often built around an ORM, these days). But once you decide to formalize it that API is harder to tinker with.

The overarching criticism, as always, is: YAGNI. You are almost certainly not building Facebook. For a very large percentage of the sites on the web, this design is total overkill. If you're delivering text embedded in HTML, like most blogs or magazines or brochureware sites, you should spend your energy figuring out the Varnish and CDN layers instead of fiddling around with a custom REST API that nobody is ever going to use. Just install an off-the-shelf RSS module and call it a day.

Even if your site might potentially benefit from an internal API, should it really be part of your initial deliverables? Does the customer want to pay for it? Does the minimum viable product require it? If there is anything more painful than designing one interface for the wrong product, it's designing two interfaces for the wrong product.

Having said all of that: This architecture is indeed really useful when you have the right problem, we use something like it at my own company, and (as many other commenters have pointed out) in the world of rich Javascript clients and native mobile apps this strategy is gaining in popularity for good reason.


Excellent.

We run an architecture somewhat similar to that described in the article, and a lot of the issues you raised we either dealt with up-front, or realized we needed to deal with them very soon after rolling it out.

Regarding extra latency and sql select result marshaling, we cache extensively and invalidate through message queues. The few exceptions include data that are involved in transactional contexts, like actual order placement and fulfillment.

We have effectively solved debugging/logging by generating and chaining request identifiers, which turns out to not be as computationally expensive as one would think.

YAGNI is, of course, the elephant in the room. Some aspects of the architecture have turned out to be quite beneficial, but the traffic scalability afforded by shared-nothing, API-driven architectures is not something we have had the luxury of really exercising as much as we'd like :)


Yes, the caching at the API layer has proven critical for us, too. Although, of course, this introduces a cache coherency problem, and we all know what fun those can be.

You bring up a very good point that I left out:

The few exceptions include data that are involved in transactional contexts...

It's sad how few people even understand how to write transactional code when you hand them a direct interface to SQL. (I like to think I do, but I may be fooling myself.) And trying to implement, document, test, and maintain a stateless RESTful HTTP protocol that properly supports transactions on the underlying data store is even harder.


I can imagine performance being an issue (as mentioned in the article, but without any numbers) - anyone know of any benchmark results?


I have a similar architecture for http://www.mockuptiger.com

It is php+jquery+mysql

I use adodb library but I think PDO is now quite mature so will consider that for other enhancements


What's that additional layer of PHP code good for? Call your API from Javascript.


Why not let the front end sit entirely on the client?

EDIT: What aeden said.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: