Hacker News new | past | comments | ask | show | jobs | submit login
XHP: A New Way to Write PHP (from Facebook) (facebook.com)
134 points by ashu on Feb 10, 2010 | hide | past | favorite | 67 comments



XHP rocks so fucking hard, it isn't even funny. It is just so much better than alternatives.

IMHO, It is the only PHP tool I use at facebook that is better than alternatives in other languages. I'm looking at you, django templates!

The notation perfectly represents the objects, with no cruft associated with object oriented programming. That is really rare. You could argue that the markup syntax is cruft, but it really helps code readability to have two types of syntax, for code and for markup.

  $b = <span>quotes and variables</span>;
  $a = <div>omg, I can't "believe" how easy these {$b} are</div>;


Couple questions:

1. How is this different from Django filters? Is it that the default is HTML escaping instead of having to specify the escaping with each template variable?

2. How does it handle different escaping contexts? For example, text in html attributes needs to be escaped differently from text in the body of the document. Text in URLs or JavaScript has to be escaped differently still, and often times you have to combine these escapings (eg. a JavaScript onClick attribute). Is XHP smart enough to recognize these different contexts and do the right thing, or do you need to fall back to some manual mechanism?


Filters are tags in django templates that live in .html files with a bit of logic in a bespoke mini language. XHP for python would be something like this.

  def view_foo( request ):
    baz = "roger, roger"
    return render_foo( baz )
  
  def render_foo( name ):
    return <html><head></head><body>hi, {name}</body></html>
The most glaring difference is that instead of template logic and keywords, you can use python. You definitely want to sequester rendering from the rest of your view, but I see little benefit to django templates. Missing from this code sample is some django middleware which renders a proper HttpResponse() from the XHP return.

I don't know well enough to answer about escaping. Check out the framework, and try it for yourself :)


I'm also very interested in how it is able to escape properly... anybody?


I've skimmed through the docs. Each tag that you can use is a PHP class. It knows exactly which attributes it can take and it can do some validations on the attribute values. I'd assume that it knows when there's supposed to be JS in an attribute value and when there's supposed to be text.


It's turd polish, that doesn't make anything shiny.

If your code looks anything like that, or the examples on the Facebook page, you're doing it wrong.


I fail to see how is this "so much better" than, say, Smarty.


You clearly don't have any idea how django template engine works lol. How can someone even compare this with django, beats me. But Hey, whatever works for you.


I hope you don't mind, but I quoted/linked to this in my post on XHP: http://www.readwriteweb.com/archives/xhp_more_php_enhancemen...

I really liked what you had to say! Please let me know if you want me to remove the quotation or change the links around it. I'm jolie@readwriteweb.com.


For me, XHP is far more interesting than HipHop. And I say that as someone who administers a pile of single-application CPU-bound PHP servers. This completely and forever changes the templates-vs-just-PHP debate, and I'm glad -- it's the kind of evolution PHP needs to continue to be taken seriously.


Strangely enough, I'm reading your PHP code just now :)


Heh, which code is that? If it's anything to do with a certain salesforce-like system at a certain log-searching company, I apologize. :)


No, thank god that (salesforce-like system) is dead :) Something called 'spacecake'...


I'm happy to have written code for a system I know nothing about. :)


Hi James! I hope this is ok with you - I quoted & linked to this comment in my post on XHP tonight: http://www.readwriteweb.com/archives/xhp_more_php_enhancemen...

If you want me to remove the quotation or change the links, just let me know. I'm jolie@readwriteweb.com.


In a previous job, we used Apache::ASP, which is basically PHP-style <? ?> tags, only with Perl in the insides. We got an awful lot of mileage out of simply re-writing the default <?= ?> equivalent to automatically HTML-escape the contents, and adding a new <?! ?> type thing to pass through the inside unescaped.

It's less "cool" than this, certainly, but making the default reasonably safe and forcing you to ask for the dangerous level of output rather than defaulting to dangerous and having to ask for safe is a lot easier to implement.


The sentiment here seems to be overwhelmingly positive. Call me a cynic, but this reminds me of "magic_quotes" all over again: a "feature" that tries to help, but masks the fundamental problem.


What is the fundamental problem it masks?


Presumably he means allowing users to thoughtlessly work with unsafe values. One way that XHP could mask this problem is clear by looking at:

    echo <span class="username">{$_POST['name']}</span>;
Now someone comes along and decides the span is unnecessary, turning the code into the (erroneous):

    echo $_POST['name'];
However, I don't think this is a problem with XHP's approach, except in the sense that XHP doesn't go far enough in fixing PHP's faults.

If PHP gave all unsafe values the type "unsafe string" and disallowed all implicit conversion to safe types, then XHP's approach would be a welcome way of doing the right thing by default. Programmer-introduced errors like the above example would result in fatal type errors instead of exploits (presuming "echo" won't take unsafe strings and so requires a conversion function, like htmlspecialchars or the hypothetical unsafe_cast).


One could simple write their own "echo" function which only takes an XML object parameter.

    // This works
    write(<span class="username">{$_POST['name']}</span>); 
  
    // Where as this wouldn't work
    write($_POST['name']);
I think perhaps the use of echo in these examples is just a simplification. More than likely, in Facebook, these XML classes are outputted by another function.


The separation of presentation and logic.

Anyone who's spent the 1990s coding Perl CGI will tell you just how big of a step backward this is.


That seems to be unrelated to the concern here (as belied by the grandparent citing magic_quotes). XHP (like plain PHP, for that matter) can be used in a way that, for all intents and purposes, separates the concerns of presentation and logic. It's more a matter of convention than constraint.


> separates the concerns of presentation and logic

WHAT? The explicit purpose of this is to combine the presentation (HTML) and logic (PHP) layers.


The purpose of a "layer" is not determined by the language it's implemented in, but by what the code actually does. Code that handles nothing but presentation is by definition separate from the business logic, even if the presentation code and business logic are implemented in the same language. It is trivial to implement this architecture with PHP and even easier with XHP.


Lazy/bad developers will always find ways to put business logic and presentation in places they don't belong. I like systems that -- rather than enforcing arbitrary and inevitably leaky constraints -- aim for maximum productivity and power in the hands of capable developers.

Let's take a real case: Smarty. To my eye, XHP is going to take a huge slice of the PHP template system market away from Smarty. The reason is that XHP gives you the native constructs of PHP (sans mind-bending hacks -- try referencing a class constant in Smarty) while staying at the right level of abstraction for fast development: chunks of valid XHTML.


  That is, it is impossible to generate malformed webpages while using XHP.
While the purist in me thinks this is great if everyone else uses it, the immense amount of productivity lost when I first started using kid templating (e.g. http://turbogears.org/about/kid.html ) really burned me on this whole concept.

Sometimes I really do want to make a quick test page without crossing all my i's and dotting all the t's. Importing non-perfect markup from a designer is a big pain, too, in this kind of templating system.

And, though it's unfair to say so, some companies do well enough without 100% valid XML markup: http://blog.errorhelp.com/2009/06/27/the-highest-traffic-sit...


Saving 14 bytes per request is something only a very few sites need to think about...

For the rest of us, clarity and semantically correct code are much more useful.


Wasn't there just a thread complaining about how the end tags make HTML so much more verbose and difficult to read, and that's why people are writing preprocessors like HAML and XHP? I've found that the code is much clearer when you omit your end tags.

As for semantic correctness - it's in the HTML spec, and every major modern browser handles it correctly. Sometimes I wonder if Google's the only folks who actually read the W3C specs, there's been so much cargo-cult advice passed down between web developers.

The "always close your tags" advice came from the early 2000s, when people were pushing XHTML as a way to make your HTML pages XML compliant (the big buzzword back then). It gives essentially no benefit to users, no benefit to developers, costs you bandwidth, makes your pages slower, and clutters up your markup.


Actually, I once visited a website that didn't close its tags where that practice hindered my use of it. That page used backticks instead of curly quotes, so I wanted to use a bookmarklet to fix that. The bookmarklet cycles through every DOM text node and replaces the text. When I tried to run it on that page, nothing happened – probably because there were no text-only nodes, because the page was treated as one big tag. So browsers don't handle missing closing tags completely correctly, and therefore one should include closing tags.


If you don't close your tags,

<div>foo<a>bar<span>baz box

how can you tell the difference between,

<div>foo<a>bar</a><span>baz</span> box</div>

and

<div>foo<a>bar<span>baz</span> box</a></div>


By "don't close your tags", I mean "don't close your tags when the HTML spec does not require you to." You still need to close your <a>, <div>, and <span> tags. You don't need to close your <li>, <th>, <tr>, <td>, <dl>, <dt>, and several other tags. You don't even need to open your <html> and <head> tags, if you're not doing funky stuff like putting comments in them.


Rendered html that will be sent has an http response should have ending tags. That doesn't mean the programmer/designer has to write them: #a dumb ruby example: def p(content) "<p>"+content+"</p>" end

p "hello" #=> "<p>hello</p>"


The HTML tag closing rules specify they only get closed when they have to be. So neither of your solutions is correct. It is:

<div>foo<a>bar<span>baz box</span></a></div>


my solution was to merely demonstrate the difference between what a programmer writes and what gets sent as an http response.


SGML is fun. It looks like XML, but is way crazier.


See also: Rasmus Lerdorf's discussion of XHP: http://toys.lerdorf.com/archives/54-A-quick-look-at-XHP.html

"...when you combine XHP with HipHop PHP you can start to imagine that the performance penalty would be a lot less than 75% and it becomes a viable approach. Of course, this also means that if you are unable to run HipHop you probably want to think a bit and run some tests before adopting this."


Scala does basically the same thing with XML:

http://www.scala-lang.org/node/131

...although I doubt it does escaping by default. Should be simple enough to add while you're converting the scala.xml.NodeSeq (iirc) to text.

For an API that required both XML and JSON output, Scala's built-in XML support had us wishing the JSON version of the API was as easy as the XML version.


I'm going to give it a try, simply because of this: Facebook Lite site was written entirely with XHP.


BTW for those of you interested in installing on Linux, you'll need php5-dev (so on deb/ubuntu machines a quick apt-get install php5-dev solves it). Run phpize from the root, then the normal ./configure, make, make install etc...


I just installed this on a fresh Linode with a basic Ubuntu 9.10 LAMP stack. It built ok, and the module shows up in phpinfo(), but when I try to run any of the example code, I get an error:

  Fatal error: Class 'xhp_a' not found in /home/me/public_html/test.php  on line 3
Here's the code I tested with:

  <?php
  $href = 'http://www.facebook.com';
  echo <a href={$href}>Facebook</a>;
I tried running the code through the xhpize tool, and got the following output:

  <?php
  $href='http://www.facebook.com';
  echo new xhp_a(array('href' => $href,), array('Facebook',), __FILE__, 3);
Which produces the same "class not found" error.


I'm having exactly the same problem, same system (Ubuntu Karmic Koala). Prerequisites installed, XHP module installed, it's listed in my phpinfo. For any XHTML tag I include (a, span, ...) the error message says class not included:

  Fatal error: Class 'xhp_span' not found in ...
Any ideas on how to fix would be appreciated. Google and Bing have nothing so far and the module configuration info on the GitHub page is limited: http://wiki.github.com/facebook/xhp/


So, it seems you need to include init.php. Not sure why that doesn't happen automatically.

http://github.com/facebook/xhp/issues#issue/2


has anyone gotten this to work on OSX?

I get strange errors after I build and add the extension to php.ini

LOG: http://pastie.org/817528


I have so many damn unnecessary problems with OS X that I've resorted to running a lightweight Debian VM with VMWare. It uses ~250mb out of my 4GB RAM and I can just suspend it whenever I'm not coding.

Benefits: I can use apt-get and have all the other conveniences of a true *nix environment. I can update and try out new software easily (I had XHP running in about 5 minutes). Also, I can create snapshots of my OS so that when my environment is just the way I like it, I can always revert right back to it.

I interact with certain paths on my virtual linux filesystem just as though they were local (like my ~/Sites dir) and have various dev domains in my /etc/hosts file pointing to the VM (which has it's own internal ip).

Cons: It's a standalone VM so it consumes a consistent 250mb of memory. Never really looked at what mysql/apache were doing on my Mac previously (I would assume far less) but I haven't really ran into an issue where the VM is a big issue yet. The convenience far outweighs the chuck of memory it eats up :)

I haven't had a single configuration problem yet ;) For developers the popular phrase is reversed, Debian "just works" and OS X is a pain in my ass.


Working perfectly for me on OS X snow leopard. It compiled cleanly.


Got it compiled properly now, had to get one of the versions without the lexer/parser files included.

However I'm still getting 'xhp_a' class not found errors with the basic tests from the wiki, the same as the other comments in the sibling thread.

It seems to happen on several platforms so its probably a configuration error somewhere.

Any help would be appreciated.

EDIT: Figured it out. You need to include 'init.php' from 'php-lib' in your php scripts. I copied the directory from the xhp source to my site directory and included them from there.


maybe I've misunderstood, but this seems to advocate mixing inline HTML and php logic - isn't that a huge step backwards in terms of web software architecture? I thought we were all using the MVC model by now...

My head just hurts thinking how utterly unmaintainable all that spaghetti code must be.


Not really. You can still separate MVC style. Templates have always had some display logic, which is ok. What you don't want is intermixed application logic.


In my opinion this "fuzzy" separation is exactly what frameworks like Django (and many existing PHP MVC frameworks) tries to avoid - they specifically disallow things like arithmetic in their templating language for this reason. As soon as you get more than one person working on a site, you're going to have an overly ambiguous demarcation point between the presentation and the logic layers and it's going to wreak havoc on the development process. It will take an enormous amount of discipline to have a parallel design/code workflow.

Does this offer any benefits to XML comprehension beyond syntactical sugar to allow echo avoid the use of quotes and to remove the god-awful <?php ?> syntax (which puts it at par, at the very best, in my opinion)? Does it handle XML syntax errors gracefully? Can you do native transformations on bound variables, for example, or do any more sophisticated XML DOM-ish tag functions? The post doesn't mention anything about any of these issues, which is where the real advantage would lie. If any of that were possible, you could put the browser DOM (and validation) one step closer to the application logic. As far as I can tell this just attempts to ambiguate the VC in MVC, where Django tries to replace V with T (template).

As far as I'm concerned this only increases the squinty-eyed "WTF" factor between PHP and other languages.


I definitely agree. I'm moving my Rails app to the mustache template language so that I'm forced to keep every single piece of logic (and I do mean that. Mustache allows only conditional statements, for each statements, and the outputting of variables) out of my HTML templates.


The article claims that it's stored in DOM and can manipulated as such. To me the big win though is that using this means things should get escaped properly by default.


I might be crazy but I swear I didn't see that they addressed everyone of my concerns... wtf?


This doesn't say that at all. The rendering code you write needs to be modular as well, and has quite a bunch of logic built into it (even when all the model and controller logic is separate.) So when you write your renderers as classes or functions, that is when you realize the benefit of XHP.


Ugh, I dislike this a lot. I'm one of the guys who actually likes PHP, so this may be a bit skewed, but what's wrong with PHP's existing alternate syntax? Most people don't know about it, but it's clean and easy to follow:

  <?php if ( true === $some_value ): ?>
    <div>display this div</div>
  <?php else: ?>
    <div>display this div instead</div>
  <?php endif; ?>
This way, you can keep basic logic in your templates (its inevitable and convenient), it's still PHP (there's endforeach, endfor, endwhile, etc.), and this method of templating is very clean.

You can now have a class that sets variables through __set(), loads up a .phtml file, starts output buffering, renders the file with those variables, and then returns it.

You can extend it further to automatically sanitize output variables for XSS and whatnot, cache output, etc. This way, you don't need some overly verbose system like smarty to do what PHP does already. XHP just looks like another smarty: solving a problem that I really don't think exists.

Edit: Ok, I probably shouldn't say I dislike this a lot, I do love seeing Facebook sticking with PHP and ultimately helping it out.


That sort of code can get really messy when you have to put PHP code inside the DIVs.

I think I might like the more concise, readable syntax that XHP offers for interpolating {$variables} (and any PHP code for that matter).


I'm using PHPTal right now to achieve more or less same effects, but it's not the easiest solution - it becomes a bit cumbersome for long fragments.

XHP looks very promising because it solves one of my problems with PHPTal - generating complex content in the loop (in PHPTal having multiple conditions in loop is possible, but not exactly elegant)

For example:

  <?php
    $list = <ul />;
    foreach ($items as $item) {
      if ($item->bold) {
        $list->appendChild(<li><b>{$item}</b></li>);
      } else if ($item->foobar) {
        $list->appendChild(<li><i>{$item}</i></li>);
      } else {
        $list->appendChild(<li>{$item}</li>);
      }
    }
  ?>


Another awesome example of the lengths people will go to compensate for a lack of macros.

I would so much rather see (define table (html-table (map [tr (td _)] rows)))

then

$table = <table>; foreach($rows as $row) { $table->appendChild(<tr><td>{$row}</td></tr>); (assuming this is even possible?) }


I was just about to make this comment.

I think one of lisp's adoption problems is that it makes things that are just simply amazing become commonplace.


very, very cool. i can't wait to dig into this more... it looks like even the basic examples are cool, but there's lots of stuff under the hood waiting to be discovered.


it is indeed one of the coolest things i have used. the fact that you have all the HTML stored as objects and available for manipulation makes all kinds of crazy post-processing (just before rendering) possible.


seems to me that if they went to the trouble to make it understand xml syntax and error out on invalid code, they should have just made it auto-close tags. on pages where there are heavily nested divs and other things, auto-closing the tags would make the code smaller while still generating valid xhtml and not bothering the developer with such trivial things.

     echo <div><strong><em>blah;


One last feature of XHP, which has been invaluable to us at Facebook, is that you can define your own elements which can condense a complex component into a simple XHP tag. XHP has a rich collection of declarations which let you define new elements, configure their expected attributes, as well as describe their content model.


While I understand where you're coming from, I feel like that would result in incredibly confusing markup and only lead to increased headaches down the road. I'm glad they left that out.


This sure would make working with WordPress themes easier.


The useful abstraction might mean that they wouldn't need to be GPLed, especially if combined with a sane view/template separation (like Django's: the template just gets a dict as input and nothing else).


Sure, let's take two clusterfucks, php and xml, and smash them together! Great idea!

(Actually, this looks rather handy.. but there is a certain amount of initial WTF.)


looks like E4X, which security is a problem

http://sla.ckers.org/forum/read.php?2,20408

Any code mix operation & data is dangerous. That's all how overflow exp, injection and XSS works




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: