I'm maintaining a huge web application mostly written in PHP. We (well. back then it was just me) began doing this back in 2004.
Back in 2004, PHP was a very sensible solution: It was the language I knew best (we had little time for the project, so going with a language I knew felt sensible), it was easy to deploy and back then, there weren't that many alternatives anyways:
Ruby was in its infancy, for Python you had thread safety issues with mod_python or you went CGI, Java was and continues to be just ugly. JavaScript back then was still just a toy language. No Node.js or anything.
That would have left me with mod_perl, but looking at where we are today, that would have been an even worse decision it seems.
Fast forward 6 years.
The application consists of over 100'000 lines of PHP code alone now. It's in production use for many customers which serve tens of thousands of end users. It's not only a traditional web application, it also serves as an API for custom-made tablet pc applications (before we had the iPad), for GPRS-connected barcode scanners and last, but not least, for native Windows Clients (all developed by our company, using various languages).
While I really hate some aspects of PHP by now and I would love to have a Ruby or Python codebase to work with instead, rewriting all of this is out of the question.
Customers depend on this to work exactly the way it works now (they panic even if a link is two pixels off - welcome to the enterprise).
While I might be able to exchange some components with something else, I don't see the benefit it would provide - it would do nothing but make maintenance harder because I'd add another dependency to keep track of.
The only thing I could do is rewrite the thing. But by now, there's more than 30 man-years of work that went into this.
Sure. Redoing it wouldn't take the same amount of time, but considering it would have to look exactly the same (probably I couldn't even get customers to accept different URLs), where's the point in that?
OTOH, despite being done in PHP and tailored to sometimes crazy customer requirements, the code base is sufficiently clean to work with and it's constantly improving. Bad parts get factored out, good parts arrive, so it's not all-bad.
We are embracing new technologies as they become available and fit our product. Our CSS is now written in SASS, we moved from pure DOM scripting to Prototype to jQuery, we make use of the latest features of PHP (now Closures and anonymous functions from 5.3) and of our database (constantly running latest Postgres).
Even though it's PHP, it can still be fun.
Considering recruitment: Granted. It might be harder to convince a good programmer to work on this "ugly" PHP project. But a) we are not just doing PHP (just mainly), b) the code base is, as I said, quite clean and c) even though the code base might be in a language you don't like, the basic concepts of our profession still apply.
You can still discuss and solve interesting problems and you can still create great solutions to these problems.
If you don't want to take part in this adventure just because you don't like the language this is done in, then, frankly, you are not the person I want to hire.
Even though programming is the coolest thing you can do on this world, it's still a job and not everything can always be unicorns and rainbows. If you can't see this, then I don't need you.
I like everything about your post except the bit at the end. It seems like those words are taking things personally. Look, I worked with ReallyBigCo, a B$44 business. They used Oracle. Java. SQLServer. .NET. And MUMPS. So I have PHP beat hollow for uncool technology. I'll say it again. MUMPS.
If their CTO and I traded jobs, I wouldn't rewrite everything in other languages. I'd continue to solve interesting problems and recruit the best people I could. But I wouldn't take it personally if some of the people in the marketplace chose to pass on solving interesting problems in our environment with our tool chain.
It's not personal when I pass on someone talented who isn't a fit for us, and it wouldn't be personal if someone else passed on working for us because our tool chain wasn't a fit for them. Unicorns and rainbows don't enter into it.
It's like.... Oh I don't know, perhaps it's like locating your office in Toronto instead of the Valley. Some applicants want to live and raise a family in Toronto, some want to be where the action is in the Valley.
We solve interesting problems all the time in Toronto. But no, I'm not opening an office in the Valley for those who want to live there, just as you aren't doing work in Python just because there are talented people who want to use it.
I can respect that while simultaneously respecting those programmers who give your job a pass because they don't want to work with your tool chain. Just one of those things.
I agree that the end bit might have come out a bit too personal. I wrote this after reading the excellent linked article and then seeing the first comment on HN which was
"Why not transition to a different language over time? One that more great programmers want to use?" which just isn't something you'd realistically do.
And then I thought about the fun we have here in the office and I thought about the nearly two hours of discussion I had with a fellow coworker about race- and lock free storage and merging of shopping baskets and it hurt me to think that people would throw all that away just because they don't like the language even though it was the only viable option when all of this started.
I might have been carried away.
So: Sorry. I didn't mean to insult anybody. I just think it's really shortsighted to judge a project, a team and a company based on the choice of language that might have been used at one time.
I'm leaving a PHP job soon to start a Ruby job. The language was not the decisive factor, but it was a factor - I wanted to move to Ruby at some point.
The company I'm leaving solves complex, interesting problems in solid PHP code. There are developers here who are a lot smarter than I am. So my attitude is far from "you guys suck" or "I'm better than you."
But when I look out over the programming landscape, I see a lot more energy and activity right now in the Ruby community. A lot more people building new things and inventing new best practices. This is a simplification, but it seems like great ideas come from Rails and move back to PHP eventually.
PHP jobs run the gamut from high-tech and awesome to grunt work. Rails jobs tend to be more cutting-edge, because the technology is newer and there's just more stuff happening there.
So for me, the question is this: if the Ruby / Rails community is the leading edge for new ideas, and the AVERAGE (not to say all) developer in that community is better, and the AVERAGE job in that language / framework is cooler, doesn't it make sense for me to move that way? All other things being equal, isn't that a good career move?
It isn't snobbery, it seems like the most pragmatic thing to do. This is on top of the fact that I genuinely like the Ruby language a lot more than PHP.
So I don't think you have to feel slighted personally. And I'd expect this cycle to repeat someday: there will be a lot of Ruby / Rails apps that need maintaining, and a lot of developers will prefer some hotter, newer thing. Just the way it goes. But it's nothing personal.
The answer may well mean sticking with a flawed technology that nonetheless is serving the business well. Remember there is still lots of COBOL chugging away.
2) What is best for me?
If it's my business, but I'm sick of PHP and would like to switch to Ruby, Lisp, Haskell, whatever, it might be better to sell the business and start a new one based on the new language, rather than risk a rewrite. Isn't that part of the freedom being a founder was supposed to buy?
I hate to say this and I know I will be downvoted by "hackers" but I really don't understand why people (who know limited or nothing about Java & blindly follow the norm) always pick the word "ugly", "stupid" for Java? What superior of PHP to Java when with Java I can program from a stand-alone app to webapp, from client side to server side, from desktop to mobile app?... If you are a coder, you must pick a language & a standard lib to be master, and I would say that picking a language that only sticks to a very specific application platform (web-only, desktop-only...) is a very bad choice. I started from C/C++ and moved to Java and I feel very comfortable to use/study Python, C#, Scala & Ruby.. but PHP, get off, never & never ever!
One last thing (to convince more PHP people to downvote me) is, if you are spending most of your time on PHP or a web-only-language, you will never see the beauty of asynchronous I/O, socket programming, threading, hooking...
Update: When I mentioned "web-only-language", I was thinking of people who use & only use Ruby with RoR for web apps. I don't know how many developers who can't distinguish between Ruby & RoR but I guess it's not a small number. And of course, what I said is toward to those who aren't willing to learn new things. They always think about web & only web.
anyways. I have some reasons for my strong dislike of Java: a) checked exceptions, b) no method pointers or something similar and c) lots of the code produced by the community out there (and in the standard library itself) is full of FactoryFactoryFactories and other typing intensive, mind-bending and ultimately useless abstractions (most of them not DRY at all either).
Back in 2004 I did strongly consider Java though, but ultimately, I didn't have time to implement this web application AND learn a new library (learning the language is easy. learning the library is what makes you slow in the beginning).
As a side note: Said web application also accesses locally connected barcode scanners over the local serial port. Unfortunately the only way to do this (aside of a locally installed client) is still using a Java Applet which I've also written back in 2004. So I do have the Java experience to know that I don't quite like it :-)
Java throwables come in two flavors: Exceptions like IOException which must be checked, and Errors like AssertionError which do not need to be checked. If you really, really don't like checked exceptions you can easily build libraries and write code that rely exclusively upon unchecked exceptions. I wouldn't personally recommend this design methodology.
Errors and unchecked Exceptions in Java should not be confused. An Error is typically reserved for the runtime environment for a "this ship is sinking, abandon all hands" kind of unrecoverable error. More properly Exceptions in execution that aren't checked in Java extend RuntimeException and not Error.
The stigma that Java has too many checked exceptions is no longer true with modern Java code. Everyone is using unchecked children of RuntimeException almost exclusively. Of course, there is still plenty of legacy code out there using outdated checked exception paradigms.
lots of the code produced by the community out there is full of FactoryFactoryFactories and other typing intensive, mind-bending and ultimately useless abstractions (most of them not DRY at all either
So what? Why does the code produced by others in "the community" affect your perception of the language and/or tools, if you aren't using their code?
Whether you fall in line or not, the culture around Java is around writing code to a certain style. Agree or disagree, there's an argument that by writing in a style that is familiar to Java programmers, your code is easier to read and maintain.
If you're one guy off in a corner, write however you want. But if you're not, you have to take the culture into consideration. And if you are one guy in a corner, why are you using Java?
I just started working for a large enterprise corporation (one of the biggest financial companies in the world). We have a lot of old code written in Java that needs to be maintained. I've been trying to advocate new ways of doing things, but almost every time the other developers will respond with, "that's just not how we do things here."
Java has a very strong culture around it. I worked in academia and startups previously and I didn't realize that there is a huge number Java programmers who do nothing but program in Java. They aren't interested in learning new languages and will only grudgingly learn a new framework. Their biggest concern making sure the lowest common denominator can still maintain the (unmaintainable) code.
An example is unit testing. Our current "unit" tests start up a JBoss instance, connect to the development databases, and take ~5 minutes to run just one test. But I've been told not to waste my time working on anything more modular and that if I'm going to put in any time working on unit tests, I should contribute to the framework everyone else is already using.
I'm already doing it. I set up cucumber + webrat to run through some quick UI tests. I wrote a mock-object framework into the last feature I designed so I could test it. There's still a huge resistance to change. I'm just hoping I can show how useful it is in the long run. In the short run it just looks like I'm wasting a lot of time tinkering on silly side projects.
I don't think that your company's conservative attitudes are because of their usage of Java.
I think that "one of the biggest financial companies in the world" would be just as conservative and anxious about the tiniest technology changes with any language: Ruby, Perl, or Fortran.
The attitude you describe sounds like it has more to do with being a large financial institution and the type of place where software development is a "cost center", not a profit maker.
It is MUCH easier to show someone how something like a single 5 minute unit test holds them back from being bigger better badder. Unit testing should take less than a minute for ALL of them in a project to run.
Ask for forgiveness after you do something...
The other way to go about it is to 'gingerly' find the single ally, and build on that.
Because when we call Java "ugly" or "stupid," it's the community we're talking about. My room mate is an extremely talented Java programmer, and he can fly around Eclipse[1] like a giant rainbow steamroller[2]. He uses Java like it should be used, and it's great. But I don't think he represents the community.
He tells me horror stories of code he refactors at work written by people in his own office, or worse, outsourced companies, and it's bad. It seems to me that that kind of code is more representative of the community.
When you're going to join a project, you have a better chance of encountering code not sucking if the overall culture of that language is better.
Ha, if you are thinking that way, I would suggest VB.NET or C# instead. Why? I remember the first time I was creating an MFC project using Project Wizard, I felt like I'm a fool looking at a bunch of auto-generated text & having no clue what it is about. And once again, I was in the same situation when I was using VS.NET to generate code for my first ASP.NET web app. But I still completed that webapp without even knowing what code is for. Does that make VB.NET/C# community more stupid than Java? Does that mean C#/VB.NET should be the ugly & stupid thing other than Java? So, you should better have another way to explain that.
personally, I prefer the dynamically typed languages.
but I hate. No. HATE PHP's type conversions with its == operator.
0 == 'foobar'
but
true == 'foobar' && 0 == false
so
true == false ?
eek.
Yeah. I know === exists. But if you have to compare strings and numbers, why is the default conversion method you do the lossy one? If you compare a number to a string, why can't you convert the number into a string and compare the two strings? Why convert the string into a number which will be lossy in most cases?
Sometimes it is "easier" to just type:
if (isThisReturningEmpty()) {}
instead of typing:
if (isThisReturningEmpty() === "") {}
And if you know something could return 0 (which is not empty), you should know to do a type-sensitive check). You are trading off HAVING to set types with HAVING to check types when needed (I think the latter is better).
Note that this isn't anything to do with dynamic typing: Python is very dynamic, but it doesn't have this weird coercion (or is it "automatic type conversion"?).
I hate this as well, but, I love dynamic typing. They aren't the same thing.
there are none, but to implement a callback you also wouldn't have to declare an inner class implementing some interface that in turn declares tens of methods just to react to that one callback.
In PHP 5.3 you'd just pass the function (which of course will pass a pointer to that function) and in earlier versions you'd hack something with eval or variable-variables which, while bad, is still better than either writing half a screen full of empty methods or inheriting an inner class from some meaninglessly named class that only exists for you not to have a screenful of empty methods.
In all versions of PHP, callbacks are extremely easy (no eval or variable-variables required). You just have to pass the name of function or method around and use the call_user_func() function to call it.
I disagree that you have to pick one language and one language only to be really good at. At least after you've become a decent programmer. As a polyglot (I can start a project comfortably in PHP, Python, Ruby, JS, Erlang, and wouldn't feel too out of place working in Clojure or Objective C), I think a "master" programmer is someone who's gotten to that point to realize that a programming problem is a programming problem regardless of language. Language is only the syntax you use to formulate your answer.
I do agree with what you say, but there is a gamut with 'programming problems' on one end and 'work' on the other. Especially on the web, most of programming is work, not problem solving. When doing work in a language/framework, knowledge of other languages/framework can distract you. For example, a Java-only specialist would know whether substring takes start offset and length or start and end offsets arguments. I would have to look it up (or wait for the IDE to help me), but I do know that C# and Java disagree about it.
Until recently, an ignoring a brief foray into Limbo, I've been a PHP only/focused programmer. I've been working with python more in the past year as I've been doing more work with data processing and statistical analysis for which PHP really doesn't have the tools.
Having said that I wonder if missing the 'beauty' of async I/O etc has more to do with the projects people take than the language. Would a python or ruby developer who focuses on creating web apps have any more familiarity with those concepts than a PHP developer?
because PHP provides practically NO means for async I/O whereas the other languages do.
One of the reasons for this is because PHP was designed to quickly handle single HTTP requests. The scaling is ment to be done on the app-server side and that single request that your script is serving at a given time will take as long as it will take anyways.
So you don't really need the async I/O (in theory).
Python and Javascript (and to some degree ruby) rely on their own web servers implemented in their own language, in many cases with no or bad (GIL) concurrency at which point it gets more interesting to move into an event based model where it becomes imperative that operations don't block.
There async I/O becomes important.
So: PHP: concurrency by firing off another apache/fastcgi process or thread. Don't worry about blocking on I/O.
node.js and some python/ruby frameworks: concurrency by using an event based system. Because one operation blocks the whole server, they need to be quick. async I/O becomes important.
Of course the evented model has huge advantages too: You worry much less about races, you get huge performance with a simple architecture and you can potentially handle much more concurrency (because each thread/process consumes resources that your one evented process does only once).
Both paradigms are interesting, but having first-class functions certainly makes an evented model more convenient to work with.
Async I/O is important for any time the connection needs to be kept open while processing the request under heavily concurrent conditions; it's less about not blocking on I/O, than it is about avoiding the overhead of context switching in the kernel and the extra resources of keeping a thread / process alive for the duration of the request. I don't see it as less or more important in a PHP context than an event-based model. However, without an automatic CPS transformation (continuation passing style) of the source of your request handling logic - in particular, continuations at the boundary points all I/Os - you do need to write to a pattern which is in effect event-driven.
PHP is pretty much considered web-only. Sure you can use it on the command line (never seen it myself) and yes you can do Gtk stuff with it but that doesn't change the perception of it even if you can argue differently.
EDIT: I don't hold that view because of the things I mentioned but I am pretty sure a lot of programmers do.
I have also been writing PHP command-line scripts, or rather maintaining them.
They are a bad idea, in general. PHP is designed for a very specific purpose: Very quickly building web pages in a CGI or mod_php environment and spitting them out. Everything else is an afterthought, and it shows.
For example, if you try to manipulate the filesystem extensively using PHP you will eventually trip over its "stat cache". PHP caches the result of stat() calls, presumably assuming that, hey, it is more important to avoid redundantly calling stat() during the time-sensitive rendering of your web page than it is to actually return correct information about the state of the filesystem. I mean, how often do symlinks change or files get moved during the rendering of a typical web page? And how much web-page-rendering code really depends on being able to read a link, then read the link again after the link has changed on disk? You can afford to ignore that stuff at the language level, if you're PHP.
The result is that you have to learn about the stat cache and remember to call clearstatcache() all the time when manipulating the filesystem in PHP.
That's just one example of why it's better to use Ruby or Perl to write command-line scripts. These languages were designed with command-line scripting in mind. Indeed, this is the flip side of the reason why PHP eventually drove out Perl as a web development language: Perl was originally designed for command-line scripting, and PHP was originally designed for the web. Use tools for their proper purpose.
Java was originally designed for programming embedded devices. Python was originally designed as a teaching language. One anecdote suggests that Lisp's original designers always intended to add syntax to the language.
Perhaps a decade after a language's invention it's possible to discuss its current suitability and design for specific tasks more than its original design intent.
(I consider PHP's relative ease of deployment over everything else far more explanatory of its ubiquity than any original design intent.)
Things like "stat cache" and "memory leaks" are what I watch out for when I write command line scripts in PHP. Also, I do not run my scripts as a daemon and I make sure that they do not run for more than a set maximum execution time.
So yeah, it may not be the most perfect choice but, it beats : having to learn another language when I know what to look out for when I write a command line script in PHP.
Now with node.js, it isn't as clear cut as you make out it to be. It's not there yet, but in the not so distant future, you can use node.js for your scripting needs.
Running on v8, it has an edge in terms of execution speed over other scripting languages. And the javascript syntax and semantics is something most of the developers are familiar with.
If you don't like writing nested closures, you have coffeescript which makes the code pretty, and also brings in some new features viz. list comprehensions, splats, satement modifiers etc.
I have written a certain amount of Javascript this year to run on Windows machines. This was because a) I needed the scripts to run on Windows, b) I did not wish to have to install another language (say Python) on the machines, c) I really really did not want to do them in VBScript.
I think you unintentionally (or maybe intentionally) make a good point here. Regardless of language, re-writing your entire code base is almost always a bad idea. If you have 100,000 lines of PHP, it's probably going to stay that way. If you were motivated properly, you could start to migrate things over to language 'X' slowly, but even then the benefit may not outweigh the cost.
Nitpick: "Ruby was in its infancy" - Ruby was created in 1995, and was certainly quite usable in 2004. I think maybe you mean Rails, which was released in July of that year according to wikipedia.
Well, you had Cerise and IOWA for Ruby Web frameworks then, also Wee, though Wee was not so robust. But IOWA was quite capable of fast, scalable Web sites. For whatever reasons it never got the attention it deserved.
I am interested to learn from experienced PHP developers about certain aspects of PHP program. We are currently maintain a social game which consists of few millions transactions every day (through JSON-RPC, not HTML page rendering), and it appears to be CPU-bounded (we use Redis as db backend). Based on my understanding, Facebook is CPU-bounded too and that's the reason behind their custom HipHop. How many PHP programs out there in wild are CPU-bounded in percentage? Because before I got my hand dirty on PHP few years ago, I heard many times that web applications were usually bounded by network/disk bandwidth rather than CPU.
As I said: I'm all for adopting new technologies (newer projects than this web application were done in Python 3 and lately node.js).
I'm also reluctant though to introduce more dependencies for the sake of porting perfectly working code to another language I might like more.
It's not just that maintenance will be harder it's also that in some cases our customers (very traditional enterprise. IE6-on-NT4-traditional) provide the server for this to run on and I can't present them with a huge list of services to install and monitor.
Well. Maybe I could, but I my conscience can't deal with persenting them with that list just so that I potentially have more fun while programming.
One of the things I'm currently looking into though is Websockets for a specific component of the app and in that case, I SO clearly see a huge advantage in Node.js that I will certainly go node if we decide that the shortcomings in the way we are currently solving that problem warrant the change in the architecture.
Which brings me to the same conclusion as my initial post:
Rewrite for additional functionality or fixed issues: Yeah. Rewrite for the sake of having it rewritten: I'd rather not.
If I was talking to you about a job, and I knew this ahead of time, I'd be a lot more likely to consider it even though I'm a python guy.
The fact that you made sound engineering decisions and continue to improve on what you can (while being practical) means more to me than your language choice, even if I'd rather be dancing with Django :)
Back in 2004, PHP was a very sensible solution: It was the language I knew best (we had little time for the project, so going with a language I knew felt sensible), it was easy to deploy and back then, there weren't that many alternatives anyways:
Ruby was in its infancy, for Python you had thread safety issues with mod_python or you went CGI, Java was and continues to be just ugly. JavaScript back then was still just a toy language. No Node.js or anything.
That would have left me with mod_perl, but looking at where we are today, that would have been an even worse decision it seems.
Fast forward 6 years.
The application consists of over 100'000 lines of PHP code alone now. It's in production use for many customers which serve tens of thousands of end users. It's not only a traditional web application, it also serves as an API for custom-made tablet pc applications (before we had the iPad), for GPRS-connected barcode scanners and last, but not least, for native Windows Clients (all developed by our company, using various languages).
While I really hate some aspects of PHP by now and I would love to have a Ruby or Python codebase to work with instead, rewriting all of this is out of the question.
Customers depend on this to work exactly the way it works now (they panic even if a link is two pixels off - welcome to the enterprise).
While I might be able to exchange some components with something else, I don't see the benefit it would provide - it would do nothing but make maintenance harder because I'd add another dependency to keep track of.
The only thing I could do is rewrite the thing. But by now, there's more than 30 man-years of work that went into this.
Sure. Redoing it wouldn't take the same amount of time, but considering it would have to look exactly the same (probably I couldn't even get customers to accept different URLs), where's the point in that?
OTOH, despite being done in PHP and tailored to sometimes crazy customer requirements, the code base is sufficiently clean to work with and it's constantly improving. Bad parts get factored out, good parts arrive, so it's not all-bad.
We are embracing new technologies as they become available and fit our product. Our CSS is now written in SASS, we moved from pure DOM scripting to Prototype to jQuery, we make use of the latest features of PHP (now Closures and anonymous functions from 5.3) and of our database (constantly running latest Postgres).
Even though it's PHP, it can still be fun.
Considering recruitment: Granted. It might be harder to convince a good programmer to work on this "ugly" PHP project. But a) we are not just doing PHP (just mainly), b) the code base is, as I said, quite clean and c) even though the code base might be in a language you don't like, the basic concepts of our profession still apply.
You can still discuss and solve interesting problems and you can still create great solutions to these problems.
If you don't want to take part in this adventure just because you don't like the language this is done in, then, frankly, you are not the person I want to hire.
Even though programming is the coolest thing you can do on this world, it's still a job and not everything can always be unicorns and rainbows. If you can't see this, then I don't need you.