Hacker News new | past | comments | ask | show | jobs | submit login

Ah, a fellow mod_perl hacker. :)

I'm using Apache::DBI, caching everything via startup.pl on load, keep_alive is on but with a very small timeout, host lookups are off.

Also I'm using the worker MPM in apache2, just FYI. I've found it to be really memory efficient.

I've been using MySQL with Apache::DBI for years and it's usually brilliant - I ran WorkZoo.com, a high traffic job search engine with a combination of MySQL and a full-text api.

With feedjit I'm basically storing weblogs. I either have to dump them into a single table and query that - which is what I was doing and the high query rate with read/write was a problem - or have lots of individual tables which isn't feasible after about 500 with MySQL. So small files works best for me.




We're talking one file per unique domain or per unique url?

Also, I take it from your comment that the bottleneck was in MySQL doing the writes. I assume the read side is indexed appropriately so MySQL finds the right part on the disk almost instantly. Do you think it is a locking issue then, e.g. table lock vs row lock? (Forgive me, I haven't used MySQL in a while.)


One file per URL. The problem with MySQL is pretty much what you've described. I have (had) an index on a table that gets read a lot by the application. It's amazingly fast - MyISAM table's really rock for fast reads on indexes. But therein lies the problem because it also gets written to a lot. Every time it gets written to MySQL needs to lock the table and rebuild the index.

You can improve things a bit by using INSERT DELAYED. When you use that, mysql doesn't guarantee that it'll insert the row immediately, but the mysql query returns immediately when you do the insert (it doesn't block) and mysql queues up inserts and inserts them in bulk when it feels like it. The non-blocking and bulk inserts that INSERT DELAYED give you speed things up, but only to a point because you're still constantly rebuilding an index on a table that's getting a lot of reads.

Mark.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: