Sunday, February 7, 2010

PHP and twURLa

A few weeks ago I started work on a project, twURLa. Basically, it is a site that tracks domains on Twitter and ranks them. Over the course of the few weeks, I learned a lot about PHP performance and it has been very beneficial, yet stressful. Here are a few things I learned:

  • Sockets are awesome, streams suck
  • Non-blocking is annoying
  • Debugging is very hard with very unpredictable data
  • JSON is better than serialize
  • Disks are extremely slow
  • A simple VPS can power twURLa
Basically, we started out using streams to connect to all the sites we process, which ended up not being fast enough at all! After switching to sockets, I had a lot more control and I was able to get 1 PHP script to process hundreds of  URLs per second. Throughout the process, debugging was difficult with our test data being a stream of Tweets from Twitter. What we did was save portions of the feed and then I would manually process them and compare to what the script says. The thing is: it took me an hour to process what the script did in 2 seconds.

Our VPS is powered by Fivebean. Fivebean has been extremely helpful and without them twURLa would not be where it is now. We had a very low budget and Fivebean allowed us to work around this and get our site up and running without trouble. Their support is very knowledgeable and fast; the average response time was 10-15 minutes.

James Hartig

