Those of you who have followed my blog may remember a proof-of-concept PEAR.phar single file version of PEAR introduced back in April. This PEAR.phar was a bit clunky, and had trouble on linux, as well as with installation on most systems.
Since then, PHP_Archive-based .phar files have been used in PHP 5.1 itself to install PEAR via install-pear-nozlib.phar and go-pear.phar. Both of these are bundled with the PHP 5.1 release, install-pear-nozlib.phar is in the unix distro, and go-pear.phar is in the windows distro. In the process of making this work, I unearthed two engine-level bugs in PHP (both fixed in PHP 5.1), and several bugs in PEAR itself that prevented it from working from within the .phar in certain cases (all fixed in PEAR 1.4.3 and newer).
The current release of PHP_Archive is based upon the .tar file format. This has the advantage of easy extraction with a tar tool, should you want to modify the file and re-tar it. However, this has the tremendous disadvantage of making every single file access O(n) in algorithmic complexity. In the average case, it works out to something like O(.5n), which is noticeably slow for a package like PEAR with lots of files. This, and the fact that the file size has lots of wasted space have been concerning me. In addition, compression must be implemented for each individual file, which makes untarring and retarring rather impractical.
These considerations led me to rethink the format of .phar files. Due to Thanksgiving holiday here in the US, I haven't yet been able to contact Davey to get his feedback, so these are strictly my own observations and ideas. I have been able to rework the format of PHP_Archive-based .phar files, and to my great delight, accessing internal files is now a O(1) operation, with only a slight increase in memory usage per-phar (basically: (the length of the file name + 8 bytes) * number of files). To my great surprise, now PEAR.phar is actually faster on Windows XP than the pear command because including files from NTFS is slower than including a file internally from the phar. On linux, the perceived runtime is identical.
This means, to my great surprise, that it is possible to distribute PHP applications as a .phar without any noticeable performance penalty in terms of speed. I have not benchmarked memory usage, and this is a factor in all considerations, so any benchmarks in this respect that you can run would be helpful.
In addition, I finally figured out a way to allow arbitrary naming of the .phar, so you can rename PEAR.phar to ThisIsWhyOpenSourceIsGood.php and it will still work.
I have placed the new phar at http://pear.chiaraquartet.net/PEAR.phar. This is 100k smaller than the old version, and yet it contains more files (PEAR has evolved since then). I tested just about every command, and found no difference between my installed PEAR and the .phar.
One caveat: if you have not installed PEAR already, you will need to set up a configuration file. Fortunately, this is easy:
- php PEAR.phar config-create
- php PEAR.phar /path/to/PEAR .pearrc
php PEAR.phar --windows C:\path\to\PEAR C:\windows\pear.ini - php PEAR.phar config-show
- adjust any values you don't like with "php PEAR.phar config-set <var> <value>"
In addition, all .phars require PHP 5.1, and this one requires the zlib extension be enabled in order to decompress the internal files.
I am very excited about these developments. Once Davey has a chance to review the changes and tweak them, we will be able to release PHP_Archive 0.7.0.