**Update: REST is Very fast after figuring out that I needed "Connection: close" header in fsockopen(). pear.chiaraquartet.net is running a fully functioning REST server, you can see the source in any old browser at http://pear.chiaraquartet.net/Chiara_PEAR_Server_REST/. Update to cvs of PEAR, and you can try all the remote functions and marvel at how fast it is, especially with http caching
**
In my work on the PEAR installer implementing channels, I've spent a fair amount of time investigating ways of exposing channel capabilities, and of distributing information from these channels back to the PEAR installer. PEAR 1.3.x and earlier uses XML-RPC to distribute data dynamically back to the PEAR installer to help it make decisions about files to download and install.
Recently, PEAR has become a victim of its own success, experiencing tremendous load problems at pear.php.net. In part, this was due to including too many files in each page, and this has been fixed. However, the load introduced by dynamic processing of XML-RPC requests is substantial and significant.
How can this be solved Certainly not through the introduction of the current fad, SOAP. This protocol is much more complex, requires more bandwidth, and provides no additional worthwhile features (people from Java/C# are not going to use the SOAP interface to install PHP packages).
The solution appears to be in implementing
REST, Representational State Transfer, so that all dynamic processing is done on the client-side. I say "appears to be" because all of this is experimental and not fully tested, but initial tests are very exciting.
I've completed the work of designing a server-side implementation of REST for the PEAR installer, and now am working on the client-side implementation.
The traditional REST web service, like amazon.com's non-SOAP interface, provides an incredibly complex system that provides all of the features of the SOAP interface, but simply anchors it to URI resources rather than SOAP functions. In general, PEAR does not require this kind of complexity. In fact, the entire server-side can be implemented using static xml and text files! This then limits the scalability of a channel server to the scalability of the web server, and as we all know, Apache is just about limitless in its scalability.
In addition, this provides the most excellent side effect that mirroring a channel becomes a piece of cake: simply copy the REST files' directory structure to the mirror once a day, and the channel is mirrored.
The way that my preliminary experiments with REST in Chiara_PEAR_Server work is that every time a new package, new maintainer, or new category is created, a few xml files are saved on the server containing information on these packages, and an occasional simple
Xlink hyperlink to other REST files. In addition, when a release is added, xml files representing the release information, the package.xml, and a text file containing a serialized representation of the package's dependencies are all created.
This has proven to be extremely simple to implement, and it is very gratifying to know that because the output is so limited, debugging is very simple as well.
On the client side, I've been grabbing code that resided in Chiara/PEAR/Server.php and the code it was copied from in pearweb's include/pear-database.php and simply implementing it in the installer.
There are three significant side effects to this change:
- debugging problems in the server-side web services is incredibly simple: you can open up the REST files in any web browser and see everything there is to see.
- debugging problems in the client-side of web services is also easier, as all the dynamic code is in PEAR, so there is no real confusion about where to fix the problem. Fixing problems in the XML-RPC interface has always been a nightmare
- caching can take advantage of the HTTP caching standard that has been around since HTTP 1.1 and is fully implemented in even really old web servers.
In addition, since the client has full access to all client-side information (passing this over the internet as parameters to an XML-RPC function would be both really slow and cumbersome), more intelligent download requests can be made.
For instance, if there are no releases that match the constraints specified by preferred_state, and the --force option was not specified, then nothing should be installed, and there is no reason to download the package.xml contents. REST-based code can do this, whereas an XML-RPC implementation can't possibly know this unless you make it far too complicated for anyone's good.
In addition, passing the whole package.xml back through XML-RPC requires encoding all special characters with entities, which makes the thing about 1/6-1/3 larger than the original file. REST can simply download the xml file as xml with no conflicts (mime type is used to differentiate content types). These things combined also means that there is no reason to ever download the complete package.xml, as we only need the package name, version, and its dependencies. All in all, it is much more efficient in both bandwidth and memory usage on the client side.
In addition, the PHP-space implementation of REST downloading and caching code plus the REST1.0 standard will be no larger than the XML_RPC class.
I suspect the only reason nobody thought to do this for the first incarnation of the PEAR installer is that REST just isn't as sexy in appearance. I mean, how cool is it that you can call a function on a remote server as if it were in the same script REST requires clunkier thinking in terms of resources. It is more like accessing information in an xml file via XPath versus grabbing it through an abstraction that does some of the magic processing for you.
The basic problem with API-based web services is that they limit you to the original author's assumptions about what you need to know. This can be great for high-security issues (for instance, the REST interface does NOT expose maintainer passwords or email addresses), but even this is not a good reason to use an API-based web service. HTTP authentication is perfectly capable of locking people out of sensitive resources, and in addition, the process of hiding information behind an API can still be implemented with the REST model simply by the use of a mod_rewrite-style dynamic script. Users could still bookmark and return to a particular resource via the URI, something that is impossible with an API-based server.
The assertion that it saves developer time when using a web service like SOAP also appears to be suspect to me, as one often needs to code around limitations in the API. A well-designed API in a web service probably provides similar benefit to a well-designed REST-based web service in terms of developer time. The only real problem with REST is that there is no plug-it-in-and-go solution in terms of processing REST.
The REST I'm using for PEAR is primarily xml-based, so that Xschema can be used to define its contents. Those who argue that WSDL is our savior might benefit from the obvious fact that just knowing the types of arguments and what a function returns is not the same thing as being able to use it to do something useful. If you can get data from the remote host to your host in a format you understand, then you can use it. Either way, there must be a human involved to take that data and write the code to do something with it. Thank God

.
In any case, this has been a very educational experiment, and I am excited to see where it ends up. Hopefully the initial benefits continue and persist as the code matures and people start using it. Time will tell!
Greg Beaver has decided to try moving from XML-RPC to REST for the pear installer: >I suspect the only reason nobody thought to do this for the first incarnation of the PEAR installer is that REST just isn't as sexy in appearance. I mean, how cool ...
Tracked: Apr 24, 19:07
PEAR 1.4.0, meet REST 1.0 - Lot 49 - Gre...
Tracked: Apr 24, 23:42