PHP - Sockets
From LXF Wiki
| Table of contents |
Practical PHP Programming
(Original version written by Paul Hudson for Linux Format magazine issue 47.)
Sockets are good for all sorts of things beyond just connecting the iron into. Want to revolutionise your PHP scripts? We plug you in...
Files are files, directories are files, and devices are files. Everything, and, conversely nothing (think /dev/null), is a file. So it should come as no surprise to you that sockets are files also, and the logical extension of that fact is that you can manipulate files and sockets in precisely the same way.
But first, let me explain what a socket actually is. Perhaps the best way to think of a socket is like a connector, and might be between a program and a port, or perhaps two programs - data goes in one end, and comes out the other end. There's a lot more to sockets beyond that, but it's not important on the whole - at least not for this instalment.
File-like sockets
We covered files only very briefly, and even that was back in LXF31, so if you functions like fopen() and fread() look alien to you you'll need to dig out LXF31 from the bottom of your cupboard. As seen back then in the second instalment on Practical PHP, fopen(), fread(), fwrite(), and fclose() are used in combination to make file handling straightforward. Thanks to the fact that sockets are also files, these same functions can often be used on sockets. For example, take a look at this following PHP script, socket1.php:
<?php
$connection = fsockopen ("www.linuxformat.co.uk", 80);
if ($connection) {
fwrite($connection, "GET / HTTP/1.1\r\nHOST: www.linuxformat.co.uk\r\n\r\n");
while (!feof($connection)) {
echo fread($connection,256);
}
fclose ($connection);
} else {
print "Unable to connect!\n";
}
?>
That script opens a connection to linuxformat.co.uk on port 80, sends a HTTP request, then receives the web page back. Note that fsockopen() is used rather than fopen(), because fsockopen() allows you to specify a port as the second parameter. Fopen(), on the other hand, opens a port based upon its best guess of what service you want. The advantage to this is that you can just fopen() http://www.linuxformat.co.uk, and PHP will spot that it's a HTTP connection and so open up 80. The disadvantage is that you lose that extra control - fsockopen() lets you open up port 40391, port 23934, or port 39211. That is, you can connect to any port you want, which makes it a great deal more useful.
As fopen() automatically detects the type of connection you want, it also handles the protocol associated with it - the fwrite() line that sends a HTTP request is not needed when using fopen(), because fopen() sends it for you if you use port 80. Again, using fsockopen() requires more code, but you get much more flexibility as you can send whatever request you want.
The return value of the fsockopen() call is checked, and it will return false if the connection failed or a resource if everything went smoothly. The return value, on success, is on the face of it exactly the same as the return value sent back by a successful call to fopen(), and it can be used in the same way. Behind the scenes, of course, things are a little more complicated because the connection exists over the network, but this is all entirely transparent to us as PHP programmers.
To request a page from the server, we need to send a HTTP request. This is done using a HTTP GET request, as shown in the fwrite() line. Note that there are line breaks in there, so the string inside fwrite() is in fact two lines being sent. The HTTP standard is quite easy to read, but, like most protocols, is very strict. The first line requests /, the root file on the server. This is followed by "HTTP/1.1" to denote the kind of request being sent, then a carriage return \r and new line \n. The second line in the string specifies the hostname we're trying to access - this is crucial in HTTP/1.1 because of the need to support name-based virtual hosting. The last part of a valid HTTP request is two sets of carriage return/new lines.
Once fwrite() is called, we should be able to start reading back the content, which is where the feof() loop comes in. The feof() function returns true if the file pointer passed to it still has more to be read - in this situation it will return true if there is more to be read from the web server. So, our loop says "while there is more to be read, read in 256 bytes" and print it out.
Finally, the connection is fclose()d, which is standard clean up procedure. HTTP junkies out will know that given the above code the server will actually automatically close the connection, but there's no harm making sure!
Go ahead and try the script yourself ? you should see a print out of the source code from the LXF website. However, now that we've got full control over the HTTP connection, we can alter our request to specify that we only want to receive the headers of the page using a HEAD request rather than a GET request. Save this next script is socket2.php and try it out:
<?php
$connection = fsockopen ("www.linuxformat.co.uk", 80);
if ($connection) {
fwrite($connection, "HEAD / HTTP/1.1\r\nHOST: www.linuxformat.co.uk\r\n\r\n");
while (!feof($connection)) {
echo fread($connection,256);
}
fclose ($connection);
} else {
print "Unable to connect!\n";
}
?>
Note that the sole change is in the fwrite() line - GET has become HEAD, which instructs the server to only send the HTTP header data and not the full page. If you're checking to see whether a particular page exists or not this would be the best place to start - simply check the string for a 404 error, and you're set.
A PHP server
While its clear that fsockopen() allows more flexibility than just fopen(), it is still quite simplistic when compared to what /can/ be done with sockets. We've just covered the basic use of sockets in PHP, and the next step is to move onto the advanced, more flexible sockets that allow you to really create cool stuff with PHP. If you've not fully understood the code before, I strongly advise against continuing because things get a lot harder from here on in.
The five key functions you'll need to work with are socket_create_listen(), socket_accept(), socket_write(), socket_read(), and socket_close(). Combined, these sockets do much the same as the basic functions, except they also automatically handle /blocking/ - pausing execution of the script until something important has happened. The advantage to this is that we can launch a PHP script and wait for things to happen, as opposed to making things happen. Sound like anything you recognise? That's right - a server!
Our new functions collectively handle everything we need to make a server in PHP. Precisely /what/ we'll serve is beside the point, at least right now. The new functions create and instruct a socket to listen on a given port, receive a client connecting to the socket, write text out to the client, read text sent by the client, and close the socket. To get started with the advanced socket usage, you need to call socket_create_listen(), which takes a port number to listen on as its only parameter. This function creates a socket, binds it to the port specified in the parameter, and returns as a resouce the socket it created or false if it failed. The socket resource returned from socket_create_listen() is used by other socket functions, so you'll almost certainly want to stash it away for the time being.
Moving on, once the socket is open, our server needs to accept a connection. This is done using the socket_accept() function, which takes the return value of socket_create_listen() as its only parameter, and returns a client connection ? someone who connected to our port number, presumably looking for our server. Socket_accept() operates by examining the queue of people waiting to be served, and taking the first client from there ? if there are no clients waiting to be served, socket_accept() will wait until a client does become available, at which point it will return that. This action of waiting, usually called "blocking", is crucial to our script running as a server - we're waiting to be asked for data, as opposed to actively requesting data.
Once we have a connection, we can crack on and communicate with it. Socket_write() takes two parameters, which in order are the client to write to and the value you want to write ? this data is then sent to the client in the order provided, as you'd expect. It's partner, socket_read(), also takes two parameters, which are the connection to read from, and the number of bytes to read. By using socket_write() and socket_read() together, you can fully interact with clients connecting to your socket.
General to specific
Enough blathering, it's time to see the code - after all, you're not reading this article because of my hilarious jokes, right?
Servers do one thing: serve. What they serve is down to us, so we're going to pick a suitably easy thing to serve for our first server - we're going to make an uppercaseriser! (Ed: I'm not convinced that's a word, Paul)
<?php
// this socket number is random
$socket = @socket_create_listen("55555");
if (!$socket) {
print "Failed to create socket!\n";
exit;
}
while (true) {
$client = socket_accept($socket);
$welcome = "\nWelcome to the Magnificent Uppercasing Engine.\nType '!exit' to close this connection, or type '!die' to halt the server.\n";
socket_write($client, $welcome);
while (true) {
$input = trim(socket_read ($client, 256));
if ($input == '!exit') {
break;
}
if ($input == '!die') {
socket_close ($client);
break 2;
}
$output = strtoupper($input) . "\n";
socket_write($client, $output);
echo "$input\n";
}
socket_close ($client);
}
socket_close ($socket);
?>
Remember what I said earlier: using sockets in this manner is generally best done using the CLI SAPI, because it has no script timeout set. Run the script in the CLI SAPI, in a new terminal, fire up telnet and connect to port 55555 on localhost, like this:
telnet localhost 55555
That should launch your telnet program, which is useful for forming simple connections to servers. All being well, you should receive the welcome message from the PHP server we just started it ? try it out with a few sentences, then type !die to terminate the connection. If you've followed correctly so far, you should see something like the picture below.
Send it some text and it uppercases it - it's not rocket science, but it's a good first step
At this point you should have a basic server working that automagically uppercases text you send to it. Yes, it's simple, but hopefully you should be confident using sockets properly now, which means we can move on.
Making a web server
Perhaps the simplest, useful server we can create using PHP is a web server, which, at its core, is basically an application that listens for requests on a given port (usually 80), and serves up the files that are asked for. Most web servers do a lot more than that, with common tasks being receiving POST data, handling file uploads, permissions, filters, and lots more. However, as creating such a script would require thousands of lines of code, we're just going to go for the lowest common denominator: serving files.
As we saw earlier, when looking at fsockopen(), an average HTTP request looks like this:
GET /somefile.php HTTP/1.1 HOST: www.linuxformat.co.uk
Now, don't get me wrong on this point: adhering to standards is a very important thing, and I would never ordinarly recommend ignoring or contravening standards in programming scripts. However, writing a server that properly parses and checks the entire HTTP request is out of the remit of this tutorial - you'll have to do that yourself. To serve up files, which is our goal, we just need to know which file was requested, and that's all in the second line. Therefore, to grab which file we need to send, we just check for the text after GET and before HTTP/1.1, and bin the rest. The simplest way to do this is to explode the request string as an array, and read in element #1 (arrays are zero-based, remember).
Each HTTP response needs to take a very specific format because HTTP splits its response into "header" and "content". If you don't take care to adhere to the standard when responding to a request, it's quite likely that clients will turn their nose up at what you send them. A properly formed request should contain several lines of header information followed by a couple of blank lines, then the actual HTML to be displayed. The header content should include the name of the file being served, its size, its type, the name of the server, and also the HTTP response type. The content part is the file itself, and is separated from the header by two sets of carriage return/line feeds.
HTTP response types are one line of text that contains the HTTP version being used (usually 1.0 or 1.1), followed by the HTTP status code of the response being sent and a small amount of text explaining the status. The content type is particularly important because it tells browsers what kind of file is being sent, upon which browsers usually base their decision on how to handle the file. Content types are defined as part of a standard known as Multipurpose Internet Mail Extensions (MIME) to define the exact type of data, for example the MIME type for HTML is "text/html", and for GIF images it's "image/gif".
Now you've got a working understanding of how HTTP works, it's time to get onto the code. Note that I'm opening the socket on port 8000, as port 80 is often closed to non-root users.
<?php
$httpsock = @socket_create_listen("8000");
if (!$httpsock) {
print "Socket creation failed!\n";
exit;
}
while (1) {
$client = socket_accept($httpsock);
$input = trim(socket_read ($client, 4096));
$input = explode(" ", $input);
$input = $input[1];
$fileinfo = pathinfo($input);
switch ($fileinfo['extension']) {
default:
$mime = "text/html";
}
if ($input == "/") {
$input = "/index.html";
}
$input = ".$input";
if (file_exists($input) && is_readable($input)) {
echo "Serving $input\n";
$contents = file_get_contents($input);
$output = "HTTP/1.0 200 OK\r\nServer: APatchyServer\r\nConnection: close\r\nContent-Type: $mime\r\n\r\n$contents";
} else {
$contents = "The file you requested doesn't exist. Sorry!";
$output = "HTTP/1.0 404 OBJECT NOT FOUND\r\nServer: BabyHTTP\r\nConnection: close\r\nContent-Type: text/html\r\n\r\n$contents";
}
socket_write($client, $output);
socket_close ($client);
}
socket_close ($httpsock);
?>
Save that script as webserver.php, then start it up - you should be able to get to it through your web browser by visiting http://localhost:8000. You'll need to put a little content in there to make it work at its best - create a simple HTML file to get started, but note that anything other than HTML won't work currently.
The functions that will need explaining are socket_read(), pathinfo(), file_get_contents(). Socket_read() is basically the reverse of socket_write(), and reads data sent by the client into the return value. Pathinfo() reads the name of a file, and breaks it up into its component parts - filename, directory, and extension. The return value of pathinfo() is an associative array of these values, so we use $fileinfo['extension']. Finally, file_get_contents() opens the named file for reading, returns the complete contents of that file, then closes it again.
Advanced HTTP
There's a lot more you can do to our BabyHTTP to make it a little more powerful. For example, note that there's a switch statement to handle various MIME types, and only HTML is supported right now - if you add an entry in there for PNG pictures (MIME type "image/png"), you could modify your sample HTML page and have it embed pictures in there. Sound tricky? We're not accepting any sort of keep-alive information with our server, which means we want each request as a whole new connection.
In this situation, a web browser will connect once, download the HTML, close the connection, read the HTML looking for subsequent files that are required (such as images, CSS files, Flash movies, etc), then make a separate connection for each one. To handle this we just need a new MIME type for each file we want to send because, essentially, the actual contents of the files being sent is irrelevant because file_get_contents() is binary safe.
Once you've added support for a few picture types and perhaps also CSS files, try creating a new HTML file that includes images - it should all work fine. In fact, once your server accept requests for CSS files, pictures, and Flash movies, you should be able to serve up quite complicated pages - not bad for such a simple server!
Although adding support for a variety of files is likely to add the most immediate benefits, there are two other possibilities you might want to look into that will improve the server whilst also helping you learn a lot more about PHP. As you will have noticed if you tried creating a complicated HTML page, our server only handles one connection at a time, even if there are more waiting. This is because our server exists in only one execution thread - PHP executes the code linearly, one connection at a time.
While this is perfectly fine for fairly simple servers, it would be better if the server created a new process to handle each request as soon as it came in, and this is accomplished in the PHP world by using the pcntl_fork() function - something a little too complex to cover in this article, but definitely something you should read up on if you want to improve your server.
The second potential improvement is having the ability to serve up compressed content, which is a little easier to do. The key is to look for the "accept encoding" line in the HTTP request coming in from your client, as it will tell you what kind of compression the client can accept. Konqueror, for example, reports back x-gzip, x-deflate, gzip, deflate, identity, of which we're interested in "gzip", as it shows it can receive gzipped content - give it a try!
Conclusion
Sockets are a fun and interesting part of PHP, moreso because they are one of the least-explored parts. As we've seen, it's actually quite easy to set up a simple server to handle our own tasks, although perhaps uppercasing user input is a little too easy - other possibilities include an encryption server, or a database querying server, etc. We've also looked at how it's actually quite easy to move from a simple server to a basic web server - granted our server isn't a patch (sorry!) on Apache, but, once you program in the ability to handle other media types, it can handle fairly complex stuff.
If you're thinking that sockets are only useful with the CLI SAPI, you'd be quite wrong - using sockets you can connect to all sorts of services on the Internet, such as the O&A currency system (www.oanda.com), or Unix time servers. There is quite literally a massive range of options, of which HTTP is just one - experiment!

