Friday 6 October 2017

What's wrong with CGI?

The accepted answer is that CGI is an outdated concept that no longer has a place in modern web applications.  I'm not so sure.  Let's look at how CGI works.

1.  The user requests a page.
2.  The web server starts up a new process, serves the content and the process dies.
3.  If the page is doing anything interesting that invariably means the CGI process must open a database connection and destroy it again.

In the past, operating systems weren't smart and starting a new process mean't reading the binary image into memory from disk. That's not true now, particularly for Windows (I'm a Windows developer, so I had better stick with what I know); it caches all recently read files in memory. So, if your CGI file is a compiled executable then the overhead of CGI has gone. If your CGI uses a scripting language such as Python, Perl, Tcl etc then their virtual machines must parse your script every time which will slow things down, but not by as much as you'd think, since both the virtual machine and the script are cached in memory and in Python's case at least, the script gets compiled to byte code on the first parse and thereafter runs much quicker.

Opening and destroying database connections is expensive, mainly because the database server must authenticate your connection. But if you insert some middleware in between the database and your CGI file and pool your database connections, the authentication overhead is removed. Ensuring that your database recordsets are disconnected recordsets also makes things run much faster.

So to recap, if you have CGI executables using disconnected recordsets from pooled database connections what's wrong with CGI?  Let's count the advantages:

1.  The CGI executables are tiny.  I have a single EXE for each page, all of them sharing a 2 Mb dll.  The EXE's never exceed 200 Kb. They compile in a flash.
2.  Because each CGI is stand-alone, I can keep the web site running, while just working on one aspect of it. And if a CGI should crash in production, the whole site isn't brought down, just that page.
3.  The site runs exceptionally fast, because the CGI files only do the bare minimum to create a page.  No lumbering template libraries here. In fact, no template libraries at all!  HTTP, when you remove all the towering library abstractions built on top of it, is actually very basic. Just GET and POST calls passing around variables, HTML Forms that collect the data into those variables and cookies to maintain state. It also helps to have a little bit of Javascript sprinkled about so that your Forms don't need to return to the server for every action the user takes. A typical CGI file begins by collecting any variables sent to it, reading cookies and opening a number of disconnected database recordsets.  The variables and cookies are transformed and saved as appropriate and the file finishes by writing out a new page. In my files less than a third of the code is devoted to writing the new page. Most of it revolves around updating the database.
4.  Secure. You can't have a scripting attack, if all the code is compiled into an EXE.

The disadvantages?

1.  My code uses combination of C and C++ supplied in three libraries. First, cgihtml-1.69, a minute CGI library written in C, that I scrounged off the web and which was last updated in 1998. Second, fox-1.3.15 written in C++ and released in 2002 (Fox-Toolkit is still current. The latest release is 1.7.61, but why update when the existing functionality surpasses anything I'll ever need?) and third, isectMP-1.0.3, a database pooling library written in C which I maintain. Fox-Toolkit comes with an excellent String manipulation class which I use constantly, but sometimes I yearn for the simple syntax of Python. That yearning soon disappears when I have to interface with a third party library or interact with Windows directly! And what about Python deployment? Let's not go there!

2.  Writing CGI's in C and C++, just isn't fashionable.  I constantly have to remind myself that it's results that matter, not that I should be using the latest language du jour.

And so, I find myself asking that question again, what is the problem with CGI?  Answer:  Nothing!