Monday 11 April 2022

Storing Your Code in a Database 1

I've always been concerned about how to handle different versions of the same code for different customers. I've seen several different approaches over the years. 

The one that seemed to work best involved code stored in a Pick database. At my first job, we used Advanced Revelation, a DOS version of Pick. The Pick system mixed code files written in Pick Basic and multi-value data files together in a hashed database. We used to take a global source code file and append an asterisk and the customers name to the end of it when a customisation was required e.g. accounts*customerA, accounts*customerB etc. Code management modules were written that compiled accounts*customerA into the global accounts executable and then bundled all the executable records together into an archive that the customer extracted onto their system. Another department at that same work place were developing a C++ app, storing the code in standard text files. They got into all sorts of trouble handling variations, mainly because, I think, when developing using text files, to avoid clutter you tend to let the text files grow too large and then it becomes very difficult to refactor them. 

I've seen a FoxPro app where most of the app is standard, but you were able to request minor modifications e.g. customised reports. How this was handled at the vendor end, I'm not sure, but I did notice fields in the database that my customer didn't use.

Another approach is to have a single code base, but include an ini file that let's different customers switch on and off the functionality they need. Ini files used like this can grow to be very large e.g. Apache Web Server. I think the ini file approach is going to become unworkable as the code base becomes increasingly convoluted trying to account for every possibility.

I have no experience doing this, but you could build a conventional app using text files and then store these inside a versioning database e.g. Fossil

Today we have hosted web apps. I don't think much has changed except that you no longer have the upgrade pain that came with distributed apps. I'm guessing the standard approach is to build an app in a CMS and to associate code files and database records with registered users. So you end up with one huge CMS app and one large relational database. The database is a concern, because over time it is going to end up with lots of redundant fields from customisations. You must be passing some kind of unique token between the server and the client to ensure that the client can't trespass onto someone else's data and if that token gets hacked, the hacker might be able to gain access to your entire customer base! To avoid that database engineers resort to using GUIDs to identify records, so it is virtually impossible to pull up a database record using a random key.

I'm working on a different approach.

  • Each customer has their own database for data. Consequently you can keep it simple with integer id's and integer foreign keys instead of a mess of GUIDs which as well as being confusing massively impact on performance as a index key. The database server manages multiple small databases.
  • Their code, though, resides in a SINGLE database for code. The code database is organised like a library with shelves, books and chapters (pages exist too but these are called indirectly). A shelf is an entire application which comprises many books, but the customer directly accesses just four books, a private book (administration), a public book (customer web pages that can be used by anyone), a protected book (customer web pages that can only be accessed by their registered users) and a global book. The customer books contain only the chapters which are customised for them. Every time a client requests a web page, the customer books are browsed first. If the book's chapter is missing then the chapter is sourced from the global book instead.
  • It's impossible for a customer to request another customer's book because to read their book you have to be standing right in front of it. This is pushing the analogy a bit far! I have just ONE very simple cgi file (all it does is collect cgi parameters and cookies, sends them to the code database and returns the reply to the web browser) that calls the code database, but there are MANY COPIES of that file in a shallow directory tree. Each copy has it's own ini file which specifies the name of the book it can access and which database. None of that information passes over the Internet. Access to the cgi files is password protected where necessary (using SSL encrypted basic authentication).

No comments:

Post a Comment