Ted Blue

Subscribe to Ted Blue: eMailAlertsEmail Alerts
Get Ted Blue: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Related Topics: ColdFusion on Ulitzer

CFDJ: Article

Comparing Local and Client/Server Databases

Comparing Local and Client/Server Databases

When creating a ColdFusion application, one of the first and most fundamental choices that must be made is which database platform to use. Each platform offers its own unique capabilities. Basically, there are two major categories of databases to choose from: local, such as Microsoft Access, dBASE, and FoxPro, and client/server, such as Oracle, SQL Server, and Sybase.

Don't kid yourself, this is not a trivial decision. Changing database platforms after development has begun can be difficult, and it becomes even more so the further along you get. There's a point of no return for most complex development projects where it's no longer cost-effective to change the database. At this point a major rewrite would be in order. In addition, some features your Web site might require may or may not be supported by the underlying database platform.

This article explores the fundamental differences between local and client/ server databases, and the issues that often arise when choosing a database platform for a ColdFusion Web application. For discussion purposes we'll as-sume the ColdFusion application is for e-commerce and the site will be heavily trafficked.

Web Applications Are Client/Server
All Web applications operate in a client/server environment (see Figure 1). A Web browser (a client) sends a request to a Web server, which is relayed to the ColdFusion Application Server. The server generates the appropriate HTML code based on the user's request, which is usually based on data that's accessible to the Web application. The HTML is then relayed by the Web server back to the client (the browser).

In Figure 1 there are two separate client/server relationships: between the Web browser and the Web server where the browser is the client, and between the Web server and the ColdFusion Application Server where the Web server is the client. A third client/server relationship may also exist between the ColdFusion Application Server and the database server if the data is stored on a database management system (DBMS). We'll discuss this in more detail in a later section.

Local Databases
Let's start the discussion with local databases. Technically, a local database is any data stored in an unmanaged database format - that is, a database that doesn't have or require dedicated database management software to access its data. Examples of local database formats include Microsoft Access, dBASE, FoxPro, and Paradox; other less-used formats are ISAM, B-Tree, and Delimited Text.

Connecting to an unmanaged, local data format requires that the appropriate ODBC driver is set up and configured correctly on the Web server and any essential passwords are specified. The database files can be located elsewhere. Assuming the ODBC driver is configured correctly, no additional preparation is needed - you can access the data using appropriate ColdFusion techniques, such as the CFQUERY tag.

One of the most attractive features of using a local database format is cost. The only real cost associated with a local database format is the tool you would use to create and manage the data files. For example, you might purchase a copy of Microsoft Access for creating and managing your .MDB files. In some cases the data format has no inherent cost, as the file format is commonly used by a number of other tools. The dBASE file format, for instance, can be easily created from an Excel spreadsheet or from a variety of other software applications. In almost all cases there are no licensing fees or royalties for using these database formats. Limitations of Local Databases

Although the configuration and setup of a local database is simple and the cost is low, certain limitations must be taken into consideration before dedicating any development to a local database format. First there's the issue of scalability. Many local database formats can't handle large numbers of records or simultaneous users accessing the data, both of which are critical to the development of a typical transaction-based e-commerce application.

For example, consider a Web site based on the traditional dBASE (.DBF) file format. Although it's possible to store a fair number of records in a dBASE table (100-500KB records isn't uncommon), the dBASE file format wasn't designed to hold huge amounts of data. There are known limits to the various versions of the dBASE file format, such as a 16MB limit on data stored in a memo field, due to the size of the data pointer. However, the problem is not so much with the amount of data stored as with the attempt to access that data using SQL through an ODBC driver. As the size of the tables increases, the performance decreases, often dramatically.

This limitation isn't specific to the dBASE format - most local database formats have performance problems when they exceed certain basic size limitations. In all cases, however, the larger the local database, the poorer the performance. The actual size limit where the performance problem occurs depends on a number of factors. For some local databases the limitation is measured in the number of records. For others, it's the total size of the database, the width of the record, or some other combination.

Performance Under Load
There are also limitations on the number of concurrent users for local databases. As the number of users increases, the performance drops dramatically. Under load, most local database formats drop performance to the point of being unusable (see Figure 2). Since a Web site depends on the time required to retrieve a Web page, which is directly affected by the time required to conduct the database operation, this is a rather critical limitation. If you combine this limitation with the fact that most Web databases gradually increase in size over time, you have a situation that will eventually bring your Web site to a crawl.

Where's the load limit for local databases? This depends on a number of factors, not the least of which is the code you write to access and manage the data. However, most experienced Web developers would probably agree that a local database starts having performance problems at somewhere between 10 and 20 simultaneous users. Not a lot if you're trying to build a Web site to handle tens of thousands of hits per day or more. Also, this problem can't be easily alleviated by throwing more hardware at the Web server (or database file server, as the case may be). The limitation is a symptom of a deeper, more fundamental problem, rooted in the way local database formats are structured.

Data Integrity and Local Databases
Perhaps the most insidious problem with local databases is that the integrity of the data can be fairly easily compromised. This can happen in a number of different ways, and can be very difficult to manage, prevent, or even detect.

Consider that most local databases use indexes to implement sort orders. These indexes are used to locate the correct record in a particular collating sequence. However, if the database is being updated (rather than a read-only database), the corresponding index key values must be updated at the same time. Since the data format is unmanaged, any problem with I/O on the Web server (or with message queuing in Windows) could result in an index that's out of sequence with the data, which means the index may refer to the wrong records. This in turn can result in the wrong records being updated, damaging the integrity of the data. Left unchecked, this can lead to data corruption. This particular problem is much more common than you might think, and is compounded as the size of the database and/or the number of simultaneous users increases.

Another common problem with local database formats is that when data is written, there's nothing to prevent multiple users from accessing the same data at the same time. Some local database formats provide record-locking mechanisms to prevent simultaneous access to the same data, but this assumes the application can lock a record for the entire period of time it's being edited (e.g., pessimistic record locking). Web applications are essentially stateless, and users that are editing data don't (should not, cannot) retain locks on records. This is because there's no way for the server to detect if a user abandons the editing session, thereby leaving the record in a locked state. This allows multiple users to potentially change the same data at the same time, which in turn can corrupt the data.

Even if the absence of pessimistic locking doesn't corrupt the data, another potential problem exists with local database formats since there's no database management system to prevent concurrency problems. Consider a record with a field that represents an account balance for a particular client that's updated in the following sequence:

  1. User A reads the account balance of $100.
  2. User B reads the account balance of $100.
  3. User A increases the account balance by $10, and updates the database with this value. The database now contains the value $110.
  4. User B increases the account balance by $10, and updates the database with this value. The database now contains the value $110, but it should be $120, as it has been increased by $10 two times.

This problem exists, again, because a local database has no database management system to prevent such problems.

Client/Server Database Systems
At the opposite end of the database spectrum are managed database systems. Technically, any database with a dedicated software database management system is a managed database system. However, Web databases are specifically client/server (regardless of the database format used), due to the client/server nature of Web development. This narrows the category to the more commonly used client/server RDBMSs, including Oracle, SQL Server, Informix, Sybase, and InterBase.

All these client/server database systems have a dedicated software-based manager to handle the potential problems inherent in the local database formats, and also to provide additional programming and data management capabilities that aren't available in a local database format.

Performance Under Load
Since these are managed systems, many of the problems associated with local database formats are simply non-issues with a client/server database system. For example, most client/server systems can handle huge numbers of records (billions of records is not uncommon) without a dramatic performance penalty (see Figure 3). In many systems performance can actually increase under additional user load, as the DBMS can reuse previously cached queries and service multiple client requests concurrently.

Any performance decreases due to load in a client/server database aren't nearly as dramatic as those encountered in local database formats. In addition, throwing more hardware at the database server usually improves its performance. This is far more efficient than relying on the underlying operating system to queue requests for access to a shared local data source. In most client/server systems significant performance drops don't occur until there are hundreds, or in some cases thousands, of concurrent users. When performance is affected, the performance decline is much more gradual than with local databases.

Data Integrity on Client/Server Systems
One of the strongest arguments for implementing a Web application on a client/server system is that the DBMS is specifically designed to protect the integrity of the data. Indexes are used primarily to increase performance, but not in the same way as in a local database system. Because the data is managed dynamically, the chance of index corruption is much less likely. If corruption does occur, it can be automatically detected and corrected before data corruption occurs. Other forms of data integrity problems are also automatically detected and resolved before they can adversely affect the data.

Transaction Control
Since a client/server database system allows many users to modify data offline, it's perfectly suited to Web application use in which users read and modify the data remotely. A client/server DBMS does this with transaction control, logging all potential changes by all users and layering on the updates in the correct order. This alone is no trivial task, as it requires a sophisticated logging and analysis system to simultaneously keep track of all reads and writes for all users.

Transaction control is implemented by logging client reads and writes to a transaction log, which the database system then uses to ensure the correct changes are applied, in the correct order, for each user.

Another advantage of using a transaction processing system is that multiple changes can be treated as if they were a single unit of work. For example, consider a simple transfer from your checking account to your savings account. Two updates must occur - reduce the checking account by a specified amount, and increase the savings account by the same amount.

If either transaction fails to complete (for whatever reason), someone will lose some cash. If the checking account was updated but the savings account update failed, you lose. If the savings account was updated but the checking account update failed, you win. Both updates must occur for the accounts to balance correctly. Transaction processing allows the database server to track changes for both updates, then commit them both once they're successfully done. If either update fails, both records are rolled back to their state just prior to the updates.

Another useful side effect of using transactions to track all changes for all clients is that if a disaster occurs at any point, such as a power failure, the database system can detect these open transactions at restart and roll them back automatically. Other DBMS Features

There are many other reasons for choosing a client/server database system over a local database format. Here are some of the more common selling points:

  • Security: Grants and revokes access and permissions to a specified object in the database
  • Views: Saves predefined queries that can be used as if they were actual tables
  • Stored procedures: Stores SQL programming in the database to perform common database management tasks
  • Triggers: Automatically executes stored procedures when data in a specified table is modified

Cost and Implementation
The drawbacks to using a client/server database management system are really associated with the cost and complexity of the implementation. Most professional client/server database systems cost far more than their local counterparts. Oracle, for example, can cost several hundreds of dollars per user, or tens of thousands per server, depending on configuration (www.oracle.com). InterBase, on the other hand, is free for the download (www.interbase.com). Most client/server databases are somewhere in between.

Another cost associated with client/server databases is the cost of implementation and maintenance. While some systems, like Microsoft SQL Server, are relatively simple to install and maintain, others, like Oracle, can require a highly paid professional database administrator in order to operate. Again, most fall somewhere in between.

Making a Choice
Are you scared of local databases yet? Don't be. They simply weren't intended for high-volume, heavily accessed Web sites. The choice of a local database as a permanent database for your Web site might be ap-propriate if the following conditions are true:

  • The site will provide read-only data that won't be modified remotely by clients.
  • The site won't experience high levels of traffic.
  • The site won't manage large amounts of data.
  • The site won't implement complex transactions.

If your site doesn't fit into all of these categories, you should seriously consider using a client/server database system.

Choose wisely....

More Stories By Ted Blue

Ted Blue is president of Blue Star Training (www.bluestarcorp.com), where he conducts training
for Internet, software development, and database-related topics. He's the Webmaster for a number of sites, including the San Diego ColdFusion User Group site (www.sdcfug.com). He's also the founder of
CERTIFIABLE Internet Services (www.certifiable.net),
a Web hosting company.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.