Let's start with the network application, and let's keep it simple. I have a web site and I want my competitors to see one version of it, but my customers to see another version, my web developers to see another version and then everyone else to see the official current version. This is a reasonable deployment example because more and more applications are being deployed as Java or similar applications. That is, the application is deployed in one place and the clients pick up the application automatically as part of their web browsing. For example, your Best Buy or Future Shop on-line shopping is accessed by thousands of clients all accessing a single deployment of the on-line shopping application at the web site. So, how do we go about this?
First of all we could have four different copies of the web site, but then if we want to make changes to common portions, we have to make it in four places instead of one. Does this sound more like a CM issue? So how do we proceed?
Classical CM Support for Multiple Versions of a Web Site
The classical view would be to maintain the Web site using a CM tool which is capable of sharing common code across Web development streams (i.e., branch only when necessary). Then I only have to make common changes once, and then re-deploy. This works fairly well. The problem comes when the number of views starts to grow. Perhaps the legal department needs to see all official versions of the web site used in the past 2 years.
The Dynamic Web Site
The next logical improvement would be to have a dynamic web site which generates or locates pages based on a view specification. When you access the web site, the view is specified (for example, based on how you accessed it or where your client IP address is located). Based on this view, the Web Server would dish out the appropriate pages. Such a strategy requires a bit of technology, but can provide additional benefits as well.
To implement this strategy, you need a repository of Web pages which can be served up based on the view. Selecting from a number of pre-deployed pages is a solution which is not really any different from our initial classical view. This is where we really want to start integrating with the live CM tool. Based on the view setting, we should be able to access a page from the CM tool which reflects the view. To do this, we need to have the Web Server integrated with the CM tool, perhaps through CGI scripts, so that the requested page can be retrieved from the CM tool and served up to the web client. This obviously has to be a quick operation. If 100 deltas need to be applied in order to generate the requested page, response times might not be acceptable. This is especially true if there is heavy traffic on the server, with hundreds of clients accessing pages simultaneously. Traditional RCS- and SCCS-like storage techniques likely won't cut it. As well, you have to consider how many client programs for the CM tool need to be active to provide adequate performance. If the CM tool can rapidly change views, the same CM client might be able to serve all of the Web clients. If context setting is not a rapid operation, it may be necessary to have a CM client per view - which could possibly bog down the server.
Web Sites on a CM-Supported Virtual File System
Now let's try to go one better. With a CM tool which supports a Virtual File System, it's possible to access a view of the web site by setting the right view from the configuration management perspective. The view is set at the Operating System level and the result is the illusion of a file-system based Web site, just as you would normally deploy for a basic informational web site. Anyone familiar with ClearCase would be familiar with this scenario. So the integration with the Web Server becomes trivial. However, there are still concerns here. For example, how many active views can you have simultaneously? Can each process have its own view of the file system, or is there one view active per file system mount point? In the latter case, there's still some integration that needs to be done so that the Web Server will select the right view (i.e., mount point). If there are severe restrictions on the number of concurrent views, you may be pushed into a situation where you need a separate Web Server for each view. This sort of defeats the purpose of eliminating the overhead associated with multiple Web site views.
Still this is a worthwhile goal to strive for if these concerns can be adequately addressed, the costs of the technology is reasonable, and the reliability is such that there is no additional administration added to support the virtual File System.
The Traditional Model
Now we will take a 180 degree turn towards a more traditional view of application deployment. There are really two separate scenarios which need to be discussed: (1) direct deployment of the application, where the files are deployed directly from the CM tool and (2) indirect deployment of the files, where the files are packaged up for deployment elsewhere.
I would generally argue that the latter scenario, packaging up files for deployment, is really just the first scenario with a bit of post-packaging work. And application packaging, though possibly part of the tool suite surrounding CM, I would not consider part of the CM function. There are plenty of tools out there that deal with the ins and outs of installation. However, the CM tools should do all it can to facilitate the use of such tools.
There are 4 basic deployment operation types which should be considered, in my opinion:
(1) Deployment to a fixed location: Some files have to go in a specific place - perhaps drivers need to go in the "system32/drivers" folder on Windows. It is important that the CM deployment capability will allow you to specify a specific location for a target deliverable.
(2) Deployment to a target deployment directory: Some files need to be deployed to a directory relative to the target deployment directory, regardless of their location in the source tree. For example, help files might need to go in the /help directory, even though they are stored in the source tree along with the specification documents.
(3) Deployment-based on the source tree architecture within a target deployment directory: Often the source tree can be laid out to mimic the application deployment structure. When this is done, deploying the source tree, just as you would your development workspace, will put the files in the correct location for the product, relative to your deployment directory root.
(4) Deployment-based on a source sub-tree within a target deployment directory: Perhaps your source tree is not laid out according to deployment structure, but still certain directory sub-trees map directly onto sub-trees in deployment architecture.
If the CM tool adequately supports these types of deployment, it should not be difficult to automate the deployment process. I like to use a bit of shorthand for these methods:
Fixed Location: ""
Target Directory: "/="
Tree Under Target Directory: "/*"
Sub-tree Under Target Sub-Directory: "/+"
So, for example, I might have the following deployment directives for an application:
(1) Deploy all system files to their fixed locations. The fixed location would be part of the configuration data for each file, and would itself be under revision control so that the location could change over time if required by either the platform or the application:
deploy System-File-List
(2) Deploy all *.help, *.pdf and *.chm files to the "/help" directory. Here is a customer defined location for deployment of the application. Similarly, all executables (*.so *.dll *.exe *.sh *.bat) files are deployed to the "/bin" directory:
deploy *.help, *.pdf, *.chm -to "myapp/help/="
delpoy *.so, *.dll, *.exe, *.sh, *.bat -to "myapp/bin/="
(3) Deploy all "myapp_*" directories, sub-directories and files to the "" directory under the respective directory paths:
deploy myapp_* -to "myapp/*"
(4) Deploy a few relative application directories (and their sub-directories) where they need to go:
deploy client_data_common -to "myapp/data/+"
deploy client_data_option1 -to "myapp/data/+"
deploy images -to "myapp/runtime/images/+"
If the CM tools provides this level of capability, so that you can specify a set of objects, how they are to be deployed and, implicitly or explicitly, where they are to be deployed, not only will deployment be easier, but a more natural organization of your files will typically result to facilitate deployment automation.
Other Deployment Issues
There are a few other issues that need to be addressed when we discuss deployment.
File Format - Unix or Windows (or Mac)
When it comes to deploying files for different targets and/or on different targets a CM tool can be invaluable. But for some reason, many tools don't deal with this issue well. Here's all that is needed:
1. When ascii source is submitted to the repository, the CM tool should allow any of the native formats to be used and then should store the source file in a canonical form, possibly with no line end indicators at all (e.g. store the size of the line instead). If this is not done, then the ascii file is really being treated as
a "binary" file without binary characters. This is fine in some applications (e.g., Windows-only applications when the CM tool is also on Windows).
2. When binary source is submitted to the repository, the CM tool should leave it alone (possibly compressing it, or even delta-compressing it, but always so that it can be retrieved identically).
3. When ascii source is retrieved from the repository, the CM tool must provide the following options:
a) If the retrieval command/action indicates a preferred format for retrieval, use that format
b) Otherwise, if the file class of the object being retrieved indicates a preferred format, retrieve it in that format
c) Otherwise, retrieve it into the format of the native machine
If the CM tool provides these options, and the associated mechanisms for providing them (e.g. file class definitions), the end-user will be happy.
Shared Files and their deployment
Files are often shared in the source tree in multiple locations. Perhaps there are Enterprise, Professional and Home configurations of the product which share files for common features. Or maybe two different deliverables are built using the same database engine. The handling of shared files by the CM tool, and especially with respect to deployment is a key issue.
When there are no shared files, the object itself may be used to indicate the location of the deployment. When a file is shared, both the object and its shared instance identification must be used to determine the ultimate deployment. The CM tool must facilitate deployment shared objects by providing the means to naturally indicate where the shared object is to be deployed.
Linked Files
A cousin of a shared file is a linked file. Some operating systems allow you to create a reference link to a file so that when the reference link is accessed it is the same as accessing the linked file. Unix has very good linking. Windows does a part of the job. How important linked files are to deployment depends on the application. Perhaps it fine to replicate the file on deployment. Again, the CM tool which understands and provides linking capabilities for the underlying deploy architecture has an advantage.
What are Your Requirements
Application deployment requirements are many and varied. We've looked at a few of these in this article. If you have additional requirements, we want to hear from you! Post a new thread, or reply to this article.