Designing and Implementing a Database-Driven Image Gallery
William G. Paine and Sarah E. Fazenbaker, Florida Museum of Natural History, USA.
Most museums and cultural institutions are interested in getting their collections accessible on-line. However, many may feel that purchasing a software package or content management system (CMS) is their only avenue to get large amounts of content up on the Web, despite the fact that they may have to modify their practices to fit the CMS system and will likely spend large amounts of money to do so. The idea of producing something like this in-house may seem beyond some institutions but it doesn't have to be. After receiving multiple proposals for collection image galleries at the Florida Museum of Natural History, we designed a flexible database-driven image gallery system to facilitate the rapid development of dynamic galleries for a range of disciplines. Our goal for this system was to make as many things on the back end as standardized as possible while still allowing modifications. Included are the basic database structure, general stored procedures and views. The display pages are built on templates and are dynamically loaded each time they are called, resulting in a consistent look and feel. In addition, the code that processes the queries was written using the same structure. The content is added through an application specifically designed for non-technical employees and students. This application goes beyond text data entry and incorporates real-time image processing. Database security was also considered and implemented across the board in similar fashion. We have developed this gallery system using standard server software - Microsoft SQL Server, IIS, Access and Active Server Pages. However, the methods used could be adapted to any language or platform.
Keywords: Database, Image Gallery, Content Management System, Dynamic Content, Reusable Code, Active Server Pages, SQL, Web Application
When most museums start to review their options to put image galleries of their collections on-line, a content management system (CMS) may seem to be the only choice available. Enter the term into any search engine and you will likely find thousands of offerings ranging in cost from free to millions of dollars with a seemingly similar number of features. Choosing one of these software packages is a huge investment for any operation - in dollars, time and buy-in from employees.
There are advantages to purchasing from a vendor: a complete solution with a relatively quick implementation timeline and, if you bought the support contract, an outside professional to intercede during times of crisis. However, you must weigh this against the monetary costs — including initial investment, licensing fees and future customization — along with allowing your organization to get "locked-in" to the program's preferred operating system, programming language, hardware requirements, data management and storage methods. How much time are you willing to spend porting your information to fit their data structure? How much of a mental shift will your employees have to make to adapt to the new system?
With solid planning, there are some institutions that can really put an intricate content management system to good use. However, many museums may look at the myriad of features common to CMS programs — workflow management, staging and rollbacks, templates, versioning, multiple output formats, metadata support, and Through-The-Web authoring to name a few — and ask themselves, 'Do we really need all this?'
All museums, regardless of size, should not shy away from building a custom product in-house without careful consideration of the total cost of ownership of a commercial solution. You will have to pay one way or another! It may be a better investment to acquire quality staff that know the ins and outs of your organization and can develop applications that suit your needs perfectly. Why spend thousands on a service contract that just gives you some phone number to call when you could have people on-hand, ready to support the application throughout its life and adapt it as new opportunities arise? Skilled programmers are an asset that can serve your staff far beyond initial product development. At the Florida Museum of Natural History, this investment in technology professionals has yielded a number of projects benefiting its constituents — managing collections, images, children's classes, tours, events and much more - that cost far less in person-hours than commercial software equivalents over the long term.
Current Data Management
Collection databases are already well established at the Florida Museum of Natural History. Most disciplines have a custom database entry system tailored to capture their specific specimen data. Several also have a Web version of their collection database on-line and visitors can see all types of textual specimen data in varying report formats. However, pictures are not a part of the main collection data sets — a relatively small percentage of the museum's entire collection is visually documented — therefore images are not incorporated into these collection result pages.
The collective cry, 'Let's get our visual collections accessible on-line!' is echoing through many institutions. Indeed, increasing numbers of researchers at the Florida Museum of Natural History were interested in getting large numbers of images up on the Web and several grant proposals began to revolve around this concept. The images were not always necessarily those of actual specimens, so a separation from official collection data was desired.
A content management system appeared to be a logical method to publish large numbers of these images and data on the Web in a manageable way. However, it was the Office of Museum Technology's goal to make sure that if a CMS package were purchased, it would encompass the needs of the entire museum and be adaptable for future projects. This proved to be a tall order.
After a preliminary evaluation of some popular CMS systems, several things were clear:
In light of all of these factors, the Office of Museum Technology decided that building an adaptable image gallery in-house was the best option. The costs in person-hours were worth having a flexible application that we could tailor for all of our constituent's needs, now and in the future, on the server platform of our choosing.
Accessing Project Requirements
We began with a simple question: What did we really need to get our collection image galleries on-line?
We chose to build this system using the scripting language and database type that were already in heavy use at the museum, namely Active Server Pages with a Microsoft SQL Server database back end and a Microsoft Access front end. However, the code logic used with this system could easily be adaptable to a number of languages and platforms. Indeed, a long-term goal of the museum is to port all major systems over to open-source solutions due to licensing fees, ongoing security concerns and other vendor-specific issues. This was another advantage of building the code instead of locking into a particular platform required by a packaged content management system. At some point in the future the code could be translated to use another scripting language or database type, most likely at less expense than investing in an entirely new CMS. Retaining the systems the researchers were acclimated to would save additional costs in valuable person-hours in the long run as well.
Envisioning a Standard Approach to Building Interchangeable Code
One thing to aim for is maintaining standardization in common elements. Some of these elements are database structure, images, photographer data, search pages, processing pages, processing code and the pages that output the data. This means that the programmer will be able to create subsequent database image galleries much more quickly. Developers who practice code reuse become more efficient and produce less buggy code as well. Furthermore, the more often a particular piece of code is reused, the more certain you can be that the reused code is bug-free (Mitchell, 2000).
Since we are a natural history museum, certain elements are present in most of the databases housing data relating to natural history specimens. These database columns that define taxonomy and locality data are going to be fairly standard in most of the databases we develop. The other major division present in our museum is anthropology and the image gallery databases for these collections do capture different data, such as ceramics characteristics. The extent of changes and modifications should be limited to the following things:
Modification may be made to the XHTML and ASP files to change logos, background colors, etc. Further database work is usually needed to set up dictionary tables that may be included. These will almost always be present with common fields like taxonomic classifications. Others may be setup when necessary for a particular collection's needs.
Advantages of standardization:
Identifying Common Elements
Not all museum collections are the same, or even similar to one another at all. Things that are captured and important to one collection are either not present or not considered by another. But some common elements are present in all collections, and these are the ones we begin with when addressing a new project.
Collections need some way to identify each specific item they contain, whether they be fossils, skeletal material, alcohol-preserved specimens, skins, photos, paintings, or artifacts. At the Florida Museum of Natural History, natural history collections and anthropology collections both use a unique identifying number for each specimen, specimen lot, artifact or artifact lot. This number becomes a unique row identifier in that collection's database.
Common elements include columns to capture the thumbnail image, large image(s), photographer information and the system-generated record ID. These are present no matter what kind of collection is represented. Any other columns in the database are added to customize it to its specific purpose.
From the outset, the main objective in creating our own system has been to take advantage of the shared components and use them to speed up implementation and make creation easier. We feel that this approach is bearing fruit as we have either completed or are actively building five very different galleries for the museum's Web site without having to completely reinvent the code in the Web scripts and the upload database front end. In addition, the 'common elements' approach has greatly facilitated the creation of new databases on the SQL Server, as many of the stored procedures will run with either no changes or only minor ones.
A graphical explanation of the main database structure appears in Figure 1. Note that the lines connecting the tables all show 'crows feet' where they join the Photo table. Also note that there are single lines where they join the other tables. These lines show the relationships between the tables in the database. In addition, each database contains other tables that are mainly dictionaries for setting up the drop-down menus on the Web pages and in the upload database. Examples would be attributes of different ceramic types or taxonomic classifications.
Foreign keys in one table are primary keys in another related table. Foreign key values may appear many times in a table. For example, detailed information on a photographer is entered into the database only once, in the Photographer table. Any time a record entered in the Photo table needs to reference this photographer, only the PhotographerID (the primary key in the photographer table) for the desired photographer need be entered, linking the record in the Photo table with the complete photographer data in the Photographer table. Relational databases eliminate redundant data entry and all of the errors that accompany it.
Stored procedures constitute another common element to these projects. These database structures are essentially compiled queries that run very quickly and with less overhead than other methods of data retrieval. In addition, the application is more secure when you specify that your database tables can be modified only by stored procedures. This means that potentially dangerous SQL operations generated on the fly may not be performed. (Anderson, et al. 1999) Like the main tables, the stored procedures are all very similar. Some of them can remain completely unchanged from one database to another, and if changes do need to be made, they are usually to include search parameters unique to a particular collection. Other stored procedures run the drop-down menus for the search pages and these are specific to a particular dictionary.
Permissions must be set to allow for Web site users to run queries on the database. Other permissions must be set to allow for updating of the database by appropriate persons. These will be set when the databases are created. It is extremely important that permissions are set correctly to allow adequate access to the database for data searches but not to allow unauthorized access to administrative sections.
One other element common to all of the collection databases on our Web site is a tie-in to another database that tracks the queries made against them. Typical Web statistics packages do not have the capability to track individual queries made against a database; they can only track whether the initial search page has been visited. The custom approach we take allows us to not only compile the number of queries run against a particular database, but also list them in full for further analysis by collection staff. These statistics are valuable tools when applying for further grants, etc. and can be seen on our Web site at http://www.flmnh.ufl.edu/scripts/dbs/track_dbs.asp.
Another very important thing involved here is the need for regular backups of databases and all related files. It cannot be stated enough that reliable backups can mean the difference between a minor inconvenience and major disaster in the event of data loss, system crash, hard drive failure or some other catastrophic event.
Data and Image Management Application
Having an application that allows non-technical people to easily enter data and process images directly to the Web site and Web database server makes it possible for the project to remain viable over time. Ease of use is the fundamental key to making this approach work well.
We built a custom application, using Microsoft Access as a front end to a Microsoft SQL Server back end. The hallmark of this application is the ease of uploading images to the Web server once they have been properly optimized for Web display. Currently, we allow for one thumbnail image and one large image per entry in the database, although we recently created a database for a type collection of Spanish Colonial ceramics that allows for up to four large images per entry. Since we are creating custom applications, each one can be modified as needed. The images are not stored within the database; their filenames are entered as text instead. On the Web pages and in the upload application, these filenames are referenced to display the images. This saves enormous amounts of hard drive space, as storing images within databases creates a huge amount of overhead. It also allows for plenty of flexibility in setting locations for the images on the Web server.
The main page of this menu-driven, form-based application is shown in Figure 2.
To edit or add records or images to the project, as seen in Figure 3, the "Records LOCKED" button must be clicked.
The view of the large image is cropped here, but not on the Web, as it is intended here only for verification purposes. Once this has been done, records may be altered or deleted, and new ones added. The drop-down menus on the form display data from the dictionaries in the database. When a new entry is made that is not found within the dictionary, it is added to the dictionary as well as to the current data record. This automatic updating helps to prevent data input errors. One thing to remember is that when data are added here, they are "live" and will immediately appear on the Web site. Search fields are placed at the top of the form to allow for easy searches on the data. These search fields may be easily modified when necessary.
To add an image, users click on the buttons directly to the right of the image text boxes on the right-hand side of the form. A dialog box will open to allow selection of a particular file. Once selected, the filename will appear in the text box (Figure 4).
What actually happens is the following:
The form here allows image file operations for a single database record at a time. When needed, the application contains programs that batch process large numbers of image files relating to many records in the database.
If the Delete Current Record button is clicked, the user will be prompted to make sure that deletion is wanted. If the user answers in the affirmative, the current database record will be deleted as well as its associated image files on the Web server.
Building the Web Pages
All three of the main pages — the search screen, results page and individual record display — along with any accompanying pages are coded using standards-compliant extensible hypertext markup language (XHTML) and cascading style sheets (CSS). Loosely speaking, XHTML is XML that acts like HTML in old and new browsers and also works as expected in most Internet devices, from Palm Pilots to Web phones to screen readers, making it portable, practical and cost efficient. (Zeldman, 2003) Using standards-complaint code ensures that future user agents will be accommodated with little effort. All pages are validated against the World Wide Web Consortium (W3C) recommendations using their on-line validator to insure the pages will perform similarly on all browsers and platforms. (http://validator.w3.org/) CSS provides the ability to quickly change the appearance of most design attributes, from link and heading colors to display box size and format. CSS styles are given meaningful names to ensure easy modification between different galleries.
Identifying Target Audiences
All Web pages in this system are infinitely customizable. For each project, the target audience is assessed before the Web pages and search fields are created. Most of the galleries have a "general" target audience — a variety of people looking for images of animals and objects for a number of reasons, but not necessarily specific high-level research data. In these cases, ease of searching is the key. Users are not asked to use Boolean searches (employing "and" and "or" operators) because all experience shows that users cannot use it correctly. (Nielsen, 2000) Common names are offered if possible and scientific names are provided in drop-down menus to aid visitors who may not be familiar with taxonomy. The number of search terms is limited to better aid the average person to retrieve a variety of images to browse after their search.
However, these image galleries can also be tailored to very specific research goals. The Historical Archaeology department was interested in putting a ceramic type collection on-line to reduce wear and tear on the specimens and lighten the constant flow of ceramic identifying requests for departmental researchers. They wanted the Digital Type Collection and On-line Database of Spanish Colonial Ceramics to be a research tool — a person with an unknown ceramic artifact should be able to use the gallery to identify their item using over 20 searchable terms. While it is important for archaeologists to be able to use vocabulary specific to their discipline - such as paste category, paste temper, rim motif, surface finish and decorative technique to name a few - these terms may confuse visitors who just want to casually browse the collection. Extensive glossaries were added and made accessible right next to each relevant drop-down menu on the search page. In addition, many other accompanying pages were included discussing the identification of ceramics and the type definitions and attributes necessary to search effectively and understand the results. By only using drop-down menus to construct the queries, the inexperienced would at least see the range of each category and could produce a set of images to view without necessarily knowing the specifics of each term.
The Search Page
Queries are constructed with either drop-down menus or text fields. If drop-down menus are selected for the searches, they are populated directly from dictionary tables in the database. As soon as someone enters a new value for that dictionary field in the upload application, it will instantly appear as a choice on the search page.
Drop-down menus are the preferred search method for two reasons: ease of use and security (Figure 5).
By only offering the visitor choices directly from the database, they are more likely to get a result. If visitors are allowed to type in their queries, they run the risk of using misspellings, regional variations, abbreviations and any number of terms that are not in the database at all. Drop-down menus are also much safer in terms of server security than text fields. All content entered by users should not be trusted by default — it must be filtered and stripped of special characters before being sent to the database to avoid allowing unauthorized persons to penetrate your system. Using drop-down menus avoids this issue altogether, although they can get unwieldy after a large number of entries.
For certain galleries, the images correlate to an object in the museum's collections and therefore an artifact or specimen number can also be a search field. In those instances, code is in place to filter out all other criteria that is searched on and only returns the image that matches the number entered.
Displaying the Results
The result page dynamically displays the results of the user's search. It is generally formatted using the same formula, regardless of the gallery subject. Thumbnails of the images are displayed in several rows, with links to view the larger picture (Figure 6).
These images are kept small to reduce loading times; 120 x 80 pixels at a medium quality resolution is the standard and about 12 — 20 are displayed on a page at once. A minimum amount of text can be displayed under each thumbnail; genus and species, photographer or common names are all popular choices.
In any instance where the number of results is greater than the number of thumbnail images displayed, a row of links to subsequent pages is built at the bottom of the results page. In this way, visitors can skip to any page in the sequence they desire without being held to view the additional pages in a linear way. The number of records returned is also displayed at the top of the screen and the additional page links help the user visualize how large of a record set their query produced.
After a user clicks on a thumbnail image, they are taken to a dynamically generated page featuring a large image of the item and various amounts of accompanying text (Figure 7).
Most items have far more textual information than is offered in the searchable drop-down menus and this page is used to display any information the researchers wish to include about that particular item in the database. In most cases, just one large image is used on the display page. However, the page can be customized to display virtually anything the researcher desires. In the Digital Type Collection and On-line Database of Spanish Colonial Ceramics application, one thumbnail is shown but the visitor gets to view a large image of the top, side and bottom of each artifact once they move to the detail page.
Before any images are uploaded to the server, copyright issues must be addressed. Copyright laws will vary depending on organization type and country of origin. The Florida Museum of Natural History is a non-profit institution based in the United States, so end-use is an important consideration before posting the images. A portion of the museum's copyright policy states:
All media are for nonprofit, educational, and personal use of students, scholars and the public. Any commercial use or publication by printed or electronic media is strictly prohibited without the express written permission of the museum. (http://www.flmnh.ufl.edu/databases/copyright.htm.)
Contributing photographers are made aware that due to this policy, people are allowed to use their images for the purposes mentioned in the statement. The gallery application has no built-in copyright protection and relies on the fact that the images are at a resolution low enough to make printed reproduction a non-viable option without access to the source files. Images optimized for Web use generally have a resolution around 72 dots per inch (dpi). Magazine quality reproduction requires at least 300 dpi, if not higher. Anyone desiring an image for commercial print applications would be forced to write to the museum to receive permission and a copy of the original high-resolution file. Unless an image was taken by museum staff photographers, all requests are passed to the original photographer to decide whether to authorize use and what charge to levy, if so desired.
There are important server security issues when dealing with these image galleries — beyond standard patching, maintenance and physical security — that should be addressed within the application code itself. There are many more exploits out there beyond what we will touch on in this document, but there are some areas that must be tackled with defensive coding and thorough testing, no matter what technology is used.
The two main concerns dealt with programmatically in this gallery application are buffer overflows and SQL Injection attacks. These two situations can occur when users are allowed to pass malicious information to the database by using text input boxes. However, even when searches can only be created from drop-down menus, it is a good idea to use the following guidelines to insure the greatest possible application stability.
A buffer overflow occurs when too much data are passed for the application to process in an allocated memory space and system failure may result. At worst, the user may gain control of your server. This is prevented by automatically truncating the input before the buffer limit is met. For any input with a known range, such as dates or specimen numbers, controls are put in place to insure that the input data do not fall outside the expected range.
Thorough testing of the application is a must. Our programming team emphasizes "elegant error handling" that anticipates and tests for errors in advance. This prevents the user from ever experiencing a crash. Besides being an unpleasant event for users, unhandled crashes are also a security risk — standard SQL error messages can include table names and other database details skilled hackers can use against you to gain access. You should not trivialize exception handling, expecting your code to be error free. (Mitchell, 2000) User-friendly error messages are programmed to appear in the place of more sterile, technical complaints from the server. The message "You must enter at least one search criteria" is obviously much easier to comprehend than, "Invalid procedure call or argument." This combination of elegant error handling and extensive testing ensures a robust application that is also easy to use.
Some institutions may be wary of investing in personnel, but often they are truly one of the museum's greatest assets. Hiring the right people allows a museum to create a flexible, custom gallery system tailored to their specific needs at an affordable cost over the long term.
Anderson, R., Blexrud, C., Chiarelli, A., Denault, D., Homer, A., Esposito, D., et al. (1999). Professional Active Server Pages 3.0. USA: Wrox Press Ltd.
Curphey, M., Endler, D., Hau, W., McKenna, G., McLaughlin, K., Parke, R., et al. (2002) Guide to Building Secure Web Applications: The Open Web Application Security Project. The Open Web Application Security Project. http://unc.dl.sourceforge.net/sourceforge/owasp/OWASPGuideV1.1.pdf
Mitchell, S. (2000). Designing Active Server Pages. Sebastopol, CA: O'Reilly & Associates, Inc.
Nielsen, J. (2000). Designing Web Usability: The Practice of Simplicity. Indianapolis, Indiana: New Riders Publishing.
Zeldman, J. (2003). Designing with Web Standards. Indianapolis, Indiana: New Riders Publishing.
Image Gallery Web SitesDigital Type Collection and On-line Database of Spanish Colonial Ceramics
Butterfly & Moth Image Gallery