Invertebrata    2002 

Back to Invertebrata electronic items list


Plant pest database
Lionel Hill (Lionel.Hill@dpiwe.tas.gov.au)

[Lionel Hill sent draft no. 4 of the scoping document for this database to Invertebrata for the interest of our readers.]

1. Business need for the system

The prime purpose of holding insect collections is to meet an international trade obligation to document pest records. Because pest identification has many uncertainties it is necessary to maintain physical specimens as the ultimate documentation. The Department of Primary Industries, Water and Environment (DPIWE) holds such a collection at the New Town Laboratories. Electronic databases hasten retrieval of these pest records, which, using manual retrieval methods, have taken months to collate for particular export opportunities.

A database also facilitates management of the physical collection, enabling timely specimen retrieval, generating lists of specimens, generating labels for specimens, tracking loans of specimens, identifying gaps in the specimen coverage, etc. The use of a database, in conjunction with a specimen collection, also aids State pest management. Although there are good summaries of distribution, biology, seasonality and host preferences for a few recurring major insect pests in Tasmania, such condensed information is frequently not available for lesser, sporadic or new pests. However, data from the labels of physical insect specimens can fill this gap if it can be quickly and easily collated, such as in an electronic database.

The current DPIWE entomological database is a Microsoft Access application, written in-house, and is used 'statewide'. A number of entomologists and technicians work out of the New Town laboratories and require access to the database system for day-to-day work. The Department also has a laboratory at Devonport, from which an entomologist and other technicians work. Currently the entomology database is housed at the Devonport lab and most data entry is carried out there. One of the major problems with the current system is the lack of wide area capacity. The current system has not been set up for wide area, concurrent, multi-user access; if data entry is required to be carried out at the New Town labs, the entire database is copied and sent to the New Town labs from Devonport. This means that until the updated database is sent back, no further data entry can occur at the Devonport Laboratory. Not only is there is a definite risk of data loss or corruption, duplication and data inconsistency with this system, this is also not an acceptable work practice.

The proposed Tasmanian Plant Pest Database (TPPD hereafter) addresses these issues. It will be a true multi-user, client-server application, employing an Oracle database and an in-house written architecture that makes use of a subset of Java's Enterprise Java Bean specification. The client application will be able to be deployed at as many sites, and on as many machines, as may be required. Initially, the database and the server container will be based on a server in Hobart with client applications connecting via the Department's wide area network. One of the benefits of having the database housed in Hobart is that use can be made of the current data back-up regime.

The location of the database server depends on the speed of the network. If it is found that long access times to a Hobart-based server cause problems for data entry and general querying at the Devonport labs then alternative arrangements will be made to house the database server in Devonport. This will be in the form of Oracle running on a Linux machine with backups of the database carried out by database replication to Hobart. In this scenario, one of the considerations needs to be the access required by the Australian Plant Pest Database.

The Australian Plant Pest Database project (APPD) is a national initiative to catalogue all plant pests found in Australia. It is to be accessed by a web-based application, connecting to a centralised request broker. The broker talks to the various State-based databases to retrieve the requested data via a gateway servlet application. Although some other States are using Access databases, the current DPIWE IT policy is to disallow the exposure of Microsoft systems to such environments. The proposed TPPD meets the DPIWE IT policy requirements.

Such a system as the proposed TPPD can also be used to answer 'helpdesk' style queries relating to plant pests - this is something that the current system is unable to provide without the creation of specific reports. The proposed system will be extensible, allowing accessibility to be broadened in the future, perhaps making use of a web-based query tool.

Other requests for access to the data stored in the current system have been made and, while the current system provides all the functionality required by the requests, it is not user-friendly. New users require some training in order to make the best use of the system. Making the system more user-friendly would mean that access could be granted without requiring the current expert users to spend as much time with first-time users in training sessions.

2. Size

Currently the database consists of around 50,000 accession records (made up of specimen mounts, staff diagnostic records and literature records), 12,000 taxonomic hierarchy records and around 2,000 bibliographic records. There are approximately 150,000 specimen mounts in the entomology collection. Therefore, when the extra specimen mount records are entered, the accession data will grow to three times its current size. It is also envisaged that new specimen mount records may grow the database by up to 5% this year, although this growth figure may not be met in subsequent years. The taxonomic hierarchy records will remain virtually static and the bibliographic records may increase slightly. All lookup tables should remain static. This means that, currently there are at least 100,000 insert transactions to be processed, although not all will be processed this year.

Updating of the system is currently carried out by two or three entomologists and by contract data entry personnel; this year a total of around five people will update the system. Access to the system is granted on an as-needs basis and will probably be no more than ten. It is envisaged that this will grow in the future.

3. Owners and users

The data will be provided by the Department of Primary Industries current Plant Pest database. Initial data load of the new system will be the responsibility of Stephen Cooke with the assisstance of Lionel Hill. The entomologists will be responsible for the provision of data and day-to-day maintenance of the data with initial assistance of Stephen Cooke.

4. Initial data model

Figure 2.1 - Entity diagram

flow diagram



5. Business Processes

Records of insect specimens can be created by one of three general methods of data capture: from literature, from physical collection methods and from enquiries (which may include official enquiries from Quarantine and Council officers). In the latter two cases the recording process is the same.

The samples are processed and stored on mounts (the basic (or atomic) unit of the dataset), either by pinning (with a handwritten tag that may carry description, diagnosis result and/or collection details) or by being preserved in vials of alcohol (or similar), again with a tag. A third method of storage is on a microscope slide. In this case, it is common to divide the specimen and just mount the relevant part. The rest of the specimen may be further divided and stored by one, or more, of the mentioned storage methods. At some point after the mounting process the mount details are entered into the database. Details for final printed labels, which are added to the mounts, are extracted from the database and formatted for printing. The mounts can then enter the entomological collection.

As can be see from diagram 2.1, entry of specimen mount records into the database requires that the hosts, trinomial hierarchy, habitat and look-up tables be up-to-date. Entry of any accession record into the database will require data to be stored in the following tables: Accession, AccessionDetails. Depending on the collection method the following tables may have additional data: for a collected specimen: CollectionDetails, CollectionLocation, RearingDetails, SerialMounts, MountStorage, SerialMounts, Host, SourceNote; for bibliographic only records (specimens not required) - Accession, Bibliography. Note that entry of an accession may require the entry of further host, habitat and/or trinomial data before the mount record can be entered.

Related forms for data entry should appear as tabbed panes on one user interface in an order that is consistant with the current data entry practices. The tabbed panes may be navigated by mouse and by keyboard strokes. The same interface will be used for modification of data and for searching, with appropriate mode changes controlling usage. Extra functionality, such as providing for image data, will be added on a 'nice-to-have' basis. However, the data model has been designed to cater for such add-ons.

Database records are also created in the cases where there is reputable literature that adequately describes the species/subspecies of an insect found in Tasmania. A specimen of the insect described in the literature may, or may not be, stored in the collection. While the only verifiable proof of the presence of a pest in the State is a specimen, the record provided by reputable literature is considered secondary evidence of presence and is an important aid in the identification of insect specimens. The current system allows such recording and interrogating and this functionality is to be retained in the proposed TPPD. Another aid to identification of insects is the versatility of the current database interrogation. Such functionality will be retained, where required, in the proposed TPPD, through the provision of a data query tool (such as Oracle data browser).

All the support tables require insert, delete and modification support. This will be in the form of a multi-mode data entry UI for each table - around 30 tables. Bibliographic details are also required to be stored - a separate UI will be provided for that. Provision has been made to enable the recording of multiple literature records against a trinomial - a UI is to be provided for this. When an insect is found in a 'host' other than those provided by the taxonomic hierarchy (eg. computer) the details are recorded and entered into the current database as a simple text string. The proposed database has a similar feature and a UI will be provided for the entry of such data.

The entomological collection consists of insects found in Tasmania and insects originating elsewhere. The latter consist of specimens captured in Tasmania by Quarantine officers, and sent to DPIWE for indetification, and other interesting specimens captured elsewhere by entomologists. These last are 'exotics'. It is a requirement of TPPD to be able to identify which of these categories a specimen from the collection belongs to. This data is captured at data entry time.

6. Outputs

It is a requirement for the system to also provide data interrogation and reporting functionality. The current system is very flexible in its querying capabilities. Oracle data browser will be provided to meet all the current ad-hoc querying requirements.

There are various printed reports required that provide essential support to the management of the entomology collection. These include the production of mount labels, bibliographic details against trinomials, annotated with a list of specimens, and a list of the entomology collection. Given adequate electronic search, export and reporting facilities, the need for such paper-based reports may be reduced.

Possibly, in the future, the provision of a web-based querying tool would give access to the collection's data to various stakeholders - particularly people who have an interest in the distribution, location and/or control of plant pests (eg. farmers).

7. Integration requirements

The TPPD system is to allow the proposed national Australian Plant Pest Database (APPD) access to some TPPD data fields. The proposed interface is through a 'gateway' servlet, provided by the CMIS division of CSIRO and placed inside the DPIWE de-militarised zone, that connects to the TPPD and feeds data to a request broker outside DPIWE systems. The request broker will send read-only requests from the APPD to the State based plant pest databases. The data flow will be on-going and one-way, with no update transactions allowed by the APPD on the TPPD.

Back to Invertebrata electronic items list