| Invertebrata 2002 |
Back to Invertebrata electronic items list
Plant pest database
Lionel Hill (Lionel.Hill@dpiwe.tas.gov.au)
[Lionel Hill sent draft no. 4 of the scoping document for this database to
Invertebrata for the interest of our readers.]
1. Business need for the system
The prime purpose of holding insect collections is to meet an international
trade obligation to document pest records. Because pest identification has many
uncertainties it is necessary to maintain physical specimens as the ultimate
documentation. The Department of Primary Industries, Water and Environment (DPIWE)
holds such a collection at the New Town Laboratories. Electronic databases hasten
retrieval of these pest records, which, using manual retrieval methods, have
taken months to collate for particular export opportunities.
A database also facilitates management of the physical collection, enabling
timely specimen retrieval, generating lists of specimens, generating labels
for specimens, tracking loans of specimens, identifying gaps in the specimen
coverage, etc. The use of a database, in conjunction with a specimen collection,
also aids State pest management. Although there are good summaries of distribution,
biology, seasonality and host preferences for a few recurring major insect pests
in Tasmania, such condensed information is frequently not available for lesser,
sporadic or new pests. However, data from the labels of physical insect specimens
can fill this gap if it can be quickly and easily collated, such as in an electronic
database.
The current DPIWE entomological database is a Microsoft Access application,
written in-house, and is used 'statewide'. A number of entomologists and technicians
work out of the New Town laboratories and require access to the database system
for day-to-day work. The Department also has a laboratory at Devonport, from
which an entomologist and other technicians work. Currently the entomology database
is housed at the Devonport lab and most data entry is carried out there. One
of the major problems with the current system is the lack of wide area capacity.
The current system has not been set up for wide area, concurrent, multi-user
access; if data entry is required to be carried out at the New Town labs, the
entire database is copied and sent to the New Town labs from Devonport. This
means that until the updated database is sent back, no further data entry can
occur at the Devonport Laboratory. Not only is there is a definite risk of data
loss or corruption, duplication and data inconsistency with this system, this
is also not an acceptable work practice.
The proposed Tasmanian Plant Pest Database (TPPD hereafter) addresses these
issues. It will be a true multi-user, client-server application, employing an
Oracle database and an in-house written architecture that makes use of a subset
of Java's Enterprise Java Bean specification. The client application will be
able to be deployed at as many sites, and on as many machines, as may be required.
Initially, the database and the server container will be based on a server in
Hobart with client applications connecting via the Department's wide area network.
One of the benefits of having the database housed in Hobart is that use can
be made of the current data back-up regime.
The location of the database server depends on the speed of the network. If
it is found that long access times to a Hobart-based server cause problems for
data entry and general querying at the Devonport labs then alternative arrangements
will be made to house the database server in Devonport. This will be in the
form of Oracle running on a Linux machine with backups of the database carried
out by database replication to Hobart. In this scenario, one of the considerations
needs to be the access required by the Australian Plant Pest Database.
The Australian Plant Pest Database project (APPD) is a national initiative to
catalogue all plant pests found in Australia. It is to be accessed by a web-based
application, connecting to a centralised request broker. The broker talks to
the various State-based databases to retrieve the requested data via a gateway
servlet application. Although some other States are using Access databases,
the current DPIWE IT policy is to disallow the exposure of Microsoft systems
to such environments. The proposed TPPD meets the DPIWE IT policy requirements.
Such a system as the proposed TPPD can also be used to answer 'helpdesk' style
queries relating to plant pests - this is something that the current system
is unable to provide without the creation of specific reports. The proposed
system will be extensible, allowing accessibility to be broadened in the future,
perhaps making use of a web-based query tool.
Other requests for access to the data stored in the current system have been
made and, while the current system provides all the functionality required by
the requests, it is not user-friendly. New users require some training in order
to make the best use of the system. Making the system more user-friendly would
mean that access could be granted without requiring the current expert users
to spend as much time with first-time users in training sessions.
2. Size
Currently the database consists of around 50,000 accession records (made up
of specimen mounts, staff diagnostic records and literature records), 12,000
taxonomic hierarchy records and around 2,000 bibliographic records. There are
approximately 150,000 specimen mounts in the entomology collection. Therefore,
when the extra specimen mount records are entered, the accession data will grow
to three times its current size. It is also envisaged that new specimen mount
records may grow the database by up to 5% this year, although this growth figure
may not be met in subsequent years. The taxonomic hierarchy records will remain
virtually static and the bibliographic records may increase slightly. All lookup
tables should remain static. This means that, currently there are at least 100,000
insert transactions to be processed, although not all will be processed this
year.
Updating of the system is currently carried out by two or three entomologists
and by contract data entry personnel; this year a total of around five people
will update the system. Access to the system is granted on an as-needs basis
and will probably be no more than ten. It is envisaged that this will grow in
the future.
3. Owners and users
The data will be provided by the Department of Primary Industries current Plant
Pest database. Initial data load of the new system will be the responsibility
of Stephen Cooke with the assisstance of Lionel Hill. The entomologists will
be responsible for the provision of data and day-to-day maintenance of the data
with initial assistance of Stephen Cooke.
4. Initial data model
Figure 2.1 - Entity diagram
5. Business Processes
Records of insect specimens can be created by one of three general methods of data capture: from literature, from physical collection methods and from enquiries (which may include official enquiries from Quarantine and Council officers). In the latter two cases the recording process is the same.
The samples are processed and stored on mounts (the basic (or atomic) unit of the dataset), either by pinning (with a handwritten tag that may carry description, diagnosis result and/or collection details) or by being preserved in vials of alcohol (or similar), again with a tag. A third method of storage is on a microscope slide. In this case, it is common to divide the specimen and just mount the relevant part. The rest of the specimen may be further divided and stored by one, or more, of the mentioned storage methods. At some point after the mounting process the mount details are entered into the database. Details for final printed labels, which are added to the mounts, are extracted from the database and formatted for printing. The mounts can then enter the entomological collection.
As can be see from diagram 2.1, entry of specimen mount records into the database requires that the hosts, trinomial hierarchy, habitat and look-up tables be up-to-date. Entry of any accession record into the database will require data to be stored in the following tables: Accession, AccessionDetails. Depending on the collection method the following tables may have additional data: for a collected specimen: CollectionDetails, CollectionLocation, RearingDetails, SerialMounts, MountStorage, SerialMounts, Host, SourceNote; for bibliographic only records (specimens not required) - Accession, Bibliography. Note that entry of an accession may require the entry of further host, habitat and/or trinomial data before the mount record can be entered.
Related forms for data entry should appear as tabbed panes on one user interface in an order that is consistant with the current data entry practices. The tabbed panes may be navigated by mouse and by keyboard strokes. The same interface will be used for modification of data and for searching, with appropriate mode changes controlling usage. Extra functionality, such as providing for image data, will be added on a 'nice-to-have' basis. However, the data model has been designed to cater for such add-ons.
Database records are also created in the cases where there is reputable literature that adequately describes the species/subspecies of an insect found in Tasmania. A specimen of the insect described in the literature may, or may not be, stored in the collection. While the only verifiable proof of the presence of a pest in the State is a specimen, the record provided by reputable literature is considered secondary evidence of presence and is an important aid in the identification of insect specimens. The current system allows such recording and interrogating and this functionality is to be retained in the proposed TPPD. Another aid to identification of insects is the versatility of the current database interrogation. Such functionality will be retained, where required, in the proposed TPPD, through the provision of a data query tool (such as Oracle data browser).
All the support tables require insert, delete and modification support. This will be in the form of a multi-mode data entry UI for each table - around 30 tables. Bibliographic details are also required to be stored - a separate UI will be provided for that. Provision has been made to enable the recording of multiple literature records against a trinomial - a UI is to be provided for this. When an insect is found in a 'host' other than those provided by the taxonomic hierarchy (eg. computer) the details are recorded and entered into the current database as a simple text string. The proposed database has a similar feature and a UI will be provided for the entry of such data.
The entomological collection consists of insects found in Tasmania and insects originating elsewhere. The latter consist of specimens captured in Tasmania by Quarantine officers, and sent to DPIWE for indetification, and other interesting specimens captured elsewhere by entomologists. These last are 'exotics'. It is a requirement of TPPD to be able to identify which of these categories a specimen from the collection belongs to. This data is captured at data entry time.
6. Outputs
It is a requirement for the system to also provide data interrogation and reporting functionality. The current system is very flexible in its querying capabilities. Oracle data browser will be provided to meet all the current ad-hoc querying requirements.
There are various printed reports required that provide essential support to the management of the entomology collection. These include the production of mount labels, bibliographic details against trinomials, annotated with a list of specimens, and a list of the entomology collection. Given adequate electronic search, export and reporting facilities, the need for such paper-based reports may be reduced.
Possibly, in the future, the provision of a web-based querying tool would give access to the collection's data to various stakeholders - particularly people who have an interest in the distribution, location and/or control of plant pests (eg. farmers).
7. Integration requirements
The TPPD system is to allow the proposed national Australian Plant Pest Database (APPD) access to some TPPD data fields. The proposed interface is through a 'gateway' servlet, provided by the CMIS division of CSIRO and placed inside the DPIWE de-militarised zone, that connects to the TPPD and feeds data to a request broker outside DPIWE systems. The request broker will send read-only requests from the APPD to the State based plant pest databases. The data flow will be on-going and one-way, with no update transactions allowed by the APPD on the TPPD.
Back to Invertebrata electronic items list