DESCRIPTION
Update of 30 March 2005:
This Engineering Advisory (EA) is being updated to include an
additional correction in cam_changer code. The Early Release
Patches (ERPs) identified in the EA now contain two fixes:
- A potential dead lock condition that manifests as an
application hang
- A potential kernel memory fault (KMF)
The fixes are described below.
A. Potential Dead Lock
A dead lock condition can occur in cam_changer code while a changer
application is accessing the media changer. This condition is likely
to occur when a previous changer access failed with the error "SCSI
device busy," due to a chance that the access lock is not properly
released. The dead lock condition manifests as a changer application
hang.
B. Potential KMF
A KMF can occur in the cam_changer code if more than one thread
attempts to access the changer at the same time. For example, if a
robot is moving tapes and another thread opens the same device the
panic event is likely to occur. This panic is more likely occur if
the customer is running tape backup software that uses multiple
changer threads.
When the problem occurs, the customer loses any data written to the
tape library before final tape marks have been written. To correct
for this condition, the backup process must be reinitiated, which
can lengthen the time to complete backups within allocated
production time frames.
The following is a typical changer KMF stack trace example:
crash> tf
0 stop_secondary_cpu src/kernel/arch/alpha/cpu.c : 1398
1 panic src/kernel/bsd/subr_prf.c : 1325
2 event_timeout src/kernel/arch/alpha/cpu.c : 2348
3 printf src/kernel/bsd/subr_prf.c : 1008
4 panic src/kernel/bsd/subr_prf.c : 1382
5 trap src/kernel/arch/alpha/trap.c : 2285
6 _XentMM src/kernel/arch/alpha/locore.s : 2237
7 ccmn_record_eei_status3 src/kernel/io/cam/pdrv3_common.c : 4763
8 changer_complete src/kernel/io/cam/cam_changer.c : 7681
9 xpt_callback_thread src/kernel/io/cam/xpt.c : 3357
|