View Issue Details

IDProjectCategoryView StatusLast Update
0001466channel: elrepo/el9kmod-mpt3saspublic2024-07-25 09:59
Reportercwoelkers Assigned Topperry  
PriorityhighSeveritymajorReproducibilityalways
Status assignedResolutionopen 
Platformx86_64OSRocky LinuxOS Version9.4
Summary0001466: mpt3sas driver may be causing mtx errors
DescriptionHBA in use is an LSI SAS2116 based controller.
When our backup software(Bacula) goes to change tapes in our autochanger it fails with the following error messages. The first three lines are from Bacula, the remainder from mtx.
bacula-sd JobId 467772: 3307 Issuing autochanger "unload Volume 000122L6, Slot 61, Drive 3" command.
bacula-sd JobId 467772: 3995 Bad autochanger "unload Volume 000122L6, Slot 61, Drive 3": ERR=Child exited with code 1
Results=Unloading drive 3 into Storage Element 61...Unloading drive 3 into Storage Element 61...Unloading drive 3 into Storage Element 61...Unloading drive 3 into Storage Element 61...Unloading drive 3 into Storage Element 61...mtx: Request Sense: Long Report=yes
mtx: Request Sense: Valid Residual=no
mtx: Request Sense: Error Code=70 (Current)
mtx: Request Sense: Sense Key=Illegal Request
mtx: Request Sense: FileMark=no
mtx: Request Sense: EOM=no
mtx: Request Sense: ILI=no
mtx: Request Sense: Additional Sense Code = 3B
mtx: Request Sense: Additional Sense Qualifier = 90
mtx: Request Sense: Field in Error = 00
mtx: Request Sense: BPV=no
mtx: Request Sense: Error in CDB=yes
mtx: Request Sense: SKSV=yes
mtx: Request Sense: Field Pointer = 00 04
MOVE MEDIUM from Element Address 1005 to 3061 Failed

I have already contacted Bacula support, which we pay for, and they have stated it is not an issue with Bacula which I concur with.
I have tried unloading and moving tapes with mtx and it fails with the same error messages, not counting the final line changing based on the drive and slot. Checking the tape library after running the command shows that the tape is not unloaded from the drive, placed in the carrier, or moved to its slot. I have also tested moving tapes via the tape libraries interface and that works with no errors. This error occurs with all four tape drives installed in the library so I'm reluctant to blame the hardware, be it the HBA, cables, drives, or the library as a whole.
This seems to be an issue with either mtx or mpt3sas. As mtx is no longer in active development, or I cannot find the active development site, I'm posting here first. If this isn't an mpt3sas issue I'll drop the issue into the RHEL bugzilla and see what they say.
Steps To ReproduceHave an autochanger connected to a LSAA SAS2116 based HBA with a tape loaded in a drive.
Use mtx to move the tape from the drive to its slot, 'mtx -f /dev/sg5 unload 70 0'.
Additional InformationThe tape library is an Qualstar RLS-84000 holding 50 tapes, with 4 IO slots, and two drives with a library extension holding an additional 114 tapes, also with 4 IO slots, and the two remaining tape drives. All drives are IBM LTO 6.
TagsNo tags attached.

Activities

toracat

2024-07-01 12:30

administrator   ~0009917

Could you test-install kernel-ml [1]? With this you will be testing to see if it is fixed in the latest mainline kernel from kernel.org. If it shows the same issue, then you'd need to report it to bugzilla.kernel.org.

[1] https://elrepo.org/wiki/doku.php?id=kernel-ml

cwoelkers

2024-07-01 13:39

reporter   ~0009918

I cannot install the updated kernel as it is a production system. I'll see if I can set up a test system with the same adapter, I think I have an extra, and test out the kernel there.

cwoelkers

2024-07-01 16:48

reporter   ~0009921

So testing it on a secondary system with the same adapter shows the same error under the Rocky kernel. When I installed the latest ElRepo kernel, 6.9.7-1.el9.elrepo.x86_64, the errors continued to show up. Here is a the latest error.
[root@tape-test ~]# mtx -f /dev/sg4 unload 70 0
Unloading drive 0 into Storage Element 70...mtx: Request Sense: Long Report=yes
mtx: Request Sense: Valid Residual=no
mtx: Request Sense: Error Code=70 (Current)
mtx: Request Sense: Sense Key=Illegal Request
mtx: Request Sense: FileMark=no
mtx: Request Sense: EOM=no
mtx: Request Sense: ILI=no
mtx: Request Sense: Additional Sense Code = 3B
mtx: Request Sense: Additional Sense Qualifier = 90
mtx: Request Sense: Field in Error = 00
mtx: Request Sense: BPV=no
mtx: Request Sense: Error in CDB=yes
mtx: Request Sense: SKSV=yes
mtx: Request Sense: Field Pointer = 00 04
MOVE MEDIUM from Element Address 1001 to 3070 Failed

toracat

2024-07-02 13:20

administrator   ~0009932

Last edited: 2024-07-02 13:22

I expect kernel-ml-6.10 to be released in couple of weeks. I see some source code update for mpt3sas. So you may want to give it a try.

cwoelkers

2024-07-25 09:59

reporter   ~0009986

The 6.10 kernel did not help matters. Here is the output.

[root@tape-test ~]# uname -r
6.10.1-1.el9.elrepo.x86_64
[root@tape-test ~]# mtx -f /dev/sg4 unload 70 2
Unloading drive 2 into Storage Element 70...mtx: Request Sense: Long Report=yes
mtx: Request Sense: Valid Residual=no
mtx: Request Sense: Error Code=70 (Current)
mtx: Request Sense: Sense Key=Illegal Request
mtx: Request Sense: FileMark=no
mtx: Request Sense: EOM=no
mtx: Request Sense: ILI=no
mtx: Request Sense: Additional Sense Code = 3B
mtx: Request Sense: Additional Sense Qualifier = 90
mtx: Request Sense: Field in Error = 00
mtx: Request Sense: BPV=no
mtx: Request Sense: Error in CDB=yes
mtx: Request Sense: SKSV=yes
mtx: Request Sense: Field Pointer = 00 04
MOVE MEDIUM from Element Address 1004 to 3070 Failed

Issue History

Date Modified Username Field Change
2024-07-01 10:53 cwoelkers New Issue
2024-07-01 10:53 cwoelkers Status new => assigned
2024-07-01 10:53 cwoelkers Assigned To => pperry
2024-07-01 12:30 toracat Note Added: 0009917
2024-07-01 13:39 cwoelkers Note Added: 0009918
2024-07-01 16:48 cwoelkers Note Added: 0009921
2024-07-02 13:20 toracat Note Added: 0009932
2024-07-02 13:22 toracat Note Edited: 0009932
2024-07-25 09:59 cwoelkers Note Added: 0009986