[SCSI] add global timeout to the scsi mid-layer
There are certain rogue devices (and the aic7xxx driver) that return
BUSY or QUEUE_FULL forever. This code will apply a global timeout (of
the total number of retries times the per command timer) to a given
command. If it is exceeded, the command is completed regardless of its
state.
The patch also removes the unused field in the command: timeout and
timeout_total.
This solves the problem of detecting an endless loop in the mid-layer
because of BUSY/QUEUE_FULL bouncing, but will not recover the device.
In the aic7xxx case, the driver can be recovered by sending a bus reset,
so possibly this should be tied into the error handler?
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index d1aa95d..4befbc2 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -268,6 +268,7 @@
} else
put_device(&dev->sdev_gendev);
+ cmd->jiffies_at_alloc = jiffies;
return cmd;
}
EXPORT_SYMBOL(scsi_get_command);
@@ -798,9 +799,23 @@
while (!list_empty(&local_q)) {
struct scsi_cmnd *cmd = list_entry(local_q.next,
struct scsi_cmnd, eh_entry);
+ /* The longest time any command should be outstanding is the
+ * per command timeout multiplied by the number of retries.
+ *
+ * For a typical command, this is 2.5 minutes */
+ unsigned long wait_for
+ = cmd->allowed * cmd->timeout_per_command;
list_del_init(&cmd->eh_entry);
disposition = scsi_decide_disposition(cmd);
+ if (disposition != SUCCESS &&
+ time_before(cmd->jiffies_at_alloc + wait_for, jiffies)) {
+ dev_printk(KERN_ERR, &cmd->device->sdev_gendev,
+ "timing out command, waited %ds\n",
+ wait_for/HZ);
+ disposition = SUCCESS;
+ }
+
scsi_log_completion(cmd, disposition);
switch (disposition) {
case SUCCESS: