[BNX2]: Fix rtnl deadlock in bnx2_close

This fixes an rtnl deadlock problem when flush_scheduled_work() is
called from bnx2_close(). In rare cases, linkwatch_event() may be on
the workqueue from a previous close of a different device and it will
try to get the rtnl lock which is already held by dev_close().

The fix is to set a flag if we are in the reset task which is run
from the workqueue. bnx2_close() will loop until the flag is cleared.
As suggested by Jeff Garzik, the loop is changed to call msleep(1)
instead of yield() in the original patch.

flush_scheduled_work() is also moved to bnx2_remove_one() before the
netdev is freed.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
Michael Chan 2005-08-25 15:34:29 -07:00 committed by David S. Miller
parent 2373ce1ca0
commit afdc08b9f9
2 changed files with 15 additions and 1 deletions

View file

@ -3975,12 +3975,17 @@ bnx2_reset_task(void *data)
{
struct bnx2 *bp = data;
if (!netif_running(bp->dev))
return;
bp->in_reset_task = 1;
bnx2_netif_stop(bp);
bnx2_init_nic(bp);
atomic_set(&bp->intr_sem, 1);
bnx2_netif_start(bp);
bp->in_reset_task = 0;
}
static void
@ -4172,7 +4177,13 @@ bnx2_close(struct net_device *dev)
struct bnx2 *bp = dev->priv;
u32 reset_code;
flush_scheduled_work();
/* Calling flush_scheduled_work() may deadlock because
* linkwatch_event() may be on the workqueue and it will try to get
* the rtnl_lock which we are holding.
*/
while (bp->in_reset_task)
msleep(1);
bnx2_netif_stop(bp);
del_timer_sync(&bp->timer);
if (bp->wol)
@ -5453,6 +5464,8 @@ bnx2_remove_one(struct pci_dev *pdev)
struct net_device *dev = pci_get_drvdata(pdev);
struct bnx2 *bp = dev->priv;
flush_scheduled_work();
unregister_netdev(dev);
if (bp->regview)

View file

@ -3874,6 +3874,7 @@ struct bnx2 {
int timer_interval;
struct timer_list timer;
struct work_struct reset_task;
int in_reset_task;
/* Used to synchronize phy accesses. */
spinlock_t phy_lock;