phylib: Fix race between returning phydev and calling adjust_link
It is possible that phylib will call adjust_link before returning
from {,of_}phy_connect(), which may cause the following [very rare,
though] oops upon reopening the device:
Unable to handle kernel paging request for data at address 0x0000024c
Oops: Kernel access of bad area, sig: 11 [#1]
PREEMPT SMP NR_CPUS=2 LTT NESTING LEVEL : 0
P1021 RDB
Modules linked in:
NIP: c0345dac LR: c0345dac CTR: c0345d84
TASK = dffab6b0[30] 'events/0' THREAD: c0d24000 CPU: 0
[...]
NIP [c0345dac] adjust_link+0x28/0x19c
LR [c0345dac] adjust_link+0x28/0x19c
Call Trace:
[c0d25f00] [000045e1] 0x45e1 (unreliable)
[c0d25f30] [c036c158] phy_state_machine+0x3ac/0x554
[...]
Here is why. Drivers store phydev in their private structures, e.g.
gianfar driver:
static int init_phy(struct net_device *dev)
{
...
priv->phydev = of_phy_connect(...);
...
}
So that adjust_link could retrieve it back:
static void adjust_link(struct net_device *dev)
{
...
struct phy_device *phydev = priv->phydev;
...
}
If the device has been opened before, then phydev->state is set to
PHY_HALTED (or undefined if the driver didn't call phy_stop()).
Now, phy_connect starts the PHY state machine before returning phydev to
the driver:
phy_start_machine(phydev, NULL);
if (phydev->irq > 0)
phy_start_interrupts(phydev);
return phydev;
The time between 'phy_start_machine()' and 'return phydev' is undefined.
The start machine routine delays execution for 1 second, which is enough
for most cases. But under heavy load, or if you're unlucky, it is quite
possible that PHY state machine will execute before phy_connect()
returns, and so adjust_link callback will try to dereference phydev,
which is not yet ready.
To fix the issue, simply initialize the PHY's state to PHY_READY during
phy_attach(). This will ensure that phylib won't call adjust_link before
phy_start().
Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1 file changed