clk_disable() previously used an ARM barrier, wmb(), to try to ensure
that the hardware write completed before continuing. There are some
problems with this approach.
The first problem is that wmb() only ensures that the write leaves the
ARM -- not that it actually reaches the endpoint device. In this
case, the endpoint device - either the PRM, CM, or SCM - is three
interconnects away from the ARM, and the final interconnect is
low-speed. And the OCP interconnects will post the write, who knows
how long that will take to complete. So the wmb() is not really what
we want.
Worse, the wmb() is indiscriminate; it will cause the ARM to flush any
other unrelated buffered writes and wait for the local interconnect to
acknowledge them - potentially very expensive.
This first problem could be fixed by doing a readback of the same PRM/CM/SCM
register. Since these devices use a single OCP thread, this will cause the
MPU to wait for the write to complete.
But the primary problem is a conceptual one: clk_disable() should not
need any kind of barrier. clk_enable() needs one since device driver
code must not access a device until its clocks are known to be
enabled. But clk_disable() has no such restriction.
Since blocking the MPU on a PRM/CM/SCM write can be a very
high-latency operation - several hundred MPU cycles - it's worth
avoiding this barrier if possible.
linux-omap source commit is f4aacad2c0ed1055622d5c1e910befece24ef0e2.
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>