mbox series

[RFT,0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq

Message ID 2811202.iOFZ6YHztY@kreacher (mailing list archive)
Headers show
Series cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq | expand

Message

Rafael J. Wysocki Oct. 16, 2019, 10:37 a.m. UTC
Hi All,

The motivation for this series is to address the problem discussed here:

https://lore.kernel.org/linux-pm/5ad2624194baa2f53acc1f1e627eb7684c577a19.1562210705.git.viresh.kumar@linaro.org/T/#md2d89e95906b8c91c15f582146173dce2e86e99f

and also reported here:

https://lore.kernel.org/linux-pm/20191015155735.GA29105@bogus/

Plus, generally speaking, using the policy CPU as a proxy for the policy
with respect to PM QoS does not feel particularly straightforward to me
and adds extra complexity.

Anyway, the first patch adds frequency QoS that is based on "raw" PM QoS (kind
of in analogy with device PM QoS) and is just about min and max frequency
requests (no direct relationship to devices).

The second patch switches over cpufreq and its users to the new frequency QoS.
[The Fixes: tag has been tentatively added to it.]

The third one removes frequency request types from device PM QoS.

Unfortunately, the patches are rather big, but also they are quite
straightforward.

I didn't have the time to test this series, so giving it a go would be much
appreciated.

Thanks,
Rafael

Comments

Sudeep Holla Oct. 16, 2019, 2:23 p.m. UTC | #1
On Wed, Oct 16, 2019 at 12:37:58PM +0200, Rafael J. Wysocki wrote:
> Hi All,
>
> The motivation for this series is to address the problem discussed here:
>
> https://lore.kernel.org/linux-pm/5ad2624194baa2f53acc1f1e627eb7684c577a19.1562210705.git.viresh.kumar@linaro.org/T/#md2d89e95906b8c91c15f582146173dce2e86e99f
>
> and also reported here:
>
> https://lore.kernel.org/linux-pm/20191015155735.GA29105@bogus/
>
> Plus, generally speaking, using the policy CPU as a proxy for the policy
> with respect to PM QoS does not feel particularly straightforward to me
> and adds extra complexity.
>
> Anyway, the first patch adds frequency QoS that is based on "raw" PM QoS (kind
> of in analogy with device PM QoS) and is just about min and max frequency
> requests (no direct relationship to devices).
>
> The second patch switches over cpufreq and its users to the new frequency QoS.
> [The Fixes: tag has been tentatively added to it.]
>
> The third one removes frequency request types from device PM QoS.
>
> Unfortunately, the patches are rather big, but also they are quite
> straightforward.
>
> I didn't have the time to test this series, so giving it a go would be much
> appreciated.

Thanks for the spinning these patches so quickly.

I did give it a spin, but unfortunately it doesn't fix the bug I reported.
So I looked at my bug report in detail and looks like the cpufreq_driver
variable is set to NULL at that point and it fails to dereference it
while trying to execute:
	ret = cpufreq_driver->verify(new_policy);
(Hint verify is at offset 0x1c/28)

So I suspect some race as this platform with bL switcher tries to
unregister and re-register the cpufreq driver during the boot.

I need to spend more time on this as reverting the initial PM QoS patch
to cpufreq.c makes the issue disappear.

--
Regards,
Sudeep
Viresh Kumar Oct. 17, 2019, 9:46 a.m. UTC | #2
On 16-10-19, 12:37, Rafael J. Wysocki wrote:
> Hi All,
> 
> The motivation for this series is to address the problem discussed here:
> 
> https://lore.kernel.org/linux-pm/5ad2624194baa2f53acc1f1e627eb7684c577a19.1562210705.git.viresh.kumar@linaro.org/T/#md2d89e95906b8c91c15f582146173dce2e86e99f
> 
> and also reported here:
> 
> https://lore.kernel.org/linux-pm/20191015155735.GA29105@bogus/
> 
> Plus, generally speaking, using the policy CPU as a proxy for the policy
> with respect to PM QoS does not feel particularly straightforward to me
> and adds extra complexity.
> 
> Anyway, the first patch adds frequency QoS that is based on "raw" PM QoS (kind
> of in analogy with device PM QoS) and is just about min and max frequency
> requests (no direct relationship to devices).
> 
> The second patch switches over cpufreq and its users to the new frequency QoS.
> [The Fixes: tag has been tentatively added to it.]
> 
> The third one removes frequency request types from device PM QoS.
> 
> Unfortunately, the patches are rather big, but also they are quite
> straightforward.
> 
> I didn't have the time to test this series, so giving it a go would be much
> appreciated.

Apart from the minor comment on one of the patches, these look okay to me.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Viresh Kumar Oct. 17, 2019, 9:57 a.m. UTC | #3
On 16-10-19, 15:23, Sudeep Holla wrote:
> Thanks for the spinning these patches so quickly.
> 
> I did give it a spin, but unfortunately it doesn't fix the bug I reported.
> So I looked at my bug report in detail and looks like the cpufreq_driver
> variable is set to NULL at that point and it fails to dereference it
> while trying to execute:
> 	ret = cpufreq_driver->verify(new_policy);
> (Hint verify is at offset 0x1c/28)
> 
> So I suspect some race as this platform with bL switcher tries to
> unregister and re-register the cpufreq driver during the boot.
> 
> I need to spend more time on this as reverting the initial PM QoS patch
> to cpufreq.c makes the issue disappear.

Is this easily reproducible ? cpufreq_driver == NULL shouldn't be the case, it
get updated only once while registering/unregistering cpufreq drivers. That is
the last thing which can go wrong from my point of view :)
Sudeep Holla Oct. 17, 2019, 9:59 a.m. UTC | #4
On Thu, Oct 17, 2019 at 03:27:25PM +0530, Viresh Kumar wrote:
> On 16-10-19, 15:23, Sudeep Holla wrote:
> > Thanks for the spinning these patches so quickly.
> >
> > I did give it a spin, but unfortunately it doesn't fix the bug I reported.
> > So I looked at my bug report in detail and looks like the cpufreq_driver
> > variable is set to NULL at that point and it fails to dereference it
> > while trying to execute:
> > 	ret = cpufreq_driver->verify(new_policy);
> > (Hint verify is at offset 0x1c/28)
> >
> > So I suspect some race as this platform with bL switcher tries to
> > unregister and re-register the cpufreq driver during the boot.
> >
> > I need to spend more time on this as reverting the initial PM QoS patch
> > to cpufreq.c makes the issue disappear.
>
> Is this easily reproducible ? cpufreq_driver == NULL shouldn't be the case, it
> get updated only once while registering/unregistering cpufreq drivers. That is
> the last thing which can go wrong from my point of view :)
>

Yes, if I boot my TC2 with bL switcher enabled, it always crashes on boot.

--
Regards,
Sudeep
Rafael J. Wysocki Oct. 17, 2019, 4:34 p.m. UTC | #5
On Thu, Oct 17, 2019 at 12:00 PM Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> On Thu, Oct 17, 2019 at 03:27:25PM +0530, Viresh Kumar wrote:
> > On 16-10-19, 15:23, Sudeep Holla wrote:
> > > Thanks for the spinning these patches so quickly.
> > >
> > > I did give it a spin, but unfortunately it doesn't fix the bug I reported.
> > > So I looked at my bug report in detail and looks like the cpufreq_driver
> > > variable is set to NULL at that point and it fails to dereference it
> > > while trying to execute:
> > >     ret = cpufreq_driver->verify(new_policy);
> > > (Hint verify is at offset 0x1c/28)
> > >
> > > So I suspect some race as this platform with bL switcher tries to
> > > unregister and re-register the cpufreq driver during the boot.
> > >
> > > I need to spend more time on this as reverting the initial PM QoS patch
> > > to cpufreq.c makes the issue disappear.

I guess you mean commit 67d874c3b2c6 ("cpufreq: Register notifiers
with the PM QoS framework")?

That would make sense, because it added the cpufreq_notifier_min() and
cpufreq_notifier_max() that trigger handle_update() via
schedule_work().

[BTW, Viresh, it looks like cpufreq_set_policy() should still ensure
that the new min is less than the new max, because the QoS doesn't do
that.]

> > Is this easily reproducible ? cpufreq_driver == NULL shouldn't be the case, it
> > get updated only once while registering/unregistering cpufreq drivers. That is
> > the last thing which can go wrong from my point of view :)
> >
>
> Yes, if I boot my TC2 with bL switcher enabled, it always crashes on boot.

It does look like handle_update() races with
cpufreq_unregister_driver() and cpufreq_remove_dev (called from there
indirectly) does look racy.
Sudeep Holla Oct. 17, 2019, 4:42 p.m. UTC | #6
On Thu, Oct 17, 2019 at 06:34:28PM +0200, Rafael J. Wysocki wrote:
> On Thu, Oct 17, 2019 at 12:00 PM Sudeep Holla <sudeep.holla@arm.com> wrote:
> >
> > On Thu, Oct 17, 2019 at 03:27:25PM +0530, Viresh Kumar wrote:
> > > On 16-10-19, 15:23, Sudeep Holla wrote:
> > > > Thanks for the spinning these patches so quickly.
> > > >
> > > > I did give it a spin, but unfortunately it doesn't fix the bug I reported.
> > > > So I looked at my bug report in detail and looks like the cpufreq_driver
> > > > variable is set to NULL at that point and it fails to dereference it
> > > > while trying to execute:
> > > >     ret = cpufreq_driver->verify(new_policy);
> > > > (Hint verify is at offset 0x1c/28)
> > > >
> > > > So I suspect some race as this platform with bL switcher tries to
> > > > unregister and re-register the cpufreq driver during the boot.
> > > >
> > > > I need to spend more time on this as reverting the initial PM QoS patch
> > > > to cpufreq.c makes the issue disappear.
>
> I guess you mean commit 67d874c3b2c6 ("cpufreq: Register notifiers
> with the PM QoS framework")?
>

Correct.

> That would make sense, because it added the cpufreq_notifier_min() and
> cpufreq_notifier_max() that trigger handle_update() via
> schedule_work().
>

Yes, it was not clear as I didn't trace to handle_update earlier. After
looking at depth today afternoon, I arrived at the same conclusion.

> [BTW, Viresh, it looks like cpufreq_set_policy() should still ensure
> that the new min is less than the new max, because the QoS doesn't do
> that.]
>
> > > Is this easily reproducible ? cpufreq_driver == NULL shouldn't be the case, it
> > > get updated only once while registering/unregistering cpufreq drivers. That is
> > > the last thing which can go wrong from my point of view :)
> > >
> >
> > Yes, if I boot my TC2 with bL switcher enabled, it always crashes on boot.
>
> It does look like handle_update() races with
> cpufreq_unregister_driver() and cpufreq_remove_dev (called from there
> indirectly) does look racy.

Indeed, we just crossed the mails. I just found that and posted a patch.

--
Regards,
Sudeep
Sudeep Holla Oct. 17, 2019, 5:14 p.m. UTC | #7
On Wed, Oct 16, 2019 at 03:23:43PM +0100, Sudeep Holla wrote:
> On Wed, Oct 16, 2019 at 12:37:58PM +0200, Rafael J. Wysocki wrote:
> > Hi All,
> >
> > The motivation for this series is to address the problem discussed here:
> >
> > https://lore.kernel.org/linux-pm/5ad2624194baa2f53acc1f1e627eb7684c577a19.1562210705.git.viresh.kumar@linaro.org/T/#md2d89e95906b8c91c15f582146173dce2e86e99f
> >
> > and also reported here:
> >
> > https://lore.kernel.org/linux-pm/20191015155735.GA29105@bogus/
> >
> > Plus, generally speaking, using the policy CPU as a proxy for the policy
> > with respect to PM QoS does not feel particularly straightforward to me
> > and adds extra complexity.
> >
> > Anyway, the first patch adds frequency QoS that is based on "raw" PM QoS (kind
> > of in analogy with device PM QoS) and is just about min and max frequency
> > requests (no direct relationship to devices).
> >
> > The second patch switches over cpufreq and its users to the new frequency QoS.
> > [The Fixes: tag has been tentatively added to it.]
> >
> > The third one removes frequency request types from device PM QoS.
> >
> > Unfortunately, the patches are rather big, but also they are quite
> > straightforward.
> >
> > I didn't have the time to test this series, so giving it a go would be much
> > appreciated.
>
> Thanks for the spinning these patches so quickly.
>
For the record, I thought of providing the crash that this series fixes.
After applying [1] which fixes the boot issue I was seeing on TC2, I started
seeing the below crash, which this series fixes.

FWIW,
Tested-by: Sudeep Holla <sudeep.holla@arm.com>

--

Unable to handle kernel paging request at virtual address 31b2c303
pgd = 772b96e1
[31b2c303] *pgd=a4050003, *pmd=00000000
Internal error: Oops: 206 [#1] SMP THUMB2
Modules linked in:
CPU: 1 PID: 518 Comm: bash Not tainted 5.4.0-rc3-00062-g6e3a7fd7a87e-dirty #123
Hardware name: ARM-Versatile Express
PC is at blocking_notifier_chain_unregister+0x2a/0x78
LR is at blocking_notifier_chain_unregister+0x1b/0x78
Flags: NzCv  IRQs on  FIQs off  Mode SVC_32  ISA Thumb  Segment user
Control: 70c5387d  Table: a57b08c0  DAC: 55555555
Process bash (pid: 518, stack limit = 0x018ebe57)
(blocking_notifier_chain_unregister) from (dev_pm_qos_remove_notifier+0x5d/0xb4)
(dev_pm_qos_remove_notifier) from (cpufreq_policy_free+0x77/0xc8)
(cpufreq_policy_free) from (subsys_interface_unregister+0x4f/0x80)
(subsys_interface_unregister) from (cpufreq_unregister_driver+0x29/0x6c)
(cpufreq_unregister_driver) from (bL_cpufreq_switcher_notifier+0x41/0x4c)
(bL_cpufreq_switcher_notifier) from (notifier_call_chain+0x3d/0x58)
(notifier_call_chain) from (blocking_notifier_call_chain+0x29/0x38)
(blocking_notifier_call_chain) from (bL_activation_notify+0x13/0x40)
(bL_activation_notify) from (bL_switcher_active_store+0x59/0x190)
(bL_switcher_active_store) from (kernfs_fop_write+0x85/0x12c)
(kernfs_fop_write) from (__vfs_write+0x21/0x130)
(__vfs_write) from (vfs_write+0x6b/0xfc)
(vfs_write) from (ksys_write+0x6d/0x90)
(ksys_write) from (ret_fast_syscall+0x1/0x5a)
Viresh Kumar Oct. 18, 2019, 5:44 a.m. UTC | #8
On 17-10-19, 18:34, Rafael J. Wysocki wrote:
> [BTW, Viresh, it looks like cpufreq_set_policy() should still ensure
> that the new min is less than the new max, because the QoS doesn't do
> that.]

The ->verify() callback does that for us I believe.
Rafael J. Wysocki Oct. 18, 2019, 8:24 a.m. UTC | #9
On Fri, Oct 18, 2019 at 7:44 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 17-10-19, 18:34, Rafael J. Wysocki wrote:
> > [BTW, Viresh, it looks like cpufreq_set_policy() should still ensure
> > that the new min is less than the new max, because the QoS doesn't do
> > that.]
>
> The ->verify() callback does that for us I believe.

It does in practice AFAICS, but in theory it may assume the right
ordering between the min and the max and just test the boundaries, may
it not?
Viresh Kumar Oct. 18, 2019, 8:27 a.m. UTC | #10
On 18-10-19, 10:24, Rafael J. Wysocki wrote:
> On Fri, Oct 18, 2019 at 7:44 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> >
> > On 17-10-19, 18:34, Rafael J. Wysocki wrote:
> > > [BTW, Viresh, it looks like cpufreq_set_policy() should still ensure
> > > that the new min is less than the new max, because the QoS doesn't do
> > > that.]
> >
> > The ->verify() callback does that for us I believe.
> 
> It does in practice AFAICS, but in theory it may assume the right
> ordering between the min and the max and just test the boundaries, may
> it not?

I think cpufreq_verify_within_limits() gets called for sure from
within ->verify() for all platforms and this explicitly checks

        if (policy->min > policy->max)
                policy->min = policy->max;
Rafael J. Wysocki Oct. 18, 2019, 8:30 a.m. UTC | #11
On Fri, Oct 18, 2019 at 10:27 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 18-10-19, 10:24, Rafael J. Wysocki wrote:
> > On Fri, Oct 18, 2019 at 7:44 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > >
> > > On 17-10-19, 18:34, Rafael J. Wysocki wrote:
> > > > [BTW, Viresh, it looks like cpufreq_set_policy() should still ensure
> > > > that the new min is less than the new max, because the QoS doesn't do
> > > > that.]
> > >
> > > The ->verify() callback does that for us I believe.
> >
> > It does in practice AFAICS, but in theory it may assume the right
> > ordering between the min and the max and just test the boundaries, may
> > it not?
>
> I think cpufreq_verify_within_limits() gets called for sure from
> within ->verify() for all platforms

That's why I mean by "in practice". :-)
Viresh Kumar Oct. 18, 2019, 9:24 a.m. UTC | #12
On 18-10-19, 10:30, Rafael J. Wysocki wrote:
> On Fri, Oct 18, 2019 at 10:27 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> >
> > On 18-10-19, 10:24, Rafael J. Wysocki wrote:
> > > On Fri, Oct 18, 2019 at 7:44 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > > >
> > > > On 17-10-19, 18:34, Rafael J. Wysocki wrote:
> > > > > [BTW, Viresh, it looks like cpufreq_set_policy() should still ensure
> > > > > that the new min is less than the new max, because the QoS doesn't do
> > > > > that.]
> > > >
> > > > The ->verify() callback does that for us I believe.
> > >
> > > It does in practice AFAICS, but in theory it may assume the right
> > > ordering between the min and the max and just test the boundaries, may
> > > it not?
> >
> > I think cpufreq_verify_within_limits() gets called for sure from
> > within ->verify() for all platforms
> 
> That's why I mean by "in practice". :-)

Hmm, I am not sure if we should really add another min <= max check in
cpufreq_set_policy() as in practice it will never hit :)
Rafael J. Wysocki Oct. 18, 2019, 9:26 a.m. UTC | #13
On Fri, Oct 18, 2019 at 11:24 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 18-10-19, 10:30, Rafael J. Wysocki wrote:
> > On Fri, Oct 18, 2019 at 10:27 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > >
> > > On 18-10-19, 10:24, Rafael J. Wysocki wrote:
> > > > On Fri, Oct 18, 2019 at 7:44 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > > > >
> > > > > On 17-10-19, 18:34, Rafael J. Wysocki wrote:
> > > > > > [BTW, Viresh, it looks like cpufreq_set_policy() should still ensure
> > > > > > that the new min is less than the new max, because the QoS doesn't do
> > > > > > that.]
> > > > >
> > > > > The ->verify() callback does that for us I believe.
> > > >
> > > > It does in practice AFAICS, but in theory it may assume the right
> > > > ordering between the min and the max and just test the boundaries, may
> > > > it not?
> > >
> > > I think cpufreq_verify_within_limits() gets called for sure from
> > > within ->verify() for all platforms
> >
> > That's why I mean by "in practice". :-)
>
> Hmm, I am not sure if we should really add another min <= max check in
> cpufreq_set_policy() as in practice it will never hit :)

Fair enough, but adding a comment regarding that in there would be prudent IMO.


>
> --
> viresh
Viresh Kumar Oct. 18, 2019, 9:28 a.m. UTC | #14
On 18-10-19, 11:26, Rafael J. Wysocki wrote:
> On Fri, Oct 18, 2019 at 11:24 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> >
> > On 18-10-19, 10:30, Rafael J. Wysocki wrote:
> > > On Fri, Oct 18, 2019 at 10:27 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > > >
> > > > On 18-10-19, 10:24, Rafael J. Wysocki wrote:
> > > > > On Fri, Oct 18, 2019 at 7:44 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > > > > >
> > > > > > On 17-10-19, 18:34, Rafael J. Wysocki wrote:
> > > > > > > [BTW, Viresh, it looks like cpufreq_set_policy() should still ensure
> > > > > > > that the new min is less than the new max, because the QoS doesn't do
> > > > > > > that.]
> > > > > >
> > > > > > The ->verify() callback does that for us I believe.
> > > > >
> > > > > It does in practice AFAICS, but in theory it may assume the right
> > > > > ordering between the min and the max and just test the boundaries, may
> > > > > it not?
> > > >
> > > > I think cpufreq_verify_within_limits() gets called for sure from
> > > > within ->verify() for all platforms
> > >
> > > That's why I mean by "in practice". :-)
> >
> > Hmm, I am not sure if we should really add another min <= max check in
> > cpufreq_set_policy() as in practice it will never hit :)
> 
> Fair enough, but adding a comment regarding that in there would be prudent IMO.

will do.