- 25 8月, 2020 40 次提交
-
-
由 Zhang Changzhong 提交于
mainline inclusion from mainline-v5.9-rc2 commit 0ae18a82 category: bugfix bugzilla: 39990 CVE: NA --------------------------- According to SAE J1939/21 (Chapter 5.12.3 and APPENDIX C), for transmit side the required time interval between packets of a multipacket broadcast message is 50 to 200 ms, the responder shall use a timeout of 250ms (provides margin allowing for the maximumm spacing of 200ms). For receive side a timeout will occur when a time of greater than 750 ms elapsed between two message packets when more packets were expected. So this patch fix and add rxtimer for multipacket broadcast session. Fixes: 9d71dd0c ("can: add support of SAE J1939 protocol") Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com> Link: https://lore.kernel.org/r/1596599425-5534-5-git-send-email-zhangchangzhong@huawei.comAcked-by: NOleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Zhang Changzhong 提交于
mainline inclusion from mainline-v5.9-rc2 commit 2b8b2e31 category: bugfix bugzilla: 39990 CVE: NA --------------------------- If timeout occurs, j1939_tp_rxtimer() first calls hrtimer_start() to restart rxtimer, and then calls __j1939_session_cancel() to set session->state = J1939_SESSION_WAITING_ABORT. At next timeout expiration, because of the J1939_SESSION_WAITING_ABORT session state j1939_tp_rxtimer() will call j1939_session_deactivate_activate_next() to deactivate current session, and rxtimer won't be set. But for multipacket broadcast session, __j1939_session_cancel() don't set session->state = J1939_SESSION_WAITING_ABORT, thus current session won't be deactivate and hrtimer_start() is called to start new rxtimer again and again. So fix it by moving session->state = J1939_SESSION_WAITING_ABORT out of if (!j1939_cb_is_broadcast(&session->skcb)) statement. Fixes: 9d71dd0c ("can: add support of SAE J1939 protocol") Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com> Link: https://lore.kernel.org/r/1596599425-5534-4-git-send-email-zhangchangzhong@huawei.comAcked-by: NOleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Zhang Changzhong 提交于
mainline inclusion from mainline-v5.9-rc2 commit e8b17653 category: bugfix bugzilla: 39990 CVE: NA --------------------------- If j1939_xtp_rx_dat_one() receive last frame of multipacket broadcast message, j1939_session_timers_cancel() should be called to cancel rxtimer. Fixes: 9d71dd0c ("can: add support of SAE J1939 protocol") Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com> Link: https://lore.kernel.org/r/1596599425-5534-3-git-send-email-zhangchangzhong@huawei.comAcked-by: NOleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Zhang Changzhong 提交于
mainline inclusion from mainline-v5.9-rc2 commit f4fd77fd category: bugfix bugzilla: 39990 CVE: NA --------------------------- Currently j1939_tp_im_involved_anydir() in j1939_tp_recv() check the previously set flags J1939_ECU_LOCAL_DST and J1939_ECU_LOCAL_SRC of incoming skb, thus multipacket broadcast message was aborted by receive side because it may come from remote ECUs and have no exact dst address. Similarly, j1939_tp_cmd_recv() and j1939_xtp_rx_dat() didn't process broadcast message. So fix it by checking and process broadcast message in j1939_tp_recv(), j1939_tp_cmd_recv() and j1939_xtp_rx_dat(). Fixes: 9d71dd0c ("can: add support of SAE J1939 protocol") Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com> Link: https://lore.kernel.org/r/1596599425-5534-2-git-send-email-zhangchangzhong@huawei.comAcked-by: NOleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com> Reviewed-by: NYue Haibing <yuehaibing@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: feature bugzilla: 38688 CVE: NA --------------------------- ioc_rqos_throttle() will be called with spin_lock_irq(q->queue_lock) called in sq. However, ioc_rqos_throttle() will call spin_lock_irq(&ioc->lock) and spin_unlock_irq(&ioc->lock) before 'q->queue_lock' is released. Thus, local irqs will be enabled after 'ioc->lock' is released and 'q->queue_lock' might never be released. spin_lock_irq(q->queue_lock) --> local irq will be disabled rq_qos_throttle spin_lock_irq(&ioc->lock) spin_unlock_irq(&ioc->lock) --> local irq will be enabled before 'q->queue_lock' is released spin_unlock_irq(&q->queue_lock) io_schedule() spin_lock_irq(q->queue_lock) Fix the problem by using spin_lock_irqsave()/spin_unlock_irqrestore() for other locks. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Jiufei Xue 提交于
hulk inclusion category: feature bugzilla: 38688 CVE: NA --------------------------- ioc_rqos_throttle() may called inside queue_lock, the lock should be unlocked before sleep. Signed-off-by: NJiufei Xue <jiufei.xue@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Conflict: block/blk-iocost.c Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Jiufei Xue 提交于
hulk inclusion category: feature bugzilla: 38688 CVE: NA --------------------------- Bios are not associated with blkg before entering iocost controller. do it in ioc_rqos_throttle() as well as ioc_rqos_merge(). Considering that there are so many chances to create blkg before ioc_rqos_merge(), we just lookup the blkg here and if blkg are not exist, just return rather than create it. Signed-off-by: NJiufei Xue <jiufei.xue@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Conflict: block/blk-iocost.c Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: feature bugzilla: 38688 CVE: NA --------------------------- Add definition of 'legacy_cftypes', so that iocost can be used in cgroup V1. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.4-rc4 commit 9d179b86 category: feature bugzilla: 38688 CVE: NA --------------------------- blkcg_activate_policy() has the following bugs. * cf09a8ee ("blkcg: pass @q and @blkcg into blkcg_pol_alloc_pd_fn()") added @blkcg to ->pd_alloc_fn(); however, blkcg_activate_policy() ends up using pd's allocated for the root blkcg for all preallocations, so ->pd_init_fn() for non-root blkcgs can be passed in pd's which are allocated for the root blkcg. For blk-iocost, this means that ->pd_init_fn() can write beyond the end of the allocated object as it determines the length of the flex array at the end based on the blkcg's nesting level. * Each pd is initialized as they get allocated. If alloc fails, the policy will get freed with pd's initialized on it. * After the above partial failure, the partial pds are not freed. This patch fixes all the above issues by * Restructuring blkcg_activate_policy() so that alloc and init passes are separate. Init takes place only after all allocs succeeded and on failure all allocated pds are freed. * Unifying and fixing the cleanup of the remaining pd_prealloc. Signed-off-by: NTejun Heo <tj@kernel.org> Fixes: cf09a8ee ("blkcg: pass @q and @blkcg into blkcg_pol_alloc_pd_fn()") Signed-off-by: NJens Axboe <axboe@kernel.dk> Conflict: block/blk-cgroup.c Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.3-rc1 commit 71c81407 category: feature bugzilla: 38688 CVE: NA --------------------------- When blkcg_activate_policy() is creating blkg_policy_data for existing blkgs, it did in the wrong order - descendants first. Fix it. None of the existing controllers seem affected by this. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk> Conflict: block/blk-cgroup.c Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Stephen Rothwell 提交于
mainline inclusion from mainline-5.4-rc1 commit 8d1c1560 category: feature bugzilla: 38688 CVE: NA --------------------------- Fixes: 7caa4715 ("blkcg: implement blk-iocost") Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.6-rc6 commit dcd6589b category: feature bugzilla: 38688 CVE: NA --------------------------- vtimes may wrap and time_before/after64() should be used to determine whether a given vtime is before or after another. iocg_is_idle() was incorrectly using plain "<" comparison do determine whether done_vtime is before vtime. Here, the only thing we're interested in is whether done_vtime matches vtime which indicates that there's nothing in flight. Let's test for inequality instead. Signed-off-by: NTejun Heo <tj@kernel.org> Fixes: 7caa4715 ("blkcg: implement blk-iocost") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Waiman Long 提交于
mainline inclusion from mainline-5.7-rc3 commit d6c8e949 category: feature bugzilla: 38688 CVE: NA --------------------------- Systemtap 4.2 is unable to correctly interpret the "u32 (*missed_ppm)[2]" argument of the iocost_ioc_vrate_adj trace entry defined in include/trace/events/iocost.h leading to the following error: /tmp/stapAcz0G0/stap_c89c58b83cea1724e26395efa9ed4939_6321_aux_6.c:78:8: error: expected ‘;’, ‘,’ or ‘)’ before ‘*’ token , u32[]* __tracepoint_arg_missed_ppm That argument type is indeed rather complex and hard to read. Looking at block/blk-iocost.c. It is just a 2-entry u32 array. By simplifying the argument to a simple "u32 *missed_ppm" and adjusting the trace entry accordingly, the compilation error was gone. Fixes: 7caa4715 ("blkcg: implement blk-iocost") Acked-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NWaiman Long <longman@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.6-rc1 commit 9ea37e24 category: feature bugzilla: 38688 CVE: NA --------------------------- iocost_monitor.py broke with recent versions of drgn due to helper being stricter about types. Fix it so that it uses the correct type. Signed-off-by: NTejun Heo <tj@kernel.org> Suggested-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.5-rc3 commit d7bd15a1 category: feature bugzilla: 38688 CVE: NA --------------------------- When over-budget IOs are force-issued through root cgroup, iocg_kick_delay() adjusts the async delay accordingly but doesn't actually schedule async throttle for the issuing task. This bug is pretty well masked because sooner or later the offending threads are gonna get directly throttled on regular IOs or have async delay scheduled by mem_cgroup_throttle_swaprate(). However, it can affect control quality on filesystem metadata heavy operations. Let's fix it by invoking blkcg_schedule_throttle() when iocg_kick_delay() says async delay is needed. Signed-off-by: NTejun Heo <tj@kernel.org> Fixes: 7caa4715 ("blkcg: implement blk-iocost") Cc: stable@vger.kernel.org Reported-by: NJosef Bacik <josef@toxicpanda.com> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Jiufei Xue 提交于
mainline inclusion from mainline-5.4-rc8 commit 8b37bc27 category: feature bugzilla: 38688 CVE: NA --------------------------- There is a bug that checking the same active_list over and over again in iocg_activate(). The intention of the code was checking whether all the ancestors and self have already been activated. So fix it. Fixes: 7caa4715 ("blkcg: implement blk-iocost") Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJiufei Xue <jiufei.xue@linux.alibaba.com> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Dan Carpenter 提交于
mainline inclusion from mainline-5.4-rc6 commit 41591a51 category: feature bugzilla: 38688 CVE: NA --------------------------- This code causes a static analysis warning: block/blk-iocost.c:2113 ioc_weight_write() error: double lock 'irq' We disable IRQs in blkg_conf_prep() and re-enable them in blkg_conf_finish(). IRQ disable/enable should not be nested because that means the IRQs will be enabled at the first unlock instead of the second one. Fixes: 7caa4715 ("blkcg: implement blk-iocost") Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.4-rc1 commit 7afcccaf category: feature bugzilla: 38688 CVE: NA --------------------------- The default hard disk param sets latency targets at 50ms. As the default target percentiles are zero, these don't directly regulate vrate; however, they're still used to calculate the period length - 100ms in this case. This is excessively low. A SATA drive with QD32 saturated with random IOs can easily reach avg completion latency of several hundred msecs. A period duration which is substantially lower than avg completion latency can lead to wildly fluctuating vrate. Let's bump up the default latency targets to 250ms so that the period duration is sufficiently long. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.4-rc1 commit 7cd806a9 category: feature bugzilla: 38688 CVE: NA --------------------------- Some IOs may span multiple periods. As latencies are collected on completion, the inbetween periods won't register them and may incorrectly decide to increase vrate. nr_lagging tracks these IOs to avoid those situations. Currently, whenever there are IOs which are spanning from the previous period, busy_level is reset to 0 if negative thus suppressing vrate increase. This has the following two problems. * When latency target percentiles aren't set, vrate adjustment should only be governed by queue depth depletion; however, the current code keeps nr_lagging active which pulls in latency results and can keep down vrate unexpectedly. * When lagging condition is detected, it resets the entire negative busy_level. This turned out to be way too aggressive on some devices which sometimes experience extended latencies on a small subset of commands. In addition, a lagging IO will be accounted as latency target miss on completion anyway and resetting busy_level amplifies its impact unnecessarily. This patch fixes the above two problems by disabling nr_lagging counting when latency target percentiles aren't set and blocking vrate increases when there are lagging IOs while leaving busy_level as-is. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.4-rc1 commit 25d41e4a category: feature bugzilla: 38688 CVE: NA --------------------------- vrate_adj tracepoint traces vrate changes; however, it does so only when busy_level is non-zero. busy_level turning to zero can sometimes be as interesting an event. This patch also enables vrate_adj tracepoint on other vrate related events - busy_level changes and non-zero nr_lagging. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.4-rc1 commit 7c1ee704 category: feature bugzilla: 38688 CVE: NA --------------------------- Report debt and rename del_ms row to delay for consistency. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.4-rc1 commit b06f2d35 category: feature bugzilla: 38688 CVE: NA --------------------------- When outputting json: * Don't truncate numbers. * Report address of iocg to ease drilling down further. When outputting table: * Use math.ceil() for delay_ms so that small delays don't read as 0. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.4-rc1 commit e742bd5c category: feature bugzilla: 38688 CVE: NA --------------------------- Json has limited accuracy for numbers and can silently truncate 64bit values, which can be extremely confusing. Let's consistently use string encapsulated values for json output. While at it, convert an unnecesary f-string to str(). Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.4-rc1 commit e1518f63 category: feature bugzilla: 38688 CVE: NA --------------------------- Merges have the same problem that forced-bios had which is fixed by the previous patch. The cost of a merge is calculated at the time of issue and force-advances vtime into the future. Until global vtime catches up, how the cgroup's hweight changes in the meantime doesn't matter and it often leads to situations where the cost is calculated at one hweight and paid at a very different one. See the previous patch for more details. Fix it by never advancing vtime into the future for merges. If budget is available, vtime is advanced. Otherwise, the cost is charged as debt. This brings merge cost handling in line with issue cost handling in ioc_rqos_throttle(). Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.4-rc1 commit 36a52481 category: feature bugzilla: 38688 CVE: NA --------------------------- Currently, when a bio needs to be force-charged and there isn't enough budget, vtime is simply pushed into the future. This means that the cost of the whole bio is scaled using the current hweight and then charged immediately. Until the global vtime advances beyond this future vtime, the cgroup won't be allowed to issue normal IOs. This is incorrect and can lead to, for example, exploding vrate or extended stalls if vrate range is constrained. Consider the following scenario. 1. A cgroup with a very low hweight runs out of budget. 2. A storm of swap-out happens on it. All of them are scaled according to the current low hweight and charged to vtime pushing it to a far future. 3. All other cgroups go idle and now the above cgroup has access to the whole device. However, because vtime is already wound using the past low hweight, what its current hweight is doesn't matter until global vtime catches up to the local vtime. 4. As a result, either vrate gets ramped up extremely or the IOs stall while the underlying device is idle. This is because the hweight the overage is calculated at is different from the hweight that it's being paid at. Fix it by remembering the overage in absoulte vtime and continuously paying with the actual budget according to the current hweight at each period. Note that non-forced bios which wait already remembers the cost in absolute vtime. This brings forced-bio accounting in line. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.4-rc1 commit e036c4ca category: feature bugzilla: 38688 CVE: NA --------------------------- ioc_pd_free() first cancels the hrtimers and then deactivates the iocg. However, the iocg timer can run inbetween and reschedule the hrtimers which will end up running after the iocg is freed leading to crashes like the following. general protection fault: 0000 [#1] SMP ... RIP: 0010:iocg_kick_delay+0xbe/0x1b0 RSP: 0018:ffffc90003598ea0 EFLAGS: 00010046 RAX: 1cee00fd69512b54 RBX: ffff8881bba48400 RCX: 00000000000003e8 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8881bba48400 RBP: 0000000000004e20 R08: 0000000000000002 R09: 00000000000003e8 R10: 0000000000000000 R11: 0000000000000000 R12: ffffc90003598ef0 R13: 00979f3810ad461f R14: ffff8881bba4b400 R15: 25439f950d26e1d1 FS: 0000000000000000(0000) GS:ffff88885f800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f64328c7e40 CR3: 0000000002409005 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> iocg_delay_timer_fn+0x3d/0x60 __hrtimer_run_queues+0xfe/0x270 hrtimer_interrupt+0xf4/0x210 smp_apic_timer_interrupt+0x5e/0x120 apic_timer_interrupt+0xf/0x20 </IRQ> Fix it by canceling hrtimers after deactivating the iocg. Fixes: 7caa4715 ("blkcg: implement blk-iocost") Reported-by: NDave Jones <davej@codemonkey.org.uk> Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.4-rc1 commit e916ad29 category: feature bugzilla: 38688 CVE: NA --------------------------- ioc_cpd_alloc() forgot to check NULL return from kzalloc(). Add it. Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: Nkbuild test robot <lkp@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.4-rc1 commit 3532e722 category: feature bugzilla: 38688 CVE: NA --------------------------- blk_iocost_init() forgot to free its percpu stat on the error path. Fix it. Fixes: 7caa4715 ("blkcg: implement blk-iocost") Reported-by: NHillf Danton <hdanton@sina.com> Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.4-rc1 commit 8504dea7 category: feature bugzilla: 38688 CVE: NA --------------------------- Add a script which can be used to generate device-specific iocost linear model coefficients. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.4-rc1 commit 6954ff18 category: feature bugzilla: 38688 CVE: NA --------------------------- Instead of mucking with debugfs and ->pd_stat(), add drgn based monitoring script. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Omar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.4-rc1 commit 7caa4715 category: feature bugzilla: 38688 CVE: NA --------------------------- This patchset implements IO cost model based work-conserving proportional controller. While io.latency provides the capability to comprehensively prioritize and protect IOs depending on the cgroups, its protection is binary - the lowest latency target cgroup which is suffering is protected at the cost of all others. In many use cases including stacking multiple workload containers in a single system, it's necessary to distribute IO capacity with better granularity. One challenge of controlling IO resources is the lack of trivially observable cost metric. The most common metrics - bandwidth and iops - can be off by orders of magnitude depending on the device type and IO pattern. However, the cost isn't a complete mystery. Given several key attributes, we can make fairly reliable predictions on how expensive a given stream of IOs would be, at least compared to other IO patterns. The function which determines the cost of a given IO is the IO cost model for the device. This controller distributes IO capacity based on the costs estimated by such model. The more accurate the cost model the better but the controller adapts based on IO completion latency and as long as the relative costs across differents IO patterns are consistent and sensible, it'll adapt to the actual performance of the device. Currently, the only implemented cost model is a simple linear one with a few sets of default parameters for different classes of device. This covers most common devices reasonably well. All the infrastructure to tune and add different cost models is already in place and a later patch will also allow using bpf progs for cost models. Please see the top comment in blk-iocost.c and documentation for more details. v2: Rebased on top of RQ_ALLOC_TIME changes and folded in Rik's fix for a divide-by-zero bug in current_hweight() triggered by zero inuse_sum. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Andy Newell <newella@fb.com> Cc: Josef Bacik <jbacik@fb.com> Cc: Rik van Riel <riel@surriel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk> Conflict: block/Kconfig block/Makefile include/linux/blk_types.h block/blk-iocost.c Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.4-rc1 commit 86a5bba5 category: feature bugzilla: 38688 CVE: NA --------------------------- For policies which can do enough initialization from ->cpd_alloc_fn(), make ->cpd_init_fn() optional. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.4-rc1 commit cf09a8ee category: feature bugzilla: 38688 CVE: NA --------------------------- Instead of @node, pass in @q and @blkcg so that the alloc function has more context. This doesn't cause any behavior change and will be used by io.weight implementation. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk> Conflits: block/blk-iolatency.c block/blk-cgroup.c block/cfq-iosched.c Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.3-rc1 commit 38cf3a68 category: feature bugzilla: 38688 CVE: NA --------------------------- a5e112e6 ("cgroup: add cgroup_parse_float()") accidentally added cgroup_parse_float() inside CONFIG_SYSFS block. Move it outside so that it doesn't cause failures on !CONFIG_SYSFS builds. Signed-off-by: NTejun Heo <tj@kernel.org> Fixes: a5e112e6 ("cgroup: add cgroup_parse_float()") Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.3-rc1 commit a5e112e6 category: feature bugzilla: 38688 CVE: NA --------------------------- cgroup already uses floating point for percent[ile] numbers and there are several controllers which want to take them as input. Add a generic parse helper to handle inputs. Update the interface convention documentation about the use of percentage numbers. While at it, also clarify the default time unit. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.4-rc1 commit 015d254c category: feature bugzilla: 38688 CVE: NA --------------------------- Separate out blkcg_conf_get_disk() so that it can be used by blkcg policy interface file input parsers before the policy is actually enabled. This doesn't introduce any functional changes. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.4-rc1 commit 9677a3e0 category: feature bugzilla: 38688 CVE: NA --------------------------- wbt already gets queue depth changed notification through wbt_set_queue_depth(). Generalize it into rq_qos_ops->queue_depth_changed() so that other rq_qos policies can easily hook into the events too. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk> Conflict: block/blk-rq-qos.c block/blk-rq-qos.h block/blk-wbt.c Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.4-rc1 commit d3e65fff category: feature bugzilla: 38688 CVE: NA --------------------------- Add a merge hook for rq_qos. This will be used by io.weight. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk> Conflict: block/blk-rq-qos.c block/blk-core.c block/blk-rq-qos.h Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Tejun Heo 提交于
mainline inclusion from mainline-5.4-rc1 commit 6f816b4b category: feature bugzilla: 38688 CVE: NA --------------------------- There are currently two start time timestamps - start_time_ns and io_start_time_ns. The former marks the request allocation and and the second issue-to-device time. The planned io.weight controller needs to measure the total time bios take to execute after it leaves rq_qos including the time spent waiting for request to become available, which can easily dominate on saturated devices. This patch adds request->alloc_time_ns which records when the request allocation attempt started. As it isn't used for the usual stats, make it optional behind CONFIG_BLK_RQ_ALLOC_TIME and QUEUE_FLAG_RQ_ALLOC_TIME so that it can be compiled out when there are no users and it's active only on queues which need it even when compiled in. v2: s/pre_start_time/alloc_time/ and add CONFIG_BLK_RQ_ALLOC_TIME gating as suggested by Jens. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk> Conflict: include/linux/blkdev.h block/Kconfig block/blk-mq.c Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Xiangyou Xie 提交于
hulk inclusion category: config bugzilla: NA CVE: NA We enable haltpoll by default for the improvement of performance. X86 has been supported. Now, we will provide it on ARM. Signed-off-by: NXiangyou Xie <xiexiangyou@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Reviewed-by: Nzhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-