In the Linux kernel, the following vulnerability has been resolved:
ibmveth: Disable GSO for packets with small MSS
Some physical adapters on Power systems do not support segmentation
offload when the MSS is less than 224 bytes. Attempting to send such
packets causes the adapter to freeze, stopping all traffic until
manually reset.
Implement ndo_features_check to disable GSO for packets with small MSS
values. The network stack will perform software segmentation instead.
The 224-byte minimum matches ibmvnic
commit <f10b09ef687f> ("ibmvnic: Enforce stronger sanity checks
on GSO packets")
which uses the same physical adapters in SEA configurations.
The issue occurs specifically when the hardware attempts to perform
segmentation (gso_segs > 1) with a small MSS. Single-segment GSO packets
(gso_segs == 1) do not trigger the problematic LSO code path and are
transmitted normally without segmentation.
Add an ndo_features_check callback to disable GSO when MSS < 224 bytes.
Also call vlan_features_check() to ensure proper handling of VLAN packets,
particularly QinQ (802.1ad) configurations where the hardware parser may
not support certain offload features.
Validated using iptables to force small MSS values. Without the fix,
the adapter freezes. With the fix, packets are segmented in software
and transmission succeeds. Comprehensive regression testing completedd
(MSS tests, performance, stability).
In the Linux kernel, the following vulnerability has been resolved:
coresight: tmc-etr: Fix race condition between sysfs and perf mode
When trying to run perf and sysfs mode simultaneously, the WARN_ON()
in tmc_etr_enable_hw() is triggered sometimes:
WARNING: CPU: 42 PID: 3911571 at drivers/hwtracing/coresight/coresight-tmc-etr.c:1060 tmc_etr_enable_hw+0xc0/0xd8 [coresight_tmc]
[..snip..]
Call trace:
tmc_etr_enable_hw+0xc0/0xd8 [coresight_tmc] (P)
tmc_enable_etr_sink+0x11c/0x250 [coresight_tmc] (L)
tmc_enable_etr_sink+0x11c/0x250 [coresight_tmc]
coresight_enable_path+0x1c8/0x218 [coresight]
coresight_enable_sysfs+0xa4/0x228 [coresight]
enable_source_store+0x58/0xa8 [coresight]
dev_attr_store+0x20/0x40
sysfs_kf_write+0x4c/0x68
kernfs_fop_write_iter+0x120/0x1b8
vfs_write+0x2c8/0x388
ksys_write+0x74/0x108
__arm64_sys_write+0x24/0x38
el0_svc_common.constprop.0+0x64/0x148
do_el0_svc+0x24/0x38
el0_svc+0x3c/0x130
el0t_64_sync_handler+0xc8/0xd0
el0t_64_sync+0x1ac/0x1b0
---[ end trace 0000000000000000 ]---
Since the enablement of sysfs mode is separeted into two critical regions,
one for sysfs buffer allocation and another for hardware enablement, it's
possible to race with the perf mode. Fix this by double check whether
the perf mode's been used before enabling the hardware in sysfs mode.
mode:
[sysfs mode] [perf mode]
tmc_etr_get_sysfs_buffer()
spin_lock(&drvdata->spinlock)
[sysfs buffer allocation]
spin_unlock(&drvdata->spinlock)
spin_lock(&drvdata->spinlock)
tmc_etr_enable_hw()
drvdata->etr_buf = etr_perf->etr_buf
spin_unlock(&drvdata->spinlock)
spin_lock(&drvdata->spinlock)
tmc_etr_enable_hw()
WARN_ON(drvdata->etr_buf) // WARN sicne etr_buf initialized at
the perf side
spin_unlock(&drvdata->spinlock)
With this fix, we retain the check for CS_MODE_PERF in get_etr_sysfs_buf.
This ensures we verify whether the perf mode's already running before we
actually allocate the buffer. Then we can save the time of
allocating/freeing the sysfs buffer if race with the perf mode.
In the Linux kernel, the following vulnerability has been resolved:
wifi: ath12k: do WoW offloads only on primary link
In case of multi-link connection, WCN7850 firmware crashes due to WoW
offloads enabled on both primary and secondary links.
Change to do it only on primary link to fix it.
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00284-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1
In the Linux kernel, the following vulnerability has been resolved:
power: supply: rt9455: Fix use-after-free in power_supply_changed()
Using the `devm_` variant for requesting IRQ _before_ the `devm_`
variant for allocating/registering the `power_supply` handle, means that
the `power_supply` handle will be deallocated/unregistered _before_ the
interrupt handler (since `devm_` naturally deallocates in reverse
allocation order). This means that during removal, there is a race
condition where an interrupt can fire just _after_ the `power_supply`
handle has been freed, *but* just _before_ the corresponding
unregistration of the IRQ handler has run.
This will lead to the IRQ handler calling `power_supply_changed()` with
a freed `power_supply` handle. Which usually crashes the system or
otherwise silently corrupts the memory...
Note that there is a similar situation which can also happen during
`probe()`; the possibility of an interrupt firing _before_ registering
the `power_supply` handle. This would then lead to the nasty situation
of using the `power_supply` handle *uninitialized* in
`power_supply_changed()`.
Fix this racy use-after-free by making sure the IRQ is requested _after_
the registration of the `power_supply` handle.
In the Linux kernel, the following vulnerability has been resolved:
pinctrl: canaan: k230: Fix NULL pointer dereference when parsing devicetree
When probing the k230 pinctrl driver, the kernel triggers a NULL pointer
dereference. The crash trace showed:
[ 0.732084] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000068
[ 0.740737] ...
[ 0.776296] epc : k230_pinctrl_probe+0x1be/0x4fc
In k230_pinctrl_parse_functions(), we attempt to retrieve the device
pointer via info->pctl_dev->dev, but info->pctl_dev is only initialized
after k230_pinctrl_parse_dt() completes.
At the time of DT parsing, info->pctl_dev is still NULL, leading to
the invalid dereference of info->pctl_dev->dev.
Use the already available device pointer from platform_device
instead of accessing through uninitialized pctl_dev.
In the Linux kernel, the following vulnerability has been resolved:
PCI/P2PDMA: Fix p2pmem_alloc_mmap() warning condition
Commit b7e282378773 has already changed the initial page refcount of
p2pdma page from one to zero, however, in p2pmem_alloc_mmap() it uses
"VM_WARN_ON_ONCE_PAGE(!page_ref_count(page))" to assert the initial page
refcount should not be zero and the following will be reported when
CONFIG_DEBUG_VM is enabled:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x380400000
flags: 0x20000000002000(reserved|node=0|zone=4)
raw: 0020000000002000 ff1100015e3ab440 0000000000000000 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: VM_WARN_ON_ONCE_PAGE(!page_ref_count(page))
------------[ cut here ]------------
WARNING: CPU: 5 PID: 449 at drivers/pci/p2pdma.c:240 p2pmem_alloc_mmap+0x83a/0xa60
Fix by using "page_ref_count(page)" as the assertion condition.
In the Linux kernel, the following vulnerability has been resolved:
nfc: hci: shdlc: Stop timers and work before freeing context
llc_shdlc_deinit() purges SHDLC skb queues and frees the llc_shdlc
structure while its timers and state machine work may still be active.
Timer callbacks can schedule sm_work, and sm_work accesses SHDLC state
and the skb queues. If teardown happens in parallel with a queued/running
work item, it can lead to UAF and other shutdown races.
Stop all SHDLC timers and cancel sm_work synchronously before purging the
queues and freeing the context.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
In the Linux kernel, the following vulnerability has been resolved:
inet: RAW sockets using IPPROTO_RAW MUST drop incoming ICMP
Yizhou Zhao reported that simply having one RAW socket on protocol
IPPROTO_RAW (255) was dangerous.
socket(AF_INET, SOCK_RAW, 255);
A malicious incoming ICMP packet can set the protocol field to 255
and match this socket, leading to FNHE cache changes.
inner = IP(src="192.168.2.1", dst="8.8.8.8", proto=255)/Raw("TEST")
pkt = IP(src="192.168.1.1", dst="192.168.2.1")/ICMP(type=3, code=4, nexthopmtu=576)/inner
"man 7 raw" states:
A protocol of IPPROTO_RAW implies enabled IP_HDRINCL and is able
to send any IP protocol that is specified in the passed header.
Receiving of all IP protocols via IPPROTO_RAW is not possible
using raw sockets.
Make sure we drop these malicious packets.
In the Linux kernel, the following vulnerability has been resolved:
RDMA/hns: Fix WQ_MEM_RECLAIM warning
When sunrpc is used, if a reset triggered, our wq may lead the
following trace:
workqueue: WQ_MEM_RECLAIM xprtiod:xprt_rdma_connect_worker [rpcrdma]
is flushing !WQ_MEM_RECLAIM hns_roce_irq_workq:flush_work_handle
[hns_roce_hw_v2]
WARNING: CPU: 0 PID: 8250 at kernel/workqueue.c:2644 check_flush_dependency+0xe0/0x144
Call trace:
check_flush_dependency+0xe0/0x144
start_flush_work.constprop.0+0x1d0/0x2f0
__flush_work.isra.0+0x40/0xb0
flush_work+0x14/0x30
hns_roce_v2_destroy_qp+0xac/0x1e0 [hns_roce_hw_v2]
ib_destroy_qp_user+0x9c/0x2b4
rdma_destroy_qp+0x34/0xb0
rpcrdma_ep_destroy+0x28/0xcc [rpcrdma]
rpcrdma_ep_put+0x74/0xb4 [rpcrdma]
rpcrdma_xprt_disconnect+0x1d8/0x260 [rpcrdma]
xprt_rdma_connect_worker+0xc0/0x120 [rpcrdma]
process_one_work+0x1cc/0x4d0
worker_thread+0x154/0x414
kthread+0x104/0x144
ret_from_fork+0x10/0x18
Since QP destruction frees memory, this wq should have the WQ_MEM_RECLAIM.
In the Linux kernel, the following vulnerability has been resolved:
drm/xe/pf: Fix sysfs initialization
In case of devm_add_action_or_reset() failure the provided cleanup
action will be run immediately on the not yet initialized kobject.
This may lead to errors like:
[ ] kobject: '(null)' (ff110001393608e0): is not initialized, yet kobject_put() is being called.
[ ] WARNING: lib/kobject.c:734 at kobject_put+0xd9/0x250, CPU#0: kworker/0:0/9
[ ] RIP: 0010:kobject_put+0xdf/0x250
[ ] Call Trace:
[ ] xe_sriov_pf_sysfs_init+0x21/0x100 [xe]
[ ] xe_sriov_pf_init_late+0x87/0x2b0 [xe]
[ ] xe_sriov_init_late+0x5f/0x2c0 [xe]
[ ] xe_device_probe+0x5f2/0xc20 [xe]
[ ] xe_pci_probe+0x396/0x610 [xe]
[ ] local_pci_probe+0x47/0xb0
[ ] refcount_t: underflow; use-after-free.
[ ] WARNING: lib/refcount.c:28 at refcount_warn_saturate+0x68/0xb0, CPU#0: kworker/0:0/9
[ ] RIP: 0010:refcount_warn_saturate+0x68/0xb0
[ ] Call Trace:
[ ] kobject_put+0x174/0x250
[ ] xe_sriov_pf_sysfs_init+0x21/0x100 [xe]
[ ] xe_sriov_pf_init_late+0x87/0x2b0 [xe]
[ ] xe_sriov_init_late+0x5f/0x2c0 [xe]
[ ] xe_device_probe+0x5f2/0xc20 [xe]
[ ] xe_pci_probe+0x396/0x610 [xe]
[ ] local_pci_probe+0x47/0xb0
Fix that by calling kobject_init() and kobject_add() separately
and register cleanup action after the kobject is initialized.
Also make this cleanup registration a part of the create helper to
fix another mistake, as in the loop we were wrongly passing parent
kobject while registering cleanup action, and this resulted in some
undetected leaks.
(cherry picked from commit 98b16727f07e26a5d4de84d88805ce7ffcfdd324)
In the Linux kernel, the following vulnerability has been resolved:
drm/amd/display: Fix out-of-bounds stream encoder index v3
eng_id can be negative and that stream_enc_regs[]
can be indexed out of bounds.
eng_id is used directly as an index into stream_enc_regs[], which has
only 5 entries. When eng_id is 5 (ENGINE_ID_DIGF) or negative, this can
access memory past the end of the array.
Add a bounds check using ARRAY_SIZE() before using eng_id as an index.
The unsigned cast also rejects negative values.
This avoids out-of-bounds access.
Fixes the below smatch error:
dcn*_resource.c: stream_encoder_create() may index
stream_enc_regs[eng_id] out of bounds (size 5).
drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn351/dcn351_resource.c
1246 static struct stream_encoder *dcn35_stream_encoder_create(
1247 enum engine_id eng_id,
1248 struct dc_context *ctx)
1249 {
...
1255
1256 /* Mapping of VPG, AFMT, DME register blocks to DIO block instance */
1257 if (eng_id <= ENGINE_ID_DIGF) {
ENGINE_ID_DIGF is 5. should <= be <?
Unrelated but, ugh, why is Smatch saying that "eng_id" can be negative?
end_id is type signed long, but there are checks in the caller which prevent it from being negative.
1258 vpg_inst = eng_id;
1259 afmt_inst = eng_id;
1260 } else
1261 return NULL;
1262
...
1281
1282 dcn35_dio_stream_encoder_construct(enc1, ctx, ctx->dc_bios,
1283 eng_id, vpg, afmt,
--> 1284 &stream_enc_regs[eng_id],
^^^^^^^^^^^^^^^^^^^^^^^ This stream_enc_regs[] array has 5 elements so we are one element beyond the end of the array.
...
1287 return &enc1->base;
1288 }
v2: use explicit bounds check as suggested by Roman/Dan; avoid unsigned int cast
v3: The compiler already knows how to compare the two values, so the
cast (int) is not needed. (Roman)
In the Linux kernel, the following vulnerability has been resolved:
ASoC: fsl_xcvr: Revert fix missing lock in fsl_xcvr_mode_put()
This reverts commit f51424872760 ("ASoC: fsl_xcvr: fix missing lock in fsl_xcvr_mode_put()").
The original patch attempted to acquire the card->controls_rwsem lock in
fsl_xcvr_mode_put(). However, this function is called from the upper ALSA
core function snd_ctl_elem_write(), which already holds the write lock on
controls_rwsem for the whole put operation. So there is no need to simply
hold the lock for fsl_xcvr_activate_ctl() again.
Acquiring the read lock while holding the write lock in the same thread
results in a deadlock and a hung task, as reported by Alexander Stein.
In the Linux kernel, the following vulnerability has been resolved:
spi: wpcm-fiu: Fix potential NULL pointer dereference in wpcm_fiu_probe()
platform_get_resource_byname() can return NULL, which would cause a crash
when passed the pointer to resource_size().
Move the fiu->memory_size assignment after the error check for
devm_ioremap_resource() to prevent the potential NULL pointer dereference.
In the Linux kernel, the following vulnerability has been resolved:
procfs: fix missing RCU protection when reading real_parent in do_task_stat()
When reading /proc/[pid]/stat, do_task_stat() accesses task->real_parent
without proper RCU protection, which leads to:
cpu 0 cpu 1
----- -----
do_task_stat
var = task->real_parent
release_task
call_rcu(delayed_put_task_struct)
task_tgid_nr_ns(var)
rcu_read_lock <--- Too late to protect task->real_parent!
task_pid_ptr <--- UAF!
rcu_read_unlock
This patch uses task_ppid_nr_ns() instead of task_tgid_nr_ns() to add
proper RCU protection for accessing task->real_parent.
In the Linux kernel, the following vulnerability has been resolved:
gpio: cdev: Avoid NULL dereference in linehandle_create()
In linehandle_create(), there is a statement like this:
retain_and_null_ptr(lh);
Soon after, there is a debug printout that dereferences "lh", which
will crash things.
Avoid the crash by using handlereq.lines, which is the same value.
In the Linux kernel, the following vulnerability has been resolved:
clocksource/drivers/timer-sp804: Fix an Oops when read_current_timer is called on ARM32 platforms where the SP804 is not registered as the sched_clock.
On SP804, the delay timer shares the same clkevt instance with
sched_clock. On some platforms, when
sp804_clocksource_and_sched_clock_init is called with use_sched_clock
not set to 1, sched_clkevt is not properly initialized. However,
sp804_register_delay_timer is invoked unconditionally, and
read_current_timer() subsequently calls sp804_read on an uninitialized
sched_clkevt, leading to a kernel Oops when accessing
sched_clkevt->value.
Declare a dedicated clkevt instance exclusively for delay timer,
instead of sharing the same clkevt with sched_clock. This ensures
that read_current_timer continues to work correctly regardless of
whether SP804 is selected as the sched_clock.
In the Linux kernel, the following vulnerability has been resolved:
NFS/localio: prevent direct reclaim recursion into NFS via nfs_writepages
LOCALIO is an NFS loopback mount optimization that avoids using the
network for READ, WRITE and COMMIT if the NFS client and server are
determined to be on the same system. But because LOCALIO is still
fundamentally "just NFS loopback mount" it is susceptible to recursion
deadlock via direct reclaim, e.g.: NFS LOCALIO down to XFS and then
back into NFS via nfs_writepages.
Fix LOCALIO's potential for direct reclaim deadlock by ensuring that
all its page cache allocations are done from GFP_NOFS context.
Thanks to Ben Coddington for pointing out commit ad22c7a043c2 ("xfs:
prevent stack overflows from page cache allocation").
In the Linux kernel, the following vulnerability has been resolved:
dmaengine: fsl-edma: don't explicitly disable clocks in .remove()
The clocks in fsl_edma_engine::muxclk are allocated and enabled with
devm_clk_get_enabled(), which automatically cleans these resources up,
but these clocks are also manually disabled in fsl_edma_remove(). This
causes warnings on driver removal for each clock:
edma_module already disabled
WARNING: CPU: 0 PID: 418 at drivers/clk/clk.c:1200 clk_core_disable+0x198/0x1c8
[...]
Call trace:
clk_core_disable+0x198/0x1c8 (P)
clk_disable+0x34/0x58
fsl_edma_remove+0x74/0xe8 [fsl_edma]
[...]
---[ end trace 0000000000000000 ]---
edma_module already unprepared
WARNING: CPU: 0 PID: 418 at drivers/clk/clk.c:1059 clk_core_unprepare+0x1f8/0x220
[...]
Call trace:
clk_core_unprepare+0x1f8/0x220 (P)
clk_unprepare+0x34/0x58
fsl_edma_remove+0x7c/0xe8 [fsl_edma]
[...]
---[ end trace 0000000000000000 ]---
Fix these warnings by removing the unnecessary fsl_disable_clocks() call
in fsl_edma_remove().