Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix section 3.8 number #76

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Fix section 3.8 number #76

wants to merge 1 commit into from

Conversation

ruckuus
Copy link

@ruckuus ruckuus commented Mar 5, 2014

No description provided.

antonblanchard pushed a commit to antonblanchard/linux that referenced this pull request Mar 7, 2014
After commit bcdde7e (sysfs: make __sysfs_remove_dir() recursive)
I'm seeing traces analogous to the one below in Thunderbolt testing:

WARNING: CPU: 3 PID: 76 at /scratch/rafael/work/linux-pm/fs/sysfs/group.c:214 sysfs_remove_group+0x59/0xe0()
 sysfs group ffffffff81c6c500 not found for kobject '0000:08'
 Modules linked in: ...
 CPU: 3 PID: 76 Comm: kworker/u16:7 Not tainted 3.13.0-rc1+ torvalds#76
 Hardware name: Acer Aspire S5-391/Venus    , BIOS V1.02 05/29/2012
 Workqueue: kacpi_hotplug acpi_hotplug_work_fn
  0000000000000009 ffff8801644b9ac8 ffffffff816b23bf 0000000000000007
  ffff8801644b9b18 ffff8801644b9b08 ffffffff81046607 ffff88016925b800
  0000000000000000 ffffffff81c6c500 ffff88016924f928 ffff88016924f800
 Call Trace:
  [<ffffffff816b23bf>] dump_stack+0x4e/0x71
  [<ffffffff81046607>] warn_slowpath_common+0x87/0xb0
  [<ffffffff810466d1>] warn_slowpath_fmt+0x41/0x50
  [<ffffffff811e42ef>] ? sysfs_get_dirent_ns+0x6f/0x80
  [<ffffffff811e5389>] sysfs_remove_group+0x59/0xe0
  [<ffffffff8149f00b>] dpm_sysfs_remove+0x3b/0x50
  [<ffffffff81495818>] device_del+0x58/0x1c0
  [<ffffffff814959c8>] device_unregister+0x48/0x60
  [<ffffffff813254fe>] pci_remove_bus+0x6e/0x80
  [<ffffffff81325548>] pci_remove_bus_device+0x38/0x110
  [<ffffffff8132555d>] pci_remove_bus_device+0x4d/0x110
  [<ffffffff81325639>] pci_stop_and_remove_bus_device+0x19/0x20
  [<ffffffff813418d0>] disable_slot+0x20/0xe0
  [<ffffffff81341a38>] acpiphp_check_bridge+0xa8/0xd0
  [<ffffffff813427ad>] hotplug_event+0x17d/0x220
  [<ffffffff81342880>] hotplug_event_work+0x30/0x70
  [<ffffffff8136d665>] acpi_hotplug_work_fn+0x18/0x24
  [<ffffffff81061331>] process_one_work+0x261/0x450
  [<ffffffff81061a7e>] worker_thread+0x21e/0x370
  [<ffffffff81061860>] ? rescuer_thread+0x300/0x300
  [<ffffffff81068342>] kthread+0xd2/0xe0
  [<ffffffff81068270>] ? flush_kthread_worker+0x70/0x70
  [<ffffffff816c19bc>] ret_from_fork+0x7c/0xb0
  [<ffffffff81068270>] ? flush_kthread_worker+0x70/0x70

(Mika Westerberg sees them too in his tests).

Some investigation documented in kernel bug #65281 led me to the
conclusion that the source of the problem is the device_del() in
pci_stop_dev() as it now causes the sysfs directory of the device to be
removed recursively along with all of its subdirectories.  That includes
the sysfs directory of the device's subordinate bus (dev->subordinate) and
its "power" group.

Consequently, when pci_remove_bus() is called for dev->subordinate in
pci_remove_bus_device(), it calls device_unregister(&bus->dev), but at this
point the sysfs directory of bus->dev doesn't exist any more and its
"power" group doesn't exist either.  Thus, when dpm_sysfs_remove() called
from device_del() tries to remove that group, it triggers the above
warning.

That indicates a logical mistake in the design of
pci_stop_and_remove_bus_device(), which causes bus device objects to be
left behind their parents (bridge device objects) and can be fixed by
moving the device_del() from pci_stop_dev() into pci_destroy_dev(), so
pci_remove_bus() can be called for the device's subordinate bus before the
device itself is unregistered from the hierarchy.  Still, the driver, if
any, should be detached from the device in pci_stop_dev(), so use
device_release_driver() directly from there.

References: https://bugzilla.kernel.org/show_bug.cgi?id=65281#c6
Reported-by: Mika Westerberg <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
ndyer pushed a commit to ndyer/linux that referenced this pull request Mar 10, 2014
Turn it into (for example):

[    0.073380] x86: Booting SMP configuration:
[    0.074005] .... node   #0, CPUs:          #1   dtor#2   #3   #4   #5   torvalds#6   torvalds#7
[    0.603005] .... node   #1, CPUs:     torvalds#8   torvalds#9  torvalds#10  torvalds#11  torvalds#12  torvalds#13  torvalds#14  torvalds#15
[    1.200005] .... node   dtor#2, CPUs:    torvalds#16  torvalds#17  torvalds#18  torvalds#19  torvalds#20  torvalds#21  torvalds#22  torvalds#23
[    1.796005] .... node   #3, CPUs:    torvalds#24  torvalds#25  torvalds#26  torvalds#27  torvalds#28  torvalds#29  torvalds#30  torvalds#31
[    2.393005] .... node   #4, CPUs:    torvalds#32  torvalds#33  torvalds#34  torvalds#35  torvalds#36  torvalds#37  torvalds#38  torvalds#39
[    2.996005] .... node   #5, CPUs:    torvalds#40  torvalds#41  torvalds#42  torvalds#43  torvalds#44  torvalds#45  torvalds#46  torvalds#47
[    3.600005] .... node   torvalds#6, CPUs:    torvalds#48  torvalds#49  torvalds#50  torvalds#51  #52  #53  torvalds#54  torvalds#55
[    4.202005] .... node   torvalds#7, CPUs:    torvalds#56  torvalds#57  #58  torvalds#59  torvalds#60  torvalds#61  torvalds#62  torvalds#63
[    4.811005] .... node   torvalds#8, CPUs:    torvalds#64  torvalds#65  torvalds#66  torvalds#67  torvalds#68  torvalds#69  #70  torvalds#71
[    5.421006] .... node   torvalds#9, CPUs:    torvalds#72  torvalds#73  torvalds#74  torvalds#75  torvalds#76  torvalds#77  torvalds#78  torvalds#79
[    6.032005] .... node  torvalds#10, CPUs:    torvalds#80  torvalds#81  torvalds#82  torvalds#83  torvalds#84  torvalds#85  torvalds#86  torvalds#87
[    6.648006] .... node  torvalds#11, CPUs:    torvalds#88  torvalds#89  torvalds#90  torvalds#91  torvalds#92  torvalds#93  torvalds#94  torvalds#95
[    7.262005] .... node  torvalds#12, CPUs:    torvalds#96  torvalds#97  torvalds#98  torvalds#99 torvalds#100 torvalds#101 torvalds#102 torvalds#103
[    7.865005] .... node  torvalds#13, CPUs:   torvalds#104 torvalds#105 torvalds#106 torvalds#107 torvalds#108 torvalds#109 torvalds#110 torvalds#111
[    8.466005] .... node  torvalds#14, CPUs:   torvalds#112 torvalds#113 torvalds#114 torvalds#115 torvalds#116 torvalds#117 torvalds#118 torvalds#119
[    9.073006] .... node  torvalds#15, CPUs:   torvalds#120 torvalds#121 torvalds#122 torvalds#123 torvalds#124 torvalds#125 torvalds#126 torvalds#127
[    9.679901] x86: Booted up 16 nodes, 128 CPUs

and drop useless elements.

Change num_digits() to hpa's division-avoiding, cell-phone-typed
version which he went at great lengths and pains to submit on a
Saturday evening.

Signed-off-by: Borislav Petkov <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Linus Torvalds <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
ndyer pushed a commit to ndyer/linux that referenced this pull request Mar 10, 2014
After commit bcdde7e (sysfs: make __sysfs_remove_dir() recursive)
I'm seeing traces analogous to the one below in Thunderbolt testing:

WARNING: CPU: 3 PID: 76 at /scratch/rafael/work/linux-pm/fs/sysfs/group.c:214 sysfs_remove_group+0x59/0xe0()
 sysfs group ffffffff81c6c500 not found for kobject '0000:08'
 Modules linked in: ...
 CPU: 3 PID: 76 Comm: kworker/u16:7 Not tainted 3.13.0-rc1+ torvalds#76
 Hardware name: Acer Aspire S5-391/Venus    , BIOS V1.02 05/29/2012
 Workqueue: kacpi_hotplug acpi_hotplug_work_fn
  0000000000000009 ffff8801644b9ac8 ffffffff816b23bf 0000000000000007
  ffff8801644b9b18 ffff8801644b9b08 ffffffff81046607 ffff88016925b800
  0000000000000000 ffffffff81c6c500 ffff88016924f928 ffff88016924f800
 Call Trace:
  [<ffffffff816b23bf>] dump_stack+0x4e/0x71
  [<ffffffff81046607>] warn_slowpath_common+0x87/0xb0
  [<ffffffff810466d1>] warn_slowpath_fmt+0x41/0x50
  [<ffffffff811e42ef>] ? sysfs_get_dirent_ns+0x6f/0x80
  [<ffffffff811e5389>] sysfs_remove_group+0x59/0xe0
  [<ffffffff8149f00b>] dpm_sysfs_remove+0x3b/0x50
  [<ffffffff81495818>] device_del+0x58/0x1c0
  [<ffffffff814959c8>] device_unregister+0x48/0x60
  [<ffffffff813254fe>] pci_remove_bus+0x6e/0x80
  [<ffffffff81325548>] pci_remove_bus_device+0x38/0x110
  [<ffffffff8132555d>] pci_remove_bus_device+0x4d/0x110
  [<ffffffff81325639>] pci_stop_and_remove_bus_device+0x19/0x20
  [<ffffffff813418d0>] disable_slot+0x20/0xe0
  [<ffffffff81341a38>] acpiphp_check_bridge+0xa8/0xd0
  [<ffffffff813427ad>] hotplug_event+0x17d/0x220
  [<ffffffff81342880>] hotplug_event_work+0x30/0x70
  [<ffffffff8136d665>] acpi_hotplug_work_fn+0x18/0x24
  [<ffffffff81061331>] process_one_work+0x261/0x450
  [<ffffffff81061a7e>] worker_thread+0x21e/0x370
  [<ffffffff81061860>] ? rescuer_thread+0x300/0x300
  [<ffffffff81068342>] kthread+0xd2/0xe0
  [<ffffffff81068270>] ? flush_kthread_worker+0x70/0x70
  [<ffffffff816c19bc>] ret_from_fork+0x7c/0xb0
  [<ffffffff81068270>] ? flush_kthread_worker+0x70/0x70

(Mika Westerberg sees them too in his tests).

Some investigation documented in kernel bug #65281 led me to the
conclusion that the source of the problem is the device_del() in
pci_stop_dev() as it now causes the sysfs directory of the device to be
removed recursively along with all of its subdirectories.  That includes
the sysfs directory of the device's subordinate bus (dev->subordinate) and
its "power" group.

Consequently, when pci_remove_bus() is called for dev->subordinate in
pci_remove_bus_device(), it calls device_unregister(&bus->dev), but at this
point the sysfs directory of bus->dev doesn't exist any more and its
"power" group doesn't exist either.  Thus, when dpm_sysfs_remove() called
from device_del() tries to remove that group, it triggers the above
warning.

That indicates a logical mistake in the design of
pci_stop_and_remove_bus_device(), which causes bus device objects to be
left behind their parents (bridge device objects) and can be fixed by
moving the device_del() from pci_stop_dev() into pci_destroy_dev(), so
pci_remove_bus() can be called for the device's subordinate bus before the
device itself is unregistered from the hierarchy.  Still, the driver, if
any, should be detached from the device in pci_stop_dev(), so use
device_release_driver() directly from there.

References: https://bugzilla.kernel.org/show_bug.cgi?id=65281#c6
Reported-by: Mika Westerberg <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
swarren pushed a commit to swarren/linux-tegra that referenced this pull request Jun 23, 2014
WARNING: quoted string split across lines
torvalds#60: FILE: fs/isofs/compress.c:163:
 					       " page idx = %d, bh idx = %d,"
+					       " avail_in = %ld,"

WARNING: quoted string split across lines
torvalds#61: FILE: fs/isofs/compress.c:164:
+					       " avail_in = %ld,"
+					       " avail_out = %ld\n",

WARNING: missing space after return type
torvalds#76: FILE: include/linux/decompress/bunzip2.h:5:
+	    long(*fill)(void*, unsigned long),

WARNING: missing space after return type
torvalds#77: FILE: include/linux/decompress/bunzip2.h:6:
+	    long(*flush)(void*, unsigned long),

WARNING: missing space after return type
torvalds#94: FILE: include/linux/decompress/generic.h:5:
+			      long(*fill)(void*, unsigned long),

WARNING: missing space after return type
torvalds#95: FILE: include/linux/decompress/generic.h:6:
+			      long(*flush)(void*, unsigned long),

WARNING: missing space after return type
torvalds#122: FILE: include/linux/decompress/inflate.h:5:
+	   long(*fill)(void*, unsigned long),

WARNING: missing space after return type
torvalds#123: FILE: include/linux/decompress/inflate.h:6:
+	   long(*flush)(void*, unsigned long),

WARNING: missing space after return type
torvalds#140: FILE: include/linux/decompress/unlz4.h:5:
+	long(*fill)(void*, unsigned long),

WARNING: missing space after return type
torvalds#141: FILE: include/linux/decompress/unlz4.h:6:
+	long(*flush)(void*, unsigned long),

WARNING: missing space after return type
torvalds#158: FILE: include/linux/decompress/unlzma.h:5:
+	   long(*fill)(void*, unsigned long),

WARNING: missing space after return type
torvalds#159: FILE: include/linux/decompress/unlzma.h:6:
+	   long(*flush)(void*, unsigned long),

WARNING: missing space after return type
torvalds#177: FILE: include/linux/decompress/unlzo.h:5:
+	long(*fill)(void*, unsigned long),

WARNING: missing space after return type
torvalds#178: FILE: include/linux/decompress/unlzo.h:6:
+	long(*flush)(void*, unsigned long),

WARNING: please, no spaces at the start of a line
torvalds#210: FILE: include/linux/zlib.h:86:
+    uLong     avail_in;  /* number of bytes available at next_in */$

WARNING: please, no spaces at the start of a line
torvalds#215: FILE: include/linux/zlib.h:90:
+    uLong     avail_out; /* remaining free space at next_out */$

WARNING: __initdata should be placed after count
torvalds#259: FILE: init/initramfs.c:177:
+static __initdata unsigned long count;

WARNING: __initdata should be placed after remains
torvalds#268: FILE: init/initramfs.c:189:
+static __initdata long remains;

WARNING: missing space after return type
torvalds#385: FILE: lib/decompress_bunzip2.c:679:
+			long(*fill)(void*, unsigned long),

WARNING: missing space after return type
torvalds#386: FILE: lib/decompress_bunzip2.c:680:
+			long(*flush)(void*, unsigned long),

WARNING: missing space after return type
torvalds#401: FILE: lib/decompress_bunzip2.c:747:
+			long(*fill)(void*, unsigned long),

WARNING: missing space after return type
torvalds#402: FILE: lib/decompress_bunzip2.c:748:
+			long(*flush)(void*, unsigned long),

WARNING: missing space after return type
torvalds#427: FILE: lib/decompress_inflate.c:37:
+		       long(*fill)(void*, unsigned long),

WARNING: missing space after return type
torvalds#428: FILE: lib/decompress_inflate.c:38:
+		       long(*flush)(void*, unsigned long),

WARNING: Unnecessary space before function pointer arguments
torvalds#456: FILE: lib/decompress_unlz4.c:35:
+				long (*fill) (void *, unsigned long),

WARNING: Unnecessary space before function pointer arguments
torvalds#457: FILE: lib/decompress_unlz4.c:36:
+				long (*flush) (void *, unsigned long),

WARNING: missing space after return type
torvalds#479: FILE: lib/decompress_unlz4.c:179:
+			      long(*fill)(void*, unsigned long),

WARNING: missing space after return type
torvalds#480: FILE: lib/decompress_unlz4.c:180:
+			      long(*flush)(void*, unsigned long),

WARNING: missing space after return type
torvalds#529: FILE: lib/decompress_unlzma.c:283:
+	long(*flush)(void*, unsigned long);

WARNING: missing space after return type
torvalds#541: FILE: lib/decompress_unlzma.c:538:
+			      long(*fill)(void*, unsigned long),

WARNING: missing space after return type
torvalds#542: FILE: lib/decompress_unlzma.c:539:
+			      long(*flush)(void*, unsigned long),

WARNING: missing space after return type
torvalds#557: FILE: lib/decompress_unlzma.c:671:
+			      long(*fill)(void*, unsigned long),

WARNING: missing space after return type
torvalds#558: FILE: lib/decompress_unlzma.c:672:
+			      long(*flush)(void*, unsigned long),

WARNING: Unnecessary space before function pointer arguments
torvalds#586: FILE: lib/decompress_unlzo.c:112:
+				long (*fill) (void *, unsigned long),

WARNING: Unnecessary space before function pointer arguments
torvalds#587: FILE: lib/decompress_unlzo.c:113:
+				long (*flush) (void *, unsigned long),

total: 0 errors, 35 warnings, 479 lines checked

./patches/initramfs-support-initramfs-that-is-more-than-2g.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Yinghai Lu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Gnurou pushed a commit to Gnurou/linux that referenced this pull request Jun 27, 2014
WARNING: Missing a blank line after declarations
torvalds#76: FILE: mm/zpool.c:79:
+			bool got = try_module_get(driver->owner);
+			spin_unlock(&drivers_lock);

total: 0 errors, 1 warnings, 94 lines checked

./patches/mm-zpool-prevent-zbud-zsmalloc-from-unloading-when-used.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Dan Streetman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
JoonsooKim pushed a commit to JoonsooKim/linux that referenced this pull request Jul 4, 2014
WARNING: Missing a blank line after declarations
torvalds#76: FILE: mm/zpool.c:79:
+			bool got = try_module_get(driver->owner);
+			spin_unlock(&drivers_lock);

total: 0 errors, 1 warnings, 94 lines checked

./patches/mm-zpool-prevent-zbud-zsmalloc-from-unloading-when-used.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Dan Streetman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
@Elizafox
Copy link

Elizafox commented Jan 8, 2015

Useless. And Linus doesn't accept PR's, to add insult to injury.

@swarren
Copy link
Contributor

swarren commented Jan 9, 2015

Eliza, I would not say that fixing documentation is useless. That's a little demeaning.

FWIW, this bug is fixed upstream already by 49d063c "proc: show mnt_id in /proc/pid/fdinfo".

krzk pushed a commit to krzk/linux that referenced this pull request May 2, 2015
ERROR: code indent should use tabs where possible
torvalds#76: FILE: mm/cma_debug.c:88:
+        int pages = val;$

WARNING: please, no spaces at the start of a line
torvalds#76: FILE: mm/cma_debug.c:88:
+        int pages = val;$

ERROR: code indent should use tabs where possible
torvalds#79: FILE: mm/cma_debug.c:91:
+        return cma_free_mem(cma, pages);$

WARNING: please, no spaces at the start of a line
torvalds#79: FILE: mm/cma_debug.c:91:
+        return cma_free_mem(cma, pages);$

total: 2 errors, 2 warnings, 69 lines checked

NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
      scripts/cleanfile

./patches/mm-cma-release-trigger.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Sasha Levin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
tobetter pushed a commit to tobetter/linux that referenced this pull request May 12, 2015
hzhuang1 pushed a commit to hzhuang1/linux that referenced this pull request Jun 2, 2015
ARM64: hi6220: enable ldo21 by default
johnstultz-work pushed a commit to johnstultz-work/linux-dev that referenced this pull request Nov 19, 2015
While disabling ConfigFS Android gadget, android_disconnect() calls
kill_all_hid_devices(), if CONFIG_USB_CONFIGFS_F_ACC is enabled, to free
the registered HIDs without checking whether the USB accessory device
really exist or not. If USB accessory device doesn't exist then we run into
following kernel panic:
----8<----
[  136.724761] Unable to handle kernel NULL pointer dereference at virtual address 00000064
[  136.724809] pgd = c0204000
[  136.731924] [00000064] *pgd=00000000
[  136.737830] Internal error: Oops: 5 [#1] SMP ARM
[  136.738108] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc4-00400-gf75300e-dirty torvalds#76
[  136.742788] task: c0fb19d8 ti: c0fa4000 task.ti: c0fa4000
[  136.750890] PC is at _raw_spin_lock_irqsave+0x24/0x60
[  136.756246] LR is at kill_all_hid_devices+0x24/0x114
---->8----

This patch adds a test to check if USB Accessory device exists before freeing HIDs.

Change-Id: Ie229feaf0de3f4f7a151fcaa9a994e34e15ff73b
Signed-off-by: Amit Pundir <[email protected]>
johnstultz-work pushed a commit to johnstultz-work/linux-dev that referenced this pull request Nov 19, 2015
While disabling ConfigFS Android gadget, android_disconnect() calls
kill_all_hid_devices(), if CONFIG_USB_CONFIGFS_F_ACC is enabled, to free
the registered HIDs without checking whether the USB accessory device
really exist or not. If USB accessory device doesn't exist then we run into
following kernel panic:
----8<----
[  136.724761] Unable to handle kernel NULL pointer dereference at virtual address 00000064
[  136.724809] pgd = c0204000
[  136.731924] [00000064] *pgd=00000000
[  136.737830] Internal error: Oops: 5 [#1] SMP ARM
[  136.738108] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc4-00400-gf75300e-dirty torvalds#76
[  136.742788] task: c0fb19d8 ti: c0fa4000 task.ti: c0fa4000
[  136.750890] PC is at _raw_spin_lock_irqsave+0x24/0x60
[  136.756246] LR is at kill_all_hid_devices+0x24/0x114
---->8----

This patch adds a test to check if USB Accessory device exists before freeing HIDs.

Change-Id: Ie229feaf0de3f4f7a151fcaa9a994e34e15ff73b
Signed-off-by: Amit Pundir <[email protected]>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Dec 20, 2015
I've observed various spew (KASAN, warnings, oopses, etc.) that seem to
stem from incorrect cloning of dccp_sock in dccp_create_openreq_child().

The problem is that struct dccp_sock's
  ->dccps_hc_rx_ackvec,
  ->dccps_hc_rx_ccid, and
  ->dccps_hc_tx_ccid
members are pointers to memory which is not reference counted and not
protected by any locks, so sharing them between original sock and the
clone seems like a bad idea.

The usual symptom would be a use-after-free which happens when an
operation on the original sock causes any of these pointers to be freed
followed by an operation on the cloned sock:

==================================================================
BUG: KASAN: use-after-free in dccp_sync_mss+0x45/0x160 at addr ffff880012c65780
Read of size 8 by task a.out/987
=============================================================================
BUG ccid2_hc_tx_sock (Tainted: G        W      ): kasan: bad access detected
-----------------------------------------------------------------------------

Disabling lock debugging due to kernel taint
INFO: Allocated in ccid_new+0x1b4/0x270 age=64589 cpu=0 pid=986
        ___slab_alloc+0x724/0x810
        __slab_alloc.isra.49+0x86/0xc0
        kmem_cache_alloc+0x25a/0x2d0
        ccid_new+0x1b4/0x270
        dccp_hdlr_ccid+0x26/0xe0
        __dccp_feat_activate+0xc3/0x180
        dccp_feat_activate_values+0x2fa/0x4c0
        dccp_rcv_state_process+0x814/0xa80
        dccp_v4_do_rcv+0x6a/0x100
        release_sock+0x168/0x330
        inet_stream_connect+0x6d/0x90
        SYSC_connect+0x1d0/0x200
        SyS_connect+0x11/0x20
        entry_SYSCALL_64_fastpath+0x12/0x71
INFO: Freed in ccid_hc_tx_delete+0x7d/0x90 age=11330 cpu=1 pid=989
        __slab_free+0x1f0/0x360
        kmem_cache_free+0x2b6/0x300
        ccid_hc_tx_delete+0x7d/0x90
        dccp_hdlr_ccid+0x65/0xe0
        __dccp_feat_activate+0xc3/0x180
        dccp_feat_activate_values+0x2fa/0x4c0
        dccp_create_openreq_child+0x1fc/0x290
        dccp_v4_request_recv_sock+0x67/0x430
        dccp_check_req+0x248/0x330
        dccp_v4_rcv+0x2a8/0xd50
        ip_local_deliver_finish+0x160/0x4c0
        ip_local_deliver+0x175/0x230
        ip_rcv_finish+0x119/0x750
        ip_rcv+0x678/0x960
        __netif_receive_skb_core+0xe64/0x1810
        __netif_receive_skb+0x41/0xf0
INFO: Slab 0xffffea00004b1800 objects=20 used=9 fp=0xffff880012c644c0 flags=0x100000000004080
INFO: Object 0xffff880012c65780 @offset=22400 fp=0xffff880012c60c80
[...]
CPU: 0 PID: 987 Comm: a.out Tainted: G    B   W       4.4.0-rc5+ torvalds#76
 ffffea00004b1800 ffff88001304fa40 ffffffff8169ed5b ffff88001422e800
 ffff88001304fa70 ffffffff812e36ec ffff88001422e800 ffffea00004b1800
 ffff880012c65780 000000000000ffff ffff88001304fa98 ffffffff812e946f
Call Trace:
 [<ffffffff8169ed5b>] dump_stack+0x8d/0xe2
 [<ffffffff812e36ec>] print_trailer+0x13c/0x1b0
 [<ffffffff812e946f>] object_err+0x3f/0x50
 [<ffffffff812f02c3>] kasan_report_error+0x2e3/0x6e0
 [<ffffffff8108d321>] ? kvm_clock_get_cycles+0x11/0x20
 [<ffffffff81e8fb33>] ? secure_dccp_sequence_number+0x133/0x1d0
 [<ffffffff812f0704>] kasan_report+0x44/0x50
 [<ffffffff82207155>] ? dccp_sync_mss+0x45/0x160
 [<ffffffff812ef403>] __asan_load8+0x93/0xe0
 [<ffffffff82207155>] dccp_sync_mss+0x45/0x160
 [<ffffffff822080df>] dccp_connect+0x7f/0x2a0
 [<ffffffff82217632>] dccp_v4_connect+0x612/0x960
 [<ffffffff81ff20a7>] __inet_stream_connect+0x1d7/0x6a0
 [<ffffffff8110438b>] ? preempt_count_sub+0x1b/0x170
 [<ffffffff824cd007>] ? _raw_spin_unlock_irqrestore+0x47/0x90
 [<ffffffff81ff1ed0>] ? inet_sendpage+0x200/0x200
 [<ffffffff811302c1>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
 [<ffffffff8110438b>] ? preempt_count_sub+0x1b/0x170
 [<ffffffff810baa41>] ? __local_bh_enable_ip+0x61/0x110
 [<ffffffff81ff25c1>] inet_stream_connect+0x51/0x90
 [<ffffffff81e671a0>] SYSC_connect+0x1d0/0x200
 [<ffffffff81ff2570>] ? __inet_stream_connect+0x6a0/0x6a0
 [<ffffffff81e66fd0>] ? ___sys_recvmsg+0x3d0/0x3d0
 [<ffffffff81384490>] ? SyS_epoll_create+0x1a0/0x1a0
 [<ffffffff813372e5>] ? __fget+0x115/0x180
 [<ffffffff8133739d>] ? __fget_light+0x4d/0xf0
 [<ffffffff813822e0>] ? ep_poll_wakeup_proc+0x30/0x30
 [<ffffffff81e69e71>] SyS_connect+0x11/0x20
 [<ffffffff824cdd6e>] entry_SYSCALL_64_fastpath+0x12/0x71

I'm not really sure if setting them to NULL is really the correct
solution -- maybe we should try to duplicate the pointed-to memory
instead?

Anyway, this is a tentative patch that explains the issue and fixes
this particular problem -- dccp fuzzing now runs for minutes rather
than seconds before encountering a crash. I haven't tested any
real world workloads on this patch.

Signed-off-by: Vegard Nossum <[email protected]>
diaevd referenced this pull request in diaevd/zen-kernel Dec 30, 2015
Fixed compilation for linux kernel < 3.3.0
xin3liang pushed a commit to xin3liang/linux that referenced this pull request Jan 13, 2016
While disabling ConfigFS Android gadget, android_disconnect() calls
kill_all_hid_devices(), if CONFIG_USB_CONFIGFS_F_ACC is enabled, to free
the registered HIDs without checking whether the USB accessory device
really exist or not. If USB accessory device doesn't exist then we run into
following kernel panic:
----8<----
[  136.724761] Unable to handle kernel NULL pointer dereference at virtual address 00000064
[  136.724809] pgd = c0204000
[  136.731924] [00000064] *pgd=00000000
[  136.737830] Internal error: Oops: 5 [#1] SMP ARM
[  136.738108] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc4-00400-gf75300e-dirty torvalds#76
[  136.742788] task: c0fb19d8 ti: c0fa4000 task.ti: c0fa4000
[  136.750890] PC is at _raw_spin_lock_irqsave+0x24/0x60
[  136.756246] LR is at kill_all_hid_devices+0x24/0x114
---->8----

This patch adds a test to check if USB Accessory device exists before freeing HIDs.

Change-Id: Ie229feaf0de3f4f7a151fcaa9a994e34e15ff73b
Signed-off-by: Amit Pundir <[email protected]>
xin3liang pushed a commit to xin3liang/linux that referenced this pull request Feb 6, 2016
While disabling ConfigFS Android gadget, android_disconnect() calls
kill_all_hid_devices(), if CONFIG_USB_CONFIGFS_F_ACC is enabled, to free
the registered HIDs without checking whether the USB accessory device
really exist or not. If USB accessory device doesn't exist then we run into
following kernel panic:
----8<----
[  136.724761] Unable to handle kernel NULL pointer dereference at virtual address 00000064
[  136.724809] pgd = c0204000
[  136.731924] [00000064] *pgd=00000000
[  136.737830] Internal error: Oops: 5 [#1] SMP ARM
[  136.738108] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc4-00400-gf75300e-dirty torvalds#76
[  136.742788] task: c0fb19d8 ti: c0fa4000 task.ti: c0fa4000
[  136.750890] PC is at _raw_spin_lock_irqsave+0x24/0x60
[  136.756246] LR is at kill_all_hid_devices+0x24/0x114
---->8----

This patch adds a test to check if USB Accessory device exists before freeing HIDs.

Change-Id: Ie229feaf0de3f4f7a151fcaa9a994e34e15ff73b
Signed-off-by: Amit Pundir <[email protected]>
xin3liang pushed a commit to xin3liang/linux that referenced this pull request Mar 18, 2016
While disabling ConfigFS Android gadget, android_disconnect() calls
kill_all_hid_devices(), if CONFIG_USB_CONFIGFS_F_ACC is enabled, to free
the registered HIDs without checking whether the USB accessory device
really exist or not. If USB accessory device doesn't exist then we run into
following kernel panic:
----8<----
[  136.724761] Unable to handle kernel NULL pointer dereference at virtual address 00000064
[  136.724809] pgd = c0204000
[  136.731924] [00000064] *pgd=00000000
[  136.737830] Internal error: Oops: 5 [#1] SMP ARM
[  136.738108] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc4-00400-gf75300e-dirty torvalds#76
[  136.742788] task: c0fb19d8 ti: c0fa4000 task.ti: c0fa4000
[  136.750890] PC is at _raw_spin_lock_irqsave+0x24/0x60
[  136.756246] LR is at kill_all_hid_devices+0x24/0x114
---->8----

This patch adds a test to check if USB Accessory device exists before freeing HIDs.

Change-Id: Ie229feaf0de3f4f7a151fcaa9a994e34e15ff73b
Signed-off-by: Amit Pundir <[email protected]>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request May 11, 2016
This string is used by dump_stack and as we now support
more SoC's than just STiH415/6 it is misleading to have
the current string in the stack trace.

This patch updates it to be more generic for the STi
family of SoCs.

So instead of looking like this

[  271.672555] Hardware name: STiH415/416 SoC with Flattened Device Tree
[  271.678998] [<c0310490>] (unwind_backtrace) from [<c030bb54>] (show_stack+0x10/0x14)
[  271.686746] [<c030bb54>] (show_stack) from [<c058bc4c>] (dump_stack+0x98/0xac)
[snip]

it now looks like this:

[    2.669879] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.6.0-rc3-00026-g38a1ce6-dirty torvalds#76
[    2.677973] Hardware name: STi SoC with Flattened Device Tree
[    2.683723] [<c0310490>] (unwind_backtrace) from [<c030bb54>] (show_stack+0x10/0x14)
[    2.691472] [<c030bb54>] (show_stack) from [<c058bc0c>] (dump_stack+0x98/0xac)
[snip]

Signed-off-by: Peter Griffin <[email protected]>
showliu referenced this pull request in showliu/linux Jun 16, 2016
While disabling ConfigFS Android gadget, android_disconnect() calls
kill_all_hid_devices(), if CONFIG_USB_CONFIGFS_F_ACC is enabled, to free
the registered HIDs without checking whether the USB accessory device
really exist or not. If USB accessory device doesn't exist then we run into
following kernel panic:
----8<----
[  136.724761] Unable to handle kernel NULL pointer dereference at virtual address 00000064
[  136.724809] pgd = c0204000
[  136.731924] [00000064] *pgd=00000000
[  136.737830] Internal error: Oops: 5 [linaro-swg#1] SMP ARM
[  136.738108] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc4-00400-gf75300e-dirty linaro-swg#76
[  136.742788] task: c0fb19d8 ti: c0fa4000 task.ti: c0fa4000
[  136.750890] PC is at _raw_spin_lock_irqsave+0x24/0x60
[  136.756246] LR is at kill_all_hid_devices+0x24/0x114
---->8----

This patch adds a test to check if USB Accessory device exists before freeing HIDs.

Change-Id: Ie229feaf0de3f4f7a151fcaa9a994e34e15ff73b
Signed-off-by: Amit Pundir <[email protected]>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Jul 17, 2016
This string is used by dump_stack and as we now support
more SoC's than just STiH415/6 it is misleading to have
the current string in the stack trace.

This patch updates it to be more generic for the STi
family of SoCs.

So instead of looking like this

[  271.672555] Hardware name: STiH415/416 SoC with Flattened Device Tree
[  271.678998] [<c0310490>] (unwind_backtrace) from [<c030bb54>] (show_stack+0x10/0x14)
[  271.686746] [<c030bb54>] (show_stack) from [<c058bc4c>] (dump_stack+0x98/0xac)
[snip]

it now looks like this:

[    2.669879] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.6.0-rc3-00026-g38a1ce6-dirty torvalds#76
[    2.677973] Hardware name: STi SoC with Flattened Device Tree
[    2.683723] [<c0310490>] (unwind_backtrace) from [<c030bb54>] (show_stack+0x10/0x14)
[    2.691472] [<c030bb54>] (show_stack) from [<c058bc0c>] (dump_stack+0x98/0xac)
[snip]

Signed-off-by: Peter Griffin <[email protected]>
Acked-by: Patrice Chotard <[email protected]>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Oct 6, 2016
This commit fixes a stack corruption in the pseries specific code dealing
with the huge pages.

In __pSeries_lpar_hugepage_invalidate() the buffer used to pass arguments
to the hypervisor is not large enough. This leads to a stack corruption
where a previously saved register could be corrupted leading to unexpected
result in the caller, like the following panic:

Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: virtio_balloon ip_tables x_tables autofs4
virtio_blk 8139too virtio_pci virtio_ring 8139cp virtio
CPU: 11 PID: 1916 Comm: mmstress Not tainted 4.8.0 torvalds#76
task: c000000005394880 task.stack: c000000005570000
NIP: c00000000027bf6c LR: c00000000027bf64 CTR: 0000000000000000
REGS: c000000005573820 TRAP: 0300   Not tainted  (4.8.0)
MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 84822884  XER:
20000000
CFAR: c00000000010a924 DAR: 420000000014e5e0 DSISR: 40000000 SOFTE: 1
GPR00: c00000000027bf64 c000000005573aa0 c000000000e02800 c000000004447964
GPR04: c00000000404de18 c000000004d38810 00000000042100f5 00000000f5002104
GPR08: e0000000f5002104 0000000000000001 042100f5000000e0 00000000042100f5
GPR12: 0000000000002200 c00000000fe02c00 c00000000404de18 0000000000000000
GPR16: c1ffffffffffe7ff 00003fff62000000 420000000014e5e0 00003fff63000000
GPR20: 0008000000000000 c0000000f7014800 0405e600000000e0 0000000000010000
GPR24: c000000004d38810 c000000004447c10 c00000000404de18 c000000004447964
GPR28: c000000005573b10 c000000004d38810 00003fff62000000 420000000014e5e0
NIP [c00000000027bf6c] zap_huge_pmd+0x4c/0x470
LR [c00000000027bf64] zap_huge_pmd+0x44/0x470
Call Trace:
[c000000005573aa0] [c00000000027bf64] zap_huge_pmd+0x44/0x470 (unreliable)
[c000000005573af0] [c00000000022bbd8] unmap_page_range+0xcf8/0xed0
[c000000005573c30] [c00000000022c2d4] unmap_vmas+0x84/0x120
[c000000005573c80] [c000000000235448] unmap_region+0xd8/0x1b0
[c000000005573d80] [c0000000002378f0] do_munmap+0x2d0/0x4c0
[c000000005573df0] [c000000000237be4] SyS_munmap+0x64/0xb0
[c000000005573e30] [c000000000009560] system_call+0x38/0x108
Instruction dump:
fbe1fff8 fb81ffe0 7c7f1b78 7ca32b78 7cbd2b78 f8010010 7c9a2378 f821ffb1
7cde3378 4bfffea9 7c7b1b79 41820298 <e87f0000> 48000130 7fa5eb78 7fc4f378

Most of the time, the bug is surfacing in a caller up in the stack from
__pSeries_lpar_hugepage_invalidate() which is quite confusing.

This bug is pending since v3.11 but was hidden if a caller of the
caller of __pSeries_lpar_hugepage_invalidate() has pushed the corruped
register (r18 in this case) in the stack and is not using it until
restoring it. GCC 6.2.0 seems to raise it more frequently.

This commit also change the definition of the parameter buffer in
pSeries_lpar_flush_hash_range() to rely on the global define
PLPAR_HCALL9_BUFSIZE (no functional change here).

Fixes: 1a52728 ("powerpc: Optimize hugepage invalidate")
Cc: <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Signed-off-by: Laurent Dufour <[email protected]>
mpe pushed a commit to linuxppc/linux that referenced this pull request Oct 11, 2016
This commit fixes a stack corruption in the pseries specific code dealing
with the huge pages.

In __pSeries_lpar_hugepage_invalidate() the buffer used to pass arguments
to the hypervisor is not large enough. This leads to a stack corruption
where a previously saved register could be corrupted leading to unexpected
result in the caller, like the following panic:

  Oops: Kernel access of bad area, sig: 11 [#1]
  SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in: virtio_balloon ip_tables x_tables autofs4
  virtio_blk 8139too virtio_pci virtio_ring 8139cp virtio
  CPU: 11 PID: 1916 Comm: mmstress Not tainted 4.8.0 torvalds#76
  task: c000000005394880 task.stack: c000000005570000
  NIP: c00000000027bf6c LR: c00000000027bf64 CTR: 0000000000000000
  REGS: c000000005573820 TRAP: 0300   Not tainted  (4.8.0)
  MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 84822884  XER: 20000000
  CFAR: c00000000010a924 DAR: 420000000014e5e0 DSISR: 40000000 SOFTE: 1
  GPR00: c00000000027bf64 c000000005573aa0 c000000000e02800 c000000004447964
  GPR04: c00000000404de18 c000000004d38810 00000000042100f5 00000000f5002104
  GPR08: e0000000f5002104 0000000000000001 042100f5000000e0 00000000042100f5
  GPR12: 0000000000002200 c00000000fe02c00 c00000000404de18 0000000000000000
  GPR16: c1ffffffffffe7ff 00003fff62000000 420000000014e5e0 00003fff63000000
  GPR20: 0008000000000000 c0000000f7014800 0405e600000000e0 0000000000010000
  GPR24: c000000004d38810 c000000004447c10 c00000000404de18 c000000004447964
  GPR28: c000000005573b10 c000000004d38810 00003fff62000000 420000000014e5e0
  NIP [c00000000027bf6c] zap_huge_pmd+0x4c/0x470
  LR [c00000000027bf64] zap_huge_pmd+0x44/0x470
  Call Trace:
  [c000000005573aa0] [c00000000027bf64] zap_huge_pmd+0x44/0x470 (unreliable)
  [c000000005573af0] [c00000000022bbd8] unmap_page_range+0xcf8/0xed0
  [c000000005573c30] [c00000000022c2d4] unmap_vmas+0x84/0x120
  [c000000005573c80] [c000000000235448] unmap_region+0xd8/0x1b0
  [c000000005573d80] [c0000000002378f0] do_munmap+0x2d0/0x4c0
  [c000000005573df0] [c000000000237be4] SyS_munmap+0x64/0xb0
  [c000000005573e30] [c000000000009560] system_call+0x38/0x108
  Instruction dump:
  fbe1fff8 fb81ffe0 7c7f1b78 7ca32b78 7cbd2b78 f8010010 7c9a2378 f821ffb1
  7cde3378 4bfffea9 7c7b1b79 41820298 <e87f0000> 48000130 7fa5eb78 7fc4f378

Most of the time, the bug is surfacing in a caller up in the stack from
__pSeries_lpar_hugepage_invalidate() which is quite confusing.

This bug is pending since v3.11 but was hidden if a caller of the
caller of __pSeries_lpar_hugepage_invalidate() has pushed the corruped
register (r18 in this case) in the stack and is not using it until
restoring it. GCC 6.2.0 seems to raise it more frequently.

This commit also change the definition of the parameter buffer in
pSeries_lpar_flush_hash_range() to rely on the global define
PLPAR_HCALL9_BUFSIZE (no functional change here).

Fixes: 1a52728 ("powerpc: Optimize hugepage invalidate")
Cc: [email protected] # v3.11+
Signed-off-by: Laurent Dufour <[email protected]>
Reviewed-by: Aneesh Kumar K.V <[email protected]>
Acked-by: Balbir Singh <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Noltari pushed a commit to Noltari/linux that referenced this pull request Oct 28, 2016
commit 05af40e upstream.

This commit fixes a stack corruption in the pseries specific code dealing
with the huge pages.

In __pSeries_lpar_hugepage_invalidate() the buffer used to pass arguments
to the hypervisor is not large enough. This leads to a stack corruption
where a previously saved register could be corrupted leading to unexpected
result in the caller, like the following panic:

  Oops: Kernel access of bad area, sig: 11 [#1]
  SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in: virtio_balloon ip_tables x_tables autofs4
  virtio_blk 8139too virtio_pci virtio_ring 8139cp virtio
  CPU: 11 PID: 1916 Comm: mmstress Not tainted 4.8.0 torvalds#76
  task: c000000005394880 task.stack: c000000005570000
  NIP: c00000000027bf6c LR: c00000000027bf64 CTR: 0000000000000000
  REGS: c000000005573820 TRAP: 0300   Not tainted  (4.8.0)
  MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 84822884  XER: 20000000
  CFAR: c00000000010a924 DAR: 420000000014e5e0 DSISR: 40000000 SOFTE: 1
  GPR00: c00000000027bf64 c000000005573aa0 c000000000e02800 c000000004447964
  GPR04: c00000000404de18 c000000004d38810 00000000042100f5 00000000f5002104
  GPR08: e0000000f5002104 0000000000000001 042100f5000000e0 00000000042100f5
  GPR12: 0000000000002200 c00000000fe02c00 c00000000404de18 0000000000000000
  GPR16: c1ffffffffffe7ff 00003fff62000000 420000000014e5e0 00003fff63000000
  GPR20: 0008000000000000 c0000000f7014800 0405e600000000e0 0000000000010000
  GPR24: c000000004d38810 c000000004447c10 c00000000404de18 c000000004447964
  GPR28: c000000005573b10 c000000004d38810 00003fff62000000 420000000014e5e0
  NIP [c00000000027bf6c] zap_huge_pmd+0x4c/0x470
  LR [c00000000027bf64] zap_huge_pmd+0x44/0x470
  Call Trace:
  [c000000005573aa0] [c00000000027bf64] zap_huge_pmd+0x44/0x470 (unreliable)
  [c000000005573af0] [c00000000022bbd8] unmap_page_range+0xcf8/0xed0
  [c000000005573c30] [c00000000022c2d4] unmap_vmas+0x84/0x120
  [c000000005573c80] [c000000000235448] unmap_region+0xd8/0x1b0
  [c000000005573d80] [c0000000002378f0] do_munmap+0x2d0/0x4c0
  [c000000005573df0] [c000000000237be4] SyS_munmap+0x64/0xb0
  [c000000005573e30] [c000000000009560] system_call+0x38/0x108
  Instruction dump:
  fbe1fff8 fb81ffe0 7c7f1b78 7ca32b78 7cbd2b78 f8010010 7c9a2378 f821ffb1
  7cde3378 4bfffea9 7c7b1b79 41820298 <e87f0000> 48000130 7fa5eb78 7fc4f378

Most of the time, the bug is surfacing in a caller up in the stack from
__pSeries_lpar_hugepage_invalidate() which is quite confusing.

This bug is pending since v3.11 but was hidden if a caller of the
caller of __pSeries_lpar_hugepage_invalidate() has pushed the corruped
register (r18 in this case) in the stack and is not using it until
restoring it. GCC 6.2.0 seems to raise it more frequently.

This commit also change the definition of the parameter buffer in
pSeries_lpar_flush_hash_range() to rely on the global define
PLPAR_HCALL9_BUFSIZE (no functional change here).

Fixes: 1a52728 ("powerpc: Optimize hugepage invalidate")
Signed-off-by: Laurent Dufour <[email protected]>
Reviewed-by: Aneesh Kumar K.V <[email protected]>
Acked-by: Balbir Singh <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
metux pushed a commit to metux/linux that referenced this pull request Nov 5, 2016
While disabling ConfigFS Android gadget, android_disconnect() calls
kill_all_hid_devices(), if CONFIG_USB_CONFIGFS_F_ACC is enabled, to free
the registered HIDs without checking whether the USB accessory device
really exist or not. If USB accessory device doesn't exist then we run into
following kernel panic:
----8<----
[  136.724761] Unable to handle kernel NULL pointer dereference at virtual address 00000064
[  136.724809] pgd = c0204000
[  136.731924] [00000064] *pgd=00000000
[  136.737830] Internal error: Oops: 5 [#1] SMP ARM
[  136.738108] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc4-00400-gf75300e-dirty torvalds#76
[  136.742788] task: c0fb19d8 ti: c0fa4000 task.ti: c0fa4000
[  136.750890] PC is at _raw_spin_lock_irqsave+0x24/0x60
[  136.756246] LR is at kill_all_hid_devices+0x24/0x114
---->8----

This patch adds a test to check if USB Accessory device exists before freeing HIDs.

Change-Id: Ie229feaf0de3f4f7a151fcaa9a994e34e15ff73b
Signed-off-by: Amit Pundir <[email protected]>
Kaz205 pushed a commit to Kaz205/linux that referenced this pull request Apr 25, 2024
commit 50449ca upstream.

On arm64 machines, swsusp_save() faults if it attempts to access
MEMBLOCK_NOMAP memory ranges. This can be reproduced in QEMU using UEFI
when booting with rodata=off debug_pagealloc=off and CONFIG_KFENCE=n:

  Unable to handle kernel paging request at virtual address ffffff8000000000
  Mem abort info:
    ESR = 0x0000000096000007
    EC = 0x25: DABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
    FSC = 0x07: level 3 translation fault
  Data abort info:
    ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
    CM = 0, WnR = 0, TnD = 0, TagAccess = 0
    GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
  swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000eeb0b000
  [ffffff8000000000] pgd=180000217fff9803, p4d=180000217fff9803, pud=180000217fff9803, pmd=180000217fff8803, pte=0000000000000000
  Internal error: Oops: 0000000096000007 [#1] SMP
  Internal error: Oops: 0000000096000007 [#1] SMP
  Modules linked in: xt_multiport ipt_REJECT nf_reject_ipv4 xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter bpfilter rfkill at803x snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg dwmac_generic stmmac_platform snd_hda_codec stmmac joydev pcs_xpcs snd_hda_core phylink ppdev lp parport ramoops reed_solomon ip_tables x_tables nls_iso8859_1 vfat multipath linear amdgpu amdxcp drm_exec gpu_sched drm_buddy hid_generic usbhid hid radeon video drm_suballoc_helper drm_ttm_helper ttm i2c_algo_bit drm_display_helper cec drm_kms_helper drm
  CPU: 0 PID: 3663 Comm: systemd-sleep Not tainted 6.6.2+ torvalds#76
  Source Version: 4e22ed63a0a48e7a7cff9b98b7806d8d4add7dc0
  Hardware name: Greatwall GW-XXXXXX-XXX/GW-XXXXXX-XXX, BIOS KunLun BIOS V4.0 01/19/2021
  pstate: 600003c5 (nZCv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : swsusp_save+0x280/0x538
  lr : swsusp_save+0x280/0x538
  sp : ffffffa034a3fa40
  x29: ffffffa034a3fa40 x28: ffffff8000001000 x27: 0000000000000000
  x26: ffffff8001400000 x25: ffffffc08113e248 x24: 0000000000000000
  x23: 0000000000080000 x22: ffffffc08113e280 x21: 00000000000c69f2
  x20: ffffff8000000000 x19: ffffffc081ae2500 x18: 0000000000000000
  x17: 6666662074736420 x16: 3030303030303030 x15: 3038666666666666
  x14: 0000000000000b69 x13: ffffff9f89088530 x12: 00000000ffffffea
  x11: 00000000ffff7fff x10: 00000000ffff7fff x9 : ffffffc08193f0d0
  x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 : 0000000000000001
  x5 : ffffffa0fff09dc8 x4 : 0000000000000000 x3 : 0000000000000027
  x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000004e
  Call trace:
   swsusp_save+0x280/0x538
   swsusp_arch_suspend+0x148/0x190
   hibernation_snapshot+0x240/0x39c
   hibernate+0xc4/0x378
   state_store+0xf0/0x10c
   kobj_attr_store+0x14/0x24

The reason is swsusp_save() -> copy_data_pages() -> page_is_saveable()
-> kernel_page_present() assuming that a page is always present when
can_set_direct_map() is false (all of rodata_full,
debug_pagealloc_enabled() and arm64_kfence_can_set_direct_map() false),
irrespective of the MEMBLOCK_NOMAP ranges. Such MEMBLOCK_NOMAP regions
should not be saved during hibernation.

This problem was introduced by changes to the pfn_valid() logic in
commit a7d9f30 ("arm64: drop pfn_valid_within() and simplify
pfn_valid()").

Similar to other architectures, drop the !can_set_direct_map() check in
kernel_page_present() so that page_is_savable() skips such pages.

Fixes: a7d9f30 ("arm64: drop pfn_valid_within() and simplify pfn_valid()")
Cc: <[email protected]> # 5.14.x
Suggested-by: Mike Rapoport <[email protected]>
Suggested-by: Catalin Marinas <[email protected]>
Co-developed-by: xiongxin <[email protected]>
Signed-off-by: xiongxin <[email protected]>
Signed-off-by: Yaxiong Tian <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[[email protected]: rework commit message]
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
mj22226 pushed a commit to mj22226/linux that referenced this pull request Apr 25, 2024
commit 50449ca upstream.

On arm64 machines, swsusp_save() faults if it attempts to access
MEMBLOCK_NOMAP memory ranges. This can be reproduced in QEMU using UEFI
when booting with rodata=off debug_pagealloc=off and CONFIG_KFENCE=n:

  Unable to handle kernel paging request at virtual address ffffff8000000000
  Mem abort info:
    ESR = 0x0000000096000007
    EC = 0x25: DABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
    FSC = 0x07: level 3 translation fault
  Data abort info:
    ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
    CM = 0, WnR = 0, TnD = 0, TagAccess = 0
    GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
  swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000eeb0b000
  [ffffff8000000000] pgd=180000217fff9803, p4d=180000217fff9803, pud=180000217fff9803, pmd=180000217fff8803, pte=0000000000000000
  Internal error: Oops: 0000000096000007 [#1] SMP
  Internal error: Oops: 0000000096000007 [#1] SMP
  Modules linked in: xt_multiport ipt_REJECT nf_reject_ipv4 xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter bpfilter rfkill at803x snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg dwmac_generic stmmac_platform snd_hda_codec stmmac joydev pcs_xpcs snd_hda_core phylink ppdev lp parport ramoops reed_solomon ip_tables x_tables nls_iso8859_1 vfat multipath linear amdgpu amdxcp drm_exec gpu_sched drm_buddy hid_generic usbhid hid radeon video drm_suballoc_helper drm_ttm_helper ttm i2c_algo_bit drm_display_helper cec drm_kms_helper drm
  CPU: 0 PID: 3663 Comm: systemd-sleep Not tainted 6.6.2+ torvalds#76
  Source Version: 4e22ed63a0a48e7a7cff9b98b7806d8d4add7dc0
  Hardware name: Greatwall GW-XXXXXX-XXX/GW-XXXXXX-XXX, BIOS KunLun BIOS V4.0 01/19/2021
  pstate: 600003c5 (nZCv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : swsusp_save+0x280/0x538
  lr : swsusp_save+0x280/0x538
  sp : ffffffa034a3fa40
  x29: ffffffa034a3fa40 x28: ffffff8000001000 x27: 0000000000000000
  x26: ffffff8001400000 x25: ffffffc08113e248 x24: 0000000000000000
  x23: 0000000000080000 x22: ffffffc08113e280 x21: 00000000000c69f2
  x20: ffffff8000000000 x19: ffffffc081ae2500 x18: 0000000000000000
  x17: 6666662074736420 x16: 3030303030303030 x15: 3038666666666666
  x14: 0000000000000b69 x13: ffffff9f89088530 x12: 00000000ffffffea
  x11: 00000000ffff7fff x10: 00000000ffff7fff x9 : ffffffc08193f0d0
  x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 : 0000000000000001
  x5 : ffffffa0fff09dc8 x4 : 0000000000000000 x3 : 0000000000000027
  x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000004e
  Call trace:
   swsusp_save+0x280/0x538
   swsusp_arch_suspend+0x148/0x190
   hibernation_snapshot+0x240/0x39c
   hibernate+0xc4/0x378
   state_store+0xf0/0x10c
   kobj_attr_store+0x14/0x24

The reason is swsusp_save() -> copy_data_pages() -> page_is_saveable()
-> kernel_page_present() assuming that a page is always present when
can_set_direct_map() is false (all of rodata_full,
debug_pagealloc_enabled() and arm64_kfence_can_set_direct_map() false),
irrespective of the MEMBLOCK_NOMAP ranges. Such MEMBLOCK_NOMAP regions
should not be saved during hibernation.

This problem was introduced by changes to the pfn_valid() logic in
commit a7d9f30 ("arm64: drop pfn_valid_within() and simplify
pfn_valid()").

Similar to other architectures, drop the !can_set_direct_map() check in
kernel_page_present() so that page_is_savable() skips such pages.

Fixes: a7d9f30 ("arm64: drop pfn_valid_within() and simplify pfn_valid()")
Cc: <[email protected]> # 5.14.x
Suggested-by: Mike Rapoport <[email protected]>
Suggested-by: Catalin Marinas <[email protected]>
Co-developed-by: xiongxin <[email protected]>
Signed-off-by: xiongxin <[email protected]>
Signed-off-by: Yaxiong Tian <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[[email protected]: rework commit message]
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
mj22226 pushed a commit to mj22226/linux that referenced this pull request Apr 26, 2024
commit 50449ca upstream.

On arm64 machines, swsusp_save() faults if it attempts to access
MEMBLOCK_NOMAP memory ranges. This can be reproduced in QEMU using UEFI
when booting with rodata=off debug_pagealloc=off and CONFIG_KFENCE=n:

  Unable to handle kernel paging request at virtual address ffffff8000000000
  Mem abort info:
    ESR = 0x0000000096000007
    EC = 0x25: DABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
    FSC = 0x07: level 3 translation fault
  Data abort info:
    ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
    CM = 0, WnR = 0, TnD = 0, TagAccess = 0
    GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
  swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000eeb0b000
  [ffffff8000000000] pgd=180000217fff9803, p4d=180000217fff9803, pud=180000217fff9803, pmd=180000217fff8803, pte=0000000000000000
  Internal error: Oops: 0000000096000007 [#1] SMP
  Internal error: Oops: 0000000096000007 [#1] SMP
  Modules linked in: xt_multiport ipt_REJECT nf_reject_ipv4 xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter bpfilter rfkill at803x snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg dwmac_generic stmmac_platform snd_hda_codec stmmac joydev pcs_xpcs snd_hda_core phylink ppdev lp parport ramoops reed_solomon ip_tables x_tables nls_iso8859_1 vfat multipath linear amdgpu amdxcp drm_exec gpu_sched drm_buddy hid_generic usbhid hid radeon video drm_suballoc_helper drm_ttm_helper ttm i2c_algo_bit drm_display_helper cec drm_kms_helper drm
  CPU: 0 PID: 3663 Comm: systemd-sleep Not tainted 6.6.2+ torvalds#76
  Source Version: 4e22ed63a0a48e7a7cff9b98b7806d8d4add7dc0
  Hardware name: Greatwall GW-XXXXXX-XXX/GW-XXXXXX-XXX, BIOS KunLun BIOS V4.0 01/19/2021
  pstate: 600003c5 (nZCv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : swsusp_save+0x280/0x538
  lr : swsusp_save+0x280/0x538
  sp : ffffffa034a3fa40
  x29: ffffffa034a3fa40 x28: ffffff8000001000 x27: 0000000000000000
  x26: ffffff8001400000 x25: ffffffc08113e248 x24: 0000000000000000
  x23: 0000000000080000 x22: ffffffc08113e280 x21: 00000000000c69f2
  x20: ffffff8000000000 x19: ffffffc081ae2500 x18: 0000000000000000
  x17: 6666662074736420 x16: 3030303030303030 x15: 3038666666666666
  x14: 0000000000000b69 x13: ffffff9f89088530 x12: 00000000ffffffea
  x11: 00000000ffff7fff x10: 00000000ffff7fff x9 : ffffffc08193f0d0
  x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 : 0000000000000001
  x5 : ffffffa0fff09dc8 x4 : 0000000000000000 x3 : 0000000000000027
  x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000004e
  Call trace:
   swsusp_save+0x280/0x538
   swsusp_arch_suspend+0x148/0x190
   hibernation_snapshot+0x240/0x39c
   hibernate+0xc4/0x378
   state_store+0xf0/0x10c
   kobj_attr_store+0x14/0x24

The reason is swsusp_save() -> copy_data_pages() -> page_is_saveable()
-> kernel_page_present() assuming that a page is always present when
can_set_direct_map() is false (all of rodata_full,
debug_pagealloc_enabled() and arm64_kfence_can_set_direct_map() false),
irrespective of the MEMBLOCK_NOMAP ranges. Such MEMBLOCK_NOMAP regions
should not be saved during hibernation.

This problem was introduced by changes to the pfn_valid() logic in
commit a7d9f30 ("arm64: drop pfn_valid_within() and simplify
pfn_valid()").

Similar to other architectures, drop the !can_set_direct_map() check in
kernel_page_present() so that page_is_savable() skips such pages.

Fixes: a7d9f30 ("arm64: drop pfn_valid_within() and simplify pfn_valid()")
Cc: <[email protected]> # 5.14.x
Suggested-by: Mike Rapoport <[email protected]>
Suggested-by: Catalin Marinas <[email protected]>
Co-developed-by: xiongxin <[email protected]>
Signed-off-by: xiongxin <[email protected]>
Signed-off-by: Yaxiong Tian <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[[email protected]: rework commit message]
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
1054009064 pushed a commit to 1054009064/linux that referenced this pull request Apr 27, 2024
commit 50449ca upstream.

On arm64 machines, swsusp_save() faults if it attempts to access
MEMBLOCK_NOMAP memory ranges. This can be reproduced in QEMU using UEFI
when booting with rodata=off debug_pagealloc=off and CONFIG_KFENCE=n:

  Unable to handle kernel paging request at virtual address ffffff8000000000
  Mem abort info:
    ESR = 0x0000000096000007
    EC = 0x25: DABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
    FSC = 0x07: level 3 translation fault
  Data abort info:
    ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
    CM = 0, WnR = 0, TnD = 0, TagAccess = 0
    GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
  swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000eeb0b000
  [ffffff8000000000] pgd=180000217fff9803, p4d=180000217fff9803, pud=180000217fff9803, pmd=180000217fff8803, pte=0000000000000000
  Internal error: Oops: 0000000096000007 [#1] SMP
  Internal error: Oops: 0000000096000007 [#1] SMP
  Modules linked in: xt_multiport ipt_REJECT nf_reject_ipv4 xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter bpfilter rfkill at803x snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg dwmac_generic stmmac_platform snd_hda_codec stmmac joydev pcs_xpcs snd_hda_core phylink ppdev lp parport ramoops reed_solomon ip_tables x_tables nls_iso8859_1 vfat multipath linear amdgpu amdxcp drm_exec gpu_sched drm_buddy hid_generic usbhid hid radeon video drm_suballoc_helper drm_ttm_helper ttm i2c_algo_bit drm_display_helper cec drm_kms_helper drm
  CPU: 0 PID: 3663 Comm: systemd-sleep Not tainted 6.6.2+ torvalds#76
  Source Version: 4e22ed63a0a48e7a7cff9b98b7806d8d4add7dc0
  Hardware name: Greatwall GW-XXXXXX-XXX/GW-XXXXXX-XXX, BIOS KunLun BIOS V4.0 01/19/2021
  pstate: 600003c5 (nZCv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : swsusp_save+0x280/0x538
  lr : swsusp_save+0x280/0x538
  sp : ffffffa034a3fa40
  x29: ffffffa034a3fa40 x28: ffffff8000001000 x27: 0000000000000000
  x26: ffffff8001400000 x25: ffffffc08113e248 x24: 0000000000000000
  x23: 0000000000080000 x22: ffffffc08113e280 x21: 00000000000c69f2
  x20: ffffff8000000000 x19: ffffffc081ae2500 x18: 0000000000000000
  x17: 6666662074736420 x16: 3030303030303030 x15: 3038666666666666
  x14: 0000000000000b69 x13: ffffff9f89088530 x12: 00000000ffffffea
  x11: 00000000ffff7fff x10: 00000000ffff7fff x9 : ffffffc08193f0d0
  x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 : 0000000000000001
  x5 : ffffffa0fff09dc8 x4 : 0000000000000000 x3 : 0000000000000027
  x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000004e
  Call trace:
   swsusp_save+0x280/0x538
   swsusp_arch_suspend+0x148/0x190
   hibernation_snapshot+0x240/0x39c
   hibernate+0xc4/0x378
   state_store+0xf0/0x10c
   kobj_attr_store+0x14/0x24

The reason is swsusp_save() -> copy_data_pages() -> page_is_saveable()
-> kernel_page_present() assuming that a page is always present when
can_set_direct_map() is false (all of rodata_full,
debug_pagealloc_enabled() and arm64_kfence_can_set_direct_map() false),
irrespective of the MEMBLOCK_NOMAP ranges. Such MEMBLOCK_NOMAP regions
should not be saved during hibernation.

This problem was introduced by changes to the pfn_valid() logic in
commit a7d9f30 ("arm64: drop pfn_valid_within() and simplify
pfn_valid()").

Similar to other architectures, drop the !can_set_direct_map() check in
kernel_page_present() so that page_is_savable() skips such pages.

Fixes: a7d9f30 ("arm64: drop pfn_valid_within() and simplify pfn_valid()")
Cc: <[email protected]> # 5.14.x
Suggested-by: Mike Rapoport <[email protected]>
Suggested-by: Catalin Marinas <[email protected]>
Co-developed-by: xiongxin <[email protected]>
Signed-off-by: xiongxin <[email protected]>
Signed-off-by: Yaxiong Tian <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[[email protected]: rework commit message]
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
1054009064 pushed a commit to 1054009064/linux that referenced this pull request Apr 27, 2024
commit 50449ca upstream.

On arm64 machines, swsusp_save() faults if it attempts to access
MEMBLOCK_NOMAP memory ranges. This can be reproduced in QEMU using UEFI
when booting with rodata=off debug_pagealloc=off and CONFIG_KFENCE=n:

  Unable to handle kernel paging request at virtual address ffffff8000000000
  Mem abort info:
    ESR = 0x0000000096000007
    EC = 0x25: DABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
    FSC = 0x07: level 3 translation fault
  Data abort info:
    ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
    CM = 0, WnR = 0, TnD = 0, TagAccess = 0
    GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
  swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000eeb0b000
  [ffffff8000000000] pgd=180000217fff9803, p4d=180000217fff9803, pud=180000217fff9803, pmd=180000217fff8803, pte=0000000000000000
  Internal error: Oops: 0000000096000007 [#1] SMP
  Internal error: Oops: 0000000096000007 [#1] SMP
  Modules linked in: xt_multiport ipt_REJECT nf_reject_ipv4 xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter bpfilter rfkill at803x snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg dwmac_generic stmmac_platform snd_hda_codec stmmac joydev pcs_xpcs snd_hda_core phylink ppdev lp parport ramoops reed_solomon ip_tables x_tables nls_iso8859_1 vfat multipath linear amdgpu amdxcp drm_exec gpu_sched drm_buddy hid_generic usbhid hid radeon video drm_suballoc_helper drm_ttm_helper ttm i2c_algo_bit drm_display_helper cec drm_kms_helper drm
  CPU: 0 PID: 3663 Comm: systemd-sleep Not tainted 6.6.2+ torvalds#76
  Source Version: 4e22ed63a0a48e7a7cff9b98b7806d8d4add7dc0
  Hardware name: Greatwall GW-XXXXXX-XXX/GW-XXXXXX-XXX, BIOS KunLun BIOS V4.0 01/19/2021
  pstate: 600003c5 (nZCv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : swsusp_save+0x280/0x538
  lr : swsusp_save+0x280/0x538
  sp : ffffffa034a3fa40
  x29: ffffffa034a3fa40 x28: ffffff8000001000 x27: 0000000000000000
  x26: ffffff8001400000 x25: ffffffc08113e248 x24: 0000000000000000
  x23: 0000000000080000 x22: ffffffc08113e280 x21: 00000000000c69f2
  x20: ffffff8000000000 x19: ffffffc081ae2500 x18: 0000000000000000
  x17: 6666662074736420 x16: 3030303030303030 x15: 3038666666666666
  x14: 0000000000000b69 x13: ffffff9f89088530 x12: 00000000ffffffea
  x11: 00000000ffff7fff x10: 00000000ffff7fff x9 : ffffffc08193f0d0
  x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 : 0000000000000001
  x5 : ffffffa0fff09dc8 x4 : 0000000000000000 x3 : 0000000000000027
  x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000004e
  Call trace:
   swsusp_save+0x280/0x538
   swsusp_arch_suspend+0x148/0x190
   hibernation_snapshot+0x240/0x39c
   hibernate+0xc4/0x378
   state_store+0xf0/0x10c
   kobj_attr_store+0x14/0x24

The reason is swsusp_save() -> copy_data_pages() -> page_is_saveable()
-> kernel_page_present() assuming that a page is always present when
can_set_direct_map() is false (all of rodata_full,
debug_pagealloc_enabled() and arm64_kfence_can_set_direct_map() false),
irrespective of the MEMBLOCK_NOMAP ranges. Such MEMBLOCK_NOMAP regions
should not be saved during hibernation.

This problem was introduced by changes to the pfn_valid() logic in
commit a7d9f30 ("arm64: drop pfn_valid_within() and simplify
pfn_valid()").

Similar to other architectures, drop the !can_set_direct_map() check in
kernel_page_present() so that page_is_savable() skips such pages.

Fixes: a7d9f30 ("arm64: drop pfn_valid_within() and simplify pfn_valid()")
Cc: <[email protected]> # 5.14.x
Suggested-by: Mike Rapoport <[email protected]>
Suggested-by: Catalin Marinas <[email protected]>
Co-developed-by: xiongxin <[email protected]>
Signed-off-by: xiongxin <[email protected]>
Signed-off-by: Yaxiong Tian <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[[email protected]: rework commit message]
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
1054009064 pushed a commit to 1054009064/linux that referenced this pull request Apr 27, 2024
commit 50449ca upstream.

On arm64 machines, swsusp_save() faults if it attempts to access
MEMBLOCK_NOMAP memory ranges. This can be reproduced in QEMU using UEFI
when booting with rodata=off debug_pagealloc=off and CONFIG_KFENCE=n:

  Unable to handle kernel paging request at virtual address ffffff8000000000
  Mem abort info:
    ESR = 0x0000000096000007
    EC = 0x25: DABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
    FSC = 0x07: level 3 translation fault
  Data abort info:
    ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
    CM = 0, WnR = 0, TnD = 0, TagAccess = 0
    GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
  swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000eeb0b000
  [ffffff8000000000] pgd=180000217fff9803, p4d=180000217fff9803, pud=180000217fff9803, pmd=180000217fff8803, pte=0000000000000000
  Internal error: Oops: 0000000096000007 [#1] SMP
  Internal error: Oops: 0000000096000007 [#1] SMP
  Modules linked in: xt_multiport ipt_REJECT nf_reject_ipv4 xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter bpfilter rfkill at803x snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg dwmac_generic stmmac_platform snd_hda_codec stmmac joydev pcs_xpcs snd_hda_core phylink ppdev lp parport ramoops reed_solomon ip_tables x_tables nls_iso8859_1 vfat multipath linear amdgpu amdxcp drm_exec gpu_sched drm_buddy hid_generic usbhid hid radeon video drm_suballoc_helper drm_ttm_helper ttm i2c_algo_bit drm_display_helper cec drm_kms_helper drm
  CPU: 0 PID: 3663 Comm: systemd-sleep Not tainted 6.6.2+ torvalds#76
  Source Version: 4e22ed63a0a48e7a7cff9b98b7806d8d4add7dc0
  Hardware name: Greatwall GW-XXXXXX-XXX/GW-XXXXXX-XXX, BIOS KunLun BIOS V4.0 01/19/2021
  pstate: 600003c5 (nZCv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : swsusp_save+0x280/0x538
  lr : swsusp_save+0x280/0x538
  sp : ffffffa034a3fa40
  x29: ffffffa034a3fa40 x28: ffffff8000001000 x27: 0000000000000000
  x26: ffffff8001400000 x25: ffffffc08113e248 x24: 0000000000000000
  x23: 0000000000080000 x22: ffffffc08113e280 x21: 00000000000c69f2
  x20: ffffff8000000000 x19: ffffffc081ae2500 x18: 0000000000000000
  x17: 6666662074736420 x16: 3030303030303030 x15: 3038666666666666
  x14: 0000000000000b69 x13: ffffff9f89088530 x12: 00000000ffffffea
  x11: 00000000ffff7fff x10: 00000000ffff7fff x9 : ffffffc08193f0d0
  x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 : 0000000000000001
  x5 : ffffffa0fff09dc8 x4 : 0000000000000000 x3 : 0000000000000027
  x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000004e
  Call trace:
   swsusp_save+0x280/0x538
   swsusp_arch_suspend+0x148/0x190
   hibernation_snapshot+0x240/0x39c
   hibernate+0xc4/0x378
   state_store+0xf0/0x10c
   kobj_attr_store+0x14/0x24

The reason is swsusp_save() -> copy_data_pages() -> page_is_saveable()
-> kernel_page_present() assuming that a page is always present when
can_set_direct_map() is false (all of rodata_full,
debug_pagealloc_enabled() and arm64_kfence_can_set_direct_map() false),
irrespective of the MEMBLOCK_NOMAP ranges. Such MEMBLOCK_NOMAP regions
should not be saved during hibernation.

This problem was introduced by changes to the pfn_valid() logic in
commit a7d9f30 ("arm64: drop pfn_valid_within() and simplify
pfn_valid()").

Similar to other architectures, drop the !can_set_direct_map() check in
kernel_page_present() so that page_is_savable() skips such pages.

Fixes: a7d9f30 ("arm64: drop pfn_valid_within() and simplify pfn_valid()")
Cc: <[email protected]> # 5.14.x
Suggested-by: Mike Rapoport <[email protected]>
Suggested-by: Catalin Marinas <[email protected]>
Co-developed-by: xiongxin <[email protected]>
Signed-off-by: xiongxin <[email protected]>
Signed-off-by: Yaxiong Tian <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[[email protected]: rework commit message]
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
ptr1337 pushed a commit to CachyOS/linux that referenced this pull request Apr 27, 2024
commit 50449ca upstream.

On arm64 machines, swsusp_save() faults if it attempts to access
MEMBLOCK_NOMAP memory ranges. This can be reproduced in QEMU using UEFI
when booting with rodata=off debug_pagealloc=off and CONFIG_KFENCE=n:

  Unable to handle kernel paging request at virtual address ffffff8000000000
  Mem abort info:
    ESR = 0x0000000096000007
    EC = 0x25: DABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
    FSC = 0x07: level 3 translation fault
  Data abort info:
    ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
    CM = 0, WnR = 0, TnD = 0, TagAccess = 0
    GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
  swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000eeb0b000
  [ffffff8000000000] pgd=180000217fff9803, p4d=180000217fff9803, pud=180000217fff9803, pmd=180000217fff8803, pte=0000000000000000
  Internal error: Oops: 0000000096000007 [#1] SMP
  Internal error: Oops: 0000000096000007 [#1] SMP
  Modules linked in: xt_multiport ipt_REJECT nf_reject_ipv4 xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter bpfilter rfkill at803x snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg dwmac_generic stmmac_platform snd_hda_codec stmmac joydev pcs_xpcs snd_hda_core phylink ppdev lp parport ramoops reed_solomon ip_tables x_tables nls_iso8859_1 vfat multipath linear amdgpu amdxcp drm_exec gpu_sched drm_buddy hid_generic usbhid hid radeon video drm_suballoc_helper drm_ttm_helper ttm i2c_algo_bit drm_display_helper cec drm_kms_helper drm
  CPU: 0 PID: 3663 Comm: systemd-sleep Not tainted 6.6.2+ torvalds#76
  Source Version: 4e22ed63a0a48e7a7cff9b98b7806d8d4add7dc0
  Hardware name: Greatwall GW-XXXXXX-XXX/GW-XXXXXX-XXX, BIOS KunLun BIOS V4.0 01/19/2021
  pstate: 600003c5 (nZCv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : swsusp_save+0x280/0x538
  lr : swsusp_save+0x280/0x538
  sp : ffffffa034a3fa40
  x29: ffffffa034a3fa40 x28: ffffff8000001000 x27: 0000000000000000
  x26: ffffff8001400000 x25: ffffffc08113e248 x24: 0000000000000000
  x23: 0000000000080000 x22: ffffffc08113e280 x21: 00000000000c69f2
  x20: ffffff8000000000 x19: ffffffc081ae2500 x18: 0000000000000000
  x17: 6666662074736420 x16: 3030303030303030 x15: 3038666666666666
  x14: 0000000000000b69 x13: ffffff9f89088530 x12: 00000000ffffffea
  x11: 00000000ffff7fff x10: 00000000ffff7fff x9 : ffffffc08193f0d0
  x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 : 0000000000000001
  x5 : ffffffa0fff09dc8 x4 : 0000000000000000 x3 : 0000000000000027
  x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000004e
  Call trace:
   swsusp_save+0x280/0x538
   swsusp_arch_suspend+0x148/0x190
   hibernation_snapshot+0x240/0x39c
   hibernate+0xc4/0x378
   state_store+0xf0/0x10c
   kobj_attr_store+0x14/0x24

The reason is swsusp_save() -> copy_data_pages() -> page_is_saveable()
-> kernel_page_present() assuming that a page is always present when
can_set_direct_map() is false (all of rodata_full,
debug_pagealloc_enabled() and arm64_kfence_can_set_direct_map() false),
irrespective of the MEMBLOCK_NOMAP ranges. Such MEMBLOCK_NOMAP regions
should not be saved during hibernation.

This problem was introduced by changes to the pfn_valid() logic in
commit a7d9f30 ("arm64: drop pfn_valid_within() and simplify
pfn_valid()").

Similar to other architectures, drop the !can_set_direct_map() check in
kernel_page_present() so that page_is_savable() skips such pages.

Fixes: a7d9f30 ("arm64: drop pfn_valid_within() and simplify pfn_valid()")
Cc: <[email protected]> # 5.14.x
Suggested-by: Mike Rapoport <[email protected]>
Suggested-by: Catalin Marinas <[email protected]>
Co-developed-by: xiongxin <[email protected]>
Signed-off-by: xiongxin <[email protected]>
Signed-off-by: Yaxiong Tian <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[[email protected]: rework commit message]
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
john-cabaj pushed a commit to UbuntuAsahi/linux that referenced this pull request Jun 17, 2024
BugLink: https://bugs.launchpad.net/bugs/2068087

commit 50449ca upstream.

On arm64 machines, swsusp_save() faults if it attempts to access
MEMBLOCK_NOMAP memory ranges. This can be reproduced in QEMU using UEFI
when booting with rodata=off debug_pagealloc=off and CONFIG_KFENCE=n:

  Unable to handle kernel paging request at virtual address ffffff8000000000
  Mem abort info:
    ESR = 0x0000000096000007
    EC = 0x25: DABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
    FSC = 0x07: level 3 translation fault
  Data abort info:
    ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
    CM = 0, WnR = 0, TnD = 0, TagAccess = 0
    GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
  swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000eeb0b000
  [ffffff8000000000] pgd=180000217fff9803, p4d=180000217fff9803, pud=180000217fff9803, pmd=180000217fff8803, pte=0000000000000000
  Internal error: Oops: 0000000096000007 [#1] SMP
  Internal error: Oops: 0000000096000007 [#1] SMP
  Modules linked in: xt_multiport ipt_REJECT nf_reject_ipv4 xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter bpfilter rfkill at803x snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg dwmac_generic stmmac_platform snd_hda_codec stmmac joydev pcs_xpcs snd_hda_core phylink ppdev lp parport ramoops reed_solomon ip_tables x_tables nls_iso8859_1 vfat multipath linear amdgpu amdxcp drm_exec gpu_sched drm_buddy hid_generic usbhid hid radeon video drm_suballoc_helper drm_ttm_helper ttm i2c_algo_bit drm_display_helper cec drm_kms_helper drm
  CPU: 0 PID: 3663 Comm: systemd-sleep Not tainted 6.6.2+ torvalds#76
  Source Version: 4e22ed63a0a48e7a7cff9b98b7806d8d4add7dc0
  Hardware name: Greatwall GW-XXXXXX-XXX/GW-XXXXXX-XXX, BIOS KunLun BIOS V4.0 01/19/2021
  pstate: 600003c5 (nZCv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : swsusp_save+0x280/0x538
  lr : swsusp_save+0x280/0x538
  sp : ffffffa034a3fa40
  x29: ffffffa034a3fa40 x28: ffffff8000001000 x27: 0000000000000000
  x26: ffffff8001400000 x25: ffffffc08113e248 x24: 0000000000000000
  x23: 0000000000080000 x22: ffffffc08113e280 x21: 00000000000c69f2
  x20: ffffff8000000000 x19: ffffffc081ae2500 x18: 0000000000000000
  x17: 6666662074736420 x16: 3030303030303030 x15: 3038666666666666
  x14: 0000000000000b69 x13: ffffff9f89088530 x12: 00000000ffffffea
  x11: 00000000ffff7fff x10: 00000000ffff7fff x9 : ffffffc08193f0d0
  x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 : 0000000000000001
  x5 : ffffffa0fff09dc8 x4 : 0000000000000000 x3 : 0000000000000027
  x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000004e
  Call trace:
   swsusp_save+0x280/0x538
   swsusp_arch_suspend+0x148/0x190
   hibernation_snapshot+0x240/0x39c
   hibernate+0xc4/0x378
   state_store+0xf0/0x10c
   kobj_attr_store+0x14/0x24

The reason is swsusp_save() -> copy_data_pages() -> page_is_saveable()
-> kernel_page_present() assuming that a page is always present when
can_set_direct_map() is false (all of rodata_full,
debug_pagealloc_enabled() and arm64_kfence_can_set_direct_map() false),
irrespective of the MEMBLOCK_NOMAP ranges. Such MEMBLOCK_NOMAP regions
should not be saved during hibernation.

This problem was introduced by changes to the pfn_valid() logic in
commit a7d9f30 ("arm64: drop pfn_valid_within() and simplify
pfn_valid()").

Similar to other architectures, drop the !can_set_direct_map() check in
kernel_page_present() so that page_is_savable() skips such pages.

Fixes: a7d9f30 ("arm64: drop pfn_valid_within() and simplify pfn_valid()")
Cc: <[email protected]> # 5.14.x
Suggested-by: Mike Rapoport <[email protected]>
Suggested-by: Catalin Marinas <[email protected]>
Co-developed-by: xiongxin <[email protected]>
Signed-off-by: xiongxin <[email protected]>
Signed-off-by: Yaxiong Tian <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[[email protected]: rework commit message]
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Portia Stephens <[email protected]>
Signed-off-by: Stefan Bader <[email protected]>
adam900710 added a commit to adam900710/linux that referenced this pull request Aug 17, 2024
btrfs_submit_chunk()

[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

 ==================================================================
 BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
 Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
 CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ torvalds#76
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
 Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
 Call Trace:
  <TASK>
  dump_stack_lvl+0x61/0x80
  print_address_description.constprop.0+0x5e/0x2f0
  print_report+0x118/0x216
  kasan_report+0x11d/0x1f0
  btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  process_one_work+0xce0/0x12a0
  worker_thread+0x717/0x1250
  kthread+0x2e3/0x3c0
  ret_from_fork+0x2d/0x70
  ret_from_fork_asm+0x11/0x20
  </TASK>

 Allocated by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  __kasan_slab_alloc+0x7d/0x80
  kmem_cache_alloc_noprof+0x16e/0x3e0
  mempool_alloc_noprof+0x12e/0x310
  bio_alloc_bioset+0x3f0/0x7a0
  btrfs_bio_alloc+0x2e/0x50 [btrfs]
  submit_extent_page+0x4d1/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

 Freed by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  kasan_save_free_info+0x37/0x50
  __kasan_slab_free+0x4b/0x60
  kmem_cache_free+0x214/0x5d0
  bio_free+0xed/0x180
  end_bbio_data_read+0x1cc/0x580 [btrfs]
  btrfs_submit_chunk+0x98d/0x1880 [btrfs]
  btrfs_submit_bio+0x33/0x70 [btrfs]
  submit_one_bio+0xd4/0x130 [btrfs]
  submit_extent_page+0x3ea/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I can not reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.
And the original end io function freed the whole bio.

This means a double freeing thus causing use-after-free, e.g:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Reported-by: David Sterba <[email protected]>
Fixes: 852eee6 ("btrfs: allow btrfs_submit_bio to split bios")
Signed-off-by: Qu Wenruo <[email protected]>
adam900710 added a commit to adam900710/linux that referenced this pull request Aug 17, 2024
…mit_chunk()

[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

 ==================================================================
 BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
 Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
 CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ torvalds#76
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
 Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
 Call Trace:
  <TASK>
  dump_stack_lvl+0x61/0x80
  print_address_description.constprop.0+0x5e/0x2f0
  print_report+0x118/0x216
  kasan_report+0x11d/0x1f0
  btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  process_one_work+0xce0/0x12a0
  worker_thread+0x717/0x1250
  kthread+0x2e3/0x3c0
  ret_from_fork+0x2d/0x70
  ret_from_fork_asm+0x11/0x20
  </TASK>

 Allocated by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  __kasan_slab_alloc+0x7d/0x80
  kmem_cache_alloc_noprof+0x16e/0x3e0
  mempool_alloc_noprof+0x12e/0x310
  bio_alloc_bioset+0x3f0/0x7a0
  btrfs_bio_alloc+0x2e/0x50 [btrfs]
  submit_extent_page+0x4d1/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

 Freed by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  kasan_save_free_info+0x37/0x50
  __kasan_slab_free+0x4b/0x60
  kmem_cache_free+0x214/0x5d0
  bio_free+0xed/0x180
  end_bbio_data_read+0x1cc/0x580 [btrfs]
  btrfs_submit_chunk+0x98d/0x1880 [btrfs]
  btrfs_submit_bio+0x33/0x70 [btrfs]
  submit_one_bio+0xd4/0x130 [btrfs]
  submit_extent_page+0x3ea/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I can not reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.
And the original end io function freed the whole bio.

This means a double freeing thus causing use-after-free, e.g:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Reported-by: David Sterba <[email protected]>
Fixes: 852eee6 ("btrfs: allow btrfs_submit_bio to split bios")
Signed-off-by: Qu Wenruo <[email protected]>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Aug 17, 2024
…it_chunk()

[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

 ==================================================================
 BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
 Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
 CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ torvalds#76
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
 Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
 Call Trace:
  <TASK>
  dump_stack_lvl+0x61/0x80
  print_address_description.constprop.0+0x5e/0x2f0
  print_report+0x118/0x216
  kasan_report+0x11d/0x1f0
  btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  process_one_work+0xce0/0x12a0
  worker_thread+0x717/0x1250
  kthread+0x2e3/0x3c0
  ret_from_fork+0x2d/0x70
  ret_from_fork_asm+0x11/0x20
  </TASK>

 Allocated by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  __kasan_slab_alloc+0x7d/0x80
  kmem_cache_alloc_noprof+0x16e/0x3e0
  mempool_alloc_noprof+0x12e/0x310
  bio_alloc_bioset+0x3f0/0x7a0
  btrfs_bio_alloc+0x2e/0x50 [btrfs]
  submit_extent_page+0x4d1/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

 Freed by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  kasan_save_free_info+0x37/0x50
  __kasan_slab_free+0x4b/0x60
  kmem_cache_free+0x214/0x5d0
  bio_free+0xed/0x180
  end_bbio_data_read+0x1cc/0x580 [btrfs]
  btrfs_submit_chunk+0x98d/0x1880 [btrfs]
  btrfs_submit_bio+0x33/0x70 [btrfs]
  submit_one_bio+0xd4/0x130 [btrfs]
  submit_extent_page+0x3ea/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I can not reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.
And the original endio function called bio_put() to free the whole bio.

This means a double freeing thus causing use-after-free, e.g:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

   And even if the memory is not poisoned/corrupted, we will later call
   the original endio function, causing a double freeing.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Reported-by: David Sterba <[email protected]>
Fixes: 852eee6 ("btrfs: allow btrfs_submit_bio to split bios")
Signed-off-by: Qu Wenruo <[email protected]>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Aug 18, 2024
…it_chunk()

[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

 ==================================================================
 BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
 Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
 CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ torvalds#76
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
 Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
 Call Trace:
  <TASK>
  dump_stack_lvl+0x61/0x80
  print_address_description.constprop.0+0x5e/0x2f0
  print_report+0x118/0x216
  kasan_report+0x11d/0x1f0
  btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  process_one_work+0xce0/0x12a0
  worker_thread+0x717/0x1250
  kthread+0x2e3/0x3c0
  ret_from_fork+0x2d/0x70
  ret_from_fork_asm+0x11/0x20
  </TASK>

 Allocated by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  __kasan_slab_alloc+0x7d/0x80
  kmem_cache_alloc_noprof+0x16e/0x3e0
  mempool_alloc_noprof+0x12e/0x310
  bio_alloc_bioset+0x3f0/0x7a0
  btrfs_bio_alloc+0x2e/0x50 [btrfs]
  submit_extent_page+0x4d1/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

 Freed by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  kasan_save_free_info+0x37/0x50
  __kasan_slab_free+0x4b/0x60
  kmem_cache_free+0x214/0x5d0
  bio_free+0xed/0x180
  end_bbio_data_read+0x1cc/0x580 [btrfs]
  btrfs_submit_chunk+0x98d/0x1880 [btrfs]
  btrfs_submit_bio+0x33/0x70 [btrfs]
  submit_one_bio+0xd4/0x130 [btrfs]
  submit_extent_page+0x3ea/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I can not reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.
And the original endio function called bio_put() to free the whole bio.

This means a double freeing thus causing use-after-free, e.g:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

   And even if the memory is not poisoned/corrupted, we will later call
   the original endio function, causing a double freeing.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Reported-by: David Sterba <[email protected]>
Fixes: 852eee6 ("btrfs: allow btrfs_submit_bio to split bios")
Signed-off-by: Qu Wenruo <[email protected]>
adam900710 added a commit to adam900710/linux that referenced this pull request Aug 19, 2024
…it_chunk()

[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

 ==================================================================
 BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
 Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
 CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ torvalds#76
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
 Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
 Call Trace:
  <TASK>
  dump_stack_lvl+0x61/0x80
  print_address_description.constprop.0+0x5e/0x2f0
  print_report+0x118/0x216
  kasan_report+0x11d/0x1f0
  btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  process_one_work+0xce0/0x12a0
  worker_thread+0x717/0x1250
  kthread+0x2e3/0x3c0
  ret_from_fork+0x2d/0x70
  ret_from_fork_asm+0x11/0x20
  </TASK>

 Allocated by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  __kasan_slab_alloc+0x7d/0x80
  kmem_cache_alloc_noprof+0x16e/0x3e0
  mempool_alloc_noprof+0x12e/0x310
  bio_alloc_bioset+0x3f0/0x7a0
  btrfs_bio_alloc+0x2e/0x50 [btrfs]
  submit_extent_page+0x4d1/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

 Freed by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  kasan_save_free_info+0x37/0x50
  __kasan_slab_free+0x4b/0x60
  kmem_cache_free+0x214/0x5d0
  bio_free+0xed/0x180
  end_bbio_data_read+0x1cc/0x580 [btrfs]
  btrfs_submit_chunk+0x98d/0x1880 [btrfs]
  btrfs_submit_bio+0x33/0x70 [btrfs]
  submit_one_bio+0xd4/0x130 [btrfs]
  submit_extent_page+0x3ea/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I can not reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.
And the original endio function called bio_put() to free the whole bio.

This means a double freeing thus causing use-after-free, e.g:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

   And even if the memory is not poisoned/corrupted, we will later call
   the original endio function, causing a double freeing.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Reported-by: David Sterba <[email protected]>
Fixes: 852eee6 ("btrfs: allow btrfs_submit_bio to split bios")
Signed-off-by: Qu Wenruo <[email protected]>
---
Changelog:
v2:
- Remove the unused variable  @orig_bbio
adam900710 added a commit to adam900710/linux that referenced this pull request Aug 20, 2024
…it_chunk()

[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

 ==================================================================
 BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
 Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
 CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ torvalds#76
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
 Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
 Call Trace:
  <TASK>
  dump_stack_lvl+0x61/0x80
  print_address_description.constprop.0+0x5e/0x2f0
  print_report+0x118/0x216
  kasan_report+0x11d/0x1f0
  btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  process_one_work+0xce0/0x12a0
  worker_thread+0x717/0x1250
  kthread+0x2e3/0x3c0
  ret_from_fork+0x2d/0x70
  ret_from_fork_asm+0x11/0x20
  </TASK>

 Allocated by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  __kasan_slab_alloc+0x7d/0x80
  kmem_cache_alloc_noprof+0x16e/0x3e0
  mempool_alloc_noprof+0x12e/0x310
  bio_alloc_bioset+0x3f0/0x7a0
  btrfs_bio_alloc+0x2e/0x50 [btrfs]
  submit_extent_page+0x4d1/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

 Freed by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  kasan_save_free_info+0x37/0x50
  __kasan_slab_free+0x4b/0x60
  kmem_cache_free+0x214/0x5d0
  bio_free+0xed/0x180
  end_bbio_data_read+0x1cc/0x580 [btrfs]
  btrfs_submit_chunk+0x98d/0x1880 [btrfs]
  btrfs_submit_bio+0x33/0x70 [btrfs]
  submit_one_bio+0xd4/0x130 [btrfs]
  submit_extent_page+0x3ea/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I can not reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.
And the original endio function called bio_put() to free the whole bio.

This means a double freeing thus causing use-after-free, e.g:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

   And even if the memory is not poisoned/corrupted, we will later call
   the original endio function, causing a double freeing.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Reported-by: David Sterba <[email protected]>
Fixes: 852eee6 ("btrfs: allow btrfs_submit_bio to split bios")
Signed-off-by: Qu Wenruo <[email protected]>
---
Changelog:
v2:
- Remove the unused variable  @orig_bbio
adam900710 added a commit to adam900710/linux that referenced this pull request Aug 20, 2024
…it_chunk()

[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

 ==================================================================
 BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
 Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
 CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ torvalds#76
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
 Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
 Call Trace:
  <TASK>
  dump_stack_lvl+0x61/0x80
  print_address_description.constprop.0+0x5e/0x2f0
  print_report+0x118/0x216
  kasan_report+0x11d/0x1f0
  btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  process_one_work+0xce0/0x12a0
  worker_thread+0x717/0x1250
  kthread+0x2e3/0x3c0
  ret_from_fork+0x2d/0x70
  ret_from_fork_asm+0x11/0x20
  </TASK>

 Allocated by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  __kasan_slab_alloc+0x7d/0x80
  kmem_cache_alloc_noprof+0x16e/0x3e0
  mempool_alloc_noprof+0x12e/0x310
  bio_alloc_bioset+0x3f0/0x7a0
  btrfs_bio_alloc+0x2e/0x50 [btrfs]
  submit_extent_page+0x4d1/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

 Freed by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  kasan_save_free_info+0x37/0x50
  __kasan_slab_free+0x4b/0x60
  kmem_cache_free+0x214/0x5d0
  bio_free+0xed/0x180
  end_bbio_data_read+0x1cc/0x580 [btrfs]
  btrfs_submit_chunk+0x98d/0x1880 [btrfs]
  btrfs_submit_bio+0x33/0x70 [btrfs]
  submit_one_bio+0xd4/0x130 [btrfs]
  submit_extent_page+0x3ea/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I can not reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.
And the original endio function called bio_put() to free the whole bio.

This means a double freeing thus causing use-after-free, e.g:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

   And even if the memory is not poisoned/corrupted, we will later call
   the original endio function, causing a double freeing.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Furthermore there is already one extra btrfs_cleanup_bio() call, but
that is duplicated to btrfs_orig_bbio_end_io() call, so remove that
tag completely.

Reported-by: David Sterba <[email protected]>
Fixes: 852eee6 ("btrfs: allow btrfs_submit_bio to split bios")
Signed-off-by: Qu Wenruo <[email protected]>
---
Changelog:
v2:
- Remove the unused variable  @orig_bbio

v3:
- Fix a double free if we go into fail_put_bio() tag
  By removing the tag and the btrfs_cleanup_bio() call completely.
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Aug 20, 2024
…it_chunk()

[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

 ==================================================================
 BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
 Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
 CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ torvalds#76
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
 Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
 Call Trace:
  <TASK>
  dump_stack_lvl+0x61/0x80
  print_address_description.constprop.0+0x5e/0x2f0
  print_report+0x118/0x216
  kasan_report+0x11d/0x1f0
  btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  process_one_work+0xce0/0x12a0
  worker_thread+0x717/0x1250
  kthread+0x2e3/0x3c0
  ret_from_fork+0x2d/0x70
  ret_from_fork_asm+0x11/0x20
  </TASK>

 Allocated by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  __kasan_slab_alloc+0x7d/0x80
  kmem_cache_alloc_noprof+0x16e/0x3e0
  mempool_alloc_noprof+0x12e/0x310
  bio_alloc_bioset+0x3f0/0x7a0
  btrfs_bio_alloc+0x2e/0x50 [btrfs]
  submit_extent_page+0x4d1/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

 Freed by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  kasan_save_free_info+0x37/0x50
  __kasan_slab_free+0x4b/0x60
  kmem_cache_free+0x214/0x5d0
  bio_free+0xed/0x180
  end_bbio_data_read+0x1cc/0x580 [btrfs]
  btrfs_submit_chunk+0x98d/0x1880 [btrfs]
  btrfs_submit_bio+0x33/0x70 [btrfs]
  submit_one_bio+0xd4/0x130 [btrfs]
  submit_extent_page+0x3ea/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I can not reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.
And the original endio function called bio_put() to free the whole bio.

This means a double freeing thus causing use-after-free, e.g:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

   And even if the memory is not poisoned/corrupted, we will later call
   the original endio function, causing a double freeing.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Furthermore there is already one extra btrfs_cleanup_bio() call, but
that is duplicated to btrfs_orig_bbio_end_io() call, so remove that
tag completely.

Reported-by: David Sterba <[email protected]>
Fixes: 852eee6 ("btrfs: allow btrfs_submit_bio to split bios")
Signed-off-by: Qu Wenruo <[email protected]>
adam900710 added a commit to adam900710/linux that referenced this pull request Aug 24, 2024
…it_chunk()

[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

 ==================================================================
 BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
 Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
 CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ torvalds#76
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
 Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
 Call Trace:
  <TASK>
  dump_stack_lvl+0x61/0x80
  print_address_description.constprop.0+0x5e/0x2f0
  print_report+0x118/0x216
  kasan_report+0x11d/0x1f0
  btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  process_one_work+0xce0/0x12a0
  worker_thread+0x717/0x1250
  kthread+0x2e3/0x3c0
  ret_from_fork+0x2d/0x70
  ret_from_fork_asm+0x11/0x20
  </TASK>

 Allocated by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  __kasan_slab_alloc+0x7d/0x80
  kmem_cache_alloc_noprof+0x16e/0x3e0
  mempool_alloc_noprof+0x12e/0x310
  bio_alloc_bioset+0x3f0/0x7a0
  btrfs_bio_alloc+0x2e/0x50 [btrfs]
  submit_extent_page+0x4d1/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

 Freed by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  kasan_save_free_info+0x37/0x50
  __kasan_slab_free+0x4b/0x60
  kmem_cache_free+0x214/0x5d0
  bio_free+0xed/0x180
  end_bbio_data_read+0x1cc/0x580 [btrfs]
  btrfs_submit_chunk+0x98d/0x1880 [btrfs]
  btrfs_submit_bio+0x33/0x70 [btrfs]
  submit_one_bio+0xd4/0x130 [btrfs]
  submit_extent_page+0x3ea/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I can not reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.
And the original endio function called bio_put() to free the whole bio.

This means a double freeing thus causing use-after-free, e.g:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

   And even if the memory is not poisoned/corrupted, we will later call
   the original endio function, causing a double freeing.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Furthermore there is already one extra btrfs_cleanup_bio() call, but
that is duplicated to btrfs_orig_bbio_end_io() call, so remove that
tag completely.

Reported-by: David Sterba <[email protected]>
Fixes: 852eee6 ("btrfs: allow btrfs_submit_bio to split bios")
Signed-off-by: Qu Wenruo <[email protected]>
---
Changelog:
v4:
- Fix a case where the endio function is never called
  If we have split the bbio and hit a critical error, we have to end
  both the current and the remaining bbio.
  As the remaining one will never be submitted, thus pending_ios will
  never reach 0.

v3:
- Fix a double free if we go into fail_put_bio() tag
  By removing the tag and the btrfs_cleanup_bio() call completely.

v2:
- Remove the unused variable  @orig_bbio
adam900710 added a commit to adam900710/linux that referenced this pull request Aug 24, 2024
…it_chunk()

[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

 ==================================================================
 BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
 Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
 CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ torvalds#76
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
 Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
 Call Trace:
  <TASK>
  dump_stack_lvl+0x61/0x80
  print_address_description.constprop.0+0x5e/0x2f0
  print_report+0x118/0x216
  kasan_report+0x11d/0x1f0
  btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  process_one_work+0xce0/0x12a0
  worker_thread+0x717/0x1250
  kthread+0x2e3/0x3c0
  ret_from_fork+0x2d/0x70
  ret_from_fork_asm+0x11/0x20
  </TASK>

 Allocated by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  __kasan_slab_alloc+0x7d/0x80
  kmem_cache_alloc_noprof+0x16e/0x3e0
  mempool_alloc_noprof+0x12e/0x310
  bio_alloc_bioset+0x3f0/0x7a0
  btrfs_bio_alloc+0x2e/0x50 [btrfs]
  submit_extent_page+0x4d1/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

 Freed by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  kasan_save_free_info+0x37/0x50
  __kasan_slab_free+0x4b/0x60
  kmem_cache_free+0x214/0x5d0
  bio_free+0xed/0x180
  end_bbio_data_read+0x1cc/0x580 [btrfs]
  btrfs_submit_chunk+0x98d/0x1880 [btrfs]
  btrfs_submit_bio+0x33/0x70 [btrfs]
  submit_one_bio+0xd4/0x130 [btrfs]
  submit_extent_page+0x3ea/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I can not reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.
And the original endio function called bio_put() to free the whole bio.

This means a double freeing thus causing use-after-free, e.g:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

   And even if the memory is not poisoned/corrupted, we will later call
   the original endio function, causing a double freeing.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Furthermore there is already one extra btrfs_cleanup_bio() call, but
that is duplicated to btrfs_orig_bbio_end_io() call, so remove that
tag completely.

Reported-by: David Sterba <[email protected]>
Fixes: 852eee6 ("btrfs: allow btrfs_submit_bio to split bios")
Signed-off-by: Qu Wenruo <[email protected]>
---
Changelog:
v4:
- Fix a case where the endio function is never called
  If we have split the bbio and hit a critical error, we have to end
  both the current and the remaining bbio.
  As the remaining one will never be submitted, thus pending_ios will
  never reach 0.

v3:
- Fix a double free if we go into fail_put_bio() tag
  By removing the tag and the btrfs_cleanup_bio() call completely.

v2:
- Remove the unused variable  @orig_bbio
adam900710 added a commit to adam900710/linux that referenced this pull request Aug 25, 2024
…it_chunk()

[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

 ==================================================================
 BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
 Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
 CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ torvalds#76
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
 Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
 Call Trace:
  <TASK>
  dump_stack_lvl+0x61/0x80
  print_address_description.constprop.0+0x5e/0x2f0
  print_report+0x118/0x216
  kasan_report+0x11d/0x1f0
  btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  process_one_work+0xce0/0x12a0
  worker_thread+0x717/0x1250
  kthread+0x2e3/0x3c0
  ret_from_fork+0x2d/0x70
  ret_from_fork_asm+0x11/0x20
  </TASK>

 Allocated by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  __kasan_slab_alloc+0x7d/0x80
  kmem_cache_alloc_noprof+0x16e/0x3e0
  mempool_alloc_noprof+0x12e/0x310
  bio_alloc_bioset+0x3f0/0x7a0
  btrfs_bio_alloc+0x2e/0x50 [btrfs]
  submit_extent_page+0x4d1/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

 Freed by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  kasan_save_free_info+0x37/0x50
  __kasan_slab_free+0x4b/0x60
  kmem_cache_free+0x214/0x5d0
  bio_free+0xed/0x180
  end_bbio_data_read+0x1cc/0x580 [btrfs]
  btrfs_submit_chunk+0x98d/0x1880 [btrfs]
  btrfs_submit_bio+0x33/0x70 [btrfs]
  submit_one_bio+0xd4/0x130 [btrfs]
  submit_extent_page+0x3ea/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I can not reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.
And the original endio function called bio_put() to free the whole bio.

This means a double freeing thus causing use-after-free, e.g:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

   And even if the memory is not poisoned/corrupted, we will later call
   the original endio function, causing a double freeing.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Furthermore there is already one extra btrfs_cleanup_bio() call, but
that is duplicated to btrfs_orig_bbio_end_io() call, so remove that
tag completely.

Reported-by: David Sterba <[email protected]>
Fixes: 852eee6 ("btrfs: allow btrfs_submit_bio to split bios")
Signed-off-by: Qu Wenruo <[email protected]>
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Aug 26, 2024
…it_chunk()

[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

 ==================================================================
 BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
 Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
 CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ torvalds#76
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
 Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
 Call Trace:
  <TASK>
  dump_stack_lvl+0x61/0x80
  print_address_description.constprop.0+0x5e/0x2f0
  print_report+0x118/0x216
  kasan_report+0x11d/0x1f0
  btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  process_one_work+0xce0/0x12a0
  worker_thread+0x717/0x1250
  kthread+0x2e3/0x3c0
  ret_from_fork+0x2d/0x70
  ret_from_fork_asm+0x11/0x20
  </TASK>

 Allocated by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  __kasan_slab_alloc+0x7d/0x80
  kmem_cache_alloc_noprof+0x16e/0x3e0
  mempool_alloc_noprof+0x12e/0x310
  bio_alloc_bioset+0x3f0/0x7a0
  btrfs_bio_alloc+0x2e/0x50 [btrfs]
  submit_extent_page+0x4d1/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

 Freed by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  kasan_save_free_info+0x37/0x50
  __kasan_slab_free+0x4b/0x60
  kmem_cache_free+0x214/0x5d0
  bio_free+0xed/0x180
  end_bbio_data_read+0x1cc/0x580 [btrfs]
  btrfs_submit_chunk+0x98d/0x1880 [btrfs]
  btrfs_submit_bio+0x33/0x70 [btrfs]
  submit_one_bio+0xd4/0x130 [btrfs]
  submit_extent_page+0x3ea/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I can not reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.
And the original endio function called bio_put() to free the whole bio.

This means a double freeing thus causing use-after-free, e.g:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

   And even if the memory is not poisoned/corrupted, we will later call
   the original endio function, causing a double freeing.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Furthermore there is already one extra btrfs_cleanup_bio() call, but
that is duplicated to btrfs_orig_bbio_end_io() call, so remove that
tag completely.

Reported-by: David Sterba <[email protected]>
Fixes: 852eee6 ("btrfs: allow btrfs_submit_bio to split bios")
Signed-off-by: Qu Wenruo <[email protected]>
kdave pushed a commit to kdave/btrfs-devel that referenced this pull request Aug 26, 2024
…hunk()

[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

  BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
  CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ torvalds#76
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
  Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
  Call Trace:
   dump_stack_lvl+0x61/0x80
   print_address_description.constprop.0+0x5e/0x2f0
   print_report+0x118/0x216
   kasan_report+0x11d/0x1f0
   btrfs_check_read_bio+0xa68/0xb70 [btrfs]
   process_one_work+0xce0/0x12a0
   worker_thread+0x717/0x1250
   kthread+0x2e3/0x3c0
   ret_from_fork+0x2d/0x70
   ret_from_fork_asm+0x11/0x20

  Allocated by task 20917:
   kasan_save_stack+0x37/0x60
   kasan_save_track+0x10/0x30
   __kasan_slab_alloc+0x7d/0x80
   kmem_cache_alloc_noprof+0x16e/0x3e0
   mempool_alloc_noprof+0x12e/0x310
   bio_alloc_bioset+0x3f0/0x7a0
   btrfs_bio_alloc+0x2e/0x50 [btrfs]
   submit_extent_page+0x4d1/0xdb0 [btrfs]
   btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
   btrfs_readahead+0x29a/0x430 [btrfs]
   read_pages+0x1a7/0xc60
   page_cache_ra_unbounded+0x2ad/0x560
   filemap_get_pages+0x629/0xa20
   filemap_read+0x335/0xbf0
   vfs_read+0x790/0xcb0
   ksys_read+0xfd/0x1d0
   do_syscall_64+0x6d/0x140
   entry_SYSCALL_64_after_hwframe+0x4b/0x53

  Freed by task 20917:
   kasan_save_stack+0x37/0x60
   kasan_save_track+0x10/0x30
   kasan_save_free_info+0x37/0x50
   __kasan_slab_free+0x4b/0x60
   kmem_cache_free+0x214/0x5d0
   bio_free+0xed/0x180
   end_bbio_data_read+0x1cc/0x580 [btrfs]
   btrfs_submit_chunk+0x98d/0x1880 [btrfs]
   btrfs_submit_bio+0x33/0x70 [btrfs]
   submit_one_bio+0xd4/0x130 [btrfs]
   submit_extent_page+0x3ea/0xdb0 [btrfs]
   btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
   btrfs_readahead+0x29a/0x430 [btrfs]
   read_pages+0x1a7/0xc60
   page_cache_ra_unbounded+0x2ad/0x560
   filemap_get_pages+0x629/0xa20
   filemap_read+0x335/0xbf0
   vfs_read+0x790/0xcb0
   ksys_read+0xfd/0x1d0
   do_syscall_64+0x6d/0x140
   entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I cannot reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.  And the original endio function called bio_put() to free the whole
bio.

This means a double freeing thus causing use-after-free, e.g.:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

   And even if the memory is not poisoned/corrupted, we will later call
   the original endio function, causing a double freeing.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Furthermore there is already one extra btrfs_cleanup_bio() call, but
that is duplicated to btrfs_orig_bbio_end_io() call, so remove that
label completely.

Reported-by: David Sterba <[email protected]>
Fixes: 852eee6 ("btrfs: allow btrfs_submit_bio to split bios")
CC: [email protected] # 6.6+
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
kdave pushed a commit to kdave/btrfs-devel that referenced this pull request Aug 26, 2024
…it_chunk()

[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

 ==================================================================
 BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
 Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
 CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ torvalds#76
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
 Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
 Call Trace:
  <TASK>
  dump_stack_lvl+0x61/0x80
  print_address_description.constprop.0+0x5e/0x2f0
  print_report+0x118/0x216
  kasan_report+0x11d/0x1f0
  btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  process_one_work+0xce0/0x12a0
  worker_thread+0x717/0x1250
  kthread+0x2e3/0x3c0
  ret_from_fork+0x2d/0x70
  ret_from_fork_asm+0x11/0x20
  </TASK>

 Allocated by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  __kasan_slab_alloc+0x7d/0x80
  kmem_cache_alloc_noprof+0x16e/0x3e0
  mempool_alloc_noprof+0x12e/0x310
  bio_alloc_bioset+0x3f0/0x7a0
  btrfs_bio_alloc+0x2e/0x50 [btrfs]
  submit_extent_page+0x4d1/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

 Freed by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  kasan_save_free_info+0x37/0x50
  __kasan_slab_free+0x4b/0x60
  kmem_cache_free+0x214/0x5d0
  bio_free+0xed/0x180
  end_bbio_data_read+0x1cc/0x580 [btrfs]
  btrfs_submit_chunk+0x98d/0x1880 [btrfs]
  btrfs_submit_bio+0x33/0x70 [btrfs]
  submit_one_bio+0xd4/0x130 [btrfs]
  submit_extent_page+0x3ea/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I can not reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.
And the original endio function called bio_put() to free the whole bio.

This means a double freeing thus causing use-after-free, e.g:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

   And even if the memory is not poisoned/corrupted, we will later call
   the original endio function, causing a double freeing.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Furthermore there is already one extra btrfs_cleanup_bio() call, but
that is duplicated to btrfs_orig_bbio_end_io() call, so remove that
tag completely.

Reported-by: David Sterba <[email protected]>
Reviewed-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Fixes: 852eee6 ("btrfs: allow btrfs_submit_bio to split bios")
Signed-off-by: Qu Wenruo <[email protected]>
adam900710 added a commit to adam900710/linux that referenced this pull request Aug 27, 2024
…it_chunk()

[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

 ==================================================================
 BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
 Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
 CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ torvalds#76
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
 Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
 Call Trace:
  <TASK>
  dump_stack_lvl+0x61/0x80
  print_address_description.constprop.0+0x5e/0x2f0
  print_report+0x118/0x216
  kasan_report+0x11d/0x1f0
  btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  process_one_work+0xce0/0x12a0
  worker_thread+0x717/0x1250
  kthread+0x2e3/0x3c0
  ret_from_fork+0x2d/0x70
  ret_from_fork_asm+0x11/0x20
  </TASK>

 Allocated by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  __kasan_slab_alloc+0x7d/0x80
  kmem_cache_alloc_noprof+0x16e/0x3e0
  mempool_alloc_noprof+0x12e/0x310
  bio_alloc_bioset+0x3f0/0x7a0
  btrfs_bio_alloc+0x2e/0x50 [btrfs]
  submit_extent_page+0x4d1/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

 Freed by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  kasan_save_free_info+0x37/0x50
  __kasan_slab_free+0x4b/0x60
  kmem_cache_free+0x214/0x5d0
  bio_free+0xed/0x180
  end_bbio_data_read+0x1cc/0x580 [btrfs]
  btrfs_submit_chunk+0x98d/0x1880 [btrfs]
  btrfs_submit_bio+0x33/0x70 [btrfs]
  submit_one_bio+0xd4/0x130 [btrfs]
  submit_extent_page+0x3ea/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I can not reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.
And the original endio function called bio_put() to free the whole bio.

This means a double freeing thus causing use-after-free, e.g:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

   And even if the memory is not poisoned/corrupted, we will later call
   the original endio function, causing a double freeing.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Furthermore there is already one extra btrfs_cleanup_bio() call, but
that is duplicated to btrfs_orig_bbio_end_io() call, so remove that
tag completely.

Reported-by: David Sterba <[email protected]>
Reviewed-by: Josef Bacik <[email protected]>
Fixes: 852eee6 ("btrfs: allow btrfs_submit_bio to split bios")
Signed-off-by: Qu Wenruo <[email protected]>
kdave pushed a commit to kdave/btrfs-devel that referenced this pull request Aug 27, 2024
…it_chunk()

[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

 ==================================================================
 BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
 Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
 CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ torvalds#76
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
 Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
 Call Trace:
  <TASK>
  dump_stack_lvl+0x61/0x80
  print_address_description.constprop.0+0x5e/0x2f0
  print_report+0x118/0x216
  kasan_report+0x11d/0x1f0
  btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  process_one_work+0xce0/0x12a0
  worker_thread+0x717/0x1250
  kthread+0x2e3/0x3c0
  ret_from_fork+0x2d/0x70
  ret_from_fork_asm+0x11/0x20
  </TASK>

 Allocated by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  __kasan_slab_alloc+0x7d/0x80
  kmem_cache_alloc_noprof+0x16e/0x3e0
  mempool_alloc_noprof+0x12e/0x310
  bio_alloc_bioset+0x3f0/0x7a0
  btrfs_bio_alloc+0x2e/0x50 [btrfs]
  submit_extent_page+0x4d1/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

 Freed by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  kasan_save_free_info+0x37/0x50
  __kasan_slab_free+0x4b/0x60
  kmem_cache_free+0x214/0x5d0
  bio_free+0xed/0x180
  end_bbio_data_read+0x1cc/0x580 [btrfs]
  btrfs_submit_chunk+0x98d/0x1880 [btrfs]
  btrfs_submit_bio+0x33/0x70 [btrfs]
  submit_one_bio+0xd4/0x130 [btrfs]
  submit_extent_page+0x3ea/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I can not reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.
And the original endio function called bio_put() to free the whole bio.

This means a double freeing thus causing use-after-free, e.g:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

   And even if the memory is not poisoned/corrupted, we will later call
   the original endio function, causing a double freeing.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Furthermore there is already one extra btrfs_cleanup_bio() call, but
that is duplicated to btrfs_orig_bbio_end_io() call, so remove that
tag completely.

Reported-by: David Sterba <[email protected]>
Reviewed-by: Josef Bacik <[email protected]>
Fixes: 852eee6 ("btrfs: allow btrfs_submit_bio to split bios")
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
kdave pushed a commit to kdave/btrfs-devel that referenced this pull request Aug 29, 2024
…it_chunk()

[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

 ==================================================================
 BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
 Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
 CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ torvalds#76
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
 Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
 Call Trace:
  <TASK>
  dump_stack_lvl+0x61/0x80
  print_address_description.constprop.0+0x5e/0x2f0
  print_report+0x118/0x216
  kasan_report+0x11d/0x1f0
  btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  process_one_work+0xce0/0x12a0
  worker_thread+0x717/0x1250
  kthread+0x2e3/0x3c0
  ret_from_fork+0x2d/0x70
  ret_from_fork_asm+0x11/0x20
  </TASK>

 Allocated by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  __kasan_slab_alloc+0x7d/0x80
  kmem_cache_alloc_noprof+0x16e/0x3e0
  mempool_alloc_noprof+0x12e/0x310
  bio_alloc_bioset+0x3f0/0x7a0
  btrfs_bio_alloc+0x2e/0x50 [btrfs]
  submit_extent_page+0x4d1/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

 Freed by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  kasan_save_free_info+0x37/0x50
  __kasan_slab_free+0x4b/0x60
  kmem_cache_free+0x214/0x5d0
  bio_free+0xed/0x180
  end_bbio_data_read+0x1cc/0x580 [btrfs]
  btrfs_submit_chunk+0x98d/0x1880 [btrfs]
  btrfs_submit_bio+0x33/0x70 [btrfs]
  submit_one_bio+0xd4/0x130 [btrfs]
  submit_extent_page+0x3ea/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I can not reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.
And the original endio function called bio_put() to free the whole bio.

This means a double freeing thus causing use-after-free, e.g:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

   And even if the memory is not poisoned/corrupted, we will later call
   the original endio function, causing a double freeing.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Furthermore there is already one extra btrfs_cleanup_bio() call, but
that is duplicated to btrfs_orig_bbio_end_io() call, so remove that
tag completely.

Reported-by: David Sterba <[email protected]>
Reviewed-by: Josef Bacik <[email protected]>
Fixes: 852eee6 ("btrfs: allow btrfs_submit_bio to split bios")
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
mj22226 pushed a commit to mj22226/linux that referenced this pull request Sep 1, 2024
…hunk()

commit 10d9d8c upstream.

[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

  BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
  CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ torvalds#76
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
  Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
  Call Trace:
   dump_stack_lvl+0x61/0x80
   print_address_description.constprop.0+0x5e/0x2f0
   print_report+0x118/0x216
   kasan_report+0x11d/0x1f0
   btrfs_check_read_bio+0xa68/0xb70 [btrfs]
   process_one_work+0xce0/0x12a0
   worker_thread+0x717/0x1250
   kthread+0x2e3/0x3c0
   ret_from_fork+0x2d/0x70
   ret_from_fork_asm+0x11/0x20

  Allocated by task 20917:
   kasan_save_stack+0x37/0x60
   kasan_save_track+0x10/0x30
   __kasan_slab_alloc+0x7d/0x80
   kmem_cache_alloc_noprof+0x16e/0x3e0
   mempool_alloc_noprof+0x12e/0x310
   bio_alloc_bioset+0x3f0/0x7a0
   btrfs_bio_alloc+0x2e/0x50 [btrfs]
   submit_extent_page+0x4d1/0xdb0 [btrfs]
   btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
   btrfs_readahead+0x29a/0x430 [btrfs]
   read_pages+0x1a7/0xc60
   page_cache_ra_unbounded+0x2ad/0x560
   filemap_get_pages+0x629/0xa20
   filemap_read+0x335/0xbf0
   vfs_read+0x790/0xcb0
   ksys_read+0xfd/0x1d0
   do_syscall_64+0x6d/0x140
   entry_SYSCALL_64_after_hwframe+0x4b/0x53

  Freed by task 20917:
   kasan_save_stack+0x37/0x60
   kasan_save_track+0x10/0x30
   kasan_save_free_info+0x37/0x50
   __kasan_slab_free+0x4b/0x60
   kmem_cache_free+0x214/0x5d0
   bio_free+0xed/0x180
   end_bbio_data_read+0x1cc/0x580 [btrfs]
   btrfs_submit_chunk+0x98d/0x1880 [btrfs]
   btrfs_submit_bio+0x33/0x70 [btrfs]
   submit_one_bio+0xd4/0x130 [btrfs]
   submit_extent_page+0x3ea/0xdb0 [btrfs]
   btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
   btrfs_readahead+0x29a/0x430 [btrfs]
   read_pages+0x1a7/0xc60
   page_cache_ra_unbounded+0x2ad/0x560
   filemap_get_pages+0x629/0xa20
   filemap_read+0x335/0xbf0
   vfs_read+0x790/0xcb0
   ksys_read+0xfd/0x1d0
   do_syscall_64+0x6d/0x140
   entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I cannot reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.  And the original endio function called bio_put() to free the whole
bio.

This means a double freeing thus causing use-after-free, e.g.:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

   And even if the memory is not poisoned/corrupted, we will later call
   the original endio function, causing a double freeing.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Furthermore there is already one extra btrfs_cleanup_bio() call, but
that is duplicated to btrfs_orig_bbio_end_io() call, so remove that
label completely.

Reported-by: David Sterba <[email protected]>
Fixes: 852eee6 ("btrfs: allow btrfs_submit_bio to split bios")
CC: [email protected] # 6.6+
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
adam900710 added a commit to adam900710/linux that referenced this pull request Sep 1, 2024
…it_chunk()

[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

 ==================================================================
 BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
 Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
 CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ torvalds#76
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
 Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
 Call Trace:
  <TASK>
  dump_stack_lvl+0x61/0x80
  print_address_description.constprop.0+0x5e/0x2f0
  print_report+0x118/0x216
  kasan_report+0x11d/0x1f0
  btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  process_one_work+0xce0/0x12a0
  worker_thread+0x717/0x1250
  kthread+0x2e3/0x3c0
  ret_from_fork+0x2d/0x70
  ret_from_fork_asm+0x11/0x20
  </TASK>

 Allocated by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  __kasan_slab_alloc+0x7d/0x80
  kmem_cache_alloc_noprof+0x16e/0x3e0
  mempool_alloc_noprof+0x12e/0x310
  bio_alloc_bioset+0x3f0/0x7a0
  btrfs_bio_alloc+0x2e/0x50 [btrfs]
  submit_extent_page+0x4d1/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

 Freed by task 20917:
  kasan_save_stack+0x37/0x60
  kasan_save_track+0x10/0x30
  kasan_save_free_info+0x37/0x50
  __kasan_slab_free+0x4b/0x60
  kmem_cache_free+0x214/0x5d0
  bio_free+0xed/0x180
  end_bbio_data_read+0x1cc/0x580 [btrfs]
  btrfs_submit_chunk+0x98d/0x1880 [btrfs]
  btrfs_submit_bio+0x33/0x70 [btrfs]
  submit_one_bio+0xd4/0x130 [btrfs]
  submit_extent_page+0x3ea/0xdb0 [btrfs]
  btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
  btrfs_readahead+0x29a/0x430 [btrfs]
  read_pages+0x1a7/0xc60
  page_cache_ra_unbounded+0x2ad/0x560
  filemap_get_pages+0x629/0xa20
  filemap_read+0x335/0xbf0
  vfs_read+0x790/0xcb0
  ksys_read+0xfd/0x1d0
  do_syscall_64+0x6d/0x140
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I can not reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.
And the original endio function called bio_put() to free the whole bio.

This means a double freeing thus causing use-after-free, e.g:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

   And even if the memory is not poisoned/corrupted, we will later call
   the original endio function, causing a double freeing.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Furthermore there is already one extra btrfs_cleanup_bio() call, but
that is duplicated to btrfs_orig_bbio_end_io() call, so remove that
tag completely.

Reported-by: David Sterba <[email protected]>
Reviewed-by: Josef Bacik <[email protected]>
Fixes: 852eee6 ("btrfs: allow btrfs_submit_bio to split bios")
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
ptr1337 pushed a commit to CachyOS/linux that referenced this pull request Sep 4, 2024
…hunk()

commit 10d9d8c upstream.

[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

  BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
  CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ torvalds#76
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
  Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
  Call Trace:
   dump_stack_lvl+0x61/0x80
   print_address_description.constprop.0+0x5e/0x2f0
   print_report+0x118/0x216
   kasan_report+0x11d/0x1f0
   btrfs_check_read_bio+0xa68/0xb70 [btrfs]
   process_one_work+0xce0/0x12a0
   worker_thread+0x717/0x1250
   kthread+0x2e3/0x3c0
   ret_from_fork+0x2d/0x70
   ret_from_fork_asm+0x11/0x20

  Allocated by task 20917:
   kasan_save_stack+0x37/0x60
   kasan_save_track+0x10/0x30
   __kasan_slab_alloc+0x7d/0x80
   kmem_cache_alloc_noprof+0x16e/0x3e0
   mempool_alloc_noprof+0x12e/0x310
   bio_alloc_bioset+0x3f0/0x7a0
   btrfs_bio_alloc+0x2e/0x50 [btrfs]
   submit_extent_page+0x4d1/0xdb0 [btrfs]
   btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
   btrfs_readahead+0x29a/0x430 [btrfs]
   read_pages+0x1a7/0xc60
   page_cache_ra_unbounded+0x2ad/0x560
   filemap_get_pages+0x629/0xa20
   filemap_read+0x335/0xbf0
   vfs_read+0x790/0xcb0
   ksys_read+0xfd/0x1d0
   do_syscall_64+0x6d/0x140
   entry_SYSCALL_64_after_hwframe+0x4b/0x53

  Freed by task 20917:
   kasan_save_stack+0x37/0x60
   kasan_save_track+0x10/0x30
   kasan_save_free_info+0x37/0x50
   __kasan_slab_free+0x4b/0x60
   kmem_cache_free+0x214/0x5d0
   bio_free+0xed/0x180
   end_bbio_data_read+0x1cc/0x580 [btrfs]
   btrfs_submit_chunk+0x98d/0x1880 [btrfs]
   btrfs_submit_bio+0x33/0x70 [btrfs]
   submit_one_bio+0xd4/0x130 [btrfs]
   submit_extent_page+0x3ea/0xdb0 [btrfs]
   btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
   btrfs_readahead+0x29a/0x430 [btrfs]
   read_pages+0x1a7/0xc60
   page_cache_ra_unbounded+0x2ad/0x560
   filemap_get_pages+0x629/0xa20
   filemap_read+0x335/0xbf0
   vfs_read+0x790/0xcb0
   ksys_read+0xfd/0x1d0
   do_syscall_64+0x6d/0x140
   entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I cannot reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.  And the original endio function called bio_put() to free the whole
bio.

This means a double freeing thus causing use-after-free, e.g.:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

   And even if the memory is not poisoned/corrupted, we will later call
   the original endio function, causing a double freeing.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Furthermore there is already one extra btrfs_cleanup_bio() call, but
that is duplicated to btrfs_orig_bbio_end_io() call, so remove that
label completely.

Reported-by: David Sterba <[email protected]>
Fixes: 852eee6 ("btrfs: allow btrfs_submit_bio to split bios")
CC: [email protected] # 6.6+
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
thesamesam pushed a commit to thesamesam/linux that referenced this pull request Sep 4, 2024
…hunk()

commit 10d9d8c upstream.

[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

  BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
  CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ torvalds#76
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
  Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
  Call Trace:
   dump_stack_lvl+0x61/0x80
   print_address_description.constprop.0+0x5e/0x2f0
   print_report+0x118/0x216
   kasan_report+0x11d/0x1f0
   btrfs_check_read_bio+0xa68/0xb70 [btrfs]
   process_one_work+0xce0/0x12a0
   worker_thread+0x717/0x1250
   kthread+0x2e3/0x3c0
   ret_from_fork+0x2d/0x70
   ret_from_fork_asm+0x11/0x20

  Allocated by task 20917:
   kasan_save_stack+0x37/0x60
   kasan_save_track+0x10/0x30
   __kasan_slab_alloc+0x7d/0x80
   kmem_cache_alloc_noprof+0x16e/0x3e0
   mempool_alloc_noprof+0x12e/0x310
   bio_alloc_bioset+0x3f0/0x7a0
   btrfs_bio_alloc+0x2e/0x50 [btrfs]
   submit_extent_page+0x4d1/0xdb0 [btrfs]
   btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
   btrfs_readahead+0x29a/0x430 [btrfs]
   read_pages+0x1a7/0xc60
   page_cache_ra_unbounded+0x2ad/0x560
   filemap_get_pages+0x629/0xa20
   filemap_read+0x335/0xbf0
   vfs_read+0x790/0xcb0
   ksys_read+0xfd/0x1d0
   do_syscall_64+0x6d/0x140
   entry_SYSCALL_64_after_hwframe+0x4b/0x53

  Freed by task 20917:
   kasan_save_stack+0x37/0x60
   kasan_save_track+0x10/0x30
   kasan_save_free_info+0x37/0x50
   __kasan_slab_free+0x4b/0x60
   kmem_cache_free+0x214/0x5d0
   bio_free+0xed/0x180
   end_bbio_data_read+0x1cc/0x580 [btrfs]
   btrfs_submit_chunk+0x98d/0x1880 [btrfs]
   btrfs_submit_bio+0x33/0x70 [btrfs]
   submit_one_bio+0xd4/0x130 [btrfs]
   submit_extent_page+0x3ea/0xdb0 [btrfs]
   btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
   btrfs_readahead+0x29a/0x430 [btrfs]
   read_pages+0x1a7/0xc60
   page_cache_ra_unbounded+0x2ad/0x560
   filemap_get_pages+0x629/0xa20
   filemap_read+0x335/0xbf0
   vfs_read+0x790/0xcb0
   ksys_read+0xfd/0x1d0
   do_syscall_64+0x6d/0x140
   entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I cannot reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.  And the original endio function called bio_put() to free the whole
bio.

This means a double freeing thus causing use-after-free, e.g.:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

   And even if the memory is not poisoned/corrupted, we will later call
   the original endio function, causing a double freeing.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Furthermore there is already one extra btrfs_cleanup_bio() call, but
that is duplicated to btrfs_orig_bbio_end_io() call, so remove that
label completely.

Reported-by: David Sterba <[email protected]>
Fixes: 852eee6 ("btrfs: allow btrfs_submit_bio to split bios")
CC: [email protected] # 6.6+
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Mr-Bossman pushed a commit to Mr-Bossman/linux that referenced this pull request Sep 16, 2024
…hunk()

[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

  BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
  CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ torvalds#76
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
  Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
  Call Trace:
   dump_stack_lvl+0x61/0x80
   print_address_description.constprop.0+0x5e/0x2f0
   print_report+0x118/0x216
   kasan_report+0x11d/0x1f0
   btrfs_check_read_bio+0xa68/0xb70 [btrfs]
   process_one_work+0xce0/0x12a0
   worker_thread+0x717/0x1250
   kthread+0x2e3/0x3c0
   ret_from_fork+0x2d/0x70
   ret_from_fork_asm+0x11/0x20

  Allocated by task 20917:
   kasan_save_stack+0x37/0x60
   kasan_save_track+0x10/0x30
   __kasan_slab_alloc+0x7d/0x80
   kmem_cache_alloc_noprof+0x16e/0x3e0
   mempool_alloc_noprof+0x12e/0x310
   bio_alloc_bioset+0x3f0/0x7a0
   btrfs_bio_alloc+0x2e/0x50 [btrfs]
   submit_extent_page+0x4d1/0xdb0 [btrfs]
   btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
   btrfs_readahead+0x29a/0x430 [btrfs]
   read_pages+0x1a7/0xc60
   page_cache_ra_unbounded+0x2ad/0x560
   filemap_get_pages+0x629/0xa20
   filemap_read+0x335/0xbf0
   vfs_read+0x790/0xcb0
   ksys_read+0xfd/0x1d0
   do_syscall_64+0x6d/0x140
   entry_SYSCALL_64_after_hwframe+0x4b/0x53

  Freed by task 20917:
   kasan_save_stack+0x37/0x60
   kasan_save_track+0x10/0x30
   kasan_save_free_info+0x37/0x50
   __kasan_slab_free+0x4b/0x60
   kmem_cache_free+0x214/0x5d0
   bio_free+0xed/0x180
   end_bbio_data_read+0x1cc/0x580 [btrfs]
   btrfs_submit_chunk+0x98d/0x1880 [btrfs]
   btrfs_submit_bio+0x33/0x70 [btrfs]
   submit_one_bio+0xd4/0x130 [btrfs]
   submit_extent_page+0x3ea/0xdb0 [btrfs]
   btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
   btrfs_readahead+0x29a/0x430 [btrfs]
   read_pages+0x1a7/0xc60
   page_cache_ra_unbounded+0x2ad/0x560
   filemap_get_pages+0x629/0xa20
   filemap_read+0x335/0xbf0
   vfs_read+0x790/0xcb0
   ksys_read+0xfd/0x1d0
   do_syscall_64+0x6d/0x140
   entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I cannot reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.  And the original endio function called bio_put() to free the whole
bio.

This means a double freeing thus causing use-after-free, e.g.:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

   And even if the memory is not poisoned/corrupted, we will later call
   the original endio function, causing a double freeing.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Furthermore there is already one extra btrfs_cleanup_bio() call, but
that is duplicated to btrfs_orig_bbio_end_io() call, so remove that
label completely.

Reported-by: David Sterba <[email protected]>
Fixes: 852eee6 ("btrfs: allow btrfs_submit_bio to split bios")
CC: [email protected] # 6.6+
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants