Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NULL pointer dereference in aspeed_i2c_master_irq #219

Open
zevweiss opened this issue Feb 8, 2023 · 4 comments
Open

NULL pointer dereference in aspeed_i2c_master_irq #219

zevweiss opened this issue Feb 8, 2023 · 4 comments

Comments

@zevweiss
Copy link

zevweiss commented Feb 8, 2023

Running a dev-6.0 kernel on an ast2500 platform I'm working on a port to I'm sometimes hitting a panic on busses with an IPMB device on them:

[27559.144928] 8<--- cut here ---
[27559.148067] Unable to handle kernel NULL pointer dereference at virtual address 00000122
[27559.156186] [00000122] *pgd=00000000
[27559.159813] Internal error: Oops: 5 [#1] ARM
[27559.164120] CPU: 0 PID: 0 Comm: swapper Not tainted 6.0.19-67c9407-dirty-766959e #1
[27559.171811] Hardware name: Generic DT based system
[27559.176617] PC is at aspeed_i2c_master_irq+0x10c/0x4d0
[27559.181829] LR is at aspeed_i2c_bus_irq+0x58/0x140
[27559.186657] pc : [<804eea20>]    lr : [<804eee3c>]    psr: 80000193
[27559.192934] sp : 80b01e70  ip : 00000001  fp : 8095bdb8
[27559.198170] r10: 80b08588  r9 : 80b08a60  r8 : 80dcb3c0
[27559.203403] r7 : 00000001  r6 : 00000001  r5 : 00000001  r4 : 84a84420
[27559.209937] r3 : 00000000  r2 : 00000000  r1 : 9f893440  r0 : 00000122
[27559.216475] Flags: Nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment none
[27559.223715] Control: 00c5387d  Table: 85490008  DAC: 00000051
[27559.229474] Register r0 information: non-paged memory
[27559.234563] Register r1 information: 0-page vmalloc region starting at 0x9f893000 allocated at __devm_ioremap_resource+0x150/0x1d8
[27559.246372] Register r2 information: NULL pointer
[27559.251118] Register r3 information: NULL pointer
[27559.255856] Register r4 information: slab kmalloc-1k start 84a84400 pointer offset 32 size 1024
[27559.264638] Register r5 information: non-paged memory
[27559.269725] Register r6 information: non-paged memory
[27559.274809] Register r7 information: non-paged memory
[27559.279890] Register r8 information: slab kmalloc-192 start 80dcb3c0 pointer offset 0 size 192
[27559.288585] Register r9 information: non-slab/vmalloc memory
[27559.294279] Register r10 information: non-slab/vmalloc memory
[27559.300056] Register r11 information: non-slab/vmalloc memory
[27559.305836] Register r12 information: non-paged memory
[27559.311002] Process swapper (pid: 0, stack limit = 0x88a4d53e)
[27559.316872] Stack: (0x80b01e70 to 0x80b02000)
[27559.321265] 1e60:                                     84a84420 80c02520 00000001 00000001
[27559.329464] 1e80: 80dcb3c0 804eee3c 80e5f440 80c02520 00000000 00000024 80dcb3c0 80b08a60
[27559.337663] 1ea0: 80b08588 801522bc 00000012 80c97240 80dcb3c0 80c02520 80c97b4c 80b67050
[27559.345861] 1ec0: 9dfff000 8015244c 80dcb3c0 80c02520 80c97b4c 80156320 0000000c 80151bbc
[27559.354057] 1ee0: 0000000c 803bf368 00001000 a8b1ee02 80b08588 80c024c0 80b8cf54 00000000
[27559.362249] 1f00: 80b01f5c 80151bbc 80c024c0 801011bc 80b01f28 8073c97c 80102a00 60000013
[27559.370447] 1f20: ffffffff 80100b04 00000000 00000000 8443cb49 00000000 00000000 80b08a60
[27559.378648] 1f40: 80b03064 80b08a60 9dfff000 00000000 80b08588 8095bdb8 84b9acc8 80b01f78
[27559.386847] 1f60: 801029fc 80102a00 60000013 ffffffff 00000051 80743398 00000000 80743398
[27559.395044] 1f80: 00000000 80144f40 80b0d748 80ba5040 9dfff018 80145254 80b0d748 8073ca20
[27559.403242] 1fa0: 00000001 80a00c58 00000001 80a0126c ffffffff ffffffff 00000000 80a00658
[27559.411444] 1fc0: 00000000 80a29a48 a8b7ee11 00000000 00000000 80a0033c 00000051 00c0387d
[27559.419642] 1fe0: 00000000 8feea000 410fb767 00c5387d 00000000 00000000 00000000 00000000
[27559.427843]  aspeed_i2c_master_irq from aspeed_i2c_bus_irq+0x58/0x140
[27559.434364]  aspeed_i2c_bus_irq from __handle_irq_event_percpu+0x50/0x168
[27559.441221]  __handle_irq_event_percpu from handle_irq_event+0x30/0x70
[27559.447792]  handle_irq_event from handle_simple_irq+0x7c/0xac
[27559.453679]  handle_simple_irq from generic_handle_domain_irq+0x2c/0x3c
[27559.460335]  generic_handle_domain_irq from aspeed_i2c_ic_irq_handler+0x64/0xf4
[27559.467685]  aspeed_i2c_ic_irq_handler from generic_handle_domain_irq+0x2c/0x3c
[27559.475035]  generic_handle_domain_irq from avic_handle_irq+0x44/0x60
[27559.481516]  avic_handle_irq from generic_handle_arch_irq+0x28/0x3c
[27559.487853]  generic_handle_arch_irq from __irq_svc+0x84/0xac
[27559.493645] Exception stack(0x80b01f28 to 0x80b01f70)
[27559.498722] 1f20:                   00000000 00000000 8443cb49 00000000 00000000 80b08a60
[27559.506924] 1f40: 80b03064 80b08a60 9dfff000 00000000 80b08588 8095bdb8 84b9acc8 80b01f78
[27559.515114] 1f60: 801029fc 80102a00 60000013 ffffffff
[27559.520187]  __irq_svc from arch_cpu_idle+0x30/0x38
[27559.525118]  arch_cpu_idle from default_idle_call+0x2c/0x74
[27559.530743]  default_idle_call from do_idle+0x70/0xd8
[27559.535848]  do_idle from cpu_startup_entry+0x14/0x18
[27559.540943]  cpu_startup_entry from rest_init+0x80/0x9c
[27559.546230]  rest_init from arch_post_acpi_subsys_init+0x0/0x18
[27559.552226] Code: e59421cc e59411a4 e282c001 e584c1cc (e7d02002)
[27559.558356] ---[ end trace 0000000000000000 ]---
[27559.563002] Kernel panic - not syncing: Fatal exception in interrupt

aspeed_i2c_master_irq+0x10c points to this line, with msg->buf being NULL.

I've seen it happen a few times, but it's not 100% reproducible, so I'm guessing it maybe a race condition of some sort.

@zevweiss
Copy link
Author

zevweiss commented Feb 8, 2023

Caught a maybe-related KASAN dump in a similar scenario (though I now think the IPMB device may in fact be unrelated):

[  614.857168] ==================================================================
[  614.864585] BUG: KASAN: slab-out-of-bounds in aspeed_i2c_master_irq+0x244/0x708
[  614.872150] Read of size 2 at addr 8443b6e4 by task swapper/0
[  614.878047] 
[  614.879645] CPU: 0 PID: 0 Comm: swapper Not tainted 6.0.19-67c9407-dirty-90ab623 #1
[  614.887447] Hardware name: Generic DT based system
[  614.892376]  unwind_backtrace from show_stack+0x18/0x1c
[  614.897895]  show_stack from dump_stack_lvl+0x24/0x2c
[  614.903245]  dump_stack_lvl from print_report+0x2bc/0x5f8
[  614.908919]  print_report from kasan_report+0xc4/0x108
[  614.914286]  kasan_report from aspeed_i2c_master_irq+0x244/0x708
[  614.920561]  aspeed_i2c_master_irq from aspeed_i2c_bus_irq+0x70/0x188
[  614.927251]  aspeed_i2c_bus_irq from __handle_irq_event_percpu+0x78/0x194
[  614.934313]  __handle_irq_event_percpu from handle_irq_event+0x50/0xb0
[  614.941076]  handle_irq_event from handle_simple_irq+0xc8/0x104
[  614.947248]  handle_simple_irq from generic_handle_domain_irq+0x40/0x50
[  614.954110]  generic_handle_domain_irq from aspeed_i2c_ic_irq_handler+0xe4/0x1bc
[  614.961809]  aspeed_i2c_ic_irq_handler from generic_handle_domain_irq+0x40/0x50
[  614.969395]  generic_handle_domain_irq from avic_handle_irq+0x50/0x74
[  614.976083]  avic_handle_irq from generic_handle_arch_irq+0x28/0x3c
[  614.982629]  generic_handle_arch_irq from call_with_stack+0x18/0x20
[  614.989153]  call_with_stack from __irq_svc+0x78/0x94
[  614.994423] Exception stack(0x81003f30 to 0x81003f78)
[  614.999627] 3f20:                                     8100ce00 00000000 00000000 00000000
[  615.007960] 3f40: 8100ce00 00000000 81005064 ffffffff 00000000 410fb767 00c5387d 00000000
[  615.016291] 3f60: 8100ce00 81003f80 801036b4 801036b8 60000013 ffffffff
[  615.023037]  __irq_svc from arch_cpu_idle+0x30/0x38
[  615.028151]  arch_cpu_idle from default_idle_call+0x34/0x7c
[  615.033977]  default_idle_call from do_idle+0x88/0xe4
[  615.039272]  do_idle from cpu_startup_entry+0x14/0x18
[  615.044548]  cpu_startup_entry from rest_init+0xa8/0xc4
[  615.050037]  rest_init from arch_post_acpi_subsys_init+0x0/0x18
[  615.056208] 
[  615.057807] Allocated by task 262:
[  615.061326]  seq_read_iter+0x424/0x83c
[  615.065275]  io_read+0x1b4/0x81c
[  615.068706]  io_issue_sqe+0x90/0x3ac
[  615.072495]  io_submit_sqes+0x3b8/0x9a8
[  615.076518]  sys_io_uring_enter+0x3fc/0xc6c
[  615.080887]  ret_fast_syscall+0x0/0x54
[  615.084808] 
[  615.086397] Freed by task 262:
[  615.089572]  kasan_set_free_info+0x20/0x34
[  615.093867]  __kasan_slab_free+0xe4/0x134
[  615.098049]  slab_free_freelist_hook+0x84/0x150
[  615.102794]  kfree+0x90/0x28c
[  615.105928]  seq_release+0x2c/0x48
[  615.109518]  kernfs_fop_release+0x64/0x110
[  615.113804]  __fput+0xd0/0x378
[  615.117037]  task_work_run+0x98/0xc8
[  615.120810]  do_work_pending+0x574/0x698
[  615.124927]  slow_work_pending+0xc/0x20
[  615.128926] 
[  615.130516] The buggy address belongs to the object at 8443a000
[  615.130516]  which belongs to the cache kmalloc-4k of size 4096
[  615.142475] The buggy address is located 1764 bytes to the right of
[  615.142475]  4096-byte region [8443a000, 8443b000)
[  615.153662] 
[  615.155255] The buggy address belongs to the physical page:
[  615.160936] page:343b2b83 refcount:1 mapcount:0 mapping:00000000 index:0x0 pfn:0x84438
[  615.169022] head:343b2b83 order:3 compound_mapcount:0 compound_pincount:0
[  615.175953] flags: 0x10200(slab|head|zone=0)
[  615.180432] raw: 00010200 00000100 00000122 814019a0 00000000 00040004 ffffffff 00000001
[  615.188642] page dumped because: kasan: bad access detected
[  615.194326] 
[  615.195915] Memory state around the buggy address:
[  615.200831]  8443b580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  615.207511]  8443b600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  615.214193] >8443b680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  615.220840]                                                ^
[  615.226633]  8443b700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  615.233309]  8443b780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  615.239957] ==================================================================

aspeed_i2c_master_irq+0x244 points here.

Task 262, FWIW, is adcsensor, doing i2c-unrelated things.

@zevweiss
Copy link
Author

Some further debug-experiment results: after disabling CONFIG_I2C_SLAVE I don't seem to be able to reproduce it (whereas I had been hitting it pretty reliably with that enabled).

@ksrikanth
Copy link

Hello @zevweiss any further updates on this.
We are also seeing similar issue, and have "I2C_SLAVE_ENABLED".

@zevweiss
Copy link
Author

Hi @ksrikanth -- since I realized that the platform I'd been encountering this on didn't actually require CONFIG_I2C_SLAVE this became a much lower-priority issue for us and I haven't looked into further I'm afraid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants