Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic in I2C driver when reading from I2C EEPROM chip (IDFGH-12697) #13687

Open
3 tasks done
sfzhi opened this issue Apr 24, 2024 · 5 comments
Open
3 tasks done

Panic in I2C driver when reading from I2C EEPROM chip (IDFGH-12697) #13687

sfzhi opened this issue Apr 24, 2024 · 5 comments
Assignees
Labels
Status: Opened Issue is new Type: Bug bugs in IDF

Comments

@sfzhi
Copy link

sfzhi commented Apr 24, 2024

Answers checklist.

  • I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
  • I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
  • I have searched the issue tracker for a similar issue and not found a similar issue.

IDF version.

5.2.1

Espressif SoC revision.

ESP32-C3 revision 0.4

Operating System used.

Linux

How did you build your project?

Command line with idf.py

If you are using Windows, please specify command line type.

None

Development Kit.

Seeed XIAO ESP32C3

Power Supply used.

USB

What is the expected behavior?

It should be possible to read a longer sequence of bytes from an external EEPROM over I2C.

What is the actual behavior?

Guru Meditation Error: Core  0 panic'ed (Store access fault). Exception was unhandled.

Core  0 register dump:
MEPC    : 0x40382350  RA      : 0x40382332  SP      : 0x3fc97970  GP      : 0x3fc94a00
TP      : 0x3fc7b4a4  T0      : 0x0000001f  T1      : 0x600c2140  T2      : 0x00000002
S0/FP   : 0x3fcae2c4  S1      : 0x3fcae2e8  A0      : 0x3fcae2c4  A1      : 0x00000020
A2      : 0x60013000  A3      : 0x00000020  A4      : 0x00000053  A5      : 0x00000001
A6      : 0xa0000000  A7      : 0x0000000a  S2      : 0x3fcae1ac  S3      : 0x3fcae1ac
S4      : 0xffffffff  S5      : 0x00000001  S6      : 0x3fc9e858  S7      : 0x3fca0ce4
S8      : 0x00000075  S9      : 0x00000001  S10     : 0x0000005a  S11     : 0x00000010
T3      : 0x00000038  T4      : 0x3fca11cc  T5      : 0x00001800  T6      : 0x00000000
MSTATUS : 0x00001881  MTVEC   : 0x40380001  MCAUSE  : 0x00000007  MTVAL   : 0x00000000
MHARTID : 0x00000000

Stack memory:
3fc97970: 0x00000000 0x3fcae1c4 0x00000053 0x40383fbc 0x000000ff 0x3fcae1c4 0x3fc9aaf8 0x40383f7a
3fc97990: 0x00000000 0x80000007 0x60023000 0x00000001 0x00001881 0x8000000b 0x3fcae590 0x40380f5a
3fc979b0: 0x00001881 0x80000007 0x00000009 0x403801e8 0x60023000 0x4038665c 0x40386668 0x00000000
3fc979d0: 0x3fc979cc 0x00000000 0x00000000 0x00000000 0x3fc979e4 0xffffffff 0x3fc979e4 0x3fc979e4
3fc979f0: 0x00000000 0x3fc979f8 0xffffffff 0x3fc979f8 0x3fc979f8 0x00000001 0x00000001 0x00000000
3fc97a10: 0x0001ffff 0x00000000 0x00000000 0x00000000 0x00000000 0x3fc97a20 0x00000000 0x00000000
3fc97a30: 0x00000000 0x3fc97a38 0xffffffff 0x3fc97a38 0x3fc97a38 0x00000000 0x3fc97a4c 0xffffffff
3fc97a50: 0x3fc97a4c 0x3fc97a4c 0x00000001 0x00000001 0x00000000 0x0001ffff 0x00000000 0x00000000
3fc97a70: 0x00000000 0x3fc9e60c 0x3fc9e674 0x3fca1ed8 0x3fca32f8 0x00000000 0x00000000 0x00000000
3fc97a90: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000

Steps to reproduce.

  1. 24LC04 EEPROM chip is connected to the I2C bus.
  2. i2c_new_master_bus(...) - as usually
  3. i2c_master_bus_add_device(...) - as usually
  4. Try to read a sequence of bytes starting from the given address:
uint8_t addr = 0;
uint8_t data[256];
size_t size = 73; // any number between 32 and 256
i2c_master_transmit_receive(device, &addr, sizeof(addr), data, size, 10);

From my observations, for the problem to occur, these two conditions must be true at the same time:

  • The length of the data to be read is >= 32 (the problem does not occur if the length to be read is < 32).
  • The device must have been reset using the reset button (which is an equivalent of power-on reset, if I'm not mistaken). The problem does not occur after other kinds of reset, such as via USB or by calling esp_restart().

If both conditions are true, the problem is 100% reproducible.

Debug Logs.

$ riscv32-esp-elf-addr2line -pfiaC -e build/whatever.elf 0x40382350
0x40382350: i2c_ll_read_rxfifo at /COMPONENT_HAL_DIR/esp32c3/include/hal/i2c_ll.h:613
 (inlined by) i2c_isr_receive_handler at /COMPONENT_DRIVER_DIR/i2c/i2c_master.c:533
 (inlined by) i2c_master_isr_handler_default at /COMPONENT_DRIVER_DIR/i2c/i2c_master.c:586

More Information.

The relevant part of the code disassembled by GDB:

/COMPONENT_DRIVER_DIR/i2c/i2c_master.c:                                                                                                                                                         
530         if (i2c_master->status == I2C_STATUS_READ) {                                                                                                                                        
   0x40382318 <+396>:   fence                                                                                                                                                                   
   0x4038231c <+400>:   lw      a5,12(s0)                                                                                                                                                       
   0x4038231e <+402>:   fence                                                                                                                                                                   
   0x40382322 <+406>:   bnez    a5,0x403823a6 <i2c_master_isr_handler_default+538>                                                                                                              
                                                                                                                                                                                                
531             i2c_operation_t *i2c_operation = &i2c_master->i2c_trans.ops[i2c_master->trans_idx];                                                                                             
   0x40382324 <+408>:   lhu     a5,132(s0)                                                                                                                                                      
   0x40382328 <+412>:   lw      s1,28(s0)                                                                                                                                                       
   0x4038232a <+414>:   slli    a5,a5,0x4                                                                                                                                                       
   0x4038232c <+416>:   add     s1,s1,a5                                                                                                                                                        
                                                                                                                                                                                                
532             portENTER_CRITICAL_ISR(&i2c_master->base->spinlock);                                                                                                                            
   0x4038232e <+418>:   jal     ra,0x40388b6c <vPortEnterCritical>                                                                                                                              
                                                                                                                                                                                                
533             i2c_ll_read_rxfifo(hal->dev, i2c_operation->data + i2c_operation->bytes_used, i2c_master->rx_cnt);                                                                              
   0x40382332 <+422>:   lw      a1,20(s0)                                                                                                                                                       
   0x40382334 <+424>:   lw      a2,12(s3)                                                                                                                                                       
   0x40382338 <+428>:   lw      a5,4(s1)                                                                                                                                                        
   0x4038233a <+430>:   zext.b  a3,a1                                                                                                                                                           
   0x4038233e <+434>:   lhu     a4,8(s1)                                                                                                                                                        
                                                                                                                                                                                                
/COMPONENT_HAL_DIR/esp32c3/include/hal/i2c_ll.h:                                                                                                                                                
612         for (int i = 0; i < len; i++) {                                                                                                                                                     
   0x40382342 <+438>:   beqz    a3,0x4038235e <i2c_master_isr_handler_default+466>                                                                                                              
   0x40382344 <+440>:   add     a5,a5,a4                                                                                                                                                        
   0x40382346 <+442>:   add     a3,a3,a5                                                                                                                                                        
                                                                                                                                                                                                
613             ptr[i] = HAL_FORCE_READ_U32_REG_FIELD(hw->fifo_data, data);                                                                                                                     
   0x40382348 <+444>:   lw      a4,28(a2)                                                                                                                                                       
   0x4038234a <+446>:   addi    a5,a5,1                                                                                                                                                         
   0x4038234c <+448>:   sw      a4,8(sp)                                                                                                                                                        
   0x4038234e <+450>:   lw      a4,8(sp)                                                                                                                                                        
   0x40382350 <+452>:   sb      a4,-1(a5)                                                                                                                                                       
                                                                                                                                                                                                
612         for (int i = 0; i < len; i++) {                                                                                                                                                     
   0x40382354 <+456>:   bne     a3,a5,0x40382348 <i2c_master_isr_handler_default+444>                                                                                                           
                                                                                                                                                                                                
/COMPONENT_DRIVER_DIR/i2c/i2c_master.c:                                                                                                                                                         
535             i2c_master->w_r_size = i2c_master->rx_cnt;                                                                                                                                      
   0x40382358 <+460>:   lw      a1,20(s0)                                                                                                                                                       
                                                                                                                                                                                                
536             i2c_operation->bytes_used += i2c_master->rx_cnt;                                                                                                                                
   0x4038235a <+462>:   lhu     a4,8(s1)                                                                                                                                                        
                                                                                                                                                                                                
535             i2c_master->w_r_size = i2c_master->rx_cnt;                                                                                                                                      
   0x4038235e <+466>:   add     a4,a4,a1                                                                                                                                                        
   0x40382360 <+468>:   slli    a4,a4,0x10                                                                                                                                                      

As can be seen from the register dump above, the exact spot where the problem occurs is
0x40382350 <+452>: sb a4,-1(a5), which is the assignment to ptr[i] inside i2c_ll_read_rxfifo(...).
Apparently the destination address (stored in a5 and already incremented for the next loop iteration) is NULL.

@sfzhi sfzhi added the Type: Bug bugs in IDF label Apr 24, 2024
@espressif-bot espressif-bot added the Status: Opened Issue is new label Apr 24, 2024
@github-actions github-actions bot changed the title Panic in I2C driver when reading from I2C EEPROM chip Panic in I2C driver when reading from I2C EEPROM chip (IDFGH-12697) Apr 24, 2024
@mythbuster5
Copy link
Collaborator

@sfzhi I tried to reproduce with our i2c_eeprom_main example but failed. Could you please provide your sdkconfig?

@mythbuster5
Copy link
Collaborator

IIRC,power-on reset will reset all condition, so after power-on reset, everything should begin as start. Some panic happen after power-on reset looks very strange to me even if GDB shows something. So 1. I want to see your configuration 2. See your reset reason

ESP-ROM:esp32c3-api1-20210207
Build:Feb  7 2021
rst:0x1 (POWERON),boot:0xc (SPI_FAST_FLASH_BOOT)
SPIWP:0xee
mode:DIO, clock div:1

@sfzhi
Copy link
Author

sfzhi commented Apr 25, 2024

Here you go:

ESP-ROM:esp32c3-api1-20210207                                                    
Build:Feb  7 2021                                                                
rst:0x1 (POWERON),boot:0xe (SPI_FAST_FLASH_BOOT)                                 
SPIWP:0xee                                                                       
mode:DIO, clock div:1                                                            
load:0x3fcd5820,len:0x1880                                                       
load:0x403cc710,len:0xd50                                                        
load:0x403ce710,len:0x2f50                                                       
entry 0x403cc71a                                                                 

The sdkconfig file: sdkconfig.txt.

@sfzhi
Copy link
Author

sfzhi commented Apr 25, 2024

I have conducted a few more experiments and the results are even more puzzling than before. When the panic occurs, if I don't wait until the device is restarted automatically (after 3 seconds, as per the configuration), but press the reset button again immediately, the device boots fine and the problem does not occur. I can't think of any reasonable explanation of what's happening.

@sfzhi
Copy link
Author

sfzhi commented May 3, 2024

Even more experiments resulted in the following observations and conclusions:

  • The problem is somehow tied to a specific layout of the binary image. It is 100% reproducible with some images and not reproducible at all with others. The difference between a "good" and a "bad" image can be as small as one function being two bytes longer/shorter - even for a function that is completely unrelated to I2C and is not even called before the problem occurs.
  • The problem is somehow tied to the reset type, but not always in the same way. With one image it will occur only after using the reset button, with another it will occur only after a reset via USB, with yet another it will occurs only after calling esp_restart(...). Combinations of these are also possible, but very rare.
  • The problem does not always manifest in exactly the same way. Most of the time, the panic occurs in the exact way as described above. However, in rare cases (meaning with some specific images) it manifests as an I2C transaction failure (without panic).
  • In one case (again, 100% reproducible with the given image), the panic occurred in the same spot in the code as described above, but on the second 32-bit chunk of data rather that the first one, with the destination pointer being NULL. That is a telling piece of evidence. It means the destination pointer magically turns into NULL between the first and the second 32-bit chunks. That suggests it may be some other (possibly unrelated) piece of code running at the same time (in another FreeRTOS task) that accidentally corrupts the internal state of the I2C driver.
  • By looking at what else might be running at the same time and moving those things one-by-one to a later time (after the I2C transaction in question) I managed to narrow it down to one piece that looks the most likely culprit:
wifi_init_config_t init_config = WIFI_INIT_CONFIG_DEFAULT();
ESP_ERROR_CHECK(esp_wifi_init(&init_config));
  • I have also noticed that the time the ESP32 takes to reach a certain point in the startup sequence depends on the reset type. The variation is not large (typically less than 0.1 seconds), but it's clearly measurable. That would explain the mysterious dependency on the reset type. Assuming the problem is indeed caused by WiFi initialization, a slight delay in when exactly the memory corruption occurs could make big difference w.r.t. how it affects the I2C transaction.
  • Unfortunately, much of the functionality of esp_wifi_init(...) is hidden inside esp_wifi_init_internal(..), source code for which doesn't seem to be provided as a part of ESP-IDF, so I don't know how I can investigate this issue further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Opened Issue is new Type: Bug bugs in IDF
Projects
None yet
Development

No branches or pull requests

3 participants