Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

possible crash when recursively entering the Notepad++ backup_mutex #14906

Closed
xomx opened this issue Mar 26, 2024 · 6 comments · May be fixed by #15031
Closed

possible crash when recursively entering the Notepad++ backup_mutex #14906

xomx opened this issue Mar 26, 2024 · 6 comments · May be fixed by #15031
Assignees
Labels
bug crash issue causing N++ to crash

Comments

@xomx
Copy link
Contributor

xomx commented Mar 26, 2024

Description of the Issue

While simulating the stalled/blocked/hanged file saving / filecache flushing for the NUL-corruptions issues debugging purposes, I found a bad N++ design - under some circumstances like that blocked file-IO, there is a possibility of multiple entry to the N++ backup_mutex std::mutex from the same thread, resulting in a N++ crash.

Ref: https://en.cppreference.com/w/cpp/thread/mutex/lock
"If lock is called by a thread that already owns the std::mutex, the behavior is undefined: for example, the program may deadlock.
An implementation that can detect the invalid usage is encouraged to throw a std::system_error with error condition
resource_deadlock_would_occur instead of deadlocking."
(Note: In such a case MS C++ throws the _RESOURCE_DEADLOCK_WOULD_OCCUR)

This can lead to loss or corruption of the opened/saved files in N++, as well as failure to save some important configuration files, which are saved when the N++ is being closed (e.g. the session.xml, config.xml, etc.).

STR

Current N++ design allows calling of its FileManager::backupCurrentBuffer() func multiple times from the same thread even before finishing the backup saving there but the file-IO could be slow/blocked, e.g.:

  • HW not ready (disk spinning up, waking up from a deep powerstate sleep, a network slowdown/error...)
  • busy disk (saving/flushing some other-app big data, etc.)
  • user set a too short backup time interval (e.g. 1 sec in the Preferences > Backup > Backup in every ...)
    and the file to be saved is bigger and the disk used is not fast enough
  • possible disk driver errors blocking the IO
  1. Simulate blocked file-IO by putting superfluous modal MessageBox WINAPI call to the N++ Win32_IO_File::write or Win32_IO_File::close()
  2. Run that special N++ build.
  3. Ensure that the N++ Preferences > Backup is ON:
    npp-pref-backup
  4. Either open a new N++ tab, type something in it to make it dirty, wait for the N++ backup-thread crash (default is the 7 secs as is visible in the pic above) or close the N++ (files will be written at the N++ exit ...)

Expected Behavior

N++ handles that (temporarily) "blocked-IO" situation without the crash.

Actual Behavior

N++ will start to pointlessly stack up its NPPM_INTERNAL_SAVEBACKUP messages via the periodically called PostMessage in the Notepad_plus::backupDocument(void * /*param*/) no matter if the backup file-saving can be / is completed or not, the FileManager::backupCurrentBuffer() will be reentered from the same thread causing crash at the std::lock_guard<std::mutex> lock(backup_mutex) .

Proposed solutions

  1. The simplest (I highly do not recommend that) is the replacing std::mutex by std::recursive_mutex. This will only delay the possible crash of the app. The std::recursive_mutex has not its own locking counter method (and its native_handle() method is unfortunately not implemented on the Windows platform) needed for a proper solution.
  2. Use a custom std::mutex solution enhanced by a safe lockcount() method, as the correct solution is the immediate returning from that FileManager::backupCurrentBuffer() func when a recursive lock-situation (not finished previous backup) is detected.
  3. A complete change of the N++ backup-design.

@donho
If you agree with the above no.2 proposal, assign me please to this issue, I will try a PR.

@chcg chcg added bug crash issue causing N++ to crash labels Mar 26, 2024
@donho
Copy link
Member

donho commented Mar 27, 2024

@chcg
Thank you for assigning this issue to me.

@xomx
I can reproduce the crash with your provided STR.
Let's try your 2nd suggestion then.

@donho
Copy link
Member

donho commented Mar 29, 2024

#14917 (comment)

@donho donho reopened this Mar 29, 2024
@donho
Copy link
Member

donho commented Apr 22, 2024

@xomx
Could you provide the instructions to reproduce the crash from the master?

@xomx
Copy link
Contributor Author

xomx commented Apr 23, 2024

@donho

Could you provide the instructions to reproduce the crash from the master?

Currently I cannot.
I tested this master on a PC that is currently unavailable to me and here now I cannot replicate my previous setup exhibiting the error, I forgot how, sorry. But I have already solved it there, just did not have time to publish the finished PR. So I did now. Is it ok?

@donho
Copy link
Member

donho commented Apr 23, 2024

@xomx
Thank you for the PR.

I tested this master on a PC that is currently unavailable to me

It's OK, let's wait when it is available. For me this issue has been fixed by bbeaafa. In order to consider #15031, I need to reproduce the crash.

@donho
Copy link
Member

donho commented May 6, 2024

@xomx
If there's no crash can be reproduced, I have to consider the crash has been fixed and close this issue.
Please provide the instructions to reproduce the crash, then I will reopen this issue.

@donho donho closed this as completed May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment