Summary: | deadlock between background parser and code completion | ||
---|---|---|---|
Product: | [Applications] kdevelop | Reporter: | Milian Wolff <mail> |
Component: | Language Support: CPP (Clang-based) | Assignee: | kdevelop-bugs-null |
Status: | REOPENED --- | ||
Severity: | grave | CC: | kfunk, mantri, mswan |
Priority: | VHI | Keywords: | release_blocker, triaged |
Version: | git master | ||
Target Milestone: | 5.0.0 | ||
Platform: | Other | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: |
Description
Milian Wolff
2016-01-30 17:51:20 UTC
I reproduced it once more and I think the real issue above is that the clang thread (#2) is not joining into the code completion thread (#18). If that would happen, then the parse thread in #13 should be able to continue. Am I missing something? Maybe this was not a real deadlock, just some odd behavior in clang that resulted in many seconds of parsetime? Without more debug symbols, this is hard to debug... I'll close this for now and reopen if I ever hit it again with more debug symbols in clang available. Dear Bug Submitter, This bug has been in NEEDSINFO status with no change for at least 15 days. Please provide the requested information as soon as possible and set the bug status as REPORTED. Due to regular bug tracker maintenance, if the bug is still in NEEDSINFO status with no change in 30 days, the bug will be closed as RESOLVED > WORKSFORME due to lack of needed information. For more information about our bug triaging procedures please read the wiki located here: https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging If you have already provided the requested information, please set the bug status as REPORTED so that the KDE team knows that the bug is ready to be confirmed. Thank you for helping us make KDE software even better for everyone! Dear Bug Submitter, This bug has been in NEEDSINFO status with no change for at least 30 days. The bug is now closed as RESOLVED > WORKSFORME due to lack of needed information. For more information about our bug triaging procedures please read the wiki located here: https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging Thank you for helping us make KDE software even better for everyone! This has occurred on my machine and it appears rather clear the culprit after having done some investigation. The following locking events should make it apparent: ClangParseJob thread: 1. In ClangParseJob::run, ParseSession constructor is invoked which locks on the corresponding parse session. 2. ClangParseInfo::run then invokes ClangHelpers::buildDUChain which then invokes DUChainWriteLocker constructor. ClangCodeCompletion thread: 1. In ClangCodeCompletionWorker::run, DUChainReadLocker constructor is invoked. 2. ClangCodeCompletionWorker::run then invokes createCompletionContext which invokes ClangCodeCompletionContext constructor which then invokes ParseSesssion constructor on the same parse session in ClangParseJob 1. Consequently, ClangCodeCompletion thread could hold a readlock on DUChain while ClangParseJob could hold a lock on the parse session while they both need simultaneous access to both resources. See comment above, definitely still an issue. Hmm, on second thought, it seems that the DUChain read lock is unlocked before createCompletionContext is invoked. Perhaps my above analysis is not the case. Alright, now I believe I have actually figured this out for the deadlock I was seeing although I am not sure it is the same one reported in this ticket. I think Milian's suspicion of clang taking an exorbitant amount of time appears to be the source of his problem, having given his call stack some scrutiny. In my case however, I see a call stack in which AbstractNavigationWidgetPrivate::anchorClicked is invoked which holds onto a DUChain read lock and some 53 calls up the call stack later, we reach DUChainWriteLocker::lock. Even just looking at the next relevant function up the call stack in AdaptSignatureAction::execute, it should already be the case that our current thread does not hold a read lock on DUChain, but along this code path it necessarily does. This regression was added when resolving ticket #386901, and the issue there should have probably been addressed by moving the DUChaineReadLocker further up the stack and left the contract assumed by acceptLink to ensure there are no locks on DUChain and the callee will make use of such locks as needed. A possibly relevant merge request was started @ https://invent.kde.org/kdevelop/kdevelop/-/merge_requests/277 Git commit a947074f0872ad3245b8c73679143998a88e3753 by Igor Kushnir, on behalf of Jonathan L. Verner. Committed on 01/05/2022 at 12:56. Pushed by igorkushnir into branch 'release/22.04'. Fix a crash in the "update signature action". The problem seems to be that the DUChain is readlocked in `AbstractNavigationWidgetPrivate::anchorClicked` (see also [2]), which then proceeds through the following (backtrace-like) call chain #9 AdaptSignatureAction::execute() (at plugins/clang/codegen/adaptsignatureaction.cpp:83) #10 ProblemNavigationContext::executeAction(int) (at kdevplatform/language/duchain/navigation problemnavigationcontext.cpp:258) #11 ProblemNavigationContext::executeKeyAction(QString const&) (at kdevplatform/language/duchain/navigation/problemnavigationcontext.cpp:243) const&) (at kdevplatform/language/duchain/navigation/abstractnavigationcontext.cpp:183) #13 AbstractNavigationContext::acceptLink(QString const&) (at kdevplatform/language/duchain/navigation/abstractnavigationcontext.cpp:487) #14 AbstractNavigationWidgetPrivate::anchorClicked which ends at plugins/clang/codegen/adaptsignatureaction.cpp:83 with an `ENSURE_CHAIN_NOT_LOCKED` macro, which asserts. However, the lock in `anchorClicked` was added there in commit ff72bc32 to fix bug 386901 ([1]) so it cannot just be removed. The callchain triggering the 386901 bug looks as follows: #0 FunctionDefinition::declaration (at kdevplatform/language/duchain/functiondefinition.cpp:52) #1 FunctionDefinition::declaration (at kdevplatform/language/duchain/functiondefinition.cpp:52) AbstractDeclarationNavigationContext::AbstractDeclarationNavigationContext (at kdevplatform/language/duchain/navigation/abstractdeclarationnavigationcontext.cpp:67) #3 DeclarationNavigationContext::AbstractDeclarationNavigationContext (at plugins/clang/duchain/navigationwidget.cpp:38) #4 ClangNavigationWidget::ClangNavigationWidget (at plugins/clang/duchain/navigationwidget.cpp:98) #5 ClangDUContext<KDevelop::TopDUContext, 140>::createNavigationWidget (at plugins/clang/duchain/clangducontext.cpp:46) #6 AbstractNavigationContext::registerChild (at kdevplatform/language/duchain/navigation/abstractnavigationcontext.cpp:281) #7 AbstractNavigationContext::execute (at kdevplatform/language/duchain/navigation/abstractnavigationcontext.cpp:201) #8 AbstractNavigationContext::acceptLink (at kdevplatform/language/duchain/navigation/abstractnavigationcontext.cpp:487) #9 AbstractNavigationWidgetPrivate::anchorClicked (at kdevplatform/language/duchain/navigation/abstractnavigationwidget.cpp:285) which hits an assert at kdevplatform/language/duchain/functiondefinition.cpp:52 in the `ENSURE_CAN_READ` macro. This commit moves the lock from `anchorClicked` into `AbstractNavigationContext::registerChild`, which is the last opportunity for a lock before a language-plugin specific method is called (so that the bug does not reappear in other language plugins). References [1] https://bugs.kde.org/show_bug.cgi?id=386901 [2] https://phabricator.kde.org/D22182 Related: bug 416714 FIXED-IN: 5.8.220401 M +11 -0 kdevplatform/language/duchain/navigation/abstractnavigationcontext.cpp M +0 -2 kdevplatform/language/duchain/navigation/abstractnavigationwidget.cpp https://invent.kde.org/kdevelop/kdevelop/commit/a947074f0872ad3245b8c73679143998a88e3753 Nice post. Thanks for sharing. Visit: https://www.sevenmentor.com |