Bug 401512 - kig crashes with simple python script with a failing assertion
Summary: kig crashes with simple python script with a failing assertion
Status: RESOLVED FIXED
Alias: None
Product: kig
Classification: Applications
Component: general (show other bugs)
Version: unspecified
Platform: Fedora RPMs Linux
: NOR crash
Target Milestone: ---
Assignee: David E. Narvaez
URL:
Keywords:
: 409497 (view as bug list)
Depends on:
Blocks:
 
Reported: 2018-11-28 18:05 UTC by Maurizio Paolini
Modified: 2020-01-21 17:25 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
kig file that makes kig crash (1.50 KB, application/x-kig)
2018-11-28 18:05 UTC, Maurizio Paolini
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Maurizio Paolini 2018-11-28 18:05:21 UTC
Created attachment 116551 [details]
kig file that makes kig crash

SUMMARY

kig crashes with:

kig: /usr/include/boost/python/object_core.hpp:422: boost::python::api::object_base::~object_base(): Assertion `Py_REFCNT(m_ptr) > 0' failed.

when trying to define a simple python script with a numeric label as single
argument and returning a DoubleObject or loading a kig file with such a construction.

STEPS TO REPRODUCE
1. Download the attachment "refcnt_bug.kig"
2. start kig with

  kig refcnt_bug.kig

OBSERVED RESULT

crash with the message:
kig: /usr/include/boost/python/object_core.hpp:422: boost::python::api::object_base::~object_base(): Assertion `Py_REFCNT(m_ptr) > 0' failed.


SOFTWARE/OS VERSIONS
I am using a Fedora 29 installation fully updated.
KDE Framework is kf5-5.50
QT5 5.11
boost-devel-1.66.0-14.fc29.x86_64
kig was downloaded from latest Git repository and compiled from source.
Python scripting support was correctly detected.

ADDITIONAL INFORMATION
The failing assert is contained in /usr/include/boost/python/object_core.hpp
a file contained in the Fedora package boost-devel-1.66.0-14.fc29.x86_64:

inline api::object_base::~object_base()
{
    assert( Py_REFCNT(m_ptr) > 0 );
    Py_DECREF(m_ptr);
}

It I comment the assert and recompile everything works apparently fine,
however the value of Py_REFCNT becomse -1 and then gets quite strange values (seen by adding a printf in place of the assert) indicating memory allocation problems.
Comment 1 Maurizio Paolini 2018-11-29 10:28:15 UTC
It seems that the first appearance of strange values for Py_REFCNT
appear in the retdict.get function called within

CompiledPythonScript PythonScripter::compile( const char* code )

if file scripting/python_scripter.cc
---------------- snippet ---------------
  ret->ref = 0;
  ret->calcfunc = retdict.get( "calc" );
----------------------------------------

I don't know whether the problem is really there and how to debug further
Comment 2 Maurizio Paolini 2019-12-16 15:11:18 UTC
The following diff:

-----------------------------------------------------
$ git diff
diff --git a/scripting/python_scripter.cc b/scripting/python_scripter.cc
index 62a7409b..57e09255 100644
--- a/scripting/python_scripter.cc
+++ b/scripting/python_scripter.cc
@@ -531,6 +531,10 @@ ObjectImp* PythonScripter::calc( CompiledPythonScript& script, const Args& args
     };
     tuple argstup( argstuph );
 
+// workaround?
+//
+    Py_INCREF (argstup.ptr());
+
     handle<> reth( PyEval_CallObject( calcfunc.ptr(), argstup.ptr() ) );
 //    object resulto = calcfunc( argstup );
 //    handle<> reth( PyEval_CallObject( calcfunc.ptr(), args ) );
-----------------------------------------------------

seems to "work around" the reported issue.  I do not know if this is a real fix because I have no experience with boost, but at least allows to use python scripting that otherwise seems to be completely broken (see the attached kig example)
Comment 3 David E. Narvaez 2019-12-22 20:28:47 UTC
*** Bug 409497 has been marked as a duplicate of this bug. ***
Comment 4 Kevin Kofler 2019-12-23 10:01:37 UTC
So, after looking at the code and the documentation, I suspect the issue is not really the reference count of the tuple itself, but of the individual arguments, which would explain why you do not hit this assertion if the function has no arguments.
Comment 5 Franco Pasquarelli 2019-12-23 10:19:31 UTC
Also the following diff solve the problem

diff --git a/scripting/python_scripter.cc b/scripting/python_scripter.cc
index 62a7409b..ada48861 100644
--- a/scripting/python_scripter.cc
+++ b/scripting/python_scripter.cc
@@ -527,13 +527,14 @@ ObjectImp* PythonScripter::calc( CompiledPythonScript& script, const Args& args
     handle<> argstuph( PyTuple_New( args.size() ) );
     for ( int i = 0; i < (int) objectvect.size(); ++i )
     {
+      Py_XINCREF((objectvect.begin() +i)->ptr());
       PyTuple_SetItem( argstuph.get(), i, (objectvect.begin() +i)->ptr() );
     };
+
+      
     tuple argstup( argstuph );
 
     handle<> reth( PyEval_CallObject( calcfunc.ptr(), argstup.ptr() ) );
-//    object resulto = calcfunc( argstup );
-//    handle<> reth( PyEval_CallObject( calcfunc.ptr(), args ) );
     object resulto( reth );
 
     extract<ObjectImp&> result( resulto );

where I increase the reference count of the arguments
Comment 6 David E. Narvaez 2019-12-23 18:31:23 UTC
(In reply to Kevin Kofler from comment #4)
> So, after looking at the code and the documentation, I suspect the issue is
> not really the reference count of the tuple itself, but of the individual
> arguments, which would explain why you do not hit this assertion if the
> function has no arguments.

I agree that the issue seems to be the refcount of the arguments, but how would you explain that the patch in comment #2 works around the issue?

On a separate note, the assertion was added 3 years ago here

https://github.com/boostorg/python/commit/bc2f77a3db0b3b428ef7bc205804ce5625e46001

so it seems this has been broken since Boost 1.63 or so?
Comment 7 Kevin Kofler 2019-12-23 18:41:10 UTC
The patch from comment #2 prevents the tuple from being de-refcounted all the way down to 0 (because it adds a bogus unowned reference), so the tuple is leaked, and its continued existence also keeps the arguments referenced. (It is the destruction of the tuple that decreases the reference count of each argument.)

On the other hand, the patch from comment #5 looks like the correct fix to me.
Comment 8 David E. Narvaez 2019-12-23 20:40:06 UTC
(In reply to Kevin Kofler from comment #7)
> The patch from comment #2 prevents the tuple from being de-refcounted all
> the way down to 0 (because it adds a bogus unowned reference), so the tuple
> is leaked, and its continued existence also keeps the arguments referenced.
> (It is the destruction of the tuple that decreases the reference count of
> each argument.)

Ah, makes sense.

> On the other hand, the patch from comment #5 looks like the correct fix to
> me.

I agree. Franco, can you post your patch to Phabricator? Against the release/19.12 branch please.
Comment 9 Maurizio Paolini 2019-12-23 22:17:12 UTC
(In reply to David E. Narvaez from comment #6)
> (In reply to Kevin Kofler from comment #4)
> > So, after looking at the code and the documentation, I suspect the issue is
> > not really the reference count of the tuple itself, but of the individual
> > arguments, which would explain why you do not hit this assertion if the
> > function has no arguments.
> 
> I agree that the issue seems to be the refcount of the arguments, but how
> would you explain that the patch in comment #2 works around the issue?
> 
> On a separate note, the assertion was added 3 years ago here
> 
> https://github.com/boostorg/python/commit/
> bc2f77a3db0b3b428ef7bc205804ce5625e46001
> 
> so it seems this has been broken since Boost 1.63 or so?

Yes, indeed!  I am no longer an active user, so I realized the presence of the problem after some time; then I posted the report only downstream for fedora.
In any case I opened this bug report more than one year ago.
Comment 10 Maurizio Paolini 2019-12-31 02:14:11 UTC
(In reply to David E. Narvaez from comment #8)
> (In reply to Kevin Kofler from comment #7)
> > The patch from comment #2 prevents the tuple from being de-refcounted all
> > the way down to 0 (because it adds a bogus unowned reference), so the tuple
> > is leaked, and its continued existence also keeps the arguments referenced.
> > (It is the destruction of the tuple that decreases the reference count of
> > each argument.)
> 
> Ah, makes sense.
> 
> > On the other hand, the patch from comment #5 looks like the correct fix to
> > me.
> 
> I agree. Franco, can you post your patch to Phabricator? Against the
> release/19.12 branch please.

On behalf of Franco I created a task on phabricator: https://phabricator.kde.org/T12453
however I am new to phabricator, so I don't know how to proceed.
Comment 11 Maurizio Paolini 2020-01-01 03:22:51 UTC
actually, the patch is in

https://phabricator.kde.org/differential/revision/edit/26333/

I am not sure that I am using phabricator correctly and couldn't find a place where to indicate that it should be used against release/19.12
Comment 12 Maurizio Paolini 2020-01-01 03:25:16 UTC
(In reply to Maurizio Paolini from comment #11)
> actually, the patch is in
> 
[WRONG] > https://phabricator.kde.org/differential/revision/edit/26333/
> 
> I am not sure that I am using phabricator correctly and couldn't find a
> place where to indicate that it should be used against release/19.12

oops, pardon me, the URL of the revision is:

https://phabricator.kde.org/D26333