Bug 230350 - Umbrello segfaults on opening xmi file
Summary: Umbrello segfaults on opening xmi file
Status: RESOLVED NOT A BUG
Alias: None
Product: umbrello
Classification: Applications
Component: general (show other bugs)
Version: unspecified
Platform: unspecified Unspecified
: NOR normal
Target Milestone: ---
Assignee: Umbrello Development Group
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-03-11 19:22 UTC by Eckhart Wörner
Modified: 2013-06-19 11:35 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
xml.save.xmi fixed (10.51 KB, application/x-uml)
2013-06-18 18:20 UTC, Oliver Kellogg
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Eckhart Wörner 2010-03-11 19:22:36 UTC
Version:           2.4.1 (using 4.4.1 (KDE 4.4.1), Kubuntu packages)

Umbrello segfaults on opening the xmi file at http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=5;filename=xml.save.xmi;att=2;bug=451079

(See http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=451079 for more details about this file)
Comment 1 Ralf Habacker 2012-01-17 15:35:17 UTC
Umbrello crahsed on windows with recent svn code - the problem is a recursive call 

between umbrello.exe!Model_Utils::findObjectInList() and umbrello.exe!UMLPackage::findObjectById() 

until stack overflow. 

Stacktrace: 
msvcr100d.dll!_heap_alloc_base(unsigned int size)  Zeile 55	C
msvcr100d.dll!_heap_alloc_dbg_impl(unsigned int nSize, int nBlockUse, const char * szFileName, int nLine, int * errno_tmp)  Zeile 431 + 0x9 Bytes	C++
msvcr100d.dll!_nh_malloc_dbg_impl(unsigned int nSize, int nhFlag, int nBlockUse, const char * szFileName, int nLine, int * errno_tmp)  Zeile 239 + 0x19 Bytes	C++
msvcr100d.dll!_nh_malloc_dbg(unsigned int nSize, int nhFlag, int nBlockUse, const char * szFileName, int nLine)  Zeile 302 + 0x1d Bytes	C++
msvcr100d.dll!malloc(unsigned int nSize)  Zeile 56 + 0x15 Bytes	C++
msvcr100d.dll!operator new(unsigned int size)  Zeile 59 + 0x9 Bytes	C++
umbrello.exe!std::_Allocate<std::_Container_proxy>(unsigned int _Count, std::_Container_proxy * __formal)  Zeile 36 + 0x15 Bytes	C++
umbrello.exe!std::allocator<std::_Container_proxy>::allocate(unsigned int _Count)  Zeile 187 + 0xb Bytes	C++
umbrello.exe!std::_String_val<char,std::allocator<char> >::_String_val<char,std::allocator<char> >(std::allocator<char> _Al)  Zeile 469 + 0xa Bytes	C++
umbrello.exe!std::basic_string<char,std::char_traits<char>,std::allocator<char> >::basic_string<char,std::char_traits<char>,std::allocator<char> >(const std::basic_string<char,std::char_traits<char>,std::allocator<char> > & _Right)  Zeile 543 + 0x50 Bytes	C++
umbrello.exe!UMLPackage::findObjectById(std::basic_string<char,std::char_traits<char>,std::allocator<char> > id)  Zeile 259 + 0x18 Bytes	C++
umbrello.exe!Model_Utils::findObjectInList(std::basic_string<char,std::char_traits<char>,std::allocator<char> > id, const UMLObjectList & inList)  Zeile 92 + 0x22 Bytes	C++
umbrello.exe!UMLPackage::findObjectById(std::basic_string<char,std::char_traits<char>,std::allocator<char> > id)  Zeile 259 + 0x20 Bytes	C++
umbrello.exe!Model_Utils::findObjectInList(std::basic_string<char,std::char_traits<char>,std::allocator<char> > id, const UMLObjectList & inList)  Zeile 92 + 0x22 Bytes	C++
umbrello.exe!UMLPackage::findObjectById(std::basic_string<char,std::char_traits<char>,std::allocator<char> > id)  Zeile 259 + 0x20 Bytes	C++
umbrello.exe!Model_Utils::findObjectInList(std::basic_string<char,std::char_traits<char>,std::allocator<char> > id, const UMLObjectList & inList)  Zeile 92 + 0x22 Bytes	C++
umbrello.exe!UMLPackage::findObjectById(std::basic_string<char,std::char_traits<char>,std::allocator<char> > id)  Zeile 259 + 0x20 Bytes	C++
Comment 2 Oliver Kellogg 2012-01-17 19:11:08 UTC
I'm not sure it's worthwhile to hunt this down - unfortunately 1.5.5 was a "bad" umbrello version. For example, even only glancing at the xmi file with the naked eye, I see there is a
   <UML:Stereotype xmi.id="2" name="folder" />
and a few lines later a
   <UML:DataType xmi.id="2" name="long" />
which are in conflict with their XMI IDs.
There may be more problems.

I suggest recreating the model with any umbrello >= version 1.5.8.
(Should not be too hard, given that the file only has a few objects.)
Comment 3 Ralf Habacker 2012-01-17 19:37:48 UTC
(In reply to comment #2)
> I'm not sure it's worthwhile to hunt this down - unfortunately 1.5.5 was a
> "bad" umbrello version. For example, even only glancing at the xmi file with
> the naked eye, I see there is a
>    <UML:Stereotype xmi.id="2" name="folder" />
> and a few lines later a
>    <UML:DataType xmi.id="2" name="long" />
> which are in conflict with their XMI ID
> There may be more problems.
> 
>I suggest recreating the model with any umbrello >= version 1.5.8.
>(Should not be too hard, given that the file only has a few objects.)

I agree - but i think that umbrello should not crash in any case - crashing applications gives user a bad expression: this software is unstable and not usable in production environments.
Comment 4 Ralf Habacker 2012-01-17 20:06:18 UTC
The file shows an internal design issue, which may be triggered also by other conditions. 

UMLObject * UMLPackage::findObjectById(Uml::IDType id)

calls 

    return Model_Utils::findObjectInList(id, m_objects);

which is implemented as 

UMLObject* findObjectInList(Uml::IDType id, const UMLObjectList& inList)
{
    for (UMLObjectListIt oit(inList); oit.hasNext(); ) {
        UMLObject *obj = oit.next();
        if (obj->id() == id)
            return obj;
        UMLObject *o;
        UMLObject::ObjectType t = obj->baseType();
        switch (t) {
        case UMLObject::ot_Folder:
        case UMLObject::ot_Package:
        case UMLObject::ot_Component:

here it calls the above mentioned UMLPackage::findObjectById 

           o = static_cast<UMLPackage*>(obj)->findObjectById(id);

Which means if the related id is not in the list for whatever reasons, there is an endless recursive loop. 

The question is, how to detect this condition ?
Comment 5 Oliver Kellogg 2012-01-18 19:40:44 UTC
(In reply to comment #4)
> The file shows an internal design issue, which may be triggered also by other
> conditions. 
> 
> UMLObject * UMLPackage::findObjectById(Uml::IDType id)
> 
> calls 
> 
>     return Model_Utils::findObjectInList(id, m_objects);
> 
> which is implemented as 
> 
> UMLObject* findObjectInList(Uml::IDType id, const UMLObjectList& inList)
> {
>     for (UMLObjectListIt oit(inList); oit.hasNext(); ) {
>         UMLObject *obj = oit.next();
>         if (obj->id() == id)
>             return obj;
>         UMLObject *o;
>         UMLObject::ObjectType t = obj->baseType();
>         switch (t) {
>         case UMLObject::ot_Folder:
>         case UMLObject::ot_Package:
>         case UMLObject::ot_Component:
> 
> here it calls the above mentioned UMLPackage::findObjectById 
> 
>            o = static_cast<UMLPackage*>(obj)->findObjectById(id);
> 
> Which means if the related id is not in the list for whatever reasons, there is
> an endless recursive loop. 
> 
> The question is, how to detect this condition ?

This could be solved by adding an optional arg, UMLPackage *self, at Model_Utils::findObjectInList:

  UMLObject* findObjectInList(Uml::IDType id,
                              const UMLObjectList& inList,
                              UMLPackage *self = 0); 

and replacing the line

  o = static_cast<UMLPackage*>(obj)->findObjectById(id);

by

  {
    UMLPackage tmpPkg = static_cast<UMLPackage*>(obj);
    if (tmpPkg == self)
      return obj;
    o = tmpPkg->findObjectById(id);
  }

Beware though that this is still a pathological case. It would mean that the UMLPackage contains itself - which most likely wreaks havoc elsewhere in the code.
Perhaps it would be better to issue an error and return NULL in that case.
Comment 6 Ralf Habacker 2012-01-19 15:30:01 UTC
SVN commit 1274699 by habacker:

Made xmi loading more robust by adding guards to prevent cycles in uml objects.

There are still problems with cycles in UMLListViewItem::deepcopy(), which will be
addressed in an additional patch.


 M  +5 -0      package.cpp  
 M  +12 -1     umlobject.cpp  
 M  +1 -1      umlobject.h  


WebSVN link: http://websvn.kde.org/?view=rev&revision=1274699
Comment 7 Ralf Habacker 2013-06-13 15:48:49 UTC
This problem is caused by a broken file, which contains multiple occurence of the same xmi.id's  

line 16:    <UML:Model stereotype="2" isSpecification="false" isLeaf="false" visibility="public" namespace="m1" xmi.id="1" isRoot="false" isAbstract="false" name="Logical View" >
31        <UML:DataType stereotype="72" isSpecification="false" isLeaf="false" visibility="public" namespace="8" xmi.id="1" isRoot="false" isAbstract="false" name="int" />
Comment 8 Maximiliano Curia 2013-06-18 09:05:13 UTC
Shouldn't this format issue be sanitized instead of looping forever? (it doesn't segfaults for me, but it loops forever).

Also the original submitter (in the debian bug) claims that the file, the file you mentioned, was generated by an older version of umbrello, so maybe this bug needs to be retitled, but I don't see it as invalid.
Comment 9 Ralf Habacker 2013-06-18 10:48:15 UTC
(In reply to comment #8)
> Shouldn't this format issue be sanitized instead of looping forever? (it
> doesn't segfaults for me, but it loops forever).
> 
> Also the original submitter (in the debian bug) claims that the file, the
> file you mentioned, was generated by an older version of umbrello, 
> so maybe this bug needs to be retitled, but I don't see it as invalid.

umbrello has already gotten some work to be loop resistant, but in all those cases there are uniq element id's, which is not the case here. 

Here i would classify the file as corrupt, because umbrello will not be able to load this file  without loosing contained information. If there are two equal id's and they are referenced from other elements, how should umbrello be able to say which reference is for which element ?  Should it skip those elements and all references or assign all references to one element and re'id the other ?  And how will the user be able to see this corruption visual ? 

In fact when umbrello is directed to load such a file it should cancel file loading and report a "file cannot be load, because it is corrupt" or similar message to the user. 

>  was generated by an older version of umbrello, 
has this been reported as major bug in that version ? Creating corrupt files looks like a major bug to me.
Comment 10 Oliver Kellogg 2013-06-18 18:20:29 UTC
Created attachment 80618 [details]
xml.save.xmi fixed

(In reply to comment #9)
> [...]
> > was generated by an older version of umbrello,
> has this been reported as major bug in that version ? Creating corrupt files looks like a major bug to me.

Certainly. As said in comment #2, 1.5.5 was a known "bad" version, and we are not supporting that version any more.
Nevertheless I made an exact analysis of the reproducer file, and the attachment fixes the problems of that file.

Here are the problems:

1) Duplicated xmi.id values

UML:Model      xmi.id="1" name="Logical View"
UML:DataType   xmi.id="1" name="int"

UML:Stereotype xmi.id="2" name="folder"
UML:DataType   xmi.id="2" name="long"

UML:Model      xmi.id="3" name="Use Case View"
UML:DataType   xmi.id="3" name="String"

UML:Model      xmi.id="4" name="Component View"
UML:DataType   xmi.id="4" name="double"

UML:Model      xmi.id="5" name="Deployment View"
UML:DataType   xmi.id="5" name="date"

UML:Model      xmi.id="6" name="Entity Relationship Model"
UML:DataType   xmi.id="6" name="CREATE_DATE"

UML:Package    xmi.id="8" name="Datatypes"
UML:DataType   xmi.id="8" name="NUMBER2"

Solution: Add 100 to xmi.ids of <UML:DataType>s, for example:
  UML:DataType   xmi.id="108" name="NUMBER2"
and adjust all xmi.id references to those datatypes.

2)
<UML:Stereotype isSpecification="false" isLeaf="false" visibility="public"
             namespace="1" xmi.id="72" isRoot="false" isAbstract="false" name="datatype" />
Changed      namespace="1"
to           namespace="m1"
Comment 11 Maximiliano Curia 2013-06-19 11:35:22 UTC
¡Hola Ralf!

El 2013-06-18 a las 10:48 +0000, Ralf Habacker escribió:
> In fact when umbrello is directed to load such a file it should cancel file
> loading and report a "file cannot be load, because it is corrupt" or similar
> message to the user. 

This sounds like a good aproach. So, should this bug be retitled?

> >  was generated by an older version of umbrello, 
> has this been reported as major bug in that version ? Creating corrupt files
> looks like a major bug to me.

It's reported for version 3.5.5 in debian. The debian bug:

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=451079#5

has the exact wording of the original submitter.