Bug 278593 - KDE 4.7 Doesn't index some specific files
Summary: KDE 4.7 Doesn't index some specific files
Status: RESOLVED FIXED
Alias: None
Product: nepomuk
Classification: Miscellaneous
Component: general (show other bugs)
Version: unspecified
Platform: Chakra Linux
: NOR normal
Target Milestone: ---
Assignee: Sebastian Trueg
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-07-27 03:15 UTC by Weng Xuetian
Modified: 2011-11-27 13:14 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
a jpg file that nepomuk ignores. (869.95 KB, image/jpeg)
2011-07-27 03:16 UTC, Weng Xuetian
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Weng Xuetian 2011-07-27 03:15:32 UTC
Version:           unspecified (using Devel) 
OS:                Linux

I already using KDE 4.7's kde-runtime try #2, so this is not a duplication of 277536.

Nepomuk does start index, but seems to ignore some file, like C/C++ source, png file, some jpg, and some mp3 (At least here).

NepomukServer complains about they have some wrong value.

For C/C++ Source file:
[/usr/bin/nepomukservicestub] "/usr/bin/nepomukservicestub(3740)" Soprano: "Invalid argument (1)": "Unknown protocol '' encountered."

For png file:
"/usr/bin/nepomukservicestub(3740)" Soprano: "Invalid argument (1)": "Failed to convert 'None' to literal of type 'http://www.w3.org/2001/XMLSchema#boolean'."

For some mp3 file, it complains
[/usr/bin/nepomukservicestub] "/usr/bin/nepomukservicestub(3740)" Soprano: "Invalid argument (1)": "<http://www.semanticdesktop.org/ontologies/2009/02/19/nmm#musicAlbum> has a rdfs:range of <http://www.semanticdesktop.org/ontologies/2009/02/19/nmm#MusicAlbum>. <_:c> only has the following types <http://www.w3.org/2000/01/rdf-schema#Resource>

For some jpg with "Create Date"
strigi seems give the wrong format of date "2009:10:27 21:31:22" (should be "-" in date), Nepomuk Server Also complain about it.

Reproducible: Always

Steps to Reproduce:
Let nepomuk server index specific folder.
Put a real c++ file in it.

Actual Results:  
C++ file not indexed.

Expected Results:  
C++ index correctly.

I will also attach some example file that doesn't indexed by nepomuk.
Comment 1 Weng Xuetian 2011-07-27 03:16:25 UTC
Created attachment 62227 [details]
a jpg file that nepomuk ignores.

a jpg file that nepomuk ignores.
Comment 2 Vishesh Handa 2011-07-28 08:39:42 UTC
With 4.7 we donot accept data that is not conformant to the ontologies. I'm not going to hack around in Nepomuk to handle incorrect data provided by Strigi. The Strigi analyzers should be fixed.

Btw, the music album issue was fixed by Trueg in Strigi, so please update to the latest version.
Comment 3 Weng Xuetian 2011-07-28 12:34:03 UTC
I think your statement make sense, though not so good for user.. user would found that some file cannot be found. But anyway git strigi resolve most problem for me, including the music file, png, and jpeg.

Seems only "c/c++ source" code problem, left for me. Is it also related to strigi?
Comment 4 Vishesh Handa 2011-07-28 12:57:15 UTC
Yes, that is also related to Strigi. 

Though the error message given by Nepomuk isn't really helpful. I've improved the error message a little bit. I think I should backport it as well.

The problem is the strigi produces a nie:depends on for all C++ files -

Eg -
<http://www.semanticdesktop.org/ontologies/2007/01/19/nie#depends> 
                "utils.h",
                "variant.h",
                "resourcemanager.h",
                "resource.h",
                "filequery.h",
                "comparisonterm.h",
                "andterm.h",
                "resourceterm.h",
                "resourcetypeterm.h",
                "optionalterm.h",
                "nie.h",
                "nfo.h",
                "nuao.h",
                "ndo.h",
                "kglobal.h",
                "klocale.h",
                "Soprano/Model",
                "Soprano/QueryResultIterator",
                "Soprano/NodeIterator";

nie:depends on actually has a range of nie:DataObject, not a string. So the indexers need to produce correct uris. Though, I'm not entirely sure how they can do that.
Comment 5 Weng Xuetian 2011-07-28 16:08:22 UTC
Thank you for the information, so currently I would prefer patch the cpplineanalyzer.cpp to make it works.
Comment 6 Weng Xuetian 2011-11-26 10:34:41 UTC
Well, though this bug is marked as wontfix, some part for this bug is resolved in recent code.

this part in cpplineanalyzer.cpp should also be comment out like the upper part.

            if((pos4 != string::npos) && (pos5 != string::npos)){
                analysisResult->addValue(factory->includeField, include1.substr(1+pos4,((pos5-1)-pos4)));
                includes++;
            }
Comment 7 Weng Xuetian 2011-11-27 13:14:57 UTC
Git commit cd1a7d2f3d92d834ef15ad7453820bcca49807c6 by Weng Xuetian.
Committed on 27/11/2011 at 14:09.
Pushed by xuetianweng into branch 'master'.

we need a useful new property or new DataObject type to describe a C
header which does not have an absolute path.
Comment it out for now.

BUG: 278593
REVIEW: 103258

M  +3    -1    plugins/lineplugins/cpplineanalyzer.cpp

http://commits.kde.org/libstreamanalyzer/cd1a7d2f3d92d834ef15ad7453820bcca49807c6