Bug 256418 - Docbook entities are probably misused
Summary: Docbook entities are probably misused
Status: RESOLVED DOWNSTREAM
Alias: None
Product: docs.kde.org
Classification: Websites
Component: general (show other bugs)
Version: unspecified
Platform: unspecified Unspecified
: NOR normal
Target Milestone: ---
Assignee: Documentation Editorial Team
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-11-09 07:05 UTC by David E. Narvaez
Modified: 2010-11-09 21:52 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
A list of files in kde-l10n sources declaring entities in a suspicious way (161.58 KB, text/plain)
2010-11-09 07:05 UTC, David E. Narvaez
Details

Note You need to log in before you can comment on or make changes to this bug.
Description David E. Narvaez 2010-11-09 07:05:23 UTC
Created attachment 53266 [details]
A list of files in kde-l10n sources declaring entities in a suspicious way

Version:           unspecified (using Devel) 
OS:                unspecified

I stumbled upon this analyzing bug reports like

https://bugs.gentoo.org/show_bug.cgi?id=343523

where docbook complains about entities not declared. According to

http://www.w3schools.com/dtd/dtd_entities.asp

you declare an entity using syntax like

<!ENTITY kappname "kate">

and then use the entity as &kappname; on the rest of the document. It looks like a widespread practice on the translation documents of kde-l10n to use a syntax similar to 

<!ENTITY kappname "&kate;">

and then use &kate; on the rest of the documents. I checked out the whole kde-l10n project and ran a script against the docbook files in these sources and found at least 1993 files using this syntax. Is this a bug or is this intended to accomplish other purposes?

I'm attaching the file generated by my script in case it helps.

Reproducible: Always

Steps to Reproduce:
1) Checkout kde-l10n sources
Comment 1 Albert Astals Cid 2010-11-09 09:42:16 UTC
No idea if that is the correct syntax or not, but that's not a fault of the translation but from the original docbooks so moving to the docs team.
Comment 2 Freek de Kruijf 2010-11-09 13:53:34 UTC
In my view this certainly is OK. It simply means that you do can use &kappname; where you want to use the entity for the current application.
The advantage is that if the name of the application changes, and this happens, you do not need to change the messages. Like the change we had with ktts to jovie, but there have been more.
Another advantage is that a message like "&kappname; handbook", which can be used in all docbooks, only needs to be translated once.
However it is not used very often by the document writers.
So this not a bug, but a feature and it may need some promotion.
Comment 3 Burkhard Lück 2010-11-09 18:17:10 UTC
(In reply to comment #0)
> Created an attachment (id=53266) [details]
> A list of files in kde-l10n sources declaring entities in a suspicious way
> 
> Version:           unspecified (using Devel) 
> OS:                unspecified
> 
> I stumbled upon this analyzing bug reports like
> 
> https://bugs.gentoo.org/show_bug.cgi?id=343523
> 
> where docbook complains about entities not declared. According to
> 
> http://www.w3schools.com/dtd/dtd_entities.asp
> 
> you declare an entity using syntax like
> 
> <!ENTITY kappname "kate">
> 
That is an exception, only 7 of 156 entity definitions in docbooks use this form

> and then use the entity as &kappname; on the rest of the document. It looks
> like a widespread practice on the translation documents of kde-l10n to use a
> syntax similar to 
> 
> <!ENTITY kappname "&kate;">
> 
That is the major use case, 149 of 156 entity definitions in docbooks use this form

> and then use &kate; on the rest of the documents. 

No, that is wrong, we then use &kappname; in the document. 
This entitiy &kappname; is expanded in this way:
&kappname; -> &kate; -> <application>Kate</application>
Where an entity like &kate; is defined in general.entities in kdelibs
And &kate; used in the document is expanded via general.entities to <application>Kate</application>

From http://docbook.org/tdg/en/html/ch01.html#s-entities:
<!ENTITY ora "O'Reilly &amp; Associates">
You see in this example that using an entity ("&amp;") in an entity definition is valid docbook syntax.

That is not the reason for Gentoo Bug 343523.

In #343523 I see too different issues:

1) Entity 'kdf' not defined
en_GB-4.4.5/docs/kdeutils/kinfocenter/blockdevices/index.docbook -> that is a broken language docbook.

2) Entity 'kpat' not defined
We had a problem in the kde archiv with man-kpat.6.docbook switching from DTD 4.1 to DTD 4.2; the entity &kpat; had to be replaced with &kpatience; to make this docbook man page compilable. I have no idea why this was necessary, man-kpat.6.docbook is unchanged since < 4.0.

I have no 4.4 kde source environment available any more, so I can't dig into that issue further. My guess is that this is a Gentoo issue related to  mixing DTD 4.1/4.2

(In reply to comment #2)

> So this not a bug, but a feature and it may need some promotion.

Please read this discussion http://lists.kde.org/?l=kde-i18n-doc&m=123551535502590&w=2

Not all language teams share your PoV
Comment 4 David E. Narvaez 2010-11-09 18:57:03 UTC
(In reply to comment #3)
> > and then use &kate; on the rest of the documents. 
> 
> No, that is wrong, we then use &kappname; in the document. 
> This entitiy &kappname; is expanded in this way:
> &kappname; -> &kate; -> <application>Kate</application>
> Where an entity like &kate; is defined in general.entities in kdelibs
> And &kate; used in the document is expanded via general.entities to
> <application>Kate</application>
> 
> From http://docbook.org/tdg/en/html/ch01.html#s-entities:
> <!ENTITY ora "O'Reilly &amp; Associates">
> You see in this example that using an entity ("&amp;") in an entity definition
> is valid docbook syntax.

Thanks for the interest and for the explanation. I didn't know of the existence of the general.entities file, but as long as the &kate; entity (in this example) is defined somewhere I agree it's fine to expand kappname to another entity. I'm not sure if "we then use"  means "we should use" or "in fact, we use" but taking a look at, e.g., 

http://websvn.kde.org/trunk/l10n-kde4/ca/docs/kdelibs/sonnet/index.docbook?view=markup

you can easily check &kappname; does not appear in the document (which wouldn't really break anything, since the sonnet entity was already declared) but then it may all be just a matter of style or convenience and we could close this report.

> That is not the reason for Gentoo Bug 343523.
> 
> In #343523 I see too different issues:
> 
> 1) Entity 'kdf' not defined
> en_GB-4.4.5/docs/kdeutils/kinfocenter/blockdevices/index.docbook -> that is a
> broken language docbook.
> 
> 2) Entity 'kpat' not defined
> We had a problem in the kde archiv with man-kpat.6.docbook switching from DTD
> 4.1 to DTD 4.2; the entity &kpat; had to be replaced with &kpatience; to make
> this docbook man page compilable. I have no idea why this was necessary,
> man-kpat.6.docbook is unchanged since < 4.0.
> 
> I have no 4.4 kde source environment available any more, so I can't dig into
> that issue further. My guess is that this is a Gentoo issue related to  mixing
> DTD 4.1/4.2

I was actually treating both things separately but your insight on the kpat issue is great and I'll be investigating down that line.

Thanks again.
Comment 5 Burkhard Lück 2010-11-09 21:52:36 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > > and then use &kate; on the rest of the documents. 
> > 
> > No, that is wrong, we then use &kappname; in the document. 
> > This entitiy &kappname; is expanded in this way:
> > &kappname; -> &kate; -> <application>Kate</application>
> > Where an entity like &kate; is defined in general.entities in kdelibs
> > And &kate; used in the document is expanded via general.entities to
> > <application>Kate</application>
> > 
> > From http://docbook.org/tdg/en/html/ch01.html#s-entities:
> > <!ENTITY ora "O'Reilly &amp; Associates">
> > You see in this example that using an entity ("&amp;") in an entity 
> > definition is valid docbook syntax.
> 
> Thanks for the interest and for the explanation. I didn't know of the existence
> of the general.entities file, but as long as the &kate; entity (in this
> example) is defined somewhere I agree it's fine to expand kappname to another
> entity. I'm not sure if "we then use"  means "we should use" or "in fact, we
> use" 

if "we then use" means *not* "we should use" (reason see the link in my reply to comment #2), but "in fact, we use". Either direktly as &kappname; in the docbook text or via the docbook tool chain e.g. in the entity &help.menu.documentation;. That's now the third or fourth level of entitiy redirection ;-)

> but taking a look at, e.g., 
> 
> http://websvn.kde.org/trunk/l10n-kde4/ca/docs/kdelibs/sonnet/index.docbook?view=markup
> 
The header of language docbooks is just a copy from the english docbook, so you have to look into kdelibs/doc/sonnet/index.docbook

> you can easily check &kappname; does not appear in the document (which wouldn't
> really break anything, since the sonnet entity was already declared) but then
> it may all be just a matter of style or convenience and we could close this
> report.
> 
Finaly you got me ;-) Guilty!
With rev 1077440 I updated the sonnet docs and and changed it to an article, where the entity &kappname; is not used any more. I left the entity in the header. But defining an entity but not use it does not matter and will never break docbook xml.

Thanks to Yuri's hint I know now that this is a Gentoo bug:

The entity &kpat; is defined in http://websvn.kde.org/tags/KDE/4.4.0/kdelibs/kdoctools/customization/obsolete/general.entities?view=log. This file with the entity was removed before 4.5.0 was released.

Obviously you try to build a language from kde 4.4.5 with kdelibs >= 4.5.0. 
Check if you have a file $KDEDIR/share/apps/ksgmltools2/customization/obsolete/general.entities, I am sure you will not find it.

Closing as downstream.

Thanks for the report.