Bug 496972 - Issues with pcre2 (with Qt6)
Summary: Issues with pcre2 (with Qt6)
Status: RESOLVED WORKSFORME
Alias: None
Product: valgrind
Classification: Developer tools
Component: memcheck (other bugs)
Version First Reported In: 3.24 GIT
Platform: Compiled Sources All
: NOR normal
Target Milestone: ---
Assignee: Paul Floyd
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-12-03 04:55 UTC by Paul Floyd
Modified: 2025-01-03 20:46 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Paul Floyd 2024-12-03 04:55:45 UTC
Recently I tried running a Qt app with memcheck. I got huge numbers of spurious looking conditional jump errors. I couldn’t tell where they came from - just 2 hex addresses in the call stack. I suspected self modifying code.  --smc -check=all didn’t help.

Using vgdb there was also no call stack at the error location. When I stepped out of the asm back into library code it was pre. I’m pretty sure that is some kind of generated code for regular expression FSAs.
Comment 1 Paul Floyd 2025-01-02 17:39:08 UTC
The first error happens very early on.

Running under vgdb and using step-instruction to get back somewhere with a stack 

#0  0x0000000008523e0d in ?? () from /usr/local/lib/libpcre2-16.so.0
#1  0x0000000008523d46 in pcre2_jit_match_16 () from /usr/local/lib/libpcre2-16.so.0
#2  0x000000000855c19f in pcre2_match_16 () from /usr/local/lib/libpcre2-16.so.0
#3  0x0000000006507f52 in ?? () from /usr/local/lib/qt6/libQt6Core.so.6
#4  0x0000000006507d2f in ?? () from /usr/local/lib/qt6/libQt6Core.so.6
#5  0x00000000065089de in QRegularExpression::match(QString const&, long long, QRegularExpression::MatchType, QFlags<QRegularExpression::MatchOption>) const ()
   from /usr/local/lib/qt6/libQt6Core.so.6
#6  0x0000000006304ed6 in ?? () from /usr/local/lib/qt6/libQt6Core.so.6
#7  0x00000000063049de in ?? () from /usr/local/lib/qt6/libQt6Core.so.6
#8  0x00000000063054f5 in QDirIterator::nextFileInfo() () from /usr/local/lib/qt6/libQt6Core.so.6
#9  0x00000000062fa62d in QDir::entryInfoList(QList<QString> const&, QFlags<QDir::Filter>, QFlags<QDir::SortFlag>) const () from /usr/local/lib/qt6/libQt6Core.so.6
#10 0x0000000000210a8a in ?? ()
#11 0x00000000002135ac in ?? ()
#12 0x0000000006844a6a in __libc_start1 (argc=1, argv=0x1ffffff408, env=0x1ffffff418, cleanup=<optimized out>, mainX=0x211d10) at /usr/src/lib/libc/csu/libc_start1.c:157
#13 0x00000000002106e0 in ?? ()

Going up to that call to "match" and looking at the first argument

(gdb) x /500c $rdi
0xccbbfc0:      83 'S'  0 '\000'        111 'o' 0 '\000'        117 'u' 0 '\000'        114 'r' 0 '\000'
0xccbbfc8:      99 'c'  0 '\000'        101 'e' 0 '\000'        67 'C'  0 '\000'        111 'o' 0 '\000'
0xccbbfd0:      100 'd' 0 '\000'        101 'e' 0 '\000'        80 'P'  0 '\000'        114 'r' 0 '\000'
0xccbbfd8:      111 'o' 0 '\000'        46 '.'  0 '\000'        116 't' 0 '\000'        120 'x' 0 '\000'
0xccbbfe0:      116 't' 0 '\000'        0 '\000'        0 '\000'        0 '\000'        0 '\000'        0 '\000'        0 '\000'

Huh? I thought that "this" would be in $rdi. That looks like "SourceCodePro.txt" in wide characters. That's the name of a file in /usr/local/share/qtcreator/fonts.

Maybe this code

void QFreeTypeFontDatabase::populateFontDatabase()
{
    QString fontpath = fontDir();
    QDir dir(fontpath);

    if (!dir.exists()) {
        qWarning("QFontDatabase: Cannot find font directory %s.\n"
                 "Note that Qt no longer ships fonts. Deploy some (from https://dejavu-fonts.github.io/ for example) or switch to fontconfig.",
                 qPrintable(fontpath));
        return;
    }

    static const QString nameFilters[] = {
        u"*.ttf"_s,
        u"*.pfa"_s,
        u"*.pfb"_s,
        u"*.otf"_s,
    };

    const auto fis = dir.entryInfoList(QStringList::fromReadOnlyData(nameFilters), QDir::Files);
    for (const QFileInfo &fi : fis) {
        const QByteArray file = QFile::encodeName(fi.absoluteFilePath());
        QFreeTypeFontDatabase::addTTFile(QByteArray(), file);
    }
}

Making that into a small exe as follows

#include <QDir>
#include <QString>
#include <QStringList>
#include <iostream>

using namespace Qt::Literals::StringLiterals;

int main()
{
    QString fontpath = "/usr/local/share/qtcreator/fonts";
    QDir dir(fontpath);

    static const QString nameFilters[] = {
        u"*.ttf"_s,
        u"*.pfa"_s,
        u"*.pfb"_s,
        u"*.otf"_s,
    };

    const auto fis = dir.entryInfoList(QStringList::fromReadOnlyData(nameFilters), QDir::Files);
    for (const QFileInfo &fi : fis) {
        const QByteArray file = QFile::encodeName(fi.absoluteFilePath());
        std::cout << "foo " << file.toStdString() << std::endl;
    }
}

produces

==26438== Memcheck, a memory error detector
==26438== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
==26438== Using Valgrind-3.24.0.GIT and LibVEX; rerun with -h for copyright info
==26438== Command: ././build/Desktop-Debug/bug496972
==26438== 
==26438== Conditional jump or move depends on uninitialised value(s)
==26438==    at 0x6B15CA1: ???
==26438==    by 0x670C9AF: ???
==26438== 
==26438== Conditional jump or move depends on uninitialised value(s)
==26438==    at 0x6B14F41: ???
==26438==    by 0x670C9AF: ???
==26438== 
==26438== Conditional jump or move depends on uninitialised value(s)
==26438==    at 0x6B141E1: ???
==26438==    by 0x670C9AF: ???
==26438== 
==26438== Conditional jump or move depends on uninitialised value(s)
==26438==    at 0x6B13481: ???
==26438==    by 0x670C9AF: ???
==26438== 
foo /usr/local/share/qtcreator/fonts/SourceCodePro-Bold.ttf
foo /usr/local/share/qtcreator/fonts/SourceCodePro-BoldIt.ttf
foo /usr/local/share/qtcreator/fonts/SourceCodePro-It.ttf
foo /usr/local/share/qtcreator/fonts/SourceCodePro-Medium.ttf
foo /usr/local/share/qtcreator/fonts/SourceCodePro-MediumIt.ttf
foo /usr/local/share/qtcreator/fonts/SourceCodePro-Regular.ttf

With a debug libpcre2.so that's in

(gdb) list
58      local_stack.min_start = local_space;
59      local_stack.start = local_space;
60      local_stack.end = local_space + MACHINE_STACK_SIZE;
61      local_stack.top = local_space + MACHINE_STACK_SIZE;
62      arguments->stack = &local_stack;
63      return executable_func(arguments);
64      }

Still not sure if this is Valgrind not getting the SMC right or some probably harmless error in the jitted pcre2 expression.
Comment 2 Paul Floyd 2025-01-02 20:26:51 UTC
This isn't just with FreeBSD. I get the same errors with my small reproducer on Fedora 41 amd64. Need to test arm64.
Comment 3 Paul Floyd 2025-01-03 08:22:35 UTC
Started with arm (Raspberry Pi OS). No problem there. (Don't know if pcre2 uses jitting on that platform).

No problem on arm64 either (Ubuntu and FreeBSD).

So just amd64.
Comment 4 Paul Floyd 2025-01-03 09:56:42 UTC
On FreeBSD i386

==33163== Conditional jump or move depends on uninitialised value(s)
==33163==    at 0xEB46A34: get_cpu_features (sljitNativeX86_common.c:512)
==33163==    by 0xEB46978: init_compiler (sljitNativeX86_common.c:2872)
==33163==    by 0xEB37689: sljit_create_compiler (sljitLir.c:499)
==33163==    by 0xEB329AB: jit_compile (pcre2_jit_compile.c:14306)
==33163==    by 0xEB318CE: pcre2_jit_compile_16 (pcre2_jit_compile.c:14876)
==33163==    by 0xD56A2CA: ??? (in /usr/local/lib/qt6/libQt6Core.so.6.7.3)
==33163==    by 0xD56B591: QRegularExpression::match(QString const&, int, QRegularExpression::MatchType, QFlags<QRegularExpression::MatchOption>) const (in /usr/local/lib/qt6/libQt6Core.so.6.7.3)
==33163==    by 0xD36706C: ??? (in /usr/local/lib/qt6/libQt6Core.so.6.7.3)
==33163==    by 0xD366AE3: ??? (in /usr/local/lib/qt6/libQt6Core.so.6.7.3)
==33163==    by 0xD36636C: ??? (in /usr/local/lib/qt6/libQt6Core.so.6.7.3)
==33163==    by 0xD3675A5: QDirIterator::QDirIterator(QString const&, QList<QString> const&, QFlags<QDir::Filter>, QFlags<QDirIterator::IteratorFlag>) (in /usr/local/lib/qt6/libQt6Core.so.6.7.3)
==33163==    by 0xD35C5F0: QDir::entryInfoList(QList<QString> const&, QFlags<QDir::Filter>, QFlags<QDir::SortFlag>) const (in /usr/local/lib/qt6/libQt6Core.so.6.7.3)
==33163==  Uninitialised value was created by a stack allocation
==33163==    at 0xEB469F4: get_cpu_features (sljitNativeX86_common.c:503)
==33163== 
==33163== Conditional jump or move depends on uninitialised value(s)
==33163==    at 0xEB46A91: get_cpu_features (sljitNativeX86_common.c:523)
==33163==    by 0xEB46978: init_compiler (sljitNativeX86_common.c:2872)
==33163==    by 0xEB37689: sljit_create_compiler (sljitLir.c:499)
==33163==    by 0xEB329AB: jit_compile (pcre2_jit_compile.c:14306)
==33163==    by 0xEB318CE: pcre2_jit_compile_16 (pcre2_jit_compile.c:14876)
==33163==    by 0xD56A2CA: ??? (in /usr/local/lib/qt6/libQt6Core.so.6.7.3)
==33163==    by 0xD56B591: QRegularExpression::match(QString const&, int, QRegularExpression::MatchType, QFlags<QRegularExpression::MatchOption>) const (in /usr/local/lib/qt6/libQt6Core.so.6.7.3)
==33163==    by 0xD36706C: ??? (in /usr/local/lib/qt6/libQt6Core.so.6.7.3)
==33163==    by 0xD366AE3: ??? (in /usr/local/lib/qt6/libQt6Core.so.6.7.3)
==33163==    by 0xD36636C: ??? (in /usr/local/lib/qt6/libQt6Core.so.6.7.3)
==33163==    by 0xD3675A5: QDirIterator::QDirIterator(QString const&, QList<QString> const&, QFlags<QDir::Filter>, QFlags<QDirIterator::IteratorFlag>) (in /usr/local/lib/qt6/libQt6Core.so.6.7.3)
==33163==    by 0xD35C5F0: QDir::entryInfoList(QList<QString> const&, QFlags<QDir::Filter>, QFlags<QDir::SortFlag>) const (in /usr/local/lib/qt6/libQt6Core.so.6.7.3)
==33163==  Uninitialised value was created by a stack allocation
==33163==    at 0xEB469F4: get_cpu_features (sljitNativeX86_common.c:503)
Comment 5 Paul Floyd 2025-01-03 12:16:56 UTC
I don't know if the i386 issue is the same thing.

The code is doing a 'cpuid' and then checking what is in eax. I can't see anything wrong particularly. The pcre2 code looks similar to none/tests/x86/cpuid and I see the same values being returned. No error from the Valgrind 'none' test though.
Comment 6 Paul Floyd 2025-01-03 14:50:25 UTC
No the i386 errors are unrelated. It's a bug in sljit (the jitter used by pcre2). This patch fixes it:

https://github.com/zherczeg/sljit/commit/db3ca5014f0ae524785be05f071addc45a3442e0

If I correctly initialize 'info' then I get  no errors on FreeBSD i386 with my small reproducer.

Back to amd64.
Comment 7 Paul Floyd 2025-01-03 20:46:52 UTC
Qt has an env var that turns off jitting with pcre2. I'll add that to the FAQ and close this for the moment, might come back to it later.