Bug 476708 - valgrind-monitor.py regular expressions should use raw strings
Summary: valgrind-monitor.py regular expressions should use raw strings
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: unspecified
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Mark Wielaard
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-11-08 09:54 UTC by Mark Wielaard
Modified: 2023-11-17 12:43 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Wielaard 2023-11-08 09:54:59 UTC
/usr/share/gdb/auto-load/valgrind-monitor-def.py:214: SyntaxWarning: invalid escape sequence '\['
  if re.fullmatch("^0x[0123456789ABCDEFabcdef]+\[[^\[\]]+\]$", arg_str):
Loaded /usr/share/gdb/auto-load/valgrind-monitor.py

According to https://docs.python.org/dev/whatsnew/3.12.html#other-language-changes

A backslash-character pair that is not a valid escape sequence now generates a SyntaxWarning, instead of DeprecationWarning. For example, re.compile("\d+\.\d+") now emits a SyntaxWarning ("\d" is an invalid escape sequence, use raw strings for regular expression: re.compile(r"\d+\.\d+")). In a future Python version, SyntaxError will eventually be raised, instead of SyntaxWarning. (Contributed by Victor Stinner in gh-98401.)

Using a raw string does indeed get rid of the SyntaxWarning:

if re.fullmatch(r"^0x[0123456789ABCDEFabcdef]+\[[^\[\]]+\]$", arg_str):

Probably all regexps should use raw strings if they contain escape sequences.
Comment 1 Paul Floyd 2023-11-09 07:43:46 UTC
I'm not sure what the \[ is doing in the complemented character set. I assume that this is looking for

'literal ['
'a bunch of non-square bracket characters'
'literal ]'
Comment 2 Mark Wielaard 2023-11-17 12:22:55 UTC
(In reply to Paul Floyd from comment #1)
> I'm not sure what the \[ is doing in the complemented character set. I
> assume that this is looking for
> 
> 'literal ['
> 'a bunch of non-square bracket characters'
> 'literal ]'

Yes, that is what it is for. But \[ is a valid escape for the regular expression string, but not in a regular python string.
That is why we get that SyntaxWarning. Using raw strings makes sure python doesn't try to interpret escape sequences.
Comment 3 Mark Wielaard 2023-11-17 12:43:21 UTC
commit 0fbfbe05028ad18efda786a256a2738d2c231ed4
Author: Mark Wielaard <mark@klomp.org>
Date:   Fri Nov 17 13:31:52 2023 +0100

    valgrind-monitor.py regular expressions should use raw strings
    
    With python 3.12 gdb will produce the following SyntaxWarning when
    loading valgrind-monitor-def.py:
    
      /usr/share/gdb/auto-load/valgrind-monitor-def.py:214:
      SyntaxWarning: invalid escape sequence '\['
        if re.fullmatch("^0x[0123456789ABCDEFabcdef]+\[[^\[\]]+\]$", arg_str):
    
    In a future python version this will become an SyntaxError.
    
    Use a raw strings for the regular expression.
    
    https://bugs.kde.org/show_bug.cgi?id=476708