| Summary: | Application becomes unresponsive when searching for rare words in a large file (>2.4 GB) | ||
|---|---|---|---|
| Product: | [Applications] kate | Reporter: | ldargevicius20 |
| Component: | search | Assignee: | KWrite Developers <kwrite-bugs-null> |
| Status: | REPORTED --- | ||
| Severity: | minor | CC: | christoph, ldargevicius20 |
| Priority: | NOR | ||
| Version First Reported In: | 25.12.0 | ||
| Target Milestone: | --- | ||
| Platform: | Arch Linux | ||
| OS: | Linux | ||
| Latest Commit: | Version Fixed/Implemented In: | ||
| Sentry Crash Report: | |||
You could profile it with perf. The code for in document search is in KTextEditor. |
Application becomes unresponsive when searching for rare words in a large file (>2.4 GB) SUMMARY When attempting to search for a word that appears only once or twice in a large file (>2.4 GB, based on my testing), the application becomes unresponsive (UI freezes) for some time (depends on a file size, 20-60sec on my machine, 2.5gb txt file). STEPS TO REPRODUCE 1. Open the large file in Kate 2. Search for a word that dosent appear a lot 3. Observe UI freeze OBSERVED RESULT UI becomes unresponsive EXPECTED RESULT UI should remain responsive, and/or show progress SOFTWARE/OS VERSIONS - Kate 25.12.0-1 from the official Arch Linux repository (Arch Linux) - Kate built from source on 2026-01-02 (Arch Linux) - Kate 25.12.0 from the Fedora repository (Fedora Linux 43) ADDITIONAL INFORMATION To reproduce the issue, I created synthetic test data using a simple Python script: ```python import random import string import os OUTPUT_FILE = "large_test_file_2-5gb.txt" FILE_SIZE_IN_GB = 2.5 TARGET_SIZE_BYTES = int(FILE_SIZE_IN_GB * pow(1024, 3)) CHUNK_SIZE = pow(1024, 2) MIN_WORD_LEN = 3 MAX_WORD_LEN = 12 WORDS_PER_LINE = 1000 def generate_chunk(target_bytes): lines = [] size = 0 while size < target_bytes: line_words = (''.join(random.choices(string.ascii_lowercase, k=random.randint(MIN_WORD_LEN, MAX_WORD_LEN))) for _ in range(WORDS_PER_LINE)) line = " ".join(line_words) + "\n" lines.append(line) size += len(line) return "".join(lines) def create_file(): print("Program started!") written = 0 with open(OUTPUT_FILE, "w", encoding="utf-8") as f: while written < TARGET_SIZE_BYTES: remaining = TARGET_SIZE_BYTES - written chunk_size = min(CHUNK_SIZE, remaining) chunk = generate_chunk(chunk_size) f.write(chunk) written += len(chunk.encode("utf-8")) if written % (100 * 1024 * 1024) < CHUNK_SIZE: print(f"Written: {written / (1024**3):.2f} GB") print("Program done!") print("Final size:", os.path.getsize(OUTPUT_FILE)) create_file() ``` I’ll try to look into this problem to understand what’s wrong. If you have any suggestions on how to solve it, I’d really appreciate hearing them.