Summary: | Add possibility to copy formulas as MATHML/Latex Math/OO Math | ||
---|---|---|---|
Product: | [Applications] okular | Reporter: | Christoph Thielecke <crissi99> |
Component: | PDF backend | Assignee: | Okular developers <okular-devel> |
Status: | REPORTED --- | ||
Severity: | wishlist | CC: | cfeck, yurchor |
Priority: | NOR | ||
Version: | 0.20.2 | ||
Target Milestone: | --- | ||
Platform: | Other | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: |
Description
Christoph Thielecke
2015-01-05 12:20:14 UTC
Is there any other software able to extract formulas from PDF? To me it looks like a very hard problem, as soon as the formulas use multiple levels of text (fractions etc.) (In reply to Christoph Feck from comment #1) > Is there any other software able to extract formulas from PDF? To me it > looks like a very hard problem, as soon as the formulas use multiple levels > of text (fractions etc.) MaxTract (development canceled) can do the extraction directly. http://www.cs.bham.ac.uk/research/groupings/reasoning/sdag/maxtract.php Infty Reader can do it using OCR. Some thoughts on the problem can be found here (my tests confirm the conclusions of this paper and nothing seems changed from 2011): http://www.cs.bham.ac.uk/~aps/research/papers/pdf/BaSeSoSu-ICDAR11-ComparingApproachesToMathematicalDocumentAnalysisFromPDF.pdf IMHO, it is hard to expect that free OCR engines like Ocropus/Tesseract can solve the problem in the nearest future. At least, I failed to train Tesseract in recognition of even rather simple formulas. |