OCR improvements - excess spaces, and superscripts, hyphenated words

Loving the OCR feature, but am hoping for improvements on a few fronts.

  • Most if not all journals have superscripts for footnotes. Currently, notes don’t allow for superscript formatting, so the OCR parses them as normally formatting numbers.

  • Many times, OCRs will add additional spaces between punctuations. For example, “(a)” may show up as ( a ).

  • hyphenated words in PDFs. This has been brought up before, and I’m not sure if it’s an insurmountable technical limitation, but I’m wondering if OCR has the ability to detect hyphenated words and automatically combined them.

2 Likes