ScanSnap, OCR and non-latin scripts
Posted: Tue Feb 13, 2018 1:06 am
I'm using the ScanSnap manager with my Fujitsu ScanSnap to OCR (Edit: It uses the ABBYY FineReader engine) a Korean language book I've unbound in order to be able to copy and paste Korean sentences into Anki or other electronic documents, which would otherwise take me an enormous amount of time to type out.
I did the scan today and set the language under the "File Option" tab to Korean and target pages to "All Pages", and of course checked "Convert to Searchable PDF". I turned up the resolution and turned down the compression. In the end I got a beautiful PDF that, when I copied Korean text that was easily readable and on a single line, came out as complete gibberish when pasted into a text file.
Has anybody had success with non-latin foreign language scripts and OCR?
I'm at a loss.
Best,
MIBG
I did the scan today and set the language under the "File Option" tab to Korean and target pages to "All Pages", and of course checked "Convert to Searchable PDF". I turned up the resolution and turned down the compression. In the end I got a beautiful PDF that, when I copied Korean text that was easily readable and on a single line, came out as complete gibberish when pasted into a text file.
Has anybody had success with non-latin foreign language scripts and OCR?
I'm at a loss.
Best,
MIBG