RB wrote:
Quote:
Why do scanners/OCR programs not handle tables very well when saving to
editable text? They all seem to be afflicted with this malady.
Trying to get an editable table of colums of data from a scanned page has
turned out to be mission impossible.
Anyone know a simple way around this? |
Hi RB,
I'm not an expert on recognition technology, but the primary challenge
has to do with the the OCR engine in general has difficulty handling
matricies. The vertical and horizontal lines combined with sometimes
small amounts of whitespace between words can easily throw off
recognition engines.
Just like everything else, not all OCR engines are created equal. Some
will definitely perform better than others (general rule of thumb: "you
get what you pay for..." usually applies)
If you have a heavy volume of these types of documents/ forms, and they
can be categorised into standard forms, then OMR (optical mark
recognition) software could be a good solution. OMR software would use
anchor points (black boxes on the corners of the document ) to perform
zonal OCR using a template that you preset specific to that
matrix/table.
hope it helps~
Danny