SceneReader is implemented in ANSI C for speed and portability.
SceneReader currently runs on Linux/x86 and Windows systems; ports to other platforms may be possible by negotiation.
As the sample image gallery demonstrates, SceneReader is tolerant to wide variations in input image quality.
SceneReader can work with relatively poor-quality images such as those from simplistic phone-cameras, as well as high-quality images such as scanned photographs and high-end digital camera output.
The time taken for image analysis is dependent mostly on the size of the input image.
SceneReader is optimised for handling arbitrary photographic images with complex and unpredictable content.
It is NOT optimised for the analysis of ordinary printed text documents.
The following considerations should be noted regarding the SceneReader 2.0 release.
1. Only the latin alphabet is supported.
SceneReader 2.0 comes pre-configured with a general font knowledgebase
allowing it to recognise text in a variety of fonts. Currently,
this font knowledgebase only includes latin alphanumeric characters
(so e.g. cyrillic or greek characters are not supported).
2. Only alphanumeric characters are supported.
SceneReader 2.0 searches for alphabetic and numeric characters, but
ignores punctuation and other special characters (e.g. the $ sign).
3. Only dictionary sequences and numbers are recognised.
SceneReader 2.0 can recognise pure digit sequences (containing only
numeric characters) as numbers without any reference dictionary.
For recognising words and alphanumeric sequences, SceneReader 2.0 uses
a reference dictionary. Words (e.g. "exit") and/or
alphanumeric sequences (e.g. "WX123") not included in the dictionary
will not be recognised.
However, the dictionary can be large without significant performance
implications (e.g. we typically use a 60,000-word dictionary for testing
purposes).
4. SceneReader 2.0 is optimised for recognising printed, non-cursive
text.
SceneReader 2.0 will not, in general, recognise joined-up (cursive)
writing or heavily-interconnected text.
5. Result reporting is case-insensitive.
SceneReader 2.0 can recognize words in upper, lower and mixed case.
However, its output results are reported in all-lowercase.
6. SceneReader 2.0 is intended for photographic images.
SceneReader 2.0 is optimised for handling arbitrary photographic images
with complex and unpredictable content. It is NOT optimised for the
OCR analysis of ordinary printed text documents, and is unlikely to
perform well at this.