RootsChat.Com
General => Technical Help => Topic started by: wini on Sunday 26 April 20 07:27 BST (UK)
-
This is a very convoluted question but I know someone will know the answer.
What do you call the text, I think used maybe by Amazon and Ancestry to transfer from one document to perhaps an e reader or another or in the case of Ancestry from official records to their site.
I thought it may have been prescriptive text or descriptive text but looking at the definition of both it doesn't seem right.
I told you it was convoluted and I'm not sure if anyone can make sense of it
wini
-
Could you mean:
PLAIN text
Or
RICH text
Perhaps :)
JM
-
Most of the records on their site have indexes that are transcribed manually. If the text from a full record is transferred digitally then it is using OCR, Optical Character Recognition.
-
Handwritten documents always need to be transcribed by actual people. Printed documents are often handled by OCR.
OCR results vary depending on the quality of the original. Sometimes I'm amazed that it works out anything at all when I examine the original of, say, a 19th century newspaper, where the ink has run and made all the letters in a word join together.
Once the text has been converted, it can be displayed in whatever format is required.
Some sites, such as Trove, allow you to correct the machine transcriptions so that future researchers can benefit.
The best OCR software can learn to recognise fresh letter shapes with some human intervention early in a document, and even save a word processor file, with proper paragraphs and indentation.
I believe that some can even cope with the Gothic letters used in earlier German printing, which is more than I can.
-
thank you very much. OCR is what I was looking for
wini