OCR & Text Extraction
Extracting text from images, scanned documents, and photos using optical character recognition.
What is OCR?
OCR (Optical Character Recognition) is the technology that extracts readable text from images. Your AI employee uses OCR to read text from photos, screenshots, scanned documents, business cards, receipts, whiteboards, and any other image containing text. The extracted text can then be edited, searched, stored in your knowledge base, or used as input for other tasks.
Using OCR
To extract text from an image, simply send the image to your AI employee and ask it to read the text. You can send photos taken with your phone, screenshots from your computer, or scanned document images. The AI processes the image and returns the extracted text in a clean, editable format. It handles multiple languages, various fonts, and both printed and handwritten text (though handwriting accuracy varies).
Extracting text from a photo
Read text from an image.
Scanned Documents
For scanned PDFs (PDFs that are essentially images of pages rather than digital text), the AI applies OCR to convert them into searchable, editable text. This is particularly useful for digitizing old contracts, letters, or any paper documents. After OCR processing, the text can be ingested into your knowledge base, making the information from physical documents as searchable as digital content.
For the best OCR results, scan documents at 300 DPI or higher. Ensure good lighting and a flat surface when photographing documents with your phone.
Accuracy & Tips
OCR accuracy depends on image quality, font clarity, and document layout. Clean, well-lit images of printed text typically achieve 98% or higher accuracy. Factors that can reduce accuracy include low resolution, poor lighting, unusual fonts, handwriting, and complex layouts with overlapping text and images. For critical documents, always review the extracted text for accuracy. The AI will flag sections where it had low confidence in the extraction.