Extract text from scanned PDFs and images using OCR (Optical Character Recognition)
PDF OCR turns scanned, image-based PDFs into searchable, copy-pastable text using Tesseract.js, the open-source OCR engine — running entirely in your browser. Drop a scanned document, pick the language (40+ supported), and watch the engine recognise text page by page. The result can be exported as plain text or as a 'searchable PDF' that keeps the original page images but adds an invisible text layer. Use it to digitise old paperwork, make a scanned book full-text searchable, extract content from receipts for an expense tool, or prepare a non-editable PDF for translation.