Back to Lab
Sahaf
Local PDF & EPUB to Markdown converter with automatic digital/scanned detection, OCR support, and smart splitting. Runs entirely on your hardware.
Prototype
PDF & EPUB Support
Handles both formats with dedicated pipelines. Automatically detects digital, scanned, and mixed PDFs page by page.
High-Accuracy Conversion
Marker library delivers 95.67% accuracy. Supports 90+ languages with Surya OCR engine.
Smart Splitting
Page and chapter range selection. Splits output at heading and paragraph boundaries, not mid-sentence.
Fully Local Processing
No cloud APIs, no data leaves your machine. Bilingual web interface with drag & drop, dark/light theme.
⚠ GPU strongly recommended for PDF conversion. Without GPU, a 27-page scanned PDF can take over an hour. EPUB conversion is lightweight and runs instantly on any hardware.
Tech Stack
Python 3.10+FastAPI + UvicornMarker / Surya OCRPyMuPDFebooklib