PDF to Markdown Converter
Upload any PDF and get clean Markdown in seconds
Headings, paragraphs, and lists preserved. No file upload — runs entirely in your browser using Mozilla's open-source PDF.js library.
Upload
Drag your PDF onto the tool or click to browse. No account needed. Files stay on your device.
Convert
The tool reads your PDF in the browser using Mozilla's open-source PDF.js library and converts the text content to clean Markdown.
Copy or Download
Copy the Markdown to your clipboard or download it as a .md file — ready to paste into your note-taking app, CMS, or code editor.
What Is PDF to Markdown Conversion?
Markdown is the universal format for modern content — documentation sites, static site generators, CMSs, knowledge bases, and AI systems all accept Markdown natively. PDFs are the most common document format for reports, research papers, and exported content, but their binary structure makes them difficult to edit or repurpose without specialist software.
A PDF to Markdown converter bridges this gap by reading the text layer of a PDF — its font sizes, positions, and formatting metadata — and reconstructing the document structure as clean Markdown. The result is a plain-text file that is human-readable, version-controllable, and ready to paste into any Markdown editor, CMS, or AI tool.
This converter uses PDF.js (the same engine that powers Firefox's built-in PDF viewer) under the hood. It extracts each text item with its position, size, and font name, then infers headings from font-size ratios and paragraph breaks from vertical spacing. The entire process runs inside your browser with no server involved.
What Gets Preserved in the Conversion?
Headings H1–H3
Detected from font-size ratio relative to body text. Much larger = H1, moderately larger = H2/H3.
Paragraphs
Blank lines in the vertical position stream (gap > 1.5× line height) become paragraph breaks.
Bullet lists
Lines starting with •, –, or * are converted to Markdown dash lists.
Numbered lists
Lines matching N. or N) patterns become ordered Markdown lists.
Bold text
Text items whose font name contains 'Bold' are wrapped in **double asterisks**.
Italic text
Font names containing 'Italic' or 'Oblique' are wrapped in *single asterisks*.
What doesn't convert: Tables appear as plain text (PDF tables have no semantic structure). Images and charts are not extracted. Scanned or image-only PDFs produce no output (OCR not supported). Multi-column layouts may mix columns together — single-column documents convert cleanly.
Why Browser-Based Conversion Matters for Privacy
When you upload a document to a server-based converter, your file travels over the internet and lands on a third-party server. Even tools that promise immediate deletion retain file data in transit logs — and many PDFs (legal documents, financial reports, unpublished research) should never leave your device.
This converter runs entirely inside your browser. When you select a PDF, it is read into an ArrayBuffer in your browser's memory, processed by PDF.js locally, and the resulting Markdown is written directly to the screen. Nothing is uploaded. Nothing is retained. Close the tab and the data is gone.
This approach has zero server cost, which is why this tool is free with no usage limits, no account requirements, and no restrictions on file content.
Common Use Cases
Research papers and reports
Convert academic PDFs and business reports to Markdown for easier editing, summarisation, or feeding into AI tools like Claude and ChatGPT.
Documentation migration
Move legacy documentation stored in PDF format into Markdown-based documentation systems like Docusaurus, MkDocs, or GitBook.
Content repurposing
Extract text content from product PDFs, white papers, or eBooks to repurpose as blog posts, knowledge base articles, or social content.
AI and LLM context preparation
Markdown is the preferred input format for LLMs. Converting PDFs to Markdown before adding them to AI projects improves context quality and reduces token waste.
Note-taking and Obsidian vaults
Pull text from PDFs into Obsidian, Notion, or Logseq without retyping. The output is ready to paste as a new note.
CMS publishing
Convert PDF reports or presentations to Markdown and publish them directly to headless CMS platforms that accept Markdown input.
Also in the Markdown converter suite:
- → Word to Markdown Converter — convert .docx files to Markdown with full table and heading support
- → HTML to Markdown Converter — paste any HTML and get clean GFM instantly in your browser
Frequently Asked Questions
Does this tool upload my PDF to a server?▾
What formatting is preserved in the Markdown output?▾
The converter preserves the following elements where the PDF contains the necessary font metadata:
- Headings H1–H3 (detected from font size relative to body text)
- Paragraphs (detected from vertical gaps between text blocks)
- Bullet lists (lines starting with •, –, or *)
- Numbered lists (lines starting with 1., 2., etc.)
- Bold text (font name contains 'Bold')
- Italic text (font name contains 'Italic' or 'Oblique')
Tables cannot be reliably converted from PDF format and will appear as plain text. Images and charts are not extracted.
Why does my PDF output look jumbled or mixed up?▾
Can I convert a scanned PDF?▾
What is the maximum PDF file size?▾
Is the Markdown output compatible with GitHub, Notion, and Obsidian?▾
#, ##, ###), bullet points, numbered lists, bold, and italic are all standard GFM syntax. Page separators use a horizontal rule (---) between pages.