PDF to Markdown Converter

Upload any PDF and get clean Markdown in seconds

Headings, paragraphs, and lists preserved. No file upload — runs entirely in your browser using Mozilla's open-source PDF.js library.

Upload

Drag your PDF onto the tool or click to browse. No account needed. Files stay on your device.

Convert

The tool reads your PDF in the browser using Mozilla's open-source PDF.js library and converts the text content to clean Markdown.

Copy or Download

Copy the Markdown to your clipboard or download it as a .md file — ready to paste into your note-taking app, CMS, or code editor.

What Is PDF to Markdown Conversion?

Markdown is the universal format for modern content — documentation sites, static site generators, CMSs, knowledge bases, and AI systems all accept Markdown natively. PDFs are the most common document format for reports, research papers, and exported content, but their binary structure makes them difficult to edit or repurpose without specialist software.

A PDF to Markdown converter bridges this gap by reading the text layer of a PDF — its font sizes, positions, and formatting metadata — and reconstructing the document structure as clean Markdown. The result is a plain-text file that is human-readable, version-controllable, and ready to paste into any Markdown editor, CMS, or AI tool.

This converter uses PDF.js (the same engine that powers Firefox's built-in PDF viewer) under the hood. It extracts each text item with its position, size, and font name, then infers headings from font-size ratios and paragraph breaks from vertical spacing. The entire process runs inside your browser with no server involved.

What Gets Preserved in the Conversion?

Headings H1–H3

Detected from font-size ratio relative to body text. Much larger = H1, moderately larger = H2/H3.

Paragraphs

Blank lines in the vertical position stream (gap > 1.5× line height) become paragraph breaks.

Bullet lists

Lines starting with •, –, or * are converted to Markdown dash lists.

Numbered lists

Lines matching N. or N) patterns become ordered Markdown lists.

Bold text

Text items whose font name contains 'Bold' are wrapped in **double asterisks**.

Italic text

Font names containing 'Italic' or 'Oblique' are wrapped in *single asterisks*.

What doesn't convert: Tables appear as plain text (PDF tables have no semantic structure). Images and charts are not extracted. Scanned or image-only PDFs produce no output (OCR not supported). Multi-column layouts may mix columns together — single-column documents convert cleanly.

Why Browser-Based Conversion Matters for Privacy

When you upload a document to a server-based converter, your file travels over the internet and lands on a third-party server. Even tools that promise immediate deletion retain file data in transit logs — and many PDFs (legal documents, financial reports, unpublished research) should never leave your device.

This converter runs entirely inside your browser. When you select a PDF, it is read into an ArrayBuffer in your browser's memory, processed by PDF.js locally, and the resulting Markdown is written directly to the screen. Nothing is uploaded. Nothing is retained. Close the tab and the data is gone.

This approach has zero server cost, which is why this tool is free with no usage limits, no account requirements, and no restrictions on file content.

Common Use Cases

Research papers and reports

Convert academic PDFs and business reports to Markdown for easier editing, summarisation, or feeding into AI tools like Claude and ChatGPT.

Documentation migration

Move legacy documentation stored in PDF format into Markdown-based documentation systems like Docusaurus, MkDocs, or GitBook.

Content repurposing

Extract text content from product PDFs, white papers, or eBooks to repurpose as blog posts, knowledge base articles, or social content.

AI and LLM context preparation

Markdown is the preferred input format for LLMs. Converting PDFs to Markdown before adding them to AI projects improves context quality and reduces token waste.

Note-taking and Obsidian vaults

Pull text from PDFs into Obsidian, Notion, or Logseq without retyping. The output is ready to paste as a new note.

CMS publishing

Convert PDF reports or presentations to Markdown and publish them directly to headless CMS platforms that accept Markdown input.

Also in the Markdown converter suite:

→ Word to Markdown Converter — convert .docx files to Markdown with full table and heading support
→ HTML to Markdown Converter — paste any HTML and get clean GFM instantly in your browser

Frequently Asked Questions

Does this tool upload my PDF to a server?▾

No. The conversion happens entirely in your browser using Mozilla's PDF.js library. Your file never leaves your device and is not sent to any server. You can even disconnect from the internet after the page loads and the conversion will still work perfectly.

What formatting is preserved in the Markdown output?▾

The converter preserves the following elements where the PDF contains the necessary font metadata:

Headings H1–H3 (detected from font size relative to body text)
Paragraphs (detected from vertical gaps between text blocks)
Bullet lists (lines starting with •, –, or *)
Numbered lists (lines starting with 1., 2., etc.)
Bold text (font name contains 'Bold')
Italic text (font name contains 'Italic' or 'Oblique')

Tables cannot be reliably converted from PDF format and will appear as plain text. Images and charts are not extracted.

Why does my PDF output look jumbled or mixed up?▾

PDFs with multi-column layouts — such as academic papers, magazines, or brochures — store text across the full page width. The converter reads text in the order it is stored in the PDF file, which can mix columns together. Single-column documents like reports, articles, and PDFs exported from Word convert cleanly. If your document has two columns, try exporting it as a single-column Word document first and then re-converting.

Can I convert a scanned PDF?▾

No. Scanned PDFs are image files — there is no text for the converter to extract. Converting a scanned PDF requires OCR (optical character recognition), which this tool does not support. If your PDF has selectable text (you can highlight text with your cursor in a PDF viewer), it will convert correctly. Scanned documents need an OCR tool first before using this converter.

What is the maximum PDF file size?▾

50 MB. Most document PDFs are well under this limit. Very large PDFs with many embedded images may approach it — if your file is over 50 MB, try splitting it into smaller sections first. The conversion speed also depends on the number of pages: a 200-page document will take longer than a 10-page report.

Is the Markdown output compatible with GitHub, Notion, and Obsidian?▾

Yes. The output uses standard GitHub-Flavored Markdown (GFM) syntax, which is compatible with GitHub, Notion, Obsidian, Typora, VS Code, and most Markdown-based tools. Heading levels (#, ##, ###), bullet points, numbered lists, bold, and italic are all standard GFM syntax. Page separators use a horizontal rule (---) between pages.

Free account unlocks more

Now (no account)

–Tool result history: Not saved
–Bulk limits: Standard
–Features voting: —
–Crawl bot-protected pages: —
–Early access to new tools: —

Free account

✓Tool result history: Last 30 days
✓Bulk limits: +100% higher limits
✓Features voting: Vote on new features
✓Crawl bot-protected pages: Included
✓Early access to new tools: Included

Create free account →