PDF Reading

When you receive a PDF file, you have two approaches depending on the content:

Text-heavy PDFs (papers, articles, contracts)

Use read_pdf tool or pdftotext via run_bash:

pdftotext -layout /path/to/file.pdf -

This extracts all text content. Best for documents that are primarily text.

Image-heavy PDFs (charts, diagrams, scanned pages, slides)

Render pages as images with pdftoppm, then view them:

# Render all pages as JPEGs (150 DPI is good for readability)
pdftoppm -jpeg -r 150 /path/to/file.pdf /tmp/pdf-page

# This creates /tmp/pdf-page-1.jpg, /tmp/pdf-page-2.jpg, etc.

# Render just specific pages
pdftoppm -jpeg -r 150 -f 1 -l 3 /path/to/file.pdf /tmp/pdf-page

Then use read_file on the resulting .jpg files to see them with your vision capabilities.

Deciding which approach to use

User asks about charts, figures, or visual content → render as images
User asks to summarize or extract facts → text extraction first, images if needed
Scanned PDFs (pdftotext returns garbled/empty output) → render as images
Slide decks → almost always render as images

Quick check: is text extraction sufficient?

# Check if pdftotext gets meaningful content
pdftotext /path/to/file.pdf - | head -50
# If output is empty or garbled, render as images instead

Tips

150 DPI balances readability vs file size. Use 200+ for small text.
For large PDFs, render a few pages at a time rather than all at once.
You can view multiple page images in sequence to read through a document.
Files sent via Telegram are saved to ~/ava/telegram-downloads/.