PDF Reading
When you receive a PDF file, you have two approaches depending on the content:
Text-heavy PDFs (papers, articles, contracts)
Use read_pdf tool or pdftotext via run_bash:
pdftotext -layout /path/to/file.pdf -
This extracts all text content. Best for documents that are primarily text.
Image-heavy PDFs (charts, diagrams, scanned pages, slides)
Render pages as images with pdftoppm, then view them:
# Render all pages as JPEGs (150 DPI is good for readability)
pdftoppm -jpeg -r 150 /path/to/file.pdf /tmp/pdf-page
# This creates /tmp/pdf-page-1.jpg, /tmp/pdf-page-2.jpg, etc.
# Render just specific pages
pdftoppm -jpeg -r 150 -f 1 -l 3 /path/to/file.pdf /tmp/pdf-page
Then use read_file on the resulting .jpg files to see them with your vision capabilities.
Deciding which approach to use
- User asks about charts, figures, or visual content → render as images
- User asks to summarize or extract facts → text extraction first, images if needed
- Scanned PDFs (pdftotext returns garbled/empty output) → render as images
- Slide decks → almost always render as images
Quick check: is text extraction sufficient?
# Check if pdftotext gets meaningful content
pdftotext /path/to/file.pdf - | head -50
# If output is empty or garbled, render as images instead
Tips
- 150 DPI balances readability vs file size. Use 200+ for small text.
- For large PDFs, render a few pages at a time rather than all at once.
- You can view multiple page images in sequence to read through a document.
- Files sent via Telegram are saved to
~/ava/telegram-downloads/.