Parse Reddit

Extract structured data from Reddit posts and comments using Reddit’s JSON API.

Process

Get Reddit URL - Ensure URL points to a specific post
Add JSON suffix - Append .json to any Reddit URL
Fetch with User-Agent - Use curl with proper User-Agent header
Parse JSON structure - Navigate Reddit’s nested JSON format
Extract post data - Get title, author, score, body, timestamp
Extract comments - Parse comment tree for top replies

Examples

Given a Reddit URL, fetch and parse the JSON to extract:

Post author, title, body, score, date
Top comments with authors and content

How to Fetch

Append .json to any Reddit URL:

curl -s -H "User-Agent: agentbot/1.0" "https://www.reddit.com/r/SUBREDDIT/comments/ID/TITLE.json"

JSON Structure

The response is an array with 2 elements:

[0].data.children[0].data - The post
[1].data.children[] - The comments

Post Fields

author - Username (without u/)
title - Post title
selftext - Post body (for text posts)
score - Upvotes
num_comments - Comment count
created_utc - Unix timestamp
subreddit - Subreddit name (without r/)
permalink - Path to post
url - For link posts, the linked URL

Comment Fields

author - Username
body - Comment text
score - Upvotes
created_utc - Unix timestamp
depth - Nesting level (0 = top-level)

Example

Input URL: https://www.reddit.com/r/productivity/comments/1jf439v/email_newsletter_texttospeech_app/

curl -s -H "User-Agent: agentbot/1.0" \
  "https://www.reddit.com/r/productivity/comments/1jf439v/email_newsletter_texttospeech_app.json" \
  | jq '{
    author: .[0].data.children[0].data.author,
    title: .[0].data.children[0].data.title,
    body: .[0].data.children[0].data.selftext,
    score: .[0].data.children[0].data.score,
    num_comments: .[0].data.children[0].data.num_comments,
    created: .[0].data.children[0].data.created_utc,
    subreddit: .[0].data.children[0].data.subreddit,
    comments: [.[1].data.children[:5][] | .data | {author, body, score}]
  }'

Output Format

Return structured data:

## Reddit Post

**u/username** in r/subreddit (DATE, SCORE points, N comments)

> Post body text here...

### Top Comments

1. **u/commenter1** (SCORE points)
   > Comment text...

2. **u/commenter2** (SCORE points)
   > Comment text...

Date Conversion

Convert Unix timestamp to readable date:

date -d @1742409172  # Returns: Wed Mar 19 2025

Or in jq:

jq '.created_utc | strftime("%Y-%m-%d")'

Rate Limiting

Reddit may rate limit. If you get errors:

Add delays between requests
Use a descriptive User-Agent