Archive Articles with Confidence. Share without Limits.

A command-line tool designed to solve content preservation challenges with Ethical Scraping.

Example command-line usage
$ ./capcat bundle tech --count 30

Fetching from 3 sources in bundle 'tech'
◯ STARTING HACKER NEWS (30 ITEMS)
◯ STARTING LOBSTERS (30 ITEMS)
◯ STARTING IEEE SPECTRUM (30 ITEMS)

✓ 90 articles saved successfully
Saved to: ../News/news_2025-11-24/

Defined with natural language processing. Built with large language models.
A Minimum Viable Product.


The Problem: Information Overload and Inefficient Recall

Browser Inefficiencies

Twenty tabs open while researching. Days later, you can't remember which article contained the specific information you need.

Lost Context

Bookmarks give you link lists but no context. Search history shows URLs but not content summaries.

Disappearing Content

Harvard Law School study shows 25% of links become inaccessible. When content disappears, the recall problem compounds.

Two Complete Interfaces, One Powerful Backend

Capcat offers two modes optimized for different workflows - both sharing a unified backend for consistent, reliable results.

Terminal window icon with command prompt

Command-Line Mode

Fast, scriptable automation for power users. Perfect for daily routines, cron jobs, and integration with existing workflows.

Menu icon with checkboxes and selection indicators

Interactive Menu (TUI)

Visual, guided exploration for discovering sources and testing workflows. No memorization required - see all options at once.

RSS feed icon with multiple streams converging

Bulk RSS Fetching

Archive from multiple sources simultaneously. Predefined bundles (tech, news, science, AI) or create custom selections.

Document folder icon with Markdown symbol

Local Markdown Storage

Permanent archives in Markdown format. Integrate seamlessly with Obsidian, Notion, or any note-taking system.

Web page icon with HTML brackets

HTML Generation

Optional HTML output with customizable themes. Color-coded sources, visual hierarchy, shareable archives.

Download icon with checkmark indicating local storage

Offline Accessibility

Once fetched, content remains accessible forever. No dependency on live websites or internet connectivity.


How Capcat Works

  1. Choose Your Interface

    Start with CLI for speed or TUI for visual exploration. Both provide complete functionality.

  2. Select Sources

    Pick from 11 configured sources (Hacker News, BBC, Guardian, Nature, etc.) or use predefined bundles.

  3. Parallel Fetching

    Articles download simultaneously from all sources. 3× faster than sequential processing.

  4. Organized Storage

    Automatic date-based folder structure. Markdown files with front matter, images preserved, optional HTML.

  5. Search & Recall

    Local searchability across your entire archive. Visual scanning with HTML themes.

Why Two Modes?

TUI Mode: Discovery & Learning

  • Visual source browsing
  • Checkbox multi-selection
  • Guided RSS source wizard
  • Confirmation summaries
  • Multi-level TUI menu

Best for: Ease of use, one-off tasks, testing new sources.

CLI Mode: Automation & Speed

  • Fast commands from muscle memory
  • Shell aliases and scripts
  • Custom article counts
  • Verbose debugging flags
  • Exit codes for error handling

Best for: Daily automation, power users, developers integrating with existing workflows.


Tutorials & Documentation

Tutorials for mastering both CLI and interactive menu modes.

Sources Ready to Archive

Preconfigured sources across technology, news, science, and AI - or add your own RSS feeds.

Tech Pro

  • Hacker News
  • Lobsters

Tech

  • IEEE Spectrum
  • Mashable

News

  • BBC News
  • The Guardian

Science

  • Nature
  • Scientific American
Plus icon with RSS feed symbol for adding sources

Add Your Own Sources

Use the interactive wizard to add any RSS feed in minutes. Test connectivity, validate content, and start archiving immediately.

Capcat mascot catching a loading ball from progress bar

Illustration by Stayu Kasabov | Stayux.com

Ready to Start Archiving?

1. Clone Repository

git clone https://github.com/stayukasabov/capcat.git

2. Setup Dependencies

./scripts/fix_dependencies.sh

3. Start Archiving

./capcat catch