A command-line tool designed to solve content preservation challenges with Ethical Scraping.
$ ./capcat bundle tech --count 30
Fetching from 3 sources in bundle 'tech'
◯ STARTING HACKER NEWS (30 ITEMS)
◯ STARTING LOBSTERS (30 ITEMS)
◯ STARTING IEEE SPECTRUM (30 ITEMS)
✓ 90 articles saved successfully
Saved to: ../News/news_2025-11-24/
Twenty tabs open while researching. Days later, you can't remember which article contained the specific information you need.
Bookmarks give you link lists but no context. Search history shows URLs but not content summaries.
Harvard Law School study shows 25% of links become inaccessible. When content disappears, the recall problem compounds.
Capcat offers two modes optimized for different workflows - both sharing a unified backend for consistent, reliable results.
Fast, scriptable automation for power users. Perfect for daily routines, cron jobs, and integration with existing workflows.
Visual, guided exploration for discovering sources and testing workflows. No memorization required - see all options at once.
Archive from multiple sources simultaneously. Predefined bundles (tech, news, science, AI) or create custom selections.
Permanent archives in Markdown format. Integrate seamlessly with Obsidian, Notion, or any note-taking system.
Optional HTML output with customizable themes. Color-coded sources, visual hierarchy, shareable archives.
Once fetched, content remains accessible forever. No dependency on live websites or internet connectivity.
Start with CLI for speed or TUI for visual exploration. Both provide complete functionality.
Pick from 11 configured sources (Hacker News, BBC, Guardian, Nature, etc.) or use predefined bundles.
Articles download simultaneously from all sources. 3× faster than sequential processing.
Automatic date-based folder structure. Markdown files with front matter, images preserved, optional HTML.
Local searchability across your entire archive. Visual scanning with HTML themes.
Best for: Ease of use, one-off tasks, testing new sources.
Best for: Daily automation, power users, developers integrating with existing workflows.
Tutorials for mastering both CLI and interactive menu modes.
Preconfigured sources across technology, news, science, and AI - or add your own RSS feeds.
Use the interactive wizard to add any RSS feed in minutes. Test connectivity, validate content, and start archiving immediately.
git clone https://github.com/stayukasabov/capcat.git
./scripts/fix_dependencies.sh
./capcat catch