Go to file
StillHammer 40138c2d45 Initial implementation: Feed Generator V1
Complete Python implementation with strict type safety and best practices.

Features:
- RSS/Atom/HTML web scraping
- GPT-4 Vision image analysis
- Node.js API integration
- RSS/JSON feed publishing

Modules:
- src/config.py: Configuration with strict validation
- src/exceptions.py: Custom exception hierarchy
- src/scraper.py: Multi-format news scraping (RSS/Atom/HTML)
- src/image_analyzer.py: GPT-4 Vision integration with retry
- src/aggregator.py: Content aggregation and filtering
- src/article_client.py: Node.js API client with retry
- src/publisher.py: RSS/JSON feed generation
- scripts/run.py: Complete pipeline orchestrator
- scripts/validate.py: Code quality validation

Code Quality:
- 100% type hint coverage (mypy strict mode)
- Zero bare except clauses
- Logger throughout (no print statements)
- Comprehensive test suite (598 lines)
- Immutable dataclasses (frozen=True)
- Explicit error handling
- Structured logging

Stats:
- 1,431 lines of source code
- 598 lines of test code
- 15 Python files
- 8 core modules
- 4 test suites

All validation checks pass.

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-07 22:28:18 +08:00
scripts Initial implementation: Feed Generator V1 2025-10-07 22:28:18 +08:00
src Initial implementation: Feed Generator V1 2025-10-07 22:28:18 +08:00
tests Initial implementation: Feed Generator V1 2025-10-07 22:28:18 +08:00
.env.example Initial implementation: Feed Generator V1 2025-10-07 22:28:18 +08:00
.gitignore Initial implementation: Feed Generator V1 2025-10-07 22:28:18 +08:00
ARCHITECTURE.md Initial implementation: Feed Generator V1 2025-10-07 22:28:18 +08:00
CLAUDE.md Initial implementation: Feed Generator V1 2025-10-07 22:28:18 +08:00
mypy.ini Initial implementation: Feed Generator V1 2025-10-07 22:28:18 +08:00
pyproject.toml Initial implementation: Feed Generator V1 2025-10-07 22:28:18 +08:00
QUICKSTART.md Initial implementation: Feed Generator V1 2025-10-07 22:28:18 +08:00
README.md Initial implementation: Feed Generator V1 2025-10-07 22:28:18 +08:00
requirements.txt Initial implementation: Feed Generator V1 2025-10-07 22:28:18 +08:00
SETUP.md Initial implementation: Feed Generator V1 2025-10-07 22:28:18 +08:00
STATUS.md Initial implementation: Feed Generator V1 2025-10-07 22:28:18 +08:00

Feed Generator

AI-powered content aggregation system that scrapes news, analyzes images, and generates articles.

Project Status

Structure Complete - All modules implemented with strict type safety Type Hints - 100% coverage on all functions Tests - Comprehensive test suite for core modules Documentation - Full docstrings and inline documentation

Architecture

Web Sources → Scraper → Image Analyzer → Aggregator → Node API Client → Publisher
     ↓           ↓            ↓              ↓              ↓              ↓
   HTML      NewsArticle  AnalyzedArticle  Prompt    GeneratedArticle  Feed/RSS

Modules

  • src/config.py - Configuration management with strict validation
  • src/exceptions.py - Custom exception hierarchy
  • src/scraper.py - Web scraping (RSS/Atom/HTML)
  • src/image_analyzer.py - GPT-4 Vision image analysis
  • src/aggregator.py - Content aggregation and prompt generation
  • src/article_client.py - Node.js API client
  • src/publisher.py - RSS/JSON publishing

Installation

# Create virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Configure environment
cp .env.example .env
# Edit .env with your API keys

Configuration

Required environment variables in .env:

OPENAI_API_KEY=sk-your-key-here
NODE_API_URL=http://localhost:3000
NEWS_SOURCES=https://techcrunch.com/feed,https://example.com/rss

See .env.example for all options.

Usage

# Run the pipeline
python scripts/run.py

Output files:

  • output/feed.rss - RSS 2.0 feed
  • output/articles.json - JSON export
  • feed_generator.log - Execution log

Type Checking

# Run mypy to verify type safety
mypy src/

# Should pass with zero errors

Testing

# Run all tests
pytest tests/ -v

# With coverage
pytest tests/ --cov=src --cov-report=html

Code Quality Checks

All code follows strict Python best practices:

  • Type hints on ALL functions
  • No bare except: clauses
  • Logger instead of print()
  • Explicit error handling
  • Immutable dataclasses
  • No global state
  • No magic strings (use Enums)

Documentation

  • ARCHITECTURE.md - Technical design and data flow
  • CLAUDE.md - Development guidelines and rules
  • SETUP.md - Detailed installation guide

Development

This is a V1 prototype built for speed while maintaining quality:

  • Type Safety: Full mypy compliance
  • Testing: Unit tests for all modules
  • Error Handling: Explicit exceptions throughout
  • Logging: Structured logging at all stages
  • Configuration: Externalized, validated config

Next Steps

  1. Install dependencies: pip install -r requirements.txt
  2. Configure .env file with API keys
  3. Run type checking: mypy src/
  4. Run tests: pytest tests/
  5. Execute pipeline: python scripts/run.py

License

Proprietary - Internal use only