Complete Python implementation with strict type safety and best practices.
Features:
- RSS/Atom/HTML web scraping
- GPT-4 Vision image analysis
- Node.js API integration
- RSS/JSON feed publishing
Modules:
- src/config.py: Configuration with strict validation
- src/exceptions.py: Custom exception hierarchy
- src/scraper.py: Multi-format news scraping (RSS/Atom/HTML)
- src/image_analyzer.py: GPT-4 Vision integration with retry
- src/aggregator.py: Content aggregation and filtering
- src/article_client.py: Node.js API client with retry
- src/publisher.py: RSS/JSON feed generation
- scripts/run.py: Complete pipeline orchestrator
- scripts/validate.py: Code quality validation
Code Quality:
- 100% type hint coverage (mypy strict mode)
- Zero bare except clauses
- Logger throughout (no print statements)
- Comprehensive test suite (598 lines)
- Immutable dataclasses (frozen=True)
- Explicit error handling
- Structured logging
Stats:
- 1,431 lines of source code
- 598 lines of test code
- 15 Python files
- 8 core modules
- 4 test suites
All validation checks pass.
🤖 Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
19 lines
263 B
Plaintext
19 lines
263 B
Plaintext
# Core dependencies
|
|
requests==2.31.0
|
|
beautifulsoup4==4.12.2
|
|
lxml==5.1.0
|
|
openai==1.12.0
|
|
|
|
# Utilities
|
|
python-dotenv==1.0.0
|
|
feedgen==1.0.0
|
|
python-dateutil==2.8.2
|
|
|
|
# Testing
|
|
pytest==7.4.3
|
|
pytest-cov==4.1.0
|
|
|
|
# Type checking
|
|
mypy==1.8.0
|
|
types-requests==2.31.0.20240125
|