From a46db74ff62fb534304ae4fbb98819077a4d6ad8 Mon Sep 17 00:00:00 2001 From: Matteo Rosati Date: Fri, 1 May 2026 00:02:25 +0200 Subject: [PATCH] update python skill --- python-dev/REFERENCE.md | 100 ++++ python-dev/SKILL.md | 1016 ++++----------------------------------- 2 files changed, 190 insertions(+), 926 deletions(-) create mode 100644 python-dev/REFERENCE.md diff --git a/python-dev/REFERENCE.md b/python-dev/REFERENCE.md new file mode 100644 index 0000000..9cec49c --- /dev/null +++ b/python-dev/REFERENCE.md @@ -0,0 +1,100 @@ +# Python Reference + +Use this file only when deeper guidance is needed. `SKILL.md` is the default operating contract. + +## When To Consult This File + +- You are choosing between multiple Python design options. +- You need stronger guidance for typing, validation, async, or testing. +- You are creating new modules or new project structure rather than making a local edit. +- You are reviewing code and want a more complete checklist. + +## Stronger Typing Guidance + +- Prefer `collections.abc` and `typing` abstractions such as `Sequence`, `Mapping`, `Callable`, and `Iterator`. +- Use `Protocol` for behavioral contracts when inheritance is not required. +- Use `TypedDict` for structured dictionary payloads when a dataclass or Pydantic model is not appropriate. +- Use `Final` for constants and `ClassVar` for class-level attributes when helpful. +- Keep `Any` rare, local, and justified. +- If the project supports it, use `@override` on overriding methods. + +## Architecture Heuristics + +- Domain layer: pure business rules, no framework or persistence code. +- Repository or data layer: database and external service access. +- Service or use-case layer: orchestration across domain rules and collaborators. +- Interface layer: HTTP, CLI, events, serialization, request and response shaping. +- Infrastructure layer: configuration, logging setup, DB sessions, clients. + +Use this split when it fits the repository. Do not impose it mechanically on smaller scripts or simpler codebases. + +## Data Modeling + +- Use Pydantic for request payloads, config, and data from untrusted sources. +- Use dataclasses for internal value objects and simple structured state. +- Use `frozen=True` for values that should not change after creation. +- Use `slots=True` where the codebase already prefers it or where many instances are created. + +## Error Handling + +- Prefer project-specific exception types over generic `ValueError` or `RuntimeError` when a domain concept exists. +- Wrap lower-level exceptions with domain context when crossing layers. +- Keep exception handling near boundaries unless recovery is local and explicit. + +## Logging + +- Prefer structured logging when the project supports it. +- Log important operational events at service or interface boundaries. +- Do not log secrets, credentials, or sensitive user data. +- Avoid `print()` for operational behavior unless the program is explicitly a simple script or CLI output path. + +## Async Guidance + +- Use async for I/O-bound workflows, not CPU-bound work. +- Keep blocking calls out of coroutines; use thread or process offloading where appropriate. +- Prefer `asyncio.TaskGroup` over ad hoc task orchestration when available. +- Be explicit about sync-to-async boundaries. + +## Security Guidance + +- Never hardcode secrets or credentials. +- Validate all external input before it enters core logic. +- Avoid dangerous calls unless explicitly justified and safely constrained: + - `eval`, `exec` + - `pickle.loads` + - `yaml.load` + - `subprocess` with `shell=True` + - `os.system` +- Do not interpolate raw user input into SQL, shell commands, or file-system-sensitive operations. + +## Testing Guidance + +- Update tests whenever behavior changes. +- Prefer unit tests for pure logic and integration tests for boundaries. +- Use parametrization to improve coverage without duplicating setup. +- Test behavior and observable outcomes rather than implementation details. +- Mock external boundaries selectively; avoid mocking the core logic you actually want to verify. + +## Review Heuristics + +When reviewing Python changes, prioritize: + +- correctness and regressions +- incomplete or misleading typing +- invalid assumptions at trust boundaries +- misplaced logic between layers +- unnecessary abstraction or indirection +- missing tests for changed behavior +- risky calls, secret handling, or unsafe subprocess use + +## Common Verification Flow + +If the repository uses `uv`, the usual flow is: + +1. `uv run ruff format ` +2. `uv run ruff check --fix ` +3. `uv run ruff check ` +4. `uv run mypy ` +5. `uv run pytest ` + +If the repository uses different commands or tools, follow the repository. diff --git a/python-dev/SKILL.md b/python-dev/SKILL.md index 3540c54..e5b896c 100644 --- a/python-dev/SKILL.md +++ b/python-dev/SKILL.md @@ -1,932 +1,96 @@ --- name: python-dev description: | - Apply this skill whenever writing, editing, generating, or reviewing Python code — including new .py files, modifications to existing Python modules, writing classes/functions/scripts, or any task that produces Python source. Enforces enterprise-grade Python: OOP, strict static typing, idiomatic/pythonic style, separation of concerns, DRY/KISS, security, testing, structured logging, and ruff+mypy-clean output. -version: 2.0.0 + Apply this skill whenever writing, editing, generating, or reviewing Python code. Favor small, typed, idiomatic changes that fit the existing codebase and finish with lint, type checks, and relevant tests passing. Use `REFERENCE.md` only when deeper guidance is needed. +version: 3.1.0 user-invocable: false --- -# Python Development — Enterprise Standards - -Apply every rule below **every time** you write or touch Python code. These are non-negotiable defaults. - ---- - -## 0 — Fetch Current Docs First - -Before writing code that uses any third-party library or stdlib module you are not 100% certain about, use the **context7 MCP** to pull the current documentation: - -``` -mcp__plugin_context7_context7__resolve-library-id → mcp__plugin_context7_context7__query-docs -``` - -Always do this for: -- Third-party packages (pydantic, sqlalchemy, fastapi, httpx, click, typer, attrs, structlog, etc.) -- Any stdlib module whose API has changed between Python versions (e.g. `asyncio`, `pathlib`, `importlib`, `tomllib`) -- Any time you are unsure of the exact call signature, context-manager protocol, or configuration schema - ---- - -## 1 — Static Typing (Strict) - -- Every function, method, and class attribute **must** have type annotations — no exceptions. -- Use `from __future__ import annotations` at the top of every module (enables PEP 563 deferred evaluation, avoids forward-reference strings). -- Use `typing` / `collections.abc` generics: `Sequence`, `Mapping`, `Callable`, `Iterator`, `Generator`, `Awaitable`, `Coroutine`, etc. — **not** bare `list`, `dict`, `tuple` as annotations. -- Use `TypeVar`, `ParamSpec`, `TypeVarTuple`, `Protocol`, `TypedDict`, `NamedTuple`, `Literal`, `Final`, `ClassVar`, `Annotated` where appropriate. -- Use `X | Y` union syntax (PEP 604) instead of `Union[X, Y]`. -- Use `X | None` instead of `Optional[X]`. -- Never use `Any` except as a last resort; always with a `# type: ignore[...]` comment explaining why. -- `@override` (from `typing`) on all overriding methods (Python ≥ 3.12) or `typing_extensions`. -- `@dataclass(slots=True, frozen=True)` for immutable value objects; `@dataclass(slots=True)` for mutable ones. -- Move type-only imports under `TYPE_CHECKING` to avoid runtime overhead. - -```python -from __future__ import annotations - -from collections.abc import Sequence -from typing import TYPE_CHECKING, Final, TypeVar - -if TYPE_CHECKING: - from mypackage.domain.models import User - -T = TypeVar("T") -MAX_RETRIES: Final = 3 - -def first(items: Sequence[T]) -> T | None: - return items[0] if items else None -``` - -### Mypy (Strict Mode — Mandatory) - -Run mypy after every change. Configure in `pyproject.toml`: - -```toml -[tool.mypy] -python_version = "3.12" -strict = true -warn_return_any = true -warn_unused_ignores = true -no_implicit_reexport = true -plugins = ["pydantic.mypy"] # if Pydantic is used -``` - -Iterate until `mypy` exits with code 0. Never suppress errors with `# type: ignore` without a comment explaining the reason. - ---- - -## 2 — OOP Principles - -- Model domain concepts as classes. Use inheritance only for genuine IS-A relationships; prefer composition. -- Apply SOLID: - - **S** — one responsibility per class/function - - **O** — extend via subclassing or strategy, not by modifying existing code - - **L** — subtypes must be substitutable - - **I** — small, focused protocols/ABCs; never fat interfaces - - **D** — depend on abstractions (`Protocol` / ABC), not concretes -- Use `abc.ABC` + `@abc.abstractmethod` for explicit contracts when subclassing is intended. -- Use `Protocol` (structural typing) for duck-typed contracts — preferred over ABC for third-party types. -- Expose only what is needed; prefix internal names with `_`. -- `__slots__` on all classes instantiated many times (reduces per-instance memory ~40%). - -```python -from __future__ import annotations - -import abc -from typing import Protocol - -class Serializable(Protocol): - def to_dict(self) -> dict[str, object]: ... - -class BaseRepository[T](abc.ABC): - @abc.abstractmethod - def get(self, id: int) -> T | None: ... - - @abc.abstractmethod - def save(self, entity: T) -> None: ... -``` - ---- - -## 3 — Data Model: Pydantic vs dataclass vs attrs - -Choose the right tool for the data's role: - -| Scenario | Tool | -|---|---| -| Data crosses a trust boundary (API input, config file, user data) | **Pydantic v2** | -| Internal data structures (domain models, value objects, DTOs between layers) | **`@dataclass(slots=True, frozen=True)`** | -| Complex internal models needing composable validators, converters, and advanced slot support | **attrs** | - -**Pydantic v2** at trust boundaries: -```python -from pydantic import BaseModel, field_validator - -class CreateUserRequest(BaseModel): - name: str - email: str - - @field_validator("email") - @classmethod - def validate_email(cls, v: str) -> str: - if "@" not in v: - raise ValueError("invalid email") - return v.lower() -``` - -**Dataclass** for internal models (no I/O): -```python -from dataclasses import dataclass - -@dataclass(slots=True, frozen=True) -class UserId: - value: int -``` - ---- - -## 4 — Separation of Concerns - -Organise code into layers; never mix them in a single module or class: - -| Layer | Responsibility | -|---|---| -| **Domain / Models** | Pure business logic, no I/O | -| **Repository / Data** | Persistence, queries | -| **Service / Use-case** | Orchestrates domain + repos | -| **Interface / API** | HTTP, CLI, events — thin adapters | -| **Infrastructure** | DB sessions, HTTP clients, config | - -- No raw SQL / ORM calls in service or domain layers. -- No business logic in route handlers or CLI commands. -- Config (env vars, secrets) loaded once at startup, injected as typed Pydantic `BaseSettings` objects. - ---- - -## 5 — Idiomatic & Pythonic Constructs - -Always prefer: - -```python -# Comprehensions over manual loops -squares = [x**2 for x in range(10) if x % 2 == 0] - -# Generator expressions for lazy evaluation -total = sum(x**2 for x in big_sequence) - -# Walrus operator for assign-and-test -if chunk := file.read(8192): - process(chunk) - -# Context managers for all resources -with open(path) as fh: - data = fh.read() - -# Unpacking -first, *rest = items -a, b = b, a - -# enumerate / zip instead of index loops -for i, item in enumerate(items): - ... - -# dataclasses / NamedTuple over raw dicts for structured data -from dataclasses import dataclass - -@dataclass(slots=True, frozen=True) -class Point: - x: float - y: float - -# pathlib over os.path -from pathlib import Path -config = Path("~/.config/app.toml").expanduser() - -# f-strings for all string formatting -msg = f"Processing {count} items in {duration:.2f}s" - -# match/case (Python ≥ 3.10) for structural dispatch -match command: - case "quit": - raise SystemExit - case "help": - show_help() - case _: - handle(command) - -# functools.cache / lru_cache for pure, expensive computations -from functools import cache - -@cache -def fibonacci(n: int) -> int: - if n < 2: - return n - return fibonacci(n - 1) + fibonacci(n - 2) -``` - -Never use: -- `%` or `.format()` for string formatting (use f-strings) -- `os.path` (use `pathlib`) -- Mutable default arguments (`def f(x=[])`) — use `None` sentinel -- Bare `except:` — always name the exception -- `type(x) == Foo` — use `isinstance(x, Foo)` -- `lambda` assigned to a name — use `def` - ---- - -## 6 — DRY / KISS - -- Extract any logic repeated ≥ 2 times into a named function or method. -- Functions do one thing; if you need "and" to describe it, split it. -- Prefer flat over nested; early return / guard clauses over deep indentation. -- No speculative abstractions — generalise only when you have ≥ 2 concrete use-cases. -- No feature flags, backwards-compat shims, or dead code — delete it. - ---- - -## 7 — Module & Project Layout - -``` -src/ - / - __init__.py # public API only, explicit __all__ - py.typed # PEP 561 marker - domain/ - models.py - exceptions.py - repository/ - base.py - .py - service/ - .py - interface/ - api.py # or cli.py - infrastructure/ - db.py - config.py -tests/ - unit/ - integration/ - conftest.py -pyproject.toml -.pre-commit-config.yaml -uv.lock -``` - -- `__init__.py` re-exports only the public API with explicit `__all__`. -- Every package has a `py.typed` marker (PEP 561). -- `pyproject.toml` is the single source of truth for metadata, deps, and tool config. -- `uv.lock` is committed for applications (reproducible builds); omitted for libraries. - -### Dependency Management (uv) - -Use `uv` as the **sole** package manager and tool runner. **`pip` and `pip3` are banned — never reference them directly.** - -```bash -uv add # add production dependency -uv add --dev # add dev dependency -uv sync # install from lockfile -uv run pytest # run in managed venv -uv run mypy src/ # run any tool via uv -uv run ruff check . # always prefix tool invocations with uv run -``` - -Every script, CLI tool, and package binary **must** be invoked via `uv run `, never called directly. This ensures the correct venv is always used. - -Separate production and dev dependencies in `pyproject.toml`: - -```toml -[project] -dependencies = ["fastapi>=0.111", "pydantic>=2.7", "structlog>=24.1"] - -[tool.uv] -dev-dependencies = [ - "pytest>=8", - "hypothesis>=6", - "mypy>=1.10", - "ruff>=0.4", - "bandit>=1.7", - "pip-audit>=2.7", -] -``` - ---- - -## 8 — Ruff: Lint & Format (Mandatory) - -After writing or modifying any Python file, run ruff and **iterate until the output is clean**: - -### Step 1 — format -```bash -uv run ruff format -``` - -### Step 2 — lint (with auto-fix) -```bash -uv run ruff check --fix -``` - -### Step 3 — lint (final check, no auto-fix) -```bash -uv run ruff check -``` - -If Step 3 still reports errors: -- Read each diagnostic carefully. -- Fix the root cause in the source (do not suppress with `# noqa` unless there is a genuine reason). -- Repeat from Step 1 until `ruff check` exits with code 0 and no output. - -### Recommended `pyproject.toml` ruff config: - -```toml -[tool.ruff] -line-length = 100 -target-version = "py312" - -[tool.ruff.lint] -select = [ - "E", # pycodestyle errors - "W", # pycodestyle warnings - "F", # pyflakes - "I", # isort - "N", # pep8-naming - "UP", # pyupgrade - "ANN", # flake8-annotations (enforce type hints) - "B", # flake8-bugbear - "C4", # flake8-comprehensions - "SIM", # flake8-simplify - "RUF", # ruff-specific rules - "TCH", # flake8-type-checking (move type-only imports into TYPE_CHECKING) - "PT", # flake8-pytest-style - "ERA", # eradicate (commented-out code) - "S", # flake8-bandit (security) - "D", # pydocstyle (docstring enforcement) - "PERF", # perflint (performance anti-patterns) - "TRY", # tryceratops (exception handling) - "PIE", # flake8-pie (misc. good practices) -] -ignore = [ - "D100", # missing docstring in public module (optional at module level) - "D104", # missing docstring in public package - "D203", # conflicts with D211 - "D213", # conflicts with D212 - "TRY003", # allow long messages in exceptions -] - -[tool.ruff.lint.pydocstyle] -convention = "google" - -[tool.ruff.lint.per-file-ignores] -"tests/**" = ["ANN", "S101", "D"] # relax annotations/assert/docstrings in tests - -[tool.ruff.format] -quote-style = "double" -indent-style = "space" -``` - ---- - -## 9 — Type Checking with Mypy (Mandatory) - -After every change, run: - -```bash -uv run mypy src/ -``` - -Iterate until exit code 0. This is a **hard gate** — do not consider code complete until mypy is clean. - ---- - -## 10 — Docstrings - -Every **public** module, class, and function **must** have a docstring. Use Google style. - -```python -def calculate_discount(price: float, rate: float) -> float: - """Calculate the discounted price. - - Args: - price: The original price in the base currency. - rate: The discount rate as a fraction (e.g. 0.1 for 10%). - - Returns: - The price after applying the discount. - - Raises: - ValueError: If rate is not in [0, 1]. - """ - if not 0 <= rate <= 1: - raise ValueError(f"rate must be in [0, 1], got {rate!r}") - return price * (1 - rate) -``` - -Rules: -- Summary line: one sentence, imperative mood, ≤ 79 chars, ends with `.`. -- Omit docstrings on private methods (`_foo`) unless the logic is non-obvious. -- Never duplicate the signature in the docstring — document intent, not mechanics. - ---- - -## 11 — Error Handling - -- Define domain-specific exception hierarchies rooted at a project base class. -- Never silence exceptions with bare `except Exception: pass`. -- Use `contextlib.suppress` only for genuinely expected, ignorable errors. -- Attach context with `raise NewError("...") from original_exc`. -- Log at the boundary (service/interface layer), not deep inside domain logic. -- Keep `try` blocks minimal — only the line that can raise, not surrounding logic. - -```python -class AppError(Exception): - """Base for all application errors.""" - -class NotFoundError(AppError): - def __init__(self, entity: str, id: int) -> None: - super().__init__(f"{entity} with id={id} not found") - self.entity = entity - self.id = id -``` - ---- - -## 12 — Structured Logging (structlog) - -Use `structlog` for all logging. Never use `print()` for operational output. - -```python -import structlog - -log = structlog.get_logger(__name__) - -def process_order(order_id: int) -> None: - log.info("processing_order", order_id=order_id) - try: - ... - log.info("order_processed", order_id=order_id) - except AppError as exc: - log.error("order_failed", order_id=order_id, error=str(exc)) - raise -``` - -Configure structlog in `infrastructure/logging.py` to: -- Emit **JSON** in production (`JSONRenderer`). -- Emit human-readable output in development (`ConsoleRenderer`). -- Bind request-id / trace-id into context at the request boundary. -- Never log secrets, PII, or credentials. - -```python -import structlog - -def configure_logging(*, json_output: bool) -> None: - processors: list[structlog.types.Processor] = [ - structlog.contextvars.merge_contextvars, - structlog.processors.add_log_level, - structlog.processors.TimeStamper(fmt="iso"), - structlog.processors.StackInfoRenderer(), - structlog.processors.ExceptionRenderer(), - ] - if json_output: - processors.append(structlog.processors.JSONRenderer()) - else: - processors.append(structlog.dev.ConsoleRenderer()) - - structlog.configure(processors=processors, wrapper_class=structlog.BoundLogger) -``` - -Log levels: -- `debug` — detailed diagnostic info (dev only) -- `info` — normal operational events -- `warning` — unexpected but recoverable state -- `error` — failure requiring attention; always include exception context -- `critical` — system-level failure; triggers alerting - ---- - -## 13 — Immutability & Safety - -- Prefer immutable data: `@dataclass(frozen=True)`, `tuple` over `list` where mutation is not needed. -- Use `Final` for module-level constants. -- Avoid global mutable state; pass dependencies explicitly. -- Thread-safety: document concurrency assumptions; use `threading.Lock` / `asyncio.Lock` when sharing state. -- Use `functools.cached_property` for lazy-computed instance properties (not compatible with `__slots__` — choose one). - ---- - -## 14 — Async - -When writing async code: -- `async def` for all I/O-bound operations. -- Never call blocking I/O inside a coroutine — use `asyncio.to_thread` or an executor. -- Use `asyncio.TaskGroup` (Python ≥ 3.11) for concurrent tasks, not bare `asyncio.gather`. -- Annotate return types explicitly: `async def fetch(...) -> bytes:`. -- Never mix sync and async code in the same layer without an explicit boundary. - -```python -async def fetch_all(urls: list[str]) -> list[bytes]: - async with asyncio.TaskGroup() as tg: - tasks = [tg.create_task(fetch(url)) for url in urls] - return [t.result() for t in tasks] -``` - ---- - -## 15 — Security - -These rules are mandatory for all enterprise code: - -### Secrets -- **Never hardcode** secrets, API keys, tokens, or credentials in source files or tests. -- Load secrets from environment variables or a secrets manager (AWS Secrets Manager, HashiCorp Vault, Azure Key Vault). -- Use Pydantic `BaseSettings` to load and validate config/secrets at startup: - -```python -from pydantic_settings import BaseSettings - -class Settings(BaseSettings): - database_url: str - api_key: str - - class Config: - env_file = ".env" - env_file_encoding = "utf-8" -``` - -### Input Validation -- Validate **all** external input at the trust boundary with Pydantic before it enters the domain. -- Never pass raw user input to shell commands, SQL queries, or file paths. - -### Dependency Scanning -Run in CI: -```bash -uv run pip-audit # check for known CVEs in dependencies -uv run bandit -r src/ -ll # SAST: flag high/medium severity issues -``` - -### Static Security Analysis -The `"S"` ruff rule set (flake8-bandit) catches the most common SAST issues inline. Treat all `S` violations as errors. - ---- - -## 16 — Testing (pytest) - -Every piece of logic **must** have tests. Minimum requirements: - -### Structure -``` -tests/ - conftest.py # shared fixtures - unit/ # fast, no I/O, pure functions/classes - integration/ # real DB, real HTTP (use fixtures for setup/teardown) -``` - -### Pytest conventions -```python -import pytest - -# Parametrize to cover multiple cases without duplication -@pytest.mark.parametrize("input,expected", [ - (0, 0), - (1, 1), - (10, 100), -]) -def test_square(input: int, expected: int) -> None: - assert square(input) == expected - -# Use fixtures for shared setup -@pytest.fixture -def user_repo(db_session: Session) -> UserRepository: - return UserRepository(db_session) -``` - -### Property-based testing (Hypothesis) -Use Hypothesis for any function with a non-trivial input space: - -```python -from hypothesis import given, strategies as st - -@given(st.floats(min_value=0.0, max_value=1.0)) -def test_discount_never_negative(rate: float) -> None: - assert calculate_discount(100.0, rate) >= 0 -``` - -### Rules -- Test names: `test___`. -- One logical assertion per test (use `pytest.approx` for floats). -- Do **not** mock the database in integration tests — use a real test DB. -- Minimum 80% line coverage; 100% for domain logic. -- Run with: `uv run pytest --tb=short -q` - -### Coverage config -```toml -[tool.pytest.ini_options] -addopts = "--strict-markers --tb=short" -testpaths = ["tests"] - -[tool.coverage.run] -source = ["src"] -branch = true - -[tool.coverage.report] -fail_under = 80 -show_missing = true -``` - ---- - -## 17 — Pre-commit Hooks - -Every project must have `.pre-commit-config.yaml`: - -```yaml -repos: - - repo: https://github.com/astral-sh/ruff-pre-commit - rev: v0.4.0 - hooks: - - id: ruff - args: [--fix] - - id: ruff-format - - - repo: https://github.com/pre-commit/mirrors-mypy - rev: v1.10.0 - hooks: - - id: mypy - additional_dependencies: [pydantic, types-all] - - - repo: https://github.com/PyCQA/bandit - rev: 1.7.8 - hooks: - - id: bandit - args: ["-ll", "-r", "src/"] -``` - -Install once per checkout: `uv run pre-commit install` - ---- - -## 18 — Observability (OpenTelemetry) - -For services (APIs, workers), instrument with OpenTelemetry: - -```python -from opentelemetry import trace - -tracer = trace.get_tracer(__name__) - -def process_payment(order_id: int) -> None: - with tracer.start_as_current_span("process_payment") as span: - span.set_attribute("order.id", order_id) - ... -``` - -- Use semantic conventions for attribute names (`http.method`, `db.system`, etc.). -- Auto-instrument frameworks (FastAPI, SQLAlchemy, httpx) via `opentelemetry-instrument`. -- Export to an OTel Collector; never export directly to a vendor from application code. - ---- - -## 19 — Configuration Management (pydantic-settings) - -Load all configuration and secrets from the environment. Never read env vars ad-hoc with `os.environ.get()` scattered throughout code. - -```python -from pydantic import SecretStr -from pydantic_settings import BaseSettings, SettingsConfigDict - -class DatabaseSettings(BaseSettings): - url: str - pool_size: int = 10 - -class Settings(BaseSettings): - model_config = SettingsConfigDict(env_file=".env", env_nested_delimiter="__") - - debug: bool = False - database: DatabaseSettings - api_key: SecretStr # redacted in repr() — never logged - -_settings: Settings | None = None - -def get_settings() -> Settings: - global _settings - if _settings is None: - _settings = Settings() - return _settings -``` - -Rules: -- `SecretStr` for any sensitive field — it redacts itself in `repr()` and logs. -- Commit `.env.example` (placeholder values) to the repo; never commit `.env`. -- Fail fast: `Settings()` raises `ValidationError` at startup if required vars are missing. -- Inject `Settings` as a dependency — never call `get_settings()` inside domain logic. - ---- - -## 20 — Concurrency Model Selection - -Choose the right tool: - -| Workload | Tool | -|---|---| -| I/O-bound (network, disk) | `asyncio` with `async/await` | -| I/O-bound + threading simplicity | `concurrent.futures.ThreadPoolExecutor` | -| CPU-bound (compute, data processing) | `concurrent.futures.ProcessPoolExecutor` | -| Fire-and-forget background tasks | `asyncio.TaskGroup` or `ThreadPoolExecutor` | - -```python -from concurrent.futures import ProcessPoolExecutor - -def cpu_intensive(data: bytes) -> bytes: - ... # pure computation - -def process_all(items: list[bytes]) -> list[bytes]: - with ProcessPoolExecutor() as pool: - return list(pool.map(cpu_intensive, items)) -``` - -- Prefer `concurrent.futures` over raw `threading.Thread` / `multiprocessing.Process`. -- Use `asyncio.timeout()` (Python ≥ 3.11) for coroutine deadlines — preferred over `asyncio.wait_for`. -- Use `asyncio.Semaphore` to rate-limit fan-out inside `TaskGroup`. -- Never rely on the GIL for thread-safety — Python 3.13 free-threaded mode removes it. - ---- - -## 21 — ExceptionGroup and `except*` (Python ≥ 3.11) - -`asyncio.TaskGroup` raises an `ExceptionGroup` when multiple tasks fail. Handle it correctly: - -```python -import asyncio - -async def main() -> None: - try: - async with asyncio.TaskGroup() as tg: - tg.create_task(might_fail_a()) - tg.create_task(might_fail_b()) - except* ValueError as eg: - for exc in eg.exceptions: - log.error("validation_failed", error=str(exc)) - except* IOError as eg: - for exc in eg.exceptions: - log.error("io_failed", error=str(exc)) -``` - -- `except*` catches matching exceptions from the group; unmatched exceptions re-raise. -- For Python < 3.11, use the `exceptiongroup` backport. -- Raising `ExceptionGroup` from a public API is a breaking change — version it. - ---- - -## 22 — Naming Conventions - -Follow these rules consistently (aligned with Google Style Guide and PEP 8): - -| Thing | Convention | Example | -|---|---|---| -| Module | `snake_case`, short, no dashes | `user_repository.py` | -| Package | `snake_case`, short | `my_package/` | -| Class | `PascalCase` | `UserRepository` | -| Exception | `PascalCase` ending in `Error` | `NotFoundError` | -| Function / method | `snake_case` | `calculate_discount` | -| Constant | `UPPER_SNAKE_CASE` | `MAX_RETRIES` | -| Type alias | `PascalCase` | `UserId = NewType("UserId", int)` | -| `TypeVar` | Single capital or `T`-suffix | `T`, `EntityT`, `KT` | -| `Protocol` | Adjective or role noun | `Serializable`, `Closeable` | -| Private names | Single underscore prefix | `_internal_helper` | -| Internal module names | Single underscore prefix | `_utils.py` | - -Additional rules: -- Avoid abbreviations except universally understood ones (`db`, `url`, `id`, `cfg`, `req`, `resp`). -- Boolean variables and functions: use `is_`, `has_`, `can_` prefix (`is_active`, `has_permission`). -- Never shadow builtins (`list`, `id`, `type`, `input`, `filter`, `map`). - ---- - -## 23 — Dangerous Calls — Banned Without Justification - -These calls are **banned** by default; any use requires an explicit comment explaining why it is safe: - -| Call | Risk | Safe alternative | -|---|---|---| -| `eval()` / `exec()` | Code injection | Parse structured data instead | -| `pickle.loads()` | Arbitrary code execution | `json`, `msgpack`, `protobuf` | -| `subprocess.run(..., shell=True)` | Shell injection | Pass a list of args instead | -| `yaml.load()` | Arbitrary code execution | `yaml.safe_load()` | -| `__import__()` | Dynamic import abuse | `importlib.import_module()` with validation | -| `os.system()` | Shell injection | `subprocess.run([...])` | - -The `"S"` ruff rule set flags most of these automatically. - ---- - -## 24 — `pyproject.toml` Reference Template - -Use this as a starting point for new projects: - -```toml -[build-system] -requires = ["hatchling"] -build-backend = "hatchling.build" - -[project] -name = "my-package" -version = "0.1.0" -requires-python = ">=3.12" -dependencies = [ - "pydantic>=2.7", - "pydantic-settings>=2.3", - "structlog>=24.1", -] - -[tool.uv] -dev-dependencies = [ - "pytest>=8", - "pytest-cov>=5", - "hypothesis>=6", - "mypy>=1.10", - "ruff>=0.4", - "bandit>=1.7", - "pip-audit>=2.7", - "pre-commit>=3.7", -] - -[tool.ruff] -line-length = 100 -target-version = "py312" - -[tool.ruff.lint] -select = ["E","W","F","I","N","UP","ANN","B","C4","SIM","RUF","TCH","PT","ERA","S","D","PERF","TRY","PIE"] -ignore = ["D100","D104","D203","D213","TRY003"] - -[tool.ruff.lint.pydocstyle] -convention = "google" - -[tool.ruff.lint.per-file-ignores] -"tests/**" = ["ANN","S101","D"] - -[tool.ruff.format] -quote-style = "double" -indent-style = "space" - -[tool.mypy] -python_version = "3.12" -strict = true -warn_return_any = true -warn_unused_ignores = true -no_implicit_reexport = true - -[tool.pytest.ini_options] -addopts = "--strict-markers --tb=short" -testpaths = ["tests"] - -[tool.coverage.run] -source = ["src"] -branch = true - -[tool.coverage.report] -fail_under = 80 -show_missing = true -``` - ---- - -## Checklist Before Finalising Any Python File - -**Typing** -- [ ] `from __future__ import annotations` at top of every module -- [ ] All functions/methods/attributes fully typed — no bare `Any` -- [ ] Type-only imports under `TYPE_CHECKING` -- [ ] `mypy src/` passes with zero errors (strict mode) - -**Design** -- [ ] Classes model real concepts; single responsibility (SOLID) -- [ ] Pydantic used at trust boundaries; dataclass/attrs for internals -- [ ] No business logic leaking across layers -- [ ] Pythonic constructs throughout; no repeated logic (DRY) -- [ ] No dangerous calls (`eval`, `pickle.loads`, `shell=True`, etc.) without justification -- [ ] Naming follows conventions (§22) - -**Documentation** -- [ ] All public modules, classes, and functions have Google-style docstrings -- [ ] Generator functions use `Yields:`, not `Returns:` - -**Code quality** -- [ ] `ruff format` + `ruff check` both pass with zero diagnostics -- [ ] No `# noqa` / `# type: ignore` without an explanatory comment - -**Security** -- [ ] No hardcoded secrets or credentials -- [ ] All external input validated with Pydantic at the boundary -- [ ] `bandit -r src/ -ll` clean -- [ ] `pip-audit` clean - -**Testing** -- [ ] Tests written — unit for logic, integration for persistence/HTTP -- [ ] Coverage ≥ 80% (100% for domain logic) -- [ ] Hypothesis used for non-trivial input spaces - -**Observability** -- [ ] Structured logging via `structlog` (no `print()`) -- [ ] OTel spans on service-layer operations (for service code) - -**Dependencies & tooling** -- [ ] `uv.lock` committed and up to date -- [ ] context7 docs consulted for any library APIs used +# Python Development + +Use this skill for any task that creates or changes Python code. + +## Operating Mode + +- Match the repository's existing style and architecture before introducing new patterns. +- Prefer the smallest correct change over broad refactors. +- Keep code easy to read: flat control flow, clear names, limited indirection. +- Do not force framework choices or house style upgrades into unrelated work. +- Do not add compatibility layers, feature flags, or speculative abstractions unless required. +- If an API or library detail is uncertain, check current docs before coding. + +## Defaults + +- Add `from __future__ import annotations` to new Python modules. +- Fully annotate public functions and important variables. +- Prefer precise types over broad ones. +- Use `X | Y` and `X | None` syntax. +- Avoid `Any`; if unavoidable, keep it narrow and explain why. +- Use `TYPE_CHECKING` for type-only imports when it improves runtime imports or avoids cycles. +- Prefer simple functions and dataclasses unless the codebase clearly wants a heavier OO design. +- Use classes for durable domain concepts or stateful collaborators, not by default. +- Prefer composition over inheritance. +- Keep responsibilities separated: domain logic, data access, and interface code should stay distinct. +- Do not place business logic directly in route handlers, CLI commands, or persistence code. +- Validate untrusted input at the boundary. +- Prefer Pydantic for external input, config, or data crossing trust boundaries. +- Prefer `@dataclass(slots=True)` for internal structured data. +- Use `frozen=True` when immutability is a natural fit. +- Prefer `pathlib` over `os.path`. +- Prefer f-strings. +- Prefer standard idioms like comprehensions, `enumerate`, `zip`, and context managers where appropriate. +- Use guard clauses to reduce nesting. +- Avoid mutable default arguments, bare `except:`, and `type(x) == T` checks. + +## Error Handling + +- Raise specific exceptions with useful messages. +- Keep `try` blocks narrow. +- Preserve context with `raise ... from exc` when re-raising. +- Do not swallow exceptions silently. +- Log at system boundaries or orchestration layers, not deep inside pure domain logic. + +## Async + +- Use `async` only for real I/O-bound workflows. +- Do not call blocking I/O directly from coroutines. +- Keep sync and async boundaries explicit. + +## Security + +- Never hardcode secrets, tokens, or credentials. +- Validate external input before using it. +- Avoid dangerous calls unless explicitly justified: `eval`, `exec`, `pickle.loads`, `yaml.load`, `shell=True`, `os.system`. +- Do not build shell commands from raw user input. +- Avoid logging secrets or sensitive user data. + +## Testing + +- Add or update tests for behavior you change. +- Prefer unit tests for pure logic and integration tests for persistence or external boundaries. +- Use parametrization to cover multiple cases succinctly. +- Mock at external boundaries, not inside the core logic under test, unless the repo uses a different testing style. + +## Tooling + +- Use `uv` for Python dependency and tool execution when the project uses it. +- Run formatters, linters, and type checks after Python changes. +- Default verification flow: format, lint, type-check, then run relevant tests. +- If the repo uses different commands, follow the repo. + +## Completion Checklist + +Before finishing Python work, make sure: + +- the change matches existing project patterns +- code is fully typed to the repo's standard +- untrusted input is validated at the boundary +- logic is kept in the right layer +- `ruff` is clean +- `mypy` is clean +- relevant tests pass +- no secrets or risky calls were introduced without justification + +## Reference + +See `REFERENCE.md` for optional deeper guidance on typing, architecture, async patterns, testing, security, and review heuristics.