# FEATURE.md — Orchestration Service

## Purpose
Define features, data models, behaviors, and operational rules for an Orchestration Service that assigns and coordinates agents to complete tasks inside isolated running containers.

---

## Service Overview
- Name — Orchestration Service for agent-based task execution.
- Goal — Accept task definitions, schedule and dispatch agents, manage execution environments, track progress and complexity, and escalate on failures.
- Primary actors — Task, Agent, Running Container, Tool, Scheduler, Operator.

---

## Task Model

### Core Fields
- id — Unique identifier.
- name — Human-readable title.
- prompt — Primary instruction text prepended to the task before execution.
- container — Reference to the required running container.
- resolve_time_estimate — Estimated completion time in minutes.
- task_tags — Array of TaskTag values describing domain(s).
- complexity_level — One of: Low, Middle, Hard, ExtremHard.
- escalation — Optional object with reason and prompt.
- metadata — Free-form key/value map.

### Task Flow Control
A task is always a first-class entity. There is no parent/subtask concept. Tasks chain to each other via `next_task`. Tasks are organised into groups using a `TaskGroup` entity that defines sequential or parallel execution.
- schedule — Schedule definition.
  - type: immediate, time-based, datetime-event.
  - datetime-event is used for the schedule task feature (starts on a specific date/time).
- next_task — Optional ID of the next task in the chain (within or across groups).
- previous_task — Optional ID of the previous task in the chain.

Specialised control-flow behaviours — iteration, conditional branching, and jump-with-prompt — are expressed as distinct task types (see **Specialised Task Types** below). Each specialised task type is a separate entity and a separate database row with a `task_type` discriminator.

---

## Task Groups

### Purpose
A `TaskGroup` is a first-class chain participant, just like a task. It owns a set of children, waits for all of them to complete, and then advances to its `next` item — which can itself be either a task or another group.

Children of a group can be any mix of:
- `AgentTask` (standard or any specialised type: `ForeachTask`, `GotoTask`, `ConditionTask`)
- Nested `TaskGroup` instances (groups can be nested arbitrarily deep)

### TaskGroup Fields
- id — Unique identifier.
- name — Human-readable label.
- type — `Sequential` or `Parallel`.
  - `Sequential` — Children run one at a time in the declared order. When one child completes, the next child is activated.
  - `Parallel` — All children are dispatched simultaneously. The group completes when every child has finished.
- children — Ordered list of `GroupChildRef` entries. Each entry identifies a child by its ID and kind (`Task` or `Group`).
- next — Optional `GroupChildRef` identifying the item to activate after this group completes. It can point to a task or another group.
- previous — Optional `GroupChildRef` identifying the item that precedes this group in the chain.

### GroupChildRef
A `GroupChildRef` is a value object that points to either a task or a group:
- child_id — Guid of the child.
- kind — `Task` or `Group`.

### Orchestrator Behavior
- A group begins execution when it is activated (either by the root scheduler or by the `next` pointer of a predecessor).
- **Sequential**: the engine activates the first child; when that child completes, it activates the second child, and so on. After the last child completes, the group itself is marked complete and activates its `next`.
- **Parallel**: the engine activates all children simultaneously. After all children complete, the group is marked complete and activates its `next`.
- If a child is itself a `TaskGroup`, the same rules apply recursively — the nested group must complete before it counts as done for its parent.
- A child belongs to at most one group.
- Cycles in the chain (via `next` or through children) must be rejected at validation time (`DependencyCycleException`).

---

## Specialised Task Types

Each specialised control-flow behaviour is a distinct domain entity. Each type is stored as a separate database row with a `task_type` discriminator column. All specialised types inherit the core task fields and `next_task` / `previous_task` chaining.

### ForeachTask
Iterates over a list of items and spawns one task instance per item.
- foreach_items — Ordered list of string values to iterate over.
- foreach_template_task_id — ID of the task template to clone per item.
- Generated task instances are linked via `next_task` / `previous_task` in creation order and can be placed into a group.

### GotoTask
On completion, merges additional context into a target task and activates it — regardless of where the target sits in the chain.
- goto_target_task_id — ID of the task to activate.
- goto_prompt — Additional prompt text merged (appended) into the target task's `Prompt` before execution.

### ConditionTask (If-Task)
Evaluates a boolean expression against the task result and routes to one of two branches.
- condition — A boolean expression evaluated against the task result.
- true_next_task_id — ID of the task to activate when the condition evaluates to true.
- false_next_task_id — ID of the task to activate when the condition evaluates to false.

---

## Schedule

### Types
- time-based — Cron or ISO schedule expression.
- datetime-event — Start at a fixed date/time instant or range (e.g. `2026-04-01T09:00:00Z`). This enables schedule tasks.
- immediate — Start as soon as created.

The system uses normal event publishing and handling for task coordination; it does not use dedicated signal-based schedules.

### Goto Tasks
- Use a `GotoTask` (see **Specialised Task Types**) to jump to an arbitrary target task.
- The target task receives the additional `goto_prompt` text merged (appended) into its `Prompt` and activates immediately when the `GotoTask` completes.

### Retry Policy
- attempts — integer.
- backoff — fixed or exponential.
- max_retries — integer.

---

## Complexity Level

### Definition
Defines how complex a task is and when it must be split.

| Level       | Assignment rule                         | Characteristics |
|-------------|-------------------------------------------|----------------|
| Low         | resolve_time_estimate ≤ 30               | Simple logic; single topic. |
| Middle      | 30 < resolve_time_estimate ≤ 60          | Moderate complexity; single topic. |
| Hard        | resolve_time_estimate > 60               | Multi-topic; requires subtasks. |
| ExtremHard  | resolve_time_estimate unknown/unbounded  | Multi-topic; unknown duration. |

### Automatic Rules
- resolve_time_estimate > 60 → Hard.
- Missing or unbounded estimate → ExtremHard.
- Parent prompt must be prepended to all subtasks.

---

## TaskTag

### Purpose
Enum identifying task working areas and agent capabilities.

### Example Values
- DATA_INGESTION
- NLP
- IMAGE_PROCESSING
- DEPLOYMENT
- SECURITY
- TESTING
- DOCUMENTATION

---

## Agent

### Core Fields
- id — Unique identifier.
- skills — List of assigned skills.
- task_tags — List of TaskTag values the agent can handle.
- choose_rules — Matching rules for accepting tasks.
- capabilities — wait_for_subagents, call_subagent, return_status, escalation.

### Matching Rules Examples
- Accept if task has a single TaskTag equal to one of agent's TaskTags.
- Accept if agent supports a superset of task tags.
- Accept only when agent has all TaskTags required by the task.

### Execution Behavior
- Agent executes inside its container and returns success or failure.
- On failure, escalation.reason and escalation.prompt are required.
- Agents may spawn sub-agents and wait for them.

---

## Running Container

### Definition
Isolated runtime environment required for a task.

### Attributes
- image — Container image reference.
- tools — List of required tools and binaries.
- resources — CPU, memory, disk limits.
- network_policy — Allowed network rules.
- volumes — Mounts and persistence rules.

### Constraints
- Agents cannot access resources outside their container.
- Containers must declare all tools at startup.

---

## Tool

### Definition
Runnable binary or library available inside a container.

### Attributes
- name — Tool name.
- version — Required version.
- purpose — Description of tool’s role.
- usage_guidelines — Allowed and forbidden operations.
- install_instructions — Installation steps.

---

## Failure and Escalation Flow

### Agent Failure
- Agent returns failure with escalation.reason and escalation.prompt.
- Orchestrator evaluates retry policy.
- If retries exhausted → escalate to parent agent or human operator.

### Logging
- All failures and escalations must be logged with timestamps, container id, agent id, and error context.

---

## Frontend CLI

### Capabilities
- Manage Agents, Tasks, Tools, Containers, Complexity Levels, TaskTags, Schedules.
- Create, update, delete, list resources.
- Start tasks manually and view logs.
- Trigger event-driven execution.

### Example Commands
- orchestrate agent create --name <name> --task-tags <tags> --skills <skills>
- orchestrate task create --name <name> --prompt-file <file> --container <image> --schedule <spec>
- orchestrate container define --image <image> --tools <list> --resources <spec>
- orchestrate task start --id <task-id>
- orchestrate task status --id <task-id> --follow

---

## Example Task Definition (YAML)

```yaml
id: task-200
name: "Transform raw data"
prompt: "Clean and normalize raw input data."
container: "data-worker:v1.0"
resolve_time_estimate: 20
task_tags:
  - DATA_INGESTION
complexity_level: Low
schedule:
  type: immediate

group_id: "group-preprocessing"

tools:
  - python:3.11
  - pip:latest

escalation:
  reason: "Validation failed"
  prompt: "Please inspect input data and retry."
```

---
---

# DREP — Documentation & Review Enhancement Platform

**Version:** 1.1.0 (Production/Stable)

AI-powered code review and documentation improvement tool for **Gitea**, **GitHub**, and **GitLab** repositories. DREP uses large language models to automatically analyze your Python codebase, review pull requests, generate documentation, and detect security vulnerabilities — saving developer time and catching issues before they reach production.

---

## CLI Commands

### `drep init` — Interactive Configuration Wizard

A guided setup experience that walks you through configuring DREP for your environment. The wizard asks you to choose your platform (GitHub, Gitea, or GitLab), select which repositories to monitor, pick an LLM provider, and configure documentation preferences. It supports enterprise and self-hosted instances. Secrets are never stored directly in the config file — instead, environment variable placeholders are used for all tokens and credentials.

### `drep scan` — Repository Code Analysis

Scans an entire repository for code quality issues and creates issues directly on your platform with the findings. DREP detects bugs, security vulnerabilities, best practice violations, and performance problems across all Python and Markdown files. It also generates missing docstrings for public functions. By default, only files that changed since the last scan are analyzed, making repeated scans fast and efficient. Duplicate findings are automatically prevented — if an issue was already reported, it won't be created again.

### `drep review` — Pull Request / Merge Request Review

Performs an AI-powered review of a pull request or merge request. DREP reads the diff, analyzes the changes using an LLM, and posts a summary comment along with inline comments on specific lines. Each comment includes a severity level so you can prioritize what to address first. At the end, it provides an approval or rejection recommendation based on the overall quality of the changes.

### `drep check` — Local-Only Analysis

Analyzes code on your local machine without needing any platform credentials. Ideal for pre-commit workflows — you can configure it to check only staged files so it runs before every commit. It can output results as plain text or JSON and integrates directly with pre-commit hooks. Use it to catch issues before pushing code, with the option to run in warning-only mode so it never blocks your commits.

### `drep validate` — Configuration Validation

Verifies that your configuration file is complete and valid before running any other commands. Catches misconfigurations early.

---

## Features

### 1. Proactive Code Analysis

DREP uses AI to deeply analyze your Python code and surface real issues across four key areas:

- **Bugs & Logic Errors** — Catches incorrect logic, unhandled edge cases, potential crashes, undefined variables, and type mismatches that could cause runtime failures.
- **Security Vulnerabilities** — Identifies dangerous patterns like SQL injection, command injection, path traversal, unsafe deserialization, hardcoded secrets, and weak cryptographic usage before they become exploitable.
- **Best Practice Violations** — Flags PEP 8 violations, missing docstrings, poor naming conventions, code smells, and anti-patterns that make code harder to maintain.
- **Performance Problems** — Spots inefficient algorithms, unnecessary loops, blocking I/O operations, and potential memory leaks that degrade application performance.

Every finding includes a severity level (critical, high, medium, low, or info), a clear explanation of the problem, and a specific actionable suggestion for how to fix it. Findings are posted as issues on your platform with proper categorization.

### 2. Docstring Intelligence

Automatically generates and improves Google-style docstrings for your Python functions using AI. DREP intelligently targets the functions that need documentation most — those that are public-facing, sufficiently complex, or use special decorators like `@property` and `@classmethod`. It also detects poor-quality existing docstrings that are too short, contain placeholder text like "TODO" or "helper function," or are missing parameter and return value descriptions. Each generated docstring includes a quality rating and an explanation of the function's behavior, ensuring your API documentation is always accurate and complete.

### 3. Automated Pull Request / Merge Request Reviews

Provides thorough, AI-driven code reviews directly on your pull requests and merge requests. DREP posts inline comments on specific changed lines, making feedback easy to find and act on. Each comment is tagged with a severity level:

- **Info** — Minor notes or style observations
- **Suggestion** — Recommended improvements that aren't required
- **Warning** — Potential issues that should be addressed before merging
- **Critical** — Serious bugs, security problems, or blockers that must be fixed

The review also includes an overall summary with an approval or rejection recommendation and a list of any major concerns that could block the merge.

### 4. Documentation Analysis

Lints your Markdown documentation files to enforce consistent formatting and readability. Catches common problems such as trailing whitespace, tab characters, empty headings, missing spaces after heading markers, overly long lines, excessive blank lines, bare URLs that should be wrapped in link syntax, broken link formatting, and unclosed code fences. Supports custom dictionaries for spell checking so that project-specific terms are not flagged as errors.

### 5. Smart Caching

Remembers previous analysis results so that unchanged code is not re-analyzed, dramatically reducing LLM costs and scan times. The cache is automatically invalidated when code changes, ensuring results are always up to date. Old entries expire after a configurable period and are evicted when the cache reaches its size limit. On incremental scans, caching typically reduces LLM costs by over 80%.

### 6. Rate Limiting

Prevents overloading your LLM provider and platform APIs with built-in multi-level rate limiting. Controls are applied at the global level, per-repository level, per-minute request count, and per-minute token count. This ensures DREP operates within your provider's limits even when scanning many repositories in parallel, avoiding throttling errors and unexpected costs.

### 7. Circuit Breaker

Protects your workflow when external services (LLM providers, platform APIs) experience outages. If a service starts failing repeatedly, DREP automatically stops sending requests to it and fails fast instead of hanging on timeouts. After a recovery period, it cautiously tests whether the service is back before resuming normal operation. This prevents cascading failures and keeps overall scan throughput stable even when one service is degraded.

### 8. Incremental Scanning

Only analyzes files that have changed since the last scan by tracking the most recent commit. This makes repeated scans fast, as unchanged files are skipped entirely. When running against a repository for the first time, or when no previous scan record exists, DREP falls back to a full scan of all files.

### 9. Security Detection

Provides a fast, pattern-based security scan that runs without requiring an LLM. Detects common vulnerability patterns including SQL injection, command injection, path traversal, unsafe deserialization, hardcoded secrets, and weak cryptographic usage. This acts as a first line of defense that catches well-known security anti-patterns instantly, complementing the deeper AI-powered analysis.

### 10. Pre-Commit Hook Integration

Integrates directly into your development workflow as a pre-commit hook. When configured, DREP checks only your staged files before each commit and reports any issues it finds. No platform token is required for local checks. Exit codes integrate with CI/CD pipelines — the hook can either block commits that have issues or run in warning-only mode where it reports findings without preventing the commit.

### 11. Webhook Server

Runs as a server that listens for events from your Git platform and automatically triggers analysis. When code is pushed to a repository, DREP runs a background scan. When a pull request is opened or updated, it automatically posts an inline review. This enables a fully hands-off workflow where every code change is reviewed without any manual intervention.

### 12. LLM Metrics & Observability

Tracks the cost, latency, and success rate of every LLM request. Provides a per-analyzer breakdown so you can see exactly how much each type of analysis (code quality, docstring generation, PR review) costs and how long it takes. This gives you full visibility into your AI usage and helps identify opportunities to optimize spending.

### 13. Robust LLM Response Handling

Gracefully handles malformed or incomplete responses from LLM providers. When an LLM returns imperfect output — such as responses wrapped in markdown code fences, broken JSON, or truncated text — DREP applies a series of increasingly aggressive recovery strategies to extract valid results. This ensures reliable operation even with less capable or occasionally unreliable LLM providers.

---

## Supported Platforms

### Gitea

Full support for Gitea instances, including creating issues, reading and reviewing pull requests with inline comments, and fetching file content. Works with both cloud-hosted and self-hosted Gitea installations.

---

## LLM Providers

### OpenAI-Compatible (Local LLMs)

Works with any OpenAI-compatible endpoint, including LM Studio, Ollama, and other local LLM servers. This allows you to run DREP entirely on your own hardware with no data leaving your network.

### AWS Bedrock

Connects to AWS Bedrock for access to models like Claude through your existing AWS credentials. Ideal for teams that already use AWS and want to keep everything within their cloud environment.

### Anthropic

Direct Anthropic API support is planned for a future release.

---

## Database

DREP uses a database to track scan history and prevent duplicate issue creation. SQLite works out of the box with zero setup. For team or production deployments, PostgreSQL and MySQL are also supported.