AI Rule Learning System

Rule Effectiveness

Gap Distribution

Select rule for details

Upload Conversation History

Upload a JSON or CSV file containing past conversations.

JSON format

[
  {
    "conversation_id": "optional",
    "turns": [
      {"turn_number": 1, "user_input": "Hello", "agent_response": "Hi!"},
      {"turn_number": 2, "user_input": "...", "agent_response": "..."}
    ]
  }
]

CSV format

One row per turn, columns: conversation_id, turn_number, user_input, agent_response Optional columns: session_id, user_id, sentiment_before, sentiment_after

Select JSON or CSV file

Run Analysis

Scans all uploaded conversations for behavioural gaps, then uses Qwen/Qwen2.5-72B-Instruct via the HF Inference API to generate guardrail rules automatically.

Ralph Loop checkpointing: analysis is resumable if the Space times out mid-run
Detects: explicit corrections, repeated questions, code anti-patterns, sentiment drops
Requires ≥2 occurrences of a gap type before generating a rule
Rules are saved directly to the dataset and appear in the Rules tab

🔄 Validate & Evolve uses the Mengram feedback pattern: instead of just deactivating low-performing rules (< 30% effectiveness), it rewrites them with the AI model so they improve rather than disappear.

Analysis log

Project-level health sensor — tracks whether the deployed Space, dataset, rule system, and workflow are all moving in the right direction.

Health Score

Score Breakdown

Per-conversation alignment sensor — task focus, rule compliance, and semantic drift across turns.

Conversation

Alignment Score

Timeline

Type a user message below to see which gaps would be detected and which rules would be injected.

User message

Examples

System Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    CONVERSATION FLOW                            │
│                                                                 │
│  User Input                                                     │
│      │                                                          │
│      ▼                                                          │
│  ┌──────────────┐    ┌─────────────┐    ┌──────────────────┐  │
│  │   Rule       │    │  System     │    │   AI Adapter     │  │
│  │   Engine     │───▶│  Prompt     │───▶│  OpenAI/Claude   │  │
│  │  (pre-hook)  │    │  Injected   │    │                  │  │
│  └──────────────┘    └─────────────┘    └──────────────────┘  │
│         │                                        │              │
│         │                                        ▼              │
│  ┌──────────────┐                      ┌──────────────────┐   │
│  │  HF Dataset  │                      │   AI Response    │   │
│  │  (rules)     │                      │                  │   │
│  └──────────────┘                      └──────────────────┘   │
│                                                  │              │
│                                                  ▼              │
│                                        ┌──────────────────┐   │
│                                        │  Gap Detector    │   │
│                                        │  (post-hook)     │   │
│                                        └──────────────────┘   │
│                                                  │              │
│                              ┌───────────────────┤             │
│                              ▼                   ▼             │
│                    ┌──────────────┐    ┌──────────────────┐   │
│                    │  HF Dataset  │    │ Rule Generator   │   │
│                    │(conversations│    │ (when gaps → 2+) │   │
│                    └──────────────┘    └──────────────────┘   │
└─────────────────────────────────────────────────────────────────┘

Gap Detection Categories

Gap Type	Trigger	Severity
`sentiment_drop`	User sentiment falls > 0.3 points	4
`explicit_correction`	User says "wrong", "actually", "fix" etc.	5
`repeated_question`	Same question asked 2+ times	3
`code_anti_pattern`	Bare except, eval, hardcoded secrets	5

Rule Lifecycle

Gap detected → Group similar gaps → ≥2 occurrences?
                                          │
                                     Yes  ▼
                              Generate Rule (via AI)
                                          │
                                          ▼
                              Deploy to HF Dataset
                                          │
                                          ▼
                              Inject in future prompts
                                          │
                                          ▼
                              Track effectiveness
                                          │
                              Score < 15%? → Deactivate

Run Locally

git clone https://github.com/FAJU85/AI_Rule_Learning.git
cd AI_Rule_Learning
pip install -r requirements.txt
cp .env.example .env
# Add your API keys to .env
python -m src.cli.main chat

CLI Commands

Command	Description
`python -m src.cli.main chat`	Interactive conversation with rule injection
`python -m src.cli.main analyze --days 7`	Analyse last 7 days, generate rules
`python -m src.cli.main validate`	Score and prune ineffective rules
`python -m src.cli.main list-rules`	Show all active rules
`python scripts/upload_historical.py --file data.json`	Bulk upload conversations via CLI

Environment Variables

HF_TOKEN=your_hf_token
HF_DATASET_NAME=vooom/AI_Rule_Learning
OPENAI_API_KEY=your_openai_key        # or
ANTHROPIC_API_KEY=your_anthropic_key

Source

github.com/FAJU85/AI_Rule_Learning

🧠 AI Rule Learning System