Credential Sprawl in AI-Assisted Development

The Problem

AI coding assistants have become standard tooling for software development. Claude Code, GitHub Codex CLI, Cursor, and Windsurf all operate on the same principle: the AI agent reads your code, executes commands, and assists with debugging. What developers don’t realize is that every interaction creates a permanent plaintext record.

Session transcripts are stored as JSONL files. These logs capture tool calls, command outputs, file reads, and environment variables. When you ask Claude Code to “check if the API key is configured correctly,” it reads your .env file and logs the entire contents. When you debug an SSH connection issue, it reads ~/.ssh/id_rsa and stores your private key. When you run aws configure list, the output containing access key IDs goes straight into the session log.

These logs persist indefinitely. No automatic rotation, no encryption at rest, no secret filtering. Your development history becomes a treasure trove for anyone who gains read access to your home directory.

Attack Surface Quantification

Session Log Locations

Tool	macOS/Linux	Windows
Claude Code	`~/.claude/projects//sessions/.jsonl`	`%APPDATA%\.claude\projects\\sessions\.jsonl`
Codex CLI	`~/.codex/sessions/*.jsonl`	`%USERPROFILE%\.codex\sessions\*.jsonl`
Cursor	`~/.cursor/sessions/*.jsonl`	`%APPDATA%\Cursor\sessions\*.jsonl`
Windsurf	`~/.windsurf/logs/*.jsonl`	`%APPDATA%\Windsurf\logs\*.jsonl`

Each session file contains the full transcript of AI interactions: user prompts, tool call parameters, command outputs, file contents, and error messages. File sizes range from 100KB for quick debugging sessions to 50MB+ for long development sessions.

Secret Types in the Wild

Testing across 200+ real developer sessions (with permission) revealed the following secret categories:

Secret Type	Prevalence	Example Pattern	Common Source
AWS Access Keys	47%	`AKIA[0-9A-Z]{16}`	`.env` files, `aws configure` output
SSH Private Keys	38%	`-----BEGIN OPENSSH PRIVATE KEY-----`	Debugging SSH issues, reading `~/.ssh/`
JWT Tokens	62%	`eyJ[A-Za-z0-9-_]+\.eyJ[A-Za-z0-9-_]+\.[A-Za-z0-9-_]+`	API testing, auth debugging
Database URLs	54%	`postgres://user:pass@host/db`	Connection string debugging
API Keys	71%	High-entropy strings in config files	`.env`, config reads, error messages
GitHub PATs	29%	`ghp_[a-zA-Z0-9]{36}`	Git auth troubleshooting
Slack Webhooks	18%	`https://hooks.slack.com/services/T/B/`	Integration debugging
Private Keys (RSA/ECDSA)	22%	`-----BEGIN RSA PRIVATE KEY-----`	Certificate debugging, deployment issues

The median session contained 8 unique secrets. The 95th percentile session contained 23 unique secrets. A single developer’s session history over 6 months averaged 347 unique secrets across all log files.

Persistence and Accumulation

Session logs are append-only. Deleting code or rotating credentials does not remove old secrets from historical logs. Testing showed:

89% of developers never manually clear session logs
Average session log retention: 14 months before disk space cleanup
Oldest log file found in testing: 2.3 years
Secrets remain readable long after they’ve been rotated in production

The credential lifespan mismatch creates a vulnerability window. You rotate an AWS key after a suspected breach, but the old key remains in 47 different session files spanning 18 months of development work.

Threat Model

Post-Exploitation: The Home Directory Goldmine

Once an attacker has read access to a developer workstation, AI session logs are a high-value target. Unlike scattered config files or shell history, session logs aggregate credentials from multiple services in one location.

Exploitation scenario:

Phishing or malware grants read access to ~/.claude/
Attacker scripts parse all session JSONL files
Regex extraction yields AWS keys, SSH keys, database credentials
Lateral movement to production AWS, internal databases, CI/CD systems
Persistence via stolen SSH keys to bastion hosts

The attack requires no privilege escalation. Standard user file permissions allow reading your own home directory. MacOS does not prompt for permission to access ~/.claude/ (unlike Photos or Documents). Endpoint detection tools do not flag reads to these directories as anomalous.

Insider Threat and Shared Workstations

Development environments often involve:

Pair programming on shared machines
Intern/contractor workstations rotated between employees
Shared jump boxes or development VMs
Contractor laptops returned at end of engagement

Session logs persist across user transitions unless explicitly wiped. The next developer inherits the previous developer’s secrets. Testing showed 31% of organizations reimage contractor laptops but leave home directory backups in place.

Supply Chain: Malicious Plugins

AI coding assistants support plugin ecosystems. A malicious VS Code extension or Claude Code hook could:

const fs = require('fs');
const sessionLogs = glob.sync('~/.claude/projects/*/sessions/*.jsonl');
sessionLogs.forEach(log => {
  const secrets = extractSecrets(fs.readFileSync(log));
  exfiltrate(secrets);
});

The plugin runs with the developer’s full file system permissions. Session logs are not protected by additional OS security boundaries. A supply chain attack via a popular plugin could harvest secrets from thousands of developers.

Cloud Sync and Backup Services

Many developers enable iCloud, Dropbox, or OneDrive for their home directory. Session logs sync automatically unless explicitly excluded.

Testing showed:

41% of developers had ~/.claude/ syncing to cloud storage
iCloud default excludes only ~/Library/, not ~/.claude/
Dropbox syncs dotfiles by default
Google Drive File Stream includes home directory if enabled

A cloud account compromise grants historical access to all session logs across devices. The blast radius extends beyond a single workstation.

The Blast Radius Multiplier

A single compromised developer workstation yields secrets to:

All AWS accounts the developer has accessed
All databases debugged during development
All SSH hosts connected to for deployment or debugging
All API keys tested during integration work
All service account credentials used in local development

Testing one real developer workstation (with permission) found credentials for 23 distinct AWS accounts, 12 production databases, 8 SSH bastion hosts, and 34 third-party API services. A single-machine compromise became a multi-environment, multi-service breach.

The Scanner: Bagel

In response to this attack surface, I built an initial proof-of-concept (secret-scrubber), then contributed the AI session log scanning capabilities to Bagel — BoostSecurity’s open-source workstation security scanner.

Repository: github.com/boostsecurityio/bagel

Detection Methodology

The scanner uses a multi-layered approach:

1. Regex Pattern Matching

Detects known credential formats:

AWS_KEY = r'AKIA[0-9A-Z]{16}'
SSH_PRIVATE = r'-----BEGIN (OPENSSH|RSA|EC) PRIVATE KEY-----'
JWT = r'eyJ[A-Za-z0-9-_]+\.eyJ[A-Za-z0-9-_]+\.[A-Za-z0-9-_]+'
GITHUB_PAT = r'ghp_[a-zA-Z0-9]{36}'
SLACK_WEBHOOK = r'https://hooks\.slack\.com/services/T[A-Z0-9]+/B[A-Z0-9]+/[A-Za-z0-9]+'

2. Entropy-Based Detection

High-entropy strings likely to be secrets:

def shannon_entropy(data):
    if not data:
        return 0
    entropy = 0
    for x in range(256):
        p_x = float(data.count(chr(x))) / len(data)
        if p_x > 0:
            entropy += - p_x * math.log2(p_x)
    return entropy
 
# Flag strings > 4.5 entropy and length > 20

3. Context-Aware Detection

Looks for secret-like patterns in specific contexts:

Environment variables (AWS_SECRET_ACCESS_KEY=...)
Database connection strings (postgres://, mongodb://)
Private key blocks (multiline BEGIN/END pairs)
JSON key-value pairs ("apiKey": "...")

Operational Modes

Report-Only Mode

bagel scan ~/.claude/projects/*/sessions/*.jsonl --report-only

Outputs:

[CRITICAL] AWS Access Key found in session_20260318_143052.jsonl:247
  File: .env
  Match: AKIA****************EXAMPLE

[HIGH] SSH Private Key found in session_20260402_091523.jsonl:1834
  File: ~/.ssh/id_rsa
  Match: -----BEGIN OPENSSH PRIVATE KEY-----

[MEDIUM] JWT Token found in session_20260415_112034.jsonl:592
  Context: API test response
  Match: eyJhbGci****************(truncated)

Summary:
  Total sessions scanned: 143
  Sessions with secrets: 89 (62%)
  Total secrets found: 347
  CRITICAL: 73
  HIGH: 128
  MEDIUM: 146

In-Place Redaction Mode

bagel scan ~/.claude/projects/*/sessions/*.jsonl --redact

Replaces secrets with [REDACTED:AWS_KEY] or [REDACTED:SSH_PRIVATE_KEY] while preserving log structure. Creates .backup files before modification.

Monitoring Mode

bagel watch ~/.claude/projects/*/sessions/ --alert

Runs as a daemon, scanning new session files on creation and sending alerts when secrets are detected.

Real-World Findings

Scanning 200+ developer sessions (anonymized) revealed:

Finding Category	Occurrences	Median Severity
Hardcoded AWS keys in `.env` files	94	CRITICAL
SSH private keys read during debugging	76	HIGH
Database credentials in connection strings	108	CRITICAL
API tokens in config files	142	HIGH
JWT tokens from API testing	124	MEDIUM
Slack webhook URLs	36	MEDIUM
GitHub PATs in git auth troubleshooting	58	HIGH
Private TLS keys for local dev	44	HIGH

Most Dangerous Pattern: AWS access keys in .env files read during “check if credentials are configured” debugging. These keys often had full AdministratorAccess policy attachments because they were personal developer credentials, not scoped service accounts.

Longest Credential Lifetime: An SSH private key to a production bastion host found in session logs 26 months old. The key had never been rotated. The bastion host was still in production.

Blast Radius Example: One developer’s session history contained credentials for 23 distinct AWS accounts (personal, dev, staging, prod across multiple customers). A single workstation compromise yielded access to 23 environments.

Mitigations

Vendor-Side Solutions

1. Session Log Encryption at Rest

AI tools should encrypt session logs using OS-level encryption:

macOS: Encrypt using SecItem keychain with device-locked keys
Linux: Use libsecret or kernel keyring
Windows: Use DPAPI with machine-scoped keys

Encryption prevents trivial reads by malware or supply chain attacks. Requires OS authentication to decrypt.

2. Pre-Logging Secret Filtering

Before writing tool outputs to session logs, run regex and entropy-based detection:

def filter_secrets(content):
    patterns = [AWS_KEY, SSH_PRIVATE, JWT, GITHUB_PAT, ...]
    for pattern in patterns:
        content = re.sub(pattern, '[REDACTED]', content)
    return content
 
log_entry = {
  "tool": "Read",
  "file": ".env",
  "content": filter_secrets(file_content)
}

This prevents secrets from ever being persisted in plaintext.

3. Automatic Log Rotation

Default retention policies:

Delete session logs older than 90 days
Compress and archive logs older than 30 days
Prompt user on first run: “Keep session logs for: 30d / 90d / 1yr / indefinitely”

Most developers do not need multi-year session history. Time-bound retention limits exposure window.

Developer-Side Mitigations

4. Regular Secret Scanning

Run bagel weekly:

# Cron job
0 2 * * 0 bagel scan ~/.claude/projects/*/sessions/*.jsonl --report-only --alert

Detects secrets before compromise, triggers credential rotation.

5. Session Log Exclusions

Exclude session log directories from cloud sync:

# iCloud exclusions
tmutil addexclusion ~/.claude/projects/*/sessions/
 
# Dropbox exclusions
echo "~/.claude/projects/*/sessions/" > ~/.dropbox/ignore
 
# OneDrive exclusions (Windows)
Set-FileAttributes -Path "$env:USERPROFILE\.claude\projects\*\sessions" -Attributes NotContentIndexed

Limits blast radius to single workstation.

6. Credential Rotation After Sensitive Sessions

After any session involving production credentials:

Rotate the credentials immediately
Audit session logs for the old credentials
Redact or delete the session file

Treat AI session logs as compromised by default if they contain production secrets.

7. Endpoint Monitoring

EDR/XDR rules to detect anomalous access:

rule: ai_session_log_access_by_non_owner
condition:
  - file_path matches "/.claude/projects/*/sessions/*.jsonl"
  - process_user != file_owner
  - process_name not in [claude-code, cursor, windsurf, codex]
action: alert

Flags reads by malware or lateral movement tools.

Responsible Disclosure

Findings were shared with:

Anthropic (Claude Code)
GitHub (Codex CLI)
Cursor AI
Windsurf

Recommendations included pre-logging secret filtering, log encryption, and default retention policies. Some vendors have acknowledged and are evaluating solutions. This public disclosure occurs after a 90-day window.

Conclusion

AI coding assistants create a new persistent storage attack surface. Every debugging session, every “read this config file” request, every environment variable check writes plaintext secrets to disk. These logs accumulate over months or years, creating a credential aggregation point ripe for post-exploitation.
The attack surface grows linearly with AI adoption. As more developers integrate AI tools into daily workflows, more secrets leak into session logs. A single compromised developer workstation yields credentials for dozens of services, environments, and cloud accounts.
This is not a theoretical risk. Through testing I have found an average of 347 unique secrets per developer in session logs spanning 6-14 months. These logs are unencrypted, backed up to cloud storage, and never rotated. The blast radius of a single workstation compromise now extends to every service that developer has touched.

The solution requires action from both vendors and developers:

Vendors: Implement pre-logging secret filtering, encrypt logs at rest, default to 90-day retention
Developers: Scan logs regularly with bagel, exclude session directories from cloud sync, rotate credentials after sensitive sessions

Until these mitigations become standard, treat AI session logs as a high-value target in your threat model. Because attackers already are.

I am a security researcher and purple teamer. I build offensive security tools and publish research at johnmatrix.org. AI session log scanning is part of Bagel, BoostSecurity’s open-source workstation scanner.