AI Coding - The Ever changing stack

As of early 2026, coding with AI involves interacting with LLM models like a REPL. The difference is that the interaction uses natural language instead of direct code input. New models are launched frequently and it is hard to keep up with the latest releases.

Interface

As far as the customer is concerned, the interface is the product". - Jeff Raskin

For coding, the chat interface is often not sufficient as it can be slow for iterative development. The interaction happens through tools. Currently, the popular options are:

New (relatively) IDEs like Cursor, Antigravity with built in AI features
Popular IDEs like VS Code, IntelliJ with AI code assist extensions like cline, copilot, etc.
AI Tools by AI labs:
1. Terminal based tool a.k.a cli like Claude Code, Codex, Gemini, etc.
2. Native Apps

Agents and Security

The tools can be used as a REPL where you interact with the AI or you give it permission to run on its own. These are called Agents. They are autonomous to the extent you allow. Most tools are Agents first now.

The common phases of development are Plan and Act. The planning phase allows you to iterate over the execution plan.

However, the act phase requires giving agents permission to implement, and verify the code. Autonomous software with the capability to run commands and execute code is a security nightmare. To review, and approve every step an agent performs slows down the development cycle. It is increasingly difficult to review the large code changes made by agents in every step. Moreover, they may change course if the approach does not work. Practically, it makes more sense to let the agent run autonomously and review the final result rather than approving each individual step.

What happens if the agent runs the wrong commands and deletes files, or worse, crashes the system. During its attempt to debug an issue, an agent may open ports, disable a security setting or misconfigure the system. We cannot rely on the AI agent implementation to safeguard from such risks. Agents should be prevented from making dangerous operations. This is why they should be run inside a sandbox.

Code Sandboxing

A sandbox is a secure, isolated environment used to execute, test, or analyze code, applications, or programs without affecting the system or surrounding environment.

https://www.browserstack.com/guide/what-is-sandbox

From source code sandboxing :

Sandboxing, in this case, is when a developer limits available system resources to a program from within its source code.

How Popular CLI tools implement sandboxing

Claude Code

They have a great page worth reading: sandboxing

Codex

The code which is implemented in Rust, openai/codex, on a high-level (by platform)

macOS
- Uses Apple Seatbelt via sandbox-exec (the CLI wraps commands using Seatbelt).
- Runs the command inside a mostly read-only jail, exposing a small set of writable roots (cwd, TMPDIR, ~/.codex, etc.).
- Outbound network is fully blocked by default (even attempted curl will fail).
- Codex detects sandbox denial via aggregated output (e.g., filesystem "Read-only file system" messages) and special signals.
Linux
- There is no built-in OS sandboxing enabled by default in Codex; the README recommends using Docker for deterministic sandboxing.
- Codex ships/uses a helper binary (codex-linux-sandbox) that combines Landlock and seccomp. The code serializes the SandboxPolicy to JSON and invokes the helper with flags (see landlock.rs and create_linux_sandbox_command_args). The helper enforces file-system and syscall restrictions.
- The exec path will call spawn_command_under_linux_sandbox when Linux sandboxing is chosen; exec detection also looks for SIGSYS exit codes to infer seccomp denial.
Windows
- Codex integrates with a Windows sandboxing component (codex_windows_sandbox). The core crate exposes sandbox_setup_is_complete and run_elevated_setup that call into that module.
- The code supports a WindowsRestrictedToken sandbox type and includes UI flows for enabling sandbox features, elevated setup, and fallbacks if elevation is declined.
- Codex also performs world-writable directory scans and shows prompts related to world-writable filesystem protections before enabling agent mode.

Docker Sandbox for Agents

Docker has a neat trick to run agents: docker/sandbox/ which works with the agents. You can get started by docker sandbox run <agent> where can be claude, gemini, codex and more: docker/sandbox-templates

So, why not use Containers everywhere?

Containers are the most widely used sandboxes available today. Running your code inside containers is a solid, straightforward solution. However, containers require a runtime to be installed. While they start quickly, they need images and dependencies downloaded first.

Securing Agents in the IDE

Now that we know that CLI tools implement sandboxing and can also be run inside containers, it feels safer to write code with agents.

However, this begs the question: how do agents running within a typical IDE guarantee safety?

Relying on the human approval when making changes or running commands
Approve or Deny List of commands that can be run
The safety built into the tool/extension itself which may or may not be open-source.

So, you can roll your own solution depending on the isolation level you want:

Containers (Docker/Podman) for process isolation
Dedicated user with minimal privileges
Filesystem restrictions (read-only root, limited workspace)
Resource limits (memory, CPU, disk quotas)
Network isolation (if agent doesn't need network) - Not covered here
Seccomp/AppArmor for syscall filtering - Not covered here

Here are the steps to implement some of the above:

1. Container Based Isolation

Docker/Podman provides strong isolation with minimal overhead:

# Run agent with restricted filesystem access
docker run --rm \\
  --read-only \\                    # Root filesystem is read-only
  --tmpfs /tmp:rw,noexec,nosuid \\  # Temp space without execution
  --volume ./agent_workspace:/workspace:rw \\
  --network none \\                  # No network access
  --cap-drop ALL \\                  # Drop all capabilities
  --security-opt no-new-privileges \\
  agent-image

Benefits: Process isolation, resource limits (CPU, memory), and network control all in one. Bonus this work on Linux and macOS

2. Filesystem Level Restrictions

macOS: Sandbox profiles

For example, Codex uses Seatbelt via on sandbox-exec on macOS. Here’s the step to create a profile

(version 1)
(deny default)
(allow file-read* (subpath "/System/Library"))
(allow file* (subpath "/workspace"))
(allow process-exec (literal "/bin/bash"))

Apply with: sandbox-exec -f profile.sb your-agent-command

More https://igorstechnoclub.com/sandbox-exec/

Linux

There are many Linux tools that can be used, but as you add more steps, it gets closer to a container. Landlock is relatively new and often seen as a superior alternative/complement to traditional syscall filtering mechanisms like seccomp for filesystem control.

chroot + namespaces

# Create isolated environment
mkdir -p /sandbox/{bin,lib,lib64,workspace}
# Copy only necessary binaries
cp /bin/bash /sandbox/bin/
# Copy required libraries (use ldd to find them)
# That can be a bit of work!
# Run agent in chroot jail
sudo chroot /sandbox /bin/bash

unshare for namespace isolation

# Create mount, PID, and network namespaces
unshare --mount --pid --net --fork --user \\
  --map-root-user chroot /sandbox /bin/bash

Caveats

Restricting filesystem access may break tools. For example, uv caches packages on the local system, and if you ask the agent to use uv but (unintentionally) deny access to the .cache directory, it will fail we permission denied error.

3. User Level Isolation

Create a dedicated unprivileged user

# Create restricted user
sudo useradd -m -s /bin/bash -G nogroup aiagent
sudo passwd -l aiagent  # Lock password

# Set up workspace with quotas
sudo mkdir /home/aiagent/workspace
sudo chown aiagent:aiagent /home/aiagent/workspace
sudo chmod 700 /home/aiagent/workspace

# Set disk quota (prevent disk exhaustion)
sudo setquota -u aiagent 1000000 1500000 0 0 /home  # 1GB soft, 1.5GB hard block limit

Run agent as this user:

sudo -u aiagent python agent.py

That’s it for now. Happy Coding!

Coding with the Agents, Securely

AI Coding - The Ever changing stack

Interface

Agents and Security

Code Sandboxing

How Popular CLI tools implement sandboxing

Claude Code

Codex

Docker Sandbox for Agents

Securing Agents in the IDE

1. Container Based Isolation

2. Filesystem Level Restrictions

macOS: Sandbox profiles

Linux

3. User Level Isolation

Comments

More from this blog

Avoid the Slippery Slope of "AI Slop"

A friendly Fio Job Builder

Arger - Parameters Across Languages

FauxFS A Bug-Infested File System for Learning

Command Palette

AI Coding - The Ever changing stack

Interface

Agents and Security

Code Sandboxing

How Popular CLI tools implement sandboxing

Claude Code

Codex

Docker Sandbox for Agents

Securing Agents in the IDE

1. Container Based Isolation

2. Filesystem Level Restrictions

macOS: Sandbox profiles

Linux

3. User Level Isolation

Comments

More from this blog