AI coding agents#
Overview#
Coding agents are AI-powered CLI tools that can read and modify files, run commands, navigate projects, and carry out multi-step development tasks, all from your terminal.
Unlike browser-based or editor-integrated assistants, CLI agents work directly in the environment where the code lives. On Sherlock, that means they can operate on files in your project directories, run tests and build commands on actual compute nodes, and integrate naturally into shell-based workflows, without needing a graphical interface or a local copy of the code.
A few practical advantages over GUI-based tools:
- they work over SSH with no display required, making them well-suited for remote HPC environments
- they can be embedded in scripts or batch jobs to automate repetitive tasks across large codebases
- they have direct access to the same filesystems and modules as your jobs, so they can read logs, inspect outputs, and act on results without extra setup
Several of them are available on Sherlock as modules and can be loaded and used directly in interactive sessions or batch jobs.
Most coding agents use external services
With the exception of tools explicitly configured to use a local model (e.g. via Ollama), most coding agents send your prompts and code to external cloud services. Consider this when working with sensitive or unpublished research data.
Checking current versions
Run ml spider <package> to see the versions currently available on Sherlock, or browse the Software list page.
Available agents#
Claude Code#
Claude Code is an AI coding tool from Anthropic. It understands your project, edits files, runs commands, and handles git workflows directly from the terminal.
Claude Code requires an Anthropic API key:
For full documentation, see the Claude Code docs.
Gemini CLI#
Gemini CLI is Google's open-source terminal AI agent, powered by the Gemini model family. It offers a generous free tier.
Gemini CLI can authenticate via a Google account (browser-based login on first run) or with a GEMINI_API_KEY environment variable:
OpenAI Codex CLI#
Codex is a CLI coding agent from OpenAI. It runs locally but sends code and prompts to the OpenAI API.
Codex requires an OpenAI API key:
Cursor CLI#
Cursor CLI is the terminal counterpart to the Cursor editor. It requires an active Cursor account.
GitHub Copilot CLI#
GitHub Copilot CLI brings Copilot's coding assistant to the terminal. It requires an active GitHub Copilot subscription.
Mistral Vibe#
Mistral Vibe is a minimal CLI coding agent by Mistral AI. It requires a Mistral API key.
OpenCode#
OpenCode is an open-source terminal AI coding agent with a TUI interface. It supports a wide range of model providers (OpenAI, Anthropic, Gemini, and many others), including local Ollama instances.
Providers and API keys are configured on first run via:
Crush#
Crush is an open-source AI coding agent that supports multiple model providers, including local Ollama instances (a privacy-friendly option when combined with Sherlock's GPU nodes).
See the Crush documentation for configuration details, including how to point it to a local Ollama endpoint.
Using a local Ollama instance#
Running a coding agent against a local Ollama instance on a GPU compute node is a privacy-friendly alternative to cloud API services: your code and prompts never leave Sherlock. See the Ollama page for instructions on starting an Ollama server as a batch job. The job script writes the current endpoint to ~/.ollama_server. Read it once before launching any agent:
For tools that do not support environment variable interpolation in their configuration files (such as Codex), SSH local port forwarding gives a static localhost:11434 endpoint that never changes:
Configuring OpenCode#
OpenCode can be configured with an opencode.json file in your project directory (or at ~/.config/opencode/config.json for a global default). The {env:VAR} syntax substitutes environment variables at startup:
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"ollama": {
"npm": "@ai-sdk/openai-compatible",
"name": "Ollama (local)",
"options": {
"baseURL": "{env:OLLAMA_BASE_URL}"
},
"models": {
"qwen2.5-coder:7b": {
"name": "Qwen 2.5 Coder 7B"
}
}
}
}
}
Replace the model ID and name with whichever model you have pulled in Ollama (see ollama list). Then select the provider on first run with:
Configuring Crush#
Crush reads its configuration from ~/.config/crush/crush.json. It supports standard shell $VAR substitution in configuration values, including base_url:
{
"$schema": "https://charm.land/crush.json",
"providers": {
"ollama": {
"id": "ollama",
"name": "Local Ollama",
"base_url": "$OLLAMA_BASE_URL",
"type": "openai",
"api_key": "ollama",
"models": [
{
"id": "qwen2.5-coder:7b",
"name": "Qwen 2.5 Coder 7B"
}
]
}
}
}
The api_key field is required by the configuration format but not checked by Ollama, so any non-empty string works.
Other agents#
Codex does not support environment variable interpolation in its configuration file, so the cleanest approach is to use SSH local port forwarding (see above) and keep a static localhost:11434 endpoint in ~/.codex/config.json:
{
"provider": "ollama",
"providers": {
"ollama": {
"name": "Ollama",
"baseURL": "http://localhost:11434/v1",
"envKey": "OLLAMA_API_KEY"
}
}
}
The envKey field tells Codex which environment variable to read the API key from. Ollama does not check it, but the variable must be set and non-empty:
Claude Code uses the Ollama Anthropic-compatible endpoint, which does not have the /v1 suffix. Derive it from OLLAMA_BASE_URL:
$ ANTHROPIC_BASE_URL=http://$OLLAMA_HOST \
ANTHROPIC_AUTH_TOKEN=ollama \ # (1)!
ANTHROPIC_API_KEY="" \ # (2)!
claude --model <model-name>
- Ollama accepts any non-empty token value here.
- Overrides any real
ANTHROPIC_API_KEYalready set in your environment, so Claude Code does not attempt to reach the Anthropic API.
Context window size
Claude Code requires a large context window (64k tokens or more). Ollama defaults to 2k, so you will likely need to increase num_ctx before use. See Increasing context window size on the Ollama page for instructions.
Gemini CLI and Mistral Vibe are tied to their respective cloud API services and do not support local model endpoints.
Tips and tricks#
Use agents in scripts#
Most agents support a non-interactive, single-prompt mode that works well in batch jobs or shell scripts. Pass your prompt directly on the command line instead of starting an interactive session:
# Claude Code
$ claude -p "review this Python script for numerical stability issues"
# Gemini CLI
$ gemini -p "suggest SLURM options to optimize this job for memory use"
# Codex
$ codex -q "refactor this MPI initialization code to handle edge cases"
# Mistral Vibe
$ vibe --prompt "explain what this CUDA kernel does and how to profile it"
This is particularly useful for automating repetitive tasks, such as post-processing job outputs or generating summaries of results. Most agents also accept input via stdin, so you can pipe data directly:
$ cat slurm-${SLURM_JOB_ID}.out | claude -p "the simulation failed, explain the error and suggest a fix"
$ cat results.csv | gemini -p "summarize the key trends in these simulation results"
Resume sessions across SSH connections#
Since HPC work often spans multiple login sessions, it helps to be able to pick up where you left off. Claude Code saves session history automatically and lets you resume from the command line:
- resume the most recent session
- open an interactive session picker
- resume a named session
Name a session early with /rename so it's easy to find later. Sessions are stored per project directory, so they carry over across SSH connections as long as you work in the same directory.
Read before you write#
Before making any changes to an unfamiliar project, consider starting in a read-only or planning mode. Claude Code calls this Plan Mode, and can be started with:
In Plan Mode, Claude only reads files and asks questions, and makes no edits until you approve a plan. You can also switch into it during a session with Shift+Tab. This is a good habit when working on production code or shared group repositories.
Control output format for scripting#
When integrating an agent into an existing pipeline, check whether it supports structured output. Several agents have flags for this:
# Claude Code: plain text response only
$ claude -p "check this script for issues" --output-format text < my_script.sh
# Claude Code: full JSON conversation log with cost and timing metadata
$ claude -p "analyze this file" --output-format json < results.py
# Gemini CLI
$ gemini -p "check this script for issues" --output-format json < my_script.sh
# Mistral Vibe
$ vibe --prompt "check this script for issues" --output json < my_script.sh
# Codex
$ codex -q --json "check this script for issues" < my_script.sh
This makes it straightforward to capture and process the agent's output in a shell script or downstream tool.