Build Your First AI Agent with Ollama and a Tool-Calling Loop
A practical guide to building a working AI agent from scratch: a reasoning loop, tool definitions, and real tool execution — all running locally.
An AI agent isn’t magic. It’s a loop. The model reasons about a goal, picks a tool, runs it, sees the result, and repeats until it’s done. Once you see that structure clearly, you can build one yourself in a few hundred lines of Python.
This guide builds a minimal but real agent: it can search the web, read files, and run shell commands — all coordinated by a local Qwen 3 model running through Ollama.
The agent loop
Before touching code, understand the structure:
Goal → Reason → Pick tool → Execute tool → Observe result → Reason → ...
The model doesn’t run code. It outputs a decision — “call this tool with these arguments.” Your Python code executes the tool, captures the output, and feeds it back to the model as context. The model then decides what to do next.
This loop continues until the model outputs a final answer instead of a tool call.
Setting up
You need Python 3.11+ and Ollama running with qwen3:7b pulled.
pip install ollama
That’s the only dependency for the basic agent. The ollama Python library wraps the HTTP API cleanly.
Defining tools
Tools are just Python functions. We describe them to the model using a schema — name, description, and parameter types. The model never sees the function itself, only this schema.
TOOLS = [
{
"type": "function",
"function": {
"name": "read_file",
"description": "Read the contents of a file at the given path.",
"parameters": {
"type": "object",
"properties": {
"path": {
"type": "string",
"description": "Absolute or relative path to the file."
}
},
"required": ["path"]
}
}
},
{
"type": "function",
"function": {
"name": "run_shell",
"description": "Execute a shell command and return the output. Use for listing directories, checking file sizes, running scripts.",
"parameters": {
"type": "object",
"properties": {
"command": {
"type": "string",
"description": "The shell command to run."
}
},
"required": ["command"]
}
}
},
{
"type": "function",
"function": {
"name": "write_file",
"description": "Write content to a file, creating it if it doesn't exist.",
"parameters": {
"type": "object",
"properties": {
"path": {"type": "string"},
"content": {"type": "string"}
},
"required": ["path", "content"]
}
}
}
]
Implementing the tools
The actual Python functions are straightforward. Keep them simple — the agent handles the logic, the tools just do I/O.
import subprocess, json
def read_file(path: str) -> str:
try:
with open(path, 'r') as f:
return f.read()
except Exception as e:
return f"Error: {e}"
def run_shell(command: str) -> str:
result = subprocess.run(
command, shell=True, capture_output=True, text=True, timeout=30
)
output = result.stdout or result.stderr
return output[:4000] # cap at 4000 chars to avoid context overflow
def write_file(path: str, content: str) -> str:
try:
with open(path, 'w') as f:
f.write(content)
return f"Written {len(content)} bytes to {path}"
except Exception as e:
return f"Error: {e}"
TOOL_REGISTRY = {
"read_file": read_file,
"run_shell": run_shell,
"write_file": write_file,
}
The agent loop
Now the core loop. It maintains a message history, calls the model, checks if it wants to use a tool, and either executes the tool or returns the final answer.
import ollama
def run_agent(goal: str, model: str = "qwen3:7b", max_steps: int = 10) -> str:
messages = [
{
"role": "system",
"content": (
"You are an autonomous agent. Use the provided tools to achieve the user's goal. "
"Think step by step. When you have completed the goal, provide a final answer."
)
},
{"role": "user", "content": goal}
]
for step in range(max_steps):
print(f"\n[Step {step + 1}]")
response = ollama.chat(
model=model,
messages=messages,
tools=TOOLS,
)
message = response["message"]
messages.append(message)
# No tool call — model produced a final answer
if not message.get("tool_calls"):
print(f"Final answer: {message['content']}")
return message["content"]
# Execute each tool call
for call in message["tool_calls"]:
fn_name = call["function"]["name"]
fn_args = call["function"]["arguments"]
print(f" → {fn_name}({json.dumps(fn_args)})")
if fn_name not in TOOL_REGISTRY:
result = f"Error: unknown tool '{fn_name}'"
else:
result = TOOL_REGISTRY[fn_name](**fn_args)
print(f" ← {result[:200]}{'...' if len(result) > 200 else ''}")
# Feed the result back to the model
messages.append({
"role": "tool",
"content": result,
})
return "Max steps reached without a final answer."
Running the agent
if __name__ == "__main__":
result = run_agent(
"List all Python files in the current directory, "
"then count the total number of lines across them."
)
Run it and watch the agent reason through the task, calling tools as needed:
[Step 1]
→ run_shell({"command": "find . -name '*.py' -type f"})
← ./agent.py\n./tools.py\n
[Step 2]
→ run_shell({"command": "wc -l ./agent.py ./tools.py"})
← 87 ./agent.py\n 43 ./tools.py\n 130 total\n
Final answer: There are 2 Python files totaling 130 lines.
Adding a memory tool
The basic agent is stateless — it forgets everything between runs. For workflows that need persistence, add a simple key-value store as a tool:
import json, pathlib
MEMORY_FILE = "agent_memory.json"
def remember(key: str, value: str) -> str:
mem = json.loads(pathlib.Path(MEMORY_FILE).read_text()) if pathlib.Path(MEMORY_FILE).exists() else {}
mem[key] = value
pathlib.Path(MEMORY_FILE).write_text(json.dumps(mem, indent=2))
return f"Stored: {key}"
def recall(key: str) -> str:
mem = json.loads(pathlib.Path(MEMORY_FILE).read_text()) if pathlib.Path(MEMORY_FILE).exists() else {}
return mem.get(key, "Not found")
Add these to TOOLS and TOOL_REGISTRY. The agent will now store and retrieve information across runs.
Where to go from here
This is a minimal but functional agent. Real-world extensions:
- Web search tool — wrap a search API or use
requeststo scrape a results page - Email/Slack tool — connect to your notification system
- N8N trigger — call a webhook to kick off a longer workflow
- Streaming output — use
stream=Truein the Ollama call to show tokens as they generate
The architecture scales. Add more tools, and the agent gains new capabilities. Improve the system prompt, and the reasoning quality goes up. Everything runs locally.
Written by
Human editor behind Pipeline Monk. Building AI-powered workflows, reviewing pipeline output, and writing guides from hands-on experience.