How AI Agents Take Action — Not Just Talk
Lecture, Practice, and Discussion for Week 9
Function Calling — Giving AI the Power to Act
23847 × 9281 = ?)get_weather(city) function"search_papers(query) function"calculate(expr) function"A simple but powerful API pattern
1. You define functions in JSON:
{ name: "search_papers", description: "...", parameters: {...} }
2. You send the user message + function definitions to the LLM
3. The LLM responds with EITHER:
(a) A normal text answer, OR
(b) "I want to call search_papers with query='deep learning'"
4. If (b), YOUR code runs the real function and gets the result
5. You send the result back to the LLM → it forms the final answer
This is what you send to the LLM
tool = {
"type": "function",
"function": {
"name": "search_papers",
"description": "Search the local paper collection by keyword.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Keyword or topic to search for"
},
"max_results": {
"type": "integer",
"description": "How many papers to return (default 5)"
}
},
"required": ["query"]
}
}
}
This is the new prompt engineering
"description": "search papers""description": "Search the user's local paper collection (extracted via Week 6 metadata). Use this when the user asks about specific papers or trends in THEIR collection. Do NOT use for general knowledge questions."One question may need several tools
search_papers(query="GNN", max_results=3) → gets 3 papersget_paper_details(id=1) → gets full text of paper 1get_paper_details(id=2) → gets full text of paper 2get_paper_details(id=3) → gets full text of paper 3This is the heart of every agent system
messages = [{"role": "user", "content": user_question}]
for step in range(MAX_STEPS):
resp = client.chat.completions.create(
model=model,
messages=messages,
tools=tool_definitions,
)
msg = resp.choices[0].message
messages.append(msg)
if not msg.tool_calls:
# LLM gave a final answer — done
return msg.content
# LLM wants to call tools — run them
for tc in msg.tool_calls:
result = run_tool(tc.function.name, tc.function.arguments)
messages.append({
"role": "tool",
"tool_call_id": tc.id,
"content": str(result),
})
MAX_STEPS prevents runaway loopsMAX_STEPS = 10 and bail out gracefullyBuild a Research Agent with 3 Tools
search_papers(query, max_results) — keyword search in metadataget_paper_details(paper_id) — full info for one papercount_papers_by_year(year) — simple stats querytools.py — the 3 tool functions + their JSON schemasagent.py — the agentic loopapp.py — add Tab 7 (Agent Chat)tools.py)Three real Python functions the agent can call
# tools.py
import json
from pdf_to_md import load_all_metadata
MD_DIR = "md_output"
def search_papers(query: str, max_results: int = 5) -> str:
"""Keyword search in titles and abstracts."""
papers = load_all_metadata(MD_DIR)
q = query.lower()
hits = []
for i, p in enumerate(papers):
text = (p.get("title", "") + " " + p.get("abstract", "")).lower()
if q in text:
hits.append({"id": i, "title": p.get("title", "?")})
if len(hits) >= max_results:
break
return json.dumps(hits)
def get_paper_details(paper_id: int) -> str:
"""Full metadata for a single paper by id."""
papers = load_all_metadata(MD_DIR)
if 0 <= paper_id < len(papers):
return json.dumps(papers[paper_id])
return json.dumps({"error": f"paper_id {paper_id} not found"})
def count_papers_by_year(year: int) -> str:
"""Count papers published in a given year."""
papers = load_all_metadata(MD_DIR)
count = sum(1 for p in papers if str(p.get("year", "")) == str(year))
return json.dumps({"year": year, "count": count})
tools.py)How the LLM sees each tool
TOOL_SCHEMAS = [
{
"type": "function",
"function": {
"name": "search_papers",
"description": (
"Search the user's local paper collection by keyword. "
"Use when the user asks about a specific topic in THEIR papers."
),
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search keyword"},
"max_results": {"type": "integer", "description": "Default 5"}
},
"required": ["query"]
}
}
},
{
"type": "function",
"function": {
"name": "get_paper_details",
"description": "Get full metadata for one paper by its id (from search).",
"parameters": {
"type": "object",
"properties": {"paper_id": {"type": "integer"}},
"required": ["paper_id"]
}
}
},
{
"type": "function",
"function": {
"name": "count_papers_by_year",
"description": "Count how many papers in the collection were published in a given year.",
"parameters": {
"type": "object",
"properties": {"year": {"type": "integer"}},
"required": ["year"]
}
}
}
]
TOOL_FUNCS = {
"search_papers": search_papers,
"get_paper_details": get_paper_details,
"count_papers_by_year": count_papers_by_year,
}
agent.py)The heart of any function-calling agent
# agent.py
import json
from tools import TOOL_SCHEMAS, TOOL_FUNCS
MAX_STEPS = 8
def run_agent(client, model, user_question, on_step=None):
"""Run the agentic loop. on_step(event) is called for each tool call."""
messages = [
{"role": "system", "content":
"You are a research assistant with access to the user's paper collection. "
"Use tools to answer questions about their papers. "
"When unsure which paper, use search_papers first."},
{"role": "user", "content": user_question},
]
for step in range(MAX_STEPS):
resp = client.chat.completions.create(
model=model, messages=messages, tools=TOOL_SCHEMAS,
)
msg = resp.choices[0].message
messages.append(msg)
if not msg.tool_calls:
return msg.content # final answer
for tc in msg.tool_calls:
name = tc.function.name
args = json.loads(tc.function.arguments)
if on_step:
on_step({"type": "call", "name": name, "args": args})
try:
result = TOOL_FUNCS[name](**args)
except Exception as e:
result = json.dumps({"error": str(e)})
if on_step:
on_step({"type": "result", "name": name, "result": result})
messages.append({
"role": "tool",
"tool_call_id": tc.id,
"content": result,
})
return "Stopped — too many steps."
app.py Tab 7)Show every tool call in real time
# app.py — Tab 7 (Agent Chat)
from agent import run_agent
with tab7:
st.header("🔧 Research Agent (Function Calling)")
question = st.text_input(
"Ask about your papers",
placeholder="e.g., How many GNN papers do I have from 2024?"
)
if st.button("▶️ Run Agent", disabled=not question):
events = []
def on_step(event):
events.append(event)
with st.spinner("Agent thinking..."):
answer = run_agent(client, model, question, on_step=on_step)
# Show every tool call (transparency)
st.subheader("🔍 Tool Calls")
for e in events:
if e["type"] == "call":
st.markdown(f"📞 **{e['name']}**(`{e['args']}`)")
else:
st.markdown(f"📦 result: `{e['result'][:200]}...`")
st.subheader("💬 Final Answer")
st.markdown(answer)
See the agent pick different tools
Question 1: "How many papers from 2024 do I have?"
→ Agent calls count_papers_by_year(2024) → answers
Question 2: "Find papers about graph neural networks"
→ Agent calls search_papers("graph neural networks") → lists them
Question 3: "Tell me about the first GNN paper in detail"
→ Agent calls search_papers("GNN") → then get_paper_details(id=X)
→ synthesizes a summary
Question 4: "Compare 2023 vs 2024 paper counts"
→ Agent calls count_papers_by_year(2023)
→ then count_papers_by_year(2024)
→ presents both numbers
1. - [ ] Create tools.py with the 3 functions and their schemas
2. - [ ] Create agent.py with the run_agent() loop
3. - [ ] Add Tab 7 to app.py showing tool calls in real time
4. - [ ] Test with the 4 example questions — does the agent pick the right tools?
5. - [ ] Try a question the agent should NOT use a tool for (e.g., "What is deep learning?") — does it answer directly?
6. - [ ] Bonus: edit one tool's description to be vague, see how behavior changes
7. - [ ] Bonus: add a 4th tool of your choice (e.g., list_authors())
Authorship & Accountability — Who Owns the AI's Output?
14 responses analyzed — a near-universal consensus, with one twist
Three management philosophies emerge
How students framed the AI's role
Why this isn't an abstract debate
Function calling makes accountability MECHANICAL, not just philosophical
Apply the lessons in pairs (10 min)
description field with explicit limits built in1. Several classmates (Huy, Namcheol, Minh) reframed the AI accountability debate using metaphors — "velocity vs vector", "high-power instrument", "power drill". Which metaphor best captures how YOU work with AI in your research? Or propose a better one.
2. In today's practice, you saw every tool call the agent made (transparency). Does this kind of visibility change your sense of responsibility for the AI's output? Would you trust an AI more or less if you couldn't see its tool calls?
Function Calling Documentation
📚 OpenAI: Function Calling Guide 📚 Anthropic: Tool Use Overview 📚 Anthropic: Building Effective Agents
Agent Frameworks (built on function calling)
📚 LangGraph — Stateful Agent Workflows 📚 OpenAI Agents SDK
Anthropic Free Online Courses
Next week: What happens when tool calls fail — error handling and recovery in agent loops.