Practice 1.3 — Community Tools & Real Data

Unit 1, Lesson 3

This practice pushes further than class. You will evaluate community tools, build a local data tool from scratch, deliberately break agent behavior with bad descriptions, and fix it.

You will practice: - Using and evaluating community tools - Building a local file tool with a new dataset - Understanding what happens when tool descriptions overlap or conflict - Iterating on tool descriptions to fix agent behavior - Combining multiple tools in one agent

# YOUR CODE HERE → replace with your own code
# YOUR ANSWER HERE → write your answer as a comment or string
Run every cell in order

🔑 Setup

%pip install -q -U langchain langchain-google-genai langgraph langchain-core langchain-community duckduckgo-search wikipedia

import os
from google.colab import userdata
from langchain.chat_models import init_chat_model
from langchain.agents import create_react_agent
from langchain_core.tools import tool

os.environ['GOOGLE_API_KEY'] = userdata.get('GOOGLE_API_KEY')

model = init_chat_model(
    model="google_genai:gemini-2.5-flash",
    temperature=0
)

print("Ready!")

Part 1 — Evaluate Community Tools

Before using a tool in a real project, you should understand what it does, what it costs, and what could go wrong. Evaluate each tool below.

Tool 1: DuckDuckGo Search

from langchain_community.tools import DuckDuckGoSearchRun

search = DuckDuckGoSearchRun()

# Run these searches and read the outputs carefully
print("=== Search 1 ===")
print(search.invoke("current Python version 2026"))

print("\n=== Search 2 ===")
print(search.invoke("what is LangChain"))

print("\n=== Search 3 ===")
print(search.invoke("best programming language for AI 2026"))

duckduckgo_evaluation = """
# Answer these questions based on what you observed:
# 1. How reliable did the results look? Did anything seem off or outdated?
# 2. What kind of questions is DuckDuckGo best for?
# 3. What kind of questions should NOT use web search?
# 4. What is one risk of giving an agent unrestricted web search?
# YOUR ANSWER HERE
"""
print(duckduckgo_evaluation)

Tool 2: Wikipedia

from langchain_community.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper

wikipedia = WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper())

print("=== Wikipedia 1 ===")
print(wikipedia.invoke("Large language model"))

print("\n=== Wikipedia 2 ===")
print(wikipedia.invoke("API computer science"))

wikipedia_evaluation = """
# Compare DuckDuckGo and Wikipedia:
# 1. What type of questions is Wikipedia clearly better for?
# 2. What type of questions is DuckDuckGo clearly better for?
# 3. If you had both tools in an agent, how would you write the descriptions
#    so the agent picks the right one every time?
# YOUR ANSWER HERE
"""
print(wikipedia_evaluation)

Part 2 — Local File Tool with New Data

In class you used a sales CSV. Now you will build a tool for a different dataset — student grades — and wire it into an agent.

Step 1 — Create the data file

# Create a student grades CSV file
grades_data = """student,assignment,score,max_score,submitted
Alice,Assignment 1,88,100,True
Alice,Assignment 2,92,100,True
Alice,Assignment 3,75,100,True
Bob,Assignment 1,70,100,True
Bob,Assignment 2,65,100,True
Bob,Assignment 3,80,100,True
Carol,Assignment 1,95,100,True
Carol,Assignment 2,98,100,True
Carol,Assignment 3,91,100,True
David,Assignment 1,55,100,True
David,Assignment 2,60,100,True
David,Assignment 3,0,100,False
Eva,Assignment 1,82,100,True
Eva,Assignment 2,79,100,True
Eva,Assignment 3,88,100,True
"""

with open("grades.csv", "w") as f:
    f.write(grades_data)

print("grades.csv created!")

Step 2 — Build the tool

Build a tool that reads grades.csv and returns the data in a readable format. Your tool must: - Have a clear docstring that specifies when to use it and when NOT to - Handle the case where the file is not found - Return a readable string (not a raw Python object)

import csv

# YOUR TOOL HERE
@tool
def read_grades_data(query: str) -> str:
    # YOUR DOCSTRING HERE
    # YOUR CODE HERE
    pass

# Test it directly before using in an agent
print(read_grades_data.invoke({"query": "show all grades"}))

Step 3 — Wire into an agent and test

grades_agent = create_react_agent(
    model=model,
    tools=[read_grades_data],
    prompt="""You are a teaching assistant who helps analyze student performance.
Use your tool to access the grades data and answer questions accurately.
Always base your answers on the actual data, not assumptions."""
)

# Test with these questions
questions = [
    "Who has the highest average score?",
    "Which student has a missing submission?",
    "What is the class average for Assignment 2?",
    "Which students are struggling (average below 75)?"
]

for q in questions:
    response = grades_agent.invoke({"messages": [{"role": "user", "content": q}]})
    print(f"Q: {q}")
    print(f"A: {response['messages'][-1].content}")
    print()

Part 3 — The Description Pressure Test

Now give the agent BOTH tools — web search and grades data. The agent must pick the right one for each question.

First, predict which tool the agent will call for each question below. Then run it and see if you were right.

predictions = """
# Before running the next cell, predict which tool the agent will use for each question:
# 1. 'Who has the highest grade in the class?' → Tool: ___
# 2. 'What is the best programming language for beginners?' → Tool: ___
# 3. 'How is David doing in the course?' → Tool: ___
# 4. 'What are current trends in AI education?' → Tool: ___
# 5. 'What does a grade of 88 mean?' → Tool: ___ (trick question — think carefully)
# YOUR PREDICTIONS HERE
"""
print(predictions)

search_tool = DuckDuckGoSearchRun()

dual_agent = create_react_agent(
    model=model,
    tools=[search_tool, read_grades_data],
    prompt="""You are an educational assistant with two tools:
1. A web search tool for general information and current events
2. A grades tool for information about this specific class's student performance

Choose the most appropriate tool for each question."""
)

test_questions = [
    "Who has the highest grade in the class?",
    "What is the best programming language for beginners?",
    "How is David doing in the course?",
    "What are current trends in AI education?",
    "What does a grade of 88 mean?"
]

for q in test_questions:
    response = dual_agent.invoke({"messages": [{"role": "user", "content": q}]})
    print(f"Q: {q}")
    print(f"A: {response['messages'][-1].content}")
    print()

prediction_review = """
# How many did you predict correctly?
# Which question surprised you most? Why did the agent pick that tool?
# What does this tell you about how the agent makes decisions?
# YOUR ANSWER HERE
"""
print(prediction_review)

Part 4 — Deliberately Break It, Then Fix It

Step 1 — Make bad descriptions on purpose

# These tools have intentionally vague descriptions
# Do NOT fix them yet — just run this and observe

@tool
def bad_search(query: str) -> str:
    """Gets information from the internet."""
    search = DuckDuckGoSearchRun()
    return search.invoke(query)

@tool
def bad_grades(query: str) -> str:
    """Gets information about students."""
    with open("grades.csv", "r") as f:
        return f.read()

bad_agent = create_react_agent(
    model=model,
    tools=[bad_search, bad_grades],
    prompt="You are a helpful assistant."
)

# Run these 3 times each and record which tool gets picked
ambiguous_questions = [
    "Tell me about Alice.",
    "What is the average score?",
    "How are students doing?"
]

for q in ambiguous_questions:
    response = bad_agent.invoke({"messages": [{"role": "user", "content": q}]})
    print(f"Q: {q}")
    print(f"A: {response['messages'][-1].content}")
    print()

bad_description_observations = """
# What happened with the bad descriptions?
# Did the agent consistently pick the right tool?
# Did it ever use the wrong tool? Which question caused the most confusion?
# YOUR ANSWER HERE
"""
print(bad_description_observations)

Step 2 — Fix the descriptions

Rewrite both tool descriptions to make the agent reliably pick the right tool for every question above.

# Rewrite both tools with strong descriptions
@tool
def good_search(query: str) -> str:
    # YOUR IMPROVED DOCSTRING HERE
    search = DuckDuckGoSearchRun()
    return search.invoke(query)

@tool
def good_grades(query: str) -> str:
    # YOUR IMPROVED DOCSTRING HERE
    with open("grades.csv", "r") as f:
        return f.read()

good_agent = create_react_agent(
    model=model,
    tools=[good_search, good_grades],
    prompt="You are a helpful educational assistant."
)

# Run the same ambiguous questions and see if the agent does better
for q in ambiguous_questions:
    response = good_agent.invoke({"messages": [{"role": "user", "content": q}]})
    print(f"Q: {q}")
    print(f"A: {response['messages'][-1].content}")
    print()

fix_reflection = """
# What specific changes did you make to the descriptions?
# Did the improved descriptions fix the problem completely?
# Was there any question the agent still got wrong? What would fix it?
# YOUR ANSWER HERE
"""
print(fix_reflection)

Part 5 — Stretch Challenge 🔥

This is optional.

The challenge: Build an agent with THREE tools: 1. Web search (DuckDuckGo) 2. Grades data (your tool from Part 2) 3. A brand new tool you build yourself — it should do something useful for a teaching assistant, like calculating a letter grade from a percentage, or checking if a student is passing

Then ask the agent a question that requires it to use two tools in sequence to answer — for example, look up the grading standard for a course online and then check which students meet that standard based on the local data.

Document what happened — did it work? What did the agent do step by step?

# YOUR CODE HERE

stretch_reflection = """
# What happened when the agent had to use two tools in sequence?
# Did it handle it correctly?
# What surprised you?
# YOUR ANSWER HERE
"""
print(stretch_reflection)