The landscape of enterprise AI is shifting rapidly from simple "Generative AI" chatbots - which passively answer questions - to Agentic AI: intelligent systems capable of reasoning, planning, and actively using tools to solve complex problems. On the Databricks Data Intelligence Platform, this evolution is powered by the Mosaic AI Agent Framework.
Unlike a standard RAG application that merely retrieves and summarizes text, an AI Agent acts as a reasoning engine. It can break down a high-level goal, critique its own search results, invoke Python or SQL functions as tools, and iterate until it finds a verified answer. To accelerate this journey even further, Databricks introduces Agent Bricks: a higher-level abstraction that simplifies building, evaluating, and deploying agentic systems. Agent Bricks provides reusable agent blueprints, opinionated best practices, and built-in observability, allowing teams to focus on business logic rather than low-level orchestration. Instead of stitching together prompts, tools, and evaluation logic manually, Agent Bricks offers a structured way to compose agents that reason, call tools, and evolve - while remaining fully integrated with Databricks’ governance, security, and MLOps capabilities.
In this hands-on walkthrough, we will move beyond basic text generation to build a Consultant Matchmaker Agent. We will leverage Databricks' unified ecosystem - using Unity Catalog for governance, well established foundational LLMs, and Model Serving for deployment - to create an agent that ingests consultant resumes (PDFs) and intelligently matches them against complex project requirements. By the end of this guide, you will have a deployed agent that doesn't just read data, but understands the nuance of skills and experience to provide data-backed hiring recommendations.
Disclaimer: The consultants profiles used in this blog are fictional.
We start by ingesting the consultant profiles in a Databricks volume. Volumes are ideal for object storage, with Unity Catalog offering the same governance capabilities as with tabular data. We can simply upload our documents directly to the volume from the UI, or create an ingestion pipeline that regularly updates the profiles. Next, the documents need to be parsed in order to extract textual information like biographical facts, experience, projects and skill sets. AgentBricks offers a no-code text extraction mechanism where we just define the volume containing the documents, a destination table in UC and an SQL warehouse compute.
AgentBricks will generate and execute a job, based on a SQL script accepting certain parameters. The SQL script will be available in SQL Editor.
After the extraction job runs successfully, the resulting table will have the following schema:
We are mostly interested in the ‘text’ field which contains the raw text of each PDF. This is the information that will be compared against the project description.
Before we start building the AI agent, we need to define a SQL function that computes the similarity score for a project description based on the consultants’ profiles.
The magic happens with the ‘AI_SIMILARITY()’ utility which invokes a state-of-the-art generative AI model from Databricks Foundation Model APIs to compare two strings and compute the semantic similarity score.
Let’s test this function with a simple project description:
The consultants are ranked based on the similarity of their profiles with the project requirements and the five most suitable are returned.
Building the agent is a fairly straightforward task - AgentBricks takes care of the boilerplate code by generating a notebook that defines the skeleton of the agent, enables tracking of the experiments through MLflow and serving via an endpoint. In order for an AI agent to be successful in accomplishing tasks, two things are required: a detailed system prompt that acts as the entry point for the underlying LLM and a set of tools that can interact with external sources and systems. Let’s create a prompt:
SYSTEM_PROMPT = """
You are a consultant recommendation assistant for NextLytics AG. Your role is to help match project requirements with the most suitable consultants from our team.
When a user describes a project or asks for consultant recommendations:
1. Use the find_matching_consultants tool to search for consultants based on the project description
2. Analyze the results and present the top recommendations
3. Highlight why each consultant is a good fit based on their skills, experience, and technologies
4. If the user needs more details, you can also run Python code to analyze the results further
Be professional, concise, and focus on matching the right expertise to the project needs.
"""
The agent will keep these guidelines in mind when asked to find matching consultants for a particular project. An agent without a set of useful tools is just another chat bot with no ability to be proactive and take initiatives for solving problems autonomously. Tools range from user-defined functions (UDFs) that interact with UC assets, to custom Python functions and even MCP servers designed for specific use cases.
We will only need two tools in our example: the similarity UDF mentioned earlier and the built-in Python execution tool in case the agent needs to execute some arbitrary code. A Large Language Model (LLM) serves as the “brain” behind an agent, analysing and translating our prompts into detailed steps. A plethora of popular foundation models are offered by Databricks from Claude and GPT variants to more exotic and cutting edge alternatives.
databricks-claude-sonnet-4-5 is an ideal candidate for our use case, offering stellar results on reasoning and planning tasks with a reasonable token consumption.
After defining the tools and the foundation LLM, we can train the agent and test its performance with MLflow.
The next step is to deploy the agent and interact with it. The deployment process is very similar to the deployment of custom machine learning models: a model object will be created in Unity Catalog and will be served through an endpoint.
In the “Playground” section we can find all available models and agents, select our agent and start chatting!
Let’s ask the agent to propose some consultants for a SAP Datasphere related project. Here’s the prompt:
“We need 3 consultants for a project involving SAP Datasphere and SAP BW/4HANA. The project focuses on migrating selected BW data models to Datasphere using BW Bridge, redesigning CompositeProviders, and ensuring reporting continuity in SAP Analytics Cloud. Experience with delta handling, authorizations, and hybrid landscapes is required.”
The agent provided a very detailed and targeted response, suggesting the three consultants that fit the project description best. It highlighted the strengths of each one, composing a balanced team and explaining the rationale behind each pick.
Let’s now create a prompt for a project with a data engineering twist.
“We need 3 consultants for a project involving Databricks, a cloud data lake, and modern ELT pipelines. The team will design and implement scalable data pipelines using Spark and Delta Lake, ingesting data from multiple operational sources into a centralized analytics platform. Responsibilities include data modeling, performance optimization, and enforcing data quality standards.”
In a similar vein, the agent proposes the best three consultants for a Databricks data lake project giving a structured recommendation and justifying its response.
This simple example proves that agentic AI can be used in real life applications and give valuable insights in the data that lives in Unity Catalog. Of course an AI agent is way more capable than giving human-like recommendations based on a set of documents. It can be fine tuned to produce a response with a particular schema (like JSON or YAML) for consumption from external tools that interact with it. The functionality and capabilities can also be extended. A “smarter” AI agent can be entrusted with searching for new resumes if the current ones are outdated, ingesting them, parsing and storing the textual information.
Agentic AI is quickly becoming the “next layer” on top of the modern data platform - not just generating text, but reasoning through a goal, choosing the right tools, and turning your governed data into decisions you can actually act on. On Databricks, that’s the interesting twist: agents don’t live in a vacuum, they live inside the same ecosystem where your data already has lineage, permissions, and operational guardrails, so the jump from prototype to production is much smaller than it used to be.
In our hands-on example, we made that concrete by building a Consultant Matchmaker Agent: ingesting consultant PDFs into a governed volume, extracting text into a UC table, using an AI_SIMILARITY()-powered SQL function to rank profiles against a project description, and wrapping it all in an AgentBricks-generated agent skeleton with a clear system prompt, a couple of focused tools, MLflow tracking, and endpoint deployment - so you can literally “chat” with your team’s knowledge and get explainable recommendations back.
Looking ahead, the real payoff will come as agents become more autonomous and more composable: producing structured outputs (JSON/YAML) for downstream systems, chaining richer tools (SQL, Python, APIs, MCP servers), and even keeping themselves fresh by discovering, ingesting, and re-indexing new data when the underlying knowledge changes.
Want to explore how Agentic AI on Databricks could work for your own use cases? Get in touch with our experts to discuss your data, your goals, and the right agent architecture for your organisation.