Simplifys logo

Building an internal AI copilot with Gemini and your own data

A practical walkthrough of how we build internal Gemini-powered copilots for clients — what works, what doesn't, and what it actually costs.

August 5, 20252 min read

The hardest part of building an internal AI copilot isn't the model. It's everything around the model: getting the right context in, keeping the wrong context out, and convincing a non-technical team to actually use it.

Here's the shape of what we typically build, in order.

1. Pick one job, not a chatbot

The instinct is to build "an AI assistant for the company." Don't. Pick a single repetitive task — drafting an RFP response, summarizing a support ticket queue, finding the right SOP for a question. Each task gets its own purpose-built surface.

2. Retrieval over fine-tuning

For 95% of business cases, retrieval-augmented generation (RAG) with Gemini beats fine-tuning. Cheaper, easier to update, transparent about what informed the answer. Fine-tuning is for when you genuinely need the model to adopt a style or a structured-output format that prompts can't reliably get to.

3. The data plumbing is the product

A real pipeline looks like:

  • Ingest source docs from Drive, Confluence, Notion, support tickets.
  • Chunk and embed (Vertex AI Embeddings, or OSS if you have reasons).
  • Store vectors in something cheap and managed (we usually use Vertex AI Vector Search or, for smaller scale, pgvector on Cloud SQL).
  • Re-ingest on a schedule — the model is only as fresh as the index.

This part is 70% of the work.

4. Surface it where work already happens

Slack bot, Chrome extension, sidebar in Google Docs — meet people in the tool they already have open. A standalone "AI portal" gets opened twice and never again.

5. Track useful-vs-not, not engagement

The metric isn't "queries per day." It's "did this save a person from doing 20 minutes of manual work." Build a thumbs-up/thumbs-down loop into every response and review the misses weekly for the first month.

What it actually costs

For most mid-market deployments we ship: 4–8 weeks of work, then a handful of hundreds of dollars per month in GCP costs for a team of 50 active users. The ongoing maintenance is mostly content curation, not engineering.

If you've been wanting to "do something with AI" but every proposal has come back with a six-figure scope, this is the conversation worth having.

Topics

  • engineering
  • gemini
  • ai
  • rag
  • google-cloud

Have a project worth simplifying?

Get a working session with our team. We'll listen first and tell you whether we can help — honestly, either way.

You might also like