Frozen knowledge
A model trained in 2024 knows nothing of what happened after its cut-off date. Laws change, precedents evolve, specifications get updated — the LLM has no idea.
From document burden to intelligent generation. A flow designed for procurement technicians who value their time.
Connect LicitadIA with your existing repositories: SharePoint, Nextcloud, local folders. The platform automatically indexes all your historical tender documentation.
Complete a simple form with basic tender data: document type, subject, CPV, procedure, and amount. LicitadIA immediately understands the context.
The RAG system retrieves the most relevant documents from your history and, combined with the form information, generates a complete and coherent draft in minutes.
The generated document is an intelligent draft, not a final product. Edit it freely, request specific changes from the conversational assistant, and export when satisfied.
The architecture that lets a language model consult your own documentation in every answer. A step-by-step guide through the two phases of the process —indexing and query— and through the maths that hold them together.
The conceptual backbone of LicitadIA, explained without shortcuts.
An LLM on its own is a very smart student taking an exam from memory. Without access to reference material, it fails in exactly the cases that matter most.
A model trained in 2024 knows nothing of what happened after its cut-off date. Laws change, precedents evolve, specifications get updated — the LLM has no idea.
The LLM has never read your organisation's files, nor the historical specifications, nor the clauses your team has drafted over years. It completely lacks institutional memory.
When the model doesn't know something, it doesn't stay silent: it invents plausible-sounding but false answers. In an administrative context —where a wrong clause is a real risk— that is unacceptable.
A RAG solves all three problems at once: it gives the model, on every query, the exact context it needs to answer with precision, currency and traceability.
Before going into the technical detail, look at what changes when an LLM does — or does not — have access to your organisation's actual documentation.
Every RAG breaks down into two clearly distinct phases. One runs only once, asynchronously, over the documents. The other runs every time a user asks a question.
It happens once for every document you add. It is slow, it runs in the background and produces something invisible but essential: a numerical representation of every idea, ready to be queried in milliseconds.
Historical specifications, framework contracts, justification memoranda, regional regulations. Everything your organisation has produced over the years enters the system.
Each document is broken into chunks of 200–1000 tokens, respecting natural boundaries (paragraph, sentence). A 10–20% overlap prevents losing information at the edges.
Each chunk goes through an embedding model — typically a specialised transformer — that returns a vector of hundreds of dimensions encoding its meaning.
Vectors, text and metadata are stored and indexed with HNSW or another ANN algorithm. Ready to answer, in milliseconds, any future query.
Each fragment becomes a vector of hundreds of dimensions. Here we project it into two dimensions to visualise it: texts about the same thing sit close together; those that aren't, far apart. That geometric proximity is semantic similarity.
Five steps, executed in less than a second. The question goes through the same embedding model, sweeps the vector database, retrieves the relevant fragments and ends in a traceable answer.
The user's question is converted into a vector using exactly the same embedding model the documents were indexed with.
The vector database computes the cosine of the angle between the question vector and each indexed vector — millions, in milliseconds.
The k most similar vectors (typically 3, 5 or 10) are returned, ordered from highest to lowest semantic proximity.
A prompt is assembled that includes a system instruction, the retrieved fragments as context and the user's original question.
The LLM generates the final answer based solely on that context, able to cite the exact source of every claim.
Two vectors form an angle. The cosine of that angle is the metric every RAG uses to decide what is "similar" and what isn't. It ignores magnitude and looks only at direction — and that's why it works equally well for short and long texts.
Dot product of the vectors, divided by the product of their norms.
What we've described so far is a basic RAG, already useful. In real-world projects — and in LicitadIA — improvements are layered on that multiply precision and reduce hallucinations to almost zero.
Combines vector search with classic BM25. Semantics captures meaning; lexical search finds codes, proper names and exact technical terms. Reciprocal Rank Fusion merges them.
Retrieve 20 candidates and reorder them with a more expensive but more precise cross-encoder. Keep the best 5. Drastic relevance gain, controlled latency.
An LLM rephrases the question into variants, or you apply HyDE: generate a hypothetical answer and vectorise that, not the question. Works because answers look more like documents.
Before searching, restrict the space: only documents from the past year, only from file X, only in Spanish. Reduces noise and multiplies relevance.
Index small chunks (better search precision) but when retrieving one, pass the LLM the larger fragment containing it. Best of both worlds.
An agent runs iterative searches, refining the query as it discovers more, until it has enough information. Slower; unbeatable on complex questions.
LicitadIA applies everything you've just seen — recursive chunking, specialised embeddings, hybrid search, re-ranking and per-file filtering — over the specifications, memoranda and clauses of your own organisation. The AI stops inventing and starts citing.