Core Concepts
This page is the single best place to understand how Sentinel fits together. Everything else in the documentation builds on these ideas.
Matters: the container for work
A matter is the top‑level container for a piece of work and everything related to it — documents, sessions, findings, analysis, and team access. What a matter represents depends on your practice mode:
- Litigation → a case (with custodians, a court docket, discovery, and productions).
- Transactional → a deal (with counterparties, a data room, and diligence).
- Real Estate → a property transaction (with stakeholders and closing diligence).
You always work in the context of a selected matter. The matter selector in the sidebar sets your current matter; the URL reflects it so you can bookmark and share deep links.
There are also special matter types:
- Personal data lake — a personal workspace owned by a single user.
- Enterprise data lake — an organization‑wide collection that every user in the tenant can access.
Documents: data rooms and case files
Documents in a matter live in two complementary surfaces:
- Data Room — the matter's primary document repository: received evidence, discovery documents, and deal documents. In Sentinel's model, the matter is its data room, so "the data room" and "the matter's documents" are effectively the same set.
- Case Files — firm‑authored work product kept distinct from discovered evidence: your own drafts, research, pleadings, and notes.
Both are display/grouping layers over the same underlying indexed content (see below), so search and Emma see everything regardless of which surface a document came in through.
The content pipeline
The most important architectural idea in Sentinel: every file that enters the system goes through one unified ingestion pipeline and becomes searchable. There is no feature that stores content "off to the side" where search can't see it.
When a file is uploaded (or arrives via a monitored mailbox, a client‑upload link, or a docket import), it flows through a multi‑stage pipeline:
- Parse — the file is identified by type. Plain documents go straight to text extraction; email containers (PST, MBOX, MSG) and archives (ZIP) are sent to the split stage.
- Split — archives and email containers are decomposed into their individual files and attachments, each of which re‑enters the pipeline.
- OCR — for scanned PDFs and images with no extractable text, Sentinel runs real optical character recognition (Azure Document Intelligence) to recover the text.
- Multimodal extraction — for images, audio, video, and image‑bearing PDFs, a multimodal AI model extracts meaning that plain text extraction would miss (e.g. the content of a photo, a diagram, or a scanned signature).
- Embed — the extracted content is split into chunks and converted into vector embeddings that power meaning‑based (semantic) search.
- Complete — the job is finalized: counts are tallied, the search index is rebuilt, and the document is fully searchable.
Two ideas underpin this:
- One canonical content record with deduplication. Each unique file's extracted text and metadata are stored once, keyed by a content hash (SHA‑256). If the same file appears again — say, the same memo received in two productions — Sentinel reuses the existing extraction instead of re‑processing it.
- Embeddings for semantic search. The chunked embeddings are stored in a vector index so that a query like "emails discussing regulatory pressure" finds documents by meaning, not just by exact words.
What you see while a document is processing
Ingestion is asynchronous, so a freshly uploaded document moves through visible states such as queued, processing / extracting (parse, OCR, or multimodal extraction), embedding, and complete — at which point it appears in search results. Files that can't be processed (unsupported or empty) are marked failed/skipped with a reason you can review. Large uploads and media files naturally take longer than small text documents.
Search
Sentinel offers several search modes over a matter's content (and transcripts):
- Keyword — exact terms, phrases, names, Bates numbers, and Boolean operators. Fast and precise.
- Semantic — meaning‑based search using embeddings; finds conceptually relevant documents even when they don't use your exact words.
- Hybrid — combines keyword precision with semantic recall; a good default for natural‑language questions.
- Visual — searches the visual content of images and scanned pages (signatures, handwriting, diagrams, photos) using multimodal embeddings.
See Search & Research for how to use them, and Emma for asking questions in natural language.
Emma: the AI assistant
Emma is Sentinel's built‑in AI assistant. She can search your documents, answer questions about a matter, analyze communications, navigate the app, code and tag documents, and summarize sessions — by text or by voice.
Emma's defining trait is citation grounding: she may only reference a document using a citation token that a tool actually returned during the conversation, and her responses are mechanically checked against those tokens. If a response references a document that wasn't retrieved, it is blocked rather than shown. This keeps AI output grounded in your real evidence. See Emma — the AI Assistant.
Sessions
A session is a recorded deposition, interview, or meeting. Sentinel transcribes the audio with speaker diarization (who said what), and the transcript becomes searchable like any other content. During live or ambient sessions, Sentinel can surface AI "leads" — related documents, possible contradictions, and follow‑ups — in real time. See Sessions & Transcription.
Findings
Findings are the notebook of a matter: notes, theories, legal issues, contradictions, and other entries, each optionally linked to the evidence that supports it. Findings are created conversationally — Emma proposes a finding mid‑conversation and you accept, edit, or dismiss it. See Findings.
Practice modes
A tenant runs in exactly one practice mode, which reshapes the navigation and available features:
| Mode | Objects | Distinctive surfaces |
|---|---|---|
| Litigation | Cases, custodians | Discovery & document review, productions, privilege log, motions, jury selection, depositions, communication & issue analytics |
| Transactional | Deals, counterparties | Deal data rooms, due‑diligence request lists, deal checklists & templates, business intelligence |
| Real Estate | Properties, stakeholders | Deal pipeline, diligence, closing checklists (transactions ribbon) |
Practice mode is a per‑tenant setting that an administrator controls (and that Sentinel sets during provisioning). Changing it changes what every user in the tenant sees. A few surfaces are practice‑mode‑specific and are noted in their guides.
Access control: who can see a matter
Sentinel follows an "administrators assign attorneys to matters" model:
- Admins (and Sentinel's platform‑admin staff role) can access any matter in the tenant.
- The owner of a personal data‑lake matter can always access it.
- Enterprise data‑lake matters are org‑wide — every authenticated user in the tenant can access them.
- Everyone else gets access to a matter only through an explicit assignment
made by an admin. Each assignment carries an access level:
- Full — read and write.
- Read‑only — can view but not modify.
- Limited — a restricted subset.
Mutating actions (creating, editing, deleting within a matter) require full access. Assignments can be removed, which immediately revokes access. See Administration → Matter assignments & access levels.
Multi‑tenancy and data isolation
Each customer is a tenant. A tenant's documents live in that tenant's own database and storage — customer data is not commingled across tenants. A central platform layer handles cross‑tenant concerns like accounts and tenant registration, but not customer document content. See Security & Compliance for the full posture.
With these concepts in hand, the User Guides cover each feature in depth. Start with Matters, Cases & Deals.