Grounding MCP

MCP server for grounding AI reasoning in verifiable sources.

Connect

Add https://grounding.btr.mt/mcp as a remote MCP server in Claude, ChatGPT, or any MCP-compatible client.
Transport: Streamable HTTP

How this works

Grounding MCP searches academic databases and Wikipedia in parallel, then scores and deduplicates results so you get verified, citable sources rather than hallucinated references. It can verify specific claims (check_citation), explore a paper's reference network (citation_graph), or retrieve context for a research question (fetch_context).

The server queries up to six backends per request. Results include DOIs, abstracts, citation counts, and links to the original source. If a backend is unavailable or rate-limited, the others compensate automatically.

What it can't do: It cannot read full-text PDFs, generate literature reviews, or guarantee exhaustive coverage. It surfaces the best sources it can find—you still need to read and evaluate them.

Backends

CrossrefAuthoritative DOI registry for citation verification
Semantic ScholarAI-powered paper search with abstracts
OpenAlexOpen catalogue of scholarly works
WikipediaGeneral knowledge articles and extracts
OpenLibraryBook verification via Open Library (free, no auth)

Tools

ToolDescriptionParameters
check_backends Check health status of all grounding backends at once. Returns status for each backend: - **crossref** — DOI verification (free, no auth) - **semantic-scholar** — Paper search with abstracts (free, optional API key) - **wikipedia** — General knowledge (free) - **openalex** — Broad paper discovery (free) - **openlibrary** — Book verification (free, no auth) **When to use:** - Before a grounding-heavy session to see what's available - When tools return errors, to diagnose which backend is down
force boolean
Force health check for rate-limited backends. By default, Semantic Scholar is skipped without an API key to preserve the 1 req/sec rate limit.
check_citation Verify that a citation/reference exists before presenting it to the user. Use this tool BEFORE citing a paper to confirm it's real. Helps prevent hallucinated citations. **When to use:** - Before including a citation in your response - To verify a paper the user mentioned actually exists - To get the correct DOI/URL for a paper **Input priority:** - If you have a DOI, provide it (most reliable) - Otherwise, provide title + authors + year for fuzzy matching **Backends consulted:** Crossref (authoritative DOI registry), Semantic Scholar, OpenAlex, OpenLibrary (books) **If the result says "not found":** This means the citation could not be verified. Do NOT assume the work exists anyway. Common causes: - You have the wrong title (e.g. confusing a concept name with a book title) - You have the wrong author or year - The work genuinely does not exist (hallucination) - The work exists but is not indexed (rare for published books/papers) Use web search to confirm the work exists before citing it. Pay attention to the suggestion field in the response. **Books:** Academic databases index journal articles, not books. When you verify a book, the tool may match a book *review* or catalogue entry instead of the book itself. OpenLibrary provides accurate book metadata but has no citation counts. If the result has unexpectedly low citations or wrong authors for a well-known book, check that the matched work is the book itself, not a review of it. **Response fields:** Check "verified" (bool), "confidence" (0–1 match quality), "source" (which backend matched), and "suggestion" (hints when not found). A low confidence score means the match is uncertain — verify the title and authors before citing. **Tip:** The DOI returned here can be passed directly to citation_graph for reliable graph traversal — more dependable than passing a title.
authors array
Author names (e.g., ["Smith, J.", "Jones, A."]). Improves matching accuracy.
doi string
DOI of the paper (e.g., "10.1037/apl0000353"). Preferred if available.
title string
Title of the paper. Required if no DOI provided.
year number
Publication year. Improves matching accuracy.
citation_graph Start from a known paper and explore the research landscape around it. Use this tool to discover what a paper cites, what cites it, and how influential those connections are. This is the best way to find canonical/foundational works that keyword search misses. **When to use:** - To find seminal works: pick any paper in the area, traverse its references (direction "references") - To find recent follow-up work: traverse citations (direction "citations") - When fetch_context returns recent papers but you need the foundational work they build on - To explore the research context around a specific paper **Input:** A paper identifier (DOI, arXiv ID, Semantic Scholar ID, or title) **Returns:** Seed paper metadata plus references and/or citations with influence markers (is_influential) and intent labels (methodology, background, result comparison). Includes a temporal trend (year histogram). The trend is a sample when results hit the limit cap; for highly-cited papers (1000+), treat the distribution as indicative, not exact. Pay attention to is_influential edges — these mark the citations that shaped the seed paper's core argument. **Sorting:** By default, results are ordered by recency (most recent first). For highly-cited seed papers, most results will be very recent and may not be the most important citing works. Use sort_by "citations" to rank by citation count instead. **For literature characterisation,** use direction "citations" with sort_by "citations" and a moderate limit (50--100). This surfaces the canonical follow-up works, not just the latest. **Resilience:** Uses Semantic Scholar for rich edge metadata (is_influential, intents). Falls back to OpenAlex when SS is rate-limited (HTTP 429). When sort_by is "citations", OpenAlex is used directly (it supports server-side citation sorting; SS does not). OpenAlex results omit is_influential and intents but still provide paper metadata, citation counts, and year data. **Finding canonical works (recipe):** 1. Use fetch_context to find any well-cited paper in the area (min_citations: 50). 2. Use citation_graph on that paper with direction "citations", sort_by "citations", min_citations 200, limit 50. This returns the most-cited papers that cite the seed. 3. Works best with seed papers that have 500+ citations. For niche areas, lower min_citations accordingly. This two-step workflow is more reliable than keyword search, which misses papers whose titles don't share your query terms (e.g. "Spandrels of San Marco" won't appear for "evolutionary psychology critique"). **Verify→graph workflow (recommended for reliable seed resolution):** Title-based seed resolution is unreliable when Semantic Scholar is rate-limited — it may resolve to the wrong paper via OpenAlex fuzzy matching. For reliable graph traversal, first verify the paper with check_citation to obtain its DOI, then pass the DOI to citation_graph. DOIs always resolve correctly. Example: check_citation → DOI "10.1177/0891243287001002002" → citation_graph with that DOI.
direction string (default: both)
Which direction to traverse. "references": papers cited by seed — use to find intellectual foundations. "citations": papers citing seed — use to find follow-up work. "both" (default) doubles API calls — use a single direction when you know what you need.
limit number (default: 20)
Maximum results per direction (default 20, max 1000). 20 is good for exploration. 50–100 for literature characterisation. Higher values are slower and results become less relevant.
min_citations integer (default: 0)
Minimum citation count to include an edge (0 = no filter). Post-retrieval filter; increase limit to compensate for filtered results
paper_id string required
Paper identifier: DOI, arXiv ID, Semantic Scholar ID, or title for search fallback
sort_by string (default: recency)
Sort order for results. Default "recency" returns mostly very recent low-impact papers for highly-cited seeds. Switch to "citations" for canonical follow-ups and the most influential citing works.
fetch_context Retrieve real content to reason from before making claims. Use this tool to ground your reasoning in actual sources BEFORE making assertions. **Source types:** - "academic": Research papers from Semantic Scholar, OpenAlex, Crossref - "general": Wikipedia articles for general knowledge - "auto": Both academic and general sources (default) **Returns:** Papers (with abstracts, DOIs, citation counts, source backend) and/or Wikipedia articles (with extracts). The response includes a diagnostics array showing per-backend hit counts and errors — check this to detect SS cooldown or backend failures. A summary.top_cited field highlights the highest-impact results. **Query tips — these matter for result quality:** - Use 3–5 distinctive keywords, not full sentences. UK/US spelling is handled transparently. - Including an author surname dramatically improves precision (see query parameter). - Don't include book titles — they dilute results. Search for the topic. - Multiple short queries beat one long query. - For canonical works, use min_citations: 50+. This is the most effective noise filter. **IMPORTANT — rate limits and call sequencing:** Call fetch_context ONE AT A TIME, never in parallel. SS rate-limits after ~10 sequential calls (or 2–3 parallel). Preserve SS budget for citation_graph and check_citation where it matters most — for broad searches, use backends: ["openalex", "crossref"] proactively. If diagnostics show SS in cooldown, add backends: ["openalex", "crossref"] to ALL subsequent calls. **Without SS (Crossref/OpenAlex only):** avoid common English words as query terms (second, shift, class, model, system). Use specific compound terms ("household-labor" not "domestic") and always combine with min_citations. **Domain field:** OpenAlex results include a "domain" field (e.g. "Social Sciences") for discipline filtering. Only affects OpenAlex — combine with backends: ["openalex"] for cleanest results. **Finding canonical literature:** fetch_context with min_citations: 50 → citation_graph on the best-cited result (direction "references") to find foundational works keyword search misses. **Literature characterisation:** 2–3 fetch_context calls with different phrasings → citation_graph on anchor papers (direction "citations") to find follow-up work. Synthesise across abstracts; don't rely on any single paper.
backends array
Specific backends to use (overrides source_type entirely). Options: semantic_scholar, openalex, crossref, wikipedia. Primary use: skip Semantic Scholar when it's in cooldown by passing ["openalex", "crossref"]. Also useful for targeting a single backend. Use check_backends to see all available backends.
domain string
OpenAlex top-level domain filter. Restricts OA results to a discipline. Values: "Social Sciences", "Health Sciences", "Life Sciences", "Physical Sciences". **Important:** domain ONLY filters OpenAlex results. Crossref and Semantic Scholar results pass through unfiltered — using domain alone can make results worse by removing OA's good matches while Crossref STEM noise remains. For cleanest results, combine domain with backends: ["openalex"] — e.g. backends ["openalex"] + domain "Social Sciences" eliminates cross-discipline noise entirely.
max_results number (default: 5)
Maximum results to return. Default: 5. Use 10–15 when exploring a topic or looking for anchor papers to feed into citation_graph. Higher values increase noise but improve coverage for broad queries.
min_citations number
Minimum citation count. Filters out papers below this threshold. Effective for finding established/canonical work: 50 for established, 100+ for foundational. Caution: filters out recent work that hasn't accumulated citations yet. Don't use when looking for the latest research.
query string required
Keyword query. **Including an author surname is the most effective way to improve precision.** "Stoet Geary gender equality paradox" returns the right paper; "gender equality paradox STEM" returns stem cell papers. "Hochschild second shift" finds the book; "second shift women employment" returns genomics noise. Good: "Buss sexual strategies theory", "household labor division gender". Bad: "how is household labour divided between genders in modern dual-income families" (too long), "second shift emotional labour gender Hochschild" (too many terms). For Wikipedia (source_type "general"), 2–4 precise terms work best.
source_type string (default: auto)
Type of sources to search. Default: "auto". "academic" when you only need papers (faster, skips Wikipedia). "general" for background context and definitions (Wikipedia only). "auto" searches all backends — broadest coverage but slowest.
year_max number
Maximum publication year (inclusive). Rarely needed. Use to cap results to a specific era (e.g. pre-replication-crisis work before 2011).
year_min number
Minimum publication year (inclusive). Useful for "recent work only" queries. Caution: excludes foundational older works — omit when looking for canonical literature.