Litseer¶
Multi-source academic literature search with citation walking, quality filtering, and bibliometric analysis.
Litseer is a Python CLI tool for systematic literature review in aerospace and engineering. It searches nine academic databases in parallel, deduplicates results across sources, classifies quality tiers, walks citation graphs, and exports to BibTeX, JSON, or markdown.
Why Litseer?¶
Systematic literature review methodology has been codified by librarians for decades (PRISMA, Wohlin snowballing, Cochrane protocols), but no widely-adopted open-source tool implements these as reproducible software. Existing tools solve screening (Rayyan, ASReview, Covidence) — the step after you already have search results. Nobody automates the search itself.
Litseer fills this gap. See the literature positioning for the full academic context.
Key Features¶
- 9 source adapters: OpenAlex, Semantic Scholar, CrossRef, NASA NTRS, IEEE Xplore, AIAA, SAE, SKYbrary, local citation graph
- Citation snowballing: Forward and backward citation walking with configurable depth, graph-aware ingestion
- Technology portfolio search: Batch search across YAML configs with cross-technology shared paper detection
- Local citation graph: DuckDB-backed graph that accumulates over time, enabling offline citation walking
- Bibliometric network analysis: Bipartite matrices for coupling, co-citation, keyword co-occurrence (R bibliometrix pattern)
- Quality filtering: Tier-based venue classification (peer-reviewed, technical, preprint, grey literature)
- Input validation: Defense-in-depth sanitization of all untrusted API data
- Deterministic: Same YAML config + same date range = same results. No built-in LLM, no randomness.
Quick Start¶
# Install
uv pip install -e ".[dev]"
# Run a search
litseer search examples/aerospace-review.yaml -o output/
# Citation walk from a seed paper
litseer cite-walk "10.1115/1.4045389" --depth 2
# Portfolio batch search
litseer portfolio examples/portfolio-demo/ -o output/
# Graph statistics
litseer graph stats
See the Usage Guide for full documentation.
Design Principles¶
- Deterministic — same inputs produce same outputs
- Verifiable — every claim traces to a specific paper, DOI, and source
- LLM-ready — structured output for external LLM consumption, no LLM dependency
- Incremental — citation graph and cache accumulate value over time
- Portable — everything runs locally, no cloud dependencies
Project Status¶
v0.1.0-alpha — 468 unit tests + 13 integration tests passing. Python 3.11-3.14, CI via GitHub Actions.
Licensed under AGPL-3.0-or-later with commercial licensing available.