A diff doesn't tell you whether your codebase is getting better-shaped or worse-shaped. With AI agents committing dozens of edits an hour, that gap is too big to bridge by reading review threads. Raysense scores the shape directly: six structural dimensions, graded A through F, aggregated into a single 0-to-100 number. The score is published to your CI as a pass/fail gate, to a live treemap in your browser, and to the agent itself over MCP, before its next edit.
Raysense is an open-source Rust CLI and MCP server that scores codebase architecture across six structural dimensions. Each dimension grades A through F against the dependency graph and commit history of the repo. The overall score, 0 to 100, is their weighted aggregate. The score moves with structure, not with cosmetics: adding tests or shuffling files around will not lift it.
How cleanly modules separate. High when cross-module edges are rare and unidirectional; low when modules share state or no clear boundaries exist in the source layout.
How much the dependency graph really is a graph. Cycles (A imports B which imports A) block parallel compilation, complicate refactoring, and signal architectural decay. Detected at file and module scope.
Whether layering is appropriate. Both flat-and-tangled and excessively deep architectures lose points. Sweet spot: 3 to 6 layers with clear upward dependencies and no upward-violation edges.
How evenly responsibility is distributed. A handful of god files holding most of the imports drops the score; an even spread of file-level coupling raises it. Surfaces bus-factor concentration.
Duplicated logic across the codebase. Detects copy-paste blocks and functionally equivalent definitions that could share an abstraction. High redundancy multiplies the surface area of every change.
Consistency of structural and naming patterns. Same file types laid out the same way, same handler shapes, same conventions. Inconsistency forces readers to relearn each module from scratch.
$ curl -fsSL https://raw.githubusercontent.com/RayforceDB/raysense/main/install.sh | sh
$ cargo install raysense
$ raysense . # health report
$ raysense . --check # CI gate, exits non-zero on rule violations
$ raysense . --watch # rescan + reprint on a 2s loop
$ raysense . --ui # live dashboard at http://localhost:7000
$ raysense --mcp # stdio MCP server for agents
score 82 / 100
coverage 91 / 100
structure 68 / 100
facts files=32 functions=620 calls=7065 imports=246
graph resolved=89 cycles=0 max_fan_in=53 max_fan_out=21
test_gap production=15 test_files=0 missing_nearby=15
dimensions modularity=A acyclicity=A depth=A equality=F redundancy=C uniformity=C
overall_grade B
edit_risk_files
risk=2100.0 commits=5 max_complexity=140 bus_factor=1 tests=no src/cli.rs
risk=342.0 commits=3 max_complexity=38 bus_factor=1 tests=no src/health.rs
risk=147.0 commits=1 max_complexity=49 bus_factor=1 tests=no src/memory.rs
trend samples=2 score_delta=0 dimension_drift=stable
- name: Architecture gate
run: |
cargo install raysense
raysense . --check
Raysense ships as a Claude Code / Cowork plugin and as a stdio MCP server compatible with any client that speaks the Model Context Protocol. One raysense install registers raysense across every Claude host on the machine: MCP server in Claude Desktop's claude_desktop_config.json, plugin in Claude Code's ~/.claude/settings.json, and marketplace seed in Cowork's per-session registry. The plugin gives the agent six edit-loop workflows (bootstrap, verify, drift, impact, query, audit) exposed three ways: model-triggered skills, user-typed slash commands (/raysense:audit), and MCP prompts in the Desktop "+" menu. Project state lives in <repo>/.raysense/, never in a global registry, so two sessions on two repositories stay strictly independent.
$ cargo install raysense
$ raysense install # auto-detect Claude Desktop, Claude Code, and Cowork
$ raysense install --desktop # only Claude Desktop (claude_desktop_config.json)
$ raysense install --code # only Claude Code (~/.claude/settings.json plugin install)
$ raysense install --cowork # only Cowork (research preview; finish in-session)
/plugin marketplace add RayforceDB/raysense
/plugin install raysense
$ cargo install raysense --force # pull the newest binary from crates.io
$ raysense install # repoint every detected host at the new binary
# Claude Code: refresh the plugin (slash commands, skills) on top of that
/plugin reinstall raysense
/reload-plugins
At session start, raysense scans the working tree, saves a baseline under .raysense/baseline/, and materializes the splayed-table memory the agent will read for the rest of the session. The first scan is the slow one. Every follow-up question is an instant columnar read against the saved tables, not a re-walk of the source.
Before non-trivial edits, the agent calls raysense_blast_radius on the file in play. The reply lists every file that imports it transitively, the cycles it sits on, and the change-coupling neighbors that historically move with it. The agent sees the structural footprint before it touches the working tree, not after the regression lands in CI.
After a batch of edits, verify rescans and runs the same rule set as the CI gate against the new state. It diffs the score, the dimension grades, and the rule findings against the bootstrap baseline. If a previously-clean dimension regressed (Equality went B to D), it surfaces immediately, named by file, before the agent moves on.
Periodically (daily, weekly, or pre-PR), drift compares the latest scan to the trend history across a configurable window (7d, 30d, 90d). It returns three lists: dimensions whose grades worsened, files newly entered the top hotspots, and rule codes that newly tripped. Answers "what got worse since N days ago", a question neither verify (single session) nor audit (no time axis) covers alone.
On request, audit returns the deeper views: full architecture report, dependency-structure matrix (DSM), evolution signals (bus factor, change-coupling pairs, temporal hotspots), test-gap ranking, dead-code detection, and cycle-break recommendations. The same data the dashboard renders, served as JSON the agent can read and quote.
Any structural question the typed tools do not directly answer becomes a Rayfall expression over the saved baseline. Filter, project, aggregate, run .graph.* algorithms (PageRank, betweenness, shortest-path), or write a Datalog rule for declarative reachability. The agent gets the same query surface a human operator has, no new vocabulary to learn.
Every saved baseline is a queryable columnar database. Agents and humans run Rayfall expressions over the call graph, the import graph, ownership history, and change coupling - select queries for filters and aggregates, .graph.* algorithms for centrality and reachability (PageRank, Louvain, topsort, shortest-path, betweenness), Datalog rules with transitive closure for declarative reachability. Drop the same expression into a .rfl file under .raysense/policies/ and it becomes a CI gate. No vendored YAML schema, no plugin SDK to learn - rules ship as code-reviewable files alongside the codebase they govern.
;; Files over 2000 lines block the merge.
;; Result table columns: severity, code, path, message.
(select {severity: "error"
code: "huge-file"
path: path
message: "file exceeds 2000 lines, split before merging"
from: files
where: (> lines 2000)})
Agents call raysense_baseline_query with a Rayfall expression. The named baseline table is bound as t and the result returns as JSON. Three modes: select for filter/project/aggregate, .graph.* for centrality and shortest-path, Datalog for transitive reachability that mirrors blast-radius in two lines.
raysense policy check walks .raysense/policies/*.rfl, evaluates each, returns findings in the same envelope as built-in rules. Exit code 0 for pass, 1 for any policy that failed to evaluate, 2 for any error-severity finding. Wire it into a pre-commit hook or a CI gate without touching raysense's release cadence.
An .rfl policy and an interactive query reference the same baseline tables - files, module_edges, change_coupling, file_ownership, call_edges - because there is only one substrate. Promote a one-off query into a committed rule by renaming the file.
The baseline already carries change-coupling, file-ages, ownership, and rule-violation tables alongside the structural ones. Cross-time queries like "files tightly coupled in the last 60 days that sit on cycles and changed without test edits" stay one Rayfall expression, not three tools.
Drop a CSV; it joins the baseline. First row is headers, column types are inferred, and the shared symbol table means cross-table predicates like (in path (at coverage 'path)) work without ETL. Vector primitives - cos-dist, l2-dist, knn, hnsw-build, ann - are built into Rayfall, so embeddings imported alongside the structural baseline serve semantic similarity from the same query expression that drives policy gates.
$ raysense baseline import-csv coverage ./coverage.csv
imported ./coverage.csv -> .raysense/baseline/tables/coverage
$ raysense baseline query temporal_hotspots \
'(select {from: t
where: (in path
(at (select {from: coverage where: (< covered_pct 50)})
(quote path)))
desc: risk_score})'
Coverage, lint counts, error budgets, runtime traces, ownership-from-elsewhere - any CSV becomes addressable from baseline query, policy check, and the MCP tools. Subsequent re-imports overwrite cleanly; the schema-version stamp keeps stale baselines from silently mis-rendering.
Pair CSV import with embeddings: cos-dist for direct similarity, knn for brute-force scans on small sets, hnsw-build + ann for sub-linear queries on >10k vectors. An .rfl policy that flags near-duplicate functions is six lines.
Imported tables share the baseline's interned-string space, so path in coverage and path in files point to the same sym ID. Joins are predicate equality, not ETL plumbing - and the schema-version stamp catches imports against an out-of-date baseline directory.
Long Rayfall queries print throttled progress lines to stderr - op_name, phase, rows_done / rows_total, elapsed seconds, memory used. JSON callers stay byte-clean; humans on a TTY get the live signal. Quick queries (under 200ms) stay silent by design.
A treemap of the working tree color-graded A through F, served at http://localhost:7000 while you work. Tiles are sized by file weight and tinted by structural grade; click a tile for the per-file metrics that drove the color. Refreshes within the watch interval of the next save.
Diff against a saved snapshot. Simulate an edit (delete a file, break a cycle) before touching the working tree.
Scan results materialized as columnar tables. An agent's follow-up questions are instant reads, not re-scans.
One number per file ranking which the next agent edit is most likely to break. Combines churn, max complexity, single-owner penalty, and missing-tests penalty into a composite score updated on every save.
Every baseline save appends a sample. The verify step diffs against the previous one and surfaces per-dimension drift (Equality went B to D after the last session) instead of just an aggregate delta.
Files where most of the churn is fix commits surface immediately. Conventional Commits prefixes (fix, hotfix, revert) drive the classifier; absolute count and ratio against total commits both feed the ranking.
Files without nearby tests, ranked by structural risk. Feeds directly into the edit-risk score so untested files in churn-heavy areas float to the top.
Bus factor per file, change-coupling pairs, temporal hotspots (churn x complexity), file age windows, and bug-fix concentration. Six lenses on how the codebase has actually moved over the last 500 commits.
Eleven tree-sitter built-ins with full AST analysis, Rayfall with native S-expression extraction, plus 57 configurable plugins covering everything from Solidity to COBOL. See the coverage map →
Tree-sitter built-ins get the full pipeline: function bodies parsed, cyclomatic and cognitive complexity, and type inheritance (9 class-based languages: Python, TypeScript, C++, Java, C#, Kotlin, Scala, Swift, Ruby). Rayfall (the RayforceDB query language) joins them at tier 3 with native S-expression extraction tuned to its LISP-like syntax: functions, imports, and table/dict types. The catalog plugins extract functions and imports via configurable prefix patterns, no AST. Every other metric on this page (edit-risk, score drift, bug-density, evolution, blast radius) works at every tier.
Add your own language at any tier via .raysense/plugins/<name>/plugin.toml. Tree-sitter grammars can be loaded dynamically with the convention tree_sitter_<name>.
Six structural dimensions of a code repository: modularity (how cleanly module boundaries hold), acyclicity (how few directed cycles exist), depth (whether layering is appropriate), equality (how evenly responsibility is distributed), redundancy (how much logic is duplicated), and uniformity (how consistent naming and structural patterns are). Each is graded A through F against the dependency graph and 500-commit history. The aggregate is one weighted 0-to-100 number.
SonarQube measures token-level quality (complexity, duplication, lints) inside individual files; Raysense measures graph-level shape across the whole repository. CodeScene focuses on hotspots in commit history; Raysense uses commit history as one input among several but its primary output is a structural grade, not a list. NDepend is a closed-source .NET-first tool with a UI-driven workflow; Raysense is a single open-source Rust binary with a stdio MCP server, so the agent can read structural state before every edit. The honest comparison page is at /compare/.
The structural dimensions (modularity, acyclicity, depth, equality, redundancy, uniformity) work on the current working tree alone. Evolution signals (bus factor, change coupling, temporal hotspots, bug-density) require git history and walk the last 500 commits by default. A repository with no git history will return a complete grade for the structural dimensions and skip the evolution section with an explicit note.
Both surfaces dispatch to the same Rust functions in src/mcp.rs and src/cli.rs. The MCP tool registry and the CLI command registry are two views over one set of typed handlers; adding a capability registers it in both lists in the same commit. There is no separate API server, no JSON-RPC translation layer, and no schema drift to debug.
Rayfall is the query language exposed by Rayforce, the columnar runtime Raysense uses for its baseline tables. Saved scan results are queryable as Rayfall expressions (filter, project, aggregate, .graph.* algorithms, Datalog rules). The same expression that runs as an ad-hoc query becomes a CI gate when dropped in .raysense/policies/*.rfl. One vocabulary, two surfaces, no YAML schema to maintain.
Yes. The CLI (raysense ., raysense . --check, raysense . --watch, raysense . --ui) is fully usable on its own as a structural-health tool. The MCP surface is additive - the same scan results back the CLI, the live dashboard, and the agent skills. Most teams start with the CI gate and the dashboard, then enable the MCP server when they introduce an agent into the loop.
Eleven tree-sitter built-ins get full AST parsing: C at tier 2 (cyclomatic and cognitive complexity), and Rust, Python, TypeScript, C++, Java, C#, Kotlin, Scala, Swift, and Ruby at tier 3 (complexity plus type-inheritance graphs). Rust additionally tracks impl Trait for Type blocks, normalized visibility (pub / pub(crate) / pub(super) / pub(in path)), use ... as alias renames, inline #[cfg(test)] module test detection, and Cargo workspace member resolution so cross-crate imports classify as Local. Rayfall (the RayforceDB query language) joins tier 3 with native S-expression extraction. The remaining 57 languages are covered by configurable prefix-pattern plugins that extract functions and imports without an AST. Every other metric on the page (edit-risk, drift, evolution, blast radius) works at every tier.
No. Raysense runs entirely on the local machine. The CLI writes scan results to .raysense/baseline/ in the project directory; the dashboard binds to localhost:7000; the MCP server speaks stdio to the local agent client. No telemetry, no usage analytics, no remote registry, no opt-out flag because there is nothing to opt out of.
The redundancy detector finds copy-paste blocks (token-shingled comparison) and functionally equivalent definitions across files. It is conservative on purpose: false positives are worse than misses for this signal because they push agents toward incorrect "deduplication" refactors. Expect ~10-25% redundancy on healthy multi-language codebases and 40%+ on repos with heavy boilerplate (generated clients, scaffolding tools, snapshot tests). The raysense_remediations tool lists the top duplicate clusters so you can verify before acting.
Yes. Raysense is open source under the MIT license. You can use it, modify it, embed it in commercial products, and ship it inside your own toolchain without restriction. The license header on every source file states the same thing.