Raysense

GRADING MODEL

Six graded dimensions, one aggregate.

Raysense is an open-source Rust CLI and MCP server that scores codebase architecture across six structural dimensions. Each dimension grades A through F against the dependency graph and commit history of the repo. The overall score, 0 to 100, is their weighted aggregate. The score moves with structure, not with cosmetics: adding tests or shuffling files around will not lift it.

A

Modularity

How cleanly modules separate. High when cross-module edges are rare and unidirectional; low when modules share state or no clear boundaries exist in the source layout.

B

Acyclicity

How much the dependency graph really is a graph. Cycles (A imports B which imports A) block parallel compilation, complicate refactoring, and signal architectural decay. Detected at file and module scope.

C

Depth

Whether layering is appropriate. Both flat-and-tangled and excessively deep architectures lose points. Sweet spot: 3 to 6 layers with clear upward dependencies and no upward-violation edges.

D

Equality

How evenly responsibility is distributed. A handful of god files holding most of the imports drops the score; an even spread of file-level coupling raises it. Surfaces bus-factor concentration.

E

Redundancy

Duplicated logic across the codebase. Detects copy-paste blocks and functionally equivalent definitions that could share an abstraction. High redundancy multiplies the surface area of every change.

F

Uniformity

Consistency of structural and naming patterns. Same file types laid out the same way, same handler shapes, same conventions. Inconsistency forces readers to relearn each module from scratch.

INSTALL

One binary. A few flags.

install (one-liner)

$ curl -fsSL https://raw.githubusercontent.com/RayforceDB/raysense/main/install.sh | sh

install

$ cargo install raysense

use

$ raysense .              # health report
$ raysense . --check      # CI gate, exits non-zero on rule violations
$ raysense . --watch      # rescan + reprint on a 2s loop
$ raysense . --ui         # live dashboard at http://localhost:7000
$ raysense --mcp          # stdio MCP server for agents

sample output (raysense .)

score             82 / 100
coverage          91 / 100
structure         68 / 100
facts             files=32 functions=620 calls=7065 imports=246
graph             resolved=89 cycles=0 max_fan_in=53 max_fan_out=21
test_gap          production=15 test_files=0 missing_nearby=15
dimensions        modularity=A acyclicity=A depth=A equality=F redundancy=C uniformity=C
overall_grade     B

edit_risk_files
  risk=2100.0 commits=5 max_complexity=140 bus_factor=1 tests=no src/cli.rs
  risk=342.0  commits=3 max_complexity=38  bus_factor=1 tests=no src/health.rs
  risk=147.0  commits=1 max_complexity=49  bus_factor=1 tests=no src/memory.rs

trend             samples=2 score_delta=0 dimension_drift=stable

CI integration (.github/workflows/ci.yml)

- name: Architecture gate
  run: |
    cargo install raysense
    raysense . --check

AGENT INTEGRATION

MCP server for any AI coding agent.

Raysense ships as a Claude Code / Cowork plugin and as a stdio MCP server compatible with any client that speaks the Model Context Protocol. One raysense install registers raysense across every Claude host on the machine: MCP server in Claude Desktop's claude_desktop_config.json, plugin in Claude Code's ~/.claude/settings.json, and marketplace seed in Cowork's per-session registry. The plugin gives the agent six edit-loop workflows (bootstrap, verify, drift, impact, query, audit) exposed three ways: model-triggered skills, user-typed slash commands (/raysense:audit), and MCP prompts in the Desktop "+" menu. Project state lives in <repo>/.raysense/, never in a global registry, so two sessions on two repositories stay strictly independent.

one shot (every detected Claude host)

$ cargo install raysense
$ raysense install              # auto-detect Claude Desktop, Claude Code, and Cowork
$ raysense install --desktop    # only Claude Desktop  (claude_desktop_config.json)
$ raysense install --code       # only Claude Code     (~/.claude/settings.json plugin install)
$ raysense install --cowork     # only Cowork          (research preview; finish in-session)

claude code plugin (manual route, equivalent to `raysense install --code`)

/plugin marketplace add RayforceDB/raysense
/plugin install raysense

update

$ cargo install raysense --force   # pull the newest binary from crates.io
$ raysense install                 # repoint every detected host at the new binary

# Claude Code: refresh the plugin (slash commands, skills) on top of that
/plugin reinstall raysense
/reload-plugins

1

Bootstrap

At session start, raysense scans the working tree, saves a baseline under .raysense/baseline/, and materializes the splayed-table memory the agent will read for the rest of the session. The first scan is the slow one. Every follow-up question is an instant columnar read against the saved tables, not a re-walk of the source.

2

Impact

Before non-trivial edits, the agent calls raysense_blast_radius on the file in play. The reply lists every file that imports it transitively, the cycles it sits on, and the change-coupling neighbors that historically move with it. The agent sees the structural footprint before it touches the working tree, not after the regression lands in CI.

3

Verify

After a batch of edits, verify rescans and runs the same rule set as the CI gate against the new state. It diffs the score, the dimension grades, and the rule findings against the bootstrap baseline. If a previously-clean dimension regressed (Equality went B to D), it surfaces immediately, named by file, before the agent moves on.

4

Drift

Periodically (daily, weekly, or pre-PR), drift compares the latest scan to the trend history across a configurable window (7d, 30d, 90d). It returns three lists: dimensions whose grades worsened, files newly entered the top hotspots, and rule codes that newly tripped. Answers "what got worse since N days ago", a question neither verify (single session) nor audit (no time axis) covers alone.

5

Audit

On request, audit returns the deeper views: full architecture report, dependency-structure matrix (DSM), evolution signals (bus factor, change-coupling pairs, temporal hotspots), test-gap ranking, dead-code detection, and cycle-break recommendations. The same data the dashboard renders, served as JSON the agent can read and quote.

6

Query

Any structural question the typed tools do not directly answer becomes a Rayfall expression over the saved baseline. Filter, project, aggregate, run .graph.* algorithms (PageRank, betweenness, shortest-path), or write a Datalog rule for declarative reachability. The agent gets the same query surface a human operator has, no new vocabulary to learn.

QUERY + POLICY

Architectural rules as code, not config.

Every saved baseline is a queryable columnar database. Agents and humans run Rayfall expressions over the call graph, the import graph, ownership history, and change coupling - select queries for filters and aggregates, .graph.* algorithms for centrality and reachability (PageRank, Louvain, topsort, shortest-path, betweenness), Datalog rules with transitive closure for declarative reachability. Drop the same expression into a .rfl file under .raysense/policies/ and it becomes a CI gate. No vendored YAML schema, no plugin SDK to learn - rules ship as code-reviewable files alongside the codebase they govern.

.raysense/policies/no-huge-files.rfl

;; Files over 2000 lines block the merge.
;; Result table columns: severity, code, path, message.
(select {severity: "error"
         code:     "huge-file"
         path:     path
         message:  "file exceeds 2000 lines, split before merging"
         from:     files
         where:    (> lines 2000)})

Ad-hoc Rayfall queries

Agents call raysense_baseline_query with a Rayfall expression. The named baseline table is bound as t and the result returns as JSON. Three modes: select for filter/project/aggregate, .graph.* for centrality and shortest-path, Datalog for transitive reachability that mirrors blast-radius in two lines.

Pinned policies

raysense policy check walks .raysense/policies/*.rfl, evaluates each, returns findings in the same envelope as built-in rules. Exit code 0 for pass, 1 for any policy that failed to evaluate, 2 for any error-severity finding. Wire it into a pre-commit hook or a CI gate without touching raysense's release cadence.

One vocabulary, two surfaces

An .rfl policy and an interactive query reference the same baseline tables - files, module_edges, change_coupling, file_ownership, call_edges - because there is only one substrate. Promote a one-off query into a committed rule by renaming the file.

Composable across history

The baseline already carries change-coupling, file-ages, ownership, and rule-violation tables alongside the structural ones. Cross-time queries like "files tightly coupled in the last 60 days that sit on cycles and changed without test edits" stay one Rayfall expression, not three tools.

BRING YOUR OWN DATA

Coverage. Lint counts. Embeddings. Same query language.

Drop a CSV; it joins the baseline. First row is headers, column types are inferred, and the shared symbol table means cross-table predicates like (in path (at coverage 'path)) work without ETL. Vector primitives - cos-dist, l2-dist, knn, hnsw-build, ann - are built into Rayfall, so embeddings imported alongside the structural baseline serve semantic similarity from the same query expression that drives policy gates.

join coverage.csv with raysense's own temporal hotspots

$ raysense baseline import-csv coverage ./coverage.csv
  imported ./coverage.csv -> .raysense/baseline/tables/coverage

$ raysense baseline query temporal_hotspots \
    '(select {from: t
              where: (in path
                         (at (select {from: coverage where: (< covered_pct 50)})
                             (quote path)))
              desc: risk_score})'

External signals as tables

Coverage, lint counts, error budgets, runtime traces, ownership-from-elsewhere - any CSV becomes addressable from baseline query, policy check, and the MCP tools. Subsequent re-imports overwrite cleanly; the schema-version stamp keeps stale baselines from silently mis-rendering.

Vector search as policy

Pair CSV import with embeddings: cos-dist for direct similarity, knn for brute-force scans on small sets, hnsw-build + ann for sub-linear queries on >10k vectors. An .rfl policy that flags near-duplicate functions is six lines.

One sym table, one schema

Imported tables share the baseline's interned-string space, so path in coverage and path in files point to the same sym ID. Joins are predicate equality, not ETL plumbing - and the schema-version stamp catches imports against an out-of-date baseline directory.

Live progress on stderr

Long Rayfall queries print throttled progress lines to stderr - op_name, phase, rows_done / rows_total, elapsed seconds, memory used. JSON callers stay byte-clean; humans on a TTY get the live signal. Quick queries (under 200ms) stay silent by design.

CAPABILITIES

Edit-risk, drift, bug density - one number per file.

Live treemap dashboard

A treemap of the working tree color-graded A through F, served at http://localhost:7000 while you work. Tiles are sized by file weight and tinted by structural grade; click a tile for the per-file metrics that drove the color. Refreshes within the watch interval of the next save.

Baselines and what-if

Diff against a saved snapshot. Simulate an edit (delete a file, break a cycle) before touching the working tree.

Splayed-table agent memory

Scan results materialized as columnar tables. An agent's follow-up questions are instant reads, not re-scans.

Edit-risk per file

One number per file ranking which the next agent edit is most likely to break. Combines churn, max complexity, single-owner penalty, and missing-tests penalty into a composite score updated on every save.

Score drift per session

Every baseline save appends a sample. The verify step diffs against the previous one and surfaces per-dimension drift (Equality went B to D after the last session) instead of just an aggregate delta.

Bug-density per file

Files where most of the churn is fix commits surface immediately. Conventional Commits prefixes (fix, hotfix, revert) drive the classifier; absolute count and ratio against total commits both feed the ranking.

Test gap detection

Files without nearby tests, ranked by structural risk. Feeds directly into the edit-risk score so untested files in churn-heavy areas float to the top.

Evolution signal

Bus factor per file, change-coupling pairs, temporal hotspots (churn x complexity), file age windows, and bug-fix concentration. Six lenses on how the codebase has actually moved over the last 500 commits.

69 languages out of the box

Eleven tree-sitter built-ins with full AST analysis, Rayfall with native S-expression extraction, plus 57 configurable plugins covering everything from Solidity to COBOL. See the coverage map →

LANGUAGE COVERAGE

69 languages, three tiers of analysis depth.

Tree-sitter built-ins get the full pipeline: function bodies parsed, cyclomatic and cognitive complexity, and type inheritance (9 class-based languages: Python, TypeScript, C++, Java, C#, Kotlin, Scala, Swift, Ruby). Rayfall (the RayforceDB query language) joins them at tier 3 with native S-expression extraction tuned to its LISP-like syntax: functions, imports, and table/dict types. The catalog plugins extract functions and imports via configurable prefix patterns, no AST. Every other metric on this page (edit-risk, score drift, bug-density, evolution, blast radius) works at every tier.

Full AST + type inheritance Full AST + complexity Configurable plugin

Assembly

C

C#

C++

Clojure

CMake

COBOL

CoffeeScript

Crystal

CSS

D

Dart

Dockerfile

Elixir

Elm

Erlang

F#

Fortran

GDScript

GLSL

Go

Gradle

GraphQL

Groovy

Haskell

HCL

HTML

Java

JSON

Jsonnet

Julia

Kotlin

Lisp

Lua

Make

Markdown

MATLAB

Nim

Objective-C

OCaml

Pascal

Perl

PHP

PowerShell

Proto

Python

R

Rayfall

ReScript

Ruby

Rust

Scala

Scheme

Shell

Solidity

SQL

Svelte

Swift

Terraform

Thrift

TOML

TypeScript

V

VB

Vue

Vyper

XML

YAML

Zig

Add your own language at any tier via .raysense/plugins/<name>/plugin.toml. Tree-sitter grammars can be loaded dynamically with the convention tree_sitter_<name>.

FAQ

Common questions about scoring, scope, and integration.

What does Raysense actually measure?

Six structural dimensions of a code repository: modularity (how cleanly module boundaries hold), acyclicity (how few directed cycles exist), depth (whether layering is appropriate), equality (how evenly responsibility is distributed), redundancy (how much logic is duplicated), and uniformity (how consistent naming and structural patterns are). Each is graded A through F against the dependency graph and 500-commit history. The aggregate is one weighted 0-to-100 number.

How is Raysense different from SonarQube, CodeScene, or NDepend?

SonarQube measures token-level quality (complexity, duplication, lints) inside individual files; Raysense measures graph-level shape across the whole repository. CodeScene focuses on hotspots in commit history; Raysense uses commit history as one input among several but its primary output is a structural grade, not a list. NDepend is a closed-source .NET-first tool with a UI-driven workflow; Raysense is a single open-source Rust binary with a stdio MCP server, so the agent can read structural state before every edit. The honest comparison page is at /compare/.

Does the score require git history?

The structural dimensions (modularity, acyclicity, depth, equality, redundancy, uniformity) work on the current working tree alone. Evolution signals (bus factor, change coupling, temporal hotspots, bug-density) require git history and walk the last 500 commits by default. A repository with no git history will return a complete grade for the structural dimensions and skip the evolution section with an explicit note.

How does the MCP server stay in sync with the CLI tools?

Both surfaces dispatch to the same Rust functions in src/mcp.rs and src/cli.rs. The MCP tool registry and the CLI command registry are two views over one set of typed handlers; adding a capability registers it in both lists in the same commit. There is no separate API server, no JSON-RPC translation layer, and no schema drift to debug.

What is Rayfall and why is it needed?

Rayfall is the query language exposed by Rayforce, the columnar runtime Raysense uses for its baseline tables. Saved scan results are queryable as Rayfall expressions (filter, project, aggregate, .graph.* algorithms, Datalog rules). The same expression that runs as an ad-hoc query becomes a CI gate when dropped in .raysense/policies/*.rfl. One vocabulary, two surfaces, no YAML schema to maintain.

Can I use Raysense without an AI agent?

Yes. The CLI (raysense ., raysense . --check, raysense . --watch, raysense . --ui) is fully usable on its own as a structural-health tool. The MCP surface is additive - the same scan results back the CLI, the live dashboard, and the agent skills. Most teams start with the CI gate and the dashboard, then enable the MCP server when they introduce an agent into the loop.

What languages get full AST analysis?

Eleven tree-sitter built-ins get full AST parsing: C at tier 2 (cyclomatic and cognitive complexity), and Rust, Python, TypeScript, C++, Java, C#, Kotlin, Scala, Swift, and Ruby at tier 3 (complexity plus type-inheritance graphs). Rust additionally tracks impl Trait for Type blocks, normalized visibility (pub / pub(crate) / pub(super) / pub(in path)), use ... as alias renames, inline #[cfg(test)] module test detection, and Cargo workspace member resolution so cross-crate imports classify as Local. Rayfall (the RayforceDB query language) joins tier 3 with native S-expression extraction. The remaining 57 languages are covered by configurable prefix-pattern plugins that extract functions and imports without an AST. Every other metric on the page (edit-risk, drift, evolution, blast radius) works at every tier.

Does Raysense send any data anywhere?

No. Raysense runs entirely on the local machine. The CLI writes scan results to .raysense/baseline/ in the project directory; the dashboard binds to localhost:7000; the MCP server speaks stdio to the local agent client. No telemetry, no usage analytics, no remote registry, no opt-out flag because there is nothing to opt out of.

How accurate is the redundancy dimension?

The redundancy detector finds copy-paste blocks (token-shingled comparison) and functionally equivalent definitions across files. It is conservative on purpose: false positives are worse than misses for this signal because they push agents toward incorrect "deduplication" refactors. Expect ~10-25% redundancy on healthy multi-language codebases and 40%+ on repos with heavy boilerplate (generated clients, scaffolding tools, snapshot tests). The raysense_remediations tool lists the top duplicate clusters so you can verify before acting.

Is it free for commercial use?

Yes. Raysense is open source under the MIT license. You can use it, modify it, embed it in commercial products, and ship it inside your own toolchain without restriction. The license header on every source file states the same thing.

BUILT ON

The splayed-table agent memory, the baseline tables you can query back, and the columnar storage behind the live dashboard are all powered by Rayforce. An in-memory analytics runtime optimized for graph-shaped queries. Open-source, linked statically. Nothing extra to install.

Visit Rayforce →