TypeScript library and HTTP/CLI tools for AI-assisted spreadsheet work: load Excel/CSV, ask natural-language questions, generate formulas, run structured cleaning transforms, and explain existing formulas. Built on SheetJS and the Vercel AI SDK.
Contents: Features · Installation · Configuration · Usage · HTTP API · CLI · Docker · Testing · Troubleshooting
- Formula generation — Describe logic in English; get an Excel-style formula string.
- Natural-language queries — Ask questions over tabular data (sample rows are sent to the model).
- Transforms — LLM proposes a JSON plan (filter, dedupe, fill nulls, sort, rename columns, add column placeholders); the engine applies it deterministically.
- Formula explanation — Plain-English explanation of a formula.
- Formats — Load/save XLSX, CSV, ODS, XLS (via SheetJS).
- Local-first — Files stay on disk; only prompts and excerpts go to your chosen LLM provider.
- Node.js 22+
- An API key for at least one provider: OpenAI (
OPENAI_API_KEY) and/or Anthropic (ANTHROPIC_API_KEY), depending on the model id you pass.
cd agentic-spreadsheet
npm install
npm run buildCopy .env.example to .env and set keys:
OPENAI_API_KEY=sk-...
# Optional:
# ANTHROPIC_API_KEY=
# AGENT_MODEL=openai/gpt-4o-mini
PORT=3000
# API_KEY=your-secret-key
# MAX_BODY_BYTES=1048576Model strings use a provider prefix, for example:
openai/gpt-4o-mini— requiresOPENAI_API_KEYanthropic/claude-sonnet-4-20250514— requiresANTHROPIC_API_KEY
If you omit the prefix (e.g. gpt-4o-mini), OpenAI is assumed and OPENAI_API_KEY is required.
In code, new SpreadsheetAgent({ model: 'gpt-4o-mini' }) and model: 'openai/gpt-4o-mini' are equivalent for OpenAI (the prefix only selects the provider).
| Command | Description |
|---|---|
npm run build |
Compile TypeScript to dist/ |
npm run dev |
Run the HTTP server with hot reload (tsx watch) |
npm start |
Run compiled server (node dist/api/main.js) |
npm run cli |
Interactive CLI |
npm test |
Run all Vitest tests (no API key required) |
npm run typecheck |
TypeScript check without emit |
npm run clean |
Remove the dist/ build output |
npm run test:watch |
Vitest watch mode |
npm run test:integration |
Live LLM query test (needs OPENAI_API_KEY) |
npm run test:formula |
Schema unit tests only |
npm publish runs prepack, which executes npm run build so dist/ is always included in the package tarball.
import { SpreadsheetAgent } from './src/index.js'; // or from 'agentic-spreadsheet' when installed as a dependency
import * as path from 'node:path';
const agent = new SpreadsheetAgent({
model: 'openai/gpt-4o-mini',
maxRows: 10000, // optional: trim each sheet to this many rows on load
maxContextRows: 200, // optional: rows sent to the LLM per request
createModel: myModelFactory, // optional: inject model for tests/custom providers
});
const stats = agent.getWorkbookStats(workbook);
console.log(stats.rowCount, stats.columns);
const workbook = await agent.loadWorkbook(path.join('samples', 'sales-sample.csv'), 'csv');
const result = await agent.query(workbook, 'What is the total revenue?');
console.log(result.answer);
console.log(result.formula);
const formula = await agent.generateFormula(
'Commission is 10% of sales if sales > 1000, else 5%',
{ columns: ['Sales'] }
);
const cleaned = await agent.transform(
workbook,
'Remove rows where Status equals Cancelled'
);
await agent.saveWorkbook(cleaned, 'output/cleaned.xlsx', 'xlsx');const rows = agent.getRows(workbook); // first sheet; pass { sheetName: '…' } for others
const columns = agent.getColumns(workbook);
const filtered = agent.filterRows(workbook, (row) => Number(row['Revenue']) > 1000);| Method | Purpose |
|---|---|
loadWorkbook(path | Buffer, format?) |
Load a file or buffer. For a Buffer, pass format unless it is XLSX (the default). Paths infer format from the extension. |
saveWorkbook(workbook, path, format?) |
Save as XLSX or CSV. CSV writes the first sheet only (same limitation as typical Excel CSV export). |
query(workbook, question, { sheetName? }) |
NL question → { answer, formula?, rows? }. |
generateFormula(description, { columns? }) |
Single formula string. |
transform(workbook, instruction, { sheetName? }) |
NL → structured actions → updated workbook. |
explainFormula(formula, { columns? }) |
Text explanation. |
getRows / getColumns / filterRows |
Non-LLM helpers. |
getWorkbookStats |
Row/column/sheet metadata for a workbook. |
detectFormatFromPath(path) |
Infer SheetFormat from a file path extension. |
Type SheetFormat |
'xlsx' | 'csv' | 'ods' | 'xls' |
query, transform, generateFormula, and explainFormula reject empty or whitespace-only text inputs. For query and transform, sheetName is optional (defaults to the first sheet). Unknown sheet names throw an error listing available sheet names.
The HTTP API rejects whitespace-only command values and trims formula for explain mode.
npm run dev
# or after build: npm startGET /health— Liveness check (includes version; not rate-limited).POST /spreadsheet— JSON body (validated with Zod; 1 MiB body limit by default):
Optional production settings via environment:
API_KEY— When set,POST /spreadsheetrequiresX-API-Key: …orAuthorization: Bearer …(GET /healthstays public).MAX_BODY_BYTES— Override the JSON body size limit (default1048576).
| Field | Description |
|---|---|
mode |
query (default), transform, formula, or explain |
command |
Natural language command (not required for explain) |
file |
Path under the uploads directory (required for query/transform; optional for formula/explain to supply column context) |
formula |
For mode: explain only |
sheetName |
Optional sheet name |
Responses for query and transform include a stats object (rowCount, columns, etc.). Missing upload files return 404.
Examples:
# Place a file in ./uploads (created on startup), e.g. uploads/data.csv
curl -s http://localhost:3000/health
curl -s -X POST http://localhost:3000/spreadsheet \
-H "Content-Type: application/json" \
-d '{"mode":"formula","command":"Sum column B if column A equals Yes"}'
curl -s -X POST http://localhost:3000/spreadsheet \
-H "Content-Type: application/json" \
-d '{"mode":"query","command":"What is the average revenue?","file":"data.csv"}'
curl -s -X POST http://localhost:3000/spreadsheet \
-H "Content-Type: application/json" \
-d '{"mode":"explain","formula":"=IF(A1>0,SUM(B1:B10),0)"}'POST /spreadsheet only accepts file paths that resolve inside the configured uploads directory (traversal and the uploads root itself are rejected).
npm run cliPrompts for a mode (query, transform, formula, explain), then runs an interactive loop:
- query — Ask questions over a loaded file (default).
- transform — Apply NL cleaning; saves results to
./output/cli-out-<timestamp>.xlsx. - formula — Generate formulas (optional file for column context).
- explain — Explain a formula (optional file for column context).
npm run build
docker compose build
docker compose upMount ./uploads and ./output as in docker-compose.yml. Set OPENAI_API_KEY in the environment.
npm test # Unit + API tests (no API key)
npm run typecheck # tsc --noEmit
npm run test:integration # LLM query test (skipped without OPENAI_API_KEY)
npm run test:formula # Schema unit testsCI runs on push/PR to main via GitHub Actions (.github/workflows/ci.yml).
agentic-spreadsheet/
├── src/
│ ├── SpreadsheetAgent.ts
│ ├── index.ts
│ ├── llm/model.ts
│ ├── engine/
│ ├── prompts/
│ ├── schemas/
│ ├── api/
│ │ ├── server.ts # buildServer(), routes
│ │ └── main.ts # HTTP listen
│ └── cli/
├── samples/
├── tests/
├── .github/workflows/ci.yml
├── README.md
├── Dockerfile
├── docker-compose.yml
├── .env.example
├── .gitignore
├── .dockerignore
├── package.json
├── LICENSE
├── tsconfig.json
└── vitest.config.ts
dist/ is produced by npm run build. dist/ and .vitest-cache/ (Vitest) are gitignored.
OPENAI_API_KEY is required— Export the key or add it to.envbefore calling LLM features (query,transform,generateFormula,explainFormula).- Port already in use — Set
PORT=3010(or another free port) in.env. - Integration test skipped —
npm run test:integrationonly runs whenOPENAI_API_KEYis set in the environment. - Unauthorized (401) — Set
X-API-KeyorAuthorization: BearerwhenAPI_KEYis configured. - File not found (404) — Ensure the file exists under
./uploadsand the path does not traverse outside that directory. - Wrong sheet or empty workbook —
query/transformrequire a validsheetNamewhen you do not want the first sheet; empty workbooks are rejected.
import { buildServer } from 'agentic-spreadsheet/api';
const app = await buildServer({ uploadDir: './uploads', outputDir: './output' });
await app.listen({ port: 3000, host: '0.0.0.0' });The library does not call Google APIs. You can fetch values with googleapis, build a CSV buffer, and call loadWorkbook(buffer, 'csv') — see the Google Sheets API documentation.
Telegram: @tradingtermin