Skip to content

Commit 821050b

Browse files
bhaumanBruce Hauman
andauthored
Add deps_grep and deps_read tools for searching dependency jars (#147)
* Add deps_grep and deps_read tools for searching dependency jars - deps_grep: Search patterns in dependency jars on the classpath - Uses `clojure -Spath` to resolve exact dependency jars (cached) - Searches inside jars with unzip + regex matching - Supports glob/type filters, output modes (content/files/count) - Returns jar:entry paths for use with deps_read - deps_read: Read files from inside dependency jars - Takes file_path in jar:entry format (from deps_grep results) - Supports offset/limit for large files (mirrors read_file API) - Returns content with line numbers * Improve deps tools output formatting and add sources jar support - Format line numbers with arrow (→) to match Read tool style - Remove header from deps_read output (just content with line numbers) - deps_grep now also searches *-sources.jar files when available - Enables searching Java source code in dependencies * Add lazy Java source jar downloading for deps_grep - New deps-sources namespace for Maven coordinate parsing and source jar downloading - Downloads sources from Maven Central to ~/.clojure-mcp/deps_cache/ - Negative cache tracks jars without sources to avoid repeated download attempts - Only fetches Java sources when --type java or --glob "*.java" is specified - Memoizes jar lists by [project-dir java-sources?] for fast subsequent lookups - Parallel downloads using pmap for performance * Improve deps_grep robustness and fix reviewer feedback - Use ripgrep for searching with context/multiline support, fallback to Clojure regex - Fix path separator for Windows compatibility (use File/pathSeparator) - Add shared binary-available? utility in clojure-mcp.utils.shell - Only cache 404s in negative cache, not transient network errors - Add binary availability checks with helpful error messages - Document external dependencies (clojure, unzip, rg, curl) - Consolidate binary checking across grep, deps_grep, deps_sources * Improve binary availability probes with tool-specific args - binary-available? now accepts optional probe args (defaults to --help) - Use -Sdescribe for clojure, -v for unzip * Add required library filter to deps_grep and new deps_list tool deps_grep now requires a library parameter (Maven group or group/artifact) to scope searches to specific dependencies, avoiding bulk source jar downloads. deps_list lets users discover available library coordinates. * Improve deps tools robustness and fix reviewer feedback Replace unzip/bash/curl shell dependencies with pure Java implementations for cross-platform compatibility. Add shared jar-utils namespace using java.util.zip.ZipFile for jar reading. Add Java HTTP fallback for source jar downloads with proper connection cleanup. Fix Maven path regex to handle Windows backslashes. * Fix reviewer feedback on deps tools - Validate offset/limit in deps_read (reject negative values) - Escape regex metacharacters in glob-to-regex conversion - Wrap invalid regex pattern in deps_list with helpful error - Add missing clojure.string require in deps_list tool --------- Co-authored-by: Bruce Hauman <bhauman@gmail.com>
1 parent 0b8be09 commit 821050b

11 files changed

Lines changed: 979 additions & 18 deletions

File tree

src/clojure_mcp/tools.clj

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,10 @@
1010
'clojure-mcp.tools.grep.tool/grep-tool
1111
'clojure-mcp.tools.glob-files.tool/glob-files-tool
1212
'clojure-mcp.tools.project.tool/inspect-project-tool
13-
'clojure-mcp.tools.nrepl-ports.tool/list-nrepl-ports-tool])
13+
'clojure-mcp.tools.nrepl-ports.tool/list-nrepl-ports-tool
14+
'clojure-mcp.tools.deps-grep.tool/deps-grep-tool
15+
'clojure-mcp.tools.deps-read.tool/deps-read-tool
16+
'clojure-mcp.tools.deps-list.tool/deps-list-tool])
1417

1518
(def eval-tool-syms
1619
"Symbols for evaluation tool creation functions"
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
(ns clojure-mcp.tools.deps-common.jar-utils
2+
"Pure Java utilities for reading jar/zip files.
3+
Provides cross-platform alternatives to shelling out to `unzip`."
4+
(:require
5+
[clojure.java.io :as io]
6+
[taoensso.timbre :as log])
7+
(:import
8+
(java.util.zip ZipFile ZipException)))
9+
10+
(defn list-jar-entries
11+
"List all entries in a jar file using java.util.zip.ZipFile.
12+
Returns a vector of entry path strings, or nil on error."
13+
[jar-path]
14+
(try
15+
(with-open [zf (ZipFile. (str jar-path))]
16+
(let [entries (enumeration-seq (.entries zf))]
17+
(mapv #(.getName %) entries)))
18+
(catch ZipException e
19+
(log/debug "Failed to read jar (ZipException):" jar-path (.getMessage e))
20+
nil)
21+
(catch java.io.FileNotFoundException e
22+
(log/debug "Jar file not found:" jar-path (.getMessage e))
23+
nil)
24+
(catch Exception e
25+
(log/debug "Failed to list jar entries:" jar-path (.getMessage e))
26+
nil)))
27+
28+
(defn read-jar-entry
29+
"Read a single entry from a jar file as a string using java.util.zip.ZipFile.
30+
Returns the content string, or nil if the entry is not found or on error."
31+
[jar-path entry-path]
32+
(try
33+
(with-open [zf (ZipFile. (str jar-path))]
34+
(when-let [entry (.getEntry zf entry-path)]
35+
(with-open [is (.getInputStream zf entry)]
36+
(slurp is))))
37+
(catch ZipException e
38+
(log/debug "Failed to read jar entry (ZipException):" entry-path "in" jar-path (.getMessage e))
39+
nil)
40+
(catch java.io.FileNotFoundException e
41+
(log/debug "Jar file not found:" jar-path (.getMessage e))
42+
nil)
43+
(catch Exception e
44+
(log/debug "Failed to read jar entry:" entry-path "in" jar-path (.getMessage e))
45+
nil)))
Lines changed: 310 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,310 @@
1+
(ns clojure-mcp.tools.deps-grep.core
2+
"Core implementation for searching dependency jars.
3+
Uses clojure CLI for classpath resolution and ripgrep for searching."
4+
(:require
5+
[clojure.string :as str]
6+
[clojure.java.shell :as shell]
7+
[clojure.java.io :as io]
8+
[clojure-mcp.tools.deps-common.jar-utils :as jar-utils]
9+
[clojure-mcp.tools.deps-sources.core :as deps-sources]
10+
[clojure-mcp.utils.shell :as shell-utils]
11+
[taoensso.timbre :as log]))
12+
13+
;; Cache for base classpath jars, keyed by project directory
14+
(def ^:private classpath-cache (atom {}))
15+
16+
;; Cache for library-filtered jars with sources, keyed by [project-dir library java-sources?]
17+
(def ^:private library-jars-cache (atom {}))
18+
19+
(defn rg-available?
20+
"Check if ripgrep (rg) is available on the system."
21+
[]
22+
(shell-utils/binary-available? "rg"))
23+
24+
(defn check-required-binaries!
25+
"Check that required binaries are available. Returns nil if all present,
26+
or an error map with :error and :missing-binaries keys."
27+
[]
28+
(let [required {"clojure" ["-Sdescribe"]}
29+
missing (->> required
30+
(keep (fn [[bin args]]
31+
(when-not (apply shell-utils/binary-available? bin args)
32+
bin))))]
33+
(when (seq missing)
34+
{:error (str "Required binaries not found: " (str/join ", " missing)
35+
". Please install them to use deps_grep.")
36+
:missing-binaries (vec missing)})))
37+
38+
(defn get-classpath-jars
39+
"Run `clojure -Spath` in the given directory and return a vector of jar paths.
40+
Returns nil if classpath resolution fails."
41+
[project-dir]
42+
(log/debug "Resolving classpath for:" project-dir)
43+
(try
44+
(let [result (shell/with-sh-dir project-dir
45+
(shell/sh "clojure" "-Spath"))]
46+
(if (zero? (:exit result))
47+
(let [classpath (:out result)
48+
;; Use platform-specific path separator (: on Unix, ; on Windows)
49+
path-sep (re-pattern java.io.File/pathSeparator)
50+
jars (->> (str/split classpath path-sep)
51+
(filter #(str/ends-with? % ".jar"))
52+
(filter #(.exists (io/file %)))
53+
vec)]
54+
(log/debug "Found" (count jars) "jars on classpath")
55+
jars)
56+
(do
57+
(log/warn "clojure -Spath failed:" (:err result))
58+
nil)))
59+
(catch Exception e
60+
(log/error e "Failed to resolve classpath")
61+
nil)))
62+
63+
(defn find-sources-jar
64+
"Given a jar path, find the corresponding -sources.jar if it exists."
65+
[jar-path]
66+
(when (str/ends-with? jar-path ".jar")
67+
(let [sources-path (str/replace jar-path #"\.jar$" "-sources.jar")
68+
sources-file (io/file sources-path)]
69+
(when (.exists sources-file)
70+
sources-path))))
71+
72+
(defn needs-java-sources?
73+
"Check if the search options indicate we're looking for Java files."
74+
[{:keys [type glob]}]
75+
(or (= "java" type)
76+
(and glob (re-find #"\.java" glob))))
77+
78+
(defn get-jars-with-sources
79+
"Given a list of jars and search opts, return jars plus any available sources jars.
80+
When searching for Java files, downloads missing sources from Maven Central."
81+
[jars opts]
82+
(let [;; First find sources jars already in Maven cache
83+
existing-sources (->> jars
84+
(keep find-sources-jar)
85+
(remove (set jars)))]
86+
(if (needs-java-sources? opts)
87+
;; For Java searches, also download missing sources
88+
(let [jars-with-sources (set (map #(str/replace % #"-sources\.jar$" ".jar")
89+
existing-sources))
90+
jars-missing-sources (remove jars-with-sources jars)
91+
_ (log/debug "Checking for Java sources for" (count jars-missing-sources) "jars")
92+
downloaded-sources (deps-sources/ensure-sources-jars! jars-missing-sources)]
93+
(log/debug "Downloaded" (count downloaded-sources) "sources jars")
94+
(into (vec jars) (concat existing-sources downloaded-sources)))
95+
;; For non-Java searches, just use existing sources
96+
(into (vec jars) existing-sources))))
97+
98+
(defn parse-library-filter
99+
"Parse a library filter string into group and optional artifact.
100+
Returns {:group \"group.id\"} or {:group \"group.id\" :artifact \"name\"}."
101+
[library]
102+
(let [parts (str/split library #"/" 2)]
103+
(if (= 2 (count parts))
104+
{:group (first parts) :artifact (second parts)}
105+
{:group (first parts)})))
106+
107+
(defn filter-jars-by-library
108+
"Filter jars to only those matching the given library filter.
109+
Library can be a group ID (matches all artifacts) or group/artifact (exact match).
110+
Uses deps-sources/parse-maven-coords to extract coordinates from jar paths."
111+
[jars library]
112+
(let [{:keys [group artifact]} (parse-library-filter library)]
113+
(filterv (fn [jar-path]
114+
(when-let [coords (deps-sources/parse-maven-coords jar-path)]
115+
(and (= group (:group coords))
116+
(or (nil? artifact)
117+
(= artifact (:artifact coords))))))
118+
jars)))
119+
120+
(defn cached-base-jars
121+
"Get base classpath jars with caching. Returns cached result if available."
122+
[project-dir]
123+
(or (get @classpath-cache project-dir)
124+
(when-let [jars (get-classpath-jars project-dir)]
125+
(swap! classpath-cache assoc project-dir jars)
126+
jars)))
127+
128+
(defn clear-classpath-cache!
129+
"Clear all caches. Useful after deps changes."
130+
[]
131+
(reset! classpath-cache {})
132+
(reset! library-jars-cache {}))
133+
134+
(defn list-jar-entries
135+
"List all entries in a jar file.
136+
Returns a vector of entry paths or nil on error."
137+
[jar-path]
138+
(jar-utils/list-jar-entries jar-path))
139+
140+
(defn glob-matches?
141+
"Check if a path matches a glob pattern.
142+
Supports simple patterns like *.clj, *.{clj,cljs}"
143+
[pattern path]
144+
(if-not pattern
145+
true
146+
(let [;; Convert glob to regex, escaping all regex metacharacters first
147+
;; then restoring glob wildcards
148+
pattern-regex (-> pattern
149+
(str/replace #"[.+^$|()\\]" "\\\\$0")
150+
(str/replace "*" ".*")
151+
(str/replace #"\{([^}]+)\}"
152+
(fn [[_ alts]]
153+
(str "(" (str/replace alts "," "|") ")"))))]
154+
(boolean (re-find (re-pattern (str pattern-regex "$")) path)))))
155+
156+
(defn type-to-glob
157+
"Convert a file type (like 'clj') to a glob pattern."
158+
[type-str]
159+
(when type-str
160+
(str "*." type-str)))
161+
162+
(defn filter-entries
163+
"Filter jar entries by glob and/or type patterns."
164+
[entries {:keys [glob type]}]
165+
(let [effective-glob (or glob (type-to-glob type))]
166+
(if effective-glob
167+
(filter #(glob-matches? effective-glob %) entries)
168+
entries)))
169+
170+
(defn search-jar-entry-rg
171+
"Search using ripgrep. Reads jar entry via Java, pipes content to rg via stdin.
172+
Supports context lines and multiline patterns."
173+
[jar-path entry-path pattern {:keys [case-insensitive context-before context-after
174+
context multiline]}]
175+
(try
176+
(when-let [content (jar-utils/read-jar-entry jar-path entry-path)]
177+
(let [rg-args (cond-> ["rg" "-n"]
178+
case-insensitive (conj "-i")
179+
multiline (conj "-U")
180+
context-before (conj "-B" (str context-before))
181+
context-after (conj "-A" (str context-after))
182+
context (conj "-C" (str context)))
183+
rg-args (conj rg-args pattern)
184+
result (apply shell/sh (concat rg-args [:in content]))]
185+
(when (zero? (:exit result))
186+
(let [matches (->> (str/split-lines (:out result))
187+
(keep (fn [line]
188+
(when-let [[_ line-num sep content]
189+
(re-matches #"(\d+)([:|-])(.*)$" line)]
190+
{:line-num (parse-long line-num)
191+
:content content
192+
:match? (= sep ":")}))))]
193+
(when (seq matches)
194+
{:jar jar-path
195+
:entry entry-path
196+
:matches (vec matches)})))))
197+
(catch Exception e
198+
(log/debug "Error searching" entry-path "in" jar-path ":" (.getMessage e))
199+
nil)))
200+
201+
(defn search-jar-entry-fallback
202+
"Fallback search using Java jar reading and Clojure regex.
203+
Does not support context lines or multiline."
204+
[jar-path entry-path pattern {:keys [case-insensitive]}]
205+
(try
206+
(when-let [content (jar-utils/read-jar-entry jar-path entry-path)]
207+
(let [lines (str/split-lines content)
208+
pattern-re (re-pattern (if case-insensitive
209+
(str "(?i)" pattern)
210+
pattern))
211+
matches (keep-indexed
212+
(fn [idx line]
213+
(when (re-find pattern-re line)
214+
{:line-num (inc idx)
215+
:content line
216+
:match? true}))
217+
lines)]
218+
(when (seq matches)
219+
{:jar jar-path
220+
:entry entry-path
221+
:matches (vec matches)})))
222+
(catch Exception e
223+
(log/debug "Error searching" entry-path "in" jar-path ":" (.getMessage e))
224+
nil)))
225+
226+
(defn search-jar-entry
227+
"Search a single entry within a jar. Uses ripgrep if available, otherwise
228+
falls back to Clojure regex (without context/multiline support).
229+
230+
Returns a map with :jar, :entry, and :matches. Each match has :line-num,
231+
:content, and :match? (true for matches, false for context lines)."
232+
[jar-path entry-path pattern opts]
233+
(if (rg-available?)
234+
(search-jar-entry-rg jar-path entry-path pattern opts)
235+
(search-jar-entry-fallback jar-path entry-path pattern opts)))
236+
237+
(defn deps-grep
238+
"Search for a pattern in dependency jars.
239+
240+
Arguments:
241+
- project-dir: Directory containing deps.edn
242+
- pattern: Regex pattern to search for
243+
- opts: Map of options
244+
:library - Required. Maven group or group/artifact to search
245+
:glob - Filter files by glob pattern (e.g., \"*.clj\")
246+
:type - Filter files by type (e.g., \"clj\", \"java\")
247+
:output-mode - :content, :files-with-matches, or :count
248+
:case-insensitive - Case insensitive search
249+
:line-numbers - Include line numbers (default true for content mode)
250+
:context-before - Lines before match
251+
:context-after - Lines after match
252+
:context - Lines before and after
253+
:head-limit - Limit number of results
254+
:multiline - Enable multiline matching
255+
256+
Returns a map with :results and optionally :truncated.
257+
258+
Requires: clojure CLI. Optional: ripgrep (rg) for context/multiline."
259+
[project-dir pattern opts]
260+
(if-let [binary-error (check-required-binaries!)]
261+
binary-error
262+
(let [base-jars (cached-base-jars project-dir)]
263+
(if-not base-jars
264+
{:error "Failed to resolve classpath. Is this a deps.edn project?"}
265+
(let [library (:library opts)
266+
cache-key [project-dir library (needs-java-sources? opts)]
267+
filtered-jars (filter-jars-by-library base-jars library)]
268+
(if (empty? filtered-jars)
269+
{:error (str "No libraries found matching: " (:library opts)
270+
". Use deps_list to see available libraries.")}
271+
(let [;; Get jars with sources (cached per library)
272+
jars (or (get @library-jars-cache cache-key)
273+
(let [result (get-jars-with-sources filtered-jars opts)]
274+
(swap! library-jars-cache assoc cache-key result)
275+
result))
276+
{:keys [output-mode head-limit]
277+
:or {output-mode :content}} opts
278+
all-results (atom [])
279+
result-count (atom 0)
280+
limit-reached (atom false)]
281+
;; Search each jar
282+
(doseq [jar jars
283+
:while (not @limit-reached)]
284+
(when-let [entries (list-jar-entries jar)]
285+
(let [filtered-entries (filter-entries entries opts)]
286+
(doseq [entry filtered-entries
287+
:while (not @limit-reached)]
288+
(when-let [match (search-jar-entry jar entry pattern opts)]
289+
(case output-mode
290+
:files-with-matches
291+
(do
292+
(swap! all-results conj {:jar (:jar match)
293+
:entry (:entry match)})
294+
(swap! result-count inc))
295+
296+
:count
297+
(swap! result-count + (count (:matches match)))
298+
299+
;; :content (default)
300+
(do
301+
(swap! all-results conj match)
302+
(swap! result-count + (count (:matches match)))))
303+
304+
(when (and head-limit (>= @result-count head-limit))
305+
(reset! limit-reached true)))))))
306+
307+
(cond-> {:results @all-results}
308+
(= output-mode :count) (assoc :count @result-count)
309+
@limit-reached (assoc :truncated true)))))))))
310+

0 commit comments

Comments
 (0)