Skip to content
View Pringled's full-sized avatar
🚢
🚢

Organizations

@MinishLab

Block or report Pringled

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Pringled/README.md

Hi there 👋

I'm Thomas van Dongen. I'm head of AI engineering at Springer Nature and co-founder of Minish, an open-source NLP lab working on efficient models and packages.

LinkedIn Google Scholar Website


Open Source Projects

Project Description
model2vec Distill sentence transformers into static embeddings that are orders of magnitude faster Stars
semhash Multimodal semantic deduplication, outlier detection, and representative filtering Stars
semble A code-search MCP/CLI tool for AI agents that drastically reduces token consumption Stars
pyversity Diversify search & retrieval results to reduce redundancy and improve coverage Stars
vicinity Fast, lightweight nearest neighbor search with pluggable backends Stars
model2vec-rs A Rust port of Model2Vec Stars
tokenlearn Pre-train static embedding models Stars
agentcheck A Go CLI that audits what an AI agent can access before you run it Stars

Pinned Loading

  1. MinishLab/model2vec MinishLab/model2vec Public

    Fast State-of-the-Art Static Embeddings

    Python 2.1k 121

  2. MinishLab/semhash MinishLab/semhash Public

    Fast Multimodal Semantic Deduplication & Filtering

    Python 923 56

  3. pyversity pyversity Public

    Fast Diversification for Search & Retrieval

    Python 490 27

  4. MinishLab/semble MinishLab/semble Public

    Fast and Accurate Code Search for Agents. Uses ~98% fewer tokens than grep+read

    Python 796 61