diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml index 18141129d9..e30d118ef4 100644 --- a/.github/ISSUE_TEMPLATE/config.yml +++ b/.github/ISSUE_TEMPLATE/config.yml @@ -1,4 +1,4 @@ contact_links: - - name: "Join MindsDB Community" - url: https://mindsdb.com/joincommunity - about: Join our community on Slack for other questions and general chat \ No newline at end of file + - name: "Join the MindsDB Discord" + url: https://mindshub.ai/discord + about: Join our Discord for questions and general chat diff --git a/.github/ISSUE_TEMPLATE/integrations_contest.yaml b/.github/ISSUE_TEMPLATE/integrations_contest.yaml deleted file mode 100644 index 8318873d29..0000000000 --- a/.github/ISSUE_TEMPLATE/integrations_contest.yaml +++ /dev/null @@ -1,47 +0,0 @@ -name: πŸ§‘β€πŸ”§ Propose a new integration -description: Share an idea for a new datasource or machine learning integration -title: "[Integration]: " -labels: [roadmap, integration] -assignees: -- -body: -- type: markdown - attributes: - value: | - Thanks for taking the time to share the new integration! Please fill out the form in English! -- type: checkboxes - attributes: - label: Is there an existing integration? - description: Please search to see if MindsDB already supports this integration.A list with supported integrations can be found [here](https://github.com/mindsdb/mindsdb#database-integrations). - options: - - label: I have searched the existing integrations. - required: true -- type: textarea - attributes: - label: Use Case - description: Which use-cases does this solve? - placeholder: | - Why this integration will be usefull to users? What is the value of having this integration? - validations: - required: true -- type: textarea - attributes: - label: Motivation - description: How will we know that this has succeeded? - placeholder: | - Explain the proposed integration as though it was already implemented and you were explaining it to a user. - validations: - required: true -- type: textarea - attributes: - label: Implementation - description: Describe how this integration will work, with code, pseudo-code, mock-ups, text, or add diagrams - validations: - required: false -- type: textarea - attributes: - label: Anything else? - description: | - Links? References? Anything that will give more context about this integration! - validations: - required: false diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index a22cc2c206..bc20563dfb 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -2,7 +2,7 @@ Being part of the core MindsDB team is accessible to anyone who is motivated and wants to be part of that journey! -Please see below how to contribute to the project, also refer to the contributing documentation. +Please see below how to contribute to the project. ## How can you help us? @@ -27,7 +27,7 @@ In general, we follow the "fork-and-pull" Git workflow. > NOTE: Be sure to merge the latest from "upstream" before making a pull request! Also, make the PR to the `main` branch. ## Feature and Bug reports -We use GitHub issues to track bugs and features. Report them by opening a [new issue](https://github.com/mindsdb/mindsdb/issues/new/choose) and fill out all of the required inputs. +We use GitHub issues to track bugs and features. Report them by opening a [new issue](https://github.com/mindsdb/engine/issues) and fill out all of the required inputs. ## Code review process @@ -35,12 +35,10 @@ The Pull Request reviews are done on a regular basis. Please, make sure you resp ## Community -If you have additional questions or you want to chat with the MindsDB core team, please join our [Slack community](https://mindsdb.com/joincommunity) or post at [Github Discussions](https://github.com/mindsdb/mindsdb/discussions). - -To get updates on MindsDB’s latest announcements, releases, and events, sign up for our [Monthly Community Newsletter](https://mindsdb.com/newsletter/?utm_medium=community&utm_source=github&utm_campaign=mindsdb%20repo). +If you have additional questions or you want to chat with the MindsDB core team, please join our [Discord](https://mindshub.ai/discord) or open a [GitHub issue](https://github.com/mindsdb/engine/issues). -Join our mission of democratizing machine learning! +Join our mission of making semantic search accessible to everyone who knows SQL! ## Contributor Code of Conduct -Please note that this project is released with a [Contributor Code of Conduct](https://github.com/mindsdb/mindsdb/blob/main/CODE_OF_CONDUCT.md). By participating in this project, you agree to abide by its terms. +Please note that this project is released with a [Contributor Code of Conduct](CODE_OF_CONDUCT.md). By participating in this project, you agree to abide by its terms. diff --git a/README.md b/README.md index 056a11efdb..a4c17aaad1 100644 --- a/README.md +++ b/README.md @@ -1,246 +1,188 @@ - +# MindsDB Query Engine -

- - Query engine for AI analytics, powering agents to answer questions across all your live data - -

+**Semantic search over all your data β€” entirely in SQL.** -
+
- MindsDB Release + PyPI version - Python supported + Supported Python versions - Docker pulls + Docker pulls - -

- AI-coworker - Β· - Website - Β· - Docs - Β· - Contact us for a demo - Β· - Community Slack -

+[**Docs**](https://mindsdb.github.io/engine) Β· [**Website**](https://mindshub.ai) Β· [**Discord**](https://mindshub.ai/discord) Β· [**Contact**](https://mindshub.ai/contact) + --- -[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/mindsdb/mindsdb) +MindsDB Query Engine connects to 200+ data sources β€” databases, warehouses, applications, files β€” and lets you query them live in one SQL dialect, with no ETL. Index unstructured content into [knowledge bases](https://mindsdb.github.io/engine#kb-overview), then search it by meaning, by keyword, or both at once, with plain SQL filters on top. Everything is reachable from any MySQL- or PostgreSQL-compatible client. -MindsDB develops the following products: +> **Where this fits:** MindsDB now builds [MindsHub](https://mindshub.ai) β€” a hub for open AI agents. The Query Engine remains a standalone open-source project, and it pairs well with MindsHub agents: connect it to give an agent live, SQL-queryable access to your data and semantic search. The full story: [MindsHub vs MindsDB](https://mindshub.ai/mindshub-vs-mindsdb). -* [ANTON](https://github.com/mindsdb/anton) - A personal AI agent that helps you get work done. Tell it what you need in plain language and it takes it from there - sending emails, calling APIs, connecting to data sources, building dashboards, and delivering results. No setup, no plugins, no fuss. +## How it works -* [Query Engine](https://docs.mindsdb.com) - The product you are looking at, is a popular query engine for semantic-search. +``` + MySQL clients Β· PostgreSQL clients Β· BI tools Β· ORMs Β· HTTP API + β”‚ + β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” + β”‚ MindsDB Query Engine β”‚ + β”‚ one SQL dialect over β”‚ + β”‚ a federated query planner β”‚ + β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ + β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” + β”‚ β”‚ β”‚ + β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” + β”‚ Databases β”‚ β”‚ Apps & files β”‚ β”‚ Knowledge bases β”‚ + β”‚ Postgres, MySQL, β”‚ β”‚ Slack, web crawlerβ”‚ β”‚ embeddings + β”‚ + β”‚ MongoDB, Snowflakeβ”‚ β”‚ docs, sheets, β”‚ β”‚ vector store + β”‚ + β”‚ BigQuery, S3, … β”‚ β”‚ email, calendars… β”‚ β”‚ BM25 index β”‚ + β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + queried live, in place β€” data is never copied +``` +- **One server, three interfaces.** The engine ships a built-in SQL editor on HTTP (`:47334`) and speaks the MySQL (`:47335`) and PostgreSQL (`:47336`) wire protocols β€” so `mysql`, `psql`, DBeaver, SQLAlchemy, or any BI tool [connects directly](https://mindsdb.github.io/engine#setup-clients). +- **Federated queries, no pipelines.** [`CREATE DATABASE`](https://mindsdb.github.io/engine#db-create) attaches a live data source through an integration handler. The planner translates each query, pushes work down to the source, and streams results back β€” your data stays where it is. Source-specific syntax is still available via [native queries](https://mindsdb.github.io/engine#native-queries). +- **Knowledge bases are the semantic layer.** A [knowledge base](https://mindsdb.github.io/engine#kb-overview) combines an embedding model, an optional reranking model, and a vector store (e.g. pgvector). `INSERT INTO` it to chunk, embed, and index content; `SELECT` from it to retrieve by meaning, filtered by metadata columns like any other table. +- **Hybrid retrieval.** [Hybrid search](https://mindsdb.github.io/engine#kb-hybrid) runs vector similarity and BM25 keyword matching in parallel and merges the results β€” for queries that mix natural language with exact identifiers, codes, or acronyms. +- **Organize and automate.** [Projects](https://mindsdb.github.io/engine#proj-create) namespace your work, [views](https://mindsdb.github.io/engine#view-create) save cross-source transformations, and [jobs](https://mindsdb.github.io/engine#job-create) schedule any SQL to run on an interval β€” e.g. to keep knowledge bases fresh. -## What you can build with MindsDB Products +## Quick start -| CONVERSATIONAL ANALYTICS AGENTS | SEMANTIC SEARCH AGENTS | -| --- | --- | -| Get precise, data-driven answers using natural language.

Unify and query data across sources (MySQL, Salesforce, Shopify, etc.), without ETL.

Try Anton | Ground LLM responses in your most relevant internal knowledge.

Search across unstructured sources like documents, support tickets, Google Drive, and more.

Watch video | - -## How MindsDB - Query Engine works -
- - MindsDB demo - answer questions in plain English from live enterprise data - -
+Run with [Docker](https://mindsdb.github.io/engine#setup-docker): -MindsDB - Query Engine follows a simple workflow: **Connect β†’ Unify β†’ Respond**. At the center is an SQL-compatible data language with additional constructs for searching unstructured data, managing workflows (jobs/triggers), and building agents. - - - - - - - - - - - - - - -
- Connect - - Universal data access: Give your agents federated access to 200+ live data sources (Postgres, MongoDB, Slack, files, and more). -
- Unify - - Dynamic context engine: Fuse structured tables with vectorized data (text, PDFs, HTML) inside a Knowledge Base. -
- Respond - - Autonomous reasoning: Deploy agents that blend and retrieve data points across your stack to produce grounded answers. -
- -## Setup - -Users can install MindsDB via Docker, Docker Extension, or PyPI. - -Here is how to pull and run MindsDB via Docker: ```bash docker run --name mindsdb_container \ --e MINDSDB_APIS=http,mysql \ --p 47334:47334 -p 47335:47335 \ -mindsdb/mindsdb:latest + -e MINDSDB_APIS=http,mysql \ + -p 47334:47334 -p 47335:47335 \ + mindsdb/mindsdb +``` + +Or install from [PyPI](https://mindsdb.github.io/engine#setup-pip): + +```bash +pip install mindsdb # add extras as needed, e.g. mindsdb[pgvector,openai,postgres] +python -m mindsdb ``` -## Usage +Then open the editor at `http://127.0.0.1:47334`, or connect any MySQL client to port `47335`. The [quickstart](https://mindsdb.github.io/engine#quickstart) walks through the rest. + +## From zero to semantic search -**Follow the quickstart guide to get started with MindsDB using our demo data.** +Six SQL statements, start to finish. Full syntax for every statement is in the [SQL reference](https://mindsdb.github.io/engine). + +**1. Attach your data sources** ([docs](https://mindsdb.github.io/engine#db-create)) β€” they are queried live, nothing is imported: -Retrieve and analyze data from over 200 data sources in one SQL dialect. For AI agents, this means faster response time, better accuracy, and lower token consumption. ```sql ---use SQL to aggregate pipeline data from Salesforce -SELECT SUM(ExpectedRevenue) AS open_pipeline -FROM salesforce.opportunities -WHERE close_date >= CURDATE() - ---use the same dialect to retrieve even from a non-SQL database, like MondoDB -SELECT COUNT(*) AS negative_emails_last_30_days -FROM mongodb.support_tickets -WHERE sentiment = 'negative' - AND created_at >= CURRENT_DATE - INTERVAL '30 days'; +CREATE DATABASE my_pg +WITH ENGINE = 'postgres', +PARAMETERS = { + "host": "localhost", "port": 5432, + "user": "user", "password": "pass", + "database": "mydb" +}; + +CREATE DATABASE my_mongo +WITH ENGINE = 'mongodb', +PARAMETERS = { + "host": "mongodb+srv://user:pass@cluster.example.net", + "database": "support" +}; ``` -Create views and join data even from different types of data systems. +**2. Query across sources in one dialect** ([docs](https://mindsdb.github.io/engine#sql-join)) β€” even non-SQL stores like MongoDB, and save the result as a [view](https://mindsdb.github.io/engine#view-create): + ```sql ---join MongoDB and Salesforce data -CREATE VIEW risky_renewals AS ( -SELECT * -FROM mongodb.support_tickets AS reviews -JOIN salesforce.opportunities AS deals - ON reviews.customer_domain = deals.customer_domain -WHERE deals.type = "renewal" - AND reviews.sentiment = "negative" +CREATE VIEW open_tickets_by_product AS ( + SELECT p.name, COUNT(t.ticket_id) AS open_tickets + FROM my_mongo.support_tickets AS t + JOIN my_pg.products AS p + ON t.product_id = p.id + WHERE t.status = 'open' + GROUP BY p.name ); ``` -Join vectorized and structured data inside a knowledge base. Combine semantic search with precise metadata criteria in a single SQL query. +**3. Create a knowledge base** ([docs](https://mindsdb.github.io/engine#kb-create)) β€” an embedding model plus a vector store, addressable as a table: + ```sql ---create a knowledge base for customer issues -CREATE KNOWLEDGE_BASE customers_issues +CREATE KNOWLEDGE_BASE support_kb USING - storage = my_vector.db, - content_columns = ['ticket_description']; - metadata_columns = ['customer_name', 'segment', 'revenue', 'is_pending_renewal']; - ---find large customers who submitted ticket related to data security topics -SELECT * FROM customers_issues -WHERE content = 'data security' -AND - is_pending_renewal = 'true'. - revenue > 1000000; + embedding_model = { + "provider": "openai", + "model_name": "text-embedding-3-large", + "api_key": "sk-..." + }, + storage = my_pgvector.support_kb_store, -- a pgvector connection + content_columns = ['subject', 'body'], + metadata_columns = ['product_name', 'priority', 'created_at'], + id_column = 'ticket_id'; ``` -Use MindsDB pre-packaged data agents and connect them with your own. See how to use MindsDB via API or MCP. +**4. Index your content** ([docs](https://mindsdb.github.io/engine#kb-insert)) β€” rows are chunked, embedded, and upserted: + ```sql -CREATE AGENT my_agent -USING - model = { - "provider": "openai", - "model_name" : "gpt-xx", - "api_key": "sk-..." - }, - data = { - "knowledge_bases": ["mindsdb.customer_issues"], - "tables": ["salesforce.opportunities", "postgres.sales", "mongodb.support_tickets"] - }, - prompt_template = 'my prompt template and agent guidance'; +INSERT INTO support_kb + SELECT ticket_id, subject, body, product_name, priority, created_at + FROM my_mongo.support_tickets; ``` -See MindsDB’s recommended usage of agents here and how to automate workflows with jobs. - -## πŸ“ƒ Tutorials -- Enterprise Knowledge Search (example) -- Advanced Semantic Search (example) -- Customer Support Automation (example1, example2) -- Intelligent Content Discovery (example) -- Financial Analysis Agents (example) -- Real-time AI-powered analytics (example) -- Conversational Data Assistants (example) -- CRM Intelligence (example) -- Compliance & Customer Intelligence (example) -- Conversation Intelligence (example) -Subscribe to our (blog) for more - -## 🫴 Help and support - -Stuck on a query? Found a bug? We’re here to help. - - - - - - - - - - - - - -
- Ask a question - - Join our Slack Community. -
- Report a bug - - Open a GitHub Issue. Please include reproduction steps! -
- Get commercial support - - Contact the MindsDB Team for enterprise SLAs and custom solutions. -
- -**Security Note:** If you find a security vulnerability, please do not open a public issue. Refer to our security policy for reporting instructions. - -## 🀝 Contribute to MindsDB - -MindsDB is open source and contributions are welcome! You can submit code changes through pull requests or by opening issues to report bugs, suggest new features, or enhancements. - -**Ways you can help:** -- Develop a database integration -- Develop an app integration -- Identify and fix bugs - -**How to contribute** - -- Read the contribution guide to get set up. -- Browse open issues. -- Join the #contributors channel in Slack. -- Explore community rewards and programs. - -
- -Our top 100 contributors - - - - - -Made with [contrib.rocks](https://contrib.rocks) -
-## πŸ“š Resources -- Documentation -- Blog -- Events -- Community Slack -- Brand guidelines -- Contact form +**5. Search by meaning, filter by metadata** ([docs](https://mindsdb.github.io/engine#kb-query)): + +```sql +SELECT chunk_content, product_name, relevance +FROM support_kb +WHERE content = 'cannot connect after the latest update' + AND priority <= 2 + AND relevance >= 0.5 +LIMIT 10; + +-- hybrid search: blend vector similarity with BM25 keyword matching +SELECT * +FROM support_kb +WHERE content = 'error ERR-4421' + AND hybrid_search = true; +``` + +β–Ά [How to use semantic search with metadata filters](https://www.youtube.com/watch?v=HN4fHtS4mvo) β€” a good explainer of this feature. + +**6. Keep the index fresh with a job** ([docs](https://mindsdb.github.io/engine#job-create)): + +```sql +CREATE JOB refresh_support_kb ( + INSERT INTO support_kb + SELECT ticket_id, subject, body, product_name, priority, created_at + FROM my_mongo.support_tickets + WHERE created_at > LAST +) +EVERY hour; +``` + +## Help and support + +| You need | Go to | +| --- | --- | +| Ask a question | [Discord](https://mindshub.ai/discord) | +| Report a bug | [GitHub Issues](https://github.com/mindsdb/engine/issues) β€” please include reproduction steps | +| Commercial support | [Contact the team](https://mindshub.ai/contact) | + +**Security note:** if you find a vulnerability, please do not open a public issue β€” follow our [security policy](https://github.com/mindsdb/engine/security) instead. + +## Contributing + +Contributions are welcome β€” code, integrations, docs, and bug reports alike. We follow the fork-and-pull workflow: see the [contribution guide](CONTRIBUTING.md) to get set up, and browse the [open issues](https://github.com/mindsdb/engine/issues) for somewhere to start. Good first areas are new integration handlers, bug fixes, and documentation improvements. + +## Resources + +- [Documentation](https://mindsdb.github.io/engine) +- [MindsHub β€” open AI agents, from the same team](https://mindshub.ai) +- [MindsHub vs MindsDB β€” how the product evolved](https://mindshub.ai/mindshub-vs-mindsdb) +- [Discord](https://mindshub.ai/discord) +- [Contact](https://mindshub.ai/contact) + +## License + +MindsDB Core is licensed under the [Elastic License 2.0](LICENSE); some directories carry their own license β€” see the [LICENSE](LICENSE) file for the full structure. diff --git a/docs/index.html b/docs/index.html index 99b0129b2f..dd9ef622d8 100644 --- a/docs/index.html +++ b/docs/index.html @@ -4,7 +4,40 @@ -MindsDB SQL Reference +MindsDB Query Engine β€” SQL Reference + + + + + + + + + + + + + + + + + + @@ -510,7 +543,6 @@
@@ -745,11 +777,6 @@

# pip in

Editor: http://127.0.0.1:47334  Β·  MySQL API: port 47335  Β·  PostgreSQL API: port 47336

-
-

# MindsDB Cloud

-

Sign up at cloud.mindsdb.com β€” no installation required. The SQL editor is available immediately. All SQL statements in this reference work identically on Cloud.

-
-

# Connect Clients

MindsDB exposes a MySQL-compatible wire protocol. Any MySQL client can connect:

diff --git a/docs/og-image.png b/docs/og-image.png new file mode 100644 index 0000000000..0fa78280b1 Binary files /dev/null and b/docs/og-image.png differ