Star History Monthly Jan 2024 | Open-source Text2SQL Tools
LogoBlogAdd Access Token

Star History Monthly Jan 2024 | Open-source Text2SQL Tools

Mila 3 min read

Text2SQL, or Chat2SQL tools convert natural language or questions into SQL queries. Imagine having ChatGPT write beautiful, correct and useful SQL queries for you!

gpt

These tools started out to bridge the gap between non-tech users and databases, by allowing them to interact with databases using natural language and reduce the barrier to accessing and analyzing data. But with the advance of AI models, these tools now support more advanced features such as handling complex queries, joining multiple tables, or even supporting natural language conversations.

They can also help improve productivity by automating the process of generating SQL queries, thereby saving time and effort.

In this edition of Star History monthly, we have compiled a collection of open-source Text2SQL tools.

star-history

Chat2DB

Chat2DB aims to be a general-purpose SQL client and reporting tool that incorporates AI capabilities from the start. It supports connection to a handful of databases including MySQL, Postgres, Oracle, SQL Server, SQLite, ClickHouse and more.

chat2db

There was a bit of drama involving Chat2DB a while ago, we won't get into details here but I'm curious to know what you think.

SQL Chat

SQL Chat is a chat-based SQL client, and you can use natural language to communicate with your database to implement operations, such as query, modification, addition, and deletion (!) of the database.

It currently supports MySQL, Postgres, SQL Server and TiDB serverless.

sqlchat

It's open-sourced by Bytebase, a database migration tool for teams.

Vanna

Vanna is a Python framework that allows the training of an RAG model with queries, DDL, and documentation from a database.

vanna

You can use Vanna as is, or build your own custom UI with an existing tool (e.g. Streamlit, Slack).

It was open-sourced in July 2023 and got really popular this past January.

DuckDB-NSQL

DuckDB-NSQL is a Text2SQL LLM built for local DuckDB SQL analytics tasks, by MontherDuck and Numbers Station. This can certainly help users leverage the full power of DuckDB and its analytic potential, without having to go back-and-forth between the DuckDB documentation and the SQL shell.

duckdb

Langchain

With Langchain, you can build a Q&A chain and agent over an SQL database yourself.

langchain

LangChain also has an SQL Agent that you can add to the chain. It can not only answer questions based on the databases’ schema and content, but also recover from errors by running a generated query, catching the traceback and regenerating it correctly.

Awesome Text2SQL

Awesome Text2SQL is a suite of curated tutorials and resources for LLMs, Text2SQL, Text2DSL, Text2API, Text2Vis, and more. Most of the models are LLM+Text2SQL, and for each model, there are links for papers, code, dataset. If you want to dive deep into Text2SQL, take a look!

awesome-text2sql

To Wrap up

LLM or not, you should still be extra careful when executing model-generated SQL queries. Some ways to minimize risks includes describing your database schema, data; constraining the size of the output; validating and reviewing the generated SQL queries before executing them.

Lastly

If you want more AI content, check out earlier editions of the Star History Open-source Monthly: