Serving a Transformer model converting Text to SQL with Huggingface and MLflow

As machine learning continues to mature, here is an intro on how to use a T5 model to generate SQL queries from text questions and serve it…

Oct 24, 2021

As machine learning continues to mature, here is an intro on how to use a T5 model to generate SQL queries from text questions and serve it via a REST API.

Machine Learning for code completion got a lot of press with the release of OpenAI Codex which powers GitHub Copilot. Many companies are tackling this problem and making progress is now quicker thanks to the better tooling and techniques.

In the 10 years of evolution of the Hue SQL Editor, investing and switching to a parser based autocomplete was one of the top three best decisions. The parsers have even being reused by most of the competitiors. This was done five years ago and now new (complementary) approaches are worth investigating.

Starting the MLflow server and calling the model to generate a corresponding SQL query to the text question

Here are three SQL topics that could be simplified via ML:

Text to SQL →a text question get converted into an SQL query
SQL to Text →getting help on understanding what a SQL query is doing
Table Question Answering → literally ask questions on a grid dataset

Let’s have an intro with the generation of an SQL query from a text question.

For this we pick an existing model named dbernsohn/t5_wikisql_SQL2en.

Most of the difficult work has already been done by building the model and fine tuning it on the WikiSQL dataset.

Invocation of the prediction service REST API via curl

Let’s run the model with a simple question:

> python text2sql.py predict --query="How many people live in the USA?"

"SELECT COUNT Live FROM table WHERE Country = united states AND Name: text"

Bonus: this quick CLI based on a previous tutorial allows to interact easily with the model

Obviously the results are not pixel perfect and a lot more can be done but this is a good start. Now let’s see how serving the model as an API works:

Pulling a trained Text2SQL model M2 from Huggingface Hub and using MFlow to register it as experiments and serve them via a REST API

curl command asking the model to predict the SQL from a text question

For this we will use MLflow which provides a lot of the glue to automate the tedious engineering management of ML models.

We simply wrap around the Huggingface model into a MLflow model

The model is already trained and we actually just register it in MLflow

Calling the `predict()` method of the model will run the inference

The API is simply local here but MLflow can automate the pushes and deploys of the models in production environments. In our case we just want to register it:

python text2sql.py train

And after starting the mlflow ui we can see the experiment:

Seeing some of the model metadata as well as how to load it. Note that more options like Schemas and registering in the Model Registry are available.

Now we select the iteration we want to serve:

mlflow models serve -m /home/romain/projects/romain/text2sql/mlruns/0/efec45c930714e3581033699e011df51/artifacts/model -p 5001

And then can directly query it!

curl -X POST -H "Content-Type:application/json; format=pandas-split" --data '{"columns":["text"],"data":[["How many people live in the USA?"]]}' http://127.0.0.1:5001/invocations

"SELECT COUNT Live FROM table WHERE Country = united states AND Name: text"

And that’s it!

The project is in a Github repo. As a follow-up you can also find a detailed exampled how to to manage a Bayesian Model with MLflow.

In the next episodes we will see how to integrate the ML API into your own SQL Editor and improve the model!

Romain’s Newsletter

Discussion about this post