-
MorphingDB is a postgreSQL extension for supporting deep learning model inference within the database and vector storage.
-
MorphingDB allows users to import the libtorch model and make inferences using data from within the database.
-
MorphingDB allows users to store vector data with dimensional information, allowing libtorch to work directly with vector data in the database, users can store data preprocessed into vectors in the database to speed up inference.
# build docker image
sudo docker build -t MorphingDB .
# run docker contanier
sudo docker run --name MorphingDB_test -e POSTGRES_PASSWORD=123456 -d MorphingDB:latest -p 5432:5432
# enter docker
sudo docker exec -it [contanier id] /bin/bash
# run test
su postgres
psql -p 5432 -d postgres -c 'create extension pgdl;'
psql -p 5432 -d postgres -f /home/pgdl/test/sql/docker_test.sql
psql -p 5432 -d postgres -f /home/pgdl/test/sql/vector_test.sql
MorphingDB supports Postgres 12+ in linux
-- cpu
wget -P ./third_party https://download.pytorch.org/libtorch/cpu/libtorch-shared-with-deps-2.0.0%2Bcpu.zip
-- gpu
wget -P ./third_party https://download.pytorch.org/libtorch/cu117/libtorch-shared-with-deps-2.0.0%2Bcu117.zip
unzip -d ./third_party/libtorch ./third_party/*.zip
rm ./third_party/*.zip
sudo yum install postgresql
sudo yum install opencv opencv-devel opencv-python
cd third_party
git clone https://github.com/google/sentencepiece
cd sentencepiece
mkdir build
cd build
cmake ..
make -j4
sudo make install
cmake -DCMAKE_PREFIX_PATH="third_party/libtorch" ..
make -j4
make install
Start server
initdb -D <data_directory>
postgres -D <data_directory> -p 5432
Connect to server
psql -p 5432 -d postgres
Enable the extension
CREATE EXTENSION pgdl;
SELECT create_model(model_name, model_path, model description);
You need to write the corresponding input and output handlers for the created model in src/external_process, and rebuild extension, make install.
SELECT register_process();
SELECT predict_float([model_name], ['cpu'/'gpu'], [variable_input_column]) from [table];
SELECT predict_text([model_name], ['cpu'/'gpu'], [variable_input_column]) from [table];
Use window functions to speed up predictions with variable window sizes.
SELECT comment,predict_batch_text([model_name], ['cpu'/'gpu'], [variable_input_column]) over (rows between current row and [window_size] following)
AS result
FROM [table];
SELECT comment,predict_batch_float8([model_name], ['cpu'/'gpu'], [variable_input_column]) over (rows between current row and [window_size] following)
AS result
FROM [table];
Dozens of basic models will be supported, basic pre-processing post-processing processes will be added.
After the model is imported, the user can view the model information through the model_info table.
SELECT * from model_info;
MorphingDB supports vector storage, including storage of vector dimension information. In morphingdb, the vector type is mvec.
create table vec_test(id integer, vec mvec);
insert into vec_test values(1, '[1.0,2.2,3.123,4.2]{4}');
insert into vec_test values(1, '[1.0,2.2,3.123,4.2]{2,2}');
insert into vec_test values(1, ARRAY[1.0,2.0,3.0,1.2345]::float4[]::mvec);
select get_mvec_shape(vec) from vec_test;
select get_mvec_data(vec) from vec_test;
update vec_test set vec=vec+vec;
update vec_test set vec=vec-text_to_mvec('[1,2,3,4]');
select * from vec_test where vec=='[1,2.2,3.123,4.2]';
MorphingDB will support the interconversion of libtorch vectors to mvec.
A Comparative Study of in-Database Inference Approaches
Learning a Data-Driven Policy Network for Pre-Training Automated Feature Engineering
Pre-Trained Model Recommendation for Downstream Fine-tuning
SmartLite: A DBMS-based Serving System for DNN Inference in Resource-constrained Environments