Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bring your own embeddings #149

Open
ChuckHend opened this issue Oct 11, 2024 · 3 comments
Open

bring your own embeddings #149

ChuckHend opened this issue Oct 11, 2024 · 3 comments
Labels
💎 Bounty documentation Improvements or additions to documentation hacktoberfest

Comments

@ChuckHend
Copy link
Member

provide a feature or tooling to allow a user to take embeddings from one table and make it such that pg_vectorize can manage those embeddings. for example, assume a user has a table already with a content column and an embeddings column generated from the sentence-transformers/all-MiniLM-L6-v2 model. Rather than recomputing embeddings for all of the content column, we should be able to just insert those into the new embeddings table or column. I think it would be safe and fairly straight forward to manually insert embeddings into vectorize.<project_name>_embeddings after the project is created. If the project is using schedule => 'realtime', then creating a new project on a table will immediately create jobs to generate embeddings for all the text, so we might wamt to delete those jobs if we dont want to execute the jobs. In summary, I think the steps to do this could be:

  1. create vectorize by calling vectorize.table()
  2. insert embeddings into the embedding column on vectorize.<project_name>_embeddings
  3. optionally delete from pgmq where message ->> 'name' = '<project_name>'
@ChuckHend ChuckHend added the documentation Improvements or additions to documentation label Oct 11, 2024
Copy link

algora-pbc bot commented Oct 17, 2024

💎 $150 bounty • Tembo

Steps to solve:

  1. Start working: Comment /attempt #149 with your implementation plan
  2. Submit work: Create a pull request including /claim #149 in the PR body to claim the bounty
  3. Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts

Thank you for contributing to tembo-io/pg_vectorize!

Add a bountyShare on socials

Attempt Started (GMT+0) Solution
🟢 @onyedikachi-david Oct 17, 2024, 2:12:37 PM WIP

@onyedikachi-david
Copy link

onyedikachi-david commented Oct 17, 2024

/attempt #149

Algora profile Completed bounties Tech Active attempts Options
@onyedikachi-david 10 bounties from 5 projects
TypeScript, Python,
JavaScript & more
Cancel attempt

@onyedikachi-david
Copy link

Can I get assigned? @ChuckHend

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💎 Bounty documentation Improvements or additions to documentation hacktoberfest
Projects
None yet
Development

No branches or pull requests

3 participants