Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Add support for rendering records as LLM documents #48

Merged
merged 25 commits into from
Feb 27, 2024

Conversation

aaronsteers
Copy link
Contributor

@aaronsteers aaronsteers commented Feb 17, 2024

Related to:

Here we add a new Document class which is modeled after the LangChain Document class and should be compatible with similar LLM paradigms. The paradigm here is that content (or page_content) is a stringification of the record as a document, and metadata is a collection of all supporting fields that do not make up the primary content.

Later, we can add top-level id and last_updated fields, which will help us later with deduplication.

The new interfaces are:

  • Source.get_documents(stream_name, ...) - this is sibling to the existing method Source.get_records(stream_name, ...). Instead of returning an iterable of record dictionaries, we return an iterable of document objects.
  • Dataset.to_documents(...) - this is sibling to the existing methods: Dataset.to_sql_table() and Dataset.to_pandas().

For comparison, please see the Langchain Documents class.

Note:

@aaronsteers aaronsteers changed the title [Draft] Feat: Add support for rendering as documents Feat: Add support for rendering as documents Feb 27, 2024
@aaronsteers aaronsteers changed the title Feat: Add support for rendering as documents Feat: Add support for rendering records as LLM documents Feb 27, 2024
@aaronsteers aaronsteers marked this pull request as ready for review February 27, 2024 22:26
@aaronsteers aaronsteers merged commit 8ebc28f into main Feb 27, 2024
11 checks passed
@aaronsteers aaronsteers deleted the aj/feat/render-records-as-documents branch February 27, 2024 22:44
@aaronsteers aaronsteers self-assigned this Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants