Skip to content

Finetuning LLM models with PDF documents in h2o-llmstudio #719

Answered by psinger
sunilswain asked this question in Q&A
Discussion options

You must be logged in to vote

Hi,

you would first need to generate input/output pairs for your documents. I answered a similar question already here that might be helpful:
#522

If you want to just to next token training on the text of your pdfs, then you would need to transform it to a csv file with raw text and follow this: https://docs.h2o.ai/h2o-llmstudio/faqs#what-if-my-data-is-not-in-question-and-answer-form-and-i-just-have-documents-how-can-i-fine-tune-the-llm-model

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by sunilswain
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants