Better File Chunk | 更加强大的文件分块 #3550
arvinxx
started this conversation in
General | 讨论
Replies: 4 comments 11 replies
-
excel可以说是知识库的刚需了,转成html然后借用html分块。 |
Beta Was this translation helpful? Give feedback.
3 replies
This comment has been hidden.
This comment has been hidden.
-
计划支持一波 Lrc/Lrcx,这样就可以做歌词文件的分析了 |
Beta Was this translation helpful? Give feedback.
0 replies
-
可以支持一下typst么?比较新的标记语言,和latex算是竞品。 |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
背景
在 RAG 中,只有将文件合理分块后,才能做好检索与查询,但是市面上文件类型是非常多的,目前一期只做了一部分的分块支持。
目前支持的分块类型:
纯文本类:
代码类:
富文本类:
表格类:
音频类:
视频类:
如果有对文件类型的分块诉求,请在下面留言,并说明对此类文件的分块设想
Beta Was this translation helpful? Give feedback.
All reactions