You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Topic: History of European countries. 5000-word article covering all history in the European countries.
root : ERROR : Error occurs when searching query : 'hits'
root : ERROR : Error occurs when searching query : 'hits'
root : ERROR : Error occurs when searching query : 'hits'
root : ERROR : Error occurs when searching query : 'hits'
root : ERROR : Error occurs when searching query : 'hits'
root : ERROR : Error occurs when searching query : 'hits'
root : ERROR : Error occurs when searching query : 'hits'
root : ERROR : Error occurs when searching query : 'hits'
root : ERROR : Error occurs when searching query : 'hits'
root : ERROR : Error occurs when searching query : 'hits'
root : ERROR : Error occurs when searching query : 'hits'
root : ERROR : Error occurs when searching query : 'hits'
root : ERROR : Error occurs when searching query : 'hits'
root : ERROR : Error occurs when searching query : 'hits'
knowledge_storm.interface : INFO : run_knowledge_curation_module executed in 155.8709 seconds
knowledge_storm.interface : INFO : run_outline_generation_module executed in 7.7600 seconds
sentence_transformers.SentenceTransformer : INFO : Use pytorch device_name: cpu
sentence_transformers.SentenceTransformer : INFO : Load pretrained SentenceTransformer: paraphrase-MiniLM-L6-v2
knowledge_storm.interface : INFO : run_article_generation_module executed in 31.8236 seconds
knowledge_storm.interface : INFO : run_article_polishing_module executed in 7.7724 seconds
***** Execution time *****
run_knowledge_curation_module: 155.8709 seconds
run_outline_generation_module: 7.7600 seconds
run_article_generation_module: 31.8236 seconds
run_article_polishing_module: 7.7724 seconds
***** Token usage of language models: *****
run_knowledge_curation_module
claude-3-haiku-20240307: {'prompt_tokens': 108660, 'completion_tokens': 24109}
claude-3-5-sonnet-20240620: {'prompt_tokens': 0, 'completion_tokens': 0}
run_outline_generation_module
claude-3-haiku-20240307: {'prompt_tokens': 7249, 'completion_tokens': 824}
claude-3-5-sonnet-20240620: {'prompt_tokens': 0, 'completion_tokens': 0}
run_article_generation_module
claude-3-haiku-20240307: {'prompt_tokens': 0, 'completion_tokens': 0}
claude-3-5-sonnet-20240620: {'prompt_tokens': 2312, 'completion_tokens': 1086}
run_article_polishing_module
claude-3-haiku-20240307: {'prompt_tokens': 0, 'completion_tokens': 0}
claude-3-5-sonnet-20240620: {'prompt_tokens': 1297, 'completion_tokens': 299}
The generated outline was detailed and lengthy, but the article itself did not include all the outline, and the length is only 650-750 words.
Increasing the hyperparameters did not increase the length.
# hyperparameters for the pre-writing stage
parser.add_argument('--max-conv-turn', type=int, default=6,
help='Maximum number of questions in conversational question asking.')
parser.add_argument('--max-perspective', type=int, default=6,
help='Maximum number of perspectives to consider in perspective-guided question asking.')
parser.add_argument('--search-top-k', type=int, default=6,
help='Top k search results to consider for each search query.')
# hyperparameters for the writing stage
parser.add_argument('--retrieve-top-k', type=int, default=7,
help='Top k collected references for each section title.')
parser.add_argument('--remove-duplicate', action='store_true',
help='If True, remove duplicate content from the article.')
To Reproduce
Report following things
Input topic name: History of European countries. 5000-word article covering all history in the European countries.
All output files generated for this topic as a zip file. (Output file attached as zip)
Screenshots
If applicable, add screenshots to help explain your problem.
Thanks for providing detailed information! There are several factors that affect the generated article length:
In our paper experiment setting, in order to void making each section / subsection too short (to align with wikipedia writing style), we only use first level section name in the outline generation stage to guide article generation. See code snippet here. To mitigate this problem: You can customize StormArticleGenerationModulehere
When generating article, we do it section by section. For each section generation, we first retrieve top K retrieved information during knowledge curation stage. You can change the hyperparam retrieve_top_k to include more information. Corresponding code snippet here. Additionally, due to capacity of LM at the time when we conducted the experiment, we hard code the context length to 1500, code snippet here. We should make it as a parameter; it's hard coded now.
We plan to have an upgrade in the coming week, which may mitigate this issue. Exact date TBD.
Describe the bug
Tried to generate a 5000 words article with claude haiku and claude sonnet. Settings for the token:
Full log:
The generated outline was detailed and lengthy, but the article itself did not include all the outline, and the length is only 650-750 words.
Increasing the hyperparameters did not increase the length.
To Reproduce
Report following things
Screenshots
If applicable, add screenshots to help explain your problem.
Environment:
History_of_European_countries._5000-word_article_covering_all_history_in_the_European_countries.zip
The text was updated successfully, but these errors were encountered: