New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

要約の精度改善に試せそうな手法をリストアップ #25

Open

naogify opened this issue Jan 23, 2021 · 5 comments

Collaborator

naogify commented Jan 23, 2021

No description provided.

Collaborator Author

naogify commented Jan 23, 2021 •

edited

Loading

2. 以下を実装してみる

XX回以下しか出てこない単語は無視
頻出単語も無視
使用単語数に上限設定
MMR

https://qiita.com/naotaka1128/items/bdaff71379a8bf231b64#mmr%E3%81%82%E3%82%8A

Collaborator Author

naogify commented Jan 23, 2021

リクルートのsummpyで何をしているかコードを読む
https://github.com/naogify/summpy

Collaborator Author

naogify commented Jan 23, 2021 •

edited

Loading

3. Word2Vecを試してTIDFと比較してみる

https://qiita.com/takumi_TKHS/items/4a56ac151c60da8bde4b

Collaborator Author

naogify commented Jan 23, 2021

あと、短すぎる文字数も削除。
短いの判定をどうするか。文章中の短い文の下位何番までは使わない

Collaborator Author

naogify commented Jan 23, 2021 •

edited

Loading

✅ 1. Neologdを試す

結論：今回は、要約の前の音声テキスト化の時点で新語が弾かれるので、iPadic使用。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment