Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

要約の精度改善に試せそうな手法をリストアップ #25

Open
naogify opened this issue Jan 23, 2021 · 5 comments
Open

要約の精度改善に試せそうな手法をリストアップ #25

naogify opened this issue Jan 23, 2021 · 5 comments

Comments

@naogify
Copy link
Collaborator

naogify commented Jan 23, 2021

No description provided.

@naogify
Copy link
Collaborator Author

naogify commented Jan 23, 2021

2. 以下を実装してみる

  • XX回以下しか出てこない単語は無視
  • 頻出単語も無視
  • 使用単語数に上限設定
  • MMR

https://qiita.com/naotaka1128/items/bdaff71379a8bf231b64#mmr%E3%81%82%E3%82%8A

@naogify
Copy link
Collaborator Author

naogify commented Jan 23, 2021

リクルートのsummpyで何をしているかコードを読む
https://github.com/naogify/summpy

@naogify
Copy link
Collaborator Author

naogify commented Jan 23, 2021

3. Word2Vecを試してTIDFと比較してみる

https://qiita.com/takumi_TKHS/items/4a56ac151c60da8bde4b

@naogify
Copy link
Collaborator Author

naogify commented Jan 23, 2021

あと、短すぎる文字数も削除。
短いの判定をどうするか。文章中の短い文の下位何番までは使わない

@naogify
Copy link
Collaborator Author

naogify commented Jan 23, 2021

✅ 1. Neologdを試す

結論:今回は、要約の前の音声テキスト化の時点で新語が弾かれるので、iPadic使用。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant