CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling
- [Sep. 2024]: Our psychological counseling report dataset CPsyCounR is now available upon reasonable request after signing the Privacy Data Protection Agreement.
- [Jul. 2024]: Paper presentation work: Report | Long talk interviewed by shanghai AI Lab | Short talk interviewed by AI TIME
- [Jul. 2024]: We collaborate with EmoLLM team to launch EmoLLM V3.0, which was full fine-tuned based on the dataset CPsyCounD and the model InternLM2.5-7B-Chat. Model weights: OpenXLab, ModelScope. WebDemo: OpenXLab demo.
- [May. 2024]: Our paper has released on arXiv , check it out!
- [May. 2024]: CPsyCoun has been accepted to 2024 ACL Findings!
- [Apr. 2024]: CPsyCoun has been used in EmoLLM , welcome!
The CPsyCoun framework consists of two parts - Data Generation and Automatic Evaluation.
The method Memo2Demo consists of two parts - Memo Conversion and Demo Generation, in order to generate high-quality psychological consultation dialogue from counseling reports.
Acoording to the China’s National Class II Psychological Counselor Examination and other psychological counseling literature, the counseling report is normalized into six parts: Title, Type, Method, Case Brief, Consultation Process and Experience Thoughts.
- An example of counseling report
The high-quality multi-turn dialogue dataset, which has a total of 3,134 multi-turn consultation dialogues.
- For more details, please refer to the CPsyCounD.
- CPsyCounD in LLaMA-Factory form is open-sourced at HuggingFace.
- Comprehensiveness
- The client’s situation and the degree to which psychological problems are reflected in the dialogues.
- Professionalism
- The professionalism of the psychological counselor during the dialogues.
- Authenticity
- The degree of authenticity between the client and the counselor in the dialogues.
- Safety
- The degree of privacy protection of clients.
- The score criterion of each evaluation metric
The approach to effectively evaluate multi-turn consultation dialogues.
Denote a
where
Then, we employ LLM to assess these responses, utilizing the evaluation metrics. The model to assign an evaluation score
- For more details, please refer to the Code.
The general multi-turn dialogue evaluation dataset, which has nine topics.
- For more details, please refer to the CPsyCounE.
- Statistics of generated dialogues
- The results of intrinsic evaluation
We further fine-tune InternLM2-7B-Chat on CPsyCounD. CPsyCounX is fine-tuning for 9 epochs with the batch size set to 448, and the learning rate set to
- For more details, please refer to the Code.
- CPsyCounX is open-sourced at HuggingFace.
- The average results of extrinsic evaluation
- Radar plot of detailed scores of CPsyCounX and other baselines
- The full results of extrinsic evaluation
If you find our work helpful in your research, please cite the following paper:
@inproceedings{zhang-etal-2024-cpsycoun,
title="{CP}sy{C}oun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for {C}hinese Psychological Counseling",
author="Zhang, Chenhao and Li, Renhao and Tan, Minghuan and Yang, Min and Zhu, Jingwei and Yang, Di and Zhao, Jiahao and Ye, Guancheng and Li, Chengming and Hu, Xiping",
journal={ACL},
year={2024}
}