Release v2.5.1
Changes:
integrate deepspeed to train LLM models
2.Distributed GPU resource management
3.enhance cluster high availability
Changes:
integrate deepspeed to train LLM models
2.Distributed GPU resource management
3.enhance cluster high availability