Questions regarding the net size of the embedding and fitting layers. #1609

jaejae9804 · 2024-08-01T07:08:15Z

jaejae9804
Aug 1, 2024

Thanks in advance for checking out my discussion.

Currently, Im working on a dpgen process to develope a Sn dp model. During the dpgen-run process, Ive used the provided param.json file which is included in the dpgen-example folder(Al case), with some parameters modified. The provided param.json file set the net size [240, 240, 240] and [25, 50, 100] for the embedding and fitting layers, respectively (Ive attatched the according param.json file below). So, all the training datas were generated using this settings. However, when it comes to creating a single DP-model by using deepmd-kit, most of the documents and papers trained the model by using the opposite features for the embedding and fitting layers([25, 50, 100] for the embedding and [240, 240, 240] for the fitting). Above all that, It says on the docs (https://docs.deepmodeling.com/projects/deepmd/en/r2/troubleshooting/howtoset_netsize.html#) that the latter is recommended.

My question is, as I said above, all the dpgen-datas were generated by setting the net size settings to the former. Is it okay if I train the final single DP-model by deepmd-kit using the latter settings?

Also, I see that for deepmd-kit, there already are recommendations on how I should set the net size for training a model. So, are there any recommendations on how I should set the net size for the DP-GEN trial models? Any criteria? Or should I just think that, since the DP-GEN procedure incorporates the deepmd-kit code for training the ensembles themselves, the net settings for the DP-GEN procedure and the final deepmd-kit training should be the same?

{
"type_map": ["Sn"],
"mass_map": [118.71],
"init_data_prefix": "/root/Sn/dpgen/init/",
"init_data_sys": [...],
"sys_configs_prefix": "/root/Sn/dpgen/init",
"sys_configs": [...],
"_comment1": " 00.train ",
"numb_models": 4,
"default_training_param": {
"model": {
"_comment2": " model parameters",
"type_map": ["Sn"],
"descriptor": {
"type": "se_a",
"sel": [100],
"rcut_smth": 2.0,
"rcut": 8.0,
"neuron": [240,240,240],
"resnet_dt": true,
"axis_neuron": 12,
"seed": 1
},
"fitting_net": {
"neuron": [25,50,100],
"resnet_dt": false,
"seed": 1
}
},
"learning_rate": {
"type": "exp",
"start_lr": 0.001,
"decay_steps": 2000,
"stop_lr": 0.0000000351
},
"loss":{
"start_pref_e": 0.02,
"limit_pref_e": 2,
"start_pref_f": 1000,
"limit_pref_f": 1,
"start_pref_v": 0.02,
"limit_pref_v": 2
},
"training": {
"_comment3": " traing controls",
"numb_steps": 400000,
"seed": 0,
"_comment4": " display and restart",
"_comment5": " frequencies counted in batch",
"disp_file": "lcurve.out",
"disp_freq": 2000,
"save_freq": 2000,
"save_ckpt": "model.ckpt",
"disp_training": true,
"time_training": true,
"profiling": false,
"profiling_file": "timeline.json",
"_comment6": "that's all",
"training_data": {
"systems": [],
"set_prefix": "set",
"batch_size": 1
}
}
},

njzjz · 2024-08-01T19:09:17Z

njzjz
Aug 1, 2024
Maintainer

the latter is recommended.

I believe we've never recommended any specific configuration anywhere. These parameters are usually system-specific, and I don't think a configuration can be suitable for every system.

For this specific example, I noticed it was added in #190. Ping @felix5572 for the details.

1 reply

njzjz Aug 1, 2024
Maintainer

A PR is welcome if it is a typo.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions regarding the net size of the embedding and fitting layers. #1609

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Questions regarding the net size of the embedding and fitting layers. #1609

jaejae9804 Aug 1, 2024

Replies: 1 comment · 1 reply

njzjz Aug 1, 2024 Maintainer

njzjz Aug 1, 2024 Maintainer

jaejae9804
Aug 1, 2024

Replies: 1 comment 1 reply

njzjz
Aug 1, 2024
Maintainer

njzjz Aug 1, 2024
Maintainer