- Adding norm_to_scale_identity_weight_per_block to multiply and update_cache methods of estimator which allows the identity_weight to be scaled differently for each block according to some kind of norm (or norm-like function) of the curvature for that block. #871
Job | Run time |
---|---|
4m 18s | |
3m 39s | |
7m 57s |