Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handles restarting problem of adaptive time integration methods #1565

Merged
merged 124 commits into from
Sep 15, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
124 commits
Select commit Hold shift + click to select a range
a975a25
Create test.jl
ArseniyKholod Mar 15, 2023
4680a7e
Delete test.jl
ArseniyKholod Mar 15, 2023
988d6b1
Merge branch 'trixi-framework:main' into main
ArseniyKholod May 6, 2023
16d93b9
Merge branch 'trixi-framework:main' into main
ArseniyKholod May 9, 2023
d2dbbb0
Merge branch 'trixi-framework:main' into main
ArseniyKholod May 12, 2023
acf8569
Merge branch 'trixi-framework:main' into main
ArseniyKholod May 26, 2023
41451f9
Merge branch 'trixi-framework:main' into main
ArseniyKholod Jun 16, 2023
47a4f69
Merge branch 'trixi-framework:main' into main
ArseniyKholod Jun 22, 2023
586b9d2
Merge branch 'trixi-framework:main' into main
ArseniyKholod Jul 11, 2023
ff59a97
loadcallback
ArseniyKholod Jul 13, 2023
68f3598
adding_parallel_support
ArseniyKholod Jul 14, 2023
8b07b2a
Merge branch 'main' into restart
ArseniyKholod Jul 21, 2023
5354faf
formatting
ArseniyKholod Jul 21, 2023
163a9b9
minimize dependencies
ArseniyKholod Jul 28, 2023
dd6bd1a
combine loadrestart and saverestart
ArseniyKholod Jul 30, 2023
68df7b6
fix
ArseniyKholod Jul 30, 2023
05c1246
Update test_threaded.jl
ArseniyKholod Jul 30, 2023
65517dc
fix
ArseniyKholod Jul 30, 2023
b61021e
test fix
ArseniyKholod Jul 30, 2023
3b96005
fix
ArseniyKholod Jul 30, 2023
85bcd87
MODULE add
ArseniyKholod Jul 30, 2023
be1f868
fix
ArseniyKholod Jul 30, 2023
8dd3302
runtime macros
ArseniyKholod Jul 30, 2023
9b6cc48
Update test_threaded.jl
ArseniyKholod Jul 30, 2023
7f2f9bd
Merge branch 'main' into restart
ArseniyKholod Jul 30, 2023
0f9439b
handle MPI issues
ArseniyKholod Jul 31, 2023
cdb3ad4
Merge branch 'restart' of https://github.com/ArseniyKholod/Trixi.jl i…
ArseniyKholod Jul 31, 2023
7f65b3e
enable PIDController test
ArseniyKholod Aug 1, 2023
53cb000
Update test_mpi_tree.jl
ArseniyKholod Aug 1, 2023
b240c9c
fix
ArseniyKholod Aug 1, 2023
e3813b2
add asserts
ArseniyKholod Aug 1, 2023
8315b5e
Update save_restart.jl
ArseniyKholod Aug 1, 2023
b78b932
add IController tests
ArseniyKholod Aug 2, 2023
e4d5756
Merge branch 'restart' of https://github.com/ArseniyKholod/Trixi.jl i…
ArseniyKholod Aug 2, 2023
556e91a
enable HDF5 parallel
ArseniyKholod Aug 2, 2023
88d9c57
fix shot
ArseniyKholod Aug 2, 2023
84133ee
fix shot 2
ArseniyKholod Aug 2, 2023
4ec808f
fix shot 3
ArseniyKholod Aug 2, 2023
55ba995
fix shot 4
ArseniyKholod Aug 2, 2023
46f51fd
fix shot 5
ArseniyKholod Aug 2, 2023
8e9cfa3
fix shot 6
ArseniyKholod Aug 2, 2023
3b090a1
fix shot 7
ArseniyKholod Aug 2, 2023
e8f28e4
fix shot 8
ArseniyKholod Aug 2, 2023
0dcabc1
fix shot 9
ArseniyKholod Aug 2, 2023
32965f6
fix shot 10
ArseniyKholod Aug 2, 2023
008ca58
fix shot 11
ArseniyKholod Aug 2, 2023
3a41ca9
fix shot 12
ArseniyKholod Aug 2, 2023
7576665
fix shot 13
ArseniyKholod Aug 2, 2023
9dac2da
fix shot 14
ArseniyKholod Aug 2, 2023
aa5447e
fix shot 15
ArseniyKholod Aug 2, 2023
0fcfff2
fix shot 16
ArseniyKholod Aug 2, 2023
63e607a
fix shot 17
ArseniyKholod Aug 2, 2023
43e240b
fix shot 18
ArseniyKholod Aug 2, 2023
d15c7c7
enable additional configuration only in mpi test on linux
ArseniyKholod Aug 3, 2023
451e42d
enable environment
ArseniyKholod Aug 3, 2023
a4b3544
test coverage issue
ArseniyKholod Aug 3, 2023
f66a1a2
disable mpi macOs CI because of failure
ArseniyKholod Aug 3, 2023
c8393fd
disable new configurations to test coverage
ArseniyKholod Aug 3, 2023
93a39ad
disable PID and I test to test coverage issue
ArseniyKholod Aug 3, 2023
5b75944
enable old coverage all
ArseniyKholod Aug 4, 2023
460c8df
undo last commit and enable coverage on windows
ArseniyKholod Aug 4, 2023
903f27e
enable new tests, mpi macOs and HDF5 parallel
ArseniyKholod Aug 4, 2023
6eb1bcc
fix
ArseniyKholod Aug 4, 2023
2ab9cb3
enable coverage on threads
ArseniyKholod Aug 4, 2023
e020129
test HDF5 parallel
ArseniyKholod Aug 4, 2023
9377349
test HDF5 parallel 2
ArseniyKholod Aug 4, 2023
40e5d45
test HDF5 parallel 3
ArseniyKholod Aug 4, 2023
68f41c3
fix
ArseniyKholod Aug 4, 2023
21d9702
Update save_restart_dg.jl
ArseniyKholod Aug 4, 2023
935c7d4
Merge branch 'restart' of https://github.com/ArseniyKholod/Trixi.jl i…
ArseniyKholod Aug 4, 2023
a0b5b3b
test HDF5 parallel 4
ArseniyKholod Aug 4, 2023
714ca91
test HDF5 parallel 5
ArseniyKholod Aug 4, 2023
32c3939
Update configure_packages.jl
ArseniyKholod Aug 5, 2023
27666c7
delete unnecessary changes
ArseniyKholod Aug 5, 2023
74583fb
Merge branch 'restart' of https://github.com/ArseniyKholod/Trixi.jl i…
ArseniyKholod Aug 5, 2023
7de973b
Update save_restart_dg.jl
ArseniyKholod Aug 5, 2023
5c91748
Update save_restart_dg.jl
ArseniyKholod Aug 5, 2023
d76653e
remove dependency on OrdinaryDiffEq
ArseniyKholod Aug 15, 2023
3a786d5
format
ArseniyKholod Aug 15, 2023
22607d5
discard unrelated changes
ArseniyKholod Aug 15, 2023
4a82228
Merge branch 'main' into restart
ArseniyKholod Aug 15, 2023
082fca4
delete barrier
ArseniyKholod Aug 15, 2023
503a50f
delete eval()
ArseniyKholod Aug 15, 2023
047db67
comments & delete mpi_parallel
ArseniyKholod Aug 15, 2023
55f30af
format
ArseniyKholod Aug 15, 2023
15cbc68
Update runtests.jl
ArseniyKholod Aug 15, 2023
8a726a4
Update runtests.jl
ArseniyKholod Aug 15, 2023
425681d
Merge branch 'restart' of https://github.com/ArseniyKholod/Trixi.jl i…
ArseniyKholod Aug 15, 2023
c3adcf8
Merge branch 'main' into restart
ArseniyKholod Aug 15, 2023
c91b68c
simplify tests
ArseniyKholod Aug 16, 2023
0db10e3
Merge branch 'restart' of https://github.com/ArseniyKholod/Trixi.jl i…
ArseniyKholod Aug 16, 2023
629d4da
test failing MPI on windiws and macOs
ArseniyKholod Aug 16, 2023
506abc5
test with RDPK3SpFSAL49
ArseniyKholod Aug 21, 2023
b83904e
test with RDPK3SpFSAL35
ArseniyKholod Aug 21, 2023
89d1ae7
change tests
ArseniyKholod Aug 21, 2023
34c9847
Merge branch 'main' into restart
ArseniyKholod Aug 21, 2023
d1b2802
fix and new test
ArseniyKholod Aug 23, 2023
bebc1ec
Update test_tree_2d_euler.jl
ArseniyKholod Aug 23, 2023
f085ab7
fix and delete unnecessary test
ArseniyKholod Aug 23, 2023
4837f46
add printing format
ArseniyKholod Aug 23, 2023
ea66328
Merge branch 'main' into restart
ArseniyKholod Aug 23, 2023
60388d1
add docstrings
ArseniyKholod Aug 23, 2023
a2602f9
Merge branch 'restart' of https://github.com/ArseniyKholod/Trixi.jl i…
ArseniyKholod Aug 23, 2023
08b0ee1
Update src/callbacks_step/save_restart.jl
ArseniyKholod Aug 25, 2023
8cee6f1
fix
ArseniyKholod Aug 25, 2023
a96c485
formatting
ArseniyKholod Aug 25, 2023
e17a61a
Update src/callbacks_step/save_restart_dg.jl
ArseniyKholod Aug 25, 2023
17cb086
Update src/callbacks_step/save_restart_dg.jl
ArseniyKholod Aug 25, 2023
e5942e9
Update src/callbacks_step/save_restart.jl
ArseniyKholod Aug 25, 2023
c024b73
Update src/callbacks_step/save_restart.jl
ArseniyKholod Aug 25, 2023
6ccd652
Update src/callbacks_step/save_restart.jl
ArseniyKholod Aug 25, 2023
2481433
suggested changes
ArseniyKholod Aug 25, 2023
f4c081b
new test
ArseniyKholod Aug 25, 2023
ebea8c9
fix
ArseniyKholod Aug 25, 2023
4ca23b2
fix
ArseniyKholod Aug 25, 2023
b3141f7
Update test_tree_2d_advection.jl
ArseniyKholod Aug 25, 2023
6303ff9
fix
ArseniyKholod Aug 25, 2023
526408d
fix error mpi on windows
ArseniyKholod Aug 25, 2023
adb5846
rerun
ArseniyKholod Aug 27, 2023
5c8f729
Update src/callbacks_step/save_restart.jl
ArseniyKholod Aug 30, 2023
05b1971
Add comments
ArseniyKholod Aug 30, 2023
8429f31
Merge branch 'main' into restart
ArseniyKholod Sep 12, 2023
b5f80b5
Merge branch 'main' into restart
ArseniyKholod Sep 14, 2023
a41f45f
format
ArseniyKholod Sep 14, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions examples/tree_2d_dgsem/elixir_advection_extended.jl
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ analysis_callback = AnalysisCallback(semi, interval=analysis_interval,
alive_callback = AliveCallback(analysis_interval=analysis_interval)

# The SaveRestartCallback allows to save a file from which a Trixi.jl simulation can be restarted
save_restart = SaveRestartCallback(interval=100,
save_restart = SaveRestartCallback(interval=40,
save_final_restart=true)

# The SaveSolutionCallback allows to save the solution to a file in regular intervals
Expand All @@ -77,9 +77,10 @@ callbacks = CallbackSet(summary_callback,
# run the simulation

# OrdinaryDiffEq's `solve` method evolves the solution in time and executes the passed callbacks
sol = solve(ode, CarpenterKennedy2N54(williamson_condition=false),
alg = CarpenterKennedy2N54(williamson_condition=false)
sol = solve(ode, alg,
dt=1.0, # solve needs some value here but it will be overwritten by the stepsize_callback
save_everystep=false, callback=callbacks);
save_everystep=false, callback=callbacks; ode_default_options()...);

# Print the timer summary
summary_callback()
19 changes: 12 additions & 7 deletions examples/tree_2d_dgsem/elixir_advection_restart.jl
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,10 @@ using OrdinaryDiffEq
using Trixi

###############################################################################
# create a restart file

trixi_include(@__MODULE__, joinpath(@__DIR__, "elixir_advection_extended.jl"))
# Define time integration algorithm
alg = CarpenterKennedy2N54(williamson_condition=false)
# Create a restart file
trixi_include(@__MODULE__, joinpath(@__DIR__, "elixir_advection_extended.jl"), alg = alg, tspan = (0.0, 10.0))


###############################################################################
Expand All @@ -14,22 +15,26 @@ trixi_include(@__MODULE__, joinpath(@__DIR__, "elixir_advection_extended.jl"))
# Note: If you get a restart file from somewhere else, you need to provide
# appropriate setups in the elixir loading a restart file

restart_filename = joinpath("out", "restart_000018.h5")
restart_filename = joinpath("out", "restart_000040.h5")
mesh = load_mesh(restart_filename)

semi = SemidiscretizationHyperbolic(mesh, equations, initial_condition, solver)

tspan = (load_time(restart_filename), 2.0)
tspan = (load_time(restart_filename), 10.0)
dt = load_dt(restart_filename)
ode = semidiscretize(semi, tspan, restart_filename);

# Do not overwrite the initial snapshot written by elixir_advection_extended.jl.
save_solution.condition.save_initial_solution = false

alg = CarpenterKennedy2N54(williamson_condition=false)
integrator = init(ode, alg,
dt=dt, # solve needs some value here but it will be overwritten by the stepsize_callback
save_everystep=false, callback=callbacks)
save_everystep=false, callback=callbacks; ode_default_options()...)

# Load saved context for adaptive time integrator
if integrator.opts.adaptive
load_adaptive_time_integrator!(integrator, restart_filename)
end

# Get the last time index and work with that.
load_timestep!(integrator, restart_filename)
Expand Down
3 changes: 2 additions & 1 deletion src/Trixi.jl
Original file line number Diff line number Diff line change
Expand Up @@ -255,7 +255,8 @@ export SummaryCallback, SteadyStateCallback, AnalysisCallback, AliveCallback,
GlmSpeedCallback, LBMCollisionCallback, EulerAcousticsCouplingCallback,
TrivialCallback, AnalysisCallbackCoupled

export load_mesh, load_time, load_timestep, load_timestep!, load_dt
export load_mesh, load_time, load_timestep, load_timestep!, load_dt,
load_adaptive_time_integrator!

export ControllerThreeLevel, ControllerThreeLevelCombined,
IndicatorLöhner, IndicatorLoehner, IndicatorMax,
Expand Down
36 changes: 36 additions & 0 deletions src/callbacks_step/save_restart.jl
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,11 @@ function (restart_callback::SaveRestartCallback)(integrator)
end

save_restart_file(u_ode, t, dt, iter, semi, restart_callback)
# If using an adaptive time stepping scheme, store controller values for restart
if integrator.opts.adaptive
ArseniyKholod marked this conversation as resolved.
Show resolved Hide resolved
sloede marked this conversation as resolved.
Show resolved Hide resolved
save_adaptive_time_integrator(integrator, integrator.opts.controller,
restart_callback)
end
end

# avoid re-evaluating possible FSAL stages
Expand Down Expand Up @@ -168,5 +173,36 @@ function load_restart_file(semi::AbstractSemidiscretization, restart_file)
load_restart_file(mesh_equations_solver_cache(semi)..., restart_file)
end

"""
load_adaptive_time_integrator!(integrator, restart_file::AbstractString)

Load the context information for time integrators with error-based step size control
saved in a `restart_file`.
"""
function load_adaptive_time_integrator!(integrator, restart_file::AbstractString)
controller = integrator.opts.controller
# Read context information for controller
h5open(restart_file, "r") do file
# Ensure that the necessary information was saved
if !("time_integrator_qold" in keys(attributes(file))) ||
!("time_integrator_dtpropose" in keys(attributes(file))) ||
(hasproperty(controller, :err) &&
!("time_integrator_controller_err" in keys(attributes(file))))
error("Missing data in restart file: check the consistency of adaptive time controller with initial setup!")
end
# Load data that is required both for PIController and PIDController
integrator.qold = read(attributes(file)["time_integrator_qold"])
integrator.dtpropose = read(attributes(file)["time_integrator_dtpropose"])
# Accept step to use dtpropose already in the first step
integrator.accept_step = true
# Reevaluate integrator.fsal_first on the first step
integrator.reeval_fsal = true
# Load additional parameters for PIDController
if hasproperty(controller, :err) # Distinguish PIDController from PIController
controller.err[:] = read(attributes(file)["time_integrator_controller_err"])
end
end
end

include("save_restart_dg.jl")
end # @muladd
24 changes: 24 additions & 0 deletions src/callbacks_step/save_restart_dg.jl
Original file line number Diff line number Diff line change
Expand Up @@ -327,4 +327,28 @@ function load_restart_file_on_root(mesh::Union{ParallelTreeMesh, ParallelP4estMe

return u_ode
end

# Store controller values for an adaptive time stepping scheme
function save_adaptive_time_integrator(integrator,
controller, restart_callback)
# Save only on root
if mpi_isroot()
@unpack output_directory = restart_callback
timestep = integrator.stats.naccept

# Filename based on current time step
filename = joinpath(output_directory, @sprintf("restart_%06d.h5", timestep))

# Open file (preserve existing content)
h5open(filename, "r+") do file
# Add context information as attributes both for PIController and PIDController
attributes(file)["time_integrator_qold"] = integrator.qold
attributes(file)["time_integrator_dtpropose"] = integrator.dtpropose
# For PIDController is necessary to save additional parameters
if hasproperty(controller, :err) # Distinguish PIDController from PIController
attributes(file)["time_integrator_controller_err"] = controller.err
end
end
end
end
end # @muladd
20 changes: 16 additions & 4 deletions test/test_mpi_tree.jl
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,22 @@ CI_ON_WINDOWS = (get(ENV, "GITHUB_ACTIONS", false) == "true") && Sys.iswindows()
end

@trixi_testset "elixir_advection_restart.jl" begin
@test_trixi_include(joinpath(EXAMPLES_DIR, "elixir_advection_restart.jl"),
# Expected errors are exactly the same as in the serial test!
l2 = [7.81674284320524e-6],
linf = [6.314906965243505e-5])
using OrdinaryDiffEq: RDPK3SpFSAL49
Trixi.mpi_isroot() && println("═"^100)
Trixi.mpi_isroot() && println(joinpath(EXAMPLES_DIR, "elixir_advection_extended.jl"))
trixi_include(@__MODULE__, joinpath(EXAMPLES_DIR, "elixir_advection_extended.jl"),
alg = RDPK3SpFSAL49(), tspan = (0.0, 10.0))
l2_expected, linf_expected = analysis_callback(sol)

Trixi.mpi_isroot() && println("═"^100)
Trixi.mpi_isroot() && println(joinpath(EXAMPLES_DIR, "elixir_advection_restart.jl"))
# Errors are exactly the same as in the elixir_advection_extended.jl
trixi_include(@__MODULE__, joinpath(EXAMPLES_DIR, "elixir_advection_restart.jl"),
alg = RDPK3SpFSAL49())
l2_actual, linf_actual = analysis_callback(sol)

Trixi.mpi_isroot() && @test l2_actual == l2_expected
Trixi.mpi_isroot() && @test linf_actual == linf_expected
end

@trixi_testset "elixir_advection_mortar.jl" begin
Expand Down
39 changes: 25 additions & 14 deletions test/test_threaded.jl
Original file line number Diff line number Diff line change
Expand Up @@ -12,27 +12,38 @@ Trixi.mpi_isroot() && isdir(outdir) && rm(outdir, recursive=true)
@testset "Threaded tests" begin
@testset "TreeMesh" begin
@trixi_testset "elixir_advection_restart.jl" begin
@test_trixi_include(joinpath(examples_dir(), "tree_2d_dgsem", "elixir_advection_restart.jl"),
# Expected errors are exactly the same as in the serial test!
l2 = [7.81674284320524e-6],
linf = [6.314906965243505e-5])
elixir = joinpath(examples_dir(), "tree_2d_dgsem", "elixir_advection_extended.jl")
Trixi.mpi_isroot() && println("═"^100)
Trixi.mpi_isroot() && println(elixir)
trixi_include(@__MODULE__, elixir, tspan = (0.0, 10.0))
l2_expected, linf_expected = analysis_callback(sol)

elixir = joinpath(examples_dir(), "tree_2d_dgsem", "elixir_advection_restart.jl")
Trixi.mpi_isroot() && println("═"^100)
Trixi.mpi_isroot() && println(elixir)
# Errors are exactly the same as in the elixir_advection_extended.jl
trixi_include(@__MODULE__, elixir)
l2_actual, linf_actual = analysis_callback(sol)

Trixi.mpi_isroot() && @test l2_actual == l2_expected
Trixi.mpi_isroot() && @test linf_actual == linf_expected

# Ensure that we do not have excessive memory allocations
# (e.g., from type instabilities)
let
t = sol.t[end]
u_ode = sol.u[end]
du_ode = similar(u_ode)
@test (@allocated Trixi.rhs!(du_ode, u_ode, semi, t)) < 5000
end
# Ensure that we do not have excessive memory allocations
# (e.g., from type instabilities)
let
t = sol.t[end]
u_ode = sol.u[end]
du_ode = similar(u_ode)
@test (@allocated Trixi.rhs!(du_ode, u_ode, semi, t)) < 5000
end
end

@trixi_testset "elixir_advection_restart.jl with threaded time integration" begin
@test_trixi_include(joinpath(examples_dir(), "tree_2d_dgsem", "elixir_advection_restart.jl"),
alg = CarpenterKennedy2N54(williamson_condition = false, thread = OrdinaryDiffEq.True()),
# Expected errors are exactly the same as in the serial test!
l2 = [7.81674284320524e-6],
linf = [6.314906965243505e-5])
l2 = [8.005068880114254e-6],
linf = [6.39093577996519e-5])
end

@trixi_testset "elixir_advection_amr_refine_twice.jl" begin
Expand Down
20 changes: 16 additions & 4 deletions test/test_tree_2d_advection.jl
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,22 @@ EXAMPLES_DIR = pkgdir(Trixi, "examples", "tree_2d_dgsem")
end

@trixi_testset "elixir_advection_restart.jl" begin
@test_trixi_include(joinpath(EXAMPLES_DIR, "elixir_advection_restart.jl"),
# Expected errors are exactly the same as in the parallel test!
l2 = [7.81674284320524e-6],
linf = [6.314906965243505e-5])
using OrdinaryDiffEq: SSPRK43
println("═"^100)
println(joinpath(EXAMPLES_DIR, "elixir_advection_extended.jl"))
trixi_include(@__MODULE__, joinpath(EXAMPLES_DIR, "elixir_advection_extended.jl"),
alg = SSPRK43(), tspan = (0.0, 10.0))
l2_expected, linf_expected = analysis_callback(sol)

println("═"^100)
println(joinpath(EXAMPLES_DIR, "elixir_advection_restart.jl"))
# Errors are exactly the same as in the elixir_advection_extended.jl
trixi_include(@__MODULE__, joinpath(EXAMPLES_DIR, "elixir_advection_restart.jl"),
alg = SSPRK43())
l2_actual, linf_actual = analysis_callback(sol)

@test l2_actual == l2_expected
@test linf_actual == linf_expected
end

@trixi_testset "elixir_advection_mortar.jl" begin
Expand Down
Loading