Skip to content

Commit

Permalink
Added splines for camera movement, fixed threading conflict with writ…
Browse files Browse the repository at this point in the history
…e_frame(), more accurate runtime estimation, enabled FP16S by default
  • Loading branch information
ProjectPhysX committed Sep 7, 2024
1 parent 892d5ef commit 0340760
Show file tree
Hide file tree
Showing 12 changed files with 155 additions and 54 deletions.
33 changes: 33 additions & 0 deletions DOCUMENTATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -313,6 +313,39 @@
}
```
- To find suitable camera placement, run the simulation at low resolution in [`INTERACTIVE_GRAPHICS`](src/defines.hpp) mode, rotate/move the camera to the desired position, click the <kbd>Mouse</kbd> to disable mouse rotation, and press <kbd>G</kbd> to print the current camera settings as a copy-paste command in the console. <kbd>Alt</kbd>+<kbd>Tab</kbd> to the console and copy the camera placement command by selecting it with the mouse and right-clicking, then paste it into the [`main_setup()`](src/setup.cpp) function.
- To fly the camera along a smooth path through a list of provided keyframe camera placements, use `catmull_rom` splines:
```c
while(lbm.get_t()<=lbm_T) { // main simulation loop
if(lbm.graphics.next_frame(lbm_T, 30.0f)) {
const float t = (float)lbm.get_t()/(float)lbm_T;
vector<float3> camera_positions = {
float3(-0.282220f*(float)Nx, 0.529221f*(float)Ny, 0.304399f*(float)Nz),
float3( 0.806921f*(float)Nx, 0.239912f*(float)Ny, 0.436880f*(float)Nz),
float3( 1.129724f*(float)Nx, -0.130721f*(float)Ny, 0.352759f*(float)Nz),
float3( 0.595601f*(float)Nx, -0.504690f*(float)Ny, 0.203096f*(float)Nz),
float3(-0.056776f*(float)Nx, -0.591919f*(float)Ny, -0.416467f*(float)Nz)
};
vector<float> camera_rx = {
116.0f,
25.4f,
-10.6f,
-45.6f,
-94.6f
};
vector<float> camera_ry = {
26.0f,
33.3f,
20.3f,
25.3f,
-16.7f
};
const float camera_fov = 90.0f;
lbm.graphics.set_camera_free(catmull_rom(camera_positions, t), catmull_rom(camera_rx, t), catmull_rom(camera_ry, t), camera_fov);
lbm.graphics.write_frame(get_exe_path()+"export/");
}
lbm.run(1u, lbm_T);
}
```
- The visualization mode(s) can be specified as `lbm.graphics.visualization_modes` with the [`VIS_...`](src/defines.hpp) macros. You can also set the `lbm.graphics.slice_mode` (`0`=no slice, `1`=x, `2`=y, `3`=z, `4`=xz, `5`=xyz, `6`=yz, `7`=xy) and reposition the slices with `lbm.graphics.slice_x`/`lbm.graphics.slice_y`/`lbm.graphics.slice_z`.
- Exported frames will automatically be assigned the current simulation time step in their name, in the format `bin/export/image-123456789.png`.
- To convert the rendered `.png` images to video, use [FFmpeg](https://ffmpeg.org/):
Expand Down
11 changes: 11 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -182,6 +182,17 @@ The fastest and most memory efficient lattice Boltzmann CFD software, running on
- fixed minor graphical artifacts in `raytrace_phi()`
- fixed minor graphical artifacts in `ray_grid_traverse_sum()`
- fixed wrong printed time step count on raindrop sample setup
- [v2.19](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.19) (07.09.2024) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.18...v2.19) (camera splines)
- the camera can now fly along a smooth path through a list of provided keyframe camera placements, [using Catmull-Rom splines](https://github.com/ProjectPhysX/FluidX3D/blob/master/DOCUMENTATION.md#video-rendering)
- more accurate remaining runtime estimation that includes time spent on rendering
- enabled FP16S memory compression by default
- printed camera placement using key <kbd>G</kbd> is now formatted for easier copy/paste
- added benchmark chart in Readme using mermaid gantt chart
- placed memory allocation info during simulation startup at better location
- fixed threading conflict between `INTERACTIVE_GRAPHICS` and `lbm.graphics.write_frame();`
- fixed maximum buffer allocation size limit for AMD GPUs and in Intel CPU Runtime for OpenCL
- fixed wrong `Re<Re_max` info printout for 2D simulations
- minor fix in `bandwidth_bytes_per_cell_device()`

</details>

Expand Down
4 changes: 2 additions & 2 deletions src/defines.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@
#define SRT // choose single-relaxation-time LBM collision operator; (default)
//#define TRT // choose two-relaxation-time LBM collision operator

//#define FP16S // compress LBM DDFs to range-shifted IEEE-754 FP16; number conversion is done in hardware; all arithmetic is still done in FP32
//#define FP16C // compress LBM DDFs to more accurate custom FP16C format; number conversion is emulated in software; all arithmetic is still done in FP32
#define FP16S // optional for 2x speedup and 2x VRAM footprint reduction: compress LBM DDFs to range-shifted IEEE-754 FP16; number conversion is done in hardware; all arithmetic is still done in FP32
//#define FP16C // optional for 2x speedup and 2x VRAM footprint reduction: compress LBM DDFs to more accurate custom FP16C format; number conversion is emulated in software; all arithmetic is still done in FP32

#define BENCHMARK // disable all extensions and setups and run benchmark setup instead

Expand Down
6 changes: 6 additions & 0 deletions src/graphics.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -544,9 +544,11 @@ INT WINAPI WinMain(_In_ HINSTANCE hInstance, _In_opt_ HINSTANCE, _In_ PSTR, _In_
DispatchMessage(&msg);
}
// main loop ################################################################
camera.rendring_frame.lock(); // block rendering for other threads until finished
camera.update_state(fmax(1.0/(double)camera.fps_limit, frametime));
main_graphics();
update_frame(frametime);
camera.rendring_frame.unlock();
frametime = clock.stop();
sleep(1.0/(double)camera.fps_limit-frametime);
clock.start();
Expand Down Expand Up @@ -723,9 +725,11 @@ int main(int argc, char* argv[]) {
double frametime = 1.0;
while(running) {
// main loop ################################################################
camera.rendring_frame.lock(); // block rendering for other threads until finished
camera.update_state(fmax(1.0/(double)camera.fps_limit, frametime));
main_graphics();
update_frame(frametime);
camera.rendring_frame.unlock();
frametime = clock.stop();
sleep(1.0/(double)camera.fps_limit-frametime);
clock.start();
Expand Down Expand Up @@ -780,9 +784,11 @@ int main(int argc, char* argv[]) {
get_console_font_size(fontwidth, fontheight);
while(running) {
// main loop ################################################################
camera.rendring_frame.lock(); // block rendering for other threads until finished
camera.update_state(fmax(1.0/(double)camera.fps_limit, frametime));
main_graphics();
update_frame(frametime);
camera.rendring_frame.unlock();
frametime = clock.stop();
sleep(1.0/(double)camera.fps_limit-frametime);
clock.start();
Expand Down
6 changes: 5 additions & 1 deletion src/graphics.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
#include "defines.hpp"
#include "utilities.hpp"
#include <atomic>
#include <mutex>

extern vector<string> main_arguments; // console arguments
extern std::atomic_bool running;
Expand All @@ -34,8 +35,11 @@ class Camera {
bool vr=false, tv=false; // virtual reality mode (enables stereoscopic rendering), VR TV mode
float eye_distance = 8.0f; // distance between cameras in VR mode
bool autorotation = false; // autorotation
bool key_update = true; // a key variable has been updated
bool lockmouse = false; // mouse movement won't change camera view when this is true
std::atomic_bool key_update = true; // a key variable has been updated
std::atomic_bool allow_rendering = false; // allows interactive redering if true
std::atomic_bool allow_labeling = true; // allows drawing label if true
std::mutex rendring_frame; // a frame for interactive graphics is currently rendered

private:
float log_zoom=4.0f*log(zoom), target_log_zoom=log_zoom;
Expand Down
61 changes: 32 additions & 29 deletions src/info.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3,28 +3,11 @@

Info info;

void Info::initialize(LBM* lbm) {
this->lbm = lbm;
#if defined(SRT)
collision = "SRT";
#elif defined(TRT)
collision = "TRT";
#endif // TRT
#if defined(FP16S)
collision += " (FP32/FP16S)";
#elif defined(FP16C)
collision += " (FP32/FP16C)";
#else // FP32
collision += " (FP32/FP32)";
#endif // FP32
cpu_mem_required = (uint)(lbm->get_N()*(ulong)bytes_per_cell_host()/1048576ull); // reset to get valid values for consecutive simulations
gpu_mem_required = lbm->lbm_domain[0]->get_device().info.memory_used;
}
void Info::append(const ulong steps, const ulong total_steps, const ulong t) {
if(total_steps==max_ulong) { // total_steps is not provided/used
this->steps = steps; // has to be executed before info.print_initialize()
this->steps_last = t; // reset last step count if multiple run() commands are executed consecutively
this->runtime_lbm_last = runtime_lbm; // reset last runtime if multiple run() commands are executed consecutively
this->runtime_total_last = this->runtime_total; // reset last runtime if multiple run() commands are executed consecutively
this->runtime_total = clock.stop();
} else { // total_steps has been specified
this->steps = total_steps; // has to be executed before info.print_initialize()
Expand All @@ -37,7 +20,8 @@ void Info::update(const double dt) {
this->runtime_total = clock.stop();
}
double Info::time() const { // returns either elapsed time or remaining time
return steps==max_ulong ? runtime_lbm : ((double)steps/(double)(lbm->get_t()-steps_last)-1.0)*(runtime_lbm-runtime_lbm_last); // time estimation on average so far
if(lbm==nullptr) return 0.0;
return steps==max_ulong ? runtime_total : ((double)steps/(double)(lbm->get_t()-steps_last)-1.0)*(runtime_total-runtime_total_last); // time estimation on average so far
//return steps==max_ulong ? runtime_lbm : ((double)steps-(double)(lbm->get_t()-steps_last))*runtime_lbm_timestep_smooth; // instantaneous time estimation
}
void Info::print_logo() const {
Expand All @@ -58,11 +42,27 @@ void Info::print_logo() const {
print("| "); print("\\ \\ / /", c); print(" |\n");
print("| "); print("\\ ' /", c); print(" |\n");
print("| "); print("\\ /", c); print(" |\n");
print("| "); print("\\ /", c); print(" FluidX3D Version 2.18 |\n");
print("| "); print("\\ /", c); print(" FluidX3D Version 2.19 |\n");
print("| "); print( "'", c); print(" Copyright (c) Dr. Moritz Lehmann |\n");
print("|-----------------------------------------------------------------------------|\n");
}
void Info::print_initialize() {
void Info::print_initialize(LBM* lbm) {
info.allow_printing.lock(); // disable print_update() until print_initialize() has finished
this->lbm = lbm;
#if defined(SRT)
collision = "SRT";
#elif defined(TRT)
collision = "TRT";
#endif // TRT
#if defined(FP16S)
collision += " (FP32/FP16S)";
#elif defined(FP16C)
collision += " (FP32/FP16C)";
#else // FP32
collision += " (FP32/FP32)";
#endif // FP32
cpu_mem_required = (uint)(lbm->get_N()*(ulong)bytes_per_cell_host()/1048576ull); // reset to get valid values for consecutive simulations
gpu_mem_required = lbm->lbm_domain[0]->get_device().info.memory_used;
const float Re = lbm->get_Re_max();
println("|-----------------.-----------------------------------------------------------|");
println("| Grid Resolution | "+alignr(57u, to_string(lbm->get_Nx())+" x "+to_string(lbm->get_Ny())+" x "+to_string(lbm->get_Nz())+" = "+to_string(lbm->get_N()))+" |");
Expand Down Expand Up @@ -91,10 +91,12 @@ void Info::print_initialize() {
println("'-----------------'-----------------------------------------------------------'");
#endif // INTERACTIVE_GRAPHICS_ASCII
clock.start();
allow_rendering = true;
info.allow_printing.unlock();
}
void Info::print_update() const {
if(allow_rendering) reprint(
if(lbm==nullptr) return;
info.allow_printing.lock();
reprint(
"|"+alignr(8, to_uint((double)lbm->get_N()*1E-6/runtime_lbm_timestep_smooth))+" |"+ // MLUPs
alignr(7, to_uint((double)lbm->get_N()*(double)bandwidth_bytes_per_cell_device()*1E-9/runtime_lbm_timestep_smooth))+" GB/s |"+ // memory bandwidth
alignr(10, to_uint(1.0/runtime_lbm_timestep_smooth))+" | "+ // steps/s
Expand All @@ -103,16 +105,17 @@ void Info::print_update() const {
);
#ifdef GRAPHICS
if(key_G) { // print camera settings
const string camera_position = "float3("+to_string(camera.pos.x/(float)lbm->get_Nx(), 6u)+"f*(float)Nx, "+to_string(camera.pos.y/(float)lbm->get_Ny(), 6u)+"f*(float)Ny, "+to_string(camera.pos.z/(float)lbm->get_Nz(), 6u)+"f*(float)Nz)";
const string camera_rx_ry_fov = to_string(degrees(camera.rx)-90.0, 1u)+"f, "+to_string(180.0-degrees(camera.ry), 1u)+"f, "+to_string(camera.fov, 1u)+"f";
const string camera_zoom = to_string(camera.zoom*(float)fmax(fmax(lbm->get_Nx(), lbm->get_Ny()), lbm->get_Nz())/(float)min(camera.width, camera.height), 6u)+"f";
if(camera.free) print_info("lbm.graphics.set_camera_free("+camera_position+", "+camera_rx_ry_fov+");");
else print_info("lbm.graphics.set_camera_centered("+camera_rx_ry_fov+", "+camera_zoom+");");
const string camera_position = "float3("+alignr(9u, to_string(camera.pos.x/(float)lbm->get_Nx(), 6u))+"f*(float)Nx, "+alignr(9u, to_string(camera.pos.y/(float)lbm->get_Ny(), 6u))+"f*(float)Ny, "+alignr(9u, to_string(camera.pos.z/(float)lbm->get_Nz(), 6u))+"f*(float)Nz)";
const string camera_rx_ry_fov = alignr(6u, to_string(degrees(camera.rx)-90.0, 1u))+"f, "+alignr(5u, to_string(180.0-degrees(camera.ry), 1u))+"f, "+alignr(5u, to_string(camera.fov, 1u))+"f";
const string camera_zoom = alignr(8u, to_string(camera.zoom*(float)fmax(fmax(lbm->get_Nx(), lbm->get_Ny()), lbm->get_Nz())/(float)min(camera.width, camera.height), 6u))+"f";
if(camera.free) println("\rlbm.graphics.set_camera_free("+camera_position+", "+camera_rx_ry_fov+");");
else println("\rlbm.graphics.set_camera_centered("+camera_rx_ry_fov+", "+camera_zoom+"); ");
key_G = false;
}
#endif // GRAPHICS
info.allow_printing.unlock();
}
void Info::print_finalize() {
allow_rendering = false;
lbm = nullptr;
println("\n|---------'-------------'-----------'-------------------'---------------------|");
}
13 changes: 6 additions & 7 deletions src/info.hpp
Original file line number Diff line number Diff line change
@@ -1,24 +1,23 @@
#pragma once

#include "utilities.hpp"
#include <mutex>

class LBM;
struct Info { // contains redundant information for console printing
LBM* lbm = nullptr;
bool allow_rendering = false; // allows interactive redering if true
bool allow_labeling = true; // allows drawing label if true
double runtime_lbm=0.0, runtime_total=0.0f; // lbm (compute) and total (compute + rendering + data evaluation) runtime
double runtime_lbm_timestep_last=1.0, runtime_lbm_timestep_smooth=1.0, runtime_lbm_last=0.0; // for printing simulation info
double runtime_lbm=0.0, runtime_total=0.0f, runtime_total_last=0.0; // lbm (compute) and total (compute + rendering + data evaluation) runtime
double runtime_lbm_timestep_last=1.0, runtime_lbm_timestep_smooth=1.0; // for printing simulation info
Clock clock; // for measuring total runtime
ulong steps=max_ulong, steps_last=0ull; // runtime_lbm_last and steps_last are there if multiple run() commands are executed consecutively
ulong steps=max_ulong, steps_last=0ull; // runtime_total_last and steps_last are there if multiple run() commands are executed consecutively
uint cpu_mem_required=0u, gpu_mem_required=0u; // all in MB
string collision = "";
void initialize(LBM* lbm);
std::mutex allow_printing; // to prevent threading conflicts when continuously printing updates to console
void append(const ulong steps, const ulong total_steps, const ulong t);
void update(const double dt);
double time() const; // returns either elapsed time or remaining time
void print_logo() const;
void print_initialize(); // enables interactive rendering
void print_initialize(LBM* lbm); // enables interactive rendering
void print_update() const;
void print_finalize(); // disables interactive rendering
};
Expand Down
Loading

0 comments on commit 0340760

Please sign in to comment.