Releases · ProjectPhysX/FluidX3D

31 Jul 14:52

5e89439

Thank you for using FluidX3D! The v2.9 update makes simulation startup a lot quicker, especially for large multi-GPU simulations:

added cross-platform parallel_for implementation in utilities.hpp using std::threads
significantly (>4x) faster simulation startup with multithreaded geometry initialization and sanity checks
faster calculate_force_on_object() and calculate_torque_on_object() functions with multithreading
refactoring
added total runtime and LBM runtime to lbm.write_status()

Bug fixes:

fixed bug in voxelization ray direction for re-voxelizing rotating objects
fixed bug in Mesh::get_bounding_box_size()
fixed bug in print_message() function in utilities.hpp

Let the cores go brrrr!

Have fun with the software!
-- Moritz

Assets 8

26 Jun 04:58

ProjectPhysX

v2.8

aa4f6fb

FluidX3D v2.8 (documentation + polish)

Thank you for using FluidX3D! The v2.8 update doesn't add too many new features, but finally more documentation, loads of refactoring and significant usability improvements:

finally added more documentation
cleaned up all sample setups in setup.cpp for much more beginner-friendly learning
added required extensions in defines.hpp as comments to all setups in setup.cpp
improved loading of composite .stl geometries, by adding an option to omit automatic repositioning of the mesh
added more functionality to Mesh struct in utilities.hpp
added uint3 resolution(float3 box_aspect_ratio, uint memory) function to compute simulation box resolution based on box aspect ratio and VRAM occupation in MB
added bool lbm.graphics.next_frame(...) function to export images for a specified video length in the main_setup compute loop
added VIS_... macros to ease setting visualization modes in headless graphics mode in lbm.graphics.visualization_modes
simulation box dimensions are now automatically made equally divisible by domains for multi-GPU simulations
made Info/Warning/Error message labels colored
added Cessna 172 propeller airplane and Bell 222 helicopter setups to showcase how loading of composite .stl geometries and revoxelization of moving parts works
added Ahmed body setup as an example on how body forces and drag coefficient are computed; expect absolute forces to be too large by up to a factor 2, because even large resolution is not enough to fully capture the turbulent boundary layer in this case; a wall function is needed, I'll scan literature on it
added optional semi-transparent rendering mode (#define GRAPHICS_TRANSPARENCY 0.7f in defines.hpp)

Bug fixes:

fixed flickering of streamline visualization in interactive graphics
improved smooth positioning of streamlines in slice mode
fixed bug where mass and massex in SURFACE extension were also allocated in CPU RAM (not required)
fixed bug in Q-criterion isosurface rendering of halo data in multi-GPU mode
reduced gap width between domains in Q-criterion isosurface rendering in multi-GPU mode
fixed crash/bug in local memory optimization in mesh voxelization kernel
removed shared memory optimization from mesh voxelization kernel, as it crashes on Nvidia GPUs with new GPU drivers and is incompatible with old OpenCL 1.0 GPUs
fixed Info/Warning/Error message formatting for loading files

Some showcases of what v2.8 is capable of:

(click on images to show videos on YouTube)

Have fun with the software!
-- Moritz

Assets 8

29 May 07:37

ProjectPhysX

v2.7

d6485e7

FluidX3D v2.7 (visualization upgrade)

New features:

added slice visualization (key 2 / key 3 modes, then switch through slice modes with key T, move slice with keys Q/E)
made flag wireframe / solid surface visualization kernels toggleable with key 1
added surface pressure visualization (key 1 when FORCE_FIELD is enabled and lbm.calculate_force_on_boundaries(); is called)
added binary .vtk export function for meshes with lbm.write_mesh_to_vtk(Mesh* mesh);
added time_step_multiplicator for integrate_particles() function in PARTICLES extension

Bug fixes:

made correction of wrong memory reporting on Intel Arc more robust
fixed bug in write_file() template functions
reverted back to separate cl::Context for each OpenCL device, as the shared Context otherwise would allocate extra VRAM on all other unused Nvidia GPUs
removed Debug and x86 configurations from Visual Studio solution file (one less complication for compiling)
fixed bug that particles could get too close to walls and get stuck, or leave the fluid phase (added boundary force)

Preview on YouTube:

Assets 8

16 Apr 10:29

ProjectPhysX

v2.6

8c25a1f

FluidX3D v2.6 (Intel Arc patch)

FluidX3D is now fully operational on Intel Arc GPUs (I patched their OpenCL driver issues):

now VRAM allocations >4GB are possible
- this is necessary to use the full VRAM for simulations at the largest possible resolution
- perfromance impact is 1.5%, not significant
correct VRAM capacity is reported on Intel Arc A770, A750, A580, A380 (driver wrongly reports only 80% on Windows and 95% on Linux)

Assets 8

11 Apr 17:44

ProjectPhysX

v2.5

f9aba8b

FluidX3D v2.5 (raytracing overhaul)

Raytracing overhaul:

implemented light absorption in fluid for raytracing graphics (no performance impact, demo on YouTube)
improved raytracing framerate when camera is inside fluid
fixed skybox pole flickering artifacts
refactored raytracing code

Other bug fixes:

fixed bug where moving objects during re-voxelization would leave an erroneous trail of solid grid cells behind (increased mesh bounding box by 2 cells tolerance)

Assets 8

11 Mar 17:40

ProjectPhysX

v2.4

8c197a0

FluidX3D v2.4 (UI improvements)

UI improvements:

added a help menu with key H that shows keyboard/mouse controls, visualization settings and simulation stats
zoom control with keyboard is now keys +/- instead of ./,
print camera settings in console is now key G instead of H
a simple mouseclick now frees/locks the cursor additionally to key U
if the grid resolution is set larger than memory capacity allows, an error will now be printed, suggesting the largest possible grid resolution, so users don't have to guess how large the grid can be
all source files are now encoded in UTF-8

Minor optimizations:

the allocation size for the transfer buffers is now the not the maximum of Ax/Ay/Az, but only the maximum of the areas that are actually communicated; saves a few MB VRAM in some occasions
the transfer buffer for fi is now used as faster array of structures instead of structure of arrays; performance difference is negligible
refactoring in smart_device_selection() function
upgraded OpenCL-Wrapper: devices from the same vendor are now in the same OpenCL Context, allowing migration of Memory objects; event-driven synchronisation can now be used

Bug fixes:

fixed bug in temperature equilibrium function for temperature extension; lattice speed of sound in D3Q7 is 1/2 and not 1/sqrt(3)
made erroneous double literal in skybox color functions, which is a bug for Intel iGPUs, a float literal
fixed bug in make.sh where multiple console parameters for multi-GPU device IDs would not get forwarded from the ./make.sh call to the bin/FluidX3D executable
fixed bug in mouse rotation in Windows when cursor is free but kept getting centered during rotation
fixed bug in interactive graphics where text labels on the right side of the screen would not get drawn on both left/right eye screens in VR mode
fixed bug in LBM::voxelize_stl() size parameter standard initialization

Assets 8

30 Jan 15:43

ProjectPhysX

v2.3

a736c8f

FluidX3D v2.3 (particles)

Particle update:

added particles with immersed-boundary method (either passive or 2-way-coupled, only supported with single-GPU)
minor optimization to GPU voxelization algorithm (workgroup threads outside mesh bounding-box return after ray-mesh intersections have been found)
displayed GPU memory allocation size is now fully accurate
fixed bug in write_line() function in src/utilities.hpp
removed .exe file extension for Linux/macOS
refactoring and cosmetics

Assets 8

20 Jan 08:57

ProjectPhysX

v2.2

b7d883f

FluidX3D v2.2 (velocity voxelization)

Velocity voxelization update:

simulation of moving/rotating geometry is now possible, here is a demo
- added option to voxelize moving/rotating geometry on GPU, with automatic velocity initialization for each grid point based on center of rotation, linear velocity and rotational velocity
- cells that are converted from solid->fluid during re-voxelization now have their DDFs properly initialized
- added option to not auto-scale mesh during read_stl(...), with negative size parameter
added kernel for solid boundary rendering with marching-cubes

Assets 8

15 Jan 19:45

ProjectPhysX

v2.1

245303e

FluidX3D v2.1 (fast voxelization)

Fast GPU voxelization update:

new algorithm for .stl mesh GPU voxelization: ~500x faster now, from minutes to milliseconds
added unvoxelize kernel, to quickly remove all boundaries in the mesh bounding box.
removed old hull voxelization algorithm

Old: naive GPU voxelization

For each voxel in the 3D grid, cast a ray from the voxel center in an arbitrary direction, and check with all mesh triangles for intersection.
Count the number of intersections.
Odd number of intersections means the voxel is inside.
Runtime: N³×Triangles

New: fast GPU voxelization

Only for the 2D bottom layer of grid points, shoot vertical rays upward and check with all mesh triangles for intersection.
The vertical rays pass through all voxels in the columns above, so these don't have to be checked for ray-mesh intersection at all.
Store all intersection distances in a short array in registers.
Sort this array with insertion sort.
Iterate through the vertical column of voxels.
- The first voxel is inside/outside depending on odd/even total intersection count.
- Each time one of the stored distances in the sorted array is passed, switch inside/outside state.
Optimizations
- Only check inside the bounding box of the mesh.
- Don't always start from the bottom (z-direction), but from the direction where the mesh bounding box has the smallest cross-section area, so the smallest number of ray-mesh intersections have to be tested.
- To avoid errors on the odd/even total number of intersections, shoot a second ray in the opposite direction and only count the intersection number. Both have to be odd for the bottom voxel to start in inside state.
Runtime: N²×Triangles, if N=500, this is 500x faster than naive voxelization

Known issues:

voxelization might not always produce binary identical results in multi-GPU (floating-point round-off on ray-triangle instersection distances may differ for different ray origin) --> fixed in v2.16!

Assets 8

09 Jan 18:30

ProjectPhysX

v2.0

9f69df4

FluidX3D v2.0 (multi-GPU upgrade)

Big multi-GPU Update:

Multi-GPU simulations are now possible on a single node (PC/laptop/server), allowing to pool VRAM from multiple GPUs.
- Easy setup with minimal changes to the user: instead of LBM lbm(Nx, Ny, Nz, nu, ...);, use LBM lbm(Nx, Ny, Nz, Dx, Dy, Dz, nu, ...);, with Dx/Dy/Dz indicating how many domains (GPUs) in each spatial direction to use. By default, all identical GPUs will be automatically assigned their domains, however the GPUs can also be manually set with a list of their indices: ./make.sh 2 6 3 4 or /bin/FluidX3D 2 6 3 4.
- All extensions are supported and validated to produce binary identical results compared to single-GPU simulations.
- Multi-GPU also works with non-identical GPUs, regardless of vendor. Yes, you can run FluidX3D on unholy combinations of Nvidia/AMD/Intel GPUs/CPUs at the same time. I only recommend similar memory capacity and bandwidth, as the weakest GPU will bottleneck performance.
- No SLI/Crossfire/NVLink/InfinityFabric is required. All communication runs over PCIe and is compatible with all hardware.
- No MPI installation is required.
- Total grid resolution must be equally divisible into domains, such that all domains are the same size.
- The resolution of each domain is restricted to 4.29 billion grid points (2³², 225GB VRAM), but domain number and thus total grid resolution is unrestricted.
- Under the hood: Complete re-write of C++ backend, to account for the domain decomposition architecture. The code is already fully optimized and shortened for maximum maintainability/upgradeability.
Grid resolution can now be arbitrary and is not anymore restricted to the condition (Nx*Ny*Nz)%WORKGROUP_SIZE==0.

Known issues:

Raytracing graphics are disabled for multi-GPU. The simulated light rays would have to travel through the entire simulation box, crossing domain boundaries. This is not easily possible, because each GPU only keeps its own domain in VRAM.

Assets 8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: ProjectPhysX/FluidX3D

FluidX3D v2.9 (multithreading)

FluidX3D v2.8 (documentation + polish)

FluidX3D v2.7 (visualization upgrade)

FluidX3D v2.6 (Intel Arc patch)

FluidX3D v2.5 (raytracing overhaul)

FluidX3D v2.4 (UI improvements)

FluidX3D v2.3 (particles)

FluidX3D v2.2 (velocity voxelization)

FluidX3D v2.1 (fast voxelization)

FluidX3D v2.0 (multi-GPU upgrade)