Recent Releases of fluidx3d

fluidx3d - FluidX3D v3.4 (bug fixes)

Thank you for using FluidX3D! Finally I also have an AMD GPU in my posession, so I can test FluidX3D locally on AMD/Intel/Nvidia GPUs within the same PC, to guarantee full compatibility. This allowed me to identify and fix 2 critical bugs that were coding mistakes on my side yet somehow only exposed with AMD's driver.


Improvements - updated OpenCL driver install versions - minor refactoring in stream_collide()


Bug fixes - fixed bug in insertion-sort in voxelize_mesh() kernel causing crash on AMD GPUs - fixed bug in voxelize_mesh_on_device() host code causing initialization corruption on AMD GPUs - fixed dual CU and IPC reporting on AMD RDNA 1-4 GPUs


Have fun with the software! -- Moritz


PS: Here's a little demo of "SLI"-ing AMD+Intel+Nvidia GPUs with FluidX3D:

- C++
Published by ProjectPhysX 6 months ago

fluidx3d - FluidX3D v3.3 (faster .vtk export)

Thank you for using FluidX3D! Update v3.3 brings improvements to .vtk export and bug fixes:


Improvements - .vtk export now converts and writes data in chunks, to reduce memory footprint and time for large memory allocation - .vtk files now contain original file name as metadata in title - INTERACTIVE_GRAPHICS_ASCII now renders in 2x vertical resolution but less colors - updated OpenCL-Wrapper: more robust dp4a detection, fixed core count reporting for RDNA4 GPUs


Bug fixes - fixed update_moving_boundaries() kernel not being called with flags other than TYPE_S - fixed corrupted first frame until resizing with INTERACTIVE_GRAPHICS_ASCII - fixed resolution() function for D2Q9 - fixed missing <chrono> header on some compilers - fixed bug in split_regex() - fixed compiler warning with min_int


Have fun with the software! -- Moritz

- C++
Published by ProjectPhysX 8 months ago

fluidx3d - FluidX3D v3.2 (fast force/torque summation)

Thank you for using FluidX3D! Update v3.2 brings the much requested GPU-accelerated force/torque summation:


Improvements - implemented GPU-accelerated force/torque summation (~20x faster than CPU-multithreaded implementation before) - simplified calculating object force/torque in setups; before: c lbm.voxelize_mesh_on_device(mesh, TYPE_S|TYPE_X); const float3 lbm_com = lbm.calculate_object_center_of_mass(TYPE_S|TYPE_X); // ... lbm.calculate_force_on_boundaries(); lbm.F.read_from_device(); // having to copy entire lbm.F from GPU VRAM to CPU RAM was slow!! const float3 lbm_force = lbm.calculate_force_on_object(TYPE_S|TYPE_X); // slow CPU-multithreaded summation const float3 lbm_torque = lbm.calculate_torque_on_object(lbm_com, TYPE_S|TYPE_X); // slow CPU-multithreaded summation now: c lbm.voxelize_mesh_on_device(mesh, TYPE_S|TYPE_X); const float3 lbm_com = lbm.object_center_of_mass(TYPE_S|TYPE_X); // ... const float3 lbm_force = lbm.object_force(TYPE_S|TYPE_X); // fast GPU-accelerated summation, copy only result to CPU const float3 lbm_torque = lbm.object_torque(lbm_com, TYPE_S|TYPE_X); // fast GPU-accelerated summation, copy only result to CPU - improved coloring in VIS_FIELD/ray_grid_traverse_sum() - updated OpenCL-Wrapper now compiles OpenCL C code with -cl-std=CL3.0 if available


Bug fixes - fixed compiling on macOS with new OpenCL headers


Have fun with the software! -- Moritz


Here a showcase of the improved coloring in VIS_FIELD/ray_grid_traverse_sum(): grafik grafik

- C++
Published by ProjectPhysX 10 months ago

fluidx3d - FluidX3D v3.1 (more bug fixes)

Thank you for using FluidX3D! Update v3.1 brings two critical bug fixes/workarounds and various small improvements under the hood:


Improvements - faster enqueueReadBuffer() on modern CPUs with 64-Byte-aligned host_buffer - hardened ray intersection functions against planar ray edge case - updated OpenCL headers - better OpenCL device specs detection using vendor ID and Nvidia compute capability - better VRAM capacity reporting correction for Intel dGPUs - improved styling of performance mermaid gantt chart in Readme - added multi-GPU performance mermaid gantt chart in Readme - updated driver install guides


Bug fixes - fixed voxelization being broken on some GPUs - added workaround for compiler bug in Intel CPU Runtime for OpenCL that causes Q-criterion isosurface rendering corruption - fixed TFlops estimate for Intel Battlemage GPUs - fixed wrong device name reporting for AMD GPUs (unlike every sane GPU vendor they don't report device name as CL_DEVICE_NAME but need CL_DEVICE_BOARD_NAME_AMD extension instead)


Have fun with the software! -- Moritz

- C++
Published by ProjectPhysX 11 months ago

fluidx3d - FluidX3D v3.0 (larger CPU/iGPU simulations)

A little gift to you all: FluidX3D v3.0 enables 31% larger grid resolution when running on CPUs or iGPUs!


Improvements - reduced memory footprint on CPUs and iGPU from 72 to 55 Bytes/cell (fused OpenCL host+device buffers for rho/u/flags), allowing 31% higher resolution in the same RAM capacity - faster hardware-supported and faster fallback emulation atomic floating-point addition for PARTICLES extension - hardened calculate_f_eq() against bad user input for D2Q9


Bug fixes - fixed velocity voxelization for overlapping geometry with different velocity - fixed Remaining Time printout during paused simulation - fixed CPU/GPU memory printout for CPU/iGPU simulations - fixed bug that default_filename() would fail if there was a . in the file path


Have fun with the software! -- Moritz


PS: Here's a little demo of what FluidX3D v3.0 is capable of:

- C++
Published by ProjectPhysX about 1 year ago

fluidx3d - FluidX3D v2.19 (camera splines)

Thank you for using FluidX3D! Update v2.19 adds Catmull-Rom splines for smooth camera movement, and bug fixes:


Improvements - the camera can now fly along a smooth path through a list of provided keyframe camera placements, using Catmull-Rom splines - more accurate remaining runtime estimation that includes time spent on rendering - enabled FP16S memory compression by default - printed camera placement using key G is now formatted for easier copy/paste - added benchmark chart in Readme using mermaid gantt chart - placed memory allocation info during simulation startup at better location


Bug fixes - fixed threading conflict between INTERACTIVE_GRAPHICS and lbm.graphics.write_frame(); - fixed maximum buffer allocation size limit for AMD GPUs and in Intel CPU Runtime for OpenCL - fixed wrong Re<Re_max info printout for 2D simulations - minor fix in bandwidth_bytes_per_cell_device()


Have fun with the software! -- Moritz

- C++
Published by ProjectPhysX over 1 year ago

fluidx3d - FluidX3D v2.18 (more bug fixes)

Thank you for using FluidX3D! Update v2.18 brings support for high refresh rate monitors on Linux and bug fixes:


Improvements - added support for high refresh rate monitors on Linux - more compact OpenCL Runtime installation scripts in Documentation - driver/runtime installation instructions will now be printed to console if no OpenCL devices are available - added domain information to LBM::write_status() - added LBM::index function for uint3 input parameter


Bug fixes - fixed that very large simulations sometimes wouldn't render properly by increasing maximum render distance from 10k to 2.1M - fixed mouse input stuttering at high screen refresh rate on Linux - fixed graphical artifacts in free surface raytracing on Intel CPU Runtime for OpenCL - fixed runtime estimation printed in console for setups with multiple lbm.run(...) calls - fixed density oscillations in sample setups (too large lbm_u) - fixed minor graphical artifacts in raytrace_phi() - fixed minor graphical artifacts in ray_grid_traverse_sum() - fixed wrong printed time step count on raindrop sample setup


Have fun with the software! -- Moritz

- C++
Published by ProjectPhysX over 1 year ago

fluidx3d - FluidX3D v2.17 (unlimited domain resolution)

Thank you for using FluidX3D! Update v2.17 removes the limit on 2³² cells per domain and adds new field visualization:


Improvements - for GPUs/CPUs with >225 GB memory: domains are no longer limited to 4.29 billion (2³², 1624³) grid cells; if more are used, the OpenCL code will automatically compile with 64-bit indexing - new, faster raytracing-based field visualization for single-GPU simulations (thanks @snektron for the idea!) - added GPU Driver and OpenCL Runtime installation instructions to documentation - refactored INTERACTIVE_GRAPHICS_ASCII


Bug fixes - fixed memory leak in destructors of floatN, floatNxN, doubleN, doubleNxN (all unused) - made camera movement/rotation/zoom behavior independent of framerate - fixed that smart_device_selection() would print a wrong warning if device reports 0 MHz clock speed

Have fun with the software! -- Moritz


A glimpse of the new raytracing-based field visualization: grafik

- C++
Published by ProjectPhysX over 1 year ago

fluidx3d - FluidX3D v2.16 (bug fixes)

I'm doing my part! With the v2.16 update I've put down all remaining known bugs for good. 🖖😎❌🪳❌ WOULD YOU LIKE TO KNOW MORE?


Bug fixes in this release: - fixed that voxelization failed in Intel OpenCL CPU Runtime due to array out-of-bounds access - fixed that voxelization did not always produce binary identical results in multi-GPU compared to single-GPU - fixed that velocity voxelization failed for free surface simulations - fixed terrible performance on ARM GPUs by macro-replacing fused-multiply-add (fma) with a*b+c - fixed that Y/Z keys were incorrect for QWERTY keyboard layout in Linux - fixed that free camera movement speed in help overlay was not updated in stationary image when scrolling - fixed that cursor would sometimes flicker when scrolling on trackpads with Linux-X11 interactive graphics - fixed flickering of interactive rendering with multi-GPU when camera is not moved - fixed missing XInitThreads() call that could crash Linux interactive graphics on some systems - fixed z-fighting between graphics_rasterize_phi() and graphics_flags_mc() kernels


Other improvements: - simplified 10% faster marching-cubes implementation with 1D interpolation on edges instead of 3D interpolation, allowing to get rid of edge table - added faster, simplified marching-cubes variant for solid surface rendering where edges are always halfway between grid cells - refactoring in OpenCL rendering kernels


With GitHub I can track every bug from day it was discovered/fixed back to the day it was first introduced. This allows me to graph the number of open bugs over time, along with a curve weighted by their individual severity (minor 0.25, low 0.5, medium 1.0, high 2.0, showstopper 4.0): grafik

Here is the distribution of days open, days till discovery and days till fix. I fixed 56% of bugs on the day of discovery. Notice the bimodal distribution of days open - a clear separation into "easy" and "nasty" bugs. grafik

Lessons learned: - Since release there was 63 bugs in FluidX3D in total, with at max 41 open bugs at one time. 🖖😱 Now there is 0, at least until I find more. 🖖😎 For reference: FluidX3D is 12.1k lines of code. - Most bugs were a byproduct of big feature updates, like v2.0 (multi-GPU) and v2.1/v2.2 (voxelization). Of course at the time of introduction I didn't know that bugs slipped through, and I (or users) only discovered them later. - Only 17% of bugs were found by users, all the others I found myself with rigorous testing. It takes continuous poking around in the code to find these often super rare bugs. - 30% of bugs were actually bugs in the compiler, driver or operating system that needed a workaround on application side. - The latest v2.16 release is the best FluidX3D has ever been. The worst, most bugged version by this metric is v2.2. 🖖🤠

Have fun with the software! -- Moritz


PS: Here's an amusing FluidX3D video from @SLGY, he's doing his part too!

- C++
Published by ProjectPhysX over 1 year ago

fluidx3d - FluidX3D v2.15 (framerate boost)

Thank you for using FluidX3D! Update v2.15 boosts framerate in interactive graphics by 20-70%: - eliminated one frame memory copy and one clear frame operation in rendering chain - enabled g++ compiler optimizations for faster startup and higher rendering framerate


Bug fixes - fixed bug in multithreaded sanity checks - fixed wrong unit conversion for thermal expansion coefficient - fixed density to pressure conversion in LBM units - fixed bug that raytracing kernel could lock up simulation - fixed minor visual artifacts with raytracing - fixed that console sometimes was not cleared before INTERACTIVE_GRAPHICS_ASCII rendering started


Have fun with the software! -- Moritz

- C++
Published by ProjectPhysX over 1 year ago

fluidx3d - FluidX3D v2.14 (visualization upgrade)

Thank you for using FluidX3D! Update v2.14 brings an upgrade to visualization kernels and eases compiling: - coloring can now be switched between velocity/density/temperature with key Z - uniform improved color palettes for velocity/density/temperature visualization - color scale with automatic unit conversion can now be shown with key H - slice mode for field visualization now draws fully filled-in slices instead of only lines for velocity vectors - shading in VIS_FLAG_SURFACE and VIS_PHI_RASTERIZE modes is smoother now - make.sh now automatically detects operating system and X11 support on Linux and only runs FluidX3D if last compilation was successful


Bug fixes - fixed compiler warnings on Android - fixed make.sh failing on some systems due to nonstandard interpreter path - fixed that make would not compile with multiple cores on some systems


Here is a YouTube video (some screen recordings) to showcase the update, all real-time simulations on an Intel Arc A750: image

Have fun with the software! -- Moritz

- C++
Published by ProjectPhysX almost 2 years ago

fluidx3d - FluidX3D v2.13 (improved .vtk export)

Thank you for using FluidX3D! Update v2.13 improves .vtk export: - data in exported .vtk files is now automatically converted to SI units - ~2x faster .vtk export with multithreading - added unit conversion functions for TEMPERATURE extension


Bug fixes: - fixed graphical artifacts with axis-aligned camera in raytracing - fixed get_exe_path() for macOS - fixed X11 multi-monitor issues on Linux - workaround for Nvidia driver bug: enqueueFillBuffer is broken for large buffers on Nvidia GPUs - fixed slow numeric drift issues caused by -cl-fast-relaxed-math - fixed wrong Maximum Allocation Size reporting in LBM::write_status() - fixed missing scaling of coordinates to SI units in LBM::write_mesh_to_vtk()


Have fun with the software! -- Moritz

- C++
Published by ProjectPhysX almost 2 years ago

fluidx3d - FluidX3D v2.12

Thank you for using FluidX3D! Update v2.12 significantly reduces compile and startup time: - if make is installed, source code compiling on Linux is now ~3x faster using multiple CPU cores, from ~15s to ~5s - simulation initialization for single-GPU simulations is ~40% faster now - simulation initialization for multi-GPU simulations is ~15% faster now


Bug fixes - minor bug fix in Memory_Container::reset() function


Here is how launch time changed with FluidX3D versions: image Setup: 3D Taylor-Green vortices, single 384³ domain, D3Q19 SRT FP32/FP16S Hardware: Lenovo Y50-70, i7-4720HQ, 2x 8GB DDR3 1600 MT/s, GTX 960M 4GB

I had compared all previous versions and found v2.0 to introduce a big jump in startup time. This was due to changing LBM data field access from direct array access to domain decomposition indexing, which turned out as the main bottleneck during simulation startup. This is now fixed, with an indexing shortcut for single-GPU and pre-computing variables for multi-GPU indexing. Together with multi-core parallelization of initialization (v2.9) and faster buffer initialization (v2.11), launch time is now shorter than ever.

Have fun with the software! -- Moritz

- C++
Published by ProjectPhysX almost 2 years ago

fluidx3d - FluidX3D v2.11 (improved Linux graphics)

Thank you for using FluidX3D! I have recently upgraded my laptop from Windows 10 to ~~bugged and bloated Windows 11~~ kubuntu Linux with the amazing KDE Plasma desktop, and wanted the same FluidX3D interactive graphics capabilities as on Windows. FluidX3D could already do interactive graphics on Linux since v1.4, but only in 720p windowed mode. Update v2.11 changes that, and also adds two more minor improvements: - interactive graphics on Linux are now in fullscreen mode too, fully matching interactive graphics on Windows in functionality and user interface - made CPU/GPU buffer initialization significantly faster with std::fill and enqueueFillBuffer (overall ~8% faster simulation startup) - added operating system info to OpenCL device driver version printout


Bug fixes - fixed flickering with frustrum culling at very small field of view - fixed bug where rendered/exported frame was not updated when visualization_modes changed


Have fun with the software! -- Moritz

128

- C++
Published by ProjectPhysX about 2 years ago

fluidx3d - FluidX3D v2.10 (frustrum culling)

Thank you for using FluidX3D! Update v2.10 contains improvents to rasterization performance and bug fixes: - improved rasterization performance via frustrum culling when only part of the simulation box is visible - improved switching between centered/free camera mode - refactored OpenCL rendering library - unit conversion factors are now automatically printed in console when units.set_m_kg_s(...) is used - faster startup time for FluidX3D benchmark


Bug fixes - miner bug fix in voxelize_mesh(...) kernel - fixed bug in shading(...) - replaced slow (in multithreading) std::rand() function with standard C99 LCG - more robust correction of wrong VRAM capacity reporting on Intel Arc GPUs - fixed some minor compiler warnings


Have fun with the software! -- Moritz

- C++
Published by ProjectPhysX about 2 years ago

fluidx3d - FluidX3D v2.9 (multithreading)

Thank you for using FluidX3D! The v2.9 update makes simulation startup a lot quicker, especially for large multi-GPU simulations: - added cross-platform parallel_for implementation in utilities.hpp using std::threads - significantly (>4x) faster simulation startup with multithreaded geometry initialization and sanity checks - faster calculate_force_on_object() and calculate_torque_on_object() functions with multithreading - refactoring


New features - added total runtime and LBM runtime to lbm.write_status()


Bug fixes - fixed bug in voxelization ray direction for re-voxelizing rotating objects - fixed bug in Mesh::get_bounding_box_size() - fixed bug in print_message() function in utilities.hpp


Let the cores go brrrr!

FluidX3D-v2 9


Have fun with the software! -- Moritz

- C++
Published by ProjectPhysX over 2 years ago

fluidx3d - FluidX3D v2.8 (documentation + polish)

Thank you for using FluidX3D! The v2.8 update doesn't add too many new features, but finally more documentation, loads of refactoring and significant usability improvements: - finally added more documentation - cleaned up all sample setups in setup.cpp for much more beginner-friendly learning - added required extensions in defines.hpp as comments to all setups in setup.cpp - improved loading of composite .stl geometries, by adding an option to omit automatic repositioning of the mesh - added more functionality to Mesh struct in utilities.hpp - added uint3 resolution(float3 box_aspect_ratio, uint memory) function to compute simulation box resolution based on box aspect ratio and VRAM occupation in MB - added bool lbm.graphics.next_frame(...) function to export images for a specified video length in the main_setup compute loop - added VIS_... macros to ease setting visualization modes in headless graphics mode in lbm.graphics.visualization_modes - simulation box dimensions are now automatically made equally divisible by domains for multi-GPU simulations - made Info/Warning/Error message labels colored - added Cessna 172 propeller airplane and Bell 222 helicopter setups to showcase how loading of composite .stl geometries and revoxelization of moving parts works - added Ahmed body setup as an example on how body forces and drag coefficient are computed; expect absolute forces to be too large by up to a factor 2, because even large resolution is not enough to fully capture the turbulent boundary layer in this case; a wall function is needed, I'll scan literature on it


New features - added optional semi-transparent rendering mode (#define GRAPHICS_TRANSPARENCY 0.7f in defines.hpp)


Bug fixes - fixed flickering of streamline visualization in interactive graphics - improved smooth positioning of streamlines in slice mode - fixed bug where mass and massex in SURFACE extension were also allocated in CPU RAM (not required) - fixed bug in Q-criterion isosurface rendering of halo data in multi-GPU mode - reduced gap width between domains in Q-criterion isosurface rendering in multi-GPU mode - fixed crash/bug in local memory optimization in mesh voxelization kernel - removed shared memory optimization from mesh voxelization kernel, as it crashes on Nvidia GPUs with new GPU drivers and is incompatible with old OpenCL 1.0 GPUs - fixed Info/Warning/Error message formatting for loading files


Some showcases of what v2.8 is capable of: (click on images to show videos on YouTube)


Have fun with the software! -- Moritz

- C++
Published by ProjectPhysX over 2 years ago

fluidx3d - FluidX3D v2.7 (visualization upgrade)

New features - added slice visualization (key 2 / key 3 modes, then switch through slice modes with key T, move slice with keys Q/E) - made flag wireframe / solid surface visualization kernels toggleable with key 1 - added surface pressure visualization (key 1 when FORCE_FIELD is enabled and lbm.calculate_force_on_boundaries(); is called) - added binary .vtk export function for meshes with lbm.write_mesh_to_vtk(Mesh* mesh); - added time_step_multiplicator for integrate_particles() function in PARTICLES extension

Preview on YouTube: https://youtu.be/uL8usTb0Czg


Bug fixes - made correction of wrong memory reporting on Intel Arc more robust - fixed bug in write_file() template functions - reverted back to separate cl::Context for each OpenCL device, as the shared Context otherwise would allocate extra VRAM on all other unused Nvidia GPUs - removed Debug and x86 configurations from Visual Studio solution file (one less complication for compiling) - fixed bug that particles could get too close to walls and get stuck, or leave the fluid phase (added boundary force)


Known issues: - voxelization might not always produce binary identical results in multi-GPU (floating-point round-off on ray-triangle instersection distances may differ for different ray origin)

- C++
Published by ProjectPhysX over 2 years ago

fluidx3d - FluidX3D v2.6 (Intel Arc patch)

FluidX3D is now fully operational on Intel Arc GPUs (I patched their OpenCL driver issues) - now VRAM allocations >4GB are possible - this is necessary to use the full VRAM for simulations at the largest possible resolution - perfromance impact is 1.5%, not significant - correct VRAM capacity is reported on Intel Arc A770, A750, A580, A380 (driver wrongly reports only 80% on Windows and 95% on Linux)


Known issues: - voxelization might not always produce binary identical results in multi-GPU (floating-point round-off on ray-triangle instersection distances may differ for different ray origin)

- C++
Published by ProjectPhysX over 2 years ago

fluidx3d - FluidX3D v2.5 (raytracing overhaul)

Raytracing overhaul - implemented light absorption in fluid for raytracing graphics (no performance impact, demo on YouTube) - improved raytracing framerate when camera is inside fluid - fixed skybox pole flickering artifacts - refactored raytracing code


Other bug fixes - fixed bug where moving objects during re-voxelization would leave an erroneous trail of solid grid cells behind (increased mesh bounding box by 2 cells tolerance)


Known issues: - voxelization might not always produce binary identical results in multi-GPU (floating-point round-off on ray-triangle instersection distances may differ for different ray origin)

- C++
Published by ProjectPhysX over 2 years ago

fluidx3d - FluidX3D v2.4 (UI improvements)

UI improvements - added a help menu with key H that shows keyboard/mouse controls, visualization settings and simulation stats - zoom control with keyboard is now keys +/- instead of ./, - print camera settings in console is now key G instead of H - a simple mouseclick now frees/locks the cursor additionally to key U - if the grid resolution is set larger than memory capacity allows, an error will now be printed, suggesting the largest possible grid resolution, so users don't have to guess how large the grid can be - all source files are now encoded in UTF-8


Minor optimizations - the allocation size for the transfer buffers is now the not the maximum of Ax/Ay/Az, but only the maximum of the areas that are actually communicated; saves a few MB VRAM in some occasions - the transfer buffer for fi is now used as faster array of structures instead of structure of arrays; performance difference is negligible - refactoring in smartdeviceselection() function - upgraded OpenCL-Wrapper: devices from the same vendor are now in the same OpenCL Context, allowing migration of Memory objects; event-driven synchronisation can now be used


Bug fixes - fixed bug in temperature equilibrium function for temperature extension; lattice speed of sound in D3Q7 is 1/2 and not 1/sqrt(3) - made erroneous double literal in skybox color functions, which is a bug for Intel iGPUs, a float literal - fixed bug in make.sh where multiple console parameters for multi-GPU device IDs would not get forwarded from the ./make.sh call to the bin/FluidX3D executable - fixed bug in mouse rotation in Windows when cursor is free but kept getting centered during rotation - fixed bug in interactive graphics where text labels on the right side of the screen would not get drawn on both left/right eye screens in VR mode - fixed bug in LBM::voxelize_stl() size parameter standard initialization


Known issues: - voxelization might not always produce binary identical results in multi-GPU (floating-point round-off on ray-triangle instersection distances may differ for different ray origin)

- C++
Published by ProjectPhysX almost 3 years ago

fluidx3d - FluidX3D v2.3 (particles)

Particle update: - added particles with immersed-boundary method (either passive or 2-way-coupled, only supported with single-GPU) - minor optimization to GPU voxelization algorithm (workgroup threads outside mesh bounding-box return after ray-mesh intersections have been found) - displayed GPU memory allocation size is now fully accurate - fixed bug in write_line() function in src/utilities.hpp - removed .exe file extension for Linux/macOS - refactoring and cosmetics


Known issues: - voxelization might not always produce binary identical results in multi-GPU (floating-point round-off on ray-triangle instersection distances may differ for different ray origin)

- C++
Published by ProjectPhysX almost 3 years ago

fluidx3d - FluidX3D v2.2 (velocity voxelization)

Velocity voxelization update: - simulation of moving/rotating geometry is now possible, here is a demo - added option to voxelize moving/rotating geometry on GPU, with automatic velocity initialization for each grid point based on center of rotation, linear velocity and rotational velocity - cells that are converted from solid->fluid during re-voxelization now have their DDFs properly initialized - added option to not auto-scale mesh during read_stl(...), with negative size parameter - added kernel for solid boundary rendering with marching-cubes


Known issues: - voxelization might not always produce binary identical results in multi-GPU (floating-point round-off on ray-triangle instersection distances may differ for different ray origin)

- C++
Published by ProjectPhysX almost 3 years ago

fluidx3d - FluidX3D v2.1 (fast voxelization)

Fast GPU voxelization update: - new algorithm for .stl mesh GPU voxelization: ~500x faster now, from minutes to milliseconds - added unvoxelize kernel, to quickly remove all boundaries in the mesh bounding box. - removed old hull voxelization algorithm


Old: naive GPU voxelization - For each voxel in the 3D grid, cast a ray from the voxel center in an arbitrary direction, and check with all mesh triangles for intersection. - Count the number of intersections. - Odd number of intersections means the voxel is inside. - Runtime: N³×Triangles


New: fast GPU voxelization - Only for the 2D bottom layer of grid points, shoot vertical rays upward and check with all mesh triangles for intersection. - The vertical rays pass through all voxels in the columns above, so these don't have to be checked for ray-mesh intersection at all. - Store all intersection distances in a short array in registers. - Sort this array with insertion sort. - Iterate through the vertical column of voxels. - The first voxel is inside/outside depending on odd/even total intersection count. - Each time one of the stored distances in the sorted array is passed, switch inside/outside state. - Optimizations - Only check inside the bounding box of the mesh. - Don't always start from the bottom (z-direction), but from the direction where the mesh bounding box has the smallest cross-section area, so the smallest number of ray-mesh intersections have to be tested. - To avoid errors on the odd/even total number of intersections, shoot a second ray in the opposite direction and only count the intersection number. Both have to be odd for the bottom voxel to start in inside state. - Runtime: N²×Triangles, if N=500, this is 500x faster than naive voxelization


Known issues: - voxelization might not always produce binary identical results in multi-GPU (floating-point round-off on ray-triangle instersection distances may differ for different ray origin)

- C++
Published by ProjectPhysX almost 3 years ago

fluidx3d - FluidX3D v2.0 (multi-GPU)

Big multi-GPU Update: - Multi-GPU simulations are now possible on a single node (PC/laptop/server), allowing to pool VRAM from multiple GPUs. - Easy setup with minimal changes to the user: instead of LBM lbm(Nx, Ny, Nz, nu, ...);, use LBM lbm(Nx, Ny, Nz, Dx, Dy, Dz, nu, ...);, with Dx/Dy/Dz indicating how many domains (GPUs) in each spatial direction to use. By default, all identical GPUs will be automatically assigned their domains, however the GPUs can also be manually set with a list of their indices: ./make.sh 2 6 3 4 or /bin/FluidX3D 2 6 3 4. - All extensions are supported and validated to produce binary identical results compared to single-GPU simulations. - Multi-GPU also works with non-identical GPUs, regardless of vendor. Yes, you can run FluidX3D on unholy combinations of Nvidia/AMD/Intel GPUs/CPUs at the same time. I only recommend similar memory capacity and bandwidth, as the weakest GPU will bottleneck performance. - No SLI/Crossfire/NVLink/InfinityFabric is required. All communication runs over PCIe and is compatible with all hardware. - No MPI installation is required. - Total grid resolution must be equally divisible into domains, such that all domains are the same size. - The resolution of each domain is restricted to 4.29 billion grid points (2³², 225GB VRAM), but domain number and thus total grid resolution is unrestricted. - Under the hood: Complete re-write of C++ backend, to account for the domain decomposition architecture. The code is already fully optimized and shortened for maximum maintainability/upgradeability. - Grid resolution can now be arbitrary and is not anymore restricted to the condition (Nx*Ny*Nz)%WORKGROUP_SIZE==0.


Known issues: - Raytracing graphics are disabled for multi-GPU. The simulated light rays would have to travel through the entire simulation box, crossing domain boundaries. This is not easily possible, because each GPU only keeps its own domain in VRAM.

- C++
Published by ProjectPhysX almost 3 years ago

fluidx3d - FluidX3D v1.4 (Linux graphics)

  • Big update for Linux users: Added interactive graphics mode on Linux with X11. No external dependencies, compiles out-of-the-box with the "compile on Linux with X11" command in make.sh.
  • Re-wrote C++ graphics library to minimize API dependencies
  • Colors are now signed int consistently.
  • Fixed streamline visualization in 2D.

- C++
Published by ProjectPhysX about 3 years ago

fluidx3d - FluidX3D v1.3

  • added OpenCL driver bug workaround for old AMD GPUs (binary number literals for flag bitmasks don't work, so change to hexadecimal literals)
  • FORCE_FIELD and VOLUME_FORCE can now be used independently
  • added unit conversion functions for torque

- C++
Published by ProjectPhysX about 3 years ago

fluidx3d - FluidX3D v1.2

  • added functions to compute force/torque on objects
  • added function to translate Mesh
  • added more benchmarks in Readme
  • added Stokes drag validation setup

- C++
Published by ProjectPhysX about 3 years ago

fluidx3d - FluidX3D v1.1

  • fixed broken triangle rendering with some Intel iGPUs (driver bug workaround in marching_cubes)
  • added new GPU voxelization
  • added tool to print current camera position (key_H)
  • refactoring

- C++
Published by ProjectPhysX over 3 years ago

fluidx3d - FluidX3D v1.0

- C++
Published by ProjectPhysX over 3 years ago