Recent Releases of NonuniformFFTs
NonuniformFFTs - v0.8.4
NonuniformFFTs v0.8.4
Merged pull requests: - Add AMDGPU benchmarks + update CUDA/CPU ones (#65) (@jipolanco) - CompatHelper: bump compat for AMDGPU in [weakdeps] to 2, (keep existing compat) (#66) (@github-actions[bot])
- Julia
Published by github-actions[bot] 7 months ago
NonuniformFFTs - v0.8.3
NonuniformFFTs v0.8.3
See CHANGELOG.md.
Merged pull requests: - Fix CPU performance issues with complex data (#64) (@jipolanco)
- Julia
Published by github-actions[bot] 8 months ago
NonuniformFFTs - v0.8.2
NonuniformFFTs v0.8.2
See CHANGELOG.md.
Merged pull requests: - Add option for using atomic operations in CPU type-1 (#62) (@jipolanco)
- Julia
Published by github-actions[bot] 8 months ago
NonuniformFFTs - v0.8.0
NonuniformFFTs v0.8.0
Breaking changes
- There are no breaking changes.
Other changes
Added
- Allow user-defined callbacks, which can be used to modify input and/or output transform values "on the fly". This can help improve performance and lead to memory savings in certain applications, as it allows to combine operations and to avoid the allocation of extra arrays.
Changed
- Tune performance on AMDGPU. We now default to
Directevaluation (as with CUDA), which can be much faster thanFastApproximation. The difference is more visible in type-2 transforms, while type-1 doesn't change that much. Besides, direct evaluation of the (non-default)KaiserBesselKernelhas been optimised.
Merged pull requests: - Tune default parameters on AMDGPU (#59) (@jipolanco) - Add benchmark suite using AirspeedVelocity.jl (#60) (@jipolanco) - Allow user-defined callbacks (#61) (@jipolanco)
- Julia
Published by github-actions[bot] 9 months ago
NonuniformFFTs - v0.7.3
NonuniformFFTs v0.7.3
Merged pull requests: - CI: test on aarch64 (#58) (@jipolanco)
- Julia
Published by github-actions[bot] 10 months ago
NonuniformFFTs - v0.7.2
NonuniformFFTs v0.7.2
Closed issues: - Type-1 conversion with complex-valued inputs : Values don't match after half-point (#56) - Type-2 conversion with complex-valued inputs : Error increases after Half-Point (#57)
- Julia
Published by github-actions[bot] 11 months ago
NonuniformFFTs - v0.7.1
NonuniformFFTs v0.7.1
- Julia
Published by github-actions[bot] about 1 year ago
NonuniformFFTs - v0.7.0
NonuniformFFTs v0.7.0
See CHANGELOG.md for details.
Merged pull requests: - Avoid creating copy of non-uniform points (#55) (@jipolanco)
- Julia
Published by github-actions[bot] about 1 year ago
NonuniformFFTs - v0.6.8
NonuniformFFTs v0.6.8
- minor GPU performance improvements
Merged pull requests: - Add benchmark data and plots (#49) (@jipolanco) - Fix displaying images in docs (#50) (@jipolanco) - Docs: move examples to a separate page (#51) (@jipolanco) - Improve performance of set_points! on GPU (#53) (@jipolanco) - Slightly simplify GPU shared-memory spreading (#54) (@jipolanco)
- Julia
Published by github-actions[bot] about 1 year ago
NonuniformFFTs - v0.6.7
NonuniformFFTs v0.6.7
Fixed
- Avoid error when creating high-accuracy GPU plans. This affected plans that cannot be treated using the
:shared_memorymethod (because they require large memory buffers), such as plans withComplexF64data associated to a large kernel width (e.g.HalfSupport(8)). Such plans can still be computed using the:global_memorymethod, but this failed up to now.
Merged pull requests: - CompatHelper: bump compat for Atomix to 1, (keep existing compat) (#48) (@github-actions[bot])
- Julia
Published by github-actions[bot] over 1 year ago
NonuniformFFTs - v0.6.6
NonuniformFFTs v0.6.6
- Improve parallel performance of
set_points!withCPUbackend. (#47)
Merged pull requests:
- Improve parallel performance of set_points! on CPU (#47) (@jipolanco)
- Julia
Published by github-actions[bot] over 1 year ago
NonuniformFFTs - v0.6.5
NonuniformFFTs v0.6.5
Fixed
- Fix scalar indexing error on latest AMDGPU.jl (v1.1.1). Not sure exactly if it's a recent change in AMDGPU.jl, or maybe in GPUArrays.jl, which caused the error.
Merged pull requests: - Bump codecov/codecov-action from 4 to 5 (#46) (@dependabot[bot])
- Julia
Published by github-actions[bot] over 1 year ago
NonuniformFFTs - v0.6.4
NonuniformFFTs v0.6.4
Changed
- Avoid large GPU allocation in type-2 transforms when using the CUDA backend. The allocation was due to CUDA.jl creating a copy of the input in complex-to-real FFTs (see CUDA.jl#2249).
Merged pull requests: - Avoid GPU allocation in CUDA type-2 NUFFTs (#45) (@jipolanco)
- Julia
Published by github-actions[bot] over 1 year ago
NonuniformFFTs - v0.6.3
NonuniformFFTs v0.6.3
Merged pull requests: - CompatHelper: bump compat for StructArrays to 0.7, (keep existing compat) (#44) (@github-actions[bot])
- Julia
Published by github-actions[bot] over 1 year ago
NonuniformFFTs - v0.6.2
NonuniformFFTs v0.6.2
Changed
Improve performance of atomic operations (affecting type-1 transforms) on AMD GPUs by using
@atomic :monotonic.Change a few defaults on AMD GPUs to improve performance. This is based on experiments with an AMD MI210, where the new defaults should give better performance. We now default to fast polynomial approximation of kernel functions and to the backwards Kaiser-Bessel kernel (as in the CPU).
- Julia
Published by github-actions[bot] over 1 year ago
NonuniformFFTs - v0.6.1
NonuniformFFTs v0.6.1
Fixed
- Fix type-2 transforms on the GPU when performing multiple transforms at once (
ntransforms > 1) and whengpu_method = :shared_memory(which is not currently the default).
Merged pull requests: - Fix type-2 GPU shared memory with ntransforms > 1 (#43) (@jipolanco)
- Julia
Published by github-actions[bot] over 1 year ago
NonuniformFFTs - v0.6.0
NonuniformFFTs v0.6.0
Added
Add alternative implementation of GPU transforms based on shared-memory arrays. This is disabled by default, and can be enabled by passing
gpu_method = :shared_memorywhen creating a plan (default is:global_memory).Add possibility to switch between fast approximation of kernel functions (previously the default and only choice) and direct evaluation (previously not implemented). These correspond to the new
kernel_evalmodeplan creation option. Possible values areFastApproximation()andDirect(). The default depends on the actual backend. Currently,FastApproximation()is used on CPUs andDirect()on GPUs, where it is sometimes faster.The
AbstractNFFTs.plan_nfftfunction is now implemented for full compatibility with the AbstractNFFTs.jl interface.
Changed
BREAKING: Change default precision of transforms. By default, transforms on
Float64orComplexF64now have a relative precision of the order of $10^{-7}$. This corresponds to settingm = HalfSupport(4)and oversampling factorσ = 2.0. Previously, the default wasm = HalfSupport(8)andσ = 2.0, corresponding to a relative precision of the order of $10^{-14}$.BREAKING: The
PlanNUFFTconstructor can no longer be used to create plans compatible with AbstractNFFTs.jl / NFFT.jl. Instead, a separate (and unexported)NonuniformFFTs.NFFTPlantype is now defined which may be used for this purpose. Alternatively, one can now use theAbstractNFFTs.plan_nfftfunction.On GPUs, we now default to direct evaluation of kernel functions (e.g. Kaiser-Bessel) instead of polynomial approximations, as this seems to be faster and uses far fewer GPU registers.
On CUDA and AMDGPU, the default kernel is now
KaiserBesselKernelinstead ofBackwardsKaiserBesselKernel. The direct evaluation of the KB kernel (based on Bessel functions) seems to be a bit faster than backwards KB, both on CUDA and AMDGPU. Accuracy doesn't change much since both kernels have similar precisions.
Merged pull requests:
- CompatHelper: bump compat for GPUArraysCore to 0.2, (keep existing compat) (#36) (@github-actions[bot])
- Add shared-memory GPU implementations of spreading and interpolation (#37) (@jipolanco)
- Change default accuracy of transforms (#38) (@jipolanco)
- Use direct evaluation of kernel functions on GPU (#39) (@jipolanco)
- Allow choosing the kernel evaluation method (#40) (@jipolanco)
- Automatically determine batch size in shared-memory GPU transforms (#41) (@jipolanco)
- Define AbstractNFFTs.plan_nfft and create separate plan type (#42) (@jipolanco)
- Julia
Published by github-actions[bot] over 1 year ago
NonuniformFFTs - v0.5.6
NonuniformFFTs v0.5.6
Merged pull requests: - Simplify main GPU kernels using Adapt (#34) (@jipolanco) - Minor optimisations to GPU kernels (#35) (@jipolanco)
- Julia
Published by github-actions[bot] over 1 year ago
NonuniformFFTs - v0.5.5
NonuniformFFTs v0.5.5
Merged pull requests: - Make things work on AMD GPUs (#33) (@jipolanco)
- Julia
Published by github-actions[bot] over 1 year ago
NonuniformFFTs - v0.5.4
NonuniformFFTs v0.5.4
Merged pull requests: - GPU: disable explicit synchronisation barriers by default (#32) (@jipolanco)
- Julia
Published by github-actions[bot] over 1 year ago
NonuniformFFTs - v0.5.3
NonuniformFFTs v0.5.3
Merged pull requests: - Faster point sorting + performance tuning (#31) (@jipolanco)
- Julia
Published by github-actions[bot] over 1 year ago
NonuniformFFTs - v0.5.2
NonuniformFFTs v0.5.2
- Julia
Published by github-actions[bot] over 1 year ago
NonuniformFFTs - v0.5.1
NonuniformFFTs v0.5.1
- Julia
Published by github-actions[bot] over 1 year ago
NonuniformFFTs - v0.5.0
NonuniformFFTs v0.5.0
Merged pull requests: - Initial GPU implementation (#29) (@jipolanco) - GPU: implement spatial sorting of non-uniform points (#30) (@jipolanco)
- Julia
Published by github-actions[bot] over 1 year ago
NonuniformFFTs - v0.4.1
NonuniformFFTs v0.4.1
Merged pull requests: - Fix 1D case in set_points! (#28) (@tknopp)
- Julia
Published by github-actions[bot] over 1 year ago
NonuniformFFTs - v0.4.0
NonuniformFFTs v0.4.0
Merged pull requests: - Docs: fix comparisons with NFFT.jl (#25) (@jipolanco) - Implement AbstractNFFTs interface (#27) (@jipolanco)
Closed issues: - Comparisons with NFFT.jl (#24) - Implement AbstractNFFT.jl (#26)
- Julia
Published by github-actions[bot] over 1 year ago
NonuniformFFTs - v0.3.18
NonuniformFFTs v0.3.18
- Julia
Published by github-actions[bot] over 1 year ago
NonuniformFFTs - v0.3.17
NonuniformFFTs v0.3.17
Merged pull requests: - Bump julia-actions/setup-julia from 1 to 2 (#22) (@dependabot[bot]) - CompatHelper: bump compat for Bumper to 0.7, (keep existing compat) (#23) (@github-actions[bot])
- Julia
Published by github-actions[bot] over 1 year ago
NonuniformFFTs - v0.3.16
NonuniformFFTs v0.3.16
- Julia
Published by github-actions[bot] over 1 year ago
NonuniformFFTs - v0.3.15
NonuniformFFTs v0.3.15
Merged pull requests: - Fix roundoff errors when a point is near π (#21) (@jipolanco)
- Julia
Published by github-actions[bot] over 1 year ago
NonuniformFFTs - v0.3.14
NonuniformFFTs v0.3.14
- Julia
Published by github-actions[bot] over 1 year ago
NonuniformFFTs - v0.3.13
NonuniformFFTs v0.3.13
Merged pull requests: - Optimise spreading and interpolation operations (#18) (@jipolanco) - Bump julia-actions/cache from 1 to 2 (#19) (@dependabot[bot])
- Julia
Published by github-actions[bot] almost 2 years ago
NonuniformFFTs - v0.3.11
NonuniformFFTs v0.3.11
- fix issue related to round-off error when a non-uniform point is very close to $2\pi$ (such as
x = prevfloat(2pi))
- Julia
Published by github-actions[bot] almost 2 years ago
NonuniformFFTs - v0.3.10
NonuniformFFTs v0.3.10
- Julia
Published by github-actions[bot] almost 2 years ago
NonuniformFFTs - v0.3.9
NonuniformFFTs v0.3.9
Merged pull requests: - Bump julia-actions/setup-julia from 1 to 2 (#12) (@dependabot[bot]) - Bump actions/checkout from 3 to 4 (#13) (@dependabot[bot]) - Bump codecov/codecov-action from 3 to 4 (#14) (@dependabot[bot]) - Remove constant factors from kernel definitions (#15) (@jipolanco) - Parallelise deconvolutions (#16) (@jipolanco)
- Julia
Published by github-actions[bot] almost 2 years ago
NonuniformFFTs - v0.3.8
NonuniformFFTs v0.3.8
Merged pull requests: - Precompile common workloads with PrecompileTools (#11) (@jipolanco)
- Julia
Published by github-actions[bot] almost 2 years ago
NonuniformFFTs - v0.3.7
NonuniformFFTs v0.3.7
Merged pull requests:
- Workaround issue detected by JET.@test_call (#10) (@jipolanco)
- Julia
Published by github-actions[bot] almost 2 years ago
NonuniformFFTs - v0.3.6
NonuniformFFTs v0.3.6
Merged pull requests: - Change parallel sortperm implementation (#9) (@jipolanco)
- Julia
Published by github-actions[bot] almost 2 years ago
NonuniformFFTs - v0.3.5
NonuniformFFTs v0.3.5
Merged pull requests: - Reduce compilation time when creating plans (#8) (@jipolanco)
- Julia
Published by github-actions[bot] almost 2 years ago
NonuniformFFTs - v0.3.4
NonuniformFFTs v0.3.4
Merged pull requests: - Allow manually setting spreading kernel parameters (#7) (@jipolanco)
- Julia
Published by github-actions[bot] about 2 years ago
NonuniformFFTs - v0.3.3
NonuniformFFTs v0.3.3
Merged pull requests: - Fix single-precision transforms (#6) (@jipolanco)
- Julia
Published by github-actions[bot] about 2 years ago
NonuniformFFTs - v0.3.2
NonuniformFFTs v0.3.2
Merged pull requests: - Add option to work with sorted points (#5) (@jipolanco)
- Julia
Published by github-actions[bot] about 2 years ago
NonuniformFFTs - v0.3.1
NonuniformFFTs v0.3.1
Merged pull requests: - Improve performance of multidimensional transforms (#4) (@jipolanco)
- Julia
Published by github-actions[bot] about 2 years ago
NonuniformFFTs - v0.3.0
NonuniformFFTs v0.3.0
Merged pull requests: - Allow performing multiple transforms at once (#3) (@jipolanco)
- Julia
Published by github-actions[bot] about 2 years ago
NonuniformFFTs - v0.2.1
NonuniformFFTs v0.2.1
Merged pull requests: - Add parallelism using threads (#1) (@jipolanco)
Closed issues: - TagBot trigger issue (#2)
- Julia
Published by github-actions[bot] about 2 years ago