Recent Releases of Metal
Metal - v1.7.0
Metal v1.7.0
Merged pull requests: - Add function to retrieve # of gpu cores in system (#626) (@christiangnrd) - Support KA unified memory (#630) (@christiangnrd) - Add GPUToolbox 0.3 compat (#639) (@christiangnrd) - Return the old value from atomicfetchop_explicit. (#640) (@maleadt)
- Julia
Published by github-actions[bot] 7 months ago
Metal - v1.6.4
Metal v1.6.4
Merged pull requests: - typo in MPSMatrixMultiplication comment (#622) (@jandrej) - Remove unnecessary OS signposts (#623) (@christiangnrd) - Accept alternate filename as optional argument (#629) (@christiangnrd) - Support Float32 threadgroup atomics by bitcasting. (#636) (@maleadt)
Closed issues:
- @signpost_events make the code awfully slow (#621)
- Julia
Published by github-actions[bot] 7 months ago
Metal - v1.6.3
Metal v1.6.3
Merged pull requests:
- More accumulation and reduction benchmarks (#614) (@christiangnrd)
- Remove the unnecessary reshape during mapreduce (#615) (@christiangnrd)
- Synchronize resources before cpu access of ManagedStorage resource (#617) (@christiangnrd)
- Fix linalg tests for MPS and MPSGraph (#618) (@christiangnrd)
- Don't warn on macOS 26 and bump version (#620) (@christiangnrd)
- Julia
Published by github-actions[bot] 7 months ago
Metal - v1.6.2
Metal v1.6.2
Merged pull requests:
- Handle broadcasting when storage types are different (#605) (@limarta)
- Add JLD2 to test env (#606) (@christiangnrd)
- Tahoe versions (#607) (@christiangnrd)
- Add MemoryFlagDevice to KA.jl's synchronization primitive. (#609) (@maleadt)
- Update wrappers (#610) (@christiangnrd)
- Bump version (#611) (@christiangnrd)
Closed issues: - KA.@synchronize -- threadgroup_barrier semantics (#608)
- Julia
Published by github-actions[bot] 8 months ago
Metal - v1.6.1
Metal v1.6.1
Merged pull requests: - Adding definition for KA.functional (#598) (@astrozot) - Update requirements (#599) (@christiangnrd) - Fix findall with empty MtlArray of Bool (#601) (@christiangnrd) - [NFC] Typo (#602) (@christiangnrd) - Add bare minimum for macOS 26 Tahoe (#604) (@christiangnrd)
Closed issues: - Warnings when precompiling Metal with Julia 1.12 (#594)
- Julia
Published by github-actions[bot] 9 months ago
Metal - v1.6.0
Metal v1.6.0
- Metal, MPS, and MPSGraph frameworks’ enums and objects are now automatically wrapped with Clang.jl
- Initial MPSGraph support. Currently used to replace MPS matrix multiplication on configurations where the previous method could fail (#381)
- Many more improvements and bug fixes
Merged pull requests:
- Add error checking to command buffer completion handler (#521) (@vovw)
- Enable add_functions! test under shader validation (#522) (@christiangnrd)
- Update wrapping readme (#525) (@christiangnrd)
- Automatically wrap Metal and MPS headers (feat. Properties) (#526) (@christiangnrd)
- Objective-C Availability support (#527) (@christiangnrd)
- Small followup to #526 (#528) (@christiangnrd)
- Add nextafter intrinsic (#529) (@christiangnrd)
- Improvements to float intrinsics (#531) (@christiangnrd)
- Fix Float16 sincos intrinsic (#533) (@christiangnrd)
- Fix format suggestion formatting when diff contains "``" (#534) (@christiangnrd)
- Fix rewriter for version-gated expressions (#535) (@christiangnrd)
- Move BFloat16 code out of extension (#536) (@christiangnrd)
- Version-related fixes to MPSNDArray (#537) (@christiangnrd)
- Use GPUToolbox.jl (#538) (@christiangnrd)
- [tests] Assume compatible system if xcode not installed (#539) (@christiangnrd)
- Add .git-blame-ignore-revs and a few other fixes (#541) (@christiangnrd)
- Integer & atomic Intrinsics improvements (#544) (@christiangnrd)
- [NFC] Indentation consistency (#545) (@christiangnrd)
- Update .git-blame-ignore-revs (#546) (@christiangnrd)
- Link to issue (#548) (@christiangnrd)
- Test both simd shuffle intrinsics. (#553) (@christiangnrd)
- Enabling previously disabled test and mark broken (#554) (@christiangnrd)
- Support pow with Int exponent (#557) (@christiangnrd)
- Fixes and more tests forunsafewrap(#558) (@christiangnrd)
-PiandetoFloat32andFloat16` (#559) (@christiangnrd)
- Update GPUToolbox compat (#560) (@christiangnrd)
- Split up intrinsics tests (#561) (@christiangnrd)
- Remove reference to 'dir' in profiling docs (#563) (@christiangnrd)
- Ensure synchronization before unsafewrapping of shared gpu array (#564) (@christiangnrd)
- [NFC] Move linear algebra wrappers out of MPS lib (#565) (@christiangnrd)
- Initial MPSGraph support (#566) (@christiangnrd)
- Remove copy when possible from cpu rnad using GPU RNG (#568) (@christiangnrd)
- Code coverage and misc fixes (#569) (@christiangnrd)
- Remove obsolete MtlLargerDeviceArray (#574) (@christiangnrd)
- Silence warning on 1.12+ (#575) (@christiangnrd)
- 15.4 SDK changes (#579) (@christiangnrd)
- Faster matmul sometimes (#580) (@christiangnrd)
- Fix erf and a few other improvements (#582) (@christiangnrd)
- Use an appropriate amount of threads in unified memory example (#583) (@christiangnrd)
- Clean up test imports (#584) (@christiangnrd)
- Fix some type ambiguities (#585) (@christiangnrd)
- Fix findall output type (#587) (@christiangnrd)
- Fix tests for macOS 13 (#591) (@christiangnrd)
- Minor findall and accumulate tests improvements (#592) (@christiangnrd)
- Bump version (#595) (@christiangnrd)
Closed issues:
- API Validation failures (#467)
- Handle MTLCommandBuffer Error Logs (#510)
- sincos intrinsic fails to compile with Float16 (#530)
- Can't compare Float32 with pi on Metal (#551)
- ^(::Float32, ::Integer) uses double precision (#552)
- Code coverage broken (#556)
- Remove MtlLargerDeviceArray? (#573)
- Installation of Metal does not work when DiffEqGPU is also installed. (#577)
- findall output always uses default storage mode instead of matching input storage mode (#578)
- Qualify definitions of BroadcastStyle (#586)
- New Dynamic Dispatch when using ColorTypes (#588)
- Release new version of Metal.jl (#593)
- Julia
Published by github-actions[bot] 9 months ago
Metal - v1.5.1
Metal v1.5.1
Merged pull requests:
- Adapt to minver in ObjectiveC.jl (#513) (@christiangnrd)
- Add Runic action to suggest formatting changes. (#517) (@maleadt)
- Increase polling interval for benchmark action (#518) (@christiangnrd)
- Adapt to GPUArrays.jl changes. (#519) (@maleadt)
Closed issues: - Benchmark CI failures due to too many requests (#516)
- Julia
Published by github-actions[bot] about 1 year ago
Metal - v1.5.0
Metal v1.5.0
Metal.jl 1.5 is a relatively minor release, which the most important change being behind the scenes: GPUArrays.jl v11 has switched to KernelAbstractions.jl (#461).
There is also one (technically) breaking change: code_agx and @device_code_agx have been removed (#512) because of the heavy Python dependency, and conflicts with PythonCall.jl. This functionality did not support recent M GPUs anyway, so it is unlikely to affect many users.
Features
- Improve performance of shared storage copies: #445
- Add an
is_m4function: #498 - #499
Bug fixes
- Fix
fill: #496
Merged pull requests:
- Add more tests to api validation testing (#447) (@christiangnrd)
- Adapt to GPUArrays.jl transition to KernelAbstractions.jl. (#461) (@maleadt)
- Switch CI to 1.11. (#462) (@maleadt)
- Remove old code and test cleanup (#464) (@christiangnrd)
- Adapt to JuliaGPU/GPUArrays.jl#567. (#475) (@maleadt)
- Bump LLVM downgrader (#479) (@maleadt)
- Store more debug files when encountering compilation errors. (#482) (@maleadt)
- Use OncePerProcess in 1.12+ (#483) (@christiangnrd)
- Don't run benchmarks from fork (#485) (@christiangnrd)
- Still run GH Action when merged (#486) (@christiangnrd)
- Bump IR downgrader (#489) (@maleadt)
- Move MTL tests and add a few (#491) (@christiangnrd)
- Generate MTL and MPS structs and enums with Clang.jl (#492) (@christiangnrd)
- Fix copy tests (#493) (@christiangnrd)
- Simplify benchmark runner and pipelines (#494) (@maleadt)
- Fix global linear indexing (fill!) (#496) (@christiangnrd)
- Couple typos and is_m4 function (#498) (@christiangnrd)
- Initial support for MPSNDArray (#499) (@christiangnrd)
- Tweak benchmark CI job (#501) (@maleadt)
- Fix MPSNDArrayDescriptor wrapper (#502) (@christiangnrd)
- Metal library parsing: using CodecBzip2 feature to ignore padding. (#504) (@maleadt)
- Followup to #492: Enable C function wrapping (#505) (@christiangnrd)
- Rerun random tests with chance of false negative once. (#506) (@christiangnrd)
- Bump LLVM downgrader (#507) (@maleadt)
- Test loading of package on unsupported platforms (#509) (@christiangnrd)
- Remove device_code_agx (#512) (@christiangnrd)
- Fix typo in random tests (#514) (@christiangnrd)
- Fix Documenter failures (#515) (@christiangnrd)
Closed issues:
- KernelAbstractions: add Atomix back-end (#218)
- @device_code_agx errors when Metal Shader Validation is enabled (#463)
- fill broken after KA integration (#466)
- Compilation to native code failed: NSError: Undefined symbols (#480)
- ObjectiveC.Foundation.NSErrorInstance(ObjectiveC.id{ObjectiveC.Foundation.NSError}(0x000000014cb8bd90)) (#487)
- phi-related IR downgrade issue (#488)
- Circular dependency when precompiling (#495)
- Bad interaction between PyCall and Metal (#500)
- Add github actions CI for linux, windows and non-functional macOS to ensure that precompilation and loading works (#508)
- Julia
Published by github-actions[bot] about 1 year ago
Metal - v1.4.2
Metal v1.4.2
Merged pull requests: - Fix loading on unsupported platforms (#459) (@christiangnrd)
Closed issues:
- Relax package requirements (#22)
- [windows:] Metal does not precompile anymore when installation not functional (#457)
- [MacOS:] Metal.functional() wrongly returns true despite no GPUs available (#458)
- Julia
Published by github-actions[bot] over 1 year ago
Metal - v1.4.1
Metal v1.4.1
Merged pull requests: - Update Readme (#444) (@christiangnrd) - Use CPU copy with SharedStorage (#445) (@christiangnrd) - Disable nightly CI and fix invalid Metal API usage (#448) (@christiangnrd) - Don't report benchmarks on main branch commits (#450) (@christiangnrd) - Fix #451 and a couple other fixes (#452) (@christiangnrd) - Only load BFloat16s extension on Apple systems (#454) (@christiangnrd) - CompatHelper: bump compat for GPUCompiler to 1, (keep existing compat) (#455) (@github-actions[bot])
Closed issues:
- Don't run benchmarks on the master branch? (#449)
- unsafe_wrap(Array, ...) of a view does not preserve offset information (#451)
- Metal does not load any more without error when installation not functional (#453)
- Julia
Published by github-actions[bot] over 1 year ago
Metal - v1.4.0
Metal v1.4.0
Merged pull requests:
- Use unified memory for scalar indexing of permutation matrices (#313) (@tgymnich)
- Add MPSMatrixRandom (#321) (@christiangnrd)
- [.gitignore] Also ignore versioned Manifests (#410) (@christiangnrd)
- Remove broken link in Docs (#413) (@christiangnrd)
- Remove unused [extras] section in Project.toml (#415) (@christiangnrd)
- Small fix and typos (#417) (@christiangnrd)
- Add Benchmarking CI (#420) (@christiangnrd)
- [NFC] Fix warning in topk docstrings (#421) (@christiangnrd)
- Allow initialisation of MTLSize with tuples of different integer types (#425) (@tgymnich)
- Add CI for macOS 15 (#426) (@christiangnrd)
- Simplify versioninfo() and report more packages. (#429) (@maleadt)
- Allow controlling compilation target versions. (#430) (@maleadt)
- Add a missing memory fence to a SIMD test. (#432) (@maleadt)
- Fix MPS.synchronize_state (#434) (@christiangnrd)
- Make lu results have same storage mode as input (#435) (@christiangnrd)
- Fix benchmarking CI and benchmark Shared and Private storage modes (#437) (@christiangnrd)
- NFC tweak to MPSMatrixCopy tests (#439) (@christiangnrd)
- Get more descriptive errors from flaky test (#440) (@christiangnrd)
Closed issues:
- Port the opportunistic synchronization from CUDA.jl (#317)
- Control flow-related miscompilation: (#401)
- More sporadic 1.11 hangs (#412)
- Support for LinearAlgebra.kron (#422)
- Can't use gemm! methods with Metal (#423)
- Error for thread/group size with different integer types (#424)
- README example broken (#427)
- Intermittent loadstoretg test failure (#428)
- Julia
Published by github-actions[bot] over 1 year ago
Metal - v1.3.0
Metal v1.3.0
Merged pull requests:
- Fix typo in docs (#384) (@christiangnrd)
- Bump minimal Julia requirement to v1.10. (#385) (@maleadt)
- Remove Requires dependency (#386) (@christiangnrd)
- Reflection: Figure out kernel names by looking at metallib section. (#390) (@maleadt)
- Add tests for broadcasting minimum and maximum (#391) (@tgymnich)
- Don't export MTL (#392) (@christiangnrd)
- Add erfinv (#394) (@tgymnich)
- Add expm1 (#395) (@tgymnich)
- Cleanup some imports (#398) (@christiangnrd)
- Remove type-pirated function (#399) (@christiangnrd)
- Unexport some high-level MPS functionality from MPS (#400) (@christiangnrd)
- Adapt to new REPL precompile changes (JuliaLang/julia#55210) (#403) (@christiangnrd)
- Bump GPUCompiler. (#404) (@maleadt)
- Bump LLVM compat (#407) (@maleadt)
- Make 1.11 CI success mandatory. (#408) (@maleadt)
Closed issues:
- Audit exports/public symbols (#359)
- Compilation failure on 1.11 (#370)
- MTLBinaryArchive (#387)
- Metal.code_agx() failing in MacOS 15 Beta 3 (#388)
- Test for min / max broadcasting issue (#389)
- Type piracy (#396)
- Potentially unused code in gpuarrays.jl (#397)
- Shared vs SharedStorage in examples/unified_memory (#405)
- Unsuported call to an unknown function when calling Distributions (#406)
- Julia
Published by github-actions[bot] over 1 year ago
Metal - v1.2.0
Metal v1.2.0
Merged pull requests:
- Avoid constructing MulAddMuls on Julia v1.12+ (#295) (@dkarrasch)
- Trigger the runtime profiler when a test times out. (#330) (@maleadt)
- Add MPSMatrixSoftMax (#333) (@christiangnrd)
- Reorganize and add some MPS tests (#335) (@christiangnrd)
- Typo fix (#336) (#337) (@101001000)
- Add error message for running Metal.jl under Rosetta (#339) (@tgymnich)
- Add MPSCommandBuffer (#340) (@christiangnrd)
- Bump julia-actions/setup-julia from 1 to 2 (#341) (@dependabot[bot])
- Revert error message for Rosetta (#342) (@tgymnich)
- Update to ObjectiveC.jl v3. (#343) (@maleadt)
- Add autoreleasepools to MPS interface methods. (#344) (@maleadt)
- Don't redundantly return the cmdbuf from commit methods. (#345) (@maleadt)
- Whitespace fixes (#346) (@christiangnrd)
- CompatHelper: bump compat for LLVM to 7, (keep existing compat) (#347) (@github-actions[bot])
- CompatHelper: add new compat entry for SpecialFunctions in [weakdeps] at version 2, (keep existing compat) (#352) (@github-actions[bot])
- [NFC] Fix indentation (#353) (@christiangnrd)
- Bump LLVM downgrader (#354) (@maleadt)
- Don't export non-existent contents (#356) (@christiangnrd)
- Remove/fix unused exports (#357) (@christiangnrd)
- Unexport SimpleVersion and AS (#360) (@christiangnrd)
- Add support for opaque pointers (#361) (@maleadt)
- Docstrings (#362) (@christiangnrd)
- Initial MacOS 15 support (#365) (@christiangnrd)
- Replace current_device() with device() (#366) (@christiangnrd)
- Support reading metallib v1.2.8 files from macOS 15. (#367) (@maleadt)
- Add metallib (dis)assembly helper scripts. (#368) (@maleadt)
- Simplify testing of examples. (#369) (@maleadt)
- Temporarily allow 1.11 to fail. (#371) (@maleadt)
- CompatHelper: add new compat entry for PrecompileTools at version 1, (keep existing compat) (#372) (@github-actions[bot])
- Define complex sqrt (#374) (@mtfishman)
- Check the macOS version during initialization. (#375) (@maleadt)
- CompatHelper: bump compat for LLVM to 8, (keep existing compat) (#376) (@github-actions[bot])
- Add accumulate implementation (#377) (@chengchingwen)
- fix derived device array (#378) (@chengchingwen)
- avoid ReshapedArray using Int128 in metal kernel (#379) (@chengchingwen)
- improve type stability of derived array (#380) (@chengchingwen)
- add findall implementation (#382) (@zhenwu0728)
- Bump version (#383) (@christiangnrd)
Closed issues:
- Tests sporadically timing out on 1.11 (#329)
- ReshapedArray indexing broken because of Int128 operation (#332)
- KernelAbstractions copyto! typo (#336)
- Segmentation Faults (#338)
- Port accmulate! and findall from CUDA.jl (#348)
- Tests failing with GPUCompiler v0.26.5 and LLVM v7.1 (#350)
- downgrades LLVM (#355)
- sqrt(::Complex) unsupported due to conversion exceptions (#364)
- Julia
Published by github-actions[bot] over 1 year ago
Metal - v1.1.0
Metal v1.1.0
Merged pull requests:
- Add resize! (#279) (@mtfishman)
- Initial MTLTexture support (#280) (@christiangnrd)
- Avoid redundant pointer conversions for threadgroup memory. (#283) (@maleadt)
- Re-implement metallib generation in Julia. (#284) (@maleadt)
- CompatHelper: add new compat entry for SHA at version 0.7, (keep existing compat) (#286) (@github-actions[bot])
- Support more of the metallib format (#288) (@maleadt)
- Address potentiallly buggy mtl behaviour. (#290) (@christiangnrd)
- CompatHelper: add new compat entry for CodecBzip2 at version 0.8, (keep existing compat) (#292) (@github-actions[bot])
- Remove an unneeded pointer method. (#293) (@maleadt)
- Use NSAutoreleasePool to clean up memory. (#294) (@maleadt)
- adapt_storage-related improvements (#296) (@christiangnrd)
- CompatHelper: bump compat for ObjectiveC to 2, (keep existing compat) (#297) (@github-actions[bot])
- Add support for signposts (#300) (@maleadt)
- Retain NSError we rethrow to avoid an UAF. (#302) (@maleadt)
- Minor mapreduce improvements (#303) (@maleadt)
- Specialize broadcast to avoid integer divisions. (#304) (@maleadt)
- Better Support for Unified Memory (#305) (@tgymnich)
- Add 1.11 CI (#306) (@christiangnrd)
- Remove unused files (#307) (@tgymnich)
- Skip profiling tests on macOS 14.4/M1. (#310) (@maleadt)
- Increase test timeout limit to accomodate 1.8 (#311) (@christiangnrd)
- Test all storage modes (#314) (@christiangnrd)
- Fix doctests (#315) (@christiangnrd)
- Fix KernelAbstractions for Unified Memory (#316) (@tgymnich)
- CompatHelper: add new compat entry for Preferences at version 1, (keep existing compat) (#318) (@github-actions[bot])
- Minor cleanup (#319) (@christiangnrd)
- Create MtlArray using memory allocated by Array (#320) (@christiangnrd)
- Re-enable profiling tests on M1/14.4 when using Xcode 15.3. (#322) (@maleadt)
- Small typo and doc fixup (#325) (@christiangnrd)
- BFloat16s.jl extension and related improvements (#326) (@christiangnrd)
- Support for Julia 1.11 (#327) (@maleadt)
Closed issues:
- Validation-related back-end crash on macOS Ventura (#34)
- slow broadcast copy in 2D (#41)
- Poor performance of mapreduce (#46)
- Multiplication with SubArrays (#47)
- Add support to creating MtlArray using a memory allocated by Array (#62)
- Improve use of unified memory (#86)
- Use Autoreleasepools with Metal (#103)
- Unknown RFLT tag generated by macOS 13 Metal compiler (#167)
- mapreduce allocates a lot on the CPU (#211)
- Legalization errors with vectorized code (#257)
- Compilation Failure due to undefined symbols (#276)
- resize!, append! not defined (#277)
- tag new version (#278)
- Panic during profiling tests on 14.4 beta (#281)
- M3 backend cannot handle atomics with complicated pointer conversions (#282)
- Int128 does not compile (#287)
- Two suspicious mtl-related behaviours (#289)
- LU factorization: add allowsingular keyword argument (#299)
- Autorelease changes lead to use after free with errors (#301)
- Reductions don't work on Shared Arrays (#312)
- Julia
Published by github-actions[bot] almost 2 years ago
Metal - v1.0.0
Metal v1.0.0
Merged pull requests: - Matrix batches (#158) (@tgymnich) - Add 1.10 CI. (#256) (@maleadt) - Update manifest (#258) (@github-actions[bot]) - CompatHelper: bump compat for GPUCompiler to 0.25, (keep existing compat) (#259) (@github-actions[bot]) - Bump actions/checkout from 3 to 4 (#260) (@dependabot[bot]) - Update manifest (#261) (@github-actions[bot]) - CompatHelper: bump compat for CEnum to 0.5, (keep existing compat) (#262) (@github-actions[bot]) - Update manifest (#263) (@github-actions[bot]) - CompatHelper: add new compat entry for Artifacts at version 1, (keep existing compat) (#264) (@github-actions[bot]) - Reduce launch overhead by generating code to encode arguments. (#265) (@maleadt) - Remove unused function argument (#266) (@tgymnich) - Introduce application tracing profiler (#267) (@maleadt) - Remove content(::MTLBuffer), use convert intead. (#268) (@maleadt) - Allow more kwargs syntax with kernel launches (#269) (@maleadt) - Don't re-use the IO object when shelling out to Python. (#271) (@maleadt) - Preserve storage mode when broadcasting. (#273) (@maleadt)
Closed issues: - Support for macOS Sonoma (#201) - Error with Julia 1.10 (#274)
- Julia
Published by github-actions[bot] about 2 years ago
Metal - v0.5.1
Metal v0.5.1
Merged pull requests:
- MPSMatrix improvements (#157) (@tgymnich)
- Update manifest (#221) (@github-actions[bot])
- Update manifest (#222) (@github-actions[bot])
- Update manifest (#224) (@github-actions[bot])
- Update manifest (#227) (@github-actions[bot])
- CompatHelper: bump compat for ObjectiveC to 1, (keep existing compat) (#228) (@github-actions[bot])
- Update manifest (#230) (@github-actions[bot])
- Fix argument types in sincos (#232) (@fjebaker)
- Update manifest (#233) (@github-actions[bot])
- Improve docs (#235) (@christiangnrd)
- Remove linear algebra section of MPS docs (#237) (@christiangnrd)
- CompatHelper: bump compat for GPUCompiler to 0.22, (keep existing compat) (#238) (@github-actions[bot])
- Port openlibm log1pf as log1p (#239) (@sotlampr)
- Port openlibm erf (#240) (@tgymnich)
- Remove 1.6-era override mechanism. (#241) (@maleadt)
- CompatHelper: add new compat entry for Requires at version 1, (keep existing compat) (#242) (@github-actions[bot])
- Update manifest (#243) (@github-actions[bot])
- enable dependabot for GitHub actions (#244) (@ranocha)
- Bump actions/checkout from 2 to 3 (#245) (@dependabot[bot])
- Bump peter-evans/create-pull-request from 3 to 5 (#246) (@dependabot[bot])
- Show METAL_CAPTURE_ENABLED in Metal.versioninfo() when the environment variable is set (#248) (@christiangnrd)
- Update manifest (#249) (@github-actions[bot])
- Adapt to GPUCompiler.jl, and other small updates. (#250) (@maleadt)
- Switch to GPUArrays buffer management. (#251) (@maleadt)
- Update manifest (#252) (@github-actions[bot])
- Update manifest (#253) (@github-actions[bot])
- Bump GPUCompiler (#255) (@maleadt)
Closed issues:
- Random access indexing into MtlArray views cause scalar indexing (#149)
- Q: How to debug kernels - KA.@print? (#223)
- Crash during MTLDispatchListApply (#225)
- Unable to compile trig functions through ForwardDiff (#229)
- symbol multiply defined! Bug/crash on Julia master, fine on 1.10 (#231)
- log1p fails on MtlArray{Float32} (#234)
- When precompiling, UndefVarError: CompilerConfig not defined (#247)
- Julia
Published by github-actions[bot] over 2 years ago
Metal - v0.5.0
Metal v0.5.0
Metal.jl 0.5 is a feature release, bringing initial support for atomic operations (#168).
Low-level atomics that mimic Metal C are supported (atomic_store_explicit,
atomic_load_explicit, etc), as well as a higher-level Metal.@atomic that can be used to
update array values similar to how CUDA.jl's @atomic works. This uses native atomics when
supported, and falls back to a compare-exchange loop otherwise.
Minor changes include an update for the @device_code_agx disassembler, the addition of a
type variable to MtlArray encoding the storage mode (#194), and support for MPSVector
(#199) which should accelerate matrix/vector multiplications.
Also note that Metal.jl now disallows the construction of Float64 arrays, as these are not support by the Metal libraries.
Closed issues:
- Support for atomics (#79)
- Make MtlArray storage mode a type parameter (#190)
- Long stacktrace when trying to create Float64 rand arrays (#205)
- allowscalar equivalent for Metal.jl (#206)
- Define map! ? (#219)
Merged pull requests: - Implement atomics using compiler intrinsics (#168) (@maleadt) - Parameterize MtlArray storage mode (#194) (@christiangnrd) - Implement MPSVector (#199) (@tgymnich) - Update manifest (#200) (@github-actions[bot]) - Add Metal 3.1 to MTLLanguageVersion (#202) (@christiangnrd) - Update manifest (#203) (@github-actions[bot]) - CompatHelper: bump compat for GPUCompiler to 0.21, (keep existing compat) (#204) (@github-actions[bot]) - Update manifest (#207) (@github-actions[bot]) - Disallow Float64 arrays entirely. (#209) (@maleadt) - Adapt to LLVM.jl 6. (#213) (@maleadt) - Update manifest (#215) (@github-actions[bot]) - Bump disassembler. (#216) (@maleadt)
- Julia
Published by github-actions[bot] over 2 years ago
Metal - v0.4.1
Metal v0.4.1
Closed issues:
- Command buffer callbacks can cause bus error during thread adoption (#138)
- how to set up Project.toml (#185)
- Metal.rand() creates a CPU array (#187)
- fill! for Int8 errors when the value is negative (#192)
Merged pull requests:
- Refactor matmatmul code for faster load time (#186) (@dkarrasch)
- Add *.DS_Store to .gitignore (#188) (@christiangnrd)
- Add GPUArrays out-of-place random methods (#189) (@tgymnich)
- Revert "Don't rely on thread adoption for command buffer callbacks." (#191) (@maleadt)
- Fix fill! with negative Int8 values (#193) (@christiangnrd)
- disambiguate gemm_wrapper! with LinAlg.jl (#195) (@dkarrasch)
- Add type annotations for character args in matmatmul (#196) (@dkarrasch)
- Handle missing adjoint case. (#197) (@maleadt)
- Fix transposed matmul. (#198) (@maleadt)
- Julia
Published by github-actions[bot] over 2 years ago
Metal - v0.4.0
Metal v0.4.0
Closed issues:
- Restore mtlcall (#17)
- mapreduce has poor performance (#87)
- Native code reflection (#95)
- rand! with Bools sometimes fails in tests in 1.9 (#141)
- LLVM assertion failures (#153)
- Time macro similar to CUDA.@time (#160)
- bug in rand!? (#162)
- Why not support threadIdx().x, blockIdx().x, blockDim().x etc? (#163)
- Incorrect(?) darwin version in 1.8 with Metal.versioninfo() (#179)
Merged pull requests: - Add native code reflection. (#96) (@maleadt) - Move MPSKernels into a dedicated file (#155) (@tgymnich) - [LU decomposition] Fix types (#156) (@tgymnich) - Update manifest (#161) (@github-actions[bot]) - Implement Time macro (#164) (@christiangnrd) - Fix some references to CUDA (#165) (@christiangnrd) - Fix GPUArrays RNG interface implementation. (#166) (@maleadt) - Bump the LLVM back-end. (#169) (@maleadt) - Update manifest (#170) (@github-actions[bot]) - Update manifest (#171) (@github-actions[bot]) - Update manifest (#172) (@github-actions[bot]) - Bump GPUCompiler to v0.20 (#173) (@christiangnrd) - Detect mapreduce threadgroup limits instead of guessing. (#176) (@maleadt) - Remove reference to no longer used library in README.md (#177) (@christiangnrd) - Report package versions as part of versioninfo() (#180) (@christiangnrd) - Fix Darwin version indentification (#181) (@christiangnrd) - Topk for MPSMatrix (#182) (@christiangnrd) - Update manifest (#183) (@github-actions[bot]) - Don't rely on thread adoption for command buffer callbacks. (#184) (@maleadt)
- Julia
Published by github-actions[bot] almost 3 years ago
Metal - v0.3.0
Metal v0.3.0
Closed issues:
- Migrate to metal C++? (#2)
- Improved errors when calling device functions on CPU (#90)
- Improve Objective-C interfacing (#104)
- Rename grid to groups (#116)
- Add functionality check helper (#121)
- inputing non-isbits types (#128)
- @metal docstring out-of-date (#129)
- mapreduce kernel uses too many threads (#132)
- Powers don't work with complex floats (#142)
Merged pull requests:
- Add contributing documentation (#93) (@max-Hawkins)
- Reduce multiple consecutive values in each thread to improve efficiency (#112) (@maxwindiff)
- Remove libcmt, use native ObjectiveC FFI (#117) (@maleadt)
- Rename grid to groups (#119) (@habemus-papadum)
- Audit MRR (#122) (@maleadt)
- Faster in-place reduction by using broadcasting to initialize partial… (#123) (@maxwindiff)
- Add MPS matrix decompositions (#124) (@tgymnich)
- Minor documentation formatting (#125) (@asinghvi17)
- Switch default mode to private storage (#126) (@christiangnrd)
- Update manifest (#127) (@github-actions[bot])
- Add some MtlArray docs (#130) (@christiangnrd)
- Port MetalKernels (#131) (@maxwindiff)
- Adapt to GPUCompiler 0.18. (#134) (@maleadt)
- Support passing non-isbits arguments, as long as they're unused. (#135) (@maleadt)
- Do not change grain size after pipeline creation (#136) (@maxwindiff)
- Bump GPUArrays. (#137) (@maleadt)
- Specialize GPUArrays' globalsize query. (#139) (@maleadt)
- Catch errors that happen during command buffer callbacks. (#140) (@maleadt)
- Call the correct currentdevice() in reflection (#143) (@maxwindiff)
- Error when calling device functions on CPU (#144) (@christiangnrd)
- Implement MTLGPUFamily and use it to validate gpu (#146) (@christiangnrd)
- Add functional() (#147) (@christiangnrd)
- Update manifest (#148) (@github-actions[bot])
- CompatHelper: add new compat entry for StaticArrays at version 1, (keep existing compat) (#151) (@github-actions[bot])
- Update to LLVM.jl 5 and GPUCompiler 0.19. (#154) (@maleadt)
- Julia
Published by github-actions[bot] almost 3 years ago
Metal - v0.2.0
Metal v0.2.0
Closed issues: - Threadgroup memory breaks on small datatypes (#26) - Int64 not supported on AMD GPUs? (#38) - Base.unsafe_convert is ambiguous (#42) - Support for multiple devices (#44) - Add CITATION file (#55) - XGBoost on Metal.jl (#82) - first try at metal (#84) - Copysign intrinsic possibly wrong (#89) - Metal.jl fails to precompile on Linux (#97) - Silent failure with unsupported(?) Intel Iris Graphics (#109) - I have 2 question about Metal.jl and Flux.jl (#110)
Merged pull requests: - Update manifest (#57) (@github-actions[bot]) - Add GPU profiling capabilities (#58) (@max-Hawkins) - Automatically detect if we need cmt build from source. (#59) (@maleadt) - Update manifest (#60) (@github-actions[bot]) - Add queue kernel launch argument (#61) (@tgymnich) - Update manifest (#63) (@github-actions[bot]) - Switch pipeline to juliaecosystem (#64) (@vchuravy) - Update manifest (#65) (@github-actions[bot]) - Add a function for setting the current device (#66) (@maxwindiff) - Add documentation webpage (#67) (@max-Hawkins) - Wrap simdgroup matrix functions (#70) (@maxwindiff) - Support loading/saving simdgroup matrix from threadgroup memory (#71) (@maxwindiff) - Conditionalize the MtlDeviceArray element-type workaround. (#72) (@maleadt) - Add basic SIMD shuffle up/down (#73) (@max-Hawkins) - Update manifest (#74) (@github-actions[bot]) - Optimize warp reduction for mapreduce (#75) (@max-Hawkins) - Specialize GPUArrays.globalindex() to improve broadcast performance (#76) (@maxwindiff) - Update manifest (#78) (@github-actions[bot]) - Add initial performance shader support (matmul) (#80) (@max-Hawkins) - Use Ninja to build cmt. (#81) (@maleadt) - Update manifest (#83) (@github-actions[bot]) - Support Julia 1.9 (#85) (@maleadt) - Add queue parameter to unsafecopyto (#88) (@tgymnich) - Update manifest (#91) (@github-actions[bot]) - Add MPS tests. (#92) (@maleadt) - Support for writing binary archives (#94) (@maleadt) - Support precompilation and loading on non-Apple hardware (#98) (@maleadt) - Update manifest (#99) (@github-actions[bot]) - Improve reduce performance by passing CartesianIndices and length statically (#100) (@maxwindiff) - Do not release objects that are autoreleased. (#102) (@habemus-papadum) - Fix path the cmt in Hacking Section of the Readme (#105) (@habemus-papadum) - Add example showing Metal and Gtk4 integration (#106) (@habemus-papadum) - Fix memory leak. (#107) (@habemus-papadum) - Add a mtl function for simple recursive data conversions. (#114) (@maleadt) - Write profile trace in the current folder. (#115) (@maleadt)
- Julia
Published by github-actions[bot] almost 3 years ago
Metal - v0.1.2
Metal v0.1.2
Closed issues: - installation issue (libz.1.dylib not found) +workaround - Optimally choosing threads and grid (#54)
Merged pull requests: - Use Base.active_project. (#43) (@maleadt) - Update manifest (#45) (@github-actions[bot]) - Add aliases MtlVector and MtlMatrix (#48) (@amontoison) - Update manifest (#49) (@github-actions[bot]) - Wrap at-metal's output in a let block. (#50) (@maleadt) - Update manifest (#52) (@github-actions[bot]) - Update manifest (#56) (@github-actions[bot])
- Julia
Published by github-actions[bot] over 3 years ago
Metal - v0.0.1
Metal v0.0.1
Closed issues: - error when using (#1) - Argument buffer encoding is fragile (#5) - LLVMType of MtlDeviceArray needs changing/manipulation (#6) - Errors running on M1 Max (#14) - I get this, my name isn't Tim (#16) - Thanks for the previous fix - had a go (#18) - Custom IR verification (#25) - cmt: Release build fails install (#27)
Merged pull requests: - Add devicecodemetallib macro (#3) (@max-Hawkins) - Update README (#8) (@max-Hawkins) - Implement GPUArrays launch heuristic (#9) (@max-Hawkins) - Add docstrings (#12) (@max-Hawkins) - Rework metadata generation (#13) (@maleadt) - Add CI (#19) (@maleadt) - Use sw_vers to query the macOS version. (#20) (@maleadt) - Updates for macOS 13 (Ventura); use bindless argument buffers (#23) (@maleadt) - Enable the GPUArrays test suite (#24) (@maleadt) - Use cmt from pre-built JLL. (#28) (@maleadt) - Package updates (#29) (@maleadt) - First test with a locally-built cmt. (#30) (@maleadt) - Use labels to determine whether to build local deps. (#31) (@maleadt) - Bump GPUArrays. (#32) (@maleadt) - MTL wrapper clean-ups (#33) (@maleadt)
- Julia
Published by github-actions[bot] over 3 years ago