Recent Releases of ark

ark - ARK v0.5.0

  • Integrate with MSCCL++
  • Removed dependency on gpudma
  • Add AMD CDNA3 architecture support
  • Support communication for AMD GPUs
  • Optimize OpGraph scheduling
  • Add a multi-GPU Llama2 example

See details from https://github.com/microsoft/ark/issues/168.

- C++
Published by chhwang over 2 years ago

ark - ARK v0.4.1

What's Changed

  • Fix graph optimization by @chhwang in https://github.com/microsoft/ark/pull/171
  • v0.4.1 by @chhwang in https://github.com/microsoft/ark/pull/172

Full Changelog: https://github.com/microsoft/ark/compare/v0.4.0...v0.4.1

- C++
Published by chhwang over 2 years ago

ark - ARK v0.4.0

  • Support AMD GPUs (CDNA2, single-GPU only)
  • Add high-performance AllReduce & AllGather algorithms with MSLL
  • Fix major bugs in the scheduler

See details from https://github.com/microsoft/ark/issues/137.

- C++
Published by chhwang over 2 years ago

ark - ARK v0.3.0

  • Enable heuristic model graph optimization
  • Revise Python interfaces
  • Add more operators & support mixed-precision models & support bfloat16
  • Add a Llama2-7B example
  • Fix connection setup bugs for large & distributed models
  • Fix correctness bugs from a few operators
  • Minor scheduler improvements

See details from https://github.com/microsoft/ark/issues/113.

- C++
Published by chhwang over 2 years ago

ark - ARK v0.2.1

- C++
Published by chhwang over 2 years ago

ark - ARK v0.2.0

Timeline

Released Date: Sep. 5th, 2023

Work Items (TBU)

Model

  1. * [x] Interface: expose the underlying buffer info to Tensor (#79)

Communication Stack

  1. * [x] Interface: hide GpuCommSw implementation from the interface (#81)
  2. * [x] Interface: extend the current interface (#104)

Operators Support

  1. * [x] Operator: add more operators (#62)
  2. * [x] Operator: upgrade CUTLASS (#105)

Python

  1. * [x] Interface: #96

Examples

  1. * [x] Example: parallel matmul example (#64)

Bug Fix

  • [x] #65
  • [x] #66
  • [x] #75
  • [x] #77
  • [x] #90
  • [x] #94
  • [x] #98
  • [x] #97
  • [x] #99
  • [x] #100
  • [x] #103
  • [x] #101
  • [x] #35

Documents

  1. * [x] Docs: update documents (#76, #78, #87)

CI

  1. * [x] Code Coverage: add code coverage (#110)
  2. * [x] Unit Tests: add a unit test pipeline (#88)
  3. * [x] Unit Tests: #91

- C++
Published by chhwang over 2 years ago

ark - ARK v0.1.0

Features

  • Scheduler
    • The default scheduler
  • Communication
    • A simple software communication stack
  • Operators
    • Tensor
    • Reshape
    • Identity
    • Sharding
    • ReduceSum
    • ReduceMean
    • ReduceMax
    • Layernorm
    • Softmax
    • Transpose
    • Matmul
    • Im2col
    • Scale
    • Relu
    • Gelu
    • Add
    • Mul
    • Send
    • SendDone
    • Recv
    • SendMM
    • RecvMM
    • AllReduce
    • AllGather
  • Examples
    • Tutorials
    • A simple FFN training
    • Transformer inference
    • Megatron inference

New Contributors

  • @wusar made their first contribution in https://github.com/microsoft/ark/pull/1

Full Changelog: https://github.com/microsoft/ark/commits/v0.1.0

- C++
Published by chhwang almost 3 years ago