trackfm

A compiler to automatically transform applications into disaggregated memory apps.

https://github.com/compiler-disagg/trackfm

Keywords

asplos24 compilers disaggregated-memory

Last synced: 6 months ago · JSON representation ·

Repository

A compiler to automatically transform applications into disaggregated memory apps.

Basic Info

Host: GitHub
Owner: compiler-disagg
License: mit
Language: LLVM
Default Branch: main
Homepage:
Size: 56.2 MB

Statistics

Stars: 16
Watchers: 1
Forks: 2
Open Issues: 1
Releases: 1

Topics

asplos24 compilers disaggregated-memory

Created over 2 years ago · Last pushed over 2 years ago

Metadata Files

Readme License Citation

TrackFM transparently compiles application code to run on systems with disaggregated memory. TrackFM builds on top of the AIFM runtime to manage remote memory. TrackFM uses compiler analysis and runtime optimizations to reach within 10-20% of state-of-the-art disaggregated solutions that require manual programmer effort.

TrackFM

Paper

TrackFM: Far-out Compiler Support for a Far Memory World
Brian R. Tauro, Brian Suchy, Simone Campanoni, Peter Dinda, Kyle C. Hale
The 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '24; to appear).

Setting Up the Experimental Environment

CloudLab

All experiments in our paper were run on the CloudLab platform. Our CloudLab profile, which describes a two-node Ubuntu 18.04 setup, is in a separate repo. You'll first need to create a CloudLab account if you don't have one already. Once you have an account, to instantiate our profile from your CloudLab dashboard, at the top of the page there should be button for "Experiments." Click on that and choose "Create Experiment Profile." Provide a name of your choosing for the profile, and for the "Source code" option, click the button that says "Git Repo." Provide this URL: https://github.com/compiler-disagg/trackfm-cloudlab.

Launching the Experimental Environment

SSH Key Configuration

You'll first want to make sure you have a public key set up so that you can log in to the experiment node(s). On the top right of the CloudLab Dashboard, click the button with your username, and select "Manage SSH Keys." You can then upload a key of your choice.

Launching Instances

Once you've set up the CloudLab profile, you should be able to launch the necessary instances by going to "Experiments -> Start Experiment," then "Change Profile" and select the one that you created above then choose "Select Profile," then hit "Next". You should not need to change any options on the "Parameterize" tab, so hit "Next" again. On the "Finalize" tab, provide a name of your choosing, then hit "Next." We want to instantiate the nodes now, so do not provide a schedule, and hit "Finish." If there are sufficient resources available in CloudLab, then the nodes should launch successfully. Note that we require the use of the xl170 node type in CloudLab since they have the necessary InfiniBand hardware required to support the Shenango runtime. Note that you may want to request more than one day since several of the experiments can run for quite a long time (several hours). We have only tested on CloudLab, so we do not currently support other platforms (e.g., AWS).

Once the nodes have launched successfully, you should be able to select "List View" to see the nodes, and to view the public hostnames that you can SSH to.

Node Connectivity

The compute node needs to be able to access the memory server. To set up this connectivity, update MEM_SERVER_SSH_IP in /home/TrackFM/runtime/AIFM/aifm/configs/ssh with the memory server's IP address.

You should also be able to ssh to the memory server from the compute node. You'll need to generate a key pair on the compute node (e.g., using ssh-keygen), then add the public key to the memory server, either by using ssh-copy-id or by adding the key to the .ssh/authorized_keys file on the memory server.

Build Instructions

The default storage allocation on CloudLab is insufficient for our experiments. We will instead use a custom /home directory, created by the profile. You'll first need to set up permissive access to this directory:

bash sudo chmod 777 /home

Note that unless otherwise explicitly stated, it is assumed that you will run these setup instructions on both nodes in CloudLab (the compute node and the memory server).

Build Prerequisites

We can first set up the TrackFM repo:

bash cd /home git clone https://github.com/compiler-disagg/TrackFM

Dev Packages

We need to install relevant packages packages for TrackFM, AIFM, Shenango, and our example applications. We'll first need to upgrade the Linux kernel version, which will necessitate a reboot:

bash sudo apt-get update sudo apt-get install -y linux-headers-5.0.0-20 \ linux-headers-5.0.0-20-generic \ linux-hwe-edge-tools-5.0.0-20 \ linux-image-5.0.0-20-generic \ linux-modules-5.0.0-20-generic \ linux-tools-5.0.0-20-generic sudo reboot

After reboot, verify that the instance is using the 5.0 kernel by running uname -r. We can then install the necessary packages:

bash sudo apt-get update sudo apt-get -y --fix-broken install sudo add-apt-repository ppa:ubuntu-toolchain-r/test sudo apt-get install -y libnuma-dev \ libmnl-dev \ libnl-3-dev \ libnl-route-3-dev \ libcrypto++-dev \ libcrypto++-doc \ libcrypto++-utils \ software-properties-common \ gcc-9 \ g++-9 \ python-pip \ python3-pip \ libjpeg-dev \ zlib1g-dev \ libevent-dev sudo apt-get -y purge cmake

Mellanox OFED Stack

Shenango and AIFM require support for the Mellanox InfiniBand cards. This needs to be set up manually:

bash wget "http://content.mellanox.com/ofed/MLNX_OFED-4.6-1.0.1.1/MLNX_OFED_LINUX-4.6-1.0.1.1-ubuntu18.04-x86_64.tgz" tar xvf MLNX_OFED_LINUX-4.6-1.0.1.1-ubuntu18.04-x86_64.tgz cd MLNX_OFED_LINUX-4.6-1.0.1.1-ubuntu18.04-x86_64 sudo ./mlnxofedinstall --add-kernel-support --dpdk --upstream-libs sudo /etc/init.d/openibd restart

Note that when you run ./mlnxofedinstall above, you may see a message like Failed to install libibverbs-dev DEB. This is okay, and it should still work.

Python Toolchain

We require several Python modules for AIFM, for plot generation, and for our benchmarks:

bash cd /home/TrackFM sudo pip3 install -r requirements.txt sudo pip install --upgrade pip sudo pip install cmake==3.22.0

Compiler Toolchain

TrackFM relies on the NOELLE compiler toolchain, which we need to install. Note that you only need to install this on the compute node (the node where you launch applications), not the memory server node. This will take a couple of hours, so sit back and enjoy a coffee.

bash ./install-dependencies.sh

Building TrackFM, AIFM, and Shenango

The tooling for AIFM and Shenango (and thus our build system) uses bash. On CloudLab, bash is not the default shell, so you might want to make it so:

bash chsh -s /bin/bash

You can then build the TrackFM runtime, which relies on AIFM and Shenango: bash cd /home/TrackFM/runtime ./build.sh

Note on Shenango Setup after Reboot Note that each time a node reboots, Shenango will have to be set up again. This is because it relies on certain kernel modules to be loaded and sysctl parameters to be set properly. While this is invoked by the build.sh script used above when first set up, if you reboot, you'll need to run it again:

bash sudo /home/TrackFM/runtime/AIFM/shenango/scripts/setup_machine.sh

Compiling the TrackFM compiler passes

Note this only needs to be done on the compute node.

bash cd /home/TrackFM/runtime/compiler_passes/passes source set_paths (if LLVM, noelle not installed in /home update this file with the correct paths) make -j

Datasets

We use Kaggle to store our datasets. We require one of the below options to be followed to download the datasets.

First Option: Create Kaggle Account

In order to download datasets using command line in Kaggle, you'll have to set up a Kaggle account. Once your account is created, you can click on the profile button in the Kaggle homepage to generate an API token. This will download a JSON file. Copy the JSON file to the compute server in /home/datasets. Another option is you can also download the datasets in your local machine and then upload the datasets to /home/datasets in the compute node.

```bash cd /home mkdir datasets

copy JSON API TOKEN in this folder

cd datasets pip3 install --user kaggle

upload your kaggle api token to the cloudlab server for example using scp

cp kaggle.json ~/.kaggle/
kaggle datasets download -d btauro/kmeans kaggle datasets download -d btauro/nyc-dataframe unzip nyc-dataframe.zip unzip kmeans.zip ```

Second Option: Direct download

Alternatively, you can directly download the datasets directly through the public web interface without Kaggle account:

Download and unzip the above datasets and upload them to /home/datasets on your CloudLab machine.

Note: Both options require the datasets directory in /home to already be created.

Testing

To confirm that TrackFM is set up correctly, run make smoke_test in the TrackFM root directory. This will run the transformed STREAM benchmark, and if TrackFM is set up correctly you will see a success message.

Reproducing Paper Results

Each experiment with an associated figure in the paper has a fig directory in exp/. To reproduce the results, you can just run make <runtime>_<fig-num> from the top-level TrackFM directory, where runtime is the backend to use, and fig-num is the figure number from the paper. Note: make <runtime>_<fig-num> will automatically generate figures, and are placed in figs folder in the root directory. The backends currently supported are local (a setup with only local memory), fastswap, AIFM, and trackfm. For example, to reproduce Figure 14a from the paper, you could do the following:

```

generate TrackFM results

make trackfm_fig14a

generate fastswap results

make fastswap_fig14a

```

One can also invoke the run.sh script located in the respective fig directory to generate the results. For example, for the same figure as above, one could use the following method: to be followed.

``` fig14a ├── fastswap │ ├── Makefile │ └── run.sh └── TrackFM ├── Makefile └── run.sh

cd exp/fig14a cd TrackFM

reproduce TrackFM datapoints and generate plots

./run.sh ```

Reproducing Fastswap and AIFM results

To reproduce Fastswap and AIFM results, both systems have to first be installed seperately in /home on both the compute and memory nodes. We provide installation scripts and instructions for both AIFM and Fastswap in the top-level TrackFM directory.

Using TrackFM

TrackFM requires O1 opimized bitcode without vectorization. We provide sample Makefiles in TrackFM/sample_configs/TrackFM.

```

Example workflow

cd /home/TrackFM/sampleconfigs/TrackFM clang -c -O1 main.c -emit-llvm make -f makechunk ```

TrackFM renames binary symobls to distinguish between TrackFM std lib calls and the runtime std lib calls. TrackFM can ignore symbols that do not use remote memory, by specifying the symbol name in TrackFM/app_symbols.

For large code bases, the code can be compiled using wllvm and passing a single bitcode file to TrackFM is a possible workflow.

Limitations

TrackFM does not support external libraries that change memory pointers. However TrackFM can ignore memory allocations passed to external libraries by annotating allocation sites (e.g., malloc) with __attribute__((annotate("local_malloc"))).
No support for mmap.
Multi-threaded applications are currently untested.

Code Structure

```

Acknowledgements

TrackFM was made possible with support from the United States National Science Foundation (NSF) via grants CCF-2028958, CNS-1763612, CNS-2239757, CNS-1763743, CCF-2028851, CCF-2119069, CCF-2107042, CNS-2211315, and CNS-2211508, the US Department of Energy (DoE) via the Exascale Computing Project (17-SC-20-SC) and with grant DE-SC0022268, as well as with generous support from Samsung Semiconductor, Inc.

Contact

The TrackFM codebase is currently maintained by Brian Tauro (btauro [at] hawk [dot] iit [dot] edu).

Owner

Name: compiler-disagg
Login: compiler-disagg
Kind: organization

Repositories: 1
Profile: https://github.com/compiler-disagg

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: "Tauro"
    given-names: "Brian"
  - family-names: "Suchy"
    given-names: "Brian"
  - family-names: "Campanoni"
    given-names: "Simone"
  - family-names: "Dinda"
    given-names: "Peter"
  - family-names: "Hale"
    given-names: "Kyle"
doi: ""
title: "TrackFM Compiler"
version: 1.0.0
repository-code: "https://github.com/compiler-disagg/trackfm"
license: MIT
message: "If you use this software, please cite both the article from preferred-citation and the software itself."
preferred-citation:
  type: conference-paper
  authors:
  - family-names: "Tauro"
    given-names: "Brian"
  - family-names: "Suchy"
    given-names: "Brian"
  - family-names: "Campanoni"
    given-names: "Simone"
  - family-names: "Dinda"
    given-names: "Peter"
  - family-names: "Hale"
    given-names: "Kyle"
  title: "TrackFM: Far-out Compiler Support for a Far Memory World"
  doi: ""
  publisher: "Association for Computing Machinery"
  conference:
    name: "ASPLOS '24"
    location: "San Diego, CA, USA"
  collection-title: "Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems"
  year: 2024
  month: 4
  start: 
  end:

GitHub Events

Total

Issues event: 1
Watch event: 5
Fork event: 1

Last Year

Issues event: 1
Watch event: 5
Fork event: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 2
Total pull requests: 0
Average time to close issues: 3 days
Average time to close pull requests: N/A
Total issue authors: 2
Total pull request authors: 0
Average comments per issue: 0.5
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 1
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 1
Pull request authors: 0
Average comments per issue: 0.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0