Getting Started using Nix#
Install Nix (for the first time)#
If you don’t have nix installed yet, fetch the download link from this page.
nix is a package manager that allows us to configure the system beyond pythonic libraries.
Install environment using Nix#
Clone the MASE repository:
git clone git@github.com:DeepWok/mase.git
Create your own branch to work on:
cd mase
git checkout -b your_branch_name
Activate the
nixshell:
nix-shell
Tested Systems#
darwin aarch64 mac
linux x86_64 cuda-enabled
windows x86_64 cuda-enabled
Troubleshooting#
Clang problem on
darwin aarch64systemsThere is some legacy issue with porting
clangonnix, or generallynixshells and Python packages with C++ Extensions on macOS, you can find the issue detailed here.MacOS users should follow the standard procedure to install
xcode, which provides you withApple's clang/clang++ compilation tools. In oursetup.py, several installation would use this local systemclang, you should be able to verify this by typing[nix-shell:~/Projects/mase]$ which clang # The following is the expected output # /usr/bin/clang [nix-shell:~/Projects/mase]$ clang --version # The following is the expected output # Apple clang version 15.0.0 (clang-1500.3.9.4) # Target: arm64-apple-darwin23.5.0 # Thread model: posix # InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
Using other
clangvariants, especially the llvm-backednixclangwill cause installation or running issues withcocotbandverilatorbecause of the confusion instdlibrary paths.g++orglibcproblem on WSL (Ubuntu-24.04 system)While using
verilatorin the nix shell, some of the generated files were compiled byg++(orgcc), you might encounter issues aboutg++ can not findorglibc-xxx cannot find, etc.This is because the nix shell does not include any
g++orglibcin the environment. When compiling files, it automatically calls the related libraries in your local environment.When facing this kind of problem, you need to configure the local environment to match nix shell requirement. For instance, install or update the related packages to the
verilatorrequired version would work:(.venv) cx922@DESKTOP-UAFT8QR:~/mase$ ldd --version ldd (Ubuntu GLIBC 2.39-0ubuntu8.2) 2.39 ...
verilatorinstallation and itscocotbintegrationWe expect users to self-install
verilatoron their local system. Thenix-shellwill not installverilatorfor you. You can find the installation guide here.One common problem we found with using
verilatorbackedcocotbis that thecocotbflow lacks resolution when yourpythonis running in an advanced virtual environment. Thecocotbflow makes mistakes in identifying certain paths. One error is something in the following form:- Verilator: Built from 0.000 MB sources in 0 modules, into 0.000 MB in 0 C++ files needing 0.000 MB - Verilator: Walltime 0.003 s (elab=0.000, cvt=0.000, bld=0.000); cpu 0.000 s on 8 threads INFO: Running command make -C /Users/name/Projects/mase/src/mase_components/activations/test/build/fixed_gelu/test_0 -f Vtop.mk in directory /Users/name/Projects/mase/src/mase_components/activations/test/build/fixed_gelu/test_0 make: *** No rule to make target `/usr/local/lib/python3.11/dist-packages/cocotb/share/lib/verilator/verilator.cpp', needed by `verilator.o'. Stop.
This error message is basically saying that the
cocotbflow is looking for theverilator.cppfile in the wrong path. Theverilator.cppfile is actually located in thecocotbpackage, which is installed in thedite-packagesdirectory of yourpythonenvironment. Thecocotbflow is looking for theverilator.cppfile in the native directory, which is incorrect. One hack for this problem is to change the check in thecocotbMakefile.Edit the file
/Users/yz10513/anaconda3/envs/mase/lib/python3.11/site-packages/cocotb/share/makefiles/Makefile.incand change the following line:# Our comments: this is exactly the problem, sometimes we would like to use # this ensures we use the same python as the one cocotb was installed into ifeq ($(IS_VENV),True) # In a virtual environment, the Python binary may be a symlink, so it should not use realpath PYTHON_BIN ?= $(shell cocotb-config --python-bin) else # disable the use of realpath! # realpath to convert windows paths to unix paths, like cygpath -u #PYTHON_BIN ?= $(realpath $(shell cocotb-config --python-bin)) PYTHON_BIN ?= $(shell cocotb-config --python-bin) endif
GPU-enabled
torchinstall It is possible that thetorchpackage is not installed correctly with GPU support in thenix-shell. You should have noticed that we have created a virtual environment in thenix-shelland installed thetorchpackage in it. However, thetorchpackage may not be installed with GPU support becuase of operating system compatibility issues. In these cases, you may choose to install thetorchpackage yourself usingpip# check your python # it should point you to the python3 in the nix-shell # ../mase/.venv/bin/python3 which python3 python3 -m pip install torch torchvision torchaudio
CUDA_HOMEproblem withdeepspeedoncudaenabled systemsWhen installing
deepspeedoncudaenabled systems, you might encounter the following error:× python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [9 lines of output] Traceback (most recent call last): File "<string>", line 2, in <module> File "<pip-setuptools-caller>", line 34, in <module> File "/tmp/pip-install-lmczkuc5/deepspeed_2de5ecce4b1e495ea5546f4a526749f4/setup.py", line 101, in <module> cuda_major_ver, cuda_minor_ver = installed_cuda_version() ^^^^^^^^^^^^^^^^^^^^^^^^ File "/tmp/pip-install-lmczkuc5/deepspeed_2de5ecce4b1e495ea5546f4a526749f4/op_builder/builder.py", line 50, in installed_cuda_version raise MissingCUDAException("CUDA_HOME does not exist, unable to compile CUDA op(s)") op_builder.builder.MissingCUDAException: CUDA_HOME does not exist, unable to compile CUDA op(s) [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed × Encountered error while generating package metadata. ╰─> See above for output.
This normally means that the cuda toolkit is missing or not installed properly, which you can chek by running the following command:
nvcc --version which nvcc
If there is an error, this is an indication to reinstall the cuda toolkit. You might need to run
sudo apt-get install cuda-toolkiton Ubuntu systems. You may, in fact, need to install normal build tools fordeepspeeddandpycudatoo, these can begcc,g++,make,cmake, etc. Thetensorrtinstallation may also trigger an independent install if you are onwsl.