nemo

NeMo Framework is a generative AI framework

https://github.com/logesh-works/nemo

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.4%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

NeMo Framework is a generative AI framework

Basic Info
  • Host: GitHub
  • Owner: logesh-works
  • License: apache-2.0
  • Language: Python
  • Default Branch: master
  • Size: 55.3 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 1 year ago · Last pushed over 1 year ago
Metadata Files
Readme Contributing License Citation

README.rst

Introduction
------------

NVIDIA NeMo Framework is a generative AI framework built for researchers and pytorch developers
working on large language models (LLMs), multimodal models (MM), automatic speech recognition (ASR),
and text-to-speech synthesis (TTS).

This repo implements multi-softmax in ASR models. 

Requirements
------------

1) Python 3.10 or above
2) Pytorch 1.13.1 or above
3) NVIDIA GPU, if you intend to do model training


Getting help with NeMo
----------------------
FAQ can be found on NeMo's `Discussions board `_. You are welcome to ask questions or start discussions there.


From source
~~~~~~~~~~~
Use this installation mode if you are contributing to NeMo.

.. code-block:: bash

    # create env
    conda create -n temo python=3.10
    conda activate temo

    # clone nemo
    git clone https://github.com/NVIDIA/NeMo.git
    
    # check the nvcc version and install pytorch
    pip3 install torch torchvision torchaudio
    
    conda install -c nvidia cuda-nvprof=12.1 # Cuda version
    pip install packaging

    # install apex
    git clone https://github.com/NVIDIA/apex
    cd apex
    # if pip >= 23.1 (ref: https://pip.pypa.io/en/stable/news/#v23-1) which supports multiple `--config-settings` with the same key... 
    pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" --config-settings "--build-option=--fast_layer_norm" --config-settings "--build-option=--distributed_adam" --config-settings "--build-option=--deprecated_fused_adam" ./
    
    # install NeMo
    cd ../NeMo
    ./reinstall.sh


If you only want the toolkit without additional conda-based dependencies, you may replace ``reinstall.sh``
with ``pip install -e .`` when your PWD is the root of the NeMo repository.

Mac computers with Apple silicon
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To install NeMo on Mac with Apple M-Series GPU:

- create a new Conda environment

- install PyTorch 2.0 or higher

- run the following code:

.. code-block:: shell

    # [optional] install mecab using Homebrew, to use sacrebleu for NLP collection
    # you can install Homebrew here: https://brew.sh
    brew install mecab

    # [optional] install pynini using Conda, to use text normalization
    conda install -c conda-forge pynini

    # install Cython manually
    pip install cython

    # clone the repo and install in development mode
    git clone https://github.com/NVIDIA/NeMo
    cd NeMo
    pip install 'nemo_toolkit[all]'

    # Note that only the ASR toolkit is guaranteed to work on MacBook - so for MacBook use pip install 'nemo_toolkit[asr]'

Windows Computers
~~~~~~~~~~~~~~~~~

One of the options is using Windows Subsystem for Linux (WSL).

To install WSL:

- In PowerShell, run the following code:

.. code-block:: shell

    wsl --install
    # [note] If you run wsl --install and see the WSL help text, it means WSL is already installed.

Learn more about installing WSL at `Microsoft's official documentation `_.

After Installing your Linux distribution with WSL:
  - **Option 1:** Open the distribution (Ubuntu by default) from the Start menu and follow the instructions.
  - **Option 2:** Launch the Terminal application. Download it from `Microsoft's Windows Terminal page `_ if not installed.

Next, follow the instructions for Linux systems, as provided above. For example:

.. code-block:: bash

    apt-get update && apt-get install -y libsndfile1 ffmpeg
    git clone https://github.com/NVIDIA/NeMo
    cd NeMo
    ./reinstall.sh

RNNT
~~~~
Note that RNNT requires numba to be installed from conda.

.. code-block:: bash

  conda remove numba
  pip uninstall numba
  conda install -c conda-forge numba

Apex
~~~~
To install Apex, please follow the following URL: https://github.com/NVIDIA/apex.git

It is highly recommended to use the NVIDIA PyTorch or NeMo container if having issues installing Apex or any other dependencies.

While installing Apex, it may raise an error if the CUDA version on your system does not match the CUDA version torch was compiled with.
This raise can be avoided by commenting it here: https://github.com/NVIDIA/apex/blob/master/setup.py#L32

cuda-nvprof is needed to install Apex. The version should match the CUDA version that you are using:

.. code-block:: bash

  conda install -c nvidia cuda-nvprof=11.8

packaging is also needed:

.. code-block:: bash

  pip install packaging

With the latest versions of Apex, the `pyproject.toml` file in Apex may need to be deleted in order to install locally.

Examples
--------

Many examples can be found under the `"Examples" `_ folder.

Owner

  • Name: Logesh
  • Login: logesh-works
  • Kind: user
  • Location: Tamil Nadu , India
  • Company: @cyces-innvoation-labs

SDE

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
title: "NeMo: a toolkit for Conversational AI and Large Language Models"
url: https://nvidia.github.io/NeMo/
repository-code: https://github.com/NVIDIA/NeMo
authors:
  - family-names: Harper
    given-names: Eric
  - family-names: Majumdar
    given-names: Somshubra
  - family-names: Kuchaiev
    given-names: Oleksii
  - family-names: Jason
    given-names: Li
  - family-names: Zhang
    given-names: Yang
  - family-names: Bakhturina
    given-names: Evelina
  - family-names: Noroozi 
    given-names: Vahid
  - family-names: Subramanian
    given-names: Sandeep
  - family-names: Nithin
    given-names: Koluguri
  - family-names: Jocelyn
    given-names: Huang
  - family-names: Jia
    given-names: Fei
  - family-names: Balam
    given-names: Jagadeesh
  - family-names: Yang
    given-names: Xuesong
  - family-names: Livne
    given-names: Micha
  - family-names: Dong
    given-names: Yi
  - family-names: Naren
    given-names: Sean
  - family-names: Ginsburg
    given-names: Boris

GitHub Events

Total
  • Create event: 2
Last Year
  • Create event: 2

Dependencies

Dockerfile docker
  • ${BASE_IMAGE} latest build
  • nemo-deps latest build
  • scratch latest build
pyproject.toml pypi
requirements/requirements.txt pypi
  • huggingface_hub >=0.20.3
  • numba *
  • numpy >=1.22
  • onnx >=1.7.0
  • python-dateutil *
  • ruamel.yaml *
  • scikit-learn *
  • setuptools >=65.5.1
  • tensorboard *
  • text-unidecode *
  • torch *
  • tqdm >=4.41.0
  • triton *
  • wget *
  • wrapt *
requirements/requirements_asr.txt pypi
  • braceexpand *
  • editdistance *
  • g2p_en *
  • ipywidgets *
  • jiwer *
  • kaldi-python-io *
  • kaldiio *
  • lhotse >=1.20.0
  • librosa >=0.10.0
  • marshmallow *
  • matplotlib *
  • packaging *
  • pyannote.core *
  • pyannote.metrics *
  • pydub *
  • pyloudnorm *
  • resampy *
  • ruamel.yaml *
  • scipy >=0.14
  • soundfile *
  • sox *
  • texterrors *
requirements/requirements_common.txt pypi
  • datasets *
  • inflect *
  • pandas *
  • sacremoses >=0.0.43
  • sentencepiece <1.0.0
requirements/requirements_docs.txt pypi
  • Jinja2 *
  • Sphinx *
  • boto3 *
  • latexcodec *
  • numpy *
  • pydata-sphinx-theme *
  • sphinx-book-theme *
  • sphinx-copybutton *
  • sphinxcontrib-bibtex *
  • sphinxext-opengraph *
  • urllib3 *
  • wrapt *
requirements/requirements_lightning.txt pypi
  • hydra-core >1.3,<=1.3.2
  • omegaconf <=2.3
  • pytorch-lightning >=2.2.1
  • torchmetrics >=0.11.0
  • transformers >=4.36.0
  • wandb *
  • webdataset >=0.2.86
requirements/requirements_multimodal.txt pypi
  • PyMCubes *
  • addict *
  • clip *
  • diffusers >=0.19.3
  • einops_exts *
  • imageio *
  • kornia *
  • nerfacc >=0.5.3
  • open_clip_torch *
  • taming-transformers *
  • torchdiffeq *
  • torchsde *
  • trimesh *
requirements/requirements_nlp.txt pypi
  • boto3 *
  • einops *
  • faiss-cpu *
  • fasttext *
  • flask_restful *
  • ftfy *
  • gdown *
  • h5py *
  • ijson *
  • jieba *
  • markdown2 *
  • matplotlib >=3.3.2
  • megatron_core ==0.2.0
  • nltk >=3.6.5
  • opencc <1.1.7
  • pangu *
  • rapidfuzz *
  • rouge_score *
  • sacrebleu *
  • sentence_transformers *
  • tensorstore <0.1.46
  • zarr *
requirements/requirements_slu.txt pypi
  • jiwer >=2.0.0
  • progress >=1.5
  • tabulate >=0.8.7
  • textdistance >=4.1.5
  • tqdm *
requirements/requirements_test.txt pypi
  • black ==19.10b0 test
  • click ==8.0.2 test
  • isort >5.1.0,<6.0.0 test
  • parameterized * test
  • pytest * test
  • pytest-runner * test
  • ruamel.yaml * test
  • sphinx * test
  • sphinxcontrib-bibtex * test
  • wandb * test
  • wget * test
  • wrapt * test
requirements/requirements_tts.txt pypi
  • attrdict *
  • einops *
  • jieba *
  • kornia *
  • librosa *
  • matplotlib *
  • nemo_text_processing *
  • nltk *
  • pandas *
  • pypinyin *
  • pypinyin-dict *
scripts/freesound_download_resample/freesound_requirements.txt pypi
  • joblib *
  • librosa *
  • requests *
  • requests_oauthlib *
  • sox *
setup.py pypi
tools/ctc_segmentation/requirements.txt pypi
  • ctc_segmentation ==1.7.1
  • num2words *
tools/nemo_forced_aligner/requirements.txt pypi
  • nemo_toolkit *
  • prettyprinter *
  • pytest *
tools/nmt_webapp/requirements.txt pypi
  • flask *
  • flask_cors *
  • nemo_toolkit >=1.0.0rc1
tools/speech_data_explorer/requirements.txt pypi
  • SoundFile *
  • dash >=2.1.0
  • dash_bootstrap_components >=1.0.3
  • diff_match_patch *
  • editdistance *
  • jiwer *
  • librosa >=0.9.1
  • numpy *
  • plotly *
  • tqdm *