Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.0%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: RuABraun
  • License: mit
  • Language: C++
  • Default Branch: master
  • Size: 2.01 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created 10 months ago · Last pushed 10 months ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.md

Flashlight Text: Fast, Lightweight Utilities for Text

Quickstart | Installation | Python Documentation | Citing

CircleCI Join the chat at https://gitter.im/flashlight-ml/community PyPI PyPI - Format vcpkg Codecov GitHub

Flashlight Text is a fast, minimal library for text-based operations. It features: - a high-performance, unopinionated beam search decoder - a fast tokenizer - an efficient Dictionary abstraction

Quickstart

The Flashlight Text Python package containing beam search decoder and Dictionary components is available on PyPI: bash pip install flashlight-text To enable optional KenLM support in Python with the decoder, KenLM must be installed via pip: bash pip install git+https://github.com/kpu/kenlm.git

See the full Python binding documentation for examples and more.

Building and Installing

From Source (C++) | With vcpkg (C++) | From Source (Python) | Adding to Your Own Project (C++)

Requirements

At minimum, C++ compilation requires: - A C++ compiler with good C++17 support (e.g. gcc/g++ >= 7) - CMake — version 3.16 or later, and make - A Linux-based operating system.

KenLM Support: If building with KenLM support, KenLM is required. To toggle KenLM support use the FL_TEXT_USE_KENLM CMake option or the USE_KENLM environment variable when building the Python bindings.

Tests: If building tests, Google Test >= 1.10 is required. The FL_TEXT_BUILD_TESTS CMake option toggles building tests.

Instructions for building/installing the Python bindings from source can be found here.

Building from Source

Building the C++ project from source is simple: bash git clone https://github.com/flashlight/text && cd text cmake -S . -B build cmake --build build --parallel cd build && ctest && cd .. # run tests cmake --install build # install at the CMAKE_INSTALL_PREFIX To disable KenLM while building, pass -DFL_TEXT_USE_KENLM=OFF to CMake. To disable building tests, pass -DFL_TEXT_BUILD_TESTS=OFF.

KenLM can be downloaded and installed automatically if not found on the local system. The FL_TEXT_BUILD_STANDALONE option controls this behavior — if disabled, dependencies won't be downloaded and built when building.

With vcpkg

Flashlight Text can also be installed and used downstream with the vcpkg package manager. The port contains an optional feature with which to build and install with KenLM support: bash vcpkg install flashlight-text # no dependencies, or: vcpkg install "flashlight-text[kenlm]" # install with KenLM

Adding Flashlight Text to a C++ Project

Given a simple project.cpp file that includes and links to Flashlight Text: ```c++

include

include

int main() { fl::lib::text::Dictionary myDict("someFile.dict"); std::cout << "Dictionary has " << myDict.entrySize() << " entries." << std::endl; return 0; } ```

The following CMake configuration links Flashlight and sets include directories:

```cmake cmakeminimumrequired(VERSION 3.10) set(CMAKECXXSTANDARD 17) set(CMAKECXXSTANDARD_REQUIRED ON)

add_executable(myProject project.cpp)

findpackage(flashlight-text CONFIG REQUIRED) targetlink_libraries(myProject PRIVATE flashlight::flashlight-text) ```

To link against the library providing KenLM support, use the flashlight::flashlight-text-kenlm imported target: cmake target_link_libraries(myProject PRIVATE flashlight::flashlight-text # transitively links KenLM flashlight::flashlight-text-kenlm )

Contributing and Contact

Contact: jacobkahn@meta.com

Flashlight Text is actively developed. See CONTRIBUTING for more on how to help out.

Citing

You can cite Flashlight using: @misc{kahn2022flashlight, title={Flashlight: Enabling Innovation in Tools for Machine Learning}, author={Jacob Kahn and Vineel Pratap and Tatiana Likhomanenko and Qiantong Xu and Awni Hannun and Jeff Cai and Paden Tomasello and Ann Lee and Edouard Grave and Gilad Avidov and Benoit Steiner and Vitaliy Liptchinsky and Gabriel Synnaeve and Ronan Collobert}, year={2022}, eprint={2201.12465}, archivePrefix={arXiv}, primaryClass={cs.LG} }

License

Flashlight Text is under an MIT license. See LICENSE for more information.

Owner

  • Name: Rudolf A. Braun
  • Login: RuABraun
  • Kind: user
  • Location: Lausanne

Citation (CITATION)

@misc{kahn2022flashlight,
      title={Flashlight: Enabling Innovation in Tools for Machine Learning},
      author={Jacob Kahn and Vineel Pratap and Tatiana Likhomanenko and Qiantong Xu and Awni Hannun and Jeff Cai and Paden Tomasello and Ann Lee and Edouard Grave and Gilad Avidov and Benoit Steiner and Vitaliy Liptchinsky and Gabriel Synnaeve and Ronan Collobert},
      year={2022},
      eprint={2201.12465},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

GitHub Events

Total
  • Public event: 1
  • Push event: 2
Last Year
  • Public event: 1
  • Push event: 2

Dependencies

.github/workflows/wheels.yml actions
  • actions/checkout v4 composite
  • actions/download-artifact v4 composite
  • actions/setup-python v4 composite
  • actions/upload-artifact v4 composite
  • pypa/gh-action-pypi-publish v1.8.12 composite
bindings/python/test/requirements.txt pypi
  • numpy * test
setup.py pypi