https://github.com/coqui-ai/coqpit

Simple but maybe too simple config management through python data classes. We use it for machine learning.

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.9%) to scientific vocabulary

Keywords

config-management dataclasses json machine-learning python python-data serialization typing yaml

Keywords from Contributors

speaker-encoder glow-tts hifigan melgan multi-speaker-tts speech speech-synthesis tacotron text-to-speech tts

Last synced: 5 months ago · JSON representation

Repository

Simple but maybe too simple config management through python data classes. We use it for machine learning.

Basic Info

Host: GitHub
Owner: coqui-ai
License: mit
Language: Python
Default Branch: main
Homepage:
Size: 7.64 MB

Statistics

Stars: 107
Watchers: 6
Forks: 37
Open Issues: 11
Releases: 12

Topics

config-management dataclasses json machine-learning python python-data serialization typing yaml

Created almost 5 years ago · Last pushed almost 3 years ago

Metadata Files

Readme License

👩‍✈️ Coqpit

Simple, light-weight and no dependency config handling through python data classes with to/from JSON serialization/deserialization.

Currently it is being used by 🐸TTS.

❔ Why I need this

What I need from a ML configuration library...

Fixing a general config schema in Python to guide users about expected values.

Python is good but not universal. Sometimes you train a ML model and use it on a different platform. So, you need your model configuration file importable by other programming languages.
Simple dynamic value and type checking with default values.

If you are a beginner in a ML project, it is hard to guess the right values for your ML experiment. Therefore it is important to have some default values and know what range and type of input are expected for each field.
Ability to decompose large configs.

As you define more fields for the training dataset, data preprocessing, model parameters, etc., your config file tends to get quite large but in most cases, they can be decomposed, enabling flexibility and readability.
Inheritance and nested configurations.

Simply helps to keep configurations consistent and easier to maintain.
Ability to override values from the command line when necessary.

For instance, you might need to define a path for your dataset, and this changes for almost every run. Then the user should be able to override this value easily over the command line.

It also allows easy hyper-parameter search without changing your original code. Basically, you can run different models with different parameters just using command line arguments.
Defining dynamic or conditional config values.

Sometimes you need to define certain values depending on the other values. Using python helps to define the underlying logic for such config values.
No dependencies

You don't want to install a ton of libraries for just configuration management. If you install one, then it is better to be just native python.

🚫 Limitations

Union type dataclass fields cannot be parsed from console arguments due to the type ambiguity.
JSON is the only supported serialization format, although the others can be easily integrated.
Listtype with multiple item type annotations are not supported. (e.g. List[int, str]).
dict fields are parsed from console arguments as JSON str without type checking. (e.g --val_dict '{"a":10, "b":100}').
MISSING fields cannot be avoided when parsing console arguments.

🔍 Examples

👉 Simple Coqpit

```python import os from dataclasses import asdict, dataclass, field from typing import List, Union from coqpit import MISSING, Coqpit, check_argument

@dataclass class SimpleConfig(Coqpit): vala: int = 10 valb: int = None vald: float = 10.21 valc: str = "Coqpit is great!" # mandatory field # raise an error when accessing the value if it is not changed. It is a way to define valk: int = MISSING # optional field valdict: dict = field(defaultfactory=lambda: {"valaa": 10, "valss": "This is in a dict."}) # list of list vallistoflist: List[List] = field(defaultfactory=lambda: [[1, 2], [3, 4]]) vallistofunion: List[List[Union[str,int]]] = field(default_factory=lambda: [[1, 3], [1, "Hi!"]])

def check_values(
    self,
):  # you can define explicit constraints on the fields using `check_argument()`
    """Check config fields"""
    c = asdict(self)
    check_argument("val_a", c, restricted=True, min_val=10, max_val=2056)
    check_argument("val_b", c, restricted=True, min_val=128, max_val=4058, allow_none=True)
    check_argument("val_c", c, restricted=True)

if name == "main": filepath = os.path.dirname(os.path.abspath(file_)) config = SimpleConfig()

# try MISSING class argument
try:
    k = config.val_k
except AttributeError:
    print(" val_k needs a different value before accessing it.")
config.val_k = 1000

# try serialization and deserialization
print(config.serialize())
print(config.to_json())
config.save_json(os.path.join(file_path, "example_config.json"))
config.load_json(os.path.join(file_path, "example_config.json"))
print(config.pprint())

# try `dict` interface
print(*config)
print(dict(**config))

# value assignment by mapping
config["val_a"] = -999
print(config["val_a"])
assert config.val_a == -999

```

👉 Serialization

```python import os from dataclasses import asdict, dataclass, field from coqpit import Coqpit, check_argument from typing import List, Union

@dataclass class SimpleConfig(Coqpit): vala: int = 10 valb: int = None val_c: str = "Coqpit is great!"

def check_values(self,):
    '''Check config fields'''
    c = asdict(self)
    check_argument('val_a', c, restricted=True, min_val=10, max_val=2056)
    check_argument('val_b', c, restricted=True, min_val=128, max_val=4058, allow_none=True)
    check_argument('val_c', c, restricted=True)

@dataclass class NestedConfig(Coqpit): vald: int = 10 vale: int = None valf: str = "Coqpit is great!" sclist: List[SimpleConfig] = None sc: SimpleConfig = SimpleConfig() unionvar: Union[List[SimpleConfig], SimpleConfig] = field(defaultfactory=lambda: [SimpleConfig(),SimpleConfig()])

def check_values(self,):
    '''Check config fields'''
    c = asdict(self)
    check_argument('val_d', c, restricted=True, min_val=10, max_val=2056)
    check_argument('val_e', c, restricted=True, min_val=128, max_val=4058, allow_none=True)
    check_argument('val_f', c, restricted=True)
    check_argument('sc_list', c, restricted=True, allow_none=True)
    check_argument('sc', c, restricted=True, allow_none=True)

if name == 'main': filepath = os.path.dirname(os.path.abspath(file_)) # init 🐸 dataclass config = NestedConfig()

# save to a json file
config.save_json(os.path.join(file_path, 'example_config.json'))
# load a json file
config2 = NestedConfig(val_d=None, val_e=500, val_f=None, sc_list=None, sc=None, union_var=None)
# update the config with the json file.
config2.load_json(os.path.join(file_path, 'example_config.json'))
# now they should be having the same values.
assert config == config2

# pretty print the dataclass
print(config.pprint())

# export values to a dict
config_dict = config.to_dict()
# crate a new config with different values than the defaults
config2 = NestedConfig(val_d=None, val_e=500, val_f=None, sc_list=None, sc=None, union_var=None)
# update the config with the exported valuess from the previous config.
config2.from_dict(config_dict)
# now they should be having the same values.
assert config == config2

```

👉 `argparse` handling and parsing.

```python import argparse import os from dataclasses import asdict, dataclass, field from typing import List

from coqpit import Coqpit, check_argument import sys

@dataclass class SimplerConfig(Coqpit): vala: int = field(default=None, metadata={'help': 'this is vala'})

@dataclass class SimpleConfig(Coqpit): valreq: str # required field vala: int = field(default=10, metadata={'help': 'this is vala of SimpleConfig'}) valb: int = field(default=None, metadata={'help': 'this is valb'}) nestedconfig: SimplerConfig = SimplerConfig() mylistwithdefault: List[SimplerConfig] = field( defaultfactory=lambda: [SimplerConfig(vala=100), SimplerConfig(val_a=999)], metadata={'help': 'list of SimplerConfig'})

# mylist_without_default: List[SimplerConfig] = field(default=None, metadata={'help': 'list of SimplerConfig'})  # NOT SUPPORTED YET!

def check_values(self, ):
    '''Check config fields'''
    c = asdict(self)
    check_argument('val_a', c, restricted=True, min_val=10, max_val=2056)
    check_argument('val_b',
                   c,
                   restricted=True,
                   min_val=128,
                   max_val=4058,
                   allow_none=True)
    check_argument('val_req', c, restricted=True)

def main(): # reference config that we like to match with the one parsed from argparse configref = SimpleConfig(valreq='this is different', vala=222, valb=999, nestedconfig=SimplerConfig(vala=333), mylistwithdefault=[ SimplerConfig(vala=222), SimplerConfig(vala=111) ])

# create new config object from CLI inputs
parsed = SimpleConfig.init_from_argparse()
parsed.pprint()

# check the parsed config with the reference config
assert parsed == config_ref

if name == 'main': sys.argv.extend(['--coqpit.valreq', 'this is different']) sys.argv.extend(['--coqpit.vala', '222']) sys.argv.extend(['--coqpit.valb', '999']) sys.argv.extend(['--coqpit.nestedconfig.vala', '333']) sys.argv.extend(['--coqpit.mylistwithdefault.0.vala', '222']) sys.argv.extend(['--coqpit.mylistwithdefault.1.val_a', '111']) main() ```

🤸‍♀️ Merging coqpits

```python import os from dataclasses import dataclass from coqpit import Coqpit, check_argument

@dataclass class CoqpitA(Coqpit): vala: int = 10 valb: int = None vald: float = 10.21 valc: str = "Coqpit is great!"

@dataclass class CoqpitB(Coqpit): vald: int = 25 vale: int = 257 valf: float = -10.21 valg: str = "Coqpit is really great!"

if name == 'main': filepath = os.path.dirname(os.path.abspath(file)) coqpita = CoqpitA() coqpitb = CoqpitB() coqpitb.merge(coqpita) print(coqpitb.vala) print(coqpitb.pprint()) ```

Development

Install the pre-commit hook to automatically check your commits for style and hinting issues:

bash $ python .pre-commit-2.12.1.pyz install

Owner

Name: coqui
Login: coqui-ai
Kind: organization
Email: info@coqui.ai

Website: https://coqui.ai
Twitter: coqui_ai
Repositories: 17
Profile: https://github.com/coqui-ai

Coqui, a startup providing open speech tech for everyone 🐸

GitHub Events

Total

Watch event: 9
Fork event: 5

Last Year

Watch event: 9
Fork event: 5

Committers

Last synced: 9 months ago

All Time

Total Commits: 147
Total Committers: 5
Avg Commits per committer: 29.4
Development Distribution Score (DDS): 0.327

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Eren Gölge	e**e@c**i	99
Reuben Morais	r**s@g**m	40
Edresson Casanova	e**1@g**m	4
Agrin Hilmkil	a**l@s**m	3
WeberJulian	j**r@h**r	1

Committer Domains (Top 20 + Academic)

storytel.com: 1 coqui.ai: 1

Issues and Pull Requests

Last synced: 7 months ago

All Time

Total issues: 11
Total pull requests: 28
Average time to close issues: 1 day
Average time to close pull requests: 29 days
Total issue authors: 6
Total pull request authors: 6
Average comments per issue: 1.36
Average comments per pull request: 1.18
Merged pull requests: 23
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

reuben (3)
erogol (3)
WeberJulian (2)
georgettica (1)
mweinelt (1)
mosheman5 (1)

Pull Request Authors

erogol (14)
reuben (9)
agrinh (2)
Edresson (1)
WeberJulian (1)
kdavis-coqui (1)

Top Labels

Issue Labels

feature request (2)

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- pypi 177,691 last-month
Total docker downloads: 797

Total dependent packages: 8
Total dependent repositories: 38
Total versions: 18
Total maintainers: 2

pypi.org: coqpit

Simple (maybe too simple), light-weight config management through python data-classes.

Homepage: https://github.com/erogol/coqpit
Documentation: https://coqpit.readthedocs.io/
License: mit
Latest release: 0.0.17
published about 3 years ago

Versions: 18
Dependent Packages: 8
Dependent Repositories: 38
Downloads: 177,691 Last month
Docker Downloads: 797

Rankings

Downloads: 1.3%

Dependent packages count: 1.8%

Docker downloads count: 2.3%

Dependent repos count: 2.4%

Average: 4.1%

Stargazers count: 7.7%

Forks count: 8.9%

Maintainers (2)

coqui erogol

Last synced: 6 months ago

Dependencies

requirements.txt pypi

dataclasses *

requirements_dev.txt pypi

black * development
coverage * development
pylint * development
pytest * development

.github/workflows/codeql-analysis.yml actions

actions/checkout v2 composite
github/codeql-action/analyze v1 composite
github/codeql-action/autobuild v1 composite
github/codeql-action/init v1 composite

.github/workflows/main.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite

.github/workflows/pypi-release.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite

pyproject.toml pypi

setup.py pypi

https://github.com/coqui-ai/coqpit

Science Score: 13.0%

Keywords

Keywords from Contributors

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

👩‍✈️ Coqpit

❔ Why I need this

🚫 Limitations

🔍 Examples

👉 Simple Coqpit

👉 Serialization

👉 argparse handling and parsing.

🤸‍♀️ Merging coqpits

Development

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: coqpit

Rankings

Maintainers (2)

Dependencies

👉 `argparse` handling and parsing.