gpt

Generative Pre-trained Transformer in PyTorch from scratch

https://github.com/jmaczan/gpt

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (7.0%) to scientific vocabulary

Keywords

attention deep-learning from-scratch gpt machine-learning pytorch transformer
Last synced: 6 months ago · JSON representation ·

Repository

Generative Pre-trained Transformer in PyTorch from scratch

Basic Info
Statistics
  • Stars: 3
  • Watchers: 2
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Topics
attention deep-learning from-scratch gpt machine-learning pytorch transformer
Created over 1 year ago · Last pushed about 1 year ago
Metadata Files
Readme License Citation

README.md

gpt

💜 See also the article how to build the minimalistic GPT version in Paged Out! #5 Issue, page 6 "GPT in PyTorch"

Generative Pre-trained Transformer in PyTorch from scratch

Train

CLI

sh python src/train.py

Options: sh --batch_size 64 --num-epochs 100 --lr 0.0001 --from-checkpoint checkpoint_path.pth

Model is checkpointed after each epoch and stored in checkpoints/ directory

Code

```py from train import train

train() ```

Run

CLI

sh python src/run.py --from-checkpoint checkpoint_path.pth

Code

```py from run import run

run(modelpath="checkpointpath.pth", prompt="Rick:\nMorty, where are you?) ```

Cite

If you use this software in your research, please use the following citation:

bibtex @misc{Maczan_GPT_2024, title = "Generative Pre-trained Transformer in PyTorch", author = "{Maczan, Jędrzej Paweł}", howpublished = "\url{https://github.com/jmaczan/gpt}", year = 2024, publisher = {GitHub} }

License

GPL v3

Author

Jędrzej Maczan, 2024

Owner

  • Name: Jędrzej Maczan
  • Login: jmaczan
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software in your research, please cite it as below."
authors:
- family-names: "Maczan"
  given-names: "Jędrzej Paweł"
  orcid: "https://orcid.org/0000-0003-1741-6064"
title: "Generative Pre-trained Transformer in PyTorch"
date-released: 2024-05-05
url: "https://github.com/jmaczan/gpt"

GitHub Events

Total
  • Watch event: 3
  • Issue comment event: 1
  • Push event: 2
  • Pull request event: 2
Last Year
  • Watch event: 3
  • Issue comment event: 1
  • Push event: 2
  • Pull request event: 2

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 54
  • Total Committers: 2
  • Avg Commits per committer: 27.0
  • Development Distribution Score (DDS): 0.148
Past Year
  • Commits: 42
  • Committers: 2
  • Avg Commits per committer: 21.0
  • Development Distribution Score (DDS): 0.19
Top Committers
Name Email Commits
jmaczan j****l@m****l 46
ms xx@x****x 8
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: about 11 hours
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 1.0
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: about 11 hours
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 1.0
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • michuhu (2)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

requirements.txt pypi
  • aiohttp =3.9.5=py310hd125d64_0
  • aiosignal =1.3.1=pyhd8ed1ab_0
  • appnope =0.1.4=pyhd8ed1ab_0
  • asttokens =2.4.1=pyhd8ed1ab_0
  • async-timeout =4.0.3=pyhd8ed1ab_0
  • attrs =23.2.0=pyh71513ae_0
  • aws-c-auth =0.7.22=h27bc0eb_5
  • aws-c-cal =0.6.15=h5db4892_0
  • aws-c-common =0.9.19=h99b78c6_0
  • aws-c-compression =0.2.18=h5db4892_6
  • aws-c-event-stream =0.4.2=h4de9e5c_13
  • aws-c-http =0.8.2=h2c662d3_2
  • aws-c-io =0.14.9=h8709d7d_2
  • aws-c-mqtt =0.10.4=h5fc5ab5_6
  • aws-c-s3 =0.5.10=h48f01f6_3
  • aws-c-sdkutils =0.1.16=h5db4892_2
  • aws-checksums =0.1.18=h5db4892_6
  • aws-crt-cpp =0.26.12=h9d69022_0
  • aws-sdk-cpp =1.11.329=h6bd5272_5
  • brotli-python =1.1.0=py310h1253130_1
  • bzip2 =1.0.8=h93a5062_5
  • c-ares =1.28.1=h93a5062_0
  • ca-certificates =2024.6.2=hf0a4a13_0
  • certifi =2024.6.2=pyhd8ed1ab_0
  • charset-normalizer =3.3.2=pyhd8ed1ab_0
  • colorama =0.4.6=pyhd8ed1ab_0
  • comm =0.2.2=pyhd8ed1ab_0
  • datasets =2.20.0=pyhd8ed1ab_0
  • debugpy =1.8.1=py310h692a8b6_0
  • decorator =5.1.1=pyhd8ed1ab_0
  • dill =0.3.8=pyhd8ed1ab_0
  • exceptiongroup =1.2.0=pyhd8ed1ab_2
  • executing =2.0.1=pyhd8ed1ab_0
  • filelock =3.14.0=pyhd8ed1ab_0
  • freetype =2.12.1=hadb7bae_2
  • frozenlist =1.4.1=py310hd125d64_0
  • fsspec =2024.5.0=pyhff2d567_0
  • gflags =2.2.2=hc88da5d_1004
  • glog =0.7.1=heb240a5_0
  • gmp =6.3.0=hebf3989_1
  • gmpy2 =2.1.5=py310h3bc658a_1
  • huggingface_hub =0.23.4=pyhd8ed1ab_0
  • idna =3.7=pyhd8ed1ab_0
  • importlib-metadata =7.1.0=pyha770c72_0
  • importlib_metadata =7.1.0=hd8ed1ab_0
  • ipykernel =6.29.4=pyh57ce528_0
  • ipython =8.25.0=pyh707e725_0
  • jedi =0.19.1=pyhd8ed1ab_0
  • jinja2 =3.1.4=pyhd8ed1ab_0
  • jupyter_client =8.6.2=pyhd8ed1ab_0
  • jupyter_core =5.7.2=py310hbe9552e_0
  • krb5 =1.21.2=h92f50d5_0
  • lcms2 =2.16=ha0e7c42_0
  • lerc =4.0.0=h9a09cb3_0
  • libabseil =20240116.2=cxx17_hebf3989_0
  • libarrow =16.1.0=h431211a_9_cpu
  • libarrow-acero =16.1.0=h00cdb27_9_cpu
  • libarrow-dataset =16.1.0=h00cdb27_9_cpu
  • libarrow-substrait =16.1.0=hc68f6b8_9_cpu
  • libblas =3.9.0=22_osxarm64_openblas
  • libbrotlicommon =1.1.0=hb547adb_1
  • libbrotlidec =1.1.0=hb547adb_1
  • libbrotlienc =1.1.0=hb547adb_1
  • libcblas =3.9.0=22_osxarm64_openblas
  • libcrc32c =1.1.2=hbdafb3b_0
  • libcurl =8.8.0=h7b6f9a7_0
  • libcxx =17.0.6=h5f092b4_0
  • libdeflate =1.20=h93a5062_0
  • libedit =3.1.20191231=hc8eb9b7_2
  • libev =4.33=h93a5062_2
  • libevent =2.1.12=h2757513_1
  • libffi =3.4.2=h3422bc3_5
  • libgfortran =5.0.0=13_2_0_hd922786_3
  • libgfortran5 =13.2.0=hf226fd6_3
  • libgoogle-cloud =2.25.0=hfe08963_0
  • libgoogle-cloud-storage =2.25.0=h3fa5b87_0
  • libgrpc =1.62.2=h9c18a4f_0
  • libjpeg-turbo =3.0.0=hb547adb_1
  • liblapack =3.9.0=22_osxarm64_openblas
  • libnghttp2 =1.58.0=ha4dd798_1
  • libopenblas =0.3.27=openmp_h6c19121_0
  • libparquet =16.1.0=hcf52c46_9_cpu
  • libpng =1.6.43=h091b4b1_0
  • libprotobuf =4.25.3=hbfab5d5_0
  • libre2-11 =2023.09.01=h7b2c953_2
  • libsodium =1.0.18=h27ca646_1
  • libsqlite =3.45.3=h091b4b1_0
  • libssh2 =1.11.0=h7a5bd25_0
  • libthrift =0.19.0=h026a170_1
  • libtiff =4.6.0=h07db509_3
  • libtorch =2.3.0=cpu_generic_hf1facdc_1
  • libutf8proc =2.8.0=h1a8c8d9_0
  • libuv =1.48.0=h93a5062_0
  • libwebp-base =1.4.0=h93a5062_0
  • libxcb =1.15=hf346824_0
  • libzlib =1.3.1=hfb2fe0b_1
  • llvm-openmp =18.1.6=hde57baf_0
  • lz4-c =1.9.4=hb7217d7_0
  • markupsafe =2.1.5=py310hd125d64_0
  • matplotlib-inline =0.1.7=pyhd8ed1ab_0
  • mpc =1.3.1=h91ba8db_0
  • mpfr =4.2.1=h41d338b_1
  • mpmath =1.3.0=pyhd8ed1ab_0
  • multidict =6.0.5=py310h8e9501a_0
  • multiprocess =0.70.16=py310hd125d64_0
  • ncurses =6.5=hb89a1cb_0
  • nest-asyncio =1.6.0=pyhd8ed1ab_0
  • networkx =3.3=pyhd8ed1ab_1
  • nomkl =1.0=h5ca1d4c_0
  • numpy =1.26.4=py310hd45542a_0
  • openjpeg =2.5.2=h9f1df11_0
  • openssl =3.3.1=hfb2fe0b_0
  • orc =2.0.1=h47ade37_1
  • packaging =24.0=pyhd8ed1ab_0
  • pandas =2.2.2=py310h2216879_1
  • parso =0.8.4=pyhd8ed1ab_0
  • pexpect =4.9.0=pyhd8ed1ab_0
  • pickleshare =0.7.5=py_1003
  • pillow =10.3.0=py310h81a8c2e_0
  • pip =24.0=pyhd8ed1ab_0
  • platformdirs =4.2.2=pyhd8ed1ab_0
  • prompt-toolkit =3.0.46=pyha770c72_0
  • psutil =5.9.8=py310hd125d64_0
  • pthread-stubs =0.4=h27ca646_1001
  • ptyprocess =0.7.0=pyhd3deb0d_0
  • pure_eval =0.2.2=pyhd8ed1ab_0
  • pyarrow =16.1.0=py310h24597f5_3
  • pyarrow-core =16.1.0=py310h2e300fa_3_cpu
  • pyarrow-hotfix =0.6=pyhd8ed1ab_0
  • pygments =2.18.0=pyhd8ed1ab_0
  • pysocks =1.7.1=pyha2e5f31_6
  • python =3.10.14=h2469fbe_0_cpython
  • python-dateutil =2.9.0=pyhd8ed1ab_0
  • python-tzdata =2024.1=pyhd8ed1ab_0
  • python-xxhash =3.4.1=py310h2aa6e3c_0
  • python_abi =3.10=4_cp310
  • pytorch =2.3.0=cpu_generic_py310hb190f2a_1
  • pytz =2024.1=pyhd8ed1ab_0
  • pyyaml =6.0.1=py310h2aa6e3c_1
  • pyzmq =26.0.3=py310h16e08c9_0
  • re2 =2023.09.01=h4cba328_2
  • readline =8.2=h92ec313_1
  • regex =2024.5.15=py310ha6dd24b_0
  • requests =2.32.3=pyhd8ed1ab_0
  • safetensors =0.4.3=py310hd442715_0
  • setuptools =70.0.0=pyhd8ed1ab_0
  • six =1.16.0=pyh6c4a22f_0
  • sleef =3.5.1=h156473d_2
  • snappy =1.2.0=hd04f947_1
  • stack_data =0.6.2=pyhd8ed1ab_0
  • sympy =1.12=pypyh9d50eac_103
  • tk =8.6.13=h5083fa2_1
  • tokenizers =0.19.1=py310hb145503_0
  • torchvision =0.18.0=cpu_py310h047b7c9_0
  • tornado =6.4.1=py310ha6dd24b_0
  • tqdm =4.66.4=pyhd8ed1ab_0
  • traitlets =5.14.3=pyhd8ed1ab_0
  • transformers =4.41.2=pyhd8ed1ab_0
  • typing-extensions =4.12.2=hd8ed1ab_0
  • typing_extensions =4.12.2=pyha770c72_0
  • tzdata =2024a=h0c530f3_0
  • urllib3 =2.2.1=pyhd8ed1ab_0
  • wcwidth =0.2.13=pyhd8ed1ab_0
  • wheel =0.43.0=pyhd8ed1ab_1
  • xorg-libxau =1.0.11=hb547adb_0
  • xorg-libxdmcp =1.1.3=h27ca646_0
  • xxhash =0.8.2=hb547adb_0
  • xz =5.2.6=h57fd34a_0
  • yaml =0.2.5=h3422bc3_2
  • yarl =1.9.4=py310hd125d64_0
  • zeromq =4.3.5=hcc0f68c_4
  • zipp =3.17.0=pyhd8ed1ab_0
  • zstd =1.5.6=hb46c0d2_0
environment.yaml pypi