https://github.com/camfort/camfort-ai
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
1 of 1 committers (100.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.1%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: camfort
- Language: Python
- Default Branch: master
- Size: 26.4 KB
Statistics
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Experiments with OpenAI, CamFort and Fortran source code
Installation requirements
pip install -r requirements.txt
Docker alternative
./build.shgenerates imageopenai:python3.10locally fromDockerfile.openai-python./run.sh [...]runs command[...]within the Docker container.
Getting an OpenAI API key
Sign up at https://openai.com/api/. You should automatically receive US$18 in credit valid for 3 months (as of this writing), which should be sufficient for testing. You can find your API key here: https://beta.openai.com/account/api-keys.
The key is a long string beginning with sk-. Put the OpenAI API key
into the env var OPENAPI_API_KEY or provide it using the --api-key
option to scripts.
Function / subroutine search
Compiling a database of vectors
Script get_vectors.py can be used to form a (sqlite3) database
keeping information about functions and subroutines in Fortran files
for later look-up.
It expects JSON describing function/subroutine locations in a certain
format either on stdin or given in a file via option -f. Each line
in the file looks like this:
{"name": "function_name", "path": "path/to/fortran/file.f90", "firstLine": 10, "lastLine": 25}
Example runs
fortran-src --dump-funs-and-subs src/ | ./run.sh python get_vectors.py --api-key "sk-..."fortran-src --dump-funs-and-subs file.f90 | ./run.sh python get_vectors.py --database mydatabase.db --api-key "sk-..."
Searching the database
Script search_vectors.py can be used to query the (sqlite3) database
using a general free-text search query to find the functions or
subroutines that most closely match the description.
Example runs
./run.sh python search_vectors.py --api-key "sk-..." --database mydatabase.db calculate wet-bulb temperatures./run.sh python search_vectors.py -n 10 precipitation in clouds
OpenAI transcripts and finetuning
This section discusses the code and examples found in the
openai-transcripts subdirectory.
Interactive OpenAI
OpenAI can be used interactively https://beta.openai.com/playground. Typical settings include:
- model: text-davinci-003 (most powerful) or code-davinci-002 (programming oriented).
- temperature: 0, for the most deterministic output
- maximum length: adjust as needed for longer output sequences (256 is not a bad start)
- stop sequence: I often create one to stop the AI from going on and on. I use either
!=endor""", lately.
File units-transcript1.txt contains a sample transcript from a
session of teaching OpenAI about units. In between text explanations
are series of examples. Usually, the final example in each series is a
result of prompting the AI to fill in the 'output' automatically. Then
after having built up some knowledge, I move on to the next concept.
It is possible to provide a series of prompt/completion data entries to 'finetune' the OpenAI model. Such finetuned models will appear in the list of models in the playground, or can be invoked from the programmatic API.
Converting to JSONL
The input to finetuning requires the JSONL format, which is just a series of lines in a text file, each line containing an independent JSON object. A sample script has been provided to convert from a simple text format to JSONL.
python3 txt-to-jsonl.py < finetune-examples.txt > finetune.jsonl
The expected text input is a bit verbose and is adapted from the units transcript, it looks like this:
Input: """
<partially completed code>
"""
Output: """
<fully completed code>
"""
###
[...]
The output:
{"prompt": "<partially completed code">, "completion": "<fully completed code>"}
[...]
Running the finetune process
Install the openai pip package and ensure your OPENAI_API_KEY env
var is set with your key, then you can invoke the functionality for
finetuning like so:
openai api fine_tunes.create -t openai-transcripts/finetune.jsonl -m davinci
This will put you on the queue (and eventually charge your account based on the number of tokens; it will tell you first before charging so you can cancel if you like).
Running the model
The generated finetune model will then be available in the playground
or on the command line (with OPENAI_API_KEY env var set):
openai api completions.create -m davinci:ft-personal-xx-yy-zz -p '<partially completed code for the query>'
where davinci:ft-personal-xx-yy-zz is the name of the created model.
Owner
- Name: camfort
- Login: camfort
- Kind: organization
- Website: http://camfort.github.io
- Repositories: 18
- Profile: https://github.com/camfort
Tooling for the static analysis and verification of Fortran code (joint project between the University of Kent, University of Cambridge, and Bloomberg LP)
GitHub Events
Total
Last Year
Committers
Last synced: 12 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Matthew Danish | m****5@c****k | 17 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 12 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0