https://github.com/dafny-lang/dafny-reportgenerator
A tool for analyzing and reporting on Dafny, especially the results of verification
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.4%) to scientific vocabulary
Repository
A tool for analyzing and reporting on Dafny, especially the results of verification
Basic Info
- Host: GitHub
- Owner: dafny-lang
- License: mit
- Language: Dafny
- Default Branch: main
- Size: 47.9 KB
Statistics
- Stars: 3
- Watchers: 11
- Forks: 4
- Open Issues: 7
- Releases: 0
Metadata Files
README.md
dafny-reportgenerator
A tool for analyzing and reporting on Dafny, especially the results of verification.
The primary use of this tool (for now) is to flag more expensive verification tasks, as these are
more likely to a source of future verification instability: the phenomenon where code that was previously verified
no longer verifies after making seemingly unrelated changes. This problem can make consistent and predictable development
of Dafny code challenging, and hence we recommend adding a command to invoke this tool with a configured maximum
verification cost to any Dafny project's continuous integration. This is better than setting a more aggressive verification
cost bound through options like /timeLimit directly, as it allows users to know that their code is still correct,
but still blocks code changes that are too expensive to verify and hence likely to break in the future.
Note that this tool, which is itself implemented in Dafny, is no exception!
There are currently two different metrics that you can set a maximum bound on:
--max-duration-seconds N
Bounds the wall-clock time needed to complete a verification task. This is obviously intuitive for users, but will vary across different computing resources, especially how many verification tasks are running in parallel.
--max-resource-count N(Recommended)
Bounds the number of "resources" needed to complete a verification task. The resource count is an abstract measurement of
verification cost that depends only on the input formula and exact SMT solver configuration, and hence will be the same
across multiple verification jobs. Some SMT solvers (e.g. Z3) allow configuring a direct limit on this metric through
an :rlimit parameter, although the SMT-LIB standard introduces a similar :reproducible-resource-limit parameter
as of version 2.5 (https://smtlib.cs.uiowa.edu/papers/smt-lib-reference-v2.5-r2015-06-28.pdf, page 54).
--max-duration-stddev NBounds the standard deviation of multiple measurements of the wall-clock time needed to complete a verification task. This option is especially useful for measuring the stability of verification when varying the
/randomSeedparameter to Dafny. The output will only be meaningful if the collection of CSV files being analyzed includes results from more than one Dafny run. This will be true if you've run Dafny more than once on the same set of input files with without explicitly specifying a CSV output file name and without deleting theTestResultsdirectory or any of its contents between runs.--max-duration-cv-pct NBounds the normalized standard deviation (a.k.a., coefficient of variation, or CV) of multiple measurements of the wall-clock time needed to complete a verification task, as an integer percentage. This decouples variation from total execution time.
--max-resource-stddev NBounds the standard deviation of multiple measurements of the solver "resources" needed to complete a verification task. This is similar to
--max-duration-stddevbut more stable between different runs and across different platforms.--max-resource-cv-pct NBounds the normalized standard deviation (a.k.a. coefficient of variation, or CV) of multiple measurements of the solver "resources" needed to complete a verification task. This is similar to
--max-duration-cv-pctbut more stable between different runs and across different platforms.--allow-different-outcomesDon't fail if a given verification task has more than one type of outcome (e.g., success and timeout). Normally, this is considered an error, but during the process of making a project more stable it can sometimes be useful to allow.
Here is an example of the output of this tool when run against the results of verifying itself. The CSV files
were created by passing /verificationLogger:csv when invoking the dafny command-line tool. The maximum resource count
is set unrealistically low here, just to demonstrate the behavior when violated.
```
dafny-reportgenerator summarize-csv-results --max-resource-count 500000 . All results:
Impl$$Main.default.ParseCommandLineOptions(Passed) - Duration = 00:00:00.1940000, Resource Count = 581969 Impl$$Main.default.MainAux(Passed) - Duration = 00:00:00.2470000, Resource Count = 356337 CheckWellformed$$StandardLibrary.default.MergeSortedBy(Passed) - Duration = 00:00:00.1570000, Resource Count = 333138 Impl$$TestResult.default.ParseTestResults(Passed) - Duration = 00:00:00.2070000, Resource Count = 318950 CheckWellformed$$CSV.default.ParseDataWithHeader(Passed) - Duration = 00:00:00.9010000, Resource Count = 265695 CheckWellformed$$TestResult.default.ParseFromCSVRow(Passed) - Duration = 00:00:00.1430000, Resource Count = 202603 CheckWellformed$$CSV.default.ParseRowWithHeader(Passed) - Duration = 00:00:00.1200000, Resource Count = 194334 Impl$$CSVTests.default.ParseDataWithHeader(Passed) - Duration = 00:00:01.1220000, Resource Count = 182160 CheckWellformed$$StandardLibrary.default.MergeSortBy(Passed) - Duration = 00:00:00.0950000, Resource Count = 168961 CheckWellformed$$TestResult.default.GetCSVRowField(Passed) - Duration = 00:00:00.0940000, Resource Count = 116280 Impl$$StandardLibrary.default.LemmaNewFirstElementStillSortedBy(Passed) - Duration = 00:00:00.0820000, Resource Count = 112567 CheckWellformed$$StandardLibrary.default.Split(Passed) - Duration = 00:00:00.4470000, Resource Count = 99661 CheckWellformed$$CSV.default.ParseRow(Passed) - Duration = 00:00:00.0800000, Resource Count = 92111 CheckWellformed$$StandardLibrary.default.LemmaNewFirstElementStillSortedBy(Passed) - Duration = 00:00:00.0720000, Resource Count = 89524 CheckWellformed$$StandardLibrary.default.SortedBy(Passed) - Duration = 00:00:00.0710000, Resource Count = 87462 CheckWellformed$$TestResult.TestResult.ToString(Passed) - Duration = 00:00:00.4490000, Resource Count = 83944 Impl$$Main.default.defaultMain(Passed) - Duration = 00:00:00.6980000, Resource Count = 82411 Impl$$Main.__default.PrintTestResults(Passed) - Duration = 00:00:00.0520000, Resource Count = 66537
Some results have a resource count over the configured limit of 500000:
Impl$$Main.__default.ParseCommandLineOptions(Passed) - Duration = 00:00:00.1940000, Resource Count = 581969
Errors occurred: see above for details. ```
GitHub Events
Total
- Push event: 1
- Pull request review event: 1
- Pull request event: 2
- Create event: 1
Last Year
- Push event: 1
- Pull request review event: 1
- Pull request event: 2
- Create event: 1
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 6
- Total pull requests: 18
- Average time to close issues: 6 days
- Average time to close pull requests: 14 days
- Total issue authors: 2
- Total pull request authors: 4
- Average comments per issue: 2.33
- Average comments per pull request: 0.28
- Merged pull requests: 14
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 3
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 2
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- robin-aws (3)
- seebees (2)
Pull Request Authors
- atomb (11)
- robin-aws (4)
- fabiomadge (2)
- samuelchassot (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- Microsoft.Extensions.FileSystemGlobbing 6.0.0
- dafny.msbuild 1.0.0
- Microsoft.NET.Test.Sdk 16.11.0
- coverlet.collector 3.1.0
- dafny.msbuild 1.0.0
- xunit 2.4.1
- xunit.runner.visualstudio 2.4.3
- actions/checkout v2 composite
- actions/setup-dotnet v1 composite
- dafny-lang/setup-dafny-action v1.6.0 composite