Recent Releases of shantay
shantay - v0.5: Customize HTML reports, run with Pola.rs 1.31.0
This release add two new customizations of generated HTML reports, fixes an incompatibility with Pola.rs 1.31.0, and improves test coverage including by enabling CI.
- Python
Published by apparebit 7 months ago
shantay - v0.4 Add Polish by Behaving More Consistently
This release improves the consistency between tasks, processing modes, and graphs. Notably, Shantay now stages input files before summarizing them for the full database and extracts alike. It now generates JSON metadata during summarization/distillation for the full database and extracts alike. It now writes back the metadata and summary statistics only after completing a task for single-process and multi-process mode alike. Finally, it now adds a second, more detailed panel to timelines with counts and delays alike.
This release also enriches the output of the info task and fixes a bug where Shantay required JSON metadata before it had collected the data.
Check out the Shantay-generated overview report, which covers the entire database and includes platform-specific summaries for Meta, TikTok, X, and YouTube, amongst others.
- Python
Published by apparebit 7 months ago
shantay - v0.3: a more usable, more robust tool
Shantay now has simpler command line options and you can do all the heavy lifting with only two subcommands (or "tasks"):
summarizedownloads daily releases of the DSA transparency database and extracts summary statisticsvisualizecreates a report with lots of charts, illustrating many aspects of the entire database and for more popular platforms
If you want more control, then the download, distill, recover, and info tasks as well as the --offline option may be of use. They are described in the project readme as well as the tool help.
With this release, Shantay collects richer statistics that keep their association with platforms and include free text fields for "other" categories. The generated reports for the entire database and statement categories also have much improved. They include more data and charts, organize that information better, and feature cleaner charts that omit unnecessary categories while ordering remaining categories by size. Their appearance has also been improved.
- Python
Published by apparebit 8 months ago
shantay - v0.2: a working tool
Shantay now supports two workflows:
- one to analyze the entire DSA transparency database (the summarize task)
- one to analyze a category-specific or otherwise filtered view (the prepare and analyze tasks)
It optionally uses multiprocessing for data crunching (--multiproc), which does make a difference when using 2 or 3 worker processes.
The implementation makes extensive use of declarative specifications, notably to describe the DSA transparency database schema, to describe the extraction of relevant statistics from the database or a view, and to generate timeline graphs from the collected statistics.
- Python
Published by apparebit 10 months ago