doc4tf

Automatic generation of Text-Fabric feature documentation.

https://github.com/tonyjurg/doc4tf

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (19.4%) to scientific vocabulary

Keywords

digital-humanities documentation-tool text-fabric
Last synced: 6 months ago · JSON representation ·

Repository

Automatic generation of Text-Fabric feature documentation.

Basic Info
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 12
  • Releases: 6
Topics
digital-humanities documentation-tool text-fabric
Created about 2 years ago · Last pushed about 1 year ago
Metadata Files
Readme Citation

README.md

Project Status: Active – The project has reached a stable, usable state and is being actively developed. DOI Docs SWH License: CC BY 4.0

Why Doc4TF?

Ideally, comprehensive end-user documentation should accompany the development of a Text-Fabric dataset. However, this task isn't always completed in the initial phase. Furthermore, changes to features often go unrecorded in the documentation, leading to mismatches between the actual data and its supposed description.

This Jupyter Notebook contains Python code to automatically generate a documentation set for any Text-Fabric dataset based on its actual data. It serves as a robust starting point for developing a brand-new documentation set, especially since the resulting documentation is fully hyperlinked, a task that can be laborious if done manually. The tool can also be used to validate existing documentation.

Using Doc4TF

Since Doc4TF is implemented as a Jupyter Notebook, you will need an environment capable of running Jupyter Notebooks. Given that this tool is designed for use with Text-Fabric, you likely already have a suitable environment set up. If not, a good option is to install Anaconda.

To start using Doc4TF, you first need to download this Jupyter Notebook file and place it anywhere on your system where you can execute it. The notebook will guide you through the process, which basically consists of the following steps:: * Load the Text-Fabric database you specify. * Execute the code pressent in the subsequent cells. The code will: * Construct a python dictionary with relevant data from the TF datase. * Create separate files for each feature. * Create index pages.

The tool outputs a set of markdown files (extention '.md'), the standard format for Text-Fabric feature documentation. To view these files in a standard web browser, post-processing, specifically rendering the markdown into HTML, is required. One method is to upload the files to a GitHub repository, enabling viewing the markdown files in any browser. Alternatively, you can installing an extension like markdown viewer which would allow you to view the rendered markdown files directly in your browser.

Alternatively, the script can also generate a set of HTML files. These files can be stored on a local drive which allows for browsing them using any webbrowser.

An example documentation set created by this script is found at the results directory.

Determining the delta between two TF datasets

An additional tool has been created to identify changes between two Text-Fabric datasets, including differences in features, feature values, node types, and ranges. The tool is available as a Jupyter Notebook: determineDeltaBetweenVersions.ipynb. It generates a dynamic report that allows for in-depth exploration and can be downloaded as an HTML file. This report can be used to quickly identify necessary documentation updates. It also facilitates regression testing of code updates to detect any adverse effects.

About Text-Fabric

Text-Fabric is a powerful Python library and framework designed to facilitate the analysis and manipulation of large-scale textual data, particularly in the context of ancient languages and biblical texts. It provides a comprehensive set of tools for processing and querying structured text data efficiently. Text-Fabric was developed by Dirk Roorda. The software package is accessible at https://github.com/annotation/text-fabric.

BibTeX Citation

bibtex @software{Jurg_Doc4TF_-_Automatic_2024, author = {Jurg, Tony}, doi = {10.5281/zenodo.12705876}, month = jul, title = {{Doc4TF - Automatic generation of Text-Fabric feature documentation}}, url = {https://github.com/tonyjurg/Doc4TF}, version = {0.5.2}, year = {2024} }

Owner

  • Name: Tony Jurg
  • Login: tonyjurg
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Jurg"
  given-names: "Tony"
  orcid: "https://orcid.org/0000-0002-0343-1346"
title: "Doc4TF - Automatic generation of Text-Fabric feature documentation"
version: 0.5.2
doi: 10.5281/zenodo.12705876
date-released: 2024-07-10
url: "https://github.com/tonyjurg/Doc4TF"

GitHub Events

Total
  • Issues event: 2
  • Issue comment event: 1
  • Push event: 10
  • Create event: 1
Last Year
  • Issues event: 2
  • Issue comment event: 1
  • Push event: 10
  • Create event: 1