https://github.com/fermi-ad/linac-logger-device-cleaner
Scripts for generating a list of valid and unique devices from the Linac data loggers.
Science Score: 8.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
✓Institutional organization owner
Organization fermi-ad has institutional domain (ad.fnal.gov) -
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.3%) to scientific vocabulary
Keywords
Repository
Scripts for generating a list of valid and unique devices from the Linac data loggers.
Basic Info
Statistics
- Stars: 1
- Watchers: 8
- Forks: 0
- Open Issues: 4
- Releases: 0
Topics
Metadata Files
README.md
Linac Logger Device Cleaner
This project was used to produce a list of valid and unique devices that are being data logged.
Download device requests file
The device requests file is versioned to make explicit what requests are being made. The data analysis team can associate a specific version of the device requests file with the analysis performed.
Releases are managed with tagged commits. The latest release can be accessed at https://github.com/fermi-controls/linac-logger-device-cleaner/releases/latest and the device requests file can be directly downloaded here https://github.com/fermi-controls/linac-logger-device-cleaner/releases/latest/download/linacloggerdrf_requests.txt.
Requirements
ACL is required to get the rates from the data logger lists associated with the devices.
Workflow
Generate a list of Linac data logger devices
./linac_logger_devices.acl generates a file with all the devices from all the nodes, ./output/linac_logger_devices.txt.
Find unique devices
Executing python parse_data_logger_devices.py generates ./output/linac_logger_unique_devices.txt.
Generate a list of Linac data logger requests
Using acl ./linac_logger_lists.acl we can query the data loggers for the request rates of Linac logger devices. This script produces ./output/linac_logger_rates.txt.
The output from above is used to generate DRF requests for the persistent ML data pipeline using ./parse_acl_logger_rates.py to output ./output/linac_logger_drf_requests.txt.
Duplicate device count
./output/linac_logger_duplicate_count.txt was generated by running sort output/linac_logger_devices.txt | uniq -c | sort > output/linac_logger_duplicate_count.txt from a unix command line.
Validating device lists
There's concern that we don't have a comprehensive list of relevant devices given that most sources of devices are human maintained. Here are strategies for determining what devices may be left out.
Device database machine field
The device database has a machine field that can be Linac. The output files are a list of devices with the machine field Linac, ./output/machine_linac_devices.txt. Then there are files for the devices of that list that are logged, ./output/logged_linac_devices.txt, and not logged, ./output/unlogged_linac_devices.txt. These lists are generated by ./gen_device_lists.acl.
Note: Queries to the data logger configuration are slow so it takes minutes to run the ACL script.
Node database area field
The front-end node database has an area field that can be Linac. ./gen_linac_area_nodes.acl generates a list of nodes where area is Linac, ./output/area_linac_nodes.txt, and lists of devices associated with those nodes, ./output/node_devices/.
Those resulting files can be combined using cat output/node_devices/* > output/area_linac_nodes_devices.txt resulting in ./output/area_linac_nodes_devices.txt.
grep -vi '^Z' output/area_linac_nodes_devices.txt > output/area_linac_nodes_devices_no_z.txt will remove any Z:% devices from the list of devices resulting in ./output/area_linac_nodes_devices_no_z.txt.
./validate_devices.acl is then used to validate that the devices in ./output/area_linac_nodes_devices_no_z.txt aren't deleted or obsolete, producing ./output/area_linac_nodes_devices_valid.txt.
Strategy comparison
The machine field strategy results in 8532 devices.
The area field strategy results in 9057 devices.
Remove Z devices; Remove devices from a list to be ignored; Remove devices giving [16 -13] errors.
The script output/DeviceListAuditTool.py produces a proposed new copy of a given device list file with at least the Z* devices removed.
- Optionally takes a file such as nanny.log, identifies (and counts occurrences) of [16 -13] "No such property" error codes, and removes those devices in the proposed new list.
- Optionally takes a csv file of devices to ignore, and removes those devices in the proposed new list.
- Produces a numeric breakdown of the counts of devices removed for any one, two, or three of these reasons.
- Optionally produce files listing devices removed by each of these list refinements.
TODOs
- Only use the highest periodic rate for a device.
- Account for different properties in the datalogger.
- Currently we assume that all entries are logged on their reading.
Questions to answer
- What and how many are all the Linac devices?
- What and how many Linac devices are logged?
- Do we keep event based logging?
- How do we determine the rate of events?
- What do we do when the devices in the logger changes?
Strategies to validate
L:%- Node database area field
- Node database system field
- Device database machine field
- Data logger databases devices
Owner
- Name: Fermilab Accelerator Directorate
- Login: fermi-ad
- Kind: organization
- Location: United States of America
- Website: https://ad.fnal.gov/
- Repositories: 1
- Profile: https://github.com/fermi-ad
Fermilab Accelerator Systems
GitHub Events
Total
- Member event: 2
Last Year
- Member event: 2