go-sanitize

πŸ› Lightweight Go library providing robust string sanitization and normalization utilities

https://github.com/mrz1836/go-sanitize

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • βœ“
    CITATION.cff file
    Found CITATION.cff file
  • βœ“
    codemeta.json file
    Found codemeta.json file
  • βœ“
    .zenodo.json file
    Found .zenodo.json file
  • β—‹
    DOI references
  • β—‹
    Academic publication links
  • β—‹
    Committers with academic emails
  • β—‹
    Institutional organization owner
  • β—‹
    JOSS paper metadata
  • β—‹
    Scientific vocabulary similarity
    Low similarity (8.3%) to scientific vocabulary

Keywords

golang golang-package gomod gomodule normalize sanitization sanitizer strings strings-manipulation

Keywords from Contributors

interactive mesh interpretability profiles sequences generic projection standardization optim embedded
Last synced: 6 months ago · JSON representation ·

Repository

πŸ› Lightweight Go library providing robust string sanitization and normalization utilities

Basic Info
  • Host: GitHub
  • Owner: mrz1836
  • License: mit
  • Language: Go
  • Default Branch: master
  • Homepage:
  • Size: 1.2 MB
Statistics
  • Stars: 44
  • Watchers: 3
  • Forks: 1
  • Open Issues: 0
  • Releases: 39
Topics
golang golang-package gomod gomodule normalize sanitization sanitizer strings strings-manipulation
Created about 7 years ago · Last pushed 6 months ago
Metadata Files
Readme Contributing Funding License Code of conduct Citation Codeowners Security Support Agents

README.md

πŸ› go-sanitize

Lightweight Go library providing robust string sanitization and normalization utilities

CI / CD Quality & Security Docs & Meta Community
Latest Release
Build Status
CodeQL
Last commit
Go Report Card
Code Coverage
OpenSSF Scorecard
Security policy
OpenSSF Best Practices
Go version
Go docs
AGENTS.md rules
Makefile Supported
Dependabot
Contributors
Sponsor
Donate Bitcoin


πŸ—‚οΈ Table of Contents


πŸ“¦ Installation

go-sanitize requires a supported release of Go. shell script go get -u github.com/mrz1836/go-sanitize


πŸ’‘ Usage

Here is a basic example of how to use go-sanitize in your Go project:

```go package main

import ( "fmt" "github.com/mrz1836/go-sanitize" )

func main() { // Sanitize a string to remove unwanted characters input := "Hello, World! @2025" sanitized := sanitize.AlphaNumeric(input, false) // true to keep spaces

// Output: "Sanitized String: HelloWorld2025"
fmt.Println("Sanitized String:", sanitized) 

} ```

  • Explore additional usage examples for practical integration patterns
  • Review benchmark results to assess performance characteristics
  • Examine the comprehensive test suite for validation and coverage
  • Fuzz tests are available to ensure robustness against unexpected inputs


πŸ“š Documentation

View the generated documentation

Heads up! go-sanitize is intentionally light on dependencies. The only external package it uses is the excellent testify suiteβ€”and that's just for our tests. You can drop this library into your projects without dragging along extra baggage.


Features

  • Alpha and alphanumeric sanitization with optional spaces
  • Bitcoin and Bitcoin Cash address sanitizers
  • Custom regular expression helper for arbitrary patterns
  • Precompiled regex sanitizer for repeated patterns
  • Decimal, domain, email and IP address normalization
  • HTML and XML stripping with script removal
  • URI, URL and XSS sanitization

Functions

  • Alpha: Remove non-alphabetic characters, optionally keep spaces
  • AlphaNumeric: Remove non-alphanumeric characters, optionally keep spaces
  • BitcoinAddress: Filter input to valid Bitcoin address characters
  • BitcoinCashAddress: Filter input to valid Bitcoin Cash address characters
  • Custom: Use a custom regex to filter input (legacy)
  • CustomCompiled: Use a precompiled custom regex to filter input (suggested)
  • Decimal: Keep only decimal or float characters
  • Domain: Sanitize domain, optionally preserving case and removing www
  • Email: Normalize an email address
  • FirstToUpper: Capitalize the first letter of a string
  • FormalName: Keep only formal name characters
  • HTML: Strip HTML tags
  • IPAddress: Return sanitized and valid IPv4 or IPv6 address
  • Numeric: Remove all but numeric digits
  • PhoneNumber: Keep digits and plus signs for phone numbers
  • PathName: Sanitize to a path-friendly name
  • Punctuation: Allow letters, numbers and basic punctuation
  • ScientificNotation: Keep characters valid in scientific notation
  • Scripts: Remove scripts, iframe and object tags
  • SingleLine: Replace line breaks and tabs with spaces
  • Time: Keep only valid time characters
  • URI: Keep characters allowed in a URI
  • URL: Keep characters allowed in a URL
  • XML: Strip XML tags
  • XSS: Remove common XSS attack strings

Additional Documentation & Repository Management

Library Deployment
This project uses [goreleaser](https://github.com/goreleaser/goreleaser) for streamlined binary and library deployment to GitHub. To get started, install it via: ```bash brew install goreleaser ``` The release process is defined in the [.goreleaser.yml](.goreleaser.yml) configuration file. To generate a snapshot (non-versioned) release for testing purposes, run: ```bash make release-snap ``` Before tagging a new version, update the release metadata in the `CITATION.cff` file: ```bash make citation version=0.2.1 ``` Then create and push a new Git tag using: ```bash make tag version=x.y.z ``` This process ensures consistent, repeatable releases with properly versioned artifacts and citation metadata.
Makefile Commands
View all `makefile` commands ```bash script make help ``` List of all current commands: ```text bench ## Run all benchmarks in the Go application build-go ## Build the Go application (locally) citation ## Update version in CITATION.cff (use version=X.Y.Z) clean-mods ## Remove all the Go mod cache coverage ## Show test coverage diff ## Show git diff and fail if uncommitted changes exist fumpt ## Run fumpt to format Go code generate ## Run go generate in the base of the repo godocs ## Trigger GoDocs tag sync govulncheck-install ## Install govulncheck (pass VERSION= to override) govulncheck ## Scan for vulnerabilities help ## Display this help message install-go ## Install using go install with specific version install-releaser ## Install GoReleaser install-stdlib ## Install the Go standard library for the host platform install-template ## Kick-start a fresh copy of go-template (run once!) install ## Install the application binary lint-version ## Show the golangci-lint version lint ## Run the golangci-lint application (install if not found) loc ## Total lines of code table mod-download ## Download Go module dependencies mod-tidy ## Clean up go.mod and go.sum pre-build ## Pre-build all packages to warm cache release-snap ## Build snapshot binaries release-test ## Run release dry-run (no publish) release ## Run production release (requires github_token) tag-remove ## Remove local and remote tag (use version=X.Y.Z) tag-update ## Force-update tag to current commit (use version=X.Y.Z) tag ## Create and push a new tag (use version=X.Y.Z) test-ci-no-race ## CI test suite without race detector test-ci ## CI test runs tests with race detection and coverage (no lint - handled separately) test-cover-race ## Runs unit tests with race detector and outputs coverage test-cover ## Unit tests with coverage (no race) test-fuzz ## Run fuzz tests only (no unit tests) test-no-lint ## Run only tests (no lint) test-parallel ## Run tests in parallel (faster for large repos) test-race ## Unit tests with race detector (no coverage) test-short ## Run tests excluding integration tests (no lint) test ## Default testing uses lint + unit tests (fast) uninstall ## Uninstall the Go binary update-linter ## Upgrade golangci-lint (macOS only) update-releaser ## Reinstall GoReleaser update ## Update dependencies vet-parallel ## Run go vet in parallel (faster for large repos) vet ## Run go vet only on your module packages ```
GitHub Workflows
### πŸŽ›οΈ The Workflow Control Center All GitHub Actions workflows in this repository are powered by a single configuration file: [**.env.shared**](.github/.env.shared) – your one-stop shop for tweaking CI/CD behavior without touching a single YAML file! 🎯 This magical file controls everything from: - **πŸš€ Go version matrix** (test on multiple versions or just one) - **πŸƒ Runner selection** (Ubuntu or macOS, your wallet decides) - **πŸ”¬ Feature toggles** (coverage, fuzzing, linting, race detection) - **πŸ›‘οΈ Security tool versions** (gitleaks, nancy, govulncheck) - **πŸ€– Auto-merge behaviors** (how aggressive should the bots be?) - **🏷️ PR management rules** (size labels, auto-assignment, welcome messages) > **Pro tip:** Want to disable code coverage? Just flip `ENABLE_CODE_COVERAGE=false` in [.env.shared](.github/.env.shared) and push. No YAML archaeology required!
| Workflow Name | Description | |------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------| | [auto-merge-on-approval.yml](.github/workflows/auto-merge-on-approval.yml) | Automatically merges PRs after approval and all required checks, following strict rules. | | [codeql-analysis.yml](.github/workflows/codeql-analysis.yml) | Analyzes code for security vulnerabilities using [GitHub CodeQL](https://codeql.github.com/). | | [dependabot-auto-merge.yml](.github/workflows/dependabot-auto-merge.yml) | Automatically merges [Dependabot](https://github.com/dependabot) PRs that meet all requirements. | | [fortress.yml](.github/workflows/fortress.yml) | Runs the GoFortress security and testing workflow, including linting, testing, releasing, and vulnerability checks. | | [pull-request-management.yml](.github/workflows/pull-request-management.yml) | Labels PRs by branch prefix, assigns a default user if none is assigned, and welcomes new contributors with a comment. | | [scorecard.yml](.github/workflows/scorecard.yml) | Runs [OpenSSF](https://openssf.org/) Scorecard to assess supply chain security. | | [stale.yml](.github/workflows/stale-check.yml) | Warns about (and optionally closes) inactive issues and PRs on a schedule or manual trigger. | | [sync-labels.yml](.github/workflows/sync-labels.yml) | Keeps GitHub labels in sync with the declarative manifest at [`.github/labels.yml`](./.github/labels.yml). | | [update-python-dependencies.yml](.github/workflows/update-python-dependencies.yml) | Updates Python dependencies for pre-commit hooks in the repository. | | [update-pre-commit-hooks.yml](.github/workflows/update-pre-commit-hooks.yml) | Automatically update versions for [pre-commit](https://pre-commit.com/) hooks |
Updating Dependencies
To update all dependencies (Go modules, linters, and related tools), run: ```bash make update ``` This command ensures all dependencies are brought up to date in a single step, including Go modules and any tools managed by the Makefile. It is the recommended way to keep your development environment and CI in sync with the latest versions.


πŸ§ͺ Examples & Tests

All unit tests and examples run via GitHub Actions and use Go version 1.24.x. View the configuration file.

Run all tests (fast):

bash script make test

Run all tests with race detector (slower): bash script make test-race


⚑ Benchmarks

Run the Go benchmarks:

bash script make bench


Benchmark Results

| Benchmark | Iterations | ns/op | B/op | allocs/op | |-------------------------------------------------------|------------|--------:|-----:|----------:| | Alpha | 14,018,806 | 84.89 | 24 | 1 | | Alpha_WithSpaces | 12,664,946 | 94.25 | 24 | 1 | | AlphaNumeric | 9,161,546 | 130.6 | 32 | 1 | | AlphaNumeric_WithSpaces | 7,978,879 | 150.8 | 32 | 1 | | BitcoinAddress | 8,843,929 | 137.1 | 48 | 1 | | BitcoinCashAddress | 5,892,612 | 196.2 | 48 | 1 | | Custom (Legacy) | 938,733 | 1,249.0 | 913 | 16 | | CustomCompiled | 1,576,502 | 762.3 | 96 | 5 | | Decimal | 16,285,825 | 73.91 | 24 | 1 | | Domain | 4,784,115 | 251.6 | 176 | 3 | | Domain_PreserveCase | 5,594,325 | 213.9 | 160 | 2 | | Domain_RemoveWww | 4,771,556 | 251.0 | 176 | 3 | | Email | 8,380,172 | 144.2 | 48 | 2 | | Email_PreserveCase | 13,468,302 | 90.06 | 24 | 1 | | FirstToUpper | 57,342,418 | 20.60 | 16 | 1 | | FormalName | 14,557,754 | 83.12 | 24 | 1 | | HTML | 2,558,787 | 468.5 | 48 | 3 | | IPAddress | 11,388,638 | 102.7 | 32 | 2 | | IPAddress_IPV6 | 3,434,715 | 350.9 | 96 | 2 | | Numeric | 22,661,516 | 52.92 | 16 | 1 | | PhoneNumber | 17,502,224 | 68.84 | 24 | 1 | | PathName | 13,881,150 | 86.58 | 24 | 1 | | Punctuation | 7,377,070 | 162.3 | 48 | 1 | | ScientificNotation | 19,399,621 | 61.62 | 24 | 1 | | Scripts | 2,060,790 | 580.6 | 16 | 1 | | SingleLine | 9,777,549 | 123.5 | 32 | 1 | | Time | 21,270,655 | 55.92 | 16 | 1 | | URI | 9,005,937 | 133.4 | 32 | 1 | | URL | 8,989,400 | 135.2 | 32 | 1 | | XML | 4,351,617 | 275.7 | 48 | 3 | | XSS | 3,302,917 | 362.9 | 40 | 2 |

These benchmarks reflect fast, allocation-free lookups for most retrieval functions, ensuring optimal performance in production environments. Performance benchmarks for the core functions in this library, executed on an Apple M1 Max (ARM64).


πŸ› οΈ Code Standards

Read more about this Go project's code standards.


πŸ€– AI Compliance

This project documents expectations for AI assistants using a few dedicated files:

  • AGENTS.md β€” canonical rules for coding style, workflows, and pull requests used by Codex.
  • CLAUDE.md β€” quick checklist for the Claude agent.
  • .cursorrules β€” machine-readable subset of the policies for Cursor and similar tools.
  • sweep.yaml β€” rules for Sweep, a tool for code review and pull request management.

Edit AGENTS.md first when adjusting these policies, and keep the other files in sync within the same pull request.


πŸ‘₯ Maintainers

| MrZ | |:------------------------------------------------------------------------------------------------:| | MrZ |


🀝 Contributing

View the contributing guidelines and please follow the code of conduct.

How can I help?

All kinds of contributions are welcome :raisedhands:! The most basic way to show your support is to star :star2: the project, or to raise issues :speechballoon:. You can also support this project by becoming a sponsor on GitHub :clap: or by making a bitcoin donation to ensure this journey continues indefinitely! :rocket:

Stars


πŸ“ License

License

Owner

  • Name: Mr. Z
  • Login: mrz1836
  • Kind: user
  • Location: BitCoin
  • Company: @skyetel @buxorg @BitcoinSchema @tonicpow @bitcoin-sv

#DevOps #Go #BitCoin

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you find go-sanitize useful in your software or research, please cite this project using the following citation metadata. A citation helps support future development and encourages best practices in software reuse."
title: "go-sanitize"
version: "1.5.2"
url: "https://github.com/mrz1836/go-sanitize"
repository: "https://github.com/mrz1836/go-sanitize"
repository-code: "https://github.com/mrz1836/go-sanitize"
date-released: 2024-04-24
license: "Apache-2.0"
type: software
authors:
  - family-names: "Z"
    given-names: "Mr"
    name: "mrz1836"
keywords:
  - data-cleaning
  - data-sanitization
  - data-validation
  - go
  - go-module
  - go-package
  - golang
  - golang-library
  - html-sanitization
  - input-sanitization
  - input-validation
  - normalization
  - safe-input
  - sanitize
  - string-normalization
  - string-sanitization
  - url-validation
  - user-input
  - xss
abstract: "A lightweight Go library for sanitizing and normalizing strings, HTML, and URLs. It offers robust utilities for input validation, data cleaning, and safe string handling in Go applications."

GitHub Events

Total
  • Release event: 7
  • Watch event: 8
  • Delete event: 65
  • Issue comment event: 78
  • Push event: 207
  • Pull request review comment event: 2
  • Pull request review event: 35
  • Pull request event: 130
  • Fork event: 1
  • Create event: 65
Last Year
  • Release event: 7
  • Watch event: 8
  • Delete event: 65
  • Issue comment event: 78
  • Push event: 207
  • Pull request review comment event: 2
  • Pull request review event: 35
  • Pull request event: 130
  • Fork event: 1
  • Create event: 65

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 308
  • Total Committers: 3
  • Avg Commits per committer: 102.667
  • Development Distribution Score (DDS): 0.211
Past Year
  • Commits: 35
  • Committers: 2
  • Avg Commits per committer: 17.5
  • Development Distribution Score (DDS): 0.429
Top Committers
Name Email Commits
mrz1836 m****8@g****m 243
dependabot[bot] 4****] 63
Luke r****z@g****m 2

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 2
  • Total pull requests: 237
  • Average time to close issues: 2 minutes
  • Average time to close pull requests: about 3 hours
  • Total issue authors: 2
  • Total pull request authors: 4
  • Average comments per issue: 0.5
  • Average comments per pull request: 0.87
  • Merged pull requests: 215
  • Bot issues: 0
  • Bot pull requests: 91
Past Year
  • Issues: 1
  • Pull requests: 175
  • Average time to close issues: 2 minutes
  • Average time to close pull requests: about 1 hour
  • Issue authors: 1
  • Pull request authors: 3
  • Average comments per issue: 1.0
  • Average comments per pull request: 0.87
  • Merged pull requests: 154
  • Bot issues: 0
  • Bot pull requests: 30
Top Authors
Issue Authors
  • ningzio (1)
Pull Request Authors
  • mrz1836 (132)
  • dependabot[bot] (97)
  • gouravkhunger (2)
  • rohenaz (1)
Top Labels
Issue Labels
Pull Request Labels
codex (130) feature (120) chore (84) update (14) automerge (8) dependencies (4) dependabot (4) size/XS (4) prod-dependency (4) test (3) minor-update (3) documentation (2) github-actions (2) gomod (2) idea (1) patch-update (1)

Packages

  • Total packages: 1
  • Total downloads: unknown
  • Total docker downloads: 1,701
  • Total dependent packages: 37
  • Total dependent repositories: 22
  • Total versions: 39
proxy.golang.org: github.com/mrz1836/go-sanitize

Package sanitize (go-sanitize) implements a simple library of various sanitation methods for data transformation. This package provides a collection of functions to sanitize and transform different types of data, such as strings, URLs, email addresses, and more. It is designed to help developers clean and format input data to ensure it meets specific criteria and is safe for further processing. Features: - Sanitize alpha and alphanumeric characters - Sanitize Bitcoin and Bitcoin Cash addresses - Custom regex-based sanitization - Sanitize decimal numbers and scientific notation - Sanitize domain names, email addresses, and IP addresses - Remove HTML/XML tags and scripts - Sanitize URIs and URLs - Handle XSS attack strings Usage: To use this package, import it and call the desired sanitization function with the input data. Each function is documented with examples in the `sanitize_example_test.go` file. If you have any suggestions or comments, please feel free to open an issue on this project's GitHub page.

  • Versions: 39
  • Dependent Packages: 37
  • Dependent Repositories: 22
  • Docker Downloads: 1,701
Rankings
Dependent packages count: 0.7%
Dependent repos count: 1.2%
Average: 5.2%
Stargazers count: 7.6%
Forks count: 11.2%
Last synced: 6 months ago

Dependencies

go.mod go
  • github.com/stretchr/testify v1.8.0
go.sum go
  • github.com/davecgh/go-spew v1.1.0
  • github.com/davecgh/go-spew v1.1.1
  • github.com/pmezard/go-difflib v1.0.0
  • github.com/stretchr/objx v0.1.0
  • github.com/stretchr/objx v0.4.0
  • github.com/stretchr/testify v1.7.1
  • github.com/stretchr/testify v1.8.0
  • gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405
  • gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c
  • gopkg.in/yaml.v3 v3.0.1
.github/workflows/codeql-analysis.yml actions
  • actions/checkout v3 composite
  • github/codeql-action/analyze v2 composite
  • github/codeql-action/autobuild v2 composite
  • github/codeql-action/init v2 composite
.github/workflows/release.yml actions
  • actions/checkout v3 composite
  • actions/setup-go v3 composite
  • goreleaser/goreleaser-action v4.1.0 composite
.github/workflows/run-tests.yml actions
  • actions/cache v3 composite
  • actions/checkout v3 composite
  • actions/setup-go v3 composite
  • codecov/codecov-action v3.1.1 composite
.github/workflows/sync-labels.yml actions
  • actions/checkout v3 composite
  • micnncim/action-label-syncer v1.3.0 composite