Updated 5 months ago
https://github.com/adbar/courlan
Clean, filter and sample URLs to optimize data collection – Python & command-line – Deduplication, spam, content and language filters
Updated 6 months ago
adaR
:computer: wrapper for ada-url a WHATWG-compliant and fast URL parser written in modern C++