htmldate
htmldate: A Python package to extract publication dates from web pages - Published in JOSS (2020)
gpt-researcher
LLM based autonomous agent that conducts deep local and web research on any topic and generates a long report with citations.
mcp-selenium-grid
[DEV PRE-RELEASE, NOT READY, TESTING IN MCP CLIENTS] A Model Context Protocol (MCP) server that enables AI Agents to request and manage Selenium browser instances through a secure API. Perfect for your automated browser testing needs! 🚀 [In development]
https://github.com/aryashah2k/fakenewsdetectionspacy
A Customized Fake News Detection System Built Using Streamlit With SpaCy NLP Pipeline. The Web App Contains Frontend, Backend and Database Integration. Features: Detect Fake News For COVID and Afghanistan Crisis, Sign In/ Log In, View History, Get Sentiment Of The News Article
https://github.com/ahasverus/gpack
:package: R package to web scrap G**gle services
opennews
OpenNews is a REST API made in Python to extract news from Portuguese journals. It is intended for academic use.
https://github.com/aariq/cupcakes-vs-muffins
Are cupcakes empirically different than muffins? Let's find out!
glimpser
a simple tool for real-time monitoring video and summarization with LLMs
https://github.com/adithya-s-k/discoverydino
a robust and scalable system that proactively identifies new generally available (GA) software products and checks their availability on the G2 software marketplace. The goal is to compile a list of products that are not yet listed on G2, simplifying the process of onboarding them onto the platform
https://github.com/cimentadaj/scrapex
Completely self-standing web scraping/API examples for eternity