metafinder
Search for documents in a domain through Search Engines (Google, Bing and Baidu). The objective is to extract metadata
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.5%) to scientific vocabulary
Keywords
Repository
Search for documents in a domain through Search Engines (Google, Bing and Baidu). The objective is to extract metadata
Basic Info
Statistics
- Stars: 219
- Watchers: 7
- Forks: 34
- Open Issues: 5
- Releases: 0
Topics
Metadata Files
README.md
MetaFinder
Search for documents in a domain through Search Engines. The objective is to extract metadata.
Installation:
```
pip3 install metafinder ```
Upgrades are also available using:
```
pip3 install metafinder --upgrade ```
Usage
MetaFinder can be used in 2 ways:
CLI
metafinder -d domain.com -l 20 -o folder [-t 10] -go -bi -ba
Parameters: * d: Specifies the target domain. * l: Specify the maximum number of results to be searched in the searchs engines. * o: Specify the path to save the report. * t: Optional. Used to configure the threads (4 by default). * v: Show Metafinder version. * Search Engines to select (Google by default): * go: Optional. Search in Google. * bi: Optional. Search in Bing. * ba: Optional. Search in Baidu. (Experimental)
In Code
``` import metafinder.extractor as metadata_extractor
documentslimit = 5 domain = "targetdomain" result = metadataextractor.extractmetadatafromgooglesearch(domain, documentslimit)
result = metadataextractor.extractmetadatafrombingsearch(domain, documentslimit)
result = metadataextractor.extractmetadatafrombaidusearch(domain, documentslimit)
authors = result.getauthors() software = result.getsoftware() for k,v in result.getmetadata().items(): print(f"{k}:") print(f"| URL: {v['url']}") for metadata,value in v['metadata'].items(): print(f"|__ {metadata}: {value}")
documentname = "test.pdf" try: metadatafile = metadataextractor.extractmetadatafromdocument(documentname) for k,v in metadatafile.items(): print(f"{k}: {v}") except FileNotFoundError: print("File not found") ```
Example

Author
This project has been developed by:
- Josué Encinar García -- @JosueEncinar
Contributors
- Félix Brezo Fernández -- @febrezo
Disclaimer!
The software is designed to leave no trace in the documents we upload to a domain. The author is not responsible for any illegitimate use.
Owner
- Name: Josué Encinar
- Login: Josue87
- Kind: user
- Location: Madrid
- Company: IriusRisk
- Website: www.boomernix.com
- Twitter: JosueEncinar
- Repositories: 8
- Profile: https://github.com/Josue87
Security Researcher / Offensive Security Enthusiast
GitHub Events
Total
- Watch event: 24
- Issue comment event: 1
- Pull request event: 1
- Fork event: 3
Last Year
- Watch event: 24
- Issue comment event: 1
- Pull request event: 1
- Fork event: 3
Committers
Last synced: over 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| Josue87 | j****7@g****m | 24 |
| Josué Encinar | J****7 | 5 |
| Lucas Fernandez | l****n@g****m | 2 |
| abdallaEG | 5****G | 1 |
| febrezo | f****o@d****g | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 8 months ago
All Time
- Total issues: 9
- Total pull requests: 7
- Average time to close issues: 9 days
- Average time to close pull requests: 6 days
- Total issue authors: 9
- Total pull request authors: 7
- Average comments per issue: 1.0
- Average comments per pull request: 0.43
- Merged pull requests: 4
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 1
- Average comments per issue: 1.0
- Average comments per pull request: 0.0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- truesamurai (1)
- vicmac-github (1)
- alanEG (1)
- Bartates (1)
- lucferbux (1)
- wb4r (1)
- landaboot (1)
- mlinton (1)
- sec0ps (1)
Pull Request Authors
- zblurx (2)
- mlinton (2)
- lucferbux (1)
- six2dez (1)
- alanEG (1)
- AxylumRust (1)
- febrezo (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 1,547 last-month
- Total dependent packages: 0
- Total dependent repositories: 87
- Total versions: 13
- Total maintainers: 1
pypi.org: metafinder
MetaFinder - Metadata search through Search Engines
- Homepage: https://github.com/Josue87/MetaFinder
- Documentation: https://metafinder.readthedocs.io/
- License: GNU GPLv3+
-
Latest release: 1.2
published over 4 years ago
Rankings
Maintainers (1)
Dependencies
- beautifulsoup4 >=4.9.3
- openpyxl >=3.0.5
- pikepdf >=2.5.2
- prompt-toolkit >=3.0.5
- python-docx >=0.8.6
- python-pptx >=0.6.18
- requests >=2.25.1
- urllib3 >=1.26.4
- beautifulsoup4 >=4.9.3
- openpyxl >=3.0.5
- pikepdf >=2.5.2
- prompt-toolkit >=3.0.5
- python-docx >=0.8.6
- python-pptx >=0.6.18
- requests >=2.25.1
- urllib3 >=1.26.4