Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.2%) to scientific vocabulary
Keywords
Repository
A small library to access files from SEC's edgar
Basic Info
Statistics
- Stars: 240
- Watchers: 11
- Forks: 51
- Open Issues: 4
- Releases: 8
Topics
Metadata Files
README.md
EDGAR
A small library to access files from SEC's edgar.
Installation
pip install edgar
Example
To get a company's latest 5 10-Ks, run
python
from edgar import Company
company = Company("Oracle Corp", "0001341439")
tree = company.get_all_filings(filing_type = "10-K")
docs = Company.get_documents(tree, no_of_documents=5)
or
```python
from edgar import Company, TXTML
company = Company("INTERNATIONAL BUSINESS MACHINES CORP", "0000051143") doc = company.get10K() text = TXTML.parsefull_10K(doc) ```
To get all companies and find a specific one, run
python
from edgar import Edgar
edgar = Edgar()
possible_companies = edgar.find_company_name("Cisco System")
To avoid pull of all company data from sec.gov on Edgar initialization, pass in a local path to the data
python
from edgar import Edgar
edgar = Edgar("/path/to/cik-lookup-data.txt")
possible_companies = edgar.find_company_name("Cisco System")
To get XBRL data, run ```python from edgar import Company, XBRL, XBRLElement
company = Company("Oracle Corp", "0001341439") results = company.getdatafilesfrom10K("EX-101.INS", isxml=True) xbrl = XBRL(results[0]) XBRLElement(xbrl.relevantchildrenparsed[15]).to_dict() // returns a dictionary of name, value, and schemaRef ```
API
Company
python
Company(name, cik, timeout=10)
* name (company name)
* cik (company CIK number)
* timeout (optional) (default: 10)
Methods
get_filings_url(self, filing_type="", prior_to="", ownership="include", no_of_entries=100) -> str
Returns a url to fetch filings data * filingtype: The type of document you want. i.e. 10-K, S-8, 8-K. If not specified, it'll return all documents * priorto: Time prior which documents are to be retrieved. If not specified, it'll return all documents * ownership: defaults to include. Options are include, exclude, only. * noofentries: defaults to 100. Returns the number of entries to be returned. Maximum is 100.
get_all_filings(self, filing_type="", prior_to="", ownership="include", no_of_entries=100) -> lxml.html.HtmlElement
Returns the HTML in the form of lxml.html * filingtype: The type of document you want. i.e. 10-K, S-8, 8-K. If not specified, it'll return all documents * priorto: Time prior which documents are to be retrieved. If not specified, it'll return all documents * ownership: defaults to include. Options are include, exclude, only. * noofentries: defaults to 100. Returns the number of entries to be returned. Maximum is 100.
get_10Ks(self, no_of_documents=1, as_documents=False) -> List[lxml.html.HtmlElement]
Returns the HTML in the form of lxml.html of concatenation of all the documents in the 10-K
* noofdocuments (default: 1): numer of documents to be retrieved
* When as_documents is set to True, it returns -> List[edgar.document.Documents] a list of Documents
get_10Ks_metadata(self) -> List[dict]
Returns the HTML in the form of a dictionary of concatenation of all the document metadata in the 10-K
get_document_type_from_10K(self, document_type, no_of_documents=1) -> List[lxml.html.HtmlElement]
Returns the HTML in the form of lxml.html of the document within 10-K * documenttype: Tye type of document you want, i.e. 10-K, EX-3.2 * noof_documents (default: 1): numer of documents to be retrieved
get_data_files_from_10K(self, document_type, no_of_documents=1, isxml=False) -> List[lxml.html.HtmlElement]
Returns the HTML in the form of lxml.html of the data file within 10-K
* documenttype: Tye type of document you want, i.e. EX-101.INS
* noof_documents (default: 1): numer of documents to be retrieved
* isxml (default: False): by default, things aren't case sensitive and is parsed with html in lxml. If this is True, then it is parsed withetree` which is case sensitive
Class Method
get_documents(self, tree: lxml.html.Htmlelement, no_of_documents=1, debug=False, as_documents=False) -> List[lxml.html.HtmlElement] Returns a list of strings, each string contains the body of the specified document from input
- tree: lxml.html form that is returned from Company.getAllFilings
- noofdocuments: number of document returned. If it is 1, the returned result is just one string, instead of a list of strings. Defaults to 1.
- debug (default: False): if True, displays the URL and form
- When
as_documentsis set toTrue, it returns-> List[edgar.document.Documents]a list of Documents
Edgar
Gets all companies from EDGAR
get_cik_by_company_name(company_name: str) -> str: Returns the CIK if given the exact name or the company
get_company_name_by_cik(cik: str) -> str: Returns the company name if given the CIK (with the 000s)
find_company_name(words: str) -> List[str]: Returns a list of company names by exact word matching
find_company_name_cik(words: str) -> List[tuple[str, str]]: Return a list of company names and their CIK values
match_company_by_company_name(self, name, top=5) -> List[Dict[str, Any]]: Returns a list of dictionarys, with company names, CIK, and their fuzzy match score
* top (default: 5) returns the top number of fuzzy matches. If set to None, it'll return the whole list (which is a lot)
XBRL
Parses data from XBRL
Properties
relevant_children
* get children that are not context
relevant_children_parsed
* get children that are not context, unit, schemaRef
* cleans tags
Documents
Filing and Documents Details for the SEC EDGAR Form (such as 10-K)
python
Documents(url, timeout=10)
Properties
url: str: URL of the document
content: dict: Dictionary of meta data of the document
content['Filing Date']: str: Document filing date
content['Accepted']: str: Document accepted datetime
content['Period of Report']: str: The date period that the document is for
element: lxml.html.HtmlElement: The HTML element for the Document (from the url) so it can be further parsed
Contribution
Owner
- Name: Joey
- Login: joeyism
- Kind: user
- Location: Toronto, Canada
- Website: http://www.joeyism.com
- Repositories: 55
- Profile: https://github.com/joeyism
Machine Learning Engineer, with a lot of CLI Dev Tools
GitHub Events
Total
- Watch event: 15
- Fork event: 1
Last Year
- Watch event: 15
- Fork event: 1
Issues and Pull Requests
Last synced: 8 months ago
All Time
- Total issues: 23
- Total pull requests: 7
- Average time to close issues: 4 months
- Average time to close pull requests: 3 months
- Total issue authors: 17
- Total pull request authors: 6
- Average comments per issue: 2.7
- Average comments per pull request: 0.71
- Merged pull requests: 6
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- gregjasonroberts (4)
- Deanc419 (2)
- rpocase (2)
- victor4shen (2)
- auggunner (1)
- eabase (1)
- ipl31 (1)
- bdog1385 (1)
- bosanipietro (1)
- Jurga14 (1)
- chrislakumb (1)
- compusaurusrex (1)
- kostadtk (1)
- lascott (1)
- joezein (1)
Pull Request Authors
- colfax4 (2)
- nickderobertis (1)
- ipl31 (1)
- Koen-kun (1)
- eabase (1)
- kbennatti (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 2,803 last-month
- Total dependent packages: 0
- Total dependent repositories: 13
- Total versions: 63
- Total maintainers: 1
pypi.org: edgar
Scrape data from SEC's EDGAR
- Homepage: https://github.com/joeyism/py-edgar
- Documentation: https://edgar.readthedocs.io/
- License: gpl-3.0
-
Latest release: 5.6.3
published over 1 year ago
Rankings
Maintainers (1)
Dependencies
- pytest * development
- fuzzywuzzy *
- lxml *
- requests *
- tqdm *
- package.split *
