edgar

A small library to access files from SEC's edgar

https://github.com/joeyism/py-edgar

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (9.2%) to scientific vocabulary

Keywords

cik edgar sec

Last synced: 6 months ago · JSON representation

Repository

A small library to access files from SEC's edgar

Basic Info

Host: GitHub
Owner: joeyism
License: gpl-3.0
Language: Python
Default Branch: master
Homepage:
Size: 103 KB

Statistics

Stars: 240
Watchers: 11
Forks: 51
Open Issues: 4
Releases: 8

Topics

cik edgar sec

Created over 8 years ago · Last pushed over 1 year ago

Metadata Files

Readme License

EDGAR

A small library to access files from SEC's edgar.

Installation

pip install edgar

Example

To get a company's latest 5 10-Ks, run

python from edgar import Company company = Company("Oracle Corp", "0001341439") tree = company.get_all_filings(filing_type = "10-K") docs = Company.get_documents(tree, no_of_documents=5) or ```python from edgar import Company, TXTML

company = Company("INTERNATIONAL BUSINESS MACHINES CORP", "0000051143") doc = company.get10K() text = TXTML.parsefull_10K(doc) ```

To get all companies and find a specific one, run

python from edgar import Edgar edgar = Edgar() possible_companies = edgar.find_company_name("Cisco System")

To avoid pull of all company data from sec.gov on Edgar initialization, pass in a local path to the data

python from edgar import Edgar edgar = Edgar("/path/to/cik-lookup-data.txt") possible_companies = edgar.find_company_name("Cisco System")

To get XBRL data, run ```python from edgar import Company, XBRL, XBRLElement

company = Company("Oracle Corp", "0001341439") results = company.getdatafilesfrom10K("EX-101.INS", isxml=True) xbrl = XBRL(results[0]) XBRLElement(xbrl.relevantchildrenparsed[15]).to_dict() // returns a dictionary of name, value, and schemaRef ```

API

Company

python Company(name, cik, timeout=10) * name (company name) * cik (company CIK number) * timeout (optional) (default: 10)

Methods

get_filings_url(self, filing_type="", prior_to="", ownership="include", no_of_entries=100) -> str

Returns a url to fetch filings data * filingtype: The type of document you want. i.e. 10-K, S-8, 8-K. If not specified, it'll return all documents * priorto: Time prior which documents are to be retrieved. If not specified, it'll return all documents * ownership: defaults to include. Options are include, exclude, only. * noofentries: defaults to 100. Returns the number of entries to be returned. Maximum is 100.

get_all_filings(self, filing_type="", prior_to="", ownership="include", no_of_entries=100) -> lxml.html.HtmlElement

Returns the HTML in the form of lxml.html * filingtype: The type of document you want. i.e. 10-K, S-8, 8-K. If not specified, it'll return all documents * priorto: Time prior which documents are to be retrieved. If not specified, it'll return all documents * ownership: defaults to include. Options are include, exclude, only. * noofentries: defaults to 100. Returns the number of entries to be returned. Maximum is 100.

get_10Ks(self, no_of_documents=1, as_documents=False) -> List[lxml.html.HtmlElement]

Returns the HTML in the form of lxml.html of concatenation of all the documents in the 10-K * noofdocuments (default: 1): numer of documents to be retrieved * When as_documents is set to True, it returns -> List[edgar.document.Documents] a list of Documents

get_10Ks_metadata(self) -> List[dict]

Returns the HTML in the form of a dictionary of concatenation of all the document metadata in the 10-K

get_document_type_from_10K(self, document_type, no_of_documents=1) -> List[lxml.html.HtmlElement]

Returns the HTML in the form of lxml.html of the document within 10-K * documenttype: Tye type of document you want, i.e. 10-K, EX-3.2 * noof_documents (default: 1): numer of documents to be retrieved

get_data_files_from_10K(self, document_type, no_of_documents=1, isxml=False) -> List[lxml.html.HtmlElement]

Returns the HTML in the form of lxml.html of the data file within 10-K * documenttype: Tye type of document you want, i.e. EX-101.INS * noof_documents (default: 1): numer of documents to be retrieved * isxml (default: False): by default, things aren't case sensitive and is parsed with html in lxml. If this is True, then it is parsed withetree` which is case sensitive

Class Method

get_documents(self, tree: lxml.html.Htmlelement, no_of_documents=1, debug=False, as_documents=False) -> List[lxml.html.HtmlElement] Returns a list of strings, each string contains the body of the specified document from input

tree: lxml.html form that is returned from Company.getAllFilings
noofdocuments: number of document returned. If it is 1, the returned result is just one string, instead of a list of strings. Defaults to 1.
debug (default: False): if True, displays the URL and form
When as_documents is set to True, it returns -> List[edgar.document.Documents] a list of Documents

Edgar

Gets all companies from EDGAR

get_cik_by_company_name(company_name: str) -> str: Returns the CIK if given the exact name or the company

get_company_name_by_cik(cik: str) -> str: Returns the company name if given the CIK (with the 000s)

find_company_name(words: str) -> List[str]: Returns a list of company names by exact word matching

find_company_name_cik(words: str) -> List[tuple[str, str]]: Return a list of company names and their CIK values

match_company_by_company_name(self, name, top=5) -> List[Dict[str, Any]]: Returns a list of dictionarys, with company names, CIK, and their fuzzy match score * top (default: 5) returns the top number of fuzzy matches. If set to None, it'll return the whole list (which is a lot)

XBRL

Parses data from XBRL

Properties

relevant_children * get children that are not context relevant_children_parsed * get children that are not context, unit, schemaRef * cleans tags

Documents

Filing and Documents Details for the SEC EDGAR Form (such as 10-K)

python Documents(url, timeout=10)

Properties

url: str: URL of the document

content: dict: Dictionary of meta data of the document

content['Filing Date']: str: Document filing date

content['Accepted']: str: Document accepted datetime

content['Period of Report']: str: The date period that the document is for

element: lxml.html.HtmlElement: The HTML element for the Document (from the url) so it can be further parsed

Contribution

Owner

Name: Joey
Login: joeyism
Kind: user
Location: Toronto, Canada

Website: http://www.joeyism.com
Repositories: 55
Profile: https://github.com/joeyism

Machine Learning Engineer, with a lot of CLI Dev Tools

GitHub Events

Total

Watch event: 15
Fork event: 1

Last Year

Watch event: 15
Fork event: 1

Committers

Last synced: 9 months ago

All Time

Total Commits: 109
Total Committers: 6
Avg Commits per committer: 18.167
Development Distribution Score (DDS): 0.064

Past Year

Commits: 7
Committers: 1
Avg Commits per committer: 7.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
joeyism	s**y@g**m	102
pprice	p**e@d**m	2
Koen Oussoren	K**n@n**m	2
kecarus	k**n@i**t	1
kbennatti	3****i	1
eabase	5****e	1

Committer Domains (Top 20 + Academic)

ipl31.net: 1 nnip.com: 1 dese.com: 1

Issues and Pull Requests

Last synced: 8 months ago

All Time

Total issues: 23
Total pull requests: 7
Average time to close issues: 4 months
Average time to close pull requests: 3 months
Total issue authors: 17
Total pull request authors: 6
Average comments per issue: 2.7
Average comments per pull request: 0.71
Merged pull requests: 6
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

gregjasonroberts (4)
Deanc419 (2)
rpocase (2)
victor4shen (2)
auggunner (1)
eabase (1)
ipl31 (1)
bdog1385 (1)
bosanipietro (1)
Jurga14 (1)
chrislakumb (1)
compusaurusrex (1)
kostadtk (1)
lascott (1)
joezein (1)

Pull Request Authors

colfax4 (2)
nickderobertis (1)
ipl31 (1)
Koen-kun (1)
eabase (1)
kbennatti (1)

Top Labels

Issue Labels

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- pypi 2,803 last-month

Total dependent packages: 0
Total dependent repositories: 13
Total versions: 63
Total maintainers: 1

pypi.org: edgar

Scrape data from SEC's EDGAR

Homepage: https://github.com/joeyism/py-edgar
Documentation: https://edgar.readthedocs.io/
License: gpl-3.0
Latest release: 5.6.3
published over 1 year ago

Versions: 63
Dependent Packages: 0
Dependent Repositories: 13
Downloads: 2,803 Last month

Rankings

Dependent repos count: 4.0%

Stargazers count: 4.6%

Forks count: 5.7%

Average: 6.6%

Downloads: 8.6%

Dependent packages count: 10.0%

Maintainers (1)

joeyism

Last synced: 6 months ago

Dependencies

requirements-dev.txt pypi

pytest * development

requirements.txt pypi

fuzzywuzzy *
lxml *
requests *
tqdm *

setup.py pypi

package.split *

edgar

Science Score: 26.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

EDGAR

Installation

Example

API

Company

Methods

Class Method

Edgar

XBRL

Properties

Documents

Properties

Contribution

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: edgar

Rankings

Maintainers (1)

Dependencies