Questions tagged [edgar]

EDGAR is an information system of the U.S. Securities and Exchange Commission holding company data. Questions related to parsing and querying the data and public APIs should be tagged.

EDGAR stays for Electronic Data Gathering, Analysis, and Retrieval. This information system uses several data formats: classic SGML based, XML-based XBRL format for business reporting and many more.

120 questions

vote

1 answer

word count from web text document result in 0

I tried the python codes from the article of Rasha Ashraf "Scraping EDGAR with Python". He used urllib2 which is now invalid in python 3, I guess. Thus, I changed it into urllib. I could bring the following Edgar web page. However, the number of…

asked Nov 12 '20 at 21:40

Jason SJ Yim

vote

1 answer

How to parse 10-Q reports from EDGAR API in python?

I'm trying to use EDGAR API to retrieve 10-Q for any given company (corresponding to the CIK value provided.) This code retrieves the most recent 10-Q for Tesla. There are about 30 methods attached to this object, such as keys, values, items, and…

python xml edgar

asked Aug 03 '20 at 20:50

jbuddy_13

vote

0 answers

Parsing unstructured txt files and extracting tables

I would like to parse old style EDGAR txt files from SEC containing different filings with free financial data, but it's very non trivial to parse a txt with a semblance of a table and extract this data. Here is the link to the example file I…

python text-parsing edgar

asked Jun 24 '20 at 22:53

kuatroka

vote

0 answers

Cleaning SEC filings

I am currently trying to clean 10-K filings (2690 to be exact) in order to get the pure text (without html-tags etc.). Among others, I would like to calculate the readability scores in a next step. However, cleaning the text is becoming a larger…

python html edgar

asked May 21 '20 at 13:00

Sebastian

vote

0 answers

SEC EDGAR 20-F Form - How to process text that contains html tags

I have the downloaded the following 20-F Form from SEC EDGAR: https://www.sec.gov/Archives/edgar/data/1729089/000121390019021541/0001213900-19-021541.txt As you can see, the .txt file contains multiple html tags such as:

python html regex edgar

asked May 17 '20 at 09:30

adrCoder

3,145
4
31
56

vote

0 answers

Google Sheets: querying sec.gov for the latest filings for a given company

I've had a ton of help recently from the SO community and I'd first just like to say thank you to everyone! My latest Google Sheet pursuit is querying sec.gov for the latest filing for a given ticker. I'm not trying to scrape the site, I just want…

google-sheets google-sheets-formula google-sheets-api google-sheets-query edgar

asked Apr 24 '20 at 21:31

SOtoTheRescue

vote

0 answers

Count keywords in SEC Edgar 10-K filings text-body with Python

I am trying to parse the text section of the SEC Edgar texts in Python 3, e.g.: https://www.sec.gov/Archives/edgar/data/796343/0000796343-14-000004.txt My goal is to collect the number of occurrences in the visible text body of the 10-K statements…

python parsing edgar

asked Apr 11 '20 at 08:46

dernuco

vote

2 answers

REGEX extract information from EDGAR SC-13 form

I am trying to extract information from the latest SEC EDGAR Schedule 13 forms filings. Link of the filing as an example: 1) Saba Capital_27-Dec-2019_SC13 The information I am trying to extract (and the parts of the filing with the information)…

python regex beautifulsoup finance edgar

asked Dec 30 '19 at 11:43

Lko

vote

1 answer

Repairing broken html table extracted with BS4 in python

I am parsing html tables from administrative filings. It is tricky as the html is often broken and this results in poorly constructed tables. Here is an example of table that I load into a pandas dataframe: 0 1 2 3 4 …

python pandas edgar

asked Aug 09 '19 at 13:12

user1029296

vote

1 answer

Cycle names thru list

I have 3 simple lines of code which pull S-1 filings from the SEC's "Edgar" database and put them into a folder I specify. This uses the "sec Edgar downloader." It works great, but I have to do this for about 1400 companies. I have the list of…

python python-3.x indexing edgar

asked Apr 11 '19 at 09:06

Steve Arguin

vote

2 answers

Regex capture lines A, B, or C in any order only when not preceded by D

I have a file with content something like this: SUBJECT COMPANY: COMPANY DATA: COMPANY CONFORMED NAME: MISCELLANEOUS SUBJECT CORP CENTRAL INDEX KEY: 0000000000 STANDARD INDUSTRIAL CLASSIFICATION: …

regex text multiline edgar

asked Aug 21 '18 at 19:36

Matthew

vote

1 answer

Retrieve EBIT from XBRL documents

It appears that EBIT information is not very uniform across different XBRL documents. Cross comparing data with other sources, such as Yahoo, I have seen some XBRL use the fact us-gaap:OperatingIncomeLoss to store it if using US-GAAP, or…

xbrl edgar

asked Jul 26 '18 at 16:20

Guilherme Caminha

vote

1 answer

Use Arelle to export XLSX file

I'm trying to use Arelle to export a XLSX file from a zip of XBRL files. It works just fine when I use the EdgarRenderer plugin. ./arelleCmdLine -f data/goog-20151231.xml.zip --plugins EdgarRenderer --disclosureSystem efm-pragmatic --validate -r…

python finance xbrl edgar arelle

asked Dec 15 '17 at 12:49

AppTest

vote

1 answer

How to scrape individual paragraphs from SEC 10-Ks

I am working on a project where I need to break up 10-Ks into their constituent paragraphs. For some 10-Ks I am able to do something simple like soup.find_all('p'), but I am also seeing other 10-Ks that use

for everything instead of

tags.…

python html css beautifulsoup edgar

asked May 09 '17 at 18:44

Leo

vote

1 answer

Querying Securities Exchange Comission (SEC) using EDGAR

I'm working on a project that allows the user to pull out information from both SEC and on the company's traded Stock using the company's stock-ticker. Now, in order for me to be able to retrieve information from the SEC using the stock ticker ONLY,…

ruby-on-rails edgar

asked Feb 18 '17 at 15:40

Crashtor

1,249
1
13
21

Prev 1 2

4 5 6 7 8 Next