1

I have a code that grabs a table from this url

https://www.reuters.com/companies/AAPL.OQ/financials/income-statement-annual

The code is OK and no problem at all except one point. The code gets the table but doesn't get the header

    With http
    .Open "Get", sURL, False
    .send
    html.body.innerHTML = .responseText
End With

   Set tbl = html.getElementsByTagName("Table")(0)

        For Each rw In tbl.Rows
            r = r + 1: c = 1
            For Each cl In rw.Cells
                ws.Cells(r, c).Value = cl.innerText
                c = c + 1
            Next cl
    Next rw

When inspecting the URL, I found that API URL supported

https://www.reuters.com/companies/api/getFetchCompanyFinancials/AAPL.OQ

How can I extract the desired data "annual" for "income" from the JSON response?

I tried to refer to the section I desire but got an error

Const strUrl As String = "https://www.reuters.com/companies/api/getFetchCompanyFinancials/AAPL.OQ"

Sub Test()
Dim a, json As Object, colData As Collection, sFile As String, i As Long

With CreateObject("MSXML2.ServerXMLHTTP.6.0")
    .Open "GET", strUrl
    .send
    Set json = JSONConverter.ParseJson(.responseText)
End With


Set colData = json("market_data")("financial_statements")

Stop
End Sub
0m3r
  • 12,286
  • 15
  • 35
  • 71
YasserKhalil
  • 9,138
  • 7
  • 36
  • 95
  • what was the error? There are numerous income lines- which do you require? – QHarr Apr 30 '20 at 14:39
  • 1
    It is nested dicts with path json►market_data►financial_statements►income►annual – QHarr Apr 30 '20 at 14:45
  • I am lost with those nested dictionaries in fact. – YasserKhalil Apr 30 '20 at 14:59
  • what error are you getting? Your code above looks fine with the exception of whether you need user-agent and if refreshed since headers and you are missing FALSE argument from .Open – QHarr Apr 30 '20 at 15:20
  • As for the first code, it doesn't grab the dates as headers for the data. And as for the second code, I couldn't manipulate the JSON response properly. – YasserKhalil Apr 30 '20 at 15:40
  • For part 2 try: https://pastebin.com/nQw3XEVr. not tested. – QHarr Apr 30 '20 at 16:03
  • ^^ that is written out length ways but easy to transpose – QHarr Apr 30 '20 at 16:15
  • Thanks a lot my tutor. As for the first code, try running it and note the header `Trend` in G1 while B1 to F1 has no headers and this is the point I am talking about. When inspecting the page, you will find dates (these are considered headers in B1 to F1 – YasserKhalil Apr 30 '20 at 17:37
  • As for the second code, I got type mismatch at this line `Set data = json("market_data")("financial_statements")("financial_statements")("income")("annual") ' dict of collections`.. I can solve that point (it seems repeated ..) `Set data = json("market_data")("financial_statements")("income")("annual")` // but got an error object required at this part `Set block = data("key")` – YasserKhalil Apr 30 '20 at 17:40
  • I tried to study the code well and I could make it work `Set block = data(key)`. But in this way, the results are in three columns. I want to get the same output as the table on the link .. the dates would be headers and the Keys would be in column A and the values in the suitable places. Thanks a lot for great and awesome help. – YasserKhalil Apr 30 '20 at 17:44
  • I can't run the code I now have to write vba from head to here. – QHarr Apr 30 '20 at 18:12
  • Why not just try using the [clipboard](https://stackoverflow.com/a/60906400/6241235) and use _clipboard.SetText html.querySelector("table").outerHTML_ – QHarr Apr 30 '20 at 18:14
  • Oops sorry about the "key" that was stupid. – QHarr Apr 30 '20 at 18:15
  • 1
    This is also done in my head I'm afraid https://pastebin.com/JSVDViNu – QHarr Apr 30 '20 at 18:25

2 Answers2

2

logic similar to this should work in vba:

Dim data As Scripting.Dictionary, key As Variant, block As Collection, r As Long, item As Object

Set data = json("market_data")("financial_statements")("financial_statements")("income")("annual") ' dict of collections

r = 1

For Each key In data.keys
    Set block = data(key)  'each block (section of info) is a row
    r = r + 1: c= 2
    For each item In block 'loop columns in block         
        With Activesheet
            If r = 2 then 'write out headers to row 1,starting col2 and then values to row 2 starting from col 2, and key goes in row , col 1
                .Cells(1,c) = item("date")
            End If
            .Cells(r,1) = Key
            .Cells(r,c) = item("value")
        End With
        c = c + 1
    Next
Next

I can't test in VBA but if I write the python (long hand) equivalent I get the same table:

import requests
import pandas as pd

json = requests.get('https://www.reuters.com/companies/api/getFetchCompanyFinancials/AAPL.OQ').json()
data = json["market_data"]["financial_statements"]["income"]["annual"]
rows = len(data.keys()) + 1
columns = len(data["Revenue"]) + 1
r = 0
df = pd.DataFrame(["" for c in range(columns)] for r in range(rows))

for key in data.keys():
    block = data[key]
    r+=1 ; c = 1
    for item in block:
        if r == 1:
            df.iloc[0 , c] = item["date"]
        df.iloc[r,c] = item["value"]
        df.iloc[r,0] = key
        c+=1
print(df)
QHarr
  • 83,427
  • 12
  • 54
  • 101
  • Thanks a lot my tutor. Note the item "Total Extraordinary Items" the value -5151 is in 28-09-2019 while in the JSON response it is `2018-09-29`. It is supposed each value to the related date. May be there are more than 6 dates. – YasserKhalil Apr 30 '20 at 19:55
  • Is it? In python it is in the right place. Can you screenshot the output from vba? – QHarr Apr 30 '20 at 20:26
  • Added a screen shot. – YasserKhalil Apr 30 '20 at 20:31
  • 1
    That is correct. That is how it appears on the webpage – QHarr Apr 30 '20 at 20:33
  • Yes I know that but as for the API that is not correct. Review the item `Total Extraordinary Items` and the related date.. to make sure By the way, is API response more accurate or what appears on the table on the website? – YasserKhalil Apr 30 '20 at 20:36
  • I have put my solution but of course too long as I am trying to solve it anyway. Can you have a look at the post and guide me to a better approach? – YasserKhalil Apr 30 '20 at 20:47
  • 1
    where have you put your solution? Also, working code is a likely candidate for code review site. As for the mismatch that is not something I can answer. Perhaps json response shows reference date and this default to first column? Perhaps someone messed up with website. I really don't know and don't understand Corporate Finance beyond reading a book on it ~20 years ago. – QHarr Apr 30 '20 at 20:57
  • I have put my solution in the main post. – YasserKhalil Apr 30 '20 at 20:58
  • 1
    You should post it as an answer not edit it into your post. Also, remove the image. It keeps the post tidy and makes it easier for people to follow. – QHarr Apr 30 '20 at 21:04
2

After so many hours, I could adjust it like that

Const strUrl As String = "https://www.reuters.com/companies/api/getFetchCompanyFinancials/"

Sub GetData()
    Dim ws As Worksheet, sSection As String

    For Each ws In ThisWorkbook.Worksheets(Array("IS", "BS", "CF"))
        Select Case ws.Name
            Case "IS": sSection = "income"
            Case "BS": sSection = "balance_sheet"
            Case "CF": sSection = "cash_flow"
        End Select

        GetReuters ws, "tbl" & ws.Name, Sheets("Data").Range("B1").Value, sSection, Sheets("Data").Range("B2").Value
    Next ws
End Sub

Sub GetReuters(ByVal ws As Worksheet, ByVal tblName As String, ByVal sTicker As String, ByVal sSection As String, ByVal sTime As String)
    Dim a, ky, col As Collection, json As Object, data As Object, dic As Object, rng As Range, i As Long, k As Long, c As Long

    With CreateObject("MSXML2.ServerXMLHTTP.6.0")
        .Open "GET", strUrl & sTicker
        .send
        Set json = JSONConverter.ParseJson(.responseText)
    End With

    ReDim b(1 To 10000, 1 To 7)
    c = 1: b(1, c) = "Dates"

    Set data = json("market_data")("financial_statements")(sSection)(sTime)
    Set dic = CreateObject("Scripting.Dictionary")
    dic.CompareMode = 1

    For Each ky In data.Keys
        Set col = data(ky)
        a = CollectionToArray(col)
        k = k + 1
        b(k + 1, 1) = ky

        For i = LBound(a) To UBound(a)
            If Not dic.Exists(CStr(a(i, 1))) Then
                dic(CStr(a(i, 1))) = c
                c = c + 1

                b(1, c) = CStr(a(i, 1))
                b(k + 1, c) = a(i, 2)

            Else
                b(k + 1, dic.item(CStr(a(i, 1))) + 1) = a(i, 2)
            End If
        Next i

        Erase a
    Next ky

    Application.ScreenUpdating = False
        With ws
            On Error Resume Next
                .ListObjects(tblName).Delete
            On Error GoTo 0
            .Range("A1").Resize(k + 1, UBound(b, 2)).Value = b
            With .Range("A1").CurrentRegion
                Set rng = .Offset(1, 1).Resize(.Rows.Count - 1, .Columns.Count - 1)
                rng.NumberFormat = "#,##0.00;(#,##0.00)"
                rng.Rows(1).Offset(-1).NumberFormat = "dd-mmm-yy"
                .Columns.AutoFit
            End With

            .ListObjects.Add(xlSrcRange, .Range("A1").CurrentRegion, , xlYes).Name = tblName
        End With
    Application.ScreenUpdating = True
End Sub

Function CollectionToArray(ByVal c As Collection) As Variant()
    Dim a(), i As Long
    ReDim a(1 To c.Count, 1 To 2)

    For i = 1 To c.Count
        a(i, 1) = c.item(i)("date")
        a(i, 2) = c.item(i)("value")
    Next i

    CollectionToArray = a
End Function
YasserKhalil
  • 9,138
  • 7
  • 36
  • 95
  • I have posted the whole code and removed the comments. Just comments for the values in Data sheet to guide the others. – YasserKhalil Apr 30 '20 at 21:18
  • 1
    Yes in my PC it is perfect but when copying and pasting here I am suffering of that point. I don't know till now how to post the code as it is on my one side. The only perfect one is to post as HTML code – YasserKhalil Apr 30 '20 at 21:43
  • Much better! You can remove _Set json = Nothing: Set colData = Nothing: Set jsonMarket = Nothing: Set jsonSection = Nothing: Set rng = Nothing_ as you exit sub so these go out of scope and are de-referenced. – QHarr Apr 30 '20 at 21:46
  • Thanks a lot for pointing out that. I though this is a good practice to clear memory after each worksheet manipulation. – YasserKhalil Apr 30 '20 at 21:47
  • It is unnecessary here because those objects immediately go out of scope and will be removed anyway. – QHarr Apr 30 '20 at 21:48
  • Now I have noticed you edited it properly and at the same time keep the indentation well. How did you do that? – YasserKhalil Apr 30 '20 at 21:48
  • 1
    I copied your pastebin code in, and with all code highlighted, pressed Ctrl + K to indent the code block the required 4 spaces for code inserts. – QHarr Apr 30 '20 at 21:48
  • Final point is that all your signature params can be passed ByVal. – QHarr Apr 30 '20 at 21:52
  • Can you show me what do you mean exactly by signature params? – YasserKhalil Apr 30 '20 at 21:53
  • Thanks a lot for relieving me from the suffering of posting the code properly. – YasserKhalil Apr 30 '20 at 21:55
  • 1
    Pleasure. This is a really useful post by the way: https://stackoverflow.com/a/41813615/6241235 for ByRef/ByVal and this,https://stackoverflow.com/help/formatting, for formatting. – QHarr Apr 30 '20 at 21:57
  • Using ByVal to the worksheet variable only or to all the other variables too? – YasserKhalil Apr 30 '20 at 22:25
  • 1
    Do you mean like that `Sub GetReuters(ByVal ws As Worksheet, ByVal tblName As String, ByVal sTicker As String, ByVal sSection As String, ByVal sTime As String)`? – YasserKhalil Apr 30 '20 at 22:40
  • 1
    I have updated the code .. according to your instructions. I appreciate that a lot. – YasserKhalil Apr 30 '20 at 22:53