1

I started two days ago with ethereum blockchain, so my knowledge is still a little bit all over the place. Nevertheless, i managed to connect to a node, pull some general block data and so on. As a next level of difficulty, I tried to start building event filters, in order to look at more specific types of historical data (to be clear, I don't want to fetch live data, I would rather like to query through the entire chain, and get historical sample extracts for various types of data).

See here my first attempt to build an event filter for the USDC Uniswap V2 contract, in order to collect Swap events (its not about speed or efficiency right now, just to make it work):

w3 = Web3(Web3.HTTPProvider(NODE_ADDRESS))

# uniswap v2 USDC
address = w3.toChecksumAddress('0xb4e16d0168e52d35cacd2c6185b44281ec28c9dc')

# get the ABI for uniswap v2 pair events
resp = requests.get("https://unpkg.com/@uniswap/v2-core@1.0.0/build/IUniswapV2Pair.json")
if resp.status_code==200: 
    abi = json.loads(resp.content)['abi']

# create contract object
contract = w3.eth.contract(address=address, abi=abi)

# get topics by hashing abi event signatures
res = contract.events.Swap.build_filter()

# put this into a filter input dictionary
filter_params = {'fromBlock':int_to_hex(12000000),'toBlock':int_to_hex(12010000),**res.filter_params}
# res.filter_params contains: 'topics' and 'address'

# create a filter id (i.e. a hashed version of the filter data, representing the filter)
method = 'eth_newFilter'
params = [filter_params]
resp = self.block_manager.general_sample_request(method,params)
if 'error' in resp: 
    print(resp)
else: 
    filter_id = resp['result']

# pass on the filter id, in order to query the respective logs
params = [filter_id]
method = 'eth_getFilterLogs'
resp = self.block_manager.general_sample_request(method,params)
# takes about 10-12s for about 12000 events

the resulting array contains event logs of this structure:

resp['result'][0]
>>>
{'address': '0xb4e16d0168e52d35cacd2c6185b44281ec28c9dc',
 'topics': ['0xd78ad95fa46c994b6551d0da85fc275fe613ce37657fb8d5e3d130840159d822',
  '0x0000000000000000000000007a250d5630b4cf539739df2c5dacb4c659f2488d',
  '0x0000000000000000000000000ffd670749d4179558b6b367e30e72ce2efea28f'],
 'data': '0x0000000000000000000000000000000000000000000000000000000000000000000000000000000000000\
00000000000000000000000000034f0f8a0c7663264000000000000000000000000000000000000000000000\
000000000019002d5b60000000000000000000000000000000000000000000000000000000000000000',
 'blockNumber': '0xb71b01',
 'transactionHash': '0x76403053ee0300411b68fc223b327b51fb4f1a26e1f6cb8667e05ec370e8176e',
 'transactionIndex': '0x22',
 'blockHash': '0x4bd35cb48395e77fd317a0309342c95d6687dbc4fcb85ada2d635fe266d1e769',
 'logIndex': '0x16',
 'removed': False}

As far as I understand now, I can somehow apply the ABI to decode the 'data' field. I tried with this function:

contract.decode_function_input(resp['result'][0]['data'])

but it gives me this error:

>>> ValueError: Could not find any function with matching selector

Seems like there is some problem with decoding the data. However, I am so close now to getting the real data, I dont wanna give up xD. Any help will be appreciated!

Thanks!

  • thanks. btw, your library trading strategy is awesome, great stuff! i work in qf, let me know if you wanna exchange, and ill get in touch via linkedin – user19976975 Oct 29 '22 at 10:02

1 Answers1

0
import json
import traceback
from pprint import pprint

from eth_utils import event_abi_to_log_topic, to_hex
from hexbytes import HexBytes
from web3._utils.events import get_event_data
from web3.auto import w3


def decode_tuple(t, target_field):
    output = dict()
    for i in range(len(t)):
        if isinstance(t[i], (bytes, bytearray)):
            output[target_field[i]['name']] = to_hex(t[i])
        elif isinstance(t[i], (tuple)):
            output[target_field[i]['name']] = decode_tuple(t[i], target_field[i]['components'])
        else:
            output[target_field[i]['name']] = t[i]
    return output

def decode_list_tuple(l, target_field):
    output = l
    for i in range(len(l)):
        output[i] = decode_tuple(l[i], target_field)
    return output

def decode_list(l):
    output = l
    for i in range(len(l)):
        if isinstance(l[i], (bytes, bytearray)):
            output[i] = to_hex(l[i])
        else:
            output[i] = l[i]
    return output

def convert_to_hex(arg, target_schema):
    """
    utility function to convert byte codes into human readable and json serializable data structures
    """
    output = dict()
    for k in arg:
        if isinstance(arg[k], (bytes, bytearray)):
            output[k] = to_hex(arg[k])
        elif isinstance(arg[k], (list)) and len(arg[k]) > 0:
            target = [a for a in target_schema if 'name' in a and a['name'] == k][0]
            if target['type'] == 'tuple[]':
                target_field = target['components']
                output[k] = decode_list_tuple(arg[k], target_field)
            else:
                output[k] = decode_list(arg[k])
        elif isinstance(arg[k], (tuple)):
            target_field = [a['components'] for a in target_schema if 'name' in a and a['name'] == k][0]
            output[k] = decode_tuple(arg[k], target_field)
        else:
            output[k] = arg[k]
    return output

def _get_topic2abi(abi):
    if isinstance(abi, (str)):
        abi = json.loads(abi)

    event_abi = [a for a in abi if a['type'] == 'event']
    topic2abi = {event_abi_to_log_topic(_): _ for _ in event_abi}
    return topic2abi

def _sanitize_log(log):
    for i, topic in enumerate(log['topics']):
        if not isinstance(topic, HexBytes):
            log['topics'][i] = HexBytes(topic)

    if 'address' not in log:
        log['address'] = None

    if 'blockHash' not in log:
        log['blockHash'] = None

    if 'blockNumber' not in log:
        log['blockNumber'] = None

    if 'logIndex' not in log:
        log['logIndex'] = None

    if 'transactionHash' not in log:
        log['transactionHash'] = None

    if 'transactionIndex' not in log:
        log['transactionIndex'] = None


def decode_log(log, abi):
    if abi is not None:
        try:
            # get a dict with all available events from the ABI
            topic2abi = _get_topic2abi(abi)

            # ensure the log contains all necessary keys
            _sanitize_log(log)

            # get the ABI of the event in question (stored as the first topic)
            event_abi = topic2abi[log['topics'][0]]

            # get the event name
            evt_name = event_abi['name']

            # get the event data
            data = get_event_data(w3.codec, event_abi, log)['args']
            target_schema = event_abi['inputs']
            decoded_data = convert_to_hex(data, target_schema)


            return (evt_name, decoded_data, target_schema)
        except Exception:
            return ('decode error', traceback.format_exc(), None)

    else:
        return ('no matching abi', None, None)

Example usage:

output = decode_log(
    {'data': '0x000000000000000000000000000000000000000000000000000000009502f90000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000093f8f932b016b1c',
     'topics': [
         '0xd78ad95fa46c994b6551d0da85fc275fe613ce37657fb8d5e3d130840159d822',
         '0x0000000000000000000000007a250d5630b4cf539739df2c5dacb4c659f2488d',
         '0x000000000000000000000000242301fa62f0de9e3842a5fb4c0cdca67e3a2fab'],
     },
    pair_abi
)
print(output[0])
pprint(output[1])
# Swap
# {'amount0In': 2500000000,
#  'amount0Out': 0,
#  'amount1In': 0,
#  'amount1Out': 666409132118600476,
#  'sender': '0x7a250d5630B4cF539739dF2C5dAcb4c659F2488D',
#  'to': '0x242301FA62f0De9e3842A5Fb4c0CdCa67e3A2Fab'}

Or in your case:

output = decode_log(resp['result'][0], pair_abi)
print(output[0])
pprint(output[1])
# Swap
# {'amount0In': 0,
#  'amount0Out': 6711072182,
#  'amount1In': 3814822253806629476,
#  'amount1Out': 0,
#  'sender': '0x7a250d5630B4cF539739dF2C5dAcb4c659F2488D',
#  'to': '0x0Ffd670749D4179558b6B367E30e72ce2efea28F'}

Now, note that you need to provide the pair_abi variable. It depends on the type of smart contract that you're using. I've found that when on Uniswap V3, the UniswapV2Pair ABI worked for some events, while UniswapV3Pool ABI worked for others, in particular for the Swap event that I've found the most useful.


After a few hours of digging I managed to find this solution, which is a slightly modified version of the one proposed in: https://towardsdatascience.com/decoding-ethereum-smart-contract-data-eed513a65f76 Big thumbs up to its author You can read more there on parsing the transaction input too.

Voy
  • 5,286
  • 1
  • 49
  • 59