0

I read a pgn file, extract some information and then write my results back to a file. Why does the python process Way more RAM than my variables combined? Example: After loading 10000 chess games, python needs 700mb of RAM, but the list is only 85kb big. 200,000 games break my machine.

import chess.pgn
from tqdm import tqdm

def load_games(n_games: int) -> list[chess.pgn.Game]:
    """Load n games from the pgn file and return them as a list"""
    with open("files\lichess_elite_2022-04.pgn") as pgn_file:
        #  Downloaded from: https://database.nikonoel.fr/
        games = []
        for i in tqdm(range(n_games), desc="Loading games", unit=" games"):
            game = chess.pgn.read_game(pgn_file)
            if game is not None:
                games.append(game)
            else:
                break

    return games

games = load_games(10000)

print(games.__sizeof__()/1000)
  • When opening a file, Python attempts to load as much of the file as it can into ram for quick access. – Trevor Hurst Jul 21 '22 at 15:18
  • 1
    `games.__sizeof__()` tells you how much space is used by the *object references* in the list `games`. It doesn't tell you how much space is used by those objects themselves. For example, try this: `a=list(range(100000)); b=[a]`, and then compare `a.__sizeof__()` with `b.__sizeof__()`. – slothrop Jul 21 '22 at 15:24
  • This is not about notebook, python itself has runtime memory consumption. You can measure it with psutil library for example. – ferdy Jul 26 '22 at 01:11

1 Answers1

0

getsizeof does not give the size of the list elements. I solved by using a yield statement to get a generator

        while True:
        game = chess.pgn.read_game(pgn_file)
        if game is not None:
            yield game