Python - Generate an array where elements are a function of a varying parameter, without building the array each time

Question

TLDR : How can I generate an array whose elements depend on some arbitrary (float) value, k, without having to go through the extremely time-expensive process of constructing the array from scratch every time I change the value of k.

What I want to achieve would look like:

I am generating a huge Hamiltonian in the atomic basis of a 2D lattice (N x N numpy array). Filling this array requires comparing positions (xyz) of the atomic sites multiple times for each different coupling types that I want to include, becoming very time-expensive as the system size grows. (typically N > 16,000 sites).

Elements of this array have a dependence upon some other float type variable, k (in the physical context of the program this is a quantum counting number that I want to iterate over). I need to calculate this array many times for a range of 1000 k-values.

i.e generate the 256,000,000 element array 1000 times... eurgh

At present I have to create the array every time I change to a new k-value which is obviously very inefficient. The basic structure of this looks (very generally) like:

class Device():

    def __init__(self, xyz, energy, ... other input parameters ...):

        self.xyz = xyz           # N x 3 array of positions
        self.energy = energy     # Length N list of energies
        # + A range of other parameters that define the device


     # -------- OTHER OPERATIONS ON THE DEVICE -------- #


     def get_H(self, k):
        """ Get the Hamiltonian for a given k - value """

        # Initialise the N x N array
        h = np.zeros((len(self.xyz), len(self.xyz)))

        # - Fill THE HAMILTONIAN BY COMPARING ALL ATOMIC POSITIONS IN self.xyz - #

        return h

which requires that I call the entire construction process each time.

I want to know if there is a way to generate this array once, with k having been left as a free parameter which can then be filled in later. i.e return an array which is a function of k. The priority is to only have to construct the array once since testing indicates that this takes a significant portion of my total run-time.

Below is a minimal (non-working) example of what I want to achieve by acting on a test array. I want to make the Hamiltonian an object variable rather than a method that has to be made each time, but with some dependence on k (I realise that this will be syntactically disastrous but hopefully will be a good start for an answer).

class Test_device():

def __init__():

    self.get_H = self.make_H()

def make_H(self):

    h = np.linspace(1,9,9).reshape((3,3)) # Test array

    # The below clearly will not work because k is not defined, but this is
    # where I want to achieve this

    h[1,1] += k # Give some k-dependence to the middle element of the array

    def ham(k, h = h):

        # Somehow set the value of k in h

        return h

    return ham

Which I would then access via

device = Test_device()
device.get_H(k = k_value)

Thanks in advance!

I love this question and can probably help but I am not clear on exactly what your stumbling block is - is it that you only want to initialize the Hamiltonian array once and then fill it on-the-fly using an arbitrary functional dependence on some `k`? Why can't `Device` have an update method? — jtlz2, Aug 21 '18 at 15:05
The process to generate the array and fill it is the same every time (for a given lattice structure), but the value of k differs on each iteration. Essentially, I want an array that is just a function of k as the output of my make_H() method — MrTLane, Aug 21 '18 at 15:23
Have you looked at e.g. https://docs.scipy.org/doc/numpy/reference/generated/numpy.fromfunction.html ? — jtlz2, Aug 21 '18 at 16:32
Yes. But the function which decides where `k` should be in the array is the time expensive part that I only want to do once, rather than each time as it is now. — MrTLane, Aug 21 '18 at 16:32
Also: You can't get around the fact that the array is an array with N elements - it needs populating, whether it's via a for-loop or subsequent optimization - but it remains an optimization problem does it not? — jtlz2, Aug 21 '18 at 16:34
I think that the function which populates the array in the first instance is just about as optimised as I can make it. It is all vectorised using only numpy functions to fill the elements. I will take a look at `fromfunction` to see if that is what I need. Thanks a load for the continuing advice! — MrTLane, Aug 21 '18 at 16:37
Always happy to help a fellow physicist (even if it is TCM :D), but apologies for still not having a handle on the problem definition very well / yet — jtlz2, Aug 21 '18 at 16:38
If you mean in the physical context, there are no sources of decoherence in the system and so the kernel would just be the size of my system which in this case is 16000 atoms. If you mean in the coding context I am afraid I am not sure what you mean by the 'coherence length of the kernel'. Also, I am getting a pop-up that this should be moved to chat, but I do not have enough reputation for that, what is the best thing to do? — MrTLane, Aug 21 '18 at 16:42
Oh no problem - I can't see how to move it to chat either though in this case. You can probably find my email via my profile / google if you like otherwise we can just carry on here I suppose. — jtlz2, Aug 21 '18 at 16:45
Well, the code is written as a general tight-binding calculation for any lattice structure, but my current system consists of bilayer graphene (yes, I am sadly part of that gang) with a spatially modulated potential (which would just enter via the on-diagonal elements). I have edited the question a little. Perhaps that might clarify things. — MrTLane, Aug 21 '18 at 16:50
What shape is `k`? Is the N x N Hamiltonian sparse / mostly zeros (or approximately so at some level of precision?)? — jtlz2, Aug 22 '18 at 05:31
Could you also convolve in Fourier space and then transform back? — jtlz2, Aug 22 '18 at 05:33
Device() only ever sees a single, floating point, `k` value at a time for each iteration of this code. Outside of the Device() class it is a list of such values which I parallelise over. It is not possible to vectorise this part since the complete matrix would be too much to hold in memory at any one time (only have access to about 450Gb of RAM). It would be possible to transform into fourier/reciprocal space, but this would not reduce the size of the system so I am not sure what it would gain me other than adding in another possibly expensive operation. Thanks — MrTLane, Aug 22 '18 at 09:10
Are your `k`s predefined btw e.g. on a grid, or can they take any value at any time e.g. MCMC? — jtlz2, Aug 22 '18 at 10:48
`k` lies between negative and positive pi and is always real valued — MrTLane, Aug 22 '18 at 10:52
But can you precompute the `k`s, or rather you only know what they will be as you iterate? — jtlz2, Aug 22 '18 at 10:56
Calculating the `k`'s is trivial, it is just `np.linspace(-np.pi, np.pi, 1000)` in the case of 1000 points. The issue is that each element in the Hamiltonian may have some f(k) depedence where f() is a function. The long algorithm that I have to run through to get he Hamiltonian just tells me where the f(k) should be and what their coefficients are. — MrTLane, Aug 22 '18 at 12:12

score 0 · Answer 1 · answered Aug 21 '18 at 15:05

0

This isn't a working example but...

class Test_device():

    def __init__(self, h):
        self.k = 0
        self.h = h

    def get_H(self):

        self.h = np.linspace(1,9,9).reshape((3,3)) # Test array

        # The below clearly will not work because k is not defined, but this is
        # where I want to achieve this

        self.h[1,1] += self.k # Give some k-dependence to the middle element of the array

    def ham(self):

        # Somehow set the value of k in h
        return self.h

You can do something like this:

device = Test_Device(10)
device.get_H()
device.h = 12
device.get_H()
h = device.ham()

You can change h or k at any time by just changing the value inside the class like device.h = 14. The same with k.

answered Aug 21 '18 at 15:05

eatmeimadanish

3,809
1
14
20

The problem here is that get_H() will still have to go through the process of generating the array each time. In practice, the line that I wrote as h[1,1] += k is about 200 lines of code across 4 different methods, each for a unique coupling type. Given that h is an array, I am also not sure why you would initialise it with an int value. Did you mean to pass Test_device k instead? – MrTLane Aug 21 '18 at 15:26
1

As an additional comment, I do not want to have k be a property of the class since it is not a physical quantity and so does not belong here. (I am a physicist, so this kind of thing maybe bothers me more than it should from a coding stand-point XD) – MrTLane Aug 21 '18 at 15:36
I honestly couldn't make sense of your initial code example. I was just giving you an idea on how to modify the value during execution. – eatmeimadanish Aug 22 '18 at 19:16

score 0 · Answer 2 · answered Aug 22 '18 at 10:46

0

On reflection I do not think np.fromfunction() is what you want. Rather try:

import numpy as np

class Test_device(object):

    def __init__(self, h_shape):
        self.h_shape = h_shape

        # Create array once
        self.h = np.linspace(1, h_shape[0] ** 2, h_shape[1] ** 2).reshape(self.h_shape)

    def get_H(self, k, locn=(1, 1)):
        self.h[locn] += k  # Give some k-dependence to the middle element of the array

        # Somehow set the value of k in h
        pass

In what follows, initialize device once (given an intended shape for h). Then choose a k (and a location locn, if you want). Then call the get_H method of device. k is never attributed to device, but could be if you like for reference with self.k=k (as stated by eatmeinadanish).

Then you can access device.h whenever you like.

h_shape = (3, 3)
device = Test_device(h_shape)
k = 1.3
locn = (1, 1)
device.get_H(k, locn=locn)
print device.h

I'm not sure this helps you much in the end, whether it's what you actually intended, and note that it doesn't really add much to eatmeinadanish's answer.

answered Aug 22 '18 at 10:46

jtlz2

7,700
9
64
114

Thanks for a having a think about my problem. In your solution I think that I would still have to calculate where `k` goes in each case since I don't have `locn` saved at any point. Indeed, saving an `locn` array for every coupling that I include would eat a massive amount of memory so may not be a viable option at all. What I need is something like in Mathematica where you can leave a variable unevaluated in an expression, but this may well not be possible in python. – MrTLane Aug 22 '18 at 12:29
No problem - sorry not to be more help. You can do it in python using TensorFlow declarative programming (see e.g. https://stackoverflow.com/questions/35677724/tensorflow-why-was-python-the-chosen-language), but I do wonder whether it will save you any computing resources in the end - because you can't really escape the number crunching. Is every single element of `h` affected for every `k`, or can you truncate somehow (that's why I was wondering about coherence length) so that you only touch a few of the values? – jtlz2 Aug 22 '18 at 12:38
The matrix is somewhat sparse, so there could be a speedup there, but looking into discussions of the scipy sparse modules here on stack overflow indicates that you probably need a much more sparse matrix than I actually have (~1-2% values which are non-zero unlike mine which is more like 20%). Yes, a significant number of the non-zero elements depend directly on k since electrons pick up a phase factor `e^{i k.x}` etc. when jumping between atomic sites. – MrTLane Aug 22 '18 at 12:58
I am not sure it is a good idea to accept either of the answers as they are, since I do not want to mislead anyone else who comes here either looking for answers t their own questions or wanting to try answering the question themselves, but I cannot thank you enough for having a think about it and coming up with so many good ideas. Especially since I am just in TCM ; ) – MrTLane Aug 22 '18 at 13:00
No problem at all - good luck - keep it open until you are happy with the answer - or - if you solve it suddenly, which often seems to happen after a break from thinking about it, post and accept an answer to your own Q. Good luck! :) – jtlz2 Aug 22 '18 at 13:32

Python - Generate an array where elements are a function of a varying parameter, without building the array each time

2 Answers2