3

I have written a code for a lazy simple random walk on a n-dimensional hypercube. To visualize it, one can think of a 2D square or a 3D cube with the corners of the square/cube being the vertices of a graph. The walk on this graph starts at the origin and with probability 0.5 stays at a vertex and with remaining equal probability visits its neighbors. The code is running but I have a technical question. I will explain what I did in a bit but at the end, the distance between stationary distribution which is uniform and the distribution obtained by simulation has to decrease to 0 as time increases. In my simulation, for large n, say n=10, the distance between the above distributions tapers off at a non-zero value. It would be great if someone can point out where I went wrong?!

  1. This function generates all vertices of a n-dim hypercube, output is a list of bitstring lists i.e. each vertex of the hypercube is a n-bitstring and there are 2^n such vertices.

     def generateAllBinaryStrings(n, arr, l, i):  
    
         if i == n: 
             l.append(arr[:]) 
             return
    
         arr[i] = 0
         generateAllBinaryStrings(n, arr, l, i + 1)  
    
         arr[i] = 1
         generateAllBinaryStrings(n, arr, l, i + 1)  
    
         return l
    
  2. Creating a dictionary whose keys are vertices and values are the neighbors (each vertex has n neighbors)

     def dictionary(v):
         d={}
         for i in range(len(v)):
             d[str(v[i])]=[]
             temp=[]
             for j in range(n):
                 temp=v[i][:]
                 if v[i][j]==1:
                     temp[j]=0
                 else:
                     temp[j]=1
                 d[str(v[i])].append(temp)
         return d
    
  3. simple random walk on the n-dim cube starting at the origin (output is a list of vertices visited in time t)

     def srw(d,n,t):
         h=[[0 for i in range(n)]]
         w=[1/(2*n) for i in range(n)]
         w.append(0.5)
         for i in range(t):
             temp=d[str(h[-1])][:]
             temp.append(h[-1])
             h.append(random.choices(temp,weights=w)[-1])
    
         return h
    
  4. Finding L1 distance between stationary distribution and simulated distribution

     def tvDist(d,n,t,num):
         finalstate=[]
         for i in range(num):
             temp=srw(d,n,t)
             finalstate.append(temp[-1])
         temp2=[list(item) for item in set(tuple(row) for row in finalstate)]
         Xt={}
         for state in temp2:
             Xt[str(state)]=None
         for state in temp2:
             count=0
             for i in finalstate:
                 if i==state:
                     count+=1
             Xt[str(state)]=count/num
    
    
         dist=0
         for state in d:
             if state in Xt:
                 dist+=abs(Xt[str(state)]-(1/(2**n)))
             else:
                 dist+=1/(2**n)
         dist=0.5*(dist)
    
         return dist
    
  5. If you copy paste the code above and run the following (plotting the distance vs time for n=5,10)

     import numpy as np
     import matplotlib.pyplot as plt
     import random
     import collections as cs
    
     time=30
     numsim=5000
    
     for n in range(5,11,5):
    
         l = []  
         arr = [None] * n 
         vertices=generateAllBinaryStrings(n, arr, l, 0)
         d=dictionary(vertices)
         tvdistance=[]
         for t in range(1,time):
             tvdistance.append(tvDist(d,n,t,numsim))
         plt.plot(tvdistance)
    
     plt.show()
    
  6. I get this plot: (n=5 looks right but n=10 tapers off at a non-zero value)

enter image description here

trickymaverick
  • 199
  • 1
  • 3
  • 8
  • This is a very interesting question. I'm reading your code, btw i think `if h[-1] not in temp:` will always be true because a vertex will never be its own neighbor (in this case) am I wrong?, so you can always do `temp.append(h[-1])` on step 3. – Jorge Morgado Oct 08 '20 at 19:32
  • Another thing, you are doing `for state in temp2:` twice in step 4. You can initialize `Xt` with only one `for state in temp2:` i believe. I'm stll reading the code... :) – Jorge Morgado Oct 08 '20 at 19:40
  • @JorgeMorgado Thanks, yes, it is. In fact, the distance versus time curve should be a phase transition curve for large n i.e. it should drop from 1 to 0 sharply. I don't remember the error I got in step 3, but I had to use that to fix the error. – trickymaverick Oct 08 '20 at 19:41
  • I'm trying to understand step 4, can you summarize the idea? As i see `temp2` is a Set of the items from `finalstate` and `Xt` is the average of each item, Is that correct? If this is what I asked it can be done more easy I will show you how...I'm still reading code :) – Jorge Morgado Oct 08 '20 at 19:51
  • 1
    @JorgeMorgado Take your time, thanks. Step 4 finds the probability distribution of the simulated walk at time t. `temp` has the set of vertices visited. `finalstate` is the last state/vertex visited over different simulations. Because a vertex can be visited more than once, `temp2` is the set of the list `finalstate`. Then the probability of each element in `temp2` is `count/num` i.e. number of times that vertex is visited divided by number of simulations, which is stored in the dictionary `Xt`. Hope that helps, let me know if there is a question! – trickymaverick Oct 08 '20 at 20:09
  • 1
    @JorgeMorgado After that, `dist` measures the L1 distance between the probability distribution found above and the uniform probability distribution i.e. `1/2^n` for each vertex. – trickymaverick Oct 08 '20 at 20:16
  • I see, of course, that has to decrease to 0, ok let me test the code and see if i can find the error. :) – Jorge Morgado Oct 08 '20 at 20:20
  • @JorgeMorgado Perfect, appreciate that. – trickymaverick Oct 08 '20 at 20:24
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/222743/discussion-between-jorge-morgado-and-trickymaverick). – Jorge Morgado Oct 08 '20 at 20:41
  • I really like this question. I had tried a similar random walk in x dimensions after doing a random walk in three, but I could never quite bring myself to trust my distance function which I based on vector mathematics. Your post and these comments are inspiring! – user10637953 Oct 08 '20 at 22:24
  • @user10637953 Thanks, yeah the code looks right for the most part except for some glitch. – trickymaverick Oct 08 '20 at 22:35
  • @trickymaverick i think y have an answer for you open the [chat](https://chat.stackoverflow.com/rooms/222743/discussion-between-jorge-morgado-and-trickymaverick) when you see this – Jorge Morgado Oct 09 '20 at 03:37

0 Answers0