0

I've solved a problem on spoj, but it's still too slow for being accepted.

I've tried to make it use multiprocessing too, but I've failed because it's still slower. The basic implemenation, even with pypy, returns "time limits exceeded" on spoj. So, how can I improve it?

And what is wrong with the multiprocessing implementation?

#  -- shipyard
from collections import defaultdict
#W = 100  total weight
#N = 2    number of types
#value | weight
#1       1
#30      50
# result -> 60 = minimum total value
#c = [1, 30]
#w = [1, 50]

def knap(W, N, c, w):

   f    = defaultdict(int)
   g    = defaultdict(bool)
   g[0] = True
   for i in xrange(N):
      for j in xrange(W):
         if g[j]:
            g[j+w[i]] = True
            #print "g("+str(j+w[i])+") = true"
            if ( f[j+w[i]]==0 or f[j+w[i]]>f[j]+c[i]):
               f[j+w[i]] = f[j]+c[i]
               #print " f("+str(j+w[i])+") = ",f[j+w[i]]

   if g[W]:
      print f[W]
   else:
      print -1

def start():
   while True:
      num_test = int(raw_input())

      for i in range(num_test):
         totWeight = int(raw_input())
         types    = int(raw_input())
         costs    = defaultdict(int)
         weights  = defaultdict(int)
         for t in range(int( types )):
             costs[t], weights[t] = [int(i) for i in raw_input().split()]

         knap(totWeight, types, costs, weights)
      return

if __name__ == '__main__': 
    start()

And here is the multiprocessing version:

#  -- shipyard
from multiprocessing import Process, Queue
from collections import defaultdict
from itertools import chain   

W  = 0
c  = {} #[]
w  = {} #[]

def knap(i, g, f, W, w, c, qG, qF):

   for j in xrange( W ):
      if g[j]:
         g[j+w[i]] = True
         #print "g("+str(j+w[i])+") = true"
         if ( f[j+w[i]]==0 or f[j+w[i]]>f[j]+c[i]):
            f[j+w[i]] = f[j]+c[i]
            #print " f("+str(j+w[i])+") = ",f[j+w[i]]
   qG.put( g)
   qF.put( f)

def start():
   global f, g, c, w, W

   while True:
      num_test = int(raw_input())

      for _ in range(num_test):
         qG = Queue()
         qF = Queue()
         W  = int(raw_input())
         N  = int(raw_input()) # types
         c  = {} #[0 for i in range(N)]
         w  = {} #[0 for i in range(N)]
         f  = defaultdict(int)
         g  = defaultdict(bool)
         g[0] = True

         for t in range( N ):
             c[t], w[t] = [int(i) for i in raw_input().split()]

         # let's go parallel
         for i in xrange(0, N, 2):
            k1 = Process(target=knap, args=(i,   g, f, W, w, c, qG, qF))
            k2 = Process(target=knap, args=(i+1, g, f, W, w, c, qG, qF))
            k1.start()
            k2.start()
            k1.join()
            k2.join()
            #while k1.is_alive(): # or k2.is_alive():
            #   None
            #g2 = defaultdict(bool, chain( g.iteritems(), qG.get().iteritems(), qG.get().iteritems()))
            #f2 = defaultdict(int,  chain( f.iteritems(), qF.get().iteritems(), qF.get().iteritems()))
            g2 = defaultdict(bool, g.items()+ qG.get().items()+ qG.get().items())
            f2 = defaultdict(int,  f.items()+ qF.get().items()+ qF.get().items())

            g = g2
            f = f2

            print "\n g: ", len(g), "\n f: ", len(f),"\n"

         if g[W]:
            print f[W]
         else:
            print -1   


      return

if __name__ == '__main__': 
    start()

I probably haven't understood how to make two processes to work efficently on the same dictionary

ballade4op52
  • 2,142
  • 5
  • 27
  • 42
tuxErrante
  • 1,274
  • 12
  • 19
  • Do you have to use `defaultdict`? Since the range of your list is known beforehand, it's better to use `[None] * int( types )` to initialize your arrays. – syockit Jun 28 '16 at 17:45

2 Answers2

1

Some programming contest will explicitly ban multithreading or block it so try to look elsewhere. My approach in those times is use a profiling tool to see where your code is struggling. You could try the built-in cProfile (python -m cProfile -o <outputfilename> <script-name> <options>) and then this wonderful visualization tool: http://www.vrplumber.com/programming/runsnakerun/ Once you have your visualization look around, dig into boxes. Some times there are things that are not directly evident but make sense once inspecting the running times. i.e. a common problem (not sure if it's your case) is checking for list membership. It's much faster to use a set in this case and often it pays to keep a separate list and set if you need the order later. There are many more tips regarding importing variables into local space, etc. You can check a list here: https://wiki.python.org/moin/PythonSpeed/PerformanceTips

Josep Valls
  • 5,483
  • 2
  • 33
  • 67
0

Many people who use Python face the same problem on programming contest sites. I've found that its best to simply ditch Python altogether for problems which accept large inputs where you have to construct and iterate over a big data structure. Simply re-implement the same solution in C or C++. Python is known to be 10 to 100 times slower than well optimised C/C++ code.

Your code looks good and there's really little you can do to gain more speed (apart from big O improvements or going through hoops such as multiprocessing). If you have to use Python, try to avoid creating unnecessary large lists and use the most efficient algorithm you can think of. You can also try to generate and run large test cases prior to submitting your solution.

Rabih Kodeih
  • 9,361
  • 11
  • 47
  • 55
  • This is definitely the best option but not always an option. There is always an intermediate step where you could code parts of your program as a C/C++ extension modules. See here: https://docs.python.org/2/extending/extending.html – Josep Valls May 21 '15 at 15:31
  • I'd definitely go for implementing a C extension module in a commercial project if speed was my concern in some parts, but it is really worth it for a contest question? – Rabih Kodeih May 21 '15 at 18:25