Questions tagged [disco]

Disco is a distributed computing framework based on the MapReduce paradigm. Disco is open-source; developed by Nokia Research Center to solve real problems in handling massive amounts of data. Disco distributes and replicates your data, and schedules your jobs efficiently.

Disco is a distributed computing framework based on the MapReduce paradigm. Disco is open-source; developed by Nokia Research Center to solve real problems in handling massive amounts of data. Disco distributes and replicates your data, and schedules your jobs efficiently.

35 questions
7
votes
1 answer

Disco/MapReduce: Using results of previous iteration as input to new iteration

Currently am implementing PageRank on Disco. As an iterative algorithm, the results of one iteration are used as input to the next iteration. I have a large file which represents all the links, with each row representing a page and the values in…
muckabout
  • 1,923
  • 1
  • 19
  • 31
5
votes
1 answer

Web Service Proxy Code Generated by WSDL.exe Versus "Update Web Reference" - Should I Care?

Using Visual Studio 2010, we have a solution with several web sites (not web application projects) and command line and winforms projects. All target .Net 2.0. Many of the projects have web references to the ASMX web services in the web sites. The…
Tom Winter
  • 1,813
  • 3
  • 18
  • 23
5
votes
2 answers

Celery for Map-Reduce, or other alternatives in Python?

I have expensive jobs that are very suited to be run under map-and-reduce model (long story short, it is to aggregate a few hundred rankings that are previously calculated via some time-consuming algorithm). I wanted to parallelize the jobs on…
Leonth
  • 711
  • 1
  • 9
  • 19
3
votes
2 answers

no module named disco.core

I've been following the tutorial here: http://discoproject.org/doc/disco/start/install.html and have been succesful up to the point where I run the script. I get the error: no module named disco.core I have installed disco according to the…
dvreed77
  • 2,217
  • 2
  • 27
  • 42
3
votes
1 answer

Python - Map / Reduce - How do I read JSON specific field in using DISCO count words example

I'm following along with the DISCO example for counting words from a file: Counting Words as a map/reduce job I have no issues getting this working, however I want to try reading in a specific field from a text file that contains JSON strings. The…
secumind
  • 1,141
  • 1
  • 17
  • 38
2
votes
1 answer

Generating WSDL & disco files

i want to generate a wsdl and a disco file automatically (e.g. via a bat-file). these files will be generated if a service reference is added to a (test) project. wsdl.exe and disco.exe are missing. can you tell me best practice ?
mnemonic
  • 1,605
  • 2
  • 17
  • 26
2
votes
1 answer

how to get job results disco python

How to get job results from disco python? I have tried disco jobs: jmunsch@disco-master-5147:~$ disco jobs KeyCount@5ca:2d323:53093 KeyCount@5ca:2bcb5:4f479 disco results: jmunsch@disco-master-5147:~$ disco results…
jmunsch
  • 22,771
  • 11
  • 93
  • 114
2
votes
1 answer

STM32 odd timer1 behavior in Master-Slave Configuration - mb code issue

I'm currently working on an embedded system which is meant to output a specific pulse sequence at equidistant timings. Therefore, I use the STM32 - DISCO board with the FreeRTOS kernel, as a first test. I configured TIM3 as Master triggering TIM1.…
eimer
  • 81
  • 7
2
votes
1 answer

Disco/MapReduce: Using chain_reader on split data

My algorithm currently uses nr_reduces 1 because I need to ensure that the data for a given key is aggregated. To pass input to the next iteration, one should use "chain_reader". However, the results from a mapper are as a single result list, and…
muckabout
  • 1,923
  • 1
  • 19
  • 31
2
votes
0 answers

Confusion about file accesses in disco

I have a simple 2 node cluster (master on one, workers on both). I tried using: python disco/util/distrfiles.py bigtxt /etc/nodes > bigtxt.chunks To distribute the files (which worked ok). I expected this to mean that the processes would spawn and…
muckabout
  • 1,923
  • 1
  • 19
  • 31
2
votes
1 answer

Reading Data from DDFS ValueError: No JSON object could be decoded

I'm running dozens of map reduce jobs for a number of different purposes using disco. My data has grown enormous and I thought I would try using DDFS for a change rather than standard txt files. I've followed the DISCO map/reduce example Counting…
secumind
  • 1,141
  • 1
  • 17
  • 38
1
vote
1 answer

Specifying output uri for a Disco mapreduce job

I would like to have a completed Disco job write directly to mongodb. Is there an easy way to specify an output url for Disco to send its data to?
Andruf
  • 275
  • 1
  • 2
  • 7
1
vote
1 answer

Disco Diffusion: PytorchStreamReader failed reading zip archive: failed finding central directory

I'm trying to run Disco Diffusion v5 with basically default settings and getting this error when I try to create the image. Does anyone know how to fix this? Starting Run: Horse 02(0) at frame 0 Prepping model... RuntimeError …
Trevor Alyn
  • 687
  • 2
  • 8
  • 20
1
vote
1 answer

What does Disco's "Could not parse worker event:" error mean?

I'm trying to run a Disco job using map and reduce functions that are deserialized after being passed over a TCP socket using the marshal library. Specifically, I'm unpacking them with code = marshal.loads(data_from_tcp) func =…
nickname
  • 1,187
  • 1
  • 9
  • 20
1
vote
2 answers

Running a Disco map-reduce job on data stored in Discodex

I have a large amount of static data that needs to offer random access. Since, I'm using Disco to digest it, I'm using the very impressive looking Discodex (key, value) store on top of the Disco Distributed File System. However, Disco's…
nickname
  • 1,187
  • 1
  • 9
  • 20
1
2 3