0

We have a huge file of different urls (~500K - ~1M urls).
We want to use Grinder 3 for distributing these urls to the Workers in a way that every worker will invoke a single and different url.

In the JY script we could:

  • Read the file one time per Agent

  • Allocate line-number-ranges per Agent

  • Every Worker would gets a line/url according to its run-id from its Agent line-number-range.

This still means loading a huge file into memory and writing some code to a problem that might be common to many.

Any ideas to a simpler/ready-made solution?

user3139774
  • 1,295
  • 3
  • 13
  • 24

2 Answers2

0

I used Grinder in a similar fashion a while back, and wrote a utility for multi-threaded, one-time ingestion of URLs from a large file.

See https://bitbucket.org/travis_bear/file_util -- in particular, the sequential reader.

I'd recommend using the split command-line utility (or similar) to give separate chunks of the master file to each agent prior to executing your Grinder run.

Travis Bear
  • 13,039
  • 7
  • 42
  • 51
  • This definitely helps in the actual reading of the file, but still we'd presume there's a standard code for distributing the work to the agents and to the workers. – user3139774 Mar 06 '16 at 06:19
0

I would have taken a different approach if you like since its a huge file , How many threads are you planning to spawn . I believe you already know that you can get Grinder.ThreadNo to get the currently executing thread. You can actually divide the file using a pre-processor with equal number of records into number of thread and name them 0 , 1 ,2 etc which matches with thread name .

Why I am suggesting this is that processing the file looks like a pre task whats important are its contents. File processing should not interfere when threads are executing.

So now each thread will have its own file and no collisions .

for eg 20 threads 20 files however your number of threads should be chosen carefully and may be peak + 50 % .

user666
  • 1,104
  • 12
  • 20