0

Ok, so I've build a DSL and part of it requires the user of the DSL to define what I called a 'writer block'

  writer do |data_block|
    CSV.open("data.csv", "wb") do |csv|
      headers_written = false
      data_block do |hash|
        (csv << headers_written && headers_written = true) unless headers_written
        csv << hash.values
      end
    end
  end

The writer block gets called like this:

  def pull_and_store
    raise "No writer detected" unless @writer
    @writer.call( -> (&block) {
      pull(pull_initial,&block)
    })
  end

The problem is two fold, first, is this the best way to handle this kind of thing and second I'm getting a strange error:

undefined method data_block' for Servo_City:Class (NoMethodError)

It's strange becuase I can see data_block right there, or at least it exists before the CSV block at any rate.

What I'm trying to create is a way for the user to write a wrapper block that both wraps around a block and yields a block to the block that is being wrapped, wow that's a mouthful.

Thermatix
  • 2,757
  • 21
  • 51
  • Hmm, I don't really understand what you are trying to accomplish. There are multiple possible causes of failure, some possibly in the code you don't show. Just an educated guess: You call `writer` with a block that accepts a block `data_block` which in turn accepts a block. Seems very complicated for a DSL. Try dereferencing it like so: `writer do |&data_block|` and then invoke it with `data_block.call do |hash|`. What I don't see is where the `hash` is being passed in. (Does `pull` do that?) Why are you not passing it to the writer block in the first place like `writer do |data_block, hash|`? – Raffael Mar 05 '17 at 19:49
  • This is why I'm asking, it's a bit complicated. `pull` takes in a block and yields a hash to it; hence `@writter.call` takes a block which is the block passed to `data_block`. The reason I'm doing it this way as it's the only way I can think of to allow me to wrap the `pull` method inside of the `CSV` block. Otherwise I'd have to create a csv object, append the hash to the csv file and then close it for every hash yielded. I get the feeling I'm overcomplicating this, could I maybe pass an IO object or somthing along and use that instead? – Thermatix Mar 05 '17 at 19:57

2 Answers2

1

Inner me does not want to write an answer before the question is clarified.
Other me wagers that code examples will help to clarify the problem.


I assume that the writer block has the task of persisting some data. Could you pass the data into the block in an enumerable form? That would allow the DSL user to write something like this:

writer do |data|
  CSV.open("data.csv", "wb") do |csv|
    csv << header_row
    data.each do |hash|
      data_row = hash.values
      csv << data_row
    end
  end
end

No block passing required.

Note that you can pass in a lazy collection if dealing with hugely huge data sets.

Does this solve your problem?

Raffael
  • 2,639
  • 16
  • 15
  • Sorry for not responding till now, `data.each` won't work because it doesn't yield an iterator, rather it calls the block and passes a hash to the block. – Thermatix Mar 07 '17 at 19:43
  • That's the point. Make it yield an enumerable. Why does it have to be blocks? – Raffael Mar 08 '17 at 20:01
  • Becuase it's just not possible to make it enumerable, not with the way the code is structured and the block goes a few method calls deep before it even uses the block. – Thermatix Mar 08 '17 at 21:56
  • I'd love to help you make this work and even simplify it for the DSL user. I am fairly certain that it IS possible. Would you be willing to show me your code so I can show you how to do this? I'll probably suggest to slightly change "how the code is structured". Be brave ;) – Raffael Mar 09 '17 at 20:10
  • eck, I'd honestly forgotten about this question, been busy with other things. Umm tbh it's kind of irrelevant now due to some changes I made. First is that I now store all the data i'm scraping into a file based DB using DBM and then move on from there (that way I can have resume and have it speed up scraping as it ignores scraped urls) and then just do a `.each` on the DBM object. The second is that I've now created a hard coded CSV writer and the user selects this with just a `writer({csv: "servo_city.csv"}) `. As a result the question has now become irrelevant. – Thermatix Mar 16 '17 at 18:57
  • This is how the DSL looks like now when loaded as a script: https://gist.github.com/Thermatix/ce89e65f65c65d35944c7534ea4f4063 – Thermatix Mar 16 '17 at 18:57
  • Even simpler. I like it! – Raffael Mar 16 '17 at 19:03
0

Trying to open the CSV file every time you want to write a record seems overly complex and likely to cause bad performance (unless writing is intermittent). It will also overwrite the CSV file each time unless you change the file mode from wb to ab.

I think something simple like:

csv = CSV.open('data.csv', 'wb')
csv << headers
writer do |hash|
  csv << hash.values
end

would be something more understandable.

Marc Rohloff
  • 1,332
  • 7
  • 8