3

I'm trying to consume an ATOM feed of concert data and output it to JSON for a bit nicer consumption.

So far I've been using request to get the data and feedparser to parse through it and it seems to be working as I'd like.

// data
var feed = 'http://mix.chimpfeedr.com/630a0-dcshows';
var wstream = fs.createWriteStream('data.json');

var req = request(feed);
var feedparser = new FeedParser({
        addmeta: false
    });

req.on('response', function(res) {
    var stream = this;
    if (res.statusCode != 200) return this.emit('error', new Error('Bad status code'));
    stream.pipe(feedparser)
});

feedparser.on('readable', function() {
    var stream = this;
    var item;

    // ... do some business work to get a `data` object

    wstream.write( JSON.stringify(data) + ',' );
});

This writes a file that's literally a concatenated list of these data objects:

{
    object1
}, {
    object2
}, {
    etc
},

This is cool but I'd like this to be wrapped in an array and I'd like the last item to not have the comma after it. I'm sure there are ways I could hack around this but I think I'm missing a core concept of the stream approach and what's actually happening.

So my question is: How do I manipulate the a readable stream (XML) and output an array of valid JSON?

imjared
  • 19,492
  • 4
  • 49
  • 72

1 Answers1

4

Perhaps the problem with your approach is that you are adding the comma at the end of every JSON element you put in the stream. This approach fails because you cannot be sure if there will be more data coming out of the reading stream.

So, a better approach would be to add the comma at the beginning of a JSON element, but only if you have already processed at least one element before. For this matter, you can have a flag or a variable counting the number of elements you have processed and based on that decide if you are processing the first element or not.

If you are at the first element, then you add the "[" to the stream, to represent the beginning of the array, and after it you add the first element to the write stream. If you are not processing the first element, then it means you are on the second, third or n-element, in whose case, you start by adding a comma, and then your element.

Finally, you add a listener for the 'end' event on your read stream, that way, you get notified when you have reached the end of the data, and then you can add the closing bracket of your write stream "]" and complete a valid json array.

I have created a simplified version of this example, using some local data in my hard disk. I am pretty sure you can adapt it to your case.

var FeedParser = require('feedparser'),
    fs = require('fs'), 
    feed = __dirname+'/rss2sample.xml';

var ws = fs.createWriteStream('data.json');
var first = true;
fs.createReadStream(feed)
  .on('error', function (error) {
    console.error(error);
  })
  .pipe(new FeedParser())
  .on('error', function (error) {
    console.error(error);
  })
  .on('readable', function() {
    var stream = this, item;
    while (item = stream.read()) {
      if(first){
        ws.write('[');
        first = false;
      } else {
        ws.write(',');
      }
      ws.write(JSON.stringify(item));
    }
  })
  .on('end', function(){
    ws.write(']');
  });

This produces a valid json file.

Edwin Dalorzo
  • 76,803
  • 25
  • 144
  • 205