2

I have this jquery function which I found on a tutorial site that takes in a csv file input and tablifies it. I tried giving in a large csv file(10,000KB) and my browser crashes. I saw there's a parser library called papa to handle this but is there any other approach to keep my browser from crashing while doing this?

Here's the relevant code:-

 $("#upload").bind("click", function () {

        var regex = /^([a-zA-Z0-9\s_\\.\-:])+(.csv|.txt)$/;
        if (regex.test($("#fileUpload").val().toLowerCase())) {
                            var reader = new FileReader();
                reader.onload = function (e) {
                    var table = $("<table id='mytable' class='table table-striped table-bordered'/>");
                    var rows = e.target.result.split("\n");
                    text1=e.target.result;
                    var csvString = $.trim(text1);
                    var csvRows    = csvString.split(/\n/);
                    var csvHeaders = csvRows.shift().split(';');
                    var headerstr=String(csvHeaders);
                    var count = (headerstr.match(/,/g) || []).length+1;
                    for (var i = 0; i < rows.length; i++) {
                        var row = $("<tr />");
                        var cells = rows[i].split(",");
                        for (var j = 0; j < count; j++) {

                            if(cells[j]){
                            var cell = $("<td/>");

                            cell.html(cells[j]);
                            }
                            else{
                            var cell = $("<td class='info'/>");
                            cell.html("empty");}
                            row.append(cell);

                        }
                        table.append(row);

                    }


                    $("#dvCSV").html('');
                    $("#dvCSV").append(table);
                }

    });

How do I implement this functionality without crashing my browser? Thanks in advance.

Tania
  • 1,855
  • 1
  • 15
  • 38
  • Try using a profiler (Chrome's Dev Tools has one) and find out what eats up all the memory. I'm pretty sure those `var`s inside the for loop are particularly bad. – Sergiu Paraschiv Mar 09 '15 at 09:38
  • Hi @Sergiu.. Removing the vars inside the for loop didnt help. I am working with firefox.. :( I just need this functionality intact but without crashing my browser – Tania Mar 09 '15 at 09:45
  • I did not say removing the `var`s inside the loop will fix this. What I was saying is that doing that is particularly taxing on memory. You are doing _a lot_ of other stuff that's not memory and CPU efficient. Repeatedly building the same HTML structure inside the loop is also bad. Try building it outside of the loop and only inject the data inside it, then append it to the DOM. Also try to add batches of table lines at a time, not one at a time. jQuery knows `$(element).append([list of nodes])` and optimizes it using document fragments. – Sergiu Paraschiv Mar 09 '15 at 10:28
  • The point is event though PapaParse _handles_ the data you are then responsible of displaying it. I'd first think about paging if I were you. If paging is not OK then you really really need to optimize your jQuery code. – Sergiu Paraschiv Mar 09 '15 at 10:32
  • Hi. I understand. I have to rewrite this code as to recreate the html outside the for loops. Can you please add your suggestions with a supporting piece of code as a solution? – Tania Mar 09 '15 at 11:04
  • see my response bellow. – Sergiu Paraschiv Mar 09 '15 at 12:47

2 Answers2

4

Two big issues in tackling this problem:

1) The CSV parser. Papa Parse is great. It has support for workers and is a streaming parser - the only way to go with large files.

2) The way you display the data. Simply outputing each row in a table won't work. I crashed my PC twice trying to come up with a working solution. The only way of doing this and the one used by basically any system dealing with large files is to use virtualized lists. I ended up using this one. It's simple and the code is easy to understand.

Here is my JS:

$("#fUpload").bind("change", function(evt) {
    var bigFile = evt.target.files[0];
    var rows = [];
    Papa.parse(bigFile, {
        delimiter: ",",
        newline: "\n",
        header: false,
        dynamicTyping: false,
        worker: false,
        chunk: function(results) {
            rows.concat(rows, results.data);
        },
        complete: function() {
            var list = new VirtualList({
              h: 300,
              itemHeight: 30,
              totalRows: rows.length,
              generatorFn: function(row) {
                  var el = document.createElement("div");
                  el.innerHTML = "<p>ITEM " + row + ' -> ' + rows[row].join(' - ') + "</p>";
                  return el;
              }
            });
            document.body.appendChild(list.container)
        }
    });
});

HTML contains this input: <input type="file" id="fUpload" />

How I configured Papa:

  • delimiter and newline: if you allow it to try and detect them it will fail or take longer;

  • worker: this will spawn a worker process. It will be slower but will keep the UI responsive (the UI thread won't do any work). You'll probably want to set this to true in production. (This won't work on JSFiddle because of browser cross-domain security protocol!);

  • chunk: instead of a callback for each parsed row have one for a larger set of rows. This is faster;

The virtual list config is the default one.

You can run it here.

I tested with a 9.4 MB CSV file containing 1,Foo,100 repeated on each line.

Here's the same but using a table to output the data and added a -1 to VirtualList's totalRows to compensate for the actual length.

Sergiu Paraschiv
  • 9,929
  • 5
  • 36
  • 47
  • Thanks for the working solution. Is it possible to tablify this output? And, how do I link papa parse to my code? Is there a CDN mirror to do it?? – Tania Mar 09 '15 at 13:15
  • You can download Papa Parse from their website: http://papaparse.com/ It's hosted on GitHub. – Sergiu Paraschiv Mar 09 '15 at 13:35
  • I downloaded and added the papaparse folder inside my working folder. But it doesnt work. – Tania Mar 09 '15 at 13:41
  • You have to load it in your HTML: `` and probably `` too... (where `lib/whatever.js` is the relative path to the js file). Also see my updated answer for a table version. – Sergiu Paraschiv Mar 09 '15 at 13:44
  • Thank you for the updated code I tried placing the js files alongside the html document which I am working with, also added it in a directory but papaparse.js and vlist.js arent recognized.. :( I can see it works well in the fiddle but i am not able to do it in my html page – Tania Mar 09 '15 at 14:09
  • Well, open the debug console in Firefox and see what the error message is. Your paths are probably wrong. – Sergiu Paraschiv Mar 09 '15 at 14:11
  • I just get this status 304 message for jquery but papa parse doesnt seem to work. Is there any way i can post my code so you can see what messes up things? – Tania Mar 10 '15 at 04:25
  • You'd have to show us your whole project structure and your HTML. I'd advise you to open a new question for that. – Sergiu Paraschiv Mar 10 '15 at 09:03
  • http://stackoverflow.com/questions/28959947/papaparse-vfile-doesnt-work this is my the link to my new question. Can you please check this out and tell me whats wrong? – Tania Mar 10 '15 at 09:22
0

I recommend you to use console.log('...') to identify where it's locking/breaking, maybe in the first variables(rows, csvString) because the size. So, you can know where to attack.

If not in the before loop block, then i think it's too expensive use jquery in this context, so try a direct DOM approach(inside the loop,at least):

$("#upload").bind("click", function () {
    var regex = /^([a-zA-Z0-9\s_\\.\-:])+(.csv|.txt)$/;
    if (regex.test($("#fileUpload").val().toLowerCase())) {
                        var reader = new FileReader();
            reader.onload = function (e) {
                var table = $("<table id='mytable' class='table table-striped table-bordered'/>");
                var rows = e.target.result.split("\n");
                text1=e.target.result;
                var csvString = $.trim(text1);
                var csvRows    = csvString.split(/\n/);
                var csvHeaders = csvRows.shift().split(';');
                var headerstr=String(csvHeaders);
                var count = (headerstr.match(/,/g) || []).length+1;
                for (var i = 0; i < rows.length; i++) {
                    var row = document.createElement("tr");
                    var cells = rows[i].split(",");
                    for (var j = 0; j < count; j++) {
                        var cell = document.createElement("td");
                        if(cells[j]){

                        cell.appendChild(document.createTextNode(cells[j]));
                        }
                        else{
                            cell.setAttribute('class','info');
                            cell.appendChild(document.createTextNode("empty"));
                        }
                        row.appendChild(cell);

                    }
                    table.append(row);

                }


                $("#dvCSV").html('');
                $("#dvCSV").append(table);
            }

});

I don't tested this code.

ton
  • 3,827
  • 1
  • 42
  • 40