The problem is more with other browsers not really supporting this event correctly.
The progess event should fire during the read operation, when a new Blob chunk has been read into memory but before the reader has finished reading the whole Blob, and thus before it has any result available.
- If
chunkPromise
is fulfilled with an object whose done
property is false and whose value
property is a Uint8Array
object, run these steps:
[...]
3 - If roughly 50ms have passed since these steps were last invoked, queue a task to fire a progress event called progress
at fr.
- Otherwise, if
chunkPromise
is fulfilled with an object whose done
property is true, queue a task to run the following steps and abort this algorithm:
[...]
4 - Else:
- 1 Set fr’s
result
to result
.
So as you can see, the FileReader's (fr) result
is only set when the done
property of chunkPromise
is true
, and that is after the progress event named progress
is fired.
If you want to access the reader's result
, then listen for the load
event, not for progress
.
If you really need to read this Blob by chunks, then you'll have to build your own FileReader, using the TextDecoder API.
Even if the browser did expose the internal buffered data in the progress event, this wouldn't be text yet. The package data algorithm is the one responsible for actually converting the bytes data to the output format (here text). This is done only two sub-steps before the aforementioned step 10.5.4, only when chunkPromise
is fulfilled with an object whose done
property is true.
In other words, the process is to first get all the data as an ArrayBuffer, and then process that full ArrayBuffer to whatever output format was required.
Given how Unicode text encoding works, you can't even directly read as text the Blob by chunks that you would have created with Blob.slice()
, because you could very well fall in the middle of a composed character boundary and break the whole chunk.
Fortunately for us, the TextDecoder API is able to read a stream of data, thanks to the stream
member of its option parameter which means that using this API, we can pass it chunks of data, and it will be able to read it without mangling characters.
So now, all we have to do is to read chunks of our Blob as ArrayBuffers (using Blob.arrayBuffer()
is straightforward, but we could use a FileReader as fallback for older browsers), and fire the progress event at every new chunks.
class StreamTextReader extends EventTarget {
constructor() {
super();
this.result = "";
}
async read( blob, chunksize = blob.size, encoding ) {
const queueEvent = (name) => {
const evt = new ProgressEvent( name );
setTimeout( () => this.dispatchEvent( evt ) );
};
try {
const decoder = new TextDecoder( encoding );
this.result = "";
let current_byte = 0;
const last_byte = blob.size;
while( current_byte < last_byte ) {
const chunk = blob.slice( current_byte, current_byte + chunksize );
const buf = await chunk.arrayBuffer();
this.result += decoder.decode( buf, { stream: true } );
current_byte += chunksize;
queueEvent( 'progress' );
}
queueEvent( 'load' );
return this.result;
}
catch( err ) {
console.log(err);
queueEvent( 'error' );
throw err;
}
}
}
const blob = new Blob( [ 'fooÀÂâà'.repeat( 10 ) ] );
const reader = new StreamTextReader();
reader.addEventListener('progress', (evt) => {
console.log( "in progress", reader.result );
} );
reader.addEventListener('load', (evt) => {
console.log( "in load", reader.result );
} );
reader.addEventListener('error', (evt) => {
console.log( 'An error occured' );
} );
// read by chunks of 8 bytes
reader.read( blob, 8 );
And to prove FileReader's inability to handle such streaming processing, here is the result of reading the first chunk of the same Blob:
const blob = new Blob( [ 'fooÀÂâà'.repeat( 10 ) ] );
const reader = new FileReader();
reader.addEventListener( 'load', (evt) => console.log( reader.result ) );
reader.readAsText( blob.slice( 0, 8 ) );
Final note
Beware javascript engines do put a max length on strings, in SpiderMonkey I think its about 1GB, but in V8 it's only 512MB, so if you are going to read very big files, it's something you need to handle, but I leave this to you as an exercise.