4

For a little project I would like to convert an image to a binary representation (zeroes and ones). As I want to use this on a machine that is not configured to run PHP, .NET, ... I would like to do this in JavaScript.

At the moment I managed to read the file using the FileReader which gives me an ArrayBuffer.

But I am clueless as to how to convert this ArrayBuffer to a bit representation.

My current version can be found on this jsfiddle.

function ArrayBufferToBit(buffer) {
    // How to convert my array buffer to a textual bit-representation? 0 1 1 0 0 0...
    return buffer;
  }

Your help is much appreciated!

Jules
  • 7,148
  • 6
  • 26
  • 50

2 Answers2

2

Use a DataView:

function ArrayBufferToBit(buffer) {
    var dataView = new DataView(buffer);
    var response = "", offset = (8/8); 
    // I assume we will read the entire file as a series of 8-bit integers, 
    // i.e. as a byte, hence my choice of offset.
    for(var i = 0; i < dataView.byteLength; i += offset) {
        response += dataView.getInt8(i).toString(2); 
        // toString is the secret sauce here.
    }
    return response;
}

Dataviews let you read/write numeric data; getInt8 converts the data from the byte position - here 0, the value passed in - in the ArrayBuffer to signed 8-bit integer representation, and toString(2) converts the 8-bit integer to binary representation format (i.e. a string of 1's and 0's).

The 'magic' offset value is obtained by noting we are taking files stored as bytes i.e. as 8-bit integers and reading it in 8-bit integer representation. If we were trying to read our byte-saved (i.e. 8 bits) files to 32-bit integers, we would note that 32/8 = 4 is the number of byte spaces, which is our byte offset value.


This, in addition to typed arrays, is the recommended way to read/write from an ArrayBuffer:

The ArrayBuffer object is used to represent a generic, fixed-length raw binary data buffer. You cannot directly manipulate the contents of an ArrayBuffer; instead, you create one of the typed array objects or a DataView object which represents the buffer in a specific format, and use that to read and write the contents of the buffer.

In addition to signed 8-bit representation, you can also get a variety of representations (like float64 or even int32). The choice of representation should not matter, as toString(2) will show it in binary anyway (though the length of your binary string certainly will change for obvious reasons!).

Note that in this example I have chosen to represent the entire file as a series of 8-bit integers i.e. reading byte by byte. In general, however, DataViews facilitate the mixing of homogeneous types - you can read the first 12 bytes as 32-bit integers, and the remaining as 64-bit, for instance. DataViews are usually preferred when handling files because different file formats can be handled this way and because DataViews also handle the endianness of files from different architectures.

A task like this one can be handled by typed arrays, as in @le_m's answer, or by DataViews - however, DataViews can handle both endianness (if files are transferred over network from different CPUs) issues and different file formats (e.g. PDF files, which have some byte headers before the main content).

Akshat Mahajan
  • 9,543
  • 4
  • 35
  • 44
  • Thanks for your reply. I think there's still something wrong in your example though. The `dataView.getInt32(0).toString(2)` gives me as a result `1000111010010010100011000111000` for an image. Seems a bit short... – Jules Jul 13 '16 at 11:28
  • @Jules I have revised and updated my answer with something better. – Akshat Mahajan Jul 13 '16 at 16:54
  • @AkshatMahajan I have been running into same problem as OP recently, and I follow you answer and explanation. it's good and so detailed for a beginner like me. Then I write a full demo to get binary/hex data print on screen. But I am really new into this binary-handle area. I can get the result **but I am not sure if I am understanding/doing it right, can you maybe check it out for whatever is there anything I can improve? [fiddle here](https://jsfiddle.net/69c5wcmL/1/)** – Lien Jan 19 '17 at 07:59
1

You need to iterate through the content of the array buffer. You could use a DataView for that or a typed array.

Since we are going to read individual bytes, we don't need to worry about big or small endianness of our system and can safely use a Uint8Array which allows us to use Array.reduce() in order to combine all elements to a single binary string.

To convert a byte to binary, we can use the Number.toString(base) method.

function arrayBufferToBinary(buffer) {
  var uint8 = new Uint8Array(buffer);
  return uint8.reduce((binary, uint8) => binary + uint8.toString(2), "");
}

function fileToBinary(file, callback) {  
  var reader = new FileReader();
  reader.onload = (event) => callback(arrayBufferToBinary(reader.result));
  reader.readAsArrayBuffer(file);
}

var input = document.getElementById("file");
var output = document.getElementById("binary");

input.addEventListener("change", function(event) {
  var file = input.files[0];
  if (file) fileToBinary(file, (binary) => output.textContent = binary);
});
#binary {
  word-wrap: break-word;
}
<input id="file" type="file">
<div id="binary"></div>
le_m
  • 19,302
  • 9
  • 64
  • 74