0

I have server that replaces some strings in a file. It looks like this:

var stringToBeReplacedWith = "Cool text";

var data = fs.readFileSync(file, 'utf-8');

var RegExp = new RegExp("Stringtobereplaced", 'g'); // global search
data = fileContents.replace(RegExp, stringToBeReplacedWith);

fs.writeFileSync(file, data); 

The code works by the Mime-type/encoding changes.

How can I make sure the Mime-Type is preserved when replacing the string? I've noticed there are plenty of libraries to read the mimetype but (so far) I haven't found a library that does the opposite.

Mdlc
  • 7,128
  • 12
  • 55
  • 98

1 Answers1

2

It's because .class files are binary. You're reading the file in as a UTF-8 string. So when you write it back out, it's writing it out as a malformed UTF-8 string (hence the change in magic number).

As long as you aren't trying to replace multi-byte characters, you could change

var data = fs.readFileSync(file, 'utf-8');

to

var data = fs.readFileSync(file, 'binary');

and

fs.writeFileSync(file, data);

to

fs.writeFileSync(file, data, { encoding: 'binary' });

or

fs.writeFileSync(file, new Buffer(data, 'binary'));

and it should work as you expect.

mscdex
  • 104,356
  • 15
  • 192
  • 153
  • I'm getting: "Error: Unknown Encoding" when writing to the file. I see there are ways to "encode" the file anyways using a library (http://stackoverflow.com/questions/14551608/cant-find-encodings-for-node-js) but this requires the encoding? Would you perhaps know what encoding to use? – Mdlc Feb 05 '15 at 19:37
  • What version of node are you using? You can also try the alternative use of `fs.writeFileSync` that I've now included in my answer. – mscdex Feb 05 '15 at 20:09
  • Just using 'binary' instead of { encoding: 'binary' } fixed it for me > the code executed. But when compiling the code I get the following error: "unknown tag byte: 6f" (instead of 3f for some files also 54 and 3f). Any idea why this could be? P.s. I'm on node 10 – Mdlc Feb 05 '15 at 20:24
  • I'm not familiar with the Java `.class` format. Perhaps it stores strings preceded by a length value and you're making the strings larger or smaller but not updating the length value? So it ends up reading past or inside the string and finds an unexpected byte value? – mscdex Feb 05 '15 at 20:49
  • JAVA is ISO 8859-1 with \uXXXX escape sequences, denoting Unicode BMP characters. Consecutive pairs of \uXXXX escape sequences in the surrogate range, as in UTF-16, denote Unicode characters outside the BMP. - https://github.com/bnoordhuis/node-iconv/blob/master/deps/libiconv/lib/java.h – Mdlc Feb 05 '15 at 20:52
  • That's something else, not for `.class` files, probably for the `.java` source files. The first hit on google for `java .class format` led to a page on Oracle's site that describes exactly as I predicted, that [string constants are stored as TLV tuples](http://docs.oracle.com/javase/specs/jvms/se7/html/jvms-4.html#jvms-4.4.7). So it won't be enough to just do a simple find and replace. You have to make sure to update the string's length field so that the file can be interpreted correctly again. – mscdex Feb 05 '15 at 21:15
  • Found the cause of the problem and fixed it. I'll edit my question and accept your answer. – Mdlc Feb 06 '15 at 15:44