0

Background

We need to convert a Javascript hashing algorithm into Perl code. Therefore, we need to convert Javascript's bitwise shift operators <<, >>, and >>> into Perl. So far, we have the algorithms for doing the conversion, but since Javascript bitwise shift operators operate on 32-bit integers, we also need to emulate this in Perl.

Python solution

Based on this post https://stackoverflow.com/a/41610348 we learned that we can do this in Python using ctypes. For example, to left-shift an integer by x bits:

import ctypes
print (ctypes.c_int(integer << x ^ 0).value)

Perl question

My understanding is that we need to use XS to do this. My question is whether anyone has a quick solution to implementing it. We don't know XS. We could start learning it, but from my impression of it, the learning curve is pretty high and it could take a while to gain any mastery of it. Of course, a non-XS solution would be ideal, if one exists. Any solutions or hints would be greatly appreciated.

Workaround

Since we have a Python solution already, we could implement this module in Python and then call it from Perl. Performance isn't really an issue, so this "hack" is acceptable, although somewhat undesirable. In other words, we would prefer to maintain the whole program (which consists of several modules) in Perl only.

robasia
  • 113
  • 10
  • 3
    Wouldn't it be enough to truncate the result to 32 bits? As in `use constant { I32 => 0xffff_ffff }; ... ($x << $n) & I32`? – melpomene Sep 17 '18 at 11:13
  • 2
    Can you show us the JavaScript hashing code? – melpomene Sep 17 '18 at 11:20
  • @melpomene Thanks for your suggestion, but that will not work. We have already tried some similar implementations. Larger ints can overflow the 32-bit boundary after being shifted, so simply truncated the result doesn't always give the right answer. Several other similar solutions have been given in Python solutions on SO, but from testing, the only solution that works every time is the one that uses ctypes.c_int – robasia Sep 17 '18 at 12:16
  • Does anyone on your team write C? Then you can use either https://metacpan.org/pod/C::Blocks or https://metacpan.org/pod/distribution/Inline-C/lib/Inline/C.pod – Diab Jerius Sep 17 '18 at 14:53
  • @DiabJerius `Inline::C` looks like it will work. It comes with a cookbook, too: [https://metacpan.org/pod/distribution/Inline-C/lib/Inline/C/Cookbook.pod](https://metacpan.org/pod/distribution/Inline-C/lib/Inline/C/Cookbook.pod) Thanks for this suggestion. I will give it a try later. – robasia Sep 17 '18 at 15:59
  • @robasia "*Larger ints can overflow the 32-bit boundary after being shifted, so simply truncated the result doesn't always give the right answer.*" That makes no sense. Please show the hash algorithm. – melpomene Sep 17 '18 at 18:13
  • @melpomene, I think they mean that 0x08123456 << 4 should give a negative number, but truncating doesn't do that. – ikegami Sep 17 '18 at 19:56
  • @DiabJerius We were able to implement it using Inline C as you suggested. Thank you. We added your solution to our summarized answer below. – robasia Sep 18 '18 at 10:03
  • @melpomene Thanks for your feedback and sorry for the incorrect explanation of the problem. Our algorithm is a variation of md5. We originally tried to use `Digest::MD5`, but it was not producing the same results. Therefore, we decided to translate it. However, the code is nearly 200 lines, so I was afraid that posting it would distract people from my actual question. Thanks again for your help. – robasia Sep 18 '18 at 10:15

1 Answers1

5
sub lshr32 { ( $_[0] & 0xFFFFFFFF ) >> $_[1] }                           # >>> in JS
sub lshl32 { ( $_[0] << $_[1] ) & 0xFFFFFFFF }

sub ashr32 { ( $_[0] - ( $_[0] % ( 1 << $_[1] ) ) ) / ( 1 << $_[1] ) }   # >> in JS
sub ashl32 { unpack "l", pack "l", $_[0] * ( 1 << $_[1] ) }              # << in JS

It doesn't make sense to pass a negative number to a logical shift unless the number isn't really a number but a collection of bits. Given that you are porting a hashing algorithm, this is very likely. It also means you're creating a lot of extra work for yourself by matching JavaScript this closely because you're recreating hacks used to address limits in JavaScript that don't exist in Perl. It should be far simpler to use 32-bit unsigned values, << truncated using & 0xFFFFFFFF, and >> truncated using & 0xFFFFFFFF.

ikegami
  • 367,544
  • 15
  • 269
  • 518
  • It doesn't make sense to pass negative numbers to an logical shift, but the behaviour of `>>>` does appear to be defined for negative numbers. You simply need to cast the input, which I've added to my lshr32 – ikegami Sep 17 '18 at 19:04
  • Your change just makes the sub more complex without changing the output – ikegami Sep 18 '18 at 16:47
  • The size of `int` must accommodate -32767..32767. It can be smaller than 32 bits. It can be larger than 32 bits. – ikegami Sep 19 '18 at 06:31
  • What cases? Positive and negative numbers? It already handled both. Again, your addition literally has no effect. – ikegami Sep 19 '18 at 10:51
  • We finally came to understand your explanation. Because you were lumping `lshr32` and `lshl32` together under the `#>>> in JS` comment, we thought that they needed to be used in combination. We originally tested on negative numbers and got a good result from `lshr32`, so we mistakenly thought that `lshl32` was a positive number implementation. Our lack of technical knowledge caused us confusion, and regrettably, caused you frustration. Sorry for making this question tedious for you. Thanks again for your precious help and time. – robasia Sep 19 '18 at 13:09
  • Re "*Because you were lumping `lshr32` and `lshl32` together under the `# >>> in JS` comment*", You mean you were mistakenly lumping the two together. As the comment indicates, JS's `>>>` operator performs a 32-bit logical shift right (`lshr32`). JS doesn't provide an operator for performing a logical shift left (`lshl32`), but I provided an implementation in case any future reader needs it – ikegami Sep 19 '18 at 13:19
  • Yes, you are right. I shouldn't have worded my comment that way. Our misunderstanding of your solution was completely due to our lack of knowledge in this field. – robasia Sep 20 '18 at 10:13