Questions tagged [string-hashing]

90 questions
3
votes
2 answers

Can I get (portable) access to the C++ standard library's hash implementation?

The GNU C++ standard library has: struct _Hash_impl { static size_t hash(const void* __ptr, size_t __clength, size_t __seed = static_cast(0xc70f6907UL)) { return _Hash_bytes(__ptr, __clength, __seed); } /* etc. */ …
einpoklum
  • 118,144
  • 57
  • 340
  • 684
3
votes
3 answers

Understanding Skiena's description of "Hashing and Strings"

In the "Algorithm Design Manual" book by Skiena the following paragraph is present in page 80 under heading 3.7 Hashing and Strings Let α be the size of the alphabet on which a given string S is written. Let char(c) be a function that maps each…
Talespin_Kit
  • 20,830
  • 29
  • 89
  • 135
3
votes
5 answers

Javascript hashing algorithm

I'm trying to learn how to do some basic hashing in Javascript and I've come across the following algorithm: var hash = 0; for (i = 0; i < this.length; i++) { char = str.charCodeAt(i); hash = ((hash<<5)-hash)+char; hash = hash &…
Tenescu Andrei
  • 337
  • 1
  • 5
  • 15
3
votes
1 answer

Hash distribution, why is 0 always heavily weighted?

I wrote a quick canvas visualization to see the distribution of a hashing algorithm that I ported from C++ to JavaScript. I'm seeing odd behavior in that no matter what I mod the hash by, 0 is heavily biased, in that it is selected exactly twice as…
user578895
2
votes
1 answer

How do I hash strings in C++?

I'm currently learning about hash table. Hashing integers are easy, but my assignment is to hash strings. I have given strings: 25674316-6058714 56105665-7450612 96917015-1417157 48189873-3313151 …
jvsper
  • 41
  • 4
2
votes
1 answer

How exactly does feature hashing work?

I have read many online articles on feature hashing of categorical variables for machine learning. Unfortunately, I still couldn't grasp the concept and understand how it works. I will illustrate my confusion through the sample dataset and hashing…
Stanleyrr
  • 858
  • 3
  • 12
  • 31
2
votes
1 answer

MD5 Hash in AIX OS

Is it possible to calculate the MD5 Checksum of a string with a native command on AIX OS? On Linux system yuo can use the md5sum function, but it looks like that the command is missing on AIX OS.
Duncan_McCloud
  • 543
  • 9
  • 24
2
votes
0 answers

Generate "reliable" random numbers in Javascript. (No collisions)

Would like to generate some random strings to be used as database keys. Essentially these will be UUIDs, however they do not need to conform to any particular spec aside from the necessary 122-128 bits of randomness. Most of the code will run in a…
Chris Dutrow
  • 48,402
  • 65
  • 188
  • 258
1
vote
0 answers

What is good way in .NET to compute a short hash string of a long string with relatively few collisions?

I need to compute a hash of an identifier string that looks something like 00000E11002F68FF21B459BFA33A1BFCB50E0070011167CCBF9AD994E8AAE2BFEBEEE17EC00000000010C000011167CCBF9AD994E8AAE2BFEBEEE17EC0000F083227C000000000000E11002F68FF21B459BFA33A1. I…
Shane
  • 2,271
  • 3
  • 27
  • 55
1
vote
1 answer

64bit (Long) hash of a string in Scala

I need a uniform string hash that produces longs, for use in a bloom filter. Where can I find an algorithm or a library for this? Thanks.
Zachary Oldham
  • 838
  • 1
  • 5
  • 21
1
vote
1 answer

Can i use PASSWORD_HASH for login names?

I want to have a pretty secure login on my website. I decided to use 3 inputs for my form. Login name and password to log into your account and a Username that will be shown on the website. So if someone gets access to the table, he will need to…
1
vote
2 answers

Partial uuids a good idea?

I need to generate and store a identifier per row in a distributed database (high write throughput). There are constraints on length of the Id, preferring it to be as small as possible. Id must be in a utf8. I was considering generating a uuidv4,…
1
vote
0 answers

Need to check how much memory my hash table has taken after inserting 1 million key values

I am using khash.h library for hashing. I want to check how much memory it has consumed after inserting my 1 million keys. Here is the code. https://github.com/attractivechaos/klib/blob/master/khash.h What I want: I am entering n unique entry in…
hunt009
  • 21
  • 2
1
vote
1 answer

Hashing methodology for collection of strings and integer ranges

I have a data, for example per the following: I need to match the content with the input provided for the Content & Range fields to return the matching rows. As you can see the Content field is a collection of strings & the Range field is a range…
Sanal
  • 55
  • 1
  • 7
1
vote
1 answer

When caching files based on hash characters, why use the last characters rather than the first?

I have seen it suggested that you get better "load-balancing" within a cache by using the last characters of a hashed filename - it's what nginx does for example (proxy cache module). Can anyone explain why the last characters are used? EDIT: For…
jerd
  • 75
  • 6