I need an associative data structure with floating point keys in which keys with nearly equal values are binned together. I'm working in C++ but language doesnt really matter.
Basically my current strategy is to
only handle single precision floating point numbers
use an unordered_map with a custom key type
define the hash function on the key type as
a. given float
v
dividev
by some tolerance, such as 0.0005, at double precision, yieldingk
.b. cast
k
to a 64 bit integer yieldingki
c. return std::hash of
ki
.
First of all, is there a standard named data structure that does something like this? If not is there a better way to do this than my general approach?
The main thing i do not like about the following implementation is that it is unintuitive to me which floating point values will be binned together; I cope with this by having a general sense of which values in my input I want to count as the same value and just test various tolerances but it would be nice that if you added 12.0453 to the container then values 12.0453 +/- 0.0005 would be considered equal if the tolerance parameter is 0.0005 but this is not the case -- I don't even think such behavior would be possible on top of unordered_map because I think the hash function would then be dependent on the values in the table.
Basically my implementation is dividing the number line into a 1D grid in which each grid cell is epsilon units wide and then assigning floating point values to the zero-based index of the grid cell they fall into. My question is, is there a better away to implement an associative container of floating point values with tolerance that is also O(1)? and are there problems with the implementation below?
template<typename V, int P=4>
class float_map
{
private:
struct key {
public:
long long val;
static constexpr double epsilon(int digits_of_precision)
{
return (digits_of_precision == 1) ? 0.5 : 0.1 * epsilon(digits_of_precision - 1);
}
static constexpr double eps = epsilon(P);
key(float fval) : val(static_cast<long long>( fval / eps))
{}
bool operator==(key k) const {
return val == k.val;
}
};
struct key_hash
{
std::size_t operator()(key k) const {
return std::hash<long long>{}(k.val);
}
};
std::unordered_map<key, V, key_hash> impl_;
public:
V& operator[](float f) {
return impl_[key(f)];
}
const V& at(float f) const {
return impl_.at(key(f));
}
bool contains(float f) const {
return impl_.find(f) != impl_.end();
}
double epsilon() const {
return key::eps;
}
};
int main()
{
float_map<std::string> test;
test[12.0453f] = "yes";
std::cout << "epsilon = " << test.epsilon() << std::endl; // 0.0005
std::cout << "12.0446f => " << (test.contains(12.0446f) ? "yes" : "no") << std::endl; // no
std::cout << "12.0447f => " << (test.contains(12.0447f) ? "yes" : "no") << std::endl; // no
std::cout << "12.0448f => " << (test.contains(12.0448f) ? "yes" : "no") << std::endl; // no
std::cout << "12.0449f => " << (test.contains(12.0449f) ? "yes" : "no") << std::endl; // no
std::cout << "12.0450f => " << (test.contains(12.0450f) ? "yes" : "no") << std::endl; // yes
std::cout << "12.0451f => " << (test.contains(12.0451f) ? "yes" : "no") << std::endl; // yes
std::cout << "12.0452f => " << (test.contains(12.0452f) ? "yes" : "no") << std::endl; // yes
std::cout << "12.0453f => " << (test.contains(12.0453f) ? "yes" : "no") << std::endl; // yes
std::cout << "12.0454f => " << (test.contains(12.0454f) ? "yes" : "no") << std::endl; // yes
std::cout << "12.0455f => " << (test.contains(12.0455f) ? "yes" : "no") << std::endl; // yes
std::cout << "12.0456f => " << (test.contains(12.0456f) ? "yes" : "no") << std::endl; // no
std::cout << "12.0457f => " << (test.contains(12.0457f) ? "yes" : "no") << std::endl; // no
std::cout << "12.0458f => " << (test.contains(12.0458f) ? "yes" : "no") << std::endl; // no
std::cout << "12.0459f => " << (test.contains(12.0459f) ? "yes" : "no") << std::endl; // no
std::cout << "12.0460f => " << (test.contains(12.0460f) ? "yes" : "no") << std::endl; // no
}