1

let A be:

class A {
  std::vector<double> values_;
public:
  A(const std::vector<double> &values) : values_(values){};

  void bumpAt(std::size_t i, const double &df) {
    values_[i] += df;

  virtual method1();
  virtual method2();
  ...
}

class B : public A {
  overrides methods
  ...
}

for the sake of simplicity consider the function:

double foo(input1, input2, ..., const A &a, const B &b, inputK, ...) {
 /* do complex stuff */
 return ...;
}

we want to differentiate foo() with respect to its arguments. Therefore the first order sensitivity d foo/d a is a std::vector<double> with size equal to a.size(). Same reasoning goes for d foo/d b.

A naive implementation would go as follows:


// compute d foo/d a 
std::vector<double> computeDfDa(input1, input2, ..., const A &a, const B &b, inputK, ..., double da = 1.0){
  std::vector<double> dfda = {};

  auto aUp = a.copy();
  auto aDown = a.copy();
  for (auto i = 0; i < a.size(); ++i) {

      // bump up
      aUp.bumpAt(i, da);
      // bump down
      aDown.bumpAt(i, -da);

      auto up = foo(input1, input2, ..., aUp, b, inputK, ...);
      auto down = foo(input1, input2, ..., aDown, b, inputK, ...);

      auto derivative = (up - down) / 2.0 / da;
      dfda.pushback(derivative);

      // revert bumps
      aUp.bumpAt(i, -da);
      aDown.bumpAt(i, da);
  }
   return dfda;
}

// compute d foo/d b 
std::vector<double> computeDfDb(input1, input2, ..., const A &a, const B &b, inputK, ..., double db = 0.01){
  std::vector<double> dfdb = {};

  auto bUp = b.copy();
  auto bDown = b.copy();
  for (auto i = 0; i < a.size(); ++i) {

      // bump up
      bUp.bumpAt(i, db);
      // bump down
      bDown.bumpAt(i, -db);

      auto up = foo(input1, input2, ..., a, bUp, inputK, ...);
      auto down = foo(input1, input2, ..., a, bDown, inputK, ...);

      auto derivative = (up - down) / 2.0 / db;
      dfdb.pushback(derivative);

      // revert bumps
      bUp.bumpAt(i, -db);
      bDown.bumpAt(i, db);
  }
   return dfdb;
}

This works well however we have basically the same code for computeDfDa() and for computeDfDb().

Is there any design pattern that would allow to have a unique (maybe templated) function that would understand automatically which input to bump?

Please note the position of a and b in the inputs is not commutative.

If the complexity and the number of inputs of foo() are much greater the naive solution would generate a lot of useless code as we'd have to write a computeDfDx() function for every input x of foo().

mastro
  • 619
  • 1
  • 8
  • 17
  • Is `computeDfDa( a, b, 1.0)` has the same result as `computeDfDa( b, a, 1.0)`? You did not show `compute`, is argument of `compute` interchangable? – Louis Go Aug 26 '20 at 05:09
  • 2
    @LouisGo no, the real case is where `a` and `b` are complex objects and compute is not commutative – mastro Aug 26 '20 at 05:11
  • From a design perspective, your differentiating logic should not be tied to your classes. There is a function with N arguments and you want a pair of float vectors of length N which contain the local derivative of that function at some point. If you code that independent of your classes, most likely your code complexity goes down and the code duplication as well. – BitTickler Aug 26 '20 at 05:57
  • @BitTickler the computation within `foo()` is highly dependent on class types – mastro Aug 26 '20 at 06:01
  • That is the mistake I try to point out. It should not be. >>> Design problem. – BitTickler Aug 26 '20 at 06:02
  • @BitTickler I am aware there is a huge experience mismatch here so please take my question as a honest question: isn't that the point of polymorphism to change behaviour depending on type? Also, are you implying the entire underlying design of the library should be reviewed? – mastro Aug 26 '20 at 06:06
  • 1
    @mastro It is about keeping code simple. One tool for that is separation of concerns. Differencing of some function at a point is a disjoint problem which should not get interwoven with whatever else your classes are supposed to do. The classes then call the differentiation function and provide the arguments as they see fit. Also easier to test (unit tests). And to reuse. You mentioned it is safety critical. So having good tests is important. Last not least. Not everything has to be OOP. This is a great example for that. – BitTickler Aug 26 '20 at 06:10

2 Answers2

1

Since compute order is the same but iteration loop through different containers, you may refactor this function.

std::vector<double> computeLoop( std::vector<double> &v, std::vector<double> const &computeArg1, std::vector<double> const &computeArg2, double d = 1.0 )
{
  std::vector<double> dfd = {};
  for (auto i = 0; i < v.size(); ++i) {
      // bump up
      v[i] += d;
      auto up = compute(computeArg1, computeArg2);
      v[i] -= d;

      // bump down
      v[i] -= d;
      auto down = compute(computeArg1, computeArg2);
      v[i] += d;

      auto derivative = (up - down) / 2.0 / d;
      dfd.pushback(derivative);
  }
}

Actual call.

auto dfda = computeLoop( a, a, b );
auto dfdb = computeLoop( b, a, b );

Let v pass by reference but it might cause maintenance problem. Because v might be the same reference as computeArg1 or computeArg2, however in computeLoop this is not obvious. Someone might unconsciously break the code in the future.

Louis Go
  • 2,213
  • 2
  • 16
  • 29
  • the above computation will result in the derivative being always 0 - the you are bumping `v` but not using it anywhere – mastro Aug 26 '20 at 05:39
  • @mastro, I edit my answer to let `v` pass by reference, so it could be changed as well. – Louis Go Aug 26 '20 at 05:41
  • thanks, however that is not safe, the code above is used in a safety-critical application – mastro Aug 26 '20 at 05:51
0

IMHO, the problem is, that there is a shifting of the level of abstraction.

Those classes A, B are either ...

  1. just vectors in disguise
  2. not vectors but something else entirely.

In case (1), it should be possible to rewrite to a form similar to this:

#include <... the usual suspects ... >

using VecF64 = std::vector<double>;
using VecVecF64 = std::vector<VecF64>;

// foo, dropping allusions of encapsulations. It KNOWS those A, B are vectors.
// Hence we can write it without knowledge of A, B types.
double foo(const VecVecF64& args) {
  return <result-of-complicated-computations>;
}

// With that, it is also easier to write the differentiation function
VecVecF64 fooGradients( const VecVecF64& args, double delta = 0.01) {
  VecVecF64 result;
  result.reserve(args.size());
  // for each arg vector in args, do your thing.
  // .. 
  return result;
}

In case (2), if you are all encapsulated and secretive about the nature of A, B, how do you know the number of doubles, foo can use for its computation? Which leads to the question of the size of your gradient vector for A and renders that bump idea impossible to realize.

My guess is, you have a case 1 kind of problem.

BitTickler
  • 10,905
  • 5
  • 32
  • 53
  • thanks, it's case (2) - inside foo other methods from A to extract relevant data via interpolation, i.e. something like auto v = a.interpolate(input1). Foo does not need to know the size of A but it can still do with A.size() method (as done in the loop). All of this is currently working more than fine so I would say it's impossible, maybe I just haven't expressed myself clearly – mastro Aug 26 '20 at 08:03