1

I am trying to program incremental stochastic gradient descent (ISGD) algorithm in logistic regression. Initially, I coded respective logistic regression' loss function and its gradient, also got some idea to proceed rest of workflow. But, I have no idea how to apply sequential operation in incremental stochastic gradient descent algorithm which can be used in the respective logistic regression. How can I implement the sequential operation in incremental SGD? Any way to make this happen in Python? How can I do that? Any idea?

Objective logistic regression's loss function and gradient

enter image description here

My initial implementation

 import numpy as np
    import scipy as sp
    import sklearn as sl
    from scipy import special as ss

    # implementation of logisitic regression loss function

    def lossFunc(X,y,w):
        w.resize((w.shape[0],1))
        y.resize((y.shape[0],1))

        lossFnc=ss.log1p(1+np.nan_to_num(ss.expm1(-y* np.dot(X,w,))))
        rslt=np.float(lossFnc)
        return rslt

    # implementation of its gradient function

    def gradFnc(X,y,w):
        w.resize((w.shape[0],1))
        y.resize((y.shape[0],1))

        gradF1=-y*np.nan_to_num(ss.expm1(-y))
        gradF2=gradF1/(1+np.nan_to_num(ss.expm1(-y*np.dot(X,w))))
        gradF3=gradF2.resize(gradF2.shape[0],)
        return gradF3

def _init_(self, learnRate=0.0001, num_iter=100, verbose=False):
    self.w=None
    self.learnRate=learnRate
    self.verbose=verbose
    self.num_iter=num_iter


def fitt(self, X,y):
    n,d=X.shape
    self.w=np.zeros(shape=(d,))

    for i in range(self.num_iter):
        print ("\n:", "Iteration:", i)

        grd=gradFnc(self.w, X,y)
        grd.resize((grd.shape[0],1))
        self.w=self.w-grd
        print "Loss:", lossFunc(self.w,X,y)

    return self

Now I have no idea how to apply sequential operation in incremental SGD which can be used in the respective logistic regression. How can I make this happen? Is there any efficient workaround to implement sequential incremental SGD algorithm for logistic regression? Any better idea? Thanks

The particular interpretation of incremental SGD can be found here: hogwild! algorithm for logistic regression.

What is the efficient programming pipeline to accomplish the task that I stated above? Any idea?

Jongware
  • 22,200
  • 8
  • 54
  • 100
Andy.Jian
  • 417
  • 3
  • 15
  • (1) What is *incremental stochastic gradient descent* and what exactly are you asking? (2) *fitt* looks like GD at the moment. (3) You are not using your learning-rate. – sascha Jan 25 '18 at 09:29
  • (1) I think you are on the wrong track. Imho it's either SGD or Incremental GD. Both are the same (as [he says](https://www.quora.com/What-is-the-difference-between-incremental-gradient-and-stochastic-gradient-descent)). If it's not it's still unclear what ISGD should be as you did still not define it. (2) Hogwild, while simple in it's description is very hard to implement, especially in python! That's not what you want to do, except you are good with C/C++/wrapping-of those in python + OpenMP. (3)If you are struggling using learning-rates or GD -> SGD,take a step back and reread introductions – sascha Jan 25 '18 at 10:00
  • Furthermore: the only thing what Hogwild does (compared to plain SGD), it to use more cores to improve speed. Your first task should be more reduced / simple. – sascha Jan 25 '18 at 10:02
  • It's too broad. It's not in the scope of StackOverflow to take your incomplete GD-code and make it SGD. There are tons of resources in regards to SGD and log-reg (even here). There are tons of ML-courses. You have to be patient while processing those. And don't start trying to implement stuff like Hogwild without understanding the consequences.There is no *good* python-based Hogwild-implementation i know of (even after discussion within sklearn).There is the official code by the authors (C or C++?).Theory and some numpy will be enough to impl SGD; but for Hogwild: some low-level prog needed! – sascha Jan 25 '18 at 10:13
  • 2
    @sascha re: the only thing that Hogwild does: this is not true. Hogwild "works" by making slower threads overwrite gradient updates made earlier by faster threads. The paper cautions that this is why only small gradient updates work best. – erip Jan 29 '18 at 10:58
  • 2
    @erip Hi, thanks for hitting my post. I have a difficulty to implement incremental `stochastic gradient descent` `(SGD)` that can be further used in `HogWild!` implementation. I really don't know how to program `sequential and incremental SGD` in Python. Is there any feasible workaround to make this happen in Python? Any more thoughts? – Andy.Jian Jan 29 '18 at 11:02
  • 2
    @Andy.Jian You might like [this](https://srome.github.io/Async-SGD-in-Python-Implementing-Hogwild!/) blog post. – erip Jan 29 '18 at 11:03
  • @erip Thanks for giving me this. Regarding my above implementation, what would be the possible pipeline that I can proceed rest of workflow to implement HogWild!? Plus, I still have a doubt that what's the meaning of `sequential incremental SGD` in term of implementation point of view? Any idea or possible scratch Python code to get this question thoroughly? Thank you very much. – Andy.Jian Jan 29 '18 at 11:12
  • @erip I'm aware of that. I just outlined, that Hogwild! is just a parallelization approach for vanilla SGD under some sparsity-assumptions (so without using many cores, there is no merrit; different to other popular things like variance-reduction and co.). As OP still struggles to incorporate learning-rate into GD, or implementing SGD, i wondered why he would go for Hogwild! as first step, which does not make much sense. – sascha Jan 29 '18 at 13:05

0 Answers0