7

Our app daily receives around 1k crashes based on bug mentioned on OneSignal's github issues.


Bug explanation:

Unfortunately, I can't reproduce this issue. All crashes come from Crashlytics reports. SDK version 3.12.4

Devices:

1) Samsung: Galaxy A5(2017), Galaxy S8, Galaxy A50, Galaxy S10+, Galaxy S10  
2) Xiaomi: Mi A2, Mi A2 lite, Mi A1, Mi A3, Redmi Note 5 Pro 
3) Oneplus: ONEPLUS A6010, OnePlus5T, GM191011, GM19008, OnePlus58

Stacktrace:

Caused by java.lang.IllegalThreadStateException
       at java.lang.Thread.start(Thread.java:724)
       at com.onesignal.OneSignalPrefs$WritePrefHandlerThread.startDelayedWrite(OneSignalPrefs.java:117)
       at com.onesignal.OneSignalPrefs.startDelayedWrite(OneSignalPrefs.java:183)
       at com.onesignal.OneSignal.setAppContext(OneSignal.java:601)
       at com.onesignal.OneSignalSyncServiceUtils.doBackgroundSync(OneSignalSyncServiceUtils.java:175)
       at com.onesignal.SyncJobService.onStartJob(SyncJobService.java:40)
       at android.app.job.JobService$1.onStartJob(JobService.java:62)
       at android.app.job.JobServiceEngine$JobHandler.handleMessage(JobServiceEngine.java:108)
       at android.os.Handler.dispatchMessage(Handler.java:106)
       at android.os.Looper.loop(Looper.java:280)
       at android.app.ActivityThread.main(ActivityThread.java:6748)
       at java.lang.reflect.Method.invoke(Method.java)
       at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:493)
       at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:858)

Question:

The main problem is that they tagged it as medium priority and bug exists around 3 months. Our vitals are going for a toss, just because of this issue. It's costing us a lot.

Does there exist any workaround that can temporary solve the problem?


P.S:

I'm ready to provide more related info, if required. Thanks in advance!

えるまる
  • 2,409
  • 3
  • 24
  • 44

2 Answers2

1

It is possible(if not likely) that you have an old version of OneSignal where synchronized void startDelayedWrite() is not synchronized. In this case I would update, and see if that fixes things. If not see the following:

The exception in question occurs here:

synchronized void startDelayedWrite() {
            if (mHandler == null) {
                start();
                mHandler = new Handler(getLooper());
            }
//rest of function irrelevant.

The Javadoc for Thread.start,states the following:

It is never legal to start a thread more than once. In particular, a thread may not be restarted once it has completed execution.

From this and the implementation of Thread.start, we can infer that the thread in question is being started twice.

For this to be started twice in startDelayedWrite, new Handler(getLooper()) needs to throw an exception, otherwise we won't enter this if statement again. I don't see any way for getLooper() to throw an exception, but new Handler, certainly can if getLooper() is null.

The implementation of getLooper() is as follows:

public Looper getLooper() {
        if (!isAlive()) {
            return null;
        }
//rest omitted b/c irrelevant

The only way the thread in question could have exited is if Thread.run exited early.

The thread's run implementation looks like this:

@Override
    public void run() {
        mTid = Process.myTid();
        Looper.prepare();
        synchronized (this) {
            mLooper = Looper.myLooper();
            notifyAll();
        }
        Process.setThreadPriority(mPriority);
        onLooperPrepared();
        Looper.loop();
        mTid = -1;
    }

Looper.loop, is intended to be a semi-infinite loop which reads from a message queue.

Note the following in Looper.loop:

for (;;) {
            Message msg = queue.next(); // might block
            if (msg == null) {
                // No message indicates that the message queue is quitting.
                return;
            }

So if msg == null, then our thread exits quickly, possibly causing an exception in mHandler = new Handler(getLooper());, resulting in Thread.start being called twice.

There are other plausible explanations. For example it is possible that Looper.loop crashes at some earlier point.

To mitigate this I would add a try{} finally{} block around mHandler = new Handler(getLooper());, to handle the case where the Looper has already exited. Alternatively I may be wrong and this race condition may be caused by something completely different.

PiRocks
  • 1,708
  • 2
  • 18
  • 29
  • I was going to post the same answer yesterday, unfortunately, there is no workaround solution for this. Some part of this answer is accurate as `start()` but I don't think that it's a looper issue as the condition to start the thread again is applied on handler as `if (mHandler == null) {` so if the hander is null then it will start it again and cause the exception and many has mentioned that they are using `3.12.6` version with `synchronized` fix. PS: even if the message queue quites, handler instance won't become `null`, right? – Pavneet_Singh Mar 06 '20 at 10:57
  • >I was going to post the same answer yesterday, unfortunately, there is no workaround solution for this. Yeah, if they can't edit the library source I don't think there are any good workarounds, though something might be possible with reflection. – PiRocks Mar 06 '20 at 11:05
  • The point is that `mHandler` remains null after `start();mHandler = new Handler(getLooper());`, because the constructor for `Handler` throws an exception. – PiRocks Mar 06 '20 at 11:06
  • >PS: even if the message queue quites, handler instance won't become null, right? If the message queue quits before the `Handler` instance is created, then it remains null. – PiRocks Mar 06 '20 at 11:07
  • they did release a version with `synchronized` fix(mentioned earlier) but it's still being called twice. Secondly, I don't any try catch to handle exception so in case of exception, it should crash after not on `start()` method. For solution, question is what is the root cause? the usage of old version has been ruled out(based on github comments) so left with the static thread object and class's instance i/e mhandler initialization issue. `Handler instance is created, then it remains null` how it will reset handler to `null`? – Pavneet_Singh Mar 06 '20 at 11:13
  • Right. The `mHandler` is likely not be initialized properly, and the only way that can happen is if `new Handler(getLooper())` throws an exception. The only way that can happen is if `getLooper()` returns null. The most likely reason for that happening is the thread in question no longer being alive, which in turn implies the thread threw or exited early, which is most likely caused by `msg` being null. – PiRocks Mar 06 '20 at 11:18
  • > Handler instance is created, then it remains null how it will reset handler to null? The handler starts out being null. It is initialized like so: `mHandler = new Handler(getLooper());`, but if `new Handler` throws an exception it remains uninitialized – PiRocks Mar 06 '20 at 11:22
  • `new Handler(getLooper())` return `null`, [Read this and previous comment](https://android.googlesource.com/platform/frameworks/base/+/master/core/java/android/os/Handler.java#225), where is `RuntimeException`? – Pavneet_Singh Mar 06 '20 at 11:23
  • The exception would be on this line: https://android.googlesource.com/platform/frameworks/base/+/master/core/java/android/os/Handler.java#257 And would be an NPE, because `getLooper` returns null – PiRocks Mar 06 '20 at 11:25
  • Yeah, different constructor, that's the point, it should have been a NPE not illegal state. – Pavneet_Singh Mar 06 '20 at 11:27
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/209152/discussion-between-pirocks-and-pavneet-singh). – PiRocks Mar 06 '20 at 11:27
  • @PiRocks @Pavneet_Singh What do you think about sending pull request to `OneSignal` repo? – えるまる Mar 06 '20 at 11:28
0

Bug is fixed


The IllegalThreadStateException will no longer be thrown in the 3.13.0 Release. You will now see the root exception being thrown instead so these crashes can be diagnosed further.

えるまる
  • 2,409
  • 3
  • 24
  • 44