0

Hi Below is the sample dataset from a quality control test result of electronic parts:

UnderTesting=array([47098, 46729, 45612, 43297, 40085, 36365, 32562, 28947, 25992,
       23615, 21475, 19964, 18952, 18138, 17393, 16659, 16117, 15656,
       15186, 14715, 14300, 13678, 12344, 11664, 11159, 10669, 10155,
        9688,  9066,  8443,  7838,  7121,  6542,  6045,  5535,  5078,
        4569,  4205,  3884,  3549,  3276,  3010,  2783,  2576,  2379,
        2165,  1940,  1796,  1518,  1377,  1237,  1123,  1044,   982,
         933,   886,   836,   777,   718,   678,   635,   603,   571,
         546,   509,   473,   448,   416,   398,   379,   362,   338,
         319,   310,   296,   286,   273,   260,   219,   199,   188,
         181,   172,   168,   164,   156,   146,   142,   139,   137,
         134,   129,   125,   122,   120,   108,   100,    97,    94,
          91,    88,    85,    84,    82,    77,    75,    71,    67,
          66,    65,    63,    63,    63,    62,    58,    57,    54,
          53,    52,    50])

DailyFailure = array([11855, 11704, 11257, 10484,  9493,  8428,  7374,  6351,  5423,
        4727,  4094,  3619,  3238,  2915,  2627,  2349,  2145,  2009,
        1864,  1737,  1621,  1492,  1363,  1279,  1209,  1138,  1065,
         997,   922,   864,   821,   778,   734,   703,   654,   606,
         561,   529,   501,   465,   436,   394,   361,   340,   323,
         302,   290,   275,   267,   249,   233,   220,   212,   203,
         199,   186,   181,   173,   167,   164,   162,   158,   152,
         148,   137,   130,   127,   121,   116,   111,   109,   105,
          99,    98,    95,    89,    86,    82,    81,    77,    72,
          70,    67,    67,    66,    64,    60,    60,    60,    59,
          59,    56,    55,    54,    54,    46,    43,    42,    41,
          40,    40,    40,    40,    39,    36,    36,    35,    32,
          32,    32,    32,    32,    32,    32,    30,    30,    29,
          29,    29,    28])

Both array are sorted based on Day 0 - Day 119 after the test started and for every day starting after the continius test started parts as the under testing array failing under the stress test as the DailyFailure array.

Now we are trying to create a prediction based on the Percentage of Failure using sample dataset (DailyFailure/UnderTesting)*100

Question 1 : How to find if a day break out required as the data changed behavior but the percentage of the sample are not significant enough to rely on that breakout? Question 2 : How predict percentage failure based on this imbalanced distribution to avoid biases?

SA_H
  • 35
  • 7

0 Answers0