0

I have a pandas dataframe like this:

    tail_n    | flight_route | Percentage_delay
    _______________________________
    'N14125'  |  '(VB, MI)'  | 0.1
              |  '(CC, SK)'  | 0.5
              |  '(KF, KC)'  | 0.3
   'N351JB'   |  '(AZ, AL)'  | 0.2
              |  '(AU, NY)'  | 1
   'N938DN'   |  '(ALB, TPA)'| 0.1
              |  '(ORD, JAC)'| 0.1

I have a list of flight ID like this:

tail_n = ['N14125','N351JB', 'N938DN', 'N592AS', 'N614MQ', 'N8654B', 'N997DL', 'N852AA', 'N794SW', 'N37274', 'N899AT', 'N8315C','N479CA','N961DN', 'N3LPAA', 'N27205', 'N317US',  'N8653A', 'N454UA', 'N5CKAA', 'N904DA', 'N854UA', 'N73270', 'N33264', 'N3LEAA', 'N931DN', 'N6704Z', 'N944UW', 'N929JB', 'N626AW','N73276', 'N16976', 'N108UW', 'N905SW', 'N610WN', 'N437SW', 'N440LV']

My objective function is to minimize delay:

Minimize(summation(Percentage_delay*a) Where a is a binary decision variable which can be 0 or 1.

A constraint is that the number of tail must be less than 3752 and more thant 3000.

I was planning to use cplex with python.

I understand it is a really difficult problem but if someone is so kind to help me I would be really grateful.

coelidonum
  • 523
  • 5
  • 17
  • 1
    If you let `a` be the constant zero, your objective is zero. Given that all numbers are positive, this is a minimum. Am I missing something? And what is the range of the sum? And how do the tails appear in the sum? – fuglede Dec 11 '19 at 10:35
  • Hi, a is the decision variable. It decides if an airplane will fly. Every airplane is identified by a 'tail_n'. A minimum of 3000 airplanes is required though, so you cannot let the decision variable be constant zero. I didn't think of a range of the sum. I guess it's just from 0 to + infinite. Thank you very much for your time. – coelidonum Dec 11 '19 at 10:53
  • What is the relation between a and the tails? Is there an a per tail? If so, there seems to be multiple values of percentage_delay per tail; which ones do you pick? One of them, their sum, or something else? – fuglede Dec 11 '19 at 11:03
  • Yes, there is an a per tail. Thank you for make me reason about it, I wasn't specific. I'd pick the mean of them. – coelidonum Dec 11 '19 at 11:06
  • So for each tail_n $n$, you have a $p_n$ which is given by grouping according to $n$ and taking the mean of the percentage_value in your data. Then, you are trying to minimize $\sum_n a_n p_n$ where $a_n$ is binary and such between 3000 and 3752 of the $a_n$ are positive. Is that right? – fuglede Dec 11 '19 at 11:08
  • Yes, that's correct – coelidonum Dec 11 '19 at 11:10
  • 1
    In that case, the greedy algorithm provides the correct result, and there is no need to use a general purpose optimization solver of any kind: Since all probabilities are presumably non-negative, simply find the 3000 lowest $p_n$s, and let $a_n$ be 1 for these: Any solution with more than 3000 positive $a_n$s can be improved by removing any given $n$, so the optimal solution uses no more than 3000 of them. For these, it is obvious that simply picking the minimal possible $p_n$ will minimize their sum. – fuglede Dec 11 '19 at 11:12
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/204045/discussion-between-coelidonum-and-fuglede). – coelidonum Dec 11 '19 at 11:14
  • What is the meaning of "flight route"? I have a feeling that this column has to be used in some way in your problem description. Maybe you also have a list if points that must be connected by a flight? – Daniel Junglas Feb 28 '20 at 13:39

0 Answers0