0

I am trying to predict a job's monthly cost by using the estimated total cost and the duration of the job. I had a raw data with a start date, end date of jobs, their total cost and all costs pertaining to the job and the date those cost were effectuated. I figured the date dates do not make much sense so I found the number for 5% of the time and then found the costs effectuated in that increment of time. When I try the scatter plots, i get something like the picture shows. My question is , How do I get the escape data points stacking up in lines? I get the same problem when I plot the total cost vs the monthly cost, since total costs are all the same for all the payments made during an exact job.

enter image description here

structure(list(`5% Days` = c(58.5500000000029, 58.5500000000029, 
58.5500000000029, 58.5500000000029, 58.5500000000029, 58.5500000000029, 
58.5500000000029, 58.5500000000029, 58.5500000000029, 58.5500000000029, 
58.5500000000029, 58.5500000000029, 58.5500000000029, 58.5500000000029, 
58.5500000000029, 58.5500000000029, 58.5500000000029, 58.5500000000029, 
58.5500000000029, 58.5500000000029, 32.1999999999971, 32.1999999999971, 
32.1999999999971, 32.1999999999971, 32.1999999999971, 32.1999999999971, 
32.1999999999971, 32.1999999999971, 32.1999999999971, 32.1999999999971, 
32.1999999999971, 32.1999999999971, 32.1999999999971, 32.1999999999971, 
32.1999999999971, 32.1999999999971, 32.1999999999971, 32.1999999999971, 
32.1999999999971, 32.1999999999971, 45.4000000000015, 45.4000000000015, 
45.4000000000015, 45.4000000000015, 45.4000000000015, 45.4000000000015, 
45.4000000000015, 45.4000000000015, 45.4000000000015, 45.4000000000015, 
45.4000000000015, 45.4000000000015, 45.4000000000015, 45.4000000000015, 
45.4000000000015, 45.4000000000015, 45.4000000000015, 45.4000000000015, 
45.4000000000015, 45.4000000000015, 51.5500000000029, 51.5500000000029, 
51.5500000000029, 51.5500000000029, 51.5500000000029, 51.5500000000029, 
51.5500000000029, 51.5500000000029, 51.5500000000029, 51.5500000000029, 
51.5500000000029, 51.5500000000029, 51.5500000000029, 51.5500000000029, 
51.5500000000029, 51.5500000000029, 51.5500000000029, 51.5500000000029, 
51.5500000000029, 51.5500000000029, 29.5999999999985, 29.5999999999985, 
29.5999999999985, 29.5999999999985, 29.5999999999985, 29.5999999999985, 
29.5999999999985, 29.5999999999985, 29.5999999999985, 29.5999999999985, 
29.5999999999985, 29.5999999999985, 29.5999999999985, 29.5999999999985, 
29.5999999999985, 29.5999999999985, 29.5999999999985, 29.5999999999985, 
29.5999999999985, 29.5999999999985, 30.6999999999971, 30.6999999999971, 
30.6999999999971, 30.6999999999971, 30.6999999999971, 30.6999999999971, 
30.6999999999971, 30.6999999999971, 30.6999999999971, 30.6999999999971, 
30.6999999999971, 30.6999999999971, 30.6999999999971, 30.6999999999971, 
30.6999999999971, 30.6999999999971, 30.6999999999971, 30.6999999999971, 
30.6999999999971, 30.6999999999971, 42.9499999999971, 42.9499999999971, 
42.9499999999971, 42.9499999999971, 42.9499999999971, 42.9499999999971, 
42.9499999999971, 42.9499999999971, 42.9499999999971, 42.9499999999971, 
42.9499999999971, 42.9499999999971, 42.9499999999971, 42.9499999999971, 
42.9499999999971, 42.9499999999971, 42.9499999999971, 42.9499999999971, 
42.9499999999971, 42.9499999999971), Intervals = c(1, 2, 3, 4, 
5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 1, 
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 
20, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 
18, 19, 20, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 
16, 17, 18, 19, 20, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 
14, 15, 16, 17, 18, 19, 20, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 
12, 13, 14, 15, 16, 17, 18, 19, 20, 1, 2, 3, 4, 5, 6, 7, 8, 9, 
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20), Cost = c(37393.26, 
94402.72, 254472.52, 68352.09, 57305.47, 145057.39, 109348.68, 
117798.31, 562743.58, 1254942.13, 1257165.4, 1608822.4716, 1464465.2218, 
1151072.14, 1260578.39, 727788.660000001, 328083.95, 20033.62, 
-37691.85, -4519.71, 9356, 13592, 637, 48872.73, 13436.04, 5690.07, 
3359.42, 5017.5, 20604.02, 11311.08, 58289.9, 66368.69, 92337.93, 
7531.47, 721.87, 35828.79626, 5409.97339, -1547.71, 89.72, -3124.27, 
13654.84, 18547.55, 37470.14, 53262.8, 119556.16, 19213.39, 42285.05, 
118176.77, 155294.74, 84179.25, 103324.74, 41346.01626, 6567.06064, 
1846.78759, 668.51, -4937.56, 0, 757.36, 0, 3187.4, 522530.62, 
456349.31, 453759.73, 1261379.87, 1750672.05, 662884.16, 522026.4, 
832515.01, 465700.53, 513119.36, 112372.6, 42677.03, 19558.67, 
-11399.45, -17538.46, -686.83, -103.4, 238.14, 178.49, 146.11, 
360.8, 3777.79474, 3592.31615, 5621.01113, 8845.72825, 23488.33373, 
13649.75, 26835.24, 6962.24, 12252.71, 8114.44, 13961.85, 22113, 
8078.51, 27797.78, 28399.15, 36292.99, 9173.92, 4772.47, 3459.84, 
874.51, 7357.22, 4524.49, 1569.4, 4549.69, 746.22, 1270.88, 15734.31, 
1768, 10088.35, 16825.78, 15214.86, 19643.4, 43737.74, 45669.93, 
17960.44, 363.89, 5251.72, -131123.53, 624, 141061.78, 776803.76, 
14324.23, 15211.05, 30669.85, 125067.3, 363648.07, 192211.84049, 
617111.48037, 404960.99069, 561975.96322, 440356.85, 348916.26, 
185208.47, 137126.1, 46848.08, 17561.12, -15884.29, 9698.93, 
11595.22), `Total Cost` = c(10477614.4434, 10477614.4434, 10477614.4434, 
10477614.4434, 10477614.4434, 10477614.4434, 10477614.4434, 10477614.4434, 
10477614.4434, 10477614.4434, 10477614.4434, 10477614.4434, 10477614.4434, 
10477614.4434, 10477614.4434, 10477614.4434, 10477614.4434, 10477614.4434, 
10477614.4434, 10477614.4434, 392916.679650001, 392916.679650001, 
392916.679650001, 392916.679650001, 392916.679650001, 392916.679650001, 
392916.679650001, 392916.679650001, 392916.679650001, 392916.679650001, 
392916.679650001, 392916.679650001, 392916.679650001, 392916.679650001, 
392916.679650001, 392916.679650001, 392916.679650001, 392916.679650001, 
392916.679650001, 392916.679650001, 814401.00449, 814401.00449, 
814401.00449, 814401.00449, 814401.00449, 814401.00449, 814401.00449, 
814401.00449, 814401.00449, 814401.00449, 814401.00449, 814401.00449, 
814401.00449, 814401.00449, 814401.00449, 814401.00449, 814401.00449, 
814401.00449, 814401.00449, 814401.00449, 7586379.94, 267549.874, 
86735.7000000001, 4426382.97477, 305531.76, 1623521.98576, 2023575.50399, 
878403.537809998, 272291.81984, 57808.97944, 502983.580000001, 
10632667.0823001, 884170.511820001, 70206.80899, 4491048.47898997, 
284114.110000001, 44222.37, 1948932.00513, 299710.95, 722706.59595, 
267549.874, 86735.7000000001, 4426382.97477, 305531.76, 1623521.98576, 
2023575.50399, 878403.537809998, 272291.81984, 57808.97944, 502983.580000001, 
10632667.0823001, 884170.511820001, 70206.80899, 4491048.47898997, 
284114.110000001, 44222.37, 1948932.00513, 299710.95, 722706.59595, 
3257899.22349, 86735.7000000001, 86735.7000000001, 86735.7000000001, 
86735.7000000001, 86735.7000000001, 86735.7000000001, 86735.7000000001, 
86735.7000000001, 86735.7000000001, 86735.7000000001, 86735.7000000001, 
86735.7000000001, 86735.7000000001, 86735.7000000001, 86735.7000000001, 
86735.7000000001, 86735.7000000001, 86735.7000000001, 86735.7000000001, 
86735.7000000001, 4426382.97477, 4426382.97477, 4426382.97477, 
4426382.97477, 4426382.97477, 4426382.97477, 4426382.97477, 4426382.97477, 
4426382.97477, 4426382.97477, 4426382.97477, 4426382.97477, 4426382.97477, 
4426382.97477, 4426382.97477, 4426382.97477, 4426382.97477, 4426382.97477, 
4426382.97477, 4426382.97477)), row.names = c(NA, -140L), class = c("tbl_df", 
"tbl", "data.frame"))
Reeza
  • 20,510
  • 4
  • 21
  • 38
  • Is this R or Python? – Reeza Mar 11 '20 at 16:28
  • Looks like R, edited your question, probably worth adding the code you're using to generate the image if that's relevant. – Reeza Mar 11 '20 at 16:32
  • It's R, Only looking at ideas to Kind of help me fix my data. the dput(Question) was just so I could post the data. Thank you! – CaptainTREX Mar 11 '20 at 17:09
  • Honestly, I don't know what the question is here. This part is unclear 'How do I get the escape data points stacking up in lines'. Not sure what that means. Are you looking for how to jitter the plot? In general for prediction, take the total, minus the current spend and then allocate the rest out to the remaining months based on expected project completion and if you have nothing for that, just divide it by the number of months equally. – Reeza Mar 11 '20 at 17:45
  • Thank you Reeza, I did divide up my data into segments of 5% time and feel like I am getting somewhere. – CaptainTREX Mar 11 '20 at 21:43
  • can you elaborate on what you want to plot, it's not very clear, is it 20 lines grouped by interval? – StupidWolf Mar 12 '20 at 12:01

0 Answers0