Using Pandas with python to code a small program to display percent of times each car did not stop for gas

Question

what I currently have displayedHelp using Pandas with python to code a small program to display percent of times each car did not stop for gas (represented with NOT) . Using df.groupby and .value_counts to display pic you see. there are only two columns (CAR and GAS) the rest of the numbers are the counts of each trip for each colored car. Im trying to combine everything other than (NOT) and show percentage each car stopped for gas in general and didn't stop for gas showing two results positive and negative for each car. output should look something like:

RED Percentage stopped for gas: 50% Percentage didn't stop for gas: 50% BLUE Percentage stopped for gas: 50% Percentage didn't stop for gas: 50% GREEN Percentage stopped for gas: 50% Percentage didn't stop for gas: 50%

Tried using df.groupby ('CAR').GAS.value_counts().loc[:,'NOT'] and df.groupby ('CAR').GAS.value_counts() creating the list you see but I'm having trouble assigning NOT number to a variable and all other GAS to a positive variable.

SimoN SavioR · Answer 1 · 2023-01-26T09:39:21.603

0

i have edited answer

import pandas as pd
import numpy as np

size = 200
d = {
    "cars":np.random.choice(["red","blue","green"], size),
    "station":np.random.choice(6*["not"]+[f"gas{i}" for i in range(9)], size)
}

def trans_func(x):
    if str(x).lower() != "not":
        return "got"
    return str(x)

states = {"not":-1, "got":1}

def get_result(df:pd.DataFrame) -> None:
    columns = ["car", "station", "count"]
    df.columns = columns[:df.columns.size]

    df["station"] = df["station"].transform(trans_func)
    df["count"] = df["station"].transform(lambda x: states[str(x).lower()])
    df["percentage"] = 1

    group = df.groupby(["car","station"]).aggregate("sum").groupby("car")    

    result = pd.DataFrame()
    result_text = "\n"

    for gr in group:
        car = gr[0]
        gr = gr[1]

        gr["percentage"] = (gr["percentage"] / gr["percentage"].sum()*100).map(lambda x: f"{x:0.2f}%")

        result = pd.concat([result, gr])

        result_text += "*{0}* Percentage stopped for gas: {1}% Percentage didn't stop for gas: {2}%\n".format(
            str(car).upper(), 
            list(gr["percentage"])[0], 
            list(gr["percentage"])[1])

    print(result)
    print(result_text)

df = pd.DataFrame(d)
get_result(df)

edited Jan 26 '23 at 09:39

answered Jan 25 '23 at 09:41

SimoN SavioR

614
4
6

Thank you I'll give this a try. The CSV file I'm working with has about 900 rows and sometimes the cars are called "R600" or "R926" would this method work if the name of the car is not all strings like "Red"? – Tony Porpora Jan 25 '23 at 11:31
Of course, all you have to do is give the dataframe as an argument to the function, if it gives an error, you let me know. – SimoN SavioR Jan 25 '23 at 13:35
Thanks I tried it and it didn't really work as desired I'm trying to have the output be a table or list showing each car and the positive percentage of times gassed up and negative percentage times gassed up. the output for what you gave me just showed a table with R as the only value in the first column, a column for gas station with a repeating value of "gas_station0 with the 0 going up by one down the column, and a count for each value in the third column. – Tony Porpora Jan 25 '23 at 17:35
I'm looking for the first column be the name of the car so R927, R562 , so on and so on, the second column just showing a positive and negative percentage for if they got gas or not. Pretty much all the gas stations not labeled "NOT" are positive with "NOT being negative. – Tony Porpora Jan 25 '23 at 17:35
you can check my new answer – SimoN SavioR Jan 26 '23 at 09:40

Using Pandas with python to code a small program to display percent of times each car did not stop for gas

1 Answers1