Questions tagged [data-wrangling]
1242 questions
0
votes
1 answer
Transforming a variable using the if- else if function
I have a data set that is that I want to calculate z scores by their year.
Example:
Year Score
1999 120
1999 132
1998 120
1997 132
2000 120
2002 132
1998 160
1997 142
....etc
What I want is:
Year …

JeffB
- 139
- 1
- 10
0
votes
2 answers
Creating a new dataframes based on two conditions from two other dataframes
I'm fairly new to coding languages and have been asked to create a new dataframe based on two existing dataframes. Dataframe 1 is the original and dataframe 2 is a subset of the original. The new data frame needs to be a copy of the original with…

Victoria
- 1
- 1
0
votes
1 answer
How to add every 2 characters in a pandas series df
I have a pandas df that looks like this:
I'm trying to split up the Gun_Time and Net_Time columns by adding a ':' after every 2 numbers.
I've tried some regex with a simple function but have been unable to come up with the correct solution to…

Dasax121
- 23
- 8
0
votes
1 answer
Adding Column into data frame
I want to add a column into my dataframe that identifies which group my data is from (in this case, year). What I have is:
Var1 Var2 Var3
X X X
X X X
X X X
What I want is:
Var1 Var2 Var3 Year
X X X …

JeffB
- 139
- 1
- 10
0
votes
1 answer
Updating a File in R by adding a column/vector
Is there any way that I can update an existing .csv file by adding a column/vector that I have scraped from the web. I have a webscraper that pulls COVID-19 data and I am trying to create a file that has positive cases in columns and each column is…

Nathan May
- 1
- 1
0
votes
2 answers
How to Iterate & Count Each Categorical Value Based on Some Condition
I was working on a dataset & I want to iterate through each value to find the count of job & marital status based on the deposit
Example:

Mukul Singhal
- 11
- 2
0
votes
1 answer
I want to rearrange my data in R Studio (sorting and cbind)
If I have a data frame as follows:
R/C DD CC1 CC2
RR1 a 36 37
RR1 b 21 22
RR1 c 24 25
RR1 d 196 198
RR2 e 37 38
RR2 f 17 17
RR2 g 16 16
RR2 h 48 51
RR3 i 89 90
RR3 j 79 80
RR3 k 26 26
RR3 h 48 51…

Dongchul Park
- 165
- 7
0
votes
0 answers
How to reset the row index of dataset
I'm a newbie to the whole Data thing... And I'm trying to clean my data and get them down to a form with which they can work.
df = pd.read_excel('https://query.data.world/s/s3t37yqxxeoabyocyh6g33fojskwvq')
df.head()
What should I do to re-store the…

gyungrokna
- 25
- 4
0
votes
3 answers
How to remove an entire row if it contains zero in a specific range of columns in R?
If my data frame is:
A1 A2 B1 B2 C1 C2
row1 67 8 0 99 67 84
row2 8 22 25 5 72 0
row3 0 83 35 68 17 13
row4 69 37 52 93 67 78
row5 68 64 68 90 61 38
row6 16 30 2 19 40 1
row7 …

Dongchul Park
- 165
- 7
0
votes
2 answers
Stack/Melt Sets of Columns
I'm trying to take a single table with 10 columns and merge/union/stack into 2 columns. The current layout is like this
ID_1 | Name_1 | ID_2 | Name_2 | ID_3 | Name_3
I'm trying to get this into the format with column headers "ID" and "Name"
ID |…

Joseph Wooster
- 17
- 4
0
votes
3 answers
R Rearrange columns in dataframe based on date values in column names
I've got a dataframe that has monthly survey scores for a certain hospitals. Each month, we store the score obtained by the hospital (_Score column) and the corresponding average score for all hospitals for that month (_Average column).
Here's a…

Varun
- 1,211
- 1
- 14
- 31
0
votes
2 answers
Create a random binary variable for a subset of observations assigning 1 to a specific proportion of rows
I have a dataframe...
df <- tibble(
id = 1:10,
family = c("a","a","b","b","c", "d", "e", "f", "g", "h")
)
Families will only contain 2 members at most (so they're either individuals or pairs).
For individuals (families with only one row,…

Tom
- 279
- 1
- 12
0
votes
1 answer
Creating A Bar Chart using ggplot with a Manual dataframe one row 5 columns
This is my data frame. I could not find a way to construct a bar plot using ggplot with the column names and x as the values indicated.

bk2nt
- 1
0
votes
1 answer
Problem with pipe within purrr:map2 and mutate
nested_numeric <- model_table %>%
group_by(ano_fiscal) %>%
select(-c("ano_estudo", "payout", "div_ratio","ebitda", "name.company",
"alavancagem","div_pl", "div_liq", "div_total")) %>%
nest()
nested_numeric
# A tibble: 7 x 2
#…

Walber Moreira
- 41
- 5
0
votes
1 answer
dataframe using list vs dictionary
import pandas as pd
pincodes = [800678,800456]
numbers = [2567890, 256757]
labels = ['R','M']
first = pd.DataFrame({'Number':numbers, 'Pincode':pincodes},
index=labels)
print(first)
The above code gives me the following…

Gaganrajdeep Singh
- 19
- 6