Questions tagged [data-wrangling]

1242 questions
-1
votes
1 answer

Python build dict from a mixture of dict keys and list values

Input body is: {'columns': ['site_ref', 'site_name', 'region'], 'data': [['R005000003192', 'AIRTH DSR NS896876', 'WEST'], ['R005000003195', 'AIRTHREY DSR NS814971', 'WEST']]} How could I build a new dict that will take the column values as keys and…
RAH
  • 395
  • 2
  • 9
-1
votes
1 answer

resampling dataframe/survey troubleshoot

I am having trouble resampling my dataframe. Need some help? I have a household survey in country X. Country X is divided into 3000 counties of different population sizes. The % of sampled households varied by county size. Smaller counties were…
YouLocalRUser
  • 309
  • 1
  • 9
-1
votes
2 answers

Dataframe to a dictionary as some columns in a list as keys and one as value

I have a pandas dataframe df that looks like this: col1 col2 col3 A X 1 B Y 2 C Z 3 I want to convert this into a dictionary with col1 and col2 in a list as key and col3 as value. So, the output would look…
Scratch
-1
votes
1 answer

Assign a "value" to a particular observation in R

I have frequency counts that line up with a set number of states of the world Data= S <- c("a","b","c","d","e") n <- c(1,2,3,4,5) df<- data.frame(S,n) I want to create some values that line up with the n values for each, named with the relevant…
Gilrob
  • 93
  • 7
-1
votes
2 answers

Create a list of namedtuples from a dataframe

I have a dataframe like this: df1 Name Category Age Harry A 11 James B 23 Will A 19 I want to create a list of tuples using namedtuple from collections. The list should be like this: output_list =…
star_it8293
  • 399
  • 3
  • 12
-1
votes
2 answers

Create new column based on other columns from a different dataframe

I have 2 dataframes: df1 Time Apples Pears Grapes Peachs 10:00 3 5 5 2 11:00 1 0 2 9 12:00 20 2 7 3 df2 Class Item Factor A Apples 3 A Peaches 2 A …
star_it8293
  • 399
  • 3
  • 12
-1
votes
1 answer

R Concatenate Columns from Excel file based on sheet name and Column's name

Hello Guys I have an excel file that has multiple sheetnames and these sheet names dont always have the same structure I wanna be able to read the excel file, read only some specifics sheets, select some specific columns and then create a…
R_Student
  • 624
  • 2
  • 14
-1
votes
3 answers

R Aggregate data frame based on column values

I have a data set that looks like this: > newex Name Volume Period 1 oil 29000 Jun 21 2 gold 800 Mar 22 3 oil 21000 Jul 21 4 gold 1100 Sep 21 5 gold 3000 Feb 21 6 depower 3 Q1 21 7 oil…
Saïd Maanan
  • 511
  • 4
  • 14
-1
votes
1 answer

Pandas deleting partly duplicate rows with wrong values in specific columns

I have a large dataframe from a csv file which has a few dozen columns. I have another csv file which I concatenated to the original. Now, the second file has exactly the same structure but a particular column may have incorrect values. I want to…
darzan
  • 17
  • 4
-1
votes
1 answer

Add column in first data frame based upon two columns in second data frame

I am trying to add a column to a first data frame based upon a second data frame. Basically, in the data frame 1, I have values, that are existing in data frame 2 but with additional information that I would like to extract into data frame 1. Down…
U_jex
  • 83
  • 6
-1
votes
1 answer

Compare column name with a row value and getting other row value

I have a dataframe like this: request_created_at sponsor_tier is_active status cash_in 2019/10 ... 2021/07 0 2019/10 2019/10 2.0 True 1 8901.00 ... …
h1tom1
  • 1
  • 2
-1
votes
2 answers

Can anyone explain me what actually the value inside third brackets / [2] after str.split("|", expand=True) means?

df1["state"] = df1["place_with_parent_names"].str.split("|",expand=True)[2] what [2] actually indicate of a string split method.
makt
  • 89
  • 2
  • 15
-1
votes
2 answers

How to unite multiple columns (character data) without concatenating?

Within my data I have a subset of data that look like this: Incident | Year | Person1 |Person2| :---- |:---: |:------: | -----:| 1| 2014 | A | B | 2| 2014 | A | | 3| 2016 | B | C | …
burphound
  • 161
  • 7
-1
votes
1 answer

For every unit increase in one column value , another column entries increase

I have a simulation dataset with 500 replicates - each replicate contains 300 ids. When rep = 1, id ranges from 1-300; when rep = 2, id again ranges from 1-300 and so on. I want to get the following: when rep = 1: id 1-300; when rep = 2: id 301-600…
-1
votes
1 answer

Data wrangling in Python, calculate value from some conditions

I have a dataframe in Python below: import pandas as pd df = pd.DataFrame({ 'CRDACCT_DLQ_CYC_1_MNTH_AGO' : [3, 2, 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C'], …