Highest Voted 'data-munging' Questions

1

vote

1 answer

Is there a way to read columns with strings as strings when using RGoogleDocs

I use RGoogleDocs a lot. I use it to read in data that is private or only shared with a few people. I know that read.table and read.csv allow one to use stringsAsFactors=FALSE. I want to do something similar in RGoogleDocs. Here is my typical…

r google-docs data-munging

asked Oct 11 '12 at 02:20

Farrel

10,244
19
61
99

0

votes

1 answer

PHP: How to match all occurences of a regex pattern in a document

I am doing some data munging on documents which may (or may not - as the case may be) have ocurrence(s) of a regular expression pattern in their content. I would like to write a PHP function to use to process the documents - the job of the function…

php regex preg-match-all data-munging

asked Nov 27 '11 at 13:22

Homunculus Reticulli

65,167
81
216
341

0

votes

2 answers

How to select only the last hour of weather data from each week in R?

I have a weather dataset with observations collected at 15-minute intervals for several weeks. I would like to extract only the last hour of weather data for each week and disregard the rest. In the week 15for example, I only want to keep rows from…

r dataframe dplyr data-manipulation data-munging

asked Jul 28 '23 at 18:08

Ahsk

241
1
7

0

votes

3 answers

pandas series mark all the rows between two values

I have a series ( a single col in a df) with 3 possible values: Stable, Increase, Decresae , and I want to mark all the areas between a Increase to the subsequent Decrease. So for the…

pandas dataframe data-science data-munging

asked Jun 06 '23 at 13:17

Cranjis

1,590
8
31
64

0

votes

4 answers

pandas dataframe get rows when list values in specific columns meet certain condition

I have a dataframe: df = A B 1 [0.2,0.8] 2 [0.6,0.9] I want to get only rows where all the values of B are >= 0.5 So here: new_df = A B 2 [0.6, 0.9] What is the best way to do it?

python pandas dataframe data-munging

asked Apr 09 '23 at 11:31

Cranjis

1,590
8
31
64

0

votes

1 answer

Running AppleScript Regex using sed problems

I'm trying to use sed under AppleScript to execute Regex commands. The goal is to run a series of Regex commands to cleanup text. I've tried to use the TextEdit app as the source for the data to be passed on to the Regex commands. It will not run…

regex sed applescript data-munging

asked Mar 26 '23 at 14:36

jeff

9

0

votes

1 answer

Pandas conditional join and calculation

I have two Pandas dataframes, df_stock_prices and df_sentiment_mean. I would like to do the following: Left join/merge these two dataframes into one dataframe, joined by Date and by ticker. In df_stock_prices, ticker is the column name, for…

python pandas dataframe data-munging

asked Jan 25 '23 at 14:00

billv1179

323
5
15

0

votes

1 answer

R code that iteratively creates a "rank_order" column for every column in a given dataframe

Given a data frame such as the following, how do I get a rank order (e.g. integer column ranking the value in order from descending as "1,2,3") column output for every single column without writing out ever single column? df <- data.frame( col1 =…

r data-munging

asked Jan 06 '23 at 21:55

jaykay

41
1

0

votes

1 answer

Adding a (fixed) new row to the top of each dataset in a list of N datasets using apply

I have N data sets which were loaded into RStudio and stored in the list object "datasets". The problem is what I want to be the top row in each of them or the headers for each of them, either way is in their third rows. The initial version of this…

r data-manipulation transformation data-wrangling data-munging

asked Dec 31 '22 at 06:37

Marlen

171
11

0

votes

1 answer

How to apply the equivalent of standard sub setting operations but to a list of dataframes instead of to a single dataframe

I have a set of 40 different datasets within a file folder which I have loaded into my WorkSpace in RStudio with: datasets <- lapply(filepaths_list, read.csv, header = FALSE) This object datasets is a list of 40 dataframes. I would like to run code…

r transformation data-preprocessing data-munging

asked Dec 30 '22 at 12:18

Marlen

171
11

0

votes

2 answers

pandas apply subtractions on columns function when indexes are not equal, based on alignment in another columns

I have two dataframes: df1 = C0 C1. C2. 4 AB. 1. 2 5 AC. 7 8 6 AD. 9. 9 7 AE. 2. 6 8 AG 8. 9 df2 = C0 C1. C2 8 AB 0. 1 9 AE. 6. 3 10 AD. 1. 2 I want to apply a subtraction between these two dataframes,…

python pandas dataframe data-science data-munging

asked Dec 03 '22 at 11:27

Cranjis

1,590
8
31
64

0

votes

0 answers

Pandas dataframe sort and groupby columns and add columns that are based on calclations from previous group

I have a df: df = Date id1 amount is_winner 2022-07-14 02:34:20.348. A. 87.11. False 2022-07-14 02:34:20.348. B. 77.12. True 2022-07-14 02:37:20.348. A 89.11. False 2022-07-14 02:37:20.348. B. 87.12. True 2022-07-14…

python-3.x pandas dataframe group-by data-munging

asked Dec 01 '22 at 15:16

Cranjis

1,590
8
31
64

0

votes

1 answer

Pandas add column of count of another column across all the datafram

I have a dataframe: df = C1 C2 E 1 2 3 4 9 1 3 1 1 8 2 8 8 1 2 I want to add another columns that will have the count of the value that is in the columns 'E' in all the dataframe (in the column…

python pandas dataframe data-science data-munging

asked Nov 30 '22 at 18:34

Cranjis

1,590
8
31
64

0

votes

0 answers

Lable a variable in R from a Survey

I'm trying to analyze a Survey in R. When I import the data and look at the table, I can see in the heading very nicely that there is the variable name (e.g. 'gender') and below R put the question originally asked in the survey (e.g.' What gender do…

r label data-manipulation data-munging

asked Nov 24 '22 at 12:27

Sofia

21
3

0

votes

1 answer

Pandas histogram of number of occurences of other columns after groupby

I have a dataframe: df = Batch_ID DateTime Code A1 A2 ABC. '2019-01-02 17:03:41.000' 230 2. 4 ABC. '2019-01-02 17:03:41.000' 230 1. 5 ABC. '2019-01-02 17:03:42.000' 231 1. 4 …

pandas dataframe group-by data-science data-munging

asked Nov 22 '22 at 19:07

Cranjis

1,590
8
31
64

Questions tagged [data-munging]