A Python package that mimics the R's dplyr-style data manipulation functionality.
Questions tagged [dfply]
20 questions
0
votes
0 answers
Creating user defined function for joins (Python)
I am looking for an easy way to define a function that will consecutively join tables when ran. I am pretty new to Python, but have been given the task of building out a package that heavily relies on joins to work successfully.
I have done plenty…

hamil_jp1
- 41
- 2
0
votes
2 answers
How can I group_by and sum values in a data frame?
I have this data frame (please refer the figure below)
| State | County | Homicides
|--------------|---------------|-----------
| Ags | Calvillo | 4
| Mexico City | Alvaro O | 2
| Mexico City | Alvaro O …

coding
- 917
- 2
- 12
- 25
0
votes
0 answers
How to use customized function in dfply package in python
I tried to use dfply package to create an accumulator column given a condition, but failed with customized function.
Using the diamonds data as an example:
I'd like to create an accumulator column such that if price is larger than 500, then +1, else…
0
votes
1 answer
How can I use the mask command to include more than one parameter?
I'm currently doing a machine learning project (a very basic one), and using baseball data from 1871-2015. I want to use a specific set of years to test my prediction on. I'm using the dfply package and then the mask command to take out a certain…

Yaz229
- 3
- 2
0
votes
2 answers
Python dfply: unable to mask on multiple conditions
I am an R user learning how to use Python's dfply, the Python equivalent to R's dplyr. My problem: in dfply, I am unable to mask on multiple conditions in a pipe. I seek a solution involving dfply pipes rather than multiple lines of subsetting.
My…

Neko
- 11
- 4