Questions tagged [pandas]

Pandas is a Python library for data manipulation and analysis, e.g. dataframes, multidimensional time series and cross-sectional datasets commonly found in statistics, experimental science results, econometrics, or finance. Pandas is one of the main data science libraries in Python.

Pandas is a Python library for PAN-el DA-ta manipulation and analysis, e.g. multidimensional time series and cross-sectional data sets commonly found in statistics, experimental science results, econometrics, or finance. pandas is implemented primarily using NumPy and Cython; it is intended to be able to integrate very easily with NumPy-based scientific libraries, such as statsmodels.

To create a reproducible Pandas example:

Main Features:

Data structures: for one- and two-dimensional labeled datasets (respectively Series and DataFrames). Some of their main features include:
- Automatically aligning data and interpolation
- Handling missing observations in calculations
- Convenient slicing and reshaping ("reindexing") functions
- Categorical data types
- Provide 'group by' aggregation or transformation functionality
- Tools for merging and joining together data sets
- Simple Matplotlib integration for plotting and graphing
- Multi-Indexing providing structure to indices that allow for representation of an arbitrary number of dimensions.
Date tools: objects for expressing date offsets or generating date ranges. Dates can be aligned to a specific time zone and converted or compared at will
Statistical models: convenient ordinary least squares and panel OLS implementations for in-sample or rolling time series and cross-sectional regressions. These will hopefully be the starting point for implementing models
Intelligent Cython offloading; complex computations are performed rapidly due to these optimizations.
Static and moving statistical tools: mean, standard deviation, correlation, and covariance
Rich User Documentation, using Sphinx

Asking Questions:

Before asking the question, make sure you have gone through the 10 Minutes to pandas introduction. It covers all the basic functionality of Pandas.
See this question on asking good questions: How to make good reproducible pandas examples
Please provide the version of Pandas, NumPy, and platform details (if appropriate) in your questions

Answering Questions:

How can I effectively load data on Stack Overflow questions using Pandas read_clipboard? (useful for copy pasting data from questions into your terminal as DataFrames)
Copying MultiIndex dataframes with pd.read_clipboard?

Useful Canonicals:

Resources and Tutorials:

Books:

282843 questions

votes

5 answers

Pandas Dataframe Find Rows Where all Columns Equal

I have a dataframe that has characters in it - I want a boolean result by row that tells me if all columns for that row have the same value. For example, I have df = [ a b c d 0 'C' 'C' 'C' 'C' 1 'C' 'C' 'A' 'A' 2 'A' 'A'…

python pandas

asked Mar 28 '14 at 00:11

Lisa L

votes

2 answers

pandas - how to access cell in pandas, equivalent of df[3,4] in R

If I have a pandas DataFrame object, how do I simply access a cell? In R, assuming my data.frame is called df, I can access the 3rd row and 4th column by df[3,4] What is the equivalent in python?

indexing pandas dataframe

asked Jan 27 '14 at 22:58

bill999

2,147
8
51
103

votes

8 answers

Pandas dataframe hide index functionality?

Is it possible to hide the index when displaying pandas DataFrames, so that only the column names appear at the top of the table? This would need to work for both the html representation in ipython notebook and to_latex() function (which I'm using…

python pandas jupyter-notebook

asked Jan 21 '14 at 10:52

J Grif

1,003
2
12
16

votes

7 answers

Pandas: Get duplicated indexes

Given a dataframe, I want to get the duplicated indexes, which do not have duplicate values in the columns, and see which values are different. Specifically, I have this dataframe: import pandas as pd wget…

python indexing pandas

asked Nov 25 '13 at 17:15

Olga Botvinnik

1,564
1
14
32

votes

3 answers

Unpivot Pandas Data

I currently have a DataFrame laid out as: Jan Feb Mar Apr ... 2001 1 12 12 19 2002 9 ... 2003 ... and I would like to "unpivot" the data to look like: Date Value Jan 2001 1 Feb 2001 1 Mar 2001 12 ... Jan 2002 …

python pandas numpy dataframe

asked Aug 15 '13 at 18:18

Alex Rothberg

10,243
13
60
120

votes

11 answers

Pandas ParserError EOF character when reading multiple csv files to HDF5

Using Python3, Pandas 0.12 I'm trying to write multiple csv files (total size is 7.9 GB) to a HDF5 store to process later onwards. The csv files contain around a million of rows each, 15 columns and data types are mostly strings, but some floats.…

python csv python-3.x pandas hdf5

asked Aug 02 '13 at 11:40

Matthijs

votes

5 answers

Get column name where value is something in pandas dataframe

I'm trying to find, at each timestamp, the column name in a dataframe for which the value matches with the one in a timeseries at the same timestamp. Here is my dataframe: >>> df col5 col4 col3 col2 …

python dataframe pandas

asked Feb 06 '13 at 17:04

leroygr

2,349
4
18
18

votes

3 answers

Python pandas, Plotting options for multiple lines

I want to plot multiple lines from a pandas dataframe and setting different options for each line. I would like to do something…

python plot pandas

asked Jan 06 '13 at 00:58

Joerg

votes

4 answers

Filter out groups with a length equal to one

I am creating a groupby object from a Pandas DataFrame and want to select out all the groups with > 1 size. Example: A B 0 foo 0 1 bar 1 2 foo 2 3 foo 3 The following doesn't seem to work: grouped =…

python pandas group-by

asked Oct 31 '12 at 21:03

Abhi

6,075
10
41
55

votes

12 answers

Creating dummy variables in pandas for python

I'm trying to create a series of dummy variables from a categorical variable using pandas in python. I've come across the get_dummies function, but whenever I try to call it I receive an error that the name is not defined. Any thoughts or other…

python pandas

asked Jul 20 '12 at 22:33

user1074057

1,772
5
20
30

votes

5 answers

pandas convert from datetime to integer timestamp

Considering a pandas dataframe in python having a column named time of type integer, I can convert it to a datetime format with the following instruction. df['time'] = pandas.to_datetime(df['time'], unit='s') so now the column has entries like:…

python pandas timestamp datetime-conversion

asked Jan 22 '19 at 16:43

roschach

8,390
14
74
124

votes

2 answers

Pandas filter data frame rows by function

I want to filter a dataframe by a more complex function based on different values in the row. Is there a possibility to filter DF rows by a boolean function like you can do it e.g. in ES6 filter function? Extreme simplified example to illustrate the…

python-3.x pandas dataframe function filter

asked Jul 30 '18 at 08:09

Karl Adler

15,780
10
70
88

votes

3 answers

pandas, melt, unmelt preserve index

I've got a table of clients (coper) and asset allocation (asset) A = [[1,2],[3,4],[5,6]] idx = ['coper1','coper2','coper3'] cols = ['asset1','asset2'] df = pd.DataFrame(A,index = idx, columns = cols) so my data look like asset1 …

python python-2.7 pandas linear-programming

asked May 25 '18 at 12:15

Mohammad Athar

1,953
1
15
31

votes

2 answers

Pandas add column with value based on condition based on other columns

I have the following pandas dataframe: import pandas as pd import numpy as np d = {'age' : [21, 45, 45, 5], 'salary' : [20, 40, 10, 100]} df = pd.DataFrame(d) and would like to add an extra column called "is_rich" which captures if a person…

python pandas dataframe performance conditional-statements

asked May 16 '18 at 16:36

Rutger Hofste

4,073
3
33
44

votes

5 answers

Read JSON to pandas dataframe - ValueError: Mixing dicts with non-Series may lead to ambiguous ordering

I am trying to read in the JSON structure below into pandas dataframe, but it throws out the error message: ValueError: Mixing dicts with non-Series may lead to ambiguous ordering. Json data: { "status": { "statuscode": 200, …

python json pandas

asked Mar 27 '18 at 06:35

userPyGeo

3,631
4
14
24

Prev 1 2 3

…

100