How to access the nominal values and uncertainties in a Pandas DataFrame?

Question

I am using the uncertainties module along with Pandas. At present, I am able to output the dataframe with the uncertainties together to a spreadsheet. My main objective is to write the dataframe with the uncertainties in an adjacent column. But how to access the nominal values or uncertainties within dataframes. A MWE is given below.

Present output

A	B
63.2+/-0.9	75.4+/-0.9
41.94+/-0.05	53.12+/-0.21
4.1+/-0.4	89.51+/-0.32
28.2+/-0.5	10.6+/-0.6
25.8+/-0.9	39.03+/-0.08
27.26+/-0.09	44.61+/-0.35
25.04+/-0.13	37.7+/-0.6
2.4+/-0.5	50.0+/-0.8
0.92+/-0.21	3.1+/-0.5
57.69+/-0.34	21.8+/-0.8

Desired output

A	+/-	B	+/-
63.2	0.9	75.4	0.9
41.94	0.05	53.12	0.21
4.1	0.4	89.51	0.32
28.2	0.5	10.6	0.6
25.8	0.9	39.03	0.08
27.26	0.09	44.61	0.35
25.04	0.13	37.7	0.6
2.4	0.5	50	0.8
0.92	0.21	3.1	0.5
57.69	0.34	21.8	0.8

MWE

from uncertainties import unumpy
import pandas as pd
import numpy as np


A_n = 100 * np.random.rand(10)
A_s = np.random.rand(10)

B_n = 100 * np.random.rand(10)
B_s = np.random.rand(10)

AB = pd.DataFrame({'A':unumpy.uarray(A_n, A_s), 'B': unumpy.uarray(B_n, B_s)})


AB_writer = pd.ExcelWriter('A.xlsx', engine = 'xlsxwriter', options={'strings_to_numbers': True})
AB.to_excel(AB_writer, sheet_name = 'Data', index=False, na_rep='nan')
AB_writer.close()

Update

I forgot to mention that AB is not created as shown in MWE, but is a result of previous calculations not given in the MWE. For the sake of MWE, I created the AB. So in short, I won't have access to the A and B nominal and uncertainty values.

If you have no access to the A and B nominal and uncertainty values, how can you process these 2 columns and transform them ? Or, you only have their combined values e.g. `63.2+/-0.9` in text ? — SeaBean, Oct 22 '21 at 19:08
I have only their combined values (after many calculations). — Tom Kurushingal, Oct 22 '21 at 21:42

Paul · Answer 1 · 2023-07-29T11:56:30.390

2

All the answers do not take into account that OP is using the uncertainties package. The correct answer to the question from this post or the user manual is to use

unumpy.nominal_values(arr)

and

unumpy.std_devs(arr)

where arr is your pandas column

edited Jul 29 '23 at 11:56

answered Jul 29 '23 at 11:53

Paul

21
4

dsillman2000 · Answer 2 · 2021-10-22T17:17:08.313

0

Just split them into different columns:

Au = unumpy.uarray(A_n, A_s)
Bu = unumpy.uarray(B_n, B_s)
AB = pd.DataFrame({'A': unumpy.nominal_values(Au), 'A+/-': unumpy.std_devs(Au), 'B': unumpy.nominal_values(Bu), 'B+/-': unumpy.std_devs(Bu)})

edited Oct 22 '21 at 17:17

answered Oct 22 '21 at 17:11

dsillman2000

976
1
8
20

score 0 · Answer 3 · answered Oct 22 '21 at 17:14

0

You can map the column(s) to get the outcome you're looking for. The following code maps the A column (make sure to not assign two columns to the same column key '+/-')

AB[['A', '+/-']] = AB.A.apply(lambda x: str(x).split('+/-')).to_list()

answered Oct 22 '21 at 17:14

P. van der Laan

231
1
9

SeaBean · Answer 4 · 2021-10-23T19:58:44.640

0

You can use str.split() to split each column into one column of main value and one column of the uncertainties, as follows:

# add the column labels here if you have more columns to process
# e.g. `for col in AB[['A', 'B', 'C']]:` if you want to process columns `A`, `B` and `C`
for col in AB[['A', 'B']]:     
    AB[[col, f'{col}+/-']] = AB[col].str.split(r'\+/-', expand=True)

# sort the columns to put the related columns together
AB = AB.sort_index(axis=1)

It is not recommended to have 2 columns of the same column labels in the same dataframe. Here, we name the +/- columns with together with their respective source column name in order to distinguish them.

Here, we also use .sort_index() to sort the column names to put related columns adjacent to each other.

Result:

print(AB)

       A  A+/-      B  B+/-
0   63.2   0.9   75.4   0.9
1  41.94  0.05  53.12  0.21
2    4.1   0.4  89.51  0.32
3   28.2   0.5   10.6   0.6
4   25.8   0.9  39.03  0.08
5  27.26  0.09  44.61  0.35
6  25.04  0.13   37.7   0.6
7    2.4   0.5   50.0   0.8
8   0.92  0.21    3.1   0.5
9  57.69  0.34   21.8   0.8

edited Oct 23 '21 at 19:58

answered Oct 22 '21 at 18:23

SeaBean

22,547
3
13
25

I get the error `ValueError: Columns must be same length as key` – Tom Kurushingal Oct 23 '21 at 19:07
@TomKurushingal Are you having some columns not with the `+/-` sign? If yes, please loop over the columns having the `+/-` sign only as I mentioned in comment of the code. E.g. `for col in AB[['A', 'B']]` for only columns `A` and `B` with the `+/-` sign – SeaBean Oct 23 '21 at 19:18
@TomKurushingal The loop in the first part is to help you automate looping over all columns with `+/-` sign. Of course if you have only 1 or 2 columns, you can manually do it by e.g. `AB[['A', 'A+/-']] = AB['A'].str.split(r'\+/-', expand=True)` and also `AB[['B', 'B+/-']] = AB['B'].str.split(r'\+/-', expand=True)` with one column at a time. – SeaBean Oct 23 '21 at 19:34
@TomKurushingal I've edited my codes above to process only columns `A` and `B` as in your sample data. In case you have other columns to process, just add it to the list of columns. e.g. use `for col in AB[['A', 'B', 'C']]:` for the first line of codes, if you want to process columns `A`, `B` and `C` – SeaBean Oct 23 '21 at 20:01
@TomKurushingal How's the solution works for you ? Please advise. Thanks! – SeaBean Oct 28 '21 at 09:09

How to access the nominal values and uncertainties in a Pandas DataFrame?

4 Answers4