Questions tagged [py-datatable]

Use this tag for questions related to the `datatable` python library. Consider tagging your questions with [python] as well. Do not use this tag to ask questions about generic "tables of data".

Datatable is a python library for manipulating two-dimensional data tables (called Frames). It is similar in spirit to python pandas and R data.table.

108 questions
2
votes
2 answers

Is there a way to print python datatable without waiting for user input at the end

I am printing a python datatable Frame. It pages when I do that, it's waiting for my input at the end, even for very small Frames. For example, In [12]: DT = dt.Frame(A=range(5)) In [13]: DT A --- -- 0 0 1 1 2 2 3 3 4 …
1
vote
1 answer

Datatable for ARM architecture?

import datatable as dt Throws an ImportError: ImportError: dlopen(miniforge3/lib/python3.9/site-packages/datatable/lib/_datatable.cpython-39-darwin.so, 0x0002): tried:…
1
vote
1 answer

Create many lagged variables

I have the following Python datatable: import datatable import numpy as np np.random.seed(42) dt = datatable.Frame({"A":np.repeat(np.arange(0, 2), 5), "B":np.random.normal(0, 1, 10)}) dt # A B #0 0 −0.342855 #1 …
Shawn Brar
  • 1,346
  • 3
  • 17
1
vote
1 answer

Python datatable - collection in a column

Can python datatable have any collection as datatype for a column? import datatable as dt dt_with_collection = dt.Frame(A=range(5), B=[1,5,7,2,3], c=[(1,2), (3,4), (5,6), (7,8), (9,10)]) print(dt_with_collection) TypeError: Cannot create column…
l a s
  • 3,836
  • 10
  • 42
  • 61
1
vote
2 answers

switch column locations in python datatable

What is the most efficient way to switch the locations of two columns in python datatable? I wrote the below function that does what I want, but this may not be the best way, especially if my actual table is big. Is it possible to do this in place?…
langtang
  • 22,248
  • 1
  • 12
  • 27
1
vote
1 answer

py-datatable module: How to transform rows to columns?

Is there a way to transform rows in Datatable to columns in python? For example- Given there is a datatable like below A 2 B 3 C 5 I want to transform it to A B C 2 3 5 and merge it with another datatable that looks like A X Y Z 2 5 0 3 So that…
l a s
  • 3,836
  • 10
  • 42
  • 61
1
vote
1 answer

subset datatable by column

Trying to subset a datatabl a couple different ways: DT1 = dt.Frame(A=range(5)) DT1[f.A > 2] ## select rows where A greater than 2 DT1[DT1['A'] > 2] ## select rows where A greater than 2 DT1[DT1['A'] in 2] ## select rows where A equal to…
Rafael
  • 3,096
  • 1
  • 23
  • 61
1
vote
3 answers

Python datatable/pandas reshaping problem

I need to reshape my df. This is my input df: import pandas as pd import datatable as dt DF_in = dt.Frame(name=['name1', 'name1', 'name1', 'name1', 'name2', 'name2', 'name2', 'name2'], date=['2021-01-01', '2021-01-02', '2021-01-03',…
peter
  • 756
  • 5
  • 16
1
vote
0 answers

Out-of-memory operations in Python datatable package: How to do it?

Python datatable package documentation page states that it supports out-of-memory datasets. I could not find examples of that kind of operations, so I am looking for that. Thank you
GitHunter0
  • 424
  • 6
  • 10
1
vote
1 answer

How to select row by key column in O(1)? (Python Datatable)

How can I get the row by key value in O(1)? The only way I found in the docs to select rows is the row selector that seems to not take advantage by the keyed status of the column. For example in this table: size = 10**4 DT =…
user6110729
  • 138
  • 1
  • 8
1
vote
1 answer

Python datatable: sum, groupby, column < 0

Hi I am struggling to translate some R code into Python code. This is my R code: df_sum <- df[, .( Inflow = sum(subset(Amount, Amount>0)), Outflow = sum(subset(Amount, Amount<0)), Net = sum(Amount) ), by = Account] This is my Python…
peter
  • 756
  • 5
  • 16
1
vote
1 answer

Replace all 'NA' with 0 in complete DT (Python Datatable)

Hi I am working with the Python datatable package and need to replace all the 'NA' after joining two DT's. Sample data: DT = data.table(x=rep(c("b","a","c"),each=3), y=c(1,3,6), v=1:9) X = data.table(x=c("c","b"), v=8:7, foo=c(4,2)) X[DT,…
peter
  • 756
  • 5
  • 16
1
vote
2 answers

How to apply aggregations(sum,mean,max,min etc ) across columns in pydatatable?

I have a datatable as, DT_X = dt.Frame({ 'issue':['cs-1','cs-2','cs-3','cs-1','cs-3','cs-2'], 'speech':[1,1,1,0,1,1], 'narrative':[1,0,1,1,1,0], 'thought':[0,1,1,0,1,1] }) it can be viewed as, Out[5]: |…
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30
1
vote
1 answer

How to extend the columns of a pydatable with a dictionary containing the values in list?

I have created a sample datatable as, DT_EX = dt.Frame({'recency': ['current','savings','fixex','current','savings','fixed','savings','current'], 'amount': [4200,2300,1500,8000,1200,6500,4500,9010], 'no_of_pl':…
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30
1
vote
1 answer

How to filter NA values of group of columns in pydatatable?

I have created a datatable with 3 different group of observations as, DT_EX= dt.Frame({ 'country':['a','a','a','a','b','b','c','c'], 'id':[3,3,3,3,4,4,4,4], 'shop':['dmart','dmart','dmart','dmart','amzn','amzn','amzn','amzn'], …
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30