Questions tagged [py-datatable]

Use this tag for questions related to the `datatable` python library. Consider tagging your questions with [python] as well. Do not use this tag to ask questions about generic "tables of data".

Datatable is a python library for manipulating two-dimensional data tables (called Frames). It is similar in spirit to python pandas and R data.table.

108 questions
0
votes
1 answer

Why duplicate columns are created after applying a grouping on multiple columns in pydatatable?

I have a pydatatable as, DT = dt.Frame( A=[1, 3, 2, 1, 4, 2, 1], B=['A','B','C','A','D','B','A'], C=['myamulla','skumar','cary','myamulla','api','skumar','myamulla']) Out[7]: | A B C -- + -- -- -------- 0 | 1 A …
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30
0
votes
1 answer

How to join two dataframes with different key column names in pydatatable?

I have a X dataframe as, DT_X = dt.Frame({ 'date':['2020-09-01','2020-09-02','2020-09-03'], 'temp':[35.3,32.9,43.2] }) Out[4]: | date temp -- + ---------- ---- 0 | 2020-09-01 35.3 1 | 2020-09-02 32.9 2 |…
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30
0
votes
0 answers

Test suite fails to run

I followed the instructions in https://datatable.readthedocs.io/en/latest/start/install.html to build datatable from the git repo. My tests are failing when I run step 5 in the section Install datatable in editable mode. I have attached a part of my…
PKrish
  • 1
  • 1
0
votes
1 answer

how to build python datatable from git repo

I have cloned the git repo from https://github.com/h2oai/datatable on my linux machine. How do I build datatable from the cloned copy saved on my local machine? Thanks.
PKrish
  • 1
  • 1
0
votes
1 answer

py-datatable Replace empty string in column with NaN

In a python data table, I wanted to replace empty strings with NaN. When I tried, I get the below error. It works with pandas. Thanks in advance for the help. Datatable Syntax I tried: dt[:,"column_name"].replace('',np.nan) Error Received: Cannot…
jeganathan velu
  • 189
  • 2
  • 12
0
votes
1 answer

Python Datatable/Pydatatable: How to filter rows in datatable by regex and assign value to new variable according to filter

I want to assign values to a new column, based on the regex match in another column in python-datatable syntax. DT[get rows by regex , assign value to new column, ] import pandas as pd import datatable as dt from datatable import f, Frame import re…
Zappageck
  • 122
  • 9
0
votes
1 answer

How to select columns created with unformatted names in pydatatable?

I have created a datatable as, DT_EX = dt.Frame({'Year sold':[2000,2002,2004,2006],'Year Construction':[1990,1992,1994,1996]}) and its view as Out[4]: | Year sold Year Construction -- + --------- ----------------- 0 | 2000 …
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30
0
votes
2 answers

How to filter NA values in columns of pydatatable?

I have a datatable as, DT_EX= dt.Frame({ 'country':['a','a','a','a'], 'id':[3,3,3,3], 'shop':['dmart','dmart','dmart','dmart'], 'beef':[23,None,None,None], …
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30
0
votes
1 answer

How to define custom function to generate summary stats in pydatatable?

I'm trying to build a custom function to generate a summary stats for a given field as showed in the code snippet. def estadistica_dt_summario(dt,col,por): dt_summary=…
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30
0
votes
1 answer

Python datatable to correlation matrix

Is there an equivalent corr() function for Python Datatable as exists for Python Pandas - to find the correlation matrix of the Frame columns? Thanks
SJain
  • 11
  • 1
0
votes
3 answers

Issues with pydatatable frame output in google colab

I'm doing data wrangling on a dataset using pydatatable in google colab notebooks, on executing code chunks its displaying two different output formats of a frame, where as the same dataframe with pandas displays a single output, i'm attaching hera…
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30
0
votes
1 answer

How to make use of str library functions to pydatatable?

I would like to know how str module functions can directly be applied to pydatatable Dataframe without explicitly being converted to a pandas DataFrame. Sample DT: DT_py = dt.Frame ( { 'ciudad':['PERTH','SYDNEY','PORT','MELBOURNE','DARWIN'], 'pop'…
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30
0
votes
1 answer

How to type cast a dataframe column in pydatatable?

I'm trying to explore datatypes of a frame in pydatatable. Here I have a dataframe: ventas_duda_dt = dt.Frame( {"cust_id":[893232.34],"sales":['$123,4532.93'],"profit_perc":['10%']}) and its types: ventas_duda_dt.stypes and the datatypes of…
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30
0
votes
2 answers

How to filter by date with python's datatable

I have the following datatable, which I would like to filter by dates greater than "2019-01-01". The problem is that the dates are strings. dt_dates = dt.Frame({"days_date": ['2019-01-01','2019-01-02','2019-01-03']}) This is my best attempt.…
Alex
  • 2,603
  • 4
  • 40
  • 73
0
votes
1 answer

Is the jay file format specific to Python datatable?

I can't find information on the jay file format mentioned here. Is it a datatable only format?
xiaodai
  • 14,889
  • 18
  • 76
  • 140