Questions tagged [py-datatable]

Use this tag for questions related to the `datatable` python library. Consider tagging your questions with [python] as well. Do not use this tag to ask questions about generic "tables of data".

Datatable is a python library for manipulating two-dimensional data tables (called Frames). It is similar in spirit to python pandas and R data.table.

108 questions

votes

1 answer

How to aggregate columns of type `dict`

I have a Frame as follows: x = dt.Frame(k = [1, 1, 2], v = [{'a':1, 'b':2}, {'a':3}, {'b':4}]) which looks like this: k v ▪▪▪▪ ▪▪▪▪▪▪▪▪ 1 {'a': 1, 'b': 2} 1 {'a': 3} 2 {'b': 4} What I'm trying to do is to…

python datatable py-datatable

asked Sep 04 '20 at 19:36

R. Zhu

votes

1 answer

How to combine (merge) two datatable Frame in python

Given two datatable Frame. How to combine (merge) them in one frame? dt_f_A = +--------+--------+--------+-----+--------+ | A_at_1 | A_at_2 | A_at_3 | ... | A_at_m | +--------+--------+--------+-----+--------+ | v_1 | | | | …

python dataframe concatenation py-datatable

asked Aug 18 '20 at 17:34

ibra

1,164
1
11
26

votes

4 answers

How to convert correctly a datatable of integers (from Python datatable library) to pandas Dataframe

I am using Python datatable (https://github.com/h2oai/datatable) to read a csv file that contain only integers values. After that I convert the datatable to pandas Dataframe. At the conversion, the columns that contain only 0/1 are considered as…

python pandas dataframe csv py-datatable

asked Jul 20 '20 at 13:14

ibra

1,164
1
11
26

votes

1 answer

How to lump together factor levels of a string type column into another in pydatatable?

I have a datatable as, DT_X = dt.Frame({'variety': ['Caturra', 'Bourbon', 'Typica', 'Catuai', 'Hawaiian Kona', 'Yellow Bourbon', 'Mundo Novo', 'Catimor', 'SL14', 'SL28', 'Pacas', 'Gesha', 'Pacamara', 'SL34', 'Arusha', …

python py-datatable

asked Jul 09 '20 at 16:01

myamulla_ciencia

1,282
1
8
30

votes

2 answers

How to deselect pydatatable columns based on their types?

I have created a datatable as, DT_X = dt.Frame({'x':[1,2,3,4,5], 'y':[0.1,0.5,0.9,1.5,4.3], 'z':['a','b','c','d','e'], 'u':[True,False,True,False,False], 'v':[10,20,30,40,50], …

python py-datatable

asked Jun 19 '20 at 13:08

myamulla_ciencia

1,282
1
8
30

votes

1 answer

How to find and mark duplicates in a python datatable

I would like to identify the duplicated rows in a py-dtatable by group (and create a helper column C with a bool). It should work along the lines of this: DT = dt.Frame(A=[1, 2, 1, 2, 2, 1], B=list("XXYYYY")) I get -> TypeError: Expected a Frame,…

python py-datatable

asked Jun 15 '20 at 16:14

Zappageck

votes

1 answer

Converting string column to date format in datatable frame in python

For an easy example : import datatable as dt import pandas as pd from datetime import datetime d_t = dt.Frame(pd.DataFrame({"Date": ["04/05/2020", "04/06/2020"]})) There is only a column named Date with two values in str32 type. How could I…

python datetime python-datetime date-manipulation py-datatable

asked May 17 '20 at 12:06

Denny Chen

votes

3 answers

How to find unique values by group in datatable Frame

I have created a datatable frame as follows, DT_EX = dt.Frame({'cid':[1,2,1,2,3,2,4,2,4,5], 'cust_life_cycle':['Lead','Active','Lead','Active','Inactive','Lead','Active','Lead','Inactive','Lead']}) Here I have three unique…

python py-datatable

asked May 03 '20 at 16:45

myamulla_ciencia

1,282
1
8
30

votes

0 answers

Apply an aggregate function to a python datatable column after group by

Is it possible to "apply" a user function to a python datatable after groupby? For example: import datatable as dt from datatable import f, by, sum df = dt.Frame(SYM=['A','A','A','B','B'], xval=[1.1,1.2,2.3,2.4,2.5]) print(df[:, sum(f.xval),…

python pandas-groupby apply py-datatable

asked Mar 07 '20 at 03:10

balaks

votes

1 answer

Apply aggregate function to a datatable column and return value, not datatable

Perhaps a dumb question but.. In R data.table, if I want to get the mean of a column, I can reference a column vector like foo$x and calculate its mean with something like mean(foo$x). I can't figure out how to do this operation with Python…

python py-datatable

asked Oct 12 '19 at 16:39

Ben

20,038
30
112
189

votes

1 answer

Python data.table row filter by regex

What is the data.table for python equivalent of %like%? Short example: dt_foo_bar = dt.Frame({"n": [1, 3], "s": ["foo", "bar"]}) dt_foo_bar[re.match("foo",f.s),:] #works to filter by "foo" I had expected something like this to…

python py-datatable

asked Feb 10 '19 at 21:29

Jed Gore

votes

3 answers

Analyse huge csv file in R/Python and sampling X% according to the distribution of the file?

I have a large csv file (6 GB) and I want to sample 20% of it. These 20% should be with same distribution as the large original file. For example, take Kaggles data: https://www.kaggle.com/c/avazu-ctr-prediction/data I thought about chunks but how…

python r dataframe py-datatable

asked Aug 20 '18 at 11:37

SteveS

3,789
5
30
64

votes

1 answer

How to roll up duplicate observation in pydatatable?

I have a data frame as- my_dt = dt.Frame({'last_name':['mallesh','bhavik','jagarini','mallesh','jagarini'], 'first_name':['yamulla','vemulla','yegurla','yamulla','yegurla'], …

python py-datatable

asked Aug 17 '22 at 04:20

myamulla_ciencia

1,282
1
8
30

votes

2 answers

Remove rows which have na values

I have the following datatable in python:- # A B B_lag_1 B_lag_2 B_lag_3 B_lag_4 #0 0 −0.342855 NA NA NA NA #1 0 …

python py-datatable

asked Jun 02 '22 at 09:21

Shawn Brar

1,346
3
17

votes

2 answers

datatable: process 2 frames

data_df = pd.DataFrame({"AAA": [1, 2, 1, 3], "BBB": [1, 1, 2, 2], "CCC": [2, 1, 3, 1]}) lookup_df = pd.DataFrame({"key": [1,2,3], "value" : ["Alpha", "Beta", "Charlie"]}) data_dt =…

python python-3.x pandas dataframe py-datatable

asked Mar 26 '22 at 16:52

l a s

3,836
10
42
61

Prev 1

3 4 5 6 7 8 Next