Questions tagged [py-datatable]

Use this tag for questions related to the `datatable` python library. Consider tagging your questions with [python] as well. Do not use this tag to ask questions about generic "tables of data".

Datatable is a python library for manipulating two-dimensional data tables (called Frames). It is similar in spirit to python pandas and R data.table.

108 questions
1
vote
1 answer

Recommendation on selection of required fields using f expressions in pydatatable dataframe

I have created a datatable frame as, DT_EX = dt.Frame({'sales':[103.07, 47.28, 162.15, 84.47, 44.97, 46.97, 34.99, 9.99, 29.99, 64.98], 'quantity':[6, 2, 8, 3, 3, 3, 1, 1, 1, 2], …
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30
1
vote
1 answer

How to create a column and fill in values based on condition(ifelse) in pydatatable data frame?

I have created a datatable frame as follows, DT_EX = dt.Frame({'income':[1000,2000,3000,2500,5000]}) Here i would like to add a new column(profit_or_loss) to it on a specific condition as If the income is greater than 2500 a value 'Profit' should…
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30
1
vote
1 answer

Datatable installation from github failing to find the version

I have uninstalled and re-installed the latest version of datatable from the repo 16:42:49/seirdc2.March8.in $sudo pip3 install 'datatable==0.10.1' Successfully installed datatable-0.10.1 Let's see the version: import datatable as…
WestCoastProjects
  • 58,982
  • 91
  • 316
  • 560
1
vote
1 answer

How to select columns based on their data types in pydatatable?

I'm creating a datatable as follows, spotify_songs_dt = dt.fread('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-01-21/spotify_songs.csv') and its column types are, spotify_songs_dt.stypes Here I would like to…
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30
1
vote
1 answer

How to round off floating values in pydatatable?

I'm carrying some mathematical operations on datatable fields as follows Sample DT: py_DT = dt.Frame({'junction' : ['BroadwayCycleTrack-N','BroadwayCycleTrack-N', 'Burke Gilman Trail','Burke Gilman Trail','Elliot Bay','Elliot…
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30
1
vote
2 answers

How to modify/update column values on a condition in Pydatatable?

In pydatatable, I'm trying to modify a column values specifying a condition in i i.e DT[i=="text", j="some"] sample DT: py_DT= dt.Frame({'crossing':['ABC','A','B','B','A','A','ABC'], 'total' :[2,4,5,6,8,10,12]}) Here i would like…
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30
1
vote
3 answers

how to join multiple tab files by using python

I have multiple tab files with same name in different folders like this F:/RNASEQ2019/ballgown/abundance_est/RBRN02.sorted.bam\t_data.ctab F:/RNASEQ2019/ballgown/abundance_est/RBRN151.sorted.bam\t_data.ctab Each file have 5-6 common columns and I…
jit c
  • 33
  • 1
  • 7
1
vote
1 answer

How to set a key on dataframe column in pydatatable?

I'm practicing how to perform join operation on pydatatable's dataframes. First DT is created as follows, DT_1=dt.Frame({"title": np.array(['stat','math','stat','math','esp']), "score": np.array([23,43,21,50,16])}) Second DT is…
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30
1
vote
2 answers

I received an error message while trying to install datatable in python 3.7.4 using python -m pip install datatable

When I tried to install a datatable using pythom -m pip install datatable in python 3.7.4, I received the following error message: Complete output (26 lines): Start setup.py command = `install` Find an LLVM installation Environment variable…
1
vote
1 answer

How to use seaborn library with pydatatable?

I have started using pydatatable for one of my data analysis project, here i have faced few issues in making charts of pydatatable object using seaborn library. does pydatatable support seaborn visualizations in current version of it 0.8?. I have…
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30
0
votes
0 answers

error installing py-datatable 1.0.0 on Windows 10

I'm running Python 3.10.9 in a conda environment with the Windows 10 OS. I'm trying to install the py-datatable package using pip, but I'm getting an error message. The pip install command successfully downloads the tarball from an internal mirror…
William Chiu
  • 388
  • 3
  • 19
0
votes
0 answers

Is py-datatable compatible with Dask?

Does Python Datatable work with distributed big data frameworks like Dask? I have plenty of data.table experience on R, but not on Python. For a person familiar with the tools this might sound a stupid question, apologies for that.
Jylpah
  • 223
  • 2
  • 7
0
votes
0 answers

Reading google storage files directly in Python using datatable's fread

Im using JupyterLab and a Python 3 ipykernel. In pandas this is very simple: df = pd.read_csv("gs://bucket/folder/file.csv") However in datatable I can't find a solution: DT = dt.fread( "gs://bucket/folder/file.csv") ValueError: File…
0
votes
2 answers

What is the Python equivalent to R data.table column creation?

I'm in the process of converting my R scripts to python. Is there a similar process in creating new columns that r data.table does in the J step? Below is my example code in R: dat[,Returned_on_time:=…
Zachqwerty
  • 85
  • 6
0
votes
0 answers

How to read xlsx files using fread from pydatatable?

I have a thousands of excel files with .xlsx extentions, i'm trying to import them using fread from pydatatable. fread('sample.xlsx') I also have installed xlrd library in my environment. however it gives an error as below. 24 import…
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30