Questions tagged [vaex]

Vaex is a python library for lazy Out-of-Core DataFrames (similar to Pandas)

Vaex is a python library for lazy Out-of-Core DataFrames (similar to Pandas), to visualize and explore big tabular datasets. It can calculate statistics such as mean, sum, count, standard deviation etc, on an N-dimensional grid for more than a billion (10^9) objects/rows per second. Visualization is done using histograms, density plots and 3d volume rendering, allowing interactive exploration of big data. Vaex uses memory mapping, zero memory copy policy and lazy computations for best performance (no memory wasted).

181 questions

votes

0 answers

Dump SQL table to FILE and applying a custom function?

I have a situation where writing a PL/pgSQL function solution is to slow and cumbersome to write and probably impossible cause I need many python modules. That's why I want to opt for VAEX or DASK. The plan: dump the SQL table to a file, then apply…

asked Jun 11 '21 at 04:25

sten

7,028
9
41
63

votes

1 answer

ValueError: operand '!=' not supported for string comparison

I want to compare value with string i did df = df[df.s1 != 'NON eq'] I was gotting this error ValueError: operand '!=' not supported for string comparison

python vaex

asked Jun 04 '21 at 14:50

biwia

votes

1 answer

vaex extract one column of str.split()

I want nearly the same as answered here for pandas - but want to run it in vaex. As vaex does lazy copy, for me it would be okay, to save (my two) columns of str.split into the vaex-df. But there is nothing like expand=True.

python vaex

asked May 18 '21 at 10:00

Bastian Ebeling

1,138
11
38

votes

0 answers

Clustering millions of large binary vectors?

I want to generate millions of large binary vectors (10_000 ... 100_000 bits). Then I want to cluster them by OVERLAP (AND) . After that I want to reorder the vectors according to the clustering and save it for later. Scipy have a clustering method…

python vector binary hierarchical-clustering vaex

asked May 05 '21 at 16:26

sten

7,028
9
41
63

votes

1 answer

In Python Vaex library how can I replace values of columns with allowed custom values of that columns

I have a dictionary with key-value pair columns name and value as a list of allowed values in that columns How to replace values that are not occurring in the dictionary list with '0' FinalCat_ is the column names list CombinedCat is Vaex…

python vaex

asked Apr 28 '21 at 10:13

Nishant Chandel

votes

1 answer

vaex: How to limit number of cores/threads/processes?

How can one limit the number of cores/threads/processes that are being used by vaex? Some operations have a boolean parallel switch, but I don't see a way to have more fine-grained control (which is important on larger shared servers). Code snippet…

python-3.x bigdata vaex

asked Apr 15 '21 at 11:32

kuropan

votes

1 answer

Can't open HDF5 file bigger than memory... ValueError

I have many .csv of NYC taxi from nyc.gov, one .csv = year-month. There I grab cca 15 of csvs and make HDF5s from them: import h5py import pandas as pd import os import glob import numpy as np import vaex from tqdm import tqdm_notebook as…

python-3.x pandas bigdata hdf5 vaex

asked Apr 10 '21 at 13:12

314mip

votes

0 answers

Displaying full integers instead of scientific notiation when printing out Vaex HDF5 data

my code: myfile = vaex.open('myfile.hdf5') myfile['customer_id'] output: Length: 4,259,376 dtype: int64 (column) 0 9.4618e+08 1 9.43324e+08 2 9.43325e+08 3 9.43333e+08 4 9.43333e+08 ... How can I change the output format…

python output vaex

asked Mar 29 '21 at 19:56

SophieLD

votes

1 answer

Can featuretools be used on a vaex dataframe?

I'm trying to play with automated feature engineering - I've got it to work on raw dataframes but I'm not sure to do it on out of memory dataframes such as vaex. My purpose is to find a way to use automated feature engineering when data frame…

python pandas featuretools vaex

asked Mar 18 '21 at 17:43

Lostsoul

25,013
48
144
239

votes

2 answers

convert csv to hdf5 by using vaex.from_csv Error: 'DataFrameArrays' object has no attribute 'dtype'

I have a csv file with more than 13 million rows, I want to convert to hdf5: I can run code: df_chunk = vx.from_csv(r'df.csv', nrows=20_000_000) but if I run following code: df_chunk.export(r'df.hdf5') I got error: AttributeError:…

python hdf5 vaex

asked Mar 13 '21 at 14:05

SophieLD

votes

1 answer

Vaex Dataframe and Expression: Filter every nth row (Python)

I have some pretty big hdf Files (10e9 rows, about 100Gb) containing [X,Y,Z,Sensor_0,...,Sensor_n] values. For processing i am using vaex, which gives me nice and fast results. However, i am struggling with the following issue: I havent found a way…

python dataframe vaex

asked Mar 08 '21 at 06:48

AM_Guy

votes

1 answer

Can we load .txt files to vaex?

I have folder of .txt files which is of the size of 52.6 GB. The .txt files are located in various subfolders. Each subfolder has unique labels "F","G", etc. Each subfolder has got many .txt files. I need to combine all the .txt files of each unique…

bigdata vaex

asked Feb 22 '21 at 16:23

shadow kh

votes

2 answers

Extract and combine data from 3 large tsv/csv files

I have 3 big tsv files with the following structure : file1 : id,f1,f2,name,f3 file2 : id,f4,blah1,f5 file3 : id,f5,f6,blah2 I want to create a third file that is extract from the others: result: id,name,blah1,blah2 Currently i cant because…

python pandas csv vaex

asked Feb 17 '21 at 18:42

sten

7,028
9
41
63

votes

1 answer

Vaex unable to open hdf5 created by pandas

I am getting this error: OSError: Could not open file: test/pd.hdf5, did you install vaex-hdf5? Is the format supported? Yes I have installed vaex-hdf5 Here is a screenshot of the hdf5 I am attempting to open in vaex, opened in pandas: Any help is…

python hdf5 vaex

asked Jan 06 '21 at 23:58

Hairy

votes

1 answer

ModuleNotFoundError: No module named 'vaex.remote'

I was trying to install the vaex application from Anaconda Navigator, but it fails to launch with an error: ModuleNotFoundError: No module named 'vaex.remote'. Everything is installed, and I even reinstalled everything, with no better results: ~$…

python anaconda vaex

asked Dec 24 '20 at 15:51

mrgou

1,576
2
21
45

Prev 1 2 3

…

12 13 Next