Questions tagged [apache-arrow]

Apache Arrow™ enables execution engines to take advantage of the latest SIM D (Single input multiple data) operations included in modern processors, for native vectorized optimization of analytical data processing.

Arrow memory format supports zero-copy reads for lightning-fast data access without serialization overhead.
Columnar layout of data also allows for a better use of CPU caches by placing all data relevant to a column operation in as compact of a format as possible.
Arrow acts as a new high-performance interface between various systems. It is also focused on supporting a wide variety of industry-standard programming languages. Java, C, C++, Python are underway and more languages are expected soon.

For installation details see this

595 questions

votes

1 answer

Struct of Arrays in flatbuffer?

Let's say I have the following flatbuffer IDL file: table Monster { mana:short = 150; inventory:[ubyte]; // Vector of scalars. } And that I want to serialize an array of 2 Monster objects in a buffer. Apparently it is possible to create the…

flatbuffers apache-arrow

asked May 17 '18 at 22:35

lezebulon

7,607
11
42
73

votes

1 answer

Construct an Arrow DoubleArray from double pointer

I have a two-dimensional double array pointer. I can cast it to u_int8_t, fetch it to mutable_data() of Arrow Pool Buffer, and construct an Arrow DoubleArray. However, when I get value from Value(), raw_values() of the array, I cannot get correct…

c++ apache-arrow

asked Dec 14 '17 at 06:38

mrz1603

-1

votes

1 answer

How to convert a buffer of data into a arrow::Table without intermediate file creation in C++?

I have a pipeline of processes that do different stuff. One of the pipes reads a file and de-compresses it into a buffer. The buffer in question contains an arrow Table. There is a component that takes this buffer and returns a table with the…

c++ apache-arrow

asked Aug 13 '23 at 14:47

mohabouje

3,867
2
14
28

-1

votes

1 answer

Create DataFrame from Object HuggingFace

I recently download a dataset from HuggingFace HuggingFace. I've used datasets.Dataset.load_dataset() and it gives me a Dataset backed by an Apache Arrow table. So I have problems to export the data into a DataFrame to work with pandas. The…

python-3.x pandas dataframe apache-arrow huggingface-datasets

asked Mar 30 '23 at 12:44

M.og.op.gpt

-1

votes

2 answers

How to convert list into schema in r?

The code is as below: schema = schema(`Key`=int64(), Sex = string(), `Age` = int64(), `Date of Birth` = date32(), `Institution` = string(), `Admission Date` =…

r list schema apache-arrow

asked Mar 23 '23 at 09:28

doraemon

-1

votes

1 answer

Spark dataframe creation through already distributed in-memory data sets

I am new to the Spark community. Please ignore if this question doesn't make sense. My PySpark Dataframe is just taking a fraction of time (in ms) in 'Sorting', but moving data is much expensive (> 14 sec). Explanation: I have a huge Arrow…

apache-spark pyspark apache-arrow

asked Jun 16 '20 at 13:59

Tanveer Ahmad

-2

votes

1 answer

What's the best practice for swap apache arrow data between different processes?

I have a data api which could get stream data use rust as an independent service process, and plan to write several client process to read the data, because the client process have some function based on apache arrow datatype. I think this might be…

rust ipc shared-memory mmap apache-arrow

asked Jan 12 '23 at 03:10

Hakase

-2

votes

1 answer

Difference Between apache-arrow-flight and apache-kafka (accessing large datasets over a network)

as far as i know, both platform supports big data ingestion(streaming). What are the advantages and disadvantages of each platform?

apache-kafka dataset bigdata analytics apache-arrow

asked Jan 10 '20 at 10:01

sailfish009

2,561
1
24
31

-2

votes

3 answers

apache arrow - reading csv file

all I'm working with apache arrow now. When reading csv file with arrow::csv::TableReader::Read function, I want to read this file as a file with no header. But, it reads csv file and treat first row as csv header(data field). Is there any options…

c++ apache-arrow

asked Jan 18 '19 at 01:59

makepossible99

-4

votes

1 answer

How to create arrow array of dates using ArrayFromJSON

Basically, I want to create array of date32 type using nice ArrayFromJSON function which is super handy for writing unit tests. I've tried: auto dateArray = arrow::ArrayFromJSON(arrow::date32(), R"(["2017-11-01"])"); But this doesn't work at least…

c++ apache-arrow

asked Jan 21 '21 at 15:54

Kirill Lykov

1,293
2
22
39

Prev 1 2 3

…