1

I have a hive table that is of 2.7 MB (which is stored in a parquet format). When I use impala-shell to convert this hive table to kudu, I notice that the /tserver/ folder size increases by around 300 MB. Upon exploring further, I see it is the /tserver/wals/ folder that holds the majority of this increase. I am facing serious issues due to this. If a 2.7 MB file generates a 300 MB WAL, then I cannot really work on bigger data. Is there a solution to this?

My kudu version is 1.1.0 and impala is 2.7.0.

David Ebbo
  • 42,443
  • 8
  • 103
  • 117
Zzrot
  • 304
  • 2
  • 4
  • 20

1 Answers1

1

I never used KUDU but I'm able to Google on a few keywords, and read some documentation.

From the Kudu configuration reference section "Unsupported flags"...

--log_preallocate_segments
Whether the WAL should preallocate the entire segment before writing to it
Default true

--log_segment_size_mb
The default segment size for log roll-overs, in MB
Default 64

--log_min_segments_to_retain
The minimum number of past log segments to keep at all times, regardless of what is required for durability. Must be at least 1.
Default 2

--log_max_segments_to_retain
The maximum number of past log segments to keep at all times for the purposes of catching up other peers.
Default 10

Looks like you have a minimum disk requirement of (2+1)x64 MB per tablet, for the WAL only. And it can grow up to 10x64 MB if some tablets are straggling and cannot catch up.

Plus some temp disk space for compaction etc. etc.


[Edit] these default values have changed in Kudu 1.4 (released in June 2017); quoting the Release Notes...

The default size for Write Ahead Log (WAL) segments has been reduced from 64MB to 8MB. Additionally, in the case that all replicas of a tablet are fully up to date and data has been flushed from memory, servers will now retain only a single WAL segment rather than two.
These changes are expected to reduce the average consumption of disk space on the configured WAL disk by 16x

Samson Scharfrichter
  • 8,884
  • 1
  • 17
  • 36