Questions tagged [alluxio]

Alluxio is an open source memory-centric distributed file system written in Java. It acts as an in-memory data caching layer between applications and data storage systems. The software is published under the Apache License.

Alluxio (formerly Tachyon) is an open source memory-speed distributed file system. It is a data layer between compute and storage, abstracting the files or objects in underlying persistent storage systems and providing a shared data access layer for compute applications. Alluxio was developed in University of California, Berkeley AMPLab.

Alluxio can be used as a distributed shared caching service for big data analytics like , , etc, so that compute applications talking to Alluxio can transparently cache frequently accessed data, especially data from remote locations, to provide in-memory I/O throughput

Alluxio can also simplify cloud and object storage adoption: Cloud and object storage systems use different semantics that have performance implications compared to traditional file systems. For example, when accessing data in cloud storage there is no node-level locality or cross-application caching. There are also different performance characteristics in common file system operations like directory listing (‘ls’) and ‘rename’, which often add significant overhead to analytics. Deploying Alluixo with cloud or object storage can close the semantics gap and achieve significant performance gains.

Alluxio is written in and hosted on github.

The latest stable version:

Recommended reference sources:

90 questions
1
vote
0 answers

spark-ec2 and Tachyon hadoop version disparity

I try to use spark-ec2 to launch ec2 cluster with hadoop version 2.x, so I tried: ./spark-ec2 -k spark -i ~/.ssh/spark.pem -s 1 --hadoop-major-version=2 launch my-spark-cluster then I found out there are error in the tachyon setting up…
user3684014
  • 1,175
  • 12
  • 26
0
votes
1 answer

Alluxio - How to check log4j version used in Alluxio release (alluxio-2.8.1)

We are using Alluxio(alluxio-2.8.1), and very curious to see and understand what version of log4j used in it. Please suggest where we can get that information.
0
votes
1 answer

How to get rid of warnings with MEM and SSD tiers

I have two tiers: MEM+SSD. The MEM layer is almost always at 90% full and sometimes the SSD tier is also full. Now this (kind of) message is sometimes spamming my log: 2022-06-14 07:11:43,607 WARN TieredBlockStore - Target tier:…
ChanChan Mao
  • 157
  • 8
0
votes
1 answer

How to check if folder is cached in Alluxio?

How can I tell if a folder is cached or not in Alluxio? And if a folder is cached, how can I uncache it?
ChanChan Mao
  • 157
  • 8
0
votes
1 answer

setting up Alluxio cluster, path points to Java 8 but Alluxio thinks it's java 7?

I am setting up an Alluxio cluster. I ran the script ./bin/alluxio-start.sh all SudoMount, but got this error Error: Alluxio requires Java 8 or Java 11, currently Java 1.7.0_321 found. I already set JAVA_HOME and add it to $PATH to point to Java 8.…
ChanChan Mao
  • 157
  • 8
0
votes
1 answer

How can I maintain a list of constant masters and workers under conf/masters and conf/workers in a managed Scaling cluster?

I am using an AWS EMR cluster with Alluxio installed n every node. I want to now deploy Alluxio in High Availability. https://docs.alluxio.io/os/user/stable/en/deploy/Running-Alluxio-On-a-HA-Cluster.html#start-an-alluxio-cluster-with-ha I am…
ChanChan Mao
  • 157
  • 8
0
votes
1 answer

Is hadoop3.1.1 not supported?

When I build alluxio with following cmd. mvn -T 4C clean install -pl underfs/hdfs/ \ -Dmaven.javadoc.skip=true -DskipTests -Dlicense.skip=true \ -Dcheckstyle.skip=true -Dfindbugs.skip=true \ -Pufs-hadoop-3 -Dufs.hadoop.version=3.1.1 It…
ChanChan Mao
  • 157
  • 8
0
votes
1 answer

Mount HDFS Data on Alluxio

is it possible to mount HDFS data on Alluxio and have Alluxio copy/presist data onto s3 bucket?? or use Alluxio to copy data between HDFS and S3 (without storing data in Alluxio cache)?
ChanChan Mao
  • 157
  • 8
0
votes
1 answer

Alluxio on kubernetes(EKS) supports workers only with DaemonSet kind? Or we can deploy workers using Deployment kind and specify number of workers?

We are running Alluxio on our apps EKS cluster. And the cluster deployment creating worker pods on each eks node as worker deployment kind is DaemonSet. Thus worker pods are consuming resources in all EKS nodes. We want to limit the worker pods to…
0
votes
1 answer

Is there any way I can avoid specifying the port in the alluxio fs uri?

If i have a domain name for my alluxio master, is there any way I can avoid specifying the port in the alluxio fs uri. Like instead of alluxio://:/ just alluxio:/// Also if i have a…
ChanChan Mao
  • 157
  • 8
0
votes
1 answer

How do I get Alluxio POSIX to run with version 3 libfuse?

The POSIX documentation says that libfuse version 2.9.3 is required. On my EMR 6.2.0 systems, only 2.9.2 is offered. I have removed the 2.9.2 libfuse and installed the version 3 libfuse. However, Alluxio does not mount fuse as I suppose it is still…
ChanChan Mao
  • 157
  • 8
0
votes
1 answer

Will journal be lost if there is no time to flush to the journal writer before jvm hangs?

Many journals are written asynchronously to the related journal writer through AsyncJournalWriter. If the journal is in AsyncJournalWriter.mQueue but there is no time to flush to the journal writer before jvm hangs, will the journal be lost?
ChanChan Mao
  • 157
  • 8
0
votes
1 answer

Is the s3 connection configurable with role arn alone?

Alluxio on kubernetes(EKS) supports s3 connection without aws accessKey and secretKey? Is the s3 connection configurable with role arn alone? We are installing Alluxio on EKS using s3 as a underlaying storage layer. Alluxio cluster is up and running…
ChanChan Mao
  • 157
  • 8
0
votes
1 answer

Set Default Block Size for Specific Path

how can I set default block size for a specific path? I don't know which property to be set in './alluxio fsadmiin pathConf ' command.
ChanChan Mao
  • 157
  • 8
0
votes
1 answer

format Alluxio: No Under File System Factory found for: hdfs://nameservice1/alluxio/journal/BlockMaster

I want to Deploy Alluxio on a Cluster with HA.My CDH version: 3.0.0+cdh6.3.2. I build Alluxio with a specific Hadoop release version: mvn install -Phadoop-3 -Dhadoop.version=3.0.0 -DskipTests I put…