Questions tagged [python-s3fs]

For questions related to the Python s3fs library

Use this tag for questions related to the Python s3fs library.

Not to be confused with the tag, which is for mounting an s3fs bucket on a local mount point and has nothing to do with Python.

85 questions
0
votes
0 answers

Is it possible to list all versions of a file with the s3fs S3FileSystem python API?

We're building a python abstraction on top of AWS S3 and MinIO, and we're using the s3fs API to transparently talk to either. While both the boto and the minio APIs allow us to retrieve all the versions of a given file (object), we can't find a way…
Gabriele Giuseppini
  • 1,541
  • 11
  • 19
0
votes
0 answers

Using s3fs with MinIO: FileNotFoundError when running s3fs locally, but works with boto3 and also with s3fs in Docker-compose setup

I'm trying to access files in a MinIO bucket using s3fs. The code works with boto3 locally and also with s3fs when using a docker-compose setup. However, when I try to run s3fs locally, I get a FileNotFoundError. Here's the working code using…
Jonas Kemper
  • 3,745
  • 3
  • 14
  • 21
0
votes
2 answers

s3fs FileNotFoundError

I am only able to gain limited/top-level access to my aws s3. I can see the buckets, but not their contents; neither subfolders nor files. I'm running everything from inside a conda environment. I've tried accessing files in private and public…
0
votes
0 answers

s3fs with pandas, can we cache files automatically with native implementation?

I recently migrated my workflow from AWS to a local computer. The files I need are still stored on S3 private buckets. I've been able to set up my environmental variables correctly, where all I need to do is import s3fs and then and I can read files…
jeffery_the_wind
  • 17,048
  • 34
  • 98
  • 160
0
votes
0 answers

AWS Glue- s3fs throws FileNotFoundError

I am trying to read a parquet file from s3 using s3fs file system using pyarrow but getting NoSuchKey or FileNotFoundError. def read_parquet_pd(path): s3 = s3fs.S3FileSystem() path = path.rstrip('/') logger.info(f"Path is: {path}") …
Mradul Yd
  • 71
  • 8
0
votes
0 answers

Want to create S3 Filesystem in python with stable library

I want to create s3 file system for uploading files to s3 bucket using the pyarrows write_to_dataset function fs = s3fs.S3FileSystem() pa.parquet.write_to_dataset(table, root_path=output_folder, filesystem=fs, …
0
votes
1 answer

Copy only new objects from S3 to on-premise server

I have a S3 bucket where objects are generated from salesforce on daily basis. I want to copy those objects from S3 bucket to a local Linux server. An application will run on that Linux server which will reference those objects to generate a new…
0
votes
1 answer

Trouble with formatting when writing a csv to s3 with s3fs

I'm pushing a dataframe to an s3 bucket using s3fs with the following code: s3fs = s3fs.S3FileSystem(anon=False) with s3fs.open(f"bucket-name/csv-name.csv",'w') as f: my_df.to_csv(f) The action is completed successfully, but the csv has…
ire
  • 491
  • 2
  • 12
  • 26
0
votes
1 answer

Read Parquet files with Pandas from S3 bucket directory with Proxy

I would like to read a S3 directory with multiple parquet files with same schema. The implemented code works outside the proxy, but the main problem is when enabling the proxy, I'm facing the following issue. Traceback (most recent call last): …
HouKaide
  • 127
  • 1
  • 7
0
votes
1 answer

Can I use s3fs to perform "free data transfer" between AWS EC2 and S3?

I am looking to deploy a Python Flask app on an AWS EC2 (Ubuntu 20.04) instance. The app fetches data from an S3 bucket (in the same region as the EC2 instance) and performs some data processing. I prefer using s3fs to achieve the connection to my…
mfcss
  • 1,039
  • 1
  • 9
  • 25
0
votes
1 answer

s3fs library unable to be imported in python

I get this error when trying to import s3fs in Python 3.10.2 in Windows: ImportError: cannot import name 'is_valid_ipv6_endpoint_url' from 'botocore.endpoint' I found this question in Github that advises using pip install urllib3==1.25.10. I did it…
HuLu ViCa
  • 5,077
  • 10
  • 43
  • 93
0
votes
0 answers

Python3: ImportError: cannot import name 'InvalidProxiesConfigError' from 'botocore.httpsession'

My use case is that I am trying to write my dataframe to S3 bucket for which I installed s3fs==2015.5.0 using pip3. Now when I run the code import s3fs def my_func(): # my logic my_func() It returns the following error: Traceback (most…
muazfaiz
  • 4,611
  • 14
  • 50
  • 88
0
votes
1 answer

use boto for gzipping files instead of sfs3

import contextlib import gzip import s3fs AWS_S3 = s3fs.S3FileSystem(anon=False) # AWS env must be set up correctly source_file_path = "/tmp/your_file.txt" s3_file_path = "my-bucket/your_file.txt.gz" with contextlib.ExitStack() as stack: …
x89
  • 2,798
  • 5
  • 46
  • 110
0
votes
2 answers

A conflicting conditional operation is currently in progress against this resource. (bucket already created)

Using s3fs, I am uploading a file to the already created s3 bucket (not deleting the bucket). On execution, the following error is thrown: [Operation Aborted]: A conflicting conditional operation is currently in progress against this…
Roxy
  • 1,015
  • 7
  • 20
0
votes
0 answers

Accessing S3 bucket object in TFX pipeline with S3FS

I'm building a TFX pipeline that contains images as input from an S3 bucket. At the TF Transform component step, I'm attempting to read in a series of images with their URLs stored in TFX's SparseTensor format. I'm trying to use the S3FS Python…
John Sukup
  • 303
  • 3
  • 11