0

I am investigating whether an application at work can benefit from an S3 based storage system rather than traditional NFS. So, I downloaded Minio on another computer attached to my local LAN, and wrote a quick S3 PutObject test with the aws-sdk.

I grabbed the content from /etc/passwd as test data (about 5kb)

for (int i = 0; i < 100; i++)
{
    struct timeval tv;
    gettimeofday(&tv,nullptr);
    QByteArray tvData = QByteArray::fromRawData(reinterpret_cast<char *>(&tv),sizeof(struct timeval));
    QByteArray filename = QCryptographicHash::hash(tvData,QCryptographicHash::Sha1).toHex();
    filename[1] = '/';
    filename[4] = '/';
    Aws::S3::Model::PutObjectRequest req;

    req.SetBucket(primaryBucket);
    req.SetKey(filename.constData());
    const std::shared_ptr<Aws::IOStream> input_data =
            Aws::MakeShared<Aws::StringStream>("SampleAllocationTag", ba.constData());
    req.SetBody(input_data);
    auto outcome = myClient->PutObject(req);
    if (!outcome.IsSuccess())
    {
        std::cout << "PutObject error: "
                  << outcome.GetError().GetExceptionName() << " - "
                  << outcome.GetError().GetMessage() << std::endl;
        return false;
    }
}

This small transaction of 100 small files takes 8 seconds to run. Seems ridiculously slow to me. Does anyone have any ideas, or am I missing something huge here? Again, running Minio on a local network (actually same network switch) computer, just default setup pointing to a directory. AWS S3 SDK built from source, both machines Fedora 31. I'm looking for something that can handle hundreds of files per second, write, read, and delete, with sometimes bursts into the thousands, this is orders of magnitude too slow out of the box.

tadman
  • 208,517
  • 23
  • 234
  • 262
CRB
  • 111
  • 1
  • 6
  • 1
    There's no need to tag yourself "NOOB" or anything derogatory. Stack Overflow is a place to learn, and we're all learning. – tadman Jul 12 '20 at 21:42
  • You're going to have to profile this better to see where the slow spots are. Is it the cryptographic stuff? Is there some kind of network timeout that eventually gets resolved? Is DNS giving you trouble? Do other S3 tools work as slowly? – tadman Jul 12 '20 at 21:43
  • I'm an old school C++ programmer for about 20 years and dont know a lot of the cloud terminology yet. I do know that since its minio based, I had to set up the connection options in a non-default manner, and when constructing that, it pauses for a good second or so. Thats a good point though, let me try 50 and if its 4 seconds, its the overall efficiency, if its more, its an initialization issue. – CRB Jul 12 '20 at 21:57
  • Yup, 50 objects is 3.9s. I was just hoping I was doing something wrong in the above code. Listing the buckets is instant, and retrieving a list of objects is .03 seconds. So put/write is VERY slow. – CRB Jul 12 '20 at 21:59
  • Ok, if I put each store into its own thread, and create a new client connection for each one, I got it down to 100 stores in 3 seconds. Thats more like how this would be in production anyway. Let me see if there is a quick minio option so I can tell if its minio or S3 – CRB Jul 13 '20 at 00:51
  • @CRB did you manage to get good performance? – Aurélien Gasser Apr 17 '21 at 20:02
  • 1
    No, we just moved on and are still using NFS. CentOS 7 fixed the NFS bug with 7.8. We will leave it up to our customers if they want to use something like that for data storage. – CRB Apr 18 '21 at 21:09

0 Answers0