1

The docs says that the size of a document is composed of:

  1. document name size
  2. The sum of the string size of each field name
  3. The sum of the size of each field value
  4. 32 additional bytes

The following document example:

  • "type": "Personal"
  • "done": false
  • "priority": 1
  • "description": "Learn Cloud Firestore"

Has the size of 147.

Question:

When calculating the size of a document, is there anything else I should care of? Perhaps some metadata? Because when using this calculation, there's for sure something missing.

I have this class:

class Points {
    public List<GeoPoint> geoPoints;

    public Points() {}

    public Points(List<GeoPoint> geoPoints) {
        this.geoPoints = geoPoints;
    }
}

And this is how I create the list and how I write it to the database:

List<GeoPoint> geoPoints = new ArrayList<>();
for (int i = 1; i <= 40_327 ; i++) {
    geoPoints.add(new GeoPoint(11.22, 33.44));
}
DocumentReference geoRef = db.collection("points\geo");
geoRef.set(new Points(geoPoints)).addOnCompleteListener(new OnCompleteListener<Void>() {
    @Override
    public void onComplete(@NonNull Task<Void> task) {
        if (task.isSuccessful()) {
            Log.d("TAG", "geoPoints added successfully");
        } else {
            Log.d("TAG", task.getException().getMessage());
        }
    }
});

Edit with example:

My reference is:

db.collection("points").document("geo");
  1. (6 + 1) + (3 + 1) + 16 = 27

The field name (the array) is called geoPoints

  1. 9 + 1 = 10

I store in that array 40,327

  1. 40,327 * 16 = 645,232

There is an additional 32 additional bytes for each document

  1. 32

So it makes a total of:

Total: 27 + 10 + 645,232 + 32 = 645,301 bytes

There is nowhere specified in the docs that each element in the array counts more than his length:

Field value size

The following table shows the size of field values by type.

Type Size

Array The sum of the sizes of its values

Even so, if I need to add a byte (bytes) for every position, for example, 1 for a one digit number, 2 for a two digit number and so on and an additional 1 byte as it is in case of Strings, I should add 230,850 to the total.

So it makes a new total of 645,301 + 230,850 = 876,153‬.

This is the maximum allowed. Adding 40,328, will be rejected.

Anyway it is again less than the maximum 1,048,576 allowed.

Community
  • 1
  • 1
Pathis Patel
  • 179
  • 1
  • 11
  • I'm not sure what you're asking. I don't suspect there is anything missing from the documentation. If you observe something that contradicts what it says, you can file a bug report with Firebase support. https://support.google.com/firebase/contact/support – Doug Stevenson Apr 08 '20 at 19:04
  • @DougStevenson Thank you for that. Yes, that's why, I mentioned the other answer because I've seen that we cannot use all space that is provided in the docs. – Pathis Patel Apr 08 '20 at 19:20
  • In the question you linked the OP was attempting to store a massive amount of data in a single document, which is not generally best practice. Why are you asking this question? Can you elaborate on your use case? If you're interested in billing, with Firestore it's the number of reads/writes/deletes, not so much the quantity of data. See [Billing](https://firebase.google.com/pricing) – Jay Apr 08 '20 at 22:02
  • @Jay Thanks for trying to help. **OP was attempting to store a massive amount of data in a single document**. That is my question too. Why do you say "massive amount of data"? They say we are allowed, but that is actually **not** true. If it's so massive, why do they say in the docs that it's allowed? **interested in billing** It's not about the billing it's about the fact that they say we can store up to 1,048,576 bytes but according to my calculations this is not possible. Have you tried to store a fully 1MiB? I'm afraid you can't! – Pathis Patel Apr 08 '20 at 22:27
  • The point of Firestore is NOT to see just HOW much data you can cram into a document - it is SIMPLY NOT INTENDED TO BE USED AS MASS STORAGE. It is a semi-structure store of information documents; if your goal is mass storage USE SOMETHING ELSE. – LeadDreamer Apr 09 '20 at 01:03
  • @LeadDreamer Simply I cannot undersrand you guys why are you talking about mass storage? They say a document can hold up to 1Mib, is this considered mass storage? If yes, why would they say that and not 100Kb? – Pathis Patel Apr 09 '20 at 07:40
  • To clarify, 'massive amount' means that for a single document, storing the maximum allowed amount 1Mb. While technically you *can* store that, it's wise to do so - if your data is that 'large' there may be other, better options. For example; suppose you want to store a 1Mb picture. While you *could* store it directly in a Firestore Document, you'll be way better off storing it Firebase Storage (for a number of reasons). So I think the point here is just because you *can* do it, doesn't means you *should* do it. If you can elaborate on the use case, we can probably provide better direction. – Jay Apr 09 '20 at 15:23
  • *Have you tried to store a fully 1MiB? I'm afraid you can't!*. We have done that, yes, and it does support 1Mb documents, however, as I mentioned there are usually other options that better fit that model. – Jay Apr 09 '20 at 15:26
  • @Jay 645,301 bytes from 1,048,576 bytes does **not** represent in my opinion a "massive amount". However, 1,020,000 bytes for example, does. I understand what you say about the picture, but it's not the case. My use-case: I need to display 50k geo points. According to a simple calculation, all those points fit in less than the max size of a document. Creating 50k documents to store only a lat and a lng, isn't a solution at all. How about, I sell you an apartment with 4 bedrooms, but you can only use 2 of them. Using all 4 bedrooms, does it mean I'm massively living in that apartment? – Pathis Patel Apr 10 '20 at 07:44
  • I think you may not be understanding the general message here. Regardless of how much data CAN be store in a document, the issue is whether it SHOULD be. Storing 50k of anything in a single document is probably not a 'good' design pattern in a NoSQL database. *Creating 50k docs to store only a lat and a lng, isn't a solution at all* - but it actually IS a good solution and a proven design pattern that works - and that's how Firestore was designed to operate. Best practice is to denormalize data when needed, spread your data across nodes to make it readable and queryable for your use case. – Jay Apr 10 '20 at 15:32
  • Check out this posting from @DougStevenson [6. Document size limit of 1MB](https://medium.com/firebase-developers/the-top-10-things-to-know-about-firestore-when-choosing-a-database-for-your-app-a3b71b80d979) section 6. It's good info and expands on some of the pitfalls of stuffing a 'massive' amount of data into a single document. (massive being relative to what SHOULD be stored vs what COULD be stored) – Jay Apr 10 '20 at 15:34
  • @Jay In that article Doug Stevenson says: "a very popular user is going to run into the 1MB document limit, and further writes of that document will fail." This is actually **NOT** correct!!! The write operations start to fail from **645,301 bytes**. There is no way you can get over it. You also say to store 50k location in 50k documents, really? So I should pay 50k reads instead of a single one? Come on... This is absolutely not acceptable!!! I only need to display them. No need to query, or filter them. – Pathis Patel Apr 11 '20 at 11:02
  • Well, @DougStevenson is pretty much the authority on the topic as he's part of the Firebase team. I posted an answer which may provide some clarity and shows how to actually upload 1Mb of data. You can cetainly store your data in whataver way works for your use case - a document can hold 1Mb of data. Hope it helps. – Jay Apr 11 '20 at 13:51

1 Answers1

1

TL;DR

The original question did not involve Geopoints but as more discussion took place that was the ultimate goal. The issue is not that Firestore Documents can't hold 1Mb of data (because they can as clearly shown below) but the actual issue is how the OP is calculating how much data they want to store.

A Geopoint takes 16 bytes but there is also the rest of the calculation that should be added in. So here's a summary to calculate the size of a document

docNameSize = 8 //suppose it's called 'geoArray'
fieldNameSize = 5 // this is an array, so the first element name is 0 = 1 byte, 
                  // element 10 name would be two bytes
                  // up to 5 bytes for the higher numbers
geoPointSize = 16 * number of geopoints
addlSize = 32

So suppose there are 1000 geopoints

8 + (bytes depending on the field name length) + (16 * # of geopoints) + addl Size

So as you can see, the discussion is not around how much data a document will hold but about how the document size for a geopoint is calculated.

quick calculation

var s = ""
for i in 0..<10000 {
    s += String(i)
}
print(s.count)

shows that if you want to store 10000 Geopoints, 38890 bytes goes just to field names alone.

Discussion

This answer shows how to calculate the size of a Firestore document as well as the code to demonstrate how a file (an image in this case) of size 1Mb can be uploaded to a Firestore document.

Note that this is NOT how it should be done in real world use! - images and files should be stored in Storage, not Firestore, so take this as an example case.

An additional note that storing datasets that max out the capacity of a document may hinder overall performance and negates the ability to query or sort that data server side which puts a lot more strain on the apps resources. If there is concern about cost per number of writes/reads, I suggest looking at the Real Time Data Base as the costs are per amount of data, not reads/writes.

First we start with a 1Mb jpg called Mountain

enter image description here

To calculate the actual amount of data being uploaded, we use the following from the Firebase Storage and Calculations

The size of a document is the sum of:

  • The document name size
  • The sum of the string size of each field name
  • The sum of the size of each field value (we have only one field in
    this example)
  • 32 additional bytes

In the following code, the document name is 'mountain_image' which is 14, the field name is 'imageData' 9, the size of the field value is calculated (shown below) plus 32 bytes.

For this example, I've dragged the 1Mb image into my App bundle. Here's the (macOS) code that reads that image, converts it to a NSData type for Firestore and uploads the file.

func uploadImageToFirestre() {
    let image = NSImage(imageLiteralResourceName: "Mountain.jpeg")

    guard let asTiffData = image.tiffRepresentation else { return }
    let data = NSData(data: asTiffData)

    let imgRep = NSBitmapImageRep(data: data as Data)
    guard let jpgData = imgRep?.representation(using: NSBitmapImageRep.FileType.jpeg, properties:  [:]) else { return }

    let docNameSize = 14
    let fieldNameSize = 9
    let dataSize = jpgData.count
    let addlSize = 32

    let totalSize = docNameSize + fieldNameSize + dataSize + addlSize

    print("allowed size: \(1048487)")
    print("total size:   \(totalSize)")
    let imageCollection = self.db.collection("images")
    let thisImage = imageCollection.document("mountain_image")
    let dict:[String:Any] = ["imageData": jpgData]
    thisImage.setData(dict, completion: { error in
        if let err = error {
            print(err.localizedDescription)
            return
        }

        print("upload success")
    })
}

the output to console is this

allowed size: 1048487
total size:   1040221
upload success

So as can be seen, the total size is just under the allowed size in a Firestore document.

To summarize, this code uploads a 1Mb file to a Firestore Document

For completeness, here's the code that reads back that data object, converts back to an image and displays in the UI

func readImageFromFirestore() {
    let imageCollection = self.db.collection("images")
    imageCollection.getDocuments(completion: { snapshot, error in
        if let err = error {
            print(err.localizedDescription)
            return
        }

        guard let snap = snapshot else { return }

        for doc in snap.documents {
            let imageData = doc.get("imageData") as! Data
            let image = NSImage(data: imageData)
            self.myImageView.image = image
        }

    })
}

Keep in mind that Text Strings sizes are the number of UTF-8 encoded bytes + 1 so 'Hello' would be 6 total, 5 + 1

EDIT:

The OP added some additional information about storing Geopoints. A Geopoint is a specific data type in Firestore and requires a single field to store a geopoint. Attempting to store multiple geopoints in a single field is not an option.

That being said, if you want to store 1Mb of geopoints, it can still be done.

Here's some math: the total bytes allowed in a document is 1048487 and if each geopoint uses 16 bytes, quick division shows that approximately 65530 worth of geopoint data can be stored.

So if I can upload 65530 bytes then it shows that a document can hold approximately 1Mb of data. Right? Here's the code that does that

The following code creates almost 65530 geopoints, converts them to a string and stores them in a single Firestore document.

func uploadGeopoints() {
    var geoArray = [GeoPoint]()
    let point = GeoPoint(latitude: 1.0, longitude: 1.0)
    for i in 0..<65530 {
        geoArray.append(point)
    }

    let geoString = geoArray.map { String("\($0.latitude)\($0.longitude)") }

    let combinedString = geoString.joined()

    let geoCollection = self.db.collection("geoStrings")
    let thisGeoString = geoCollection.document()
    let dict:[String: Any] = ["geoString": combinedString]
    thisGeoString.setData(dict, completion: { error in
        if let err = error {
            print(err.localizedDescription)
            return
        }

        print("upload success")
    })
}
Jay
  • 34,438
  • 18
  • 52
  • 81
  • While this answer might help future visitors, it doesn't help me. I'm not looking to store images, I'm simply looking to store data. If you try to store geo points or any other type of data but not images and not in this way, you'll see that you cannot store more than **645,301 bytes**. You should try it yourself. Voted-up though. – Pathis Patel Apr 11 '20 at 13:59
  • @PathisPatel Images are just data. See this line `let imgRep = NSBitmapImageRep(data: data as Data)`? That takes the image and breaks it down into *bytes of data* - 1Mb (that's a million bytes) in fact. Data is data. So I am not storing an image at all, I am storing bytes of data. An image is data, a string is data a geopoint is data. In the end everything just bytes of data and this code does just that - stores bytes of data. – Jay Apr 11 '20 at 14:05
  • Please note that this statement - *you'll see that you cannot store more than 645,301 bytes* - is not accurate as shown in my coding example where I've clearly stored much more than that - see where it says total size: **1040221**. Perhaps there's an issue with the code you're using? Maybe if you can update the question showing your code we can spot the error. – Jay Apr 11 '20 at 14:11
  • Totally agree, "bytes of data". You are sending to a Firestore bytes. This is **not** I'm looking. I'm looking to send geo points. It's true that a GeoPoint weighs 16 bytes but when I send them I send them as GeoPoint objects. Ok, I'll add the code so you can also try it. But it's in Java. – Pathis Patel Apr 11 '20 at 14:15
  • Please check my code. I hope you can understand Java. Try to write 40_328 list, it doesn't work :| 40_327 is the max. Hope you agree that 40_328 list of geo points is far less than 1,048,576 bytes. – Pathis Patel Apr 11 '20 at 14:25
  • @PathisPatel You initial question doesn't mention Geopoints but the math is the same. However, there's a flaw in the logic as a Geopoint field can hold only one Geopoint. You're proposing to store multiple Geopoints in a single field so there's only one read and that's not possible if you want to use Geopoint fields. However, it can be done by converting them to bytes of data (a string for example) and storing that. – Jay Apr 11 '20 at 14:43
  • I was talking about the other question [Firestore document does NOT have 1 MiB (1,048,576 bytes)](https://stackoverflow.com/questions/61103830/firestore-document-does-not-have-1-mib-1-048-576-bytes). No, I'm proposing to store multiple Geopoints in a list, as the array is a supported data-type. I have just tested with Strings, same result. Cannot store more than 700k. – Pathis Patel Apr 11 '20 at 14:54
  • Can it be a problem only on Android clients? – Pathis Patel Apr 11 '20 at 15:09
  • @PathisPatel It's not a problem, it's how you're calculating the size of the data you're trying to store. See the top part of my updated answer. – Jay Apr 11 '20 at 15:13
  • First of all thank so much for taking the time to help me with that, but please see the last part of my edited question, with a concrete example. – Pathis Patel Apr 11 '20 at 15:50