0

I wanted to confirm the approach that I am implementing to model my ERP database into Elastic search. That means I am indexing my source ERP data to the destination Elastic search. The source ERP database has some main entities like products, invoices, orders, etc. As Elastic search encourages data demoralization patterns, the source ERP database data is already stored in de-normalize pattern. Let me share an example of its main flow i.e. order transaction process. In the order transaction process, there are mainly 2 Mongodb collections involved namely “orders” and “invoices”.

MongoDB Orders collection has denormalize/nested data

Apart from an order related attributes this collection’s object has some additional de-normalize/nested information which is,

  • Users information
  • Shipment information
  • Order products items

MongoDB Invoices collection has denormalized/nested data

Apart from an invoice related attributes, this collection’s object has some additional de-normalize/nested information which is,

  • Invoiced products items
  • Invoice promotions

Looking at the current workflow for an order transaction process we know that “an order has multiple invoices and for each invoice, there are multiple products” and for all this process “ORDER ID “reference connects both the MongoDB collections which are “orders” and “invoices”.

My ES index design

ES design index design image

For this ERP transaction process, I intend to design an ES global index on the basis of a common ORDER ID. The idea is to move all related de-normalize data into the ES search index called “order_transaction”. So that we can query by any reference like,

  • Can query from the invoices date range.
  • Can query from the order date range.
  • Can query from any customer details.
  • Can query from any product information.
  • Can query from any invoice attribute

ES order_transaction index example

{
                    "invoiceDatesKeys": [
                        {
                            "invoiceCreatedAt": "2020-05-16T22:57:47.641Z"
                        }
                    ],
                    "invoiceDetials": [
                        {
                            "_id": 164,
                            "createdAt": "2020-05-16T22:57:47.641Z",
                            "invoice_items": [
                                {
                                    "_id": "5ec06feb2332475fdc7f5654",
                                    "claimed_qty": 0,
                                    "class_code": "simple",
                                    "combined_tax_rate": 0,
                                    "discount": 0,
                                    "name": "AABC Client",
                                    "ordered": 2,
                                    "picked": 0,
                                    "price": 10,
                                    "product": 1202,
                                    "qty": 2,
                                    "return_qty": 0,
                                    "row_order_discount": 0,
                                    "row_total": 20,
                                    "scrap_qty": 0,
                                    "shipped": 0,
                                    "shipping_fee": 0,
                                    "shipping_fee_tax": 0,
                                    "sku": " XYZ123",
                                    "tax_category": 0,
                                    "tax_price": 0
                                }
                            ],
                            "shipping_amount": 0,
                            "status": "pending",
                            "subtotal_price": 20,
                            "total_paid": 20,
                            "total_tax_price": 0
                        }
                    ],
                    "invoiceIDsKeys": [
                        {
                            "invoiceID": 164
                        }
                    ],
                    "productsKeys": [
                        {
                            "prodID": 1202
                        }
                    ],
                    "skuKeys": [
                        {
                            "sku": "XYZ123"
                        }
                    ],
                    "orderDetails": {
                        "createdAt": "2020-05-16T22:57:47.634Z",
                        "coupon_code": "",
                        "orderNumber": "AG-10304",
                        "orderID": 204,
                        "is_gift_order": false,
                        "shipping_address": {
                            "country_code": "US",
                            "city": "city abc",
                            "street": [
                                “ abc 3rd st ",
                                ""
                            ],
                            "postcode": "91111",
                            "last_name": "last name",
                            "telephone": "0123456789",
                            "state": "WA",
                            "first_name": "john"
                        },
                        "customer": {
                            "last_name": "ABC",
                            "first_name": "XYZ",
                            "email": "abc@gmail.com",
                            "group": 1
                        },
                        "order_items": [
                            {
                                "unit_handling_tax": 0
                            },
                            {
                                "cost": 0,
                                "unit_handling_tax": 0
                            }
                        ]
                    }
                }
        

So as a concluding question I just want to confirm my ES index-creating approach for an e-commerce reporting system solution that is mentioned above is correct? OR is there any standard to design an ERP data model in ES ???

1 Answers1

0

some general ideas on how to approach it in Elasticsearch land would be;

  1. treat the order and the invoice (and other related transactions) as separate documents in Elasticsearch and "link" them together based on the order number being in each one. updating your original order document as you add invoices details is a relatively expensive exercise for Elasticsearch, as an update is actually a delete and a reindex
  2. each of these document types would contain all the info of the customer. might seem like duplication (which, yeah, it is) but it means that you don't need to do post query process to merge this from other sources like another index
  3. it may make sense to add all the details of the order, eg the product info, into your other documents for the same reason as the point above
  4. assuming you take this approach, normalise and flatten your document structure. eg have '"document_type": "order"or whatever it needs to be. get rid of nested/object structures likeinvoiceDatesKeysand then haveCreatedAt` (and whatever other timestamp fields you need) as a normal field. this will save you mapping and querying complexity
  5. 2 and 3 and 4 make aggregations a tonne easier
warkolm
  • 1,933
  • 1
  • 4
  • 12