My scenario is like follows:
- I'm using the BING news api and the return from the api is a list of the following object:
{
"name": "Eterna Resenha contará com as participações de Neto e Vampeta",
"url": "https://www.terra.com.br/esportes/lance/eterna-resenha-contara-com-as-participacoes-de-neto-e-vampeta,82e493e511734febfcdfda6fbd22c105xjafr9k2.html",
"image": {
"contentUrl": "http://p2.trrsf.com/image/fget/cf/800/450/middle/images.terra.com/2020/05/27/5ece8e302d1fb.jpeg",
"thumbnail": {
"contentUrl": "https://www.bing.com/th?id=ON.4E1CF6986982B70A3D6009F435822EF2&pid=News",
"width": 700,
"height": 393
}
},
"description": "Durante a quarentena, as lives tomaram conta do país, tentando arrecadar doações para ajudar quem sofre com o coronavírus...",
"provider": [
{
"_type": "Organization",
"name": "Terra"
}
],
"datePublished": "2020-05-28T00:00:00.0000000Z",
"category": "Entertainment"
}
- Note that there is no
id
field in this object, so I improvised an id by turning thedatePublished
field toDate
and used thegetTime
method to return a long and then concatenated with the news language as follows:
const time = new Date(news.datePublished).getTime()
const id = `${language}${time}`
await database.collection(`news`).doc(`${id}`).set(news, { merge: true })
- This solution becomes inefficient when the same news is returned from the BING api with an updated date which causes the object to be duplicated in my firestore database.
The solution I plan to use
Transform the news url into a hash
using the sha1
algorithm as follows:
const CryptoJS = require("crypto-js");
const id = `${CryptoJS.SHA1(news.url)}`
await database.collection(`news`).doc(`${id}`).set(news, { merge: true })
The firestore document creation best practices guide leaves scope for using ids in this format. But my main concern is with the performance with big id (d40e5b8df6462e138fe617a84ddabae7f78360a6) since I will have thousands of news in at least 5 languages.
Remeber: I need to create traceable IDs (based on some object property) because some news can be retrieved from BING news with the same content and the different datePublished
, then I will need update them.
I would like to know if there are any counter points that make me choose another solution?