0

I have problem with "dump" then "delete" large data MONGODB

db have already index fields need to query

total data ~50m records

filtered data have to dump and delete ~5m

3 server:
-- MONGO t2.medium

-- SIDEKIQ t2.small

-- OTHER SERVER t2.small (multi)

I run cronjob at the less traffic time. But it take too much time to complete job, ~6-8h and when it running , other-server can't connect MONGODB, then other-server change to status degrade ( elasticbeantalk with docker )

when server down, I check MONGODB mongostat: cpu taking ~95-96%. OTHER SERVER logs is "can not connect to db".

Please, Someone have experience of mongodb that help me work out this

Peter89
  • 706
  • 8
  • 26

1 Answers1

1

Perhaps you should try other strategy. You may try to do it in few steps. Also you must split this task into many small jobs. You can run this jobs in a background in a low priority. On your place i'll make next steps:

1st step:
1) create temporary DB(collection) for storing data for dump
2) select required data from original collection in small portions. How big? It depends on your server. For example 5000 entries per time (limit, offset).
3) save data in temporary db

Now you can try to dump temporary db. If it will not work you may try to use partitioning.

sig
  • 267
  • 1
  • 10
  • I think your way is really hard to apply. Because data is over 5m records. Even I indexed fields for query as fastest as possible, but Sorting and Counting is taking alot of time in mongodb and Split data like that is unconvenient for importing again – Peter89 Feb 09 '17 at 04:16