I'm new to cassandra and gremlin.i am using gremlin to enter and retrive the data from cassandra .i want to take a bakup and restore it on new node.i took a snapshot using nodetool.i am also using elasticsearch for indexing.please help me with some links or documents
-
Hi Syed, I hope this link helps you: https://stackoverflow.com/questions/67639544/how-do-i-replicate-a-cassandras-local-node-for-other-cassandras-remote-node – Luciana Oliveira Mar 07 '22 at 18:39
-
I have the steps, if you need it, let me know. – Luciana Oliveira Mar 07 '22 at 18:40
-
yeah i need the steps.it would be great help to me – syed adeeb Mar 08 '22 at 08:21
-
I shared with you the process I do. I hope it helps you. – Luciana Oliveira Mar 08 '22 at 15:22
2 Answers
I used the secound approach of this post : How do I replicate a Cassandra's local node for other Cassandra's remote node?
If structure of the tables is the same, you could create two bash's scripts like below:
1. Export the data using these commands:
nodetool flush <your-keyspace-name>
nodetool cleanup <your-keyspace-name>
nodetool -h localhost -p 7199 snapshot <your-keyspace-name>
zip -r /tmp/bkp.zip /var/lib/cassandra/data/<your-keyspace-name>/
sshpass -p <password> scp -v /tmp/bkp.zip root@<ip>:/tmp
2. Import the data:
unzip /tmp/bkp.zip
nodetool cleanup <your-keyspace-name>
cd /var/lib/cassandra/data/<your-keyspace-name>/ && find /var/lib/cassandra/data/<your-keyspace-name>/ -maxdepth 5 -type d -exec sstableloader -v --nodes 127.0.0.1 {} \;
If you note some slow process, please check this another post: Cassandra's sstableloader too slow in import data
Important: You should adapt this informaction to your reallity.

- 822
- 4
- 14
- 35
i followed the below steps and the restoration worked
for backup
go to the path cd /var/lib/cassandra/data
then take the snapshot using the command below
nodetool snapshot janusgraph -cf edgestore -t edgestore_mar6
nodetool snapshot janusgraph -cf graphindex -t graphindex_mar6
i took the backup of all the folders present in the directory janusgraph under /var/lib/cassandra/data
now move to the folder cd /var/lib/cassandra/data/janusgraph and type give the command ls -lrth.
the latest folders will be present at the bottom then go to those folders and go inside the snapshot folders present inside those folders.
eg
cd /var/lib/cassandra/data/janusgraph/graphindex-8e147200236f11edbecf211c2dd12670/snapshots
and copied that graphindex_mar6 to a new diretory
i repeated it for all the others folders under keyspace(directory) janusgraph,copied all the folders with today's date to a new directory and using the tar command i compressed the new directory
tar cvzf janusgraph_mar6.tar.gz janusgraph
here janusgraph is the directory i created and copied all the snapshots of all the folders under keyspace(directory) janusgraph.
for restoring
then copy the janusgraph_mar6.tar.gz folder to the remote machine,where we want to restore the data
uncompress the janusgraph folder
tar xvzf janusgraph_mar6.tar.gz
then under the folder janusgraph ,rename the other folders
eg edgestore_mar6 to edgestore
mv edgestore_mar6 edgestore
graphindex_mar6 to graphindex
mv graphindex_mar6 graphindex
repeat for all the folders
then restore using the command
sstableloader -d cassandra-ip /home/ubuntu/janusgraph/graphindex/
sstableloader -d cassandra-ip /home/ubuntu/janusgraph/edgestore/
here we can get the cassandra-ip by running the command nodetool status,use the above commands for all the other folders and then restart cassandra
sudo service restart cassandra
my data was restored
since i used elasticsearch for indexing in my backend ,i ran the my reindexing script on gremlin console after restoration

- 109
- 10
-
if we use docker container exeute the commands inside the container – syed adeeb Mar 06 '23 at 12:49