Questions tagged [distributed-database]

Anything related to distributed databases and the techniques and the tools used to manage them. A distributed database is a database whose information is not stored in a single physical location, but it is spread over various devices, often placed far apart.

Anything related to distributed databases and the techniques and the tools used to manage them. A distributed database is a database whose information is not stored in a single physical location, but it is spread over various devices, often placed far apart.

188 questions
1
vote
2 answers

Which database for a web crawler, and how do I use MySQL in a distributed environment?

Which database engine should I use for a web crawler, InnoDB or MYiSAM? I have two PC's, each with 1TB hard drives. If one fills up, I'd like for it to save to the other PC automatically, but reads should go to the correct PC; how do I do that?
Jesvin
  • 491
  • 1
  • 5
  • 15
1
vote
1 answer

Joining order in Distributed Database

I am nowadays trying to study about query optimization in replicated distributed database system. But I got myself confused about joining order of the relations. If a query execution path is [15E,4F,6I,8B,14D,11G,16J,9H,6A,13C] or…
Light
  • 199
  • 1
  • 3
  • 9
1
vote
1 answer

how data is stored in distributed databases. In apache cassandra it is equally stored. How will it be the case in other distributed dbms's?

I had read articles in DataStax about Apache Cassandra and I had noticed that whatever the data we are going to write is going to distributed among all the nodes equally. Is it will be the case in all other distributed database management systems?…
1
vote
0 answers

Neo4j and Mongodb as datasource in Grails

I am using spring-data-neo4j in my Grails application because of limited functionality of neo4j gorm (neo4j plugin) . I am planning to use Neo4j and Mongodb both as my datasource . Is there any way that I could map a single Domain in both neo4j and…
1
vote
1 answer

Does all the nodes in cassandra cluster know the "partition key ranges" for each other?

Lets say I have a cassandra cluster with the following scheme: (76-100) Node1 - Node2 (0-25) | | (51-75) Node4 - Node3 (26-50) Each node is primarily responsible for a range of partition keys: For example, for a total range…
brain storm
  • 30,124
  • 69
  • 225
  • 393
1
vote
3 answers

What alternatives do I have if I want a distributed multi-master database?

I will build a system where I want to reduce single-point-of-failures, and I need a database. Is there any (free) relational database systems that can handle multi-master setups good (i.e where it is easy to add and remove nodes) or is it better to…
Jonas
  • 121,568
  • 97
  • 310
  • 388
1
vote
1 answer

MySQL distributed database with mysql access to each node

I have task to implement particular database structure: Multiple mysql servers with data using the same schema. Each server can see and edit only his particular part of data. And One master server with his own data that can run queries using data…
genpj
  • 11
  • 2
1
vote
3 answers

Tool to create mongodb sharded cluster

I need a tool to manage a cluster of mongodbs. With an increasing number of machines, it is hard to maintain each machine without a tool. More details: The database grows around 50 MB per day, so they are approximately 1.5 GB per month. The mongodb…
1
vote
1 answer

Understanding of hBase data storage (webpage) for Nutch

I am using HBase as my storage for crawled data by Apache Nutch. A location of my storage is in path /data/hbase/webpage and there I can see a lot of folders like: 64b2feb30073eec24d9dba65d421e7f…
Jan Bouchner
  • 875
  • 1
  • 14
  • 38
1
vote
2 answers

Consistency for read from distributed databases

I have a set of databases, distributed across multiple locations in the network and for ex. one client that needs to store some data in that databases. I need to make sure my data will always be stored. I can't organize a replica set with…
1
vote
0 answers

Implementing a distributed database system using Microsoft Excel (back-end) and JSP (front-end)

i am working on a project which is in short a distributed database project. **Requirements** module 1: 1) Read and update an excel sheet using an excel sheet which already contains some data. status: implemented. Brief details: i have used the POI…
cracknut
  • 61
  • 1
  • 5
0
votes
1 answer

What kind of database in regards to cap theorem should be used for an mmo?

Let's say an mmo was being created with the features of being first person view, having one world everyone plays in (like eve online), and being a sandbox game. What database would best suit it's needs in terms of cap theorem? CA, AP, CP and why?
Xavier
  • 8,828
  • 13
  • 64
  • 98
0
votes
0 answers

how can i tune GlusterFS's volume option to make it strongly-consistent?

I am using GlusterFS as the shared storage for shared-disk database.Now i have two operations. operation 1:use lseek to get a file's total size. operation 2:write and fsync immediately to extend a file. (o1 and o2 are mutually exclusive.) Here comes…
jaden
  • 1
0
votes
1 answer

What happens to long running clickhouse updates if the client dies?

Column type modifications in clickhouse can take a long time, since clickhouse has to migrate all data to the new type. Let's say a jdbc client initiates a column type update, and then the jvm process is killed. What happens to the update task?…
thewolf
  • 417
  • 1
  • 5
  • 10
0
votes
1 answer

Legal Hierarchical Quorums in Zookeeper

I am trying to understand hierarchical quorums in Zookeeper. I may not understand the example shown in the documentation (here). Are votes [from at least two servers from each of two different groups] enough to form a legal quorum? In my opinion,…