0

According to the following database schema. What is the most effective solution for deleting a folder as well as all it's subfolders and subfiles using MySQL 6?

enter image description here

user6178502
  • 715
  • 1
  • 5
  • 7
  • Use InnoDB with foreign keys `ON DELETE CASCADE` and just delete the folder – Psi Mar 19 '17 at 22:55
  • @Psi - ON DELETE CASCADE is the most dangerous operation that can destroy data in your DB. I'd suggest to use triggers or delete the records using 4 sql queries. so at least you have some control over the delete operation. – Desolator Mar 19 '17 at 22:58
  • @RobinVanPersi Why do you think so? If you don't delete wildly and set up your foreign keys correctly, what makes it dangerous in your eyes? By the way, it is intended to "destroy" data, that's what a deletion normally is supposed to do – Psi Mar 19 '17 at 22:59
  • 1
    And by the way, the most "dangerous" operation in my opinion is `drop database`. Also a legal and necessary tool ;) – Psi Mar 19 '17 at 23:06
  • 3
    @RobinVanPersi: I totally agree with Psi. DELETE CASCADE is super useful and using triggers is not the right approach here. – juergen d Mar 19 '17 at 23:09
  • I'm not sure this is the best schema you can have for this sort of thing. RDMBS systems aren't always the best at recursive eager loading. One thing that might make more sense is giving each file multiple paths, as plain old strings, so you can do things like `SELECT * FROM file_paths WHERE path LIKE '/example/path%'` and get everything from one point down in the tree. – tadman Mar 19 '17 at 23:17
  • Or, if this is a frequent operation, consider the [nested set model](https://en.m.wikipedia.org/wiki/Nested_set_model) for the hierarchy. This is how I structure things in my MySQL-based service that tracks objects in Amazon S3 buckets -- the "folders" (which are called *key prefixes* in S3 parlance) have a parent_key_prefix_id as well as a left_id and right_id and depth for hierarchical selects, grouping, counting, summing sizes, etc. – Michael - sqlbot Mar 20 '17 at 01:30
  • @Psi - it's not the best way (maybe the easiest way), one mistake and all of your data are gone. you have to be very careful when it comes to cascade deletion. also it's a big pain to do logging and tracking if you're working on a large scaled projects. I'd still go for 4 queries that I can control over the cascade deletion anytime in any database. triggers are not the best solution I'd agree with that but still you have sorta control with the logic. I'm saying that based on experience not based on (the easiest way) – Desolator Mar 20 '17 at 19:52
  • @RobinVanPersi It is the best way _because_ it is the easiest way. And it is the easiest way because that scenario is _exactly_ what cascading foreign keys are designed for. If you fear making mistakes... well, just don't make mistakes. If you fear putting in the wrong queries, you better do never work on a database in a production system. Before I do critical manual operations, I _always_ do a backup beforehand. If you know what you are doing, using the tools solving these problems they are designed for to solve, is always the best, the easiest, and the most straightforward way! – Psi Mar 20 '17 at 19:59
  • @Psi - it's the best way because it's the easiest way? that statement is 100% false man. it has its flaws and limitations just like any other method. it's designed for that purpose yeah, but that doesn't make it the best approach. it limits your ability to implement any kind of logic around it. once the data are destroyed you can't put it back unless you use your backup and it's there for testing purposes not for production purposes. I'm not fearing of making mistakes and I always make backups when I perform such operation but that's totally not the case when it's in production. – Desolator Mar 20 '17 at 20:08
  • @Psi - I'm stating facts of lots of people been through that pain. not based on documentations and answers you read online. – Desolator Mar 20 '17 at 20:09
  • I agree with @Psi that if it's done carefully there not problem to use `ON DELETE CASCADE` on the foreign key. However, this solution will not delete subfolders and subfiles. – user6178502 Mar 20 '17 at 20:12
  • @tadman Thanks but `SELECT * FROM file_paths WHERE path LIKE '/example/path%'` will be very slow on a large database. – user6178502 Mar 20 '17 at 20:14
  • With an index it's not as bad as you'd think. Prefix scans are often fairly quick, the index has the entries in order. Postfix ones are brutal, e.g. `%/path` is going to be painfully slow. As always, when prototyping a schema, test it under production scale load and then some to see how it performs. – tadman Mar 20 '17 at 20:16
  • @user6178502 It will also delete subfolders, assumed, all relations are cascaded – Psi Mar 20 '17 at 20:21
  • @RobinVanPersi So you suggest to avoid `on delete cascade` because the user might be too incompetent to use it? Your suggested triggers would do exactly the same, so they would destroy the same data... does not make sense at all for me. That means, you blame a standard feature for doing its job because some users were just not able to use them correctly? Sorry, but then they should have chosen a different job. Also, using triggers to do the job a constraint is supposed to do makes the schema a lot harder to maintain for people who know what they are doing – Psi Mar 20 '17 at 20:30

0 Answers0