0

I have company, location and product details in table R:

company  location      product
------------------------------
abc      hilltop       alpha
abc      hilltop       beta
abc      riverside     alpha
abc      riverside     beta
buggy    underbridge   gama
buggy    underbridge   theta
buggy    underbridge   omega

The relationships are multivalued. And the data needs to be decomposed according to normalization as the MVD's are

  1. not derived from a candidate key (company ->> location and company ->> product where company is not a candidate key)

or

  1. the union does not make the whole set (company U location < R and so with product).

But my colleague disagrees with me, who insists that for a relation to have multivalued dependency at least four values should exist in the company column for each company:

t1(company) = t2(company) = t3(company) = t4(company)

For company abc this is true. But for company buggy, which has only one product in three locations, this is untrue.

For the formal definition and similar examples I referenced https://en.wikipedia.org/wiki/Multivalued_dependency and https://en.wikipedia.org/wiki/Fourth_normal_form .

I too started seeing the same question after reading the formal definition.

How does this relation still have this MVD even though it does not satisfy the formal definition?

(I am not asking how to normalize this data in to 4NF. I need to break it into two tables--company-location and company-product.)

philipxy
  • 14,867
  • 6
  • 39
  • 83
samshers
  • 1
  • 6
  • 37
  • 84
  • The point is that it is not possible to see if there is or not a dependency (functional or multivalued) from a few rows of an example table. The dependencies are only a way to express formally the meaning of the data, and you should concentrate on this. For instance if you have splitted the table in two, this means that each company produces each product in any location, or that produces different products in different locations, but you are not interested in this information. If you were interested, then you should keep together the information about companies, locations and products! – Renzo Aug 28 '16 at 07:58
  • @Renzo PS The dependencies don't express the meaning of the data. They express truths about the variable values that can arise, and corresponding truths about the situtations that can arise. The meaning of the data is of the form "company abc at hilltop supplies alpha". But a dependency expresses things like "FOR ALL COMPANY, LOCATION, PRODUCT: company COMPANY is at LOCATION AND company COMPANY supplies PRODUCT IMPLIES company COMPANY at LOCATION supplies PRODUCT". – philipxy Aug 28 '16 at 18:21
  • Please clarify what you have been told and what you have to give. Have you been *told* that those MVDs hold in the table variable? Have you been *given* the CK(s) of the table variable? Or have you just been given that table value and you are supposed to show that it violates 4NF? Do they & you realize that if it does then that doesn't mean that the *variable* they have in mind does? If you clarify then I can address your problem but otherwise the "answer" becomes a book chapter on normalization. (These issues are what @Renzo was commenting on but that comment is not clear.) – philipxy Aug 29 '16 at 15:55
  • @philipxy. you see it right - what I have is the data given (and from business knowledge i know it is a MVD, but i can't use this information). No specific CK are given as the table needs all the three columns together to be treated as CK). The mathematical or formal definition fails as company **"buggy"** does business at one location only though for all products which is less than four. How do i get this solved in to 4NF. (PS: I know it's a clear MVD but with out formal proof ,not theoretical, i can't normalize) – samshers Aug 29 '16 at 16:28
  • @Renzo - i agree with you. Relations come from real world and can't be determined by table data alone. So, here I know it's a MVD from domain knowledge but what's coming in between my way is the **formal defination** which fails. So is the formal definition really valid in all cases. You see it fail's every where, when for X , Y, Z attributes there exist (1) only one possible values for Y and 3 or less possible values for Z (2) 2 possible values for Y and 1 possible values for Z in-spite these possibly being MVD's – samshers Aug 29 '16 at 16:38
  • @samshers, MVDs from *a formal point of view* are a very complex thing. The intuition is that they arise when there is a set of values depending from a value. But this intuition is *wrong* if we stop at this level. You can look for instance to [this presentation](http://www3.cs.stonybrook.edu/~icyoon/teaching/cse532-2015s/lec11-cse532-2015spr.pdf), that I find reasonably clear. – Renzo Aug 29 '16 at 17:06
  • @Renzo - the link is not accessible "Forbidden You don't have permission to access /~icyoon/teaching/cse532-2015s/lec11-cse532-2015spr.pdf on this server." Will you mind sharing it through any file sharing site if it's not going be a trouble. – samshers Aug 29 '16 at 17:42
  • The sentence with items 1 & 2 & "or" is not clear. – philipxy Mar 08 '23 at 01:18

1 Answers1

1

"There exist" says some values exist, and they don't have to be different. EXISTS followed by some name(s) says that there exist(s) some value(s) referred to by the name(s), for which a condition holds. Multiple names can refer to the same value. (FOR ALL can be expressed in terms of EXISTS.)

The notion of MVD can be applied to both variables and values. In fact the form of the linked definition is that a MVD holds in the variable sense when it holds in the value sense "in any legal relation". To know that a particular value is legal, you need business knowledge. You can then show whether that value satisfies an MVD. But to show whether its variable satisfies the MVD you have to show that the MVD is satisfied "in any legal relation" value that the variable can hold. One valid value can tell you that a MVD doesn't hold in (it and) its variable, but it can't tell you that a MVD does hold in its variable. That requires more business knowledge.

You can show that this value violates 4NF by using that definition of MVD. The definition says that a relation variable satisfies a MVD when a certain condition holds "for any valid relation" value:

for all pairs of tuples t1 & t2 in r such that t1[a] = t2[a] there exist tuples t3 & t4 [...]

For what MVD and values for t1 & t2 does your colleague claim there doesn't exist values for t3 & t4? There is no such combination of MVD and values for t1 & t2. Eg for {company} ↠ {product} and t1 & t2 both (buggy, underbridge, gamma), we can take (company, underbridge, gamma) as a value for both t3 & t4, and so on for all other choices for t1 & t2.

Another definition for F ↠ T holding is that binary JD (join dependency) *{F U T, F U (A - T)} holds, ie that the relation is equal to the join of its projections on F U T & F U (A - T). This definition might be more immediately helpful to you & your colleague in that it avoids the terminology that you & they are misinterpreting. Eg your example data is the join of these two of its projections:

company  location
--------------------
abc      hilltop
abc      riverside
buggy    underbridge

company  product
----------------
abc      alpha
abc      beta
buggy    gamma
buggy    theta
buggy    omega

So it satisfies the JD *{{company, location}, {company, product}}, so it satisfies the MVDs {company} ↠ {location} and {company} ↠ {product} (among others). (Maybe you will be able to think of examples of relations with zero, one, two, three etc tuples for which one or more (trivial and/or non-trivial) MVDs hold.)

Of course, the two definitions are two different ways of describing the same condition.

PS 1 Whenever a FD F → T holds, the MVD F ↠ T holds. For a relation in BCNF, the MVDs that violate 4NF & 5NF are those not so associated with FDs.

PS 2 A relation variable is meant to hold a tuple if and only if it makes a true statement in business terms when its values are substituted into a given statement template, or predicate. That plus the JD definition for MVD gives conditions for a relation variable satisfying a MVD in business terms. Here our predicate is of the form ...company...location...product.... (Eg company namedcompanyis located atlocationand makes productproduct.) It happens that this MVD holds for a variable when for all valid business situations, FOR ALL company, location, product,

        EXISTS product [...company...location...product...]
    AND EXISTS location [...company...location...product...]
IMPLIES ...company...location...product...
philipxy
  • 14,867
  • 6
  • 39
  • 83
  • I hopefully see what you point towards and what the definitions imply. The problem I have is I can't use the business knowledge here (though all combinations apply - I am not supposed to know or use them). So given the sample data, i need to determine and prove **mathematically** that the table violates 4nf and after that split it in to two tables. Is there any way i can prove this. – samshers Aug 29 '16 at 07:53