Relations are, themselves, values too, and relation attributes can therefore be declared to be of another relation type. Such attributes are called 'Relation-valued attributes' (RVA's for short).
In the RA, two operators are available that allow us to manipulate relations in connection with RVA's : GROUP and UNGROUP
"
Like I said, I'm a bit out of my depth here so take the above as evidence rather than proof that such things existed, but I'm pretty sure I saw this, hand-drawn, in one of Codd's original papers.
.
Edit: you are right
"Codd proposed a normal form thathe called first normal form (1NF), and he included a requirement for 1NF in his definitions for 2NF,3NF, and subsequently BCNF. Under 1NF as he defined it, relation-valued attributes were “outlawed”;that is to say, a relvar having such an attribute was not in 1NF."
No, it doesn't mean he's right. The "normal forms" could merely be suggestions for a database designer, not a technical limitation enforced by the software itself.
No one has provided convincing evidence that Codd intended to exclude nested tables entirely. People seem to be conflating i) good database design, as suggested by Codd ii) the feature-set of a DBMS, also as suggested by Codd.
> The "normal forms" could merely be suggestions for a database designer, not a technical limitation enforced by the software itself.
I think most of the motivation for normal forms is to avoid 'update anomalies', which is essentially, don't represent the same information in two places in your base relation variables (aka tables in SQL). So you can have repeated values or nested relations in queries, and you can have them in base tables which are morally normalized, as long as there's no possibility that these lead to the same information being recorded in two distinct places.
When people talk about 'denormalizing' and it's justified, I think it's breaking this rule about representing information in two or more places in exchange for performance. If you do this, the application programmer has to be careful to keep these multiple locations in sync - a kind of consistency you don't have to think about in a clean database design. I think that database management software in general cannot enforce normalisation - it can only make it easier or more difficult to use it with normalized databases.
In theory, the DBMS itself could directly support 'physical denormalization' and make this performance optimisation easier to implement and transparent to the application code. I think some SQL DBMSs have attempted to do things like this.
(Posted under a different account because I'm being slow-posted again by HN)
> In theory, the DBMS itself could directly support 'physical denormalization' and make this performance optimisation easier to implement and transparent to the application code. I think some SQL DBMSs have attempted to do things like this.
Automatically managed, application-transparent, physical denormalisation entirely managed by the database is something I am very, very interested in. Unfortunately I've been able to find pretty well nothing to describe what it would look like and how it would be done. If you can provide any links that would be so incredibly helpful!
It gets mentioned in the Date/Darwen books as being the right way to do things, but no actual information seems to be given.
I'm a bit fuzzy, but I think Vertica allows duplicating tables stored in multiple orders - then I think the appropriate version is picked automatically by the query optimiser. So this works not that differently to an index (which is also dbms managed performance denormalization).
There's also materialized views - if you have automatic incrementally updated materialized views, which are transparently substituted into queries, that's along these lines. I think there's a lot of progress being made here, and plenty of compromises used in the field that have been in production for a long time.
I think there's some ambitious work on materialized views being done in postgres.
> It gets mentioned in the Date/Darwen books as being the right way to do things, but no actual information seems to be given.
I don't think they ever convincingly got into the details on it.
> Automatically managed, application-transparent, physical denormalisation entirely managed by the database is something I am very, very interested in.
> I think most of the motivation for normal forms is to avoid 'update anomalies', which is essentially, don't represent the same information in two places
This is true for the second and higher normal forms, but not for first normal form. First normal form is about eliminating nested tables, not about eliminating redundant data.
> No one has provided convincing evidence that Codd intended to exclude nested tables entirely.
See Codds original paper (linked in a sibling comment) section 1.4.
Note that the relational algebra developed by Codd does not support querying nested tables, which would make them practically useless, even if allowed.
also https://shark.armchair.mb.ca/~erwin/RA_Intro.htm
"
Relations are, themselves, values too, and relation attributes can therefore be declared to be of another relation type. Such attributes are called 'Relation-valued attributes' (RVA's for short).
In the RA, two operators are available that allow us to manipulate relations in connection with RVA's : GROUP and UNGROUP
"
Like I said, I'm a bit out of my depth here so take the above as evidence rather than proof that such things existed, but I'm pretty sure I saw this, hand-drawn, in one of Codd's original papers.
.
Edit: you are right
"Codd proposed a normal form thathe called first normal form (1NF), and he included a requirement for 1NF in his definitions for 2NF,3NF, and subsequently BCNF. Under 1NF as he defined it, relation-valued attributes were “outlawed”;that is to say, a relvar having such an attribute was not in 1NF."
https://fliphtml5.com/qprz/cxon/basic/201-235