Anyone who has read Tales Of The Oak Table, not to mention Tony Andrews' blog or any of the countless articles and discussions on the subject on AskTom and elsewhere, will know two things about the fabled "Entity-Attribute-Value" approach to database design, in which you model all "things" in one table with a "thing ID" and a "thing type", plus a second table holding one row per "attribute", and thus create an application that can model any conceivable type of thing, ever:
- It seems like a clever idea at first.
- It isn't.
But wait. A poster on OTN forums ("SIMPLE Database Design Problem") has solved the major problems inherent in the original Entity-Attribute-Value approach, by simply denormalising away the Attribute-Value part.
Now the ENTITIES table will have all the columns for every entity type. Maybe a lot of them will be null because "INVOICE" rows will use mainly different columns to "TROPICAL_DISEASE" rows, but disk space is cheap, and look at the simplification we have achieved by not having to babysit all those old-fashioned tables. And it's not a generic design any more, is it? It's concrete.
The table would look something like this:
ENTITYID ENTITYTYPE NAME PRICE DIET COLOUR ANNUAL_TURNOVER -------- ---------- --------- ----- ----- ------ --------------- 1 PERSON William 2 FRUIT Banana Yellow 3 COMPANY Megacorp 100000000 4 ANIMAL Fruitbat Fruit 5 SNACK Snickers 0.4
accompanied by a generic RELATIONS table like this:
ENTITY1 ENTITY2 RELATIONSHIP ------- ------- ------------ 3 1 EMPLOYS 1 2 EATS 1 5 EATS
Want to list the snacks eaten by Megacorp employees? Simple:
SELECT emp.entityid, emp.name, snack.name, snack.price FROM entities emp JOIN relations emprel ON emprel.entity2 = emp.entityid AND emprel.relationship = 'EMPLOYS' JOIN entities com ON com.entityid = emprel.entity1 AND com.entitytype = 'COMPANY' JOIN relations snrel ON snrel.entity1 = emp.entityid AND snrel.relationship = 'EATS' JOIN entities snack ON snack.entityid = snrel.entity2 AND snack.entitytype = 'SNACK' WHERE emp.entitytype = 'PERSON' AND com.name = 'Megacorp';
Want to make FRUITBAT an employee of SNICKERS?
INSERT INTO relations VALUES (5, 4, 'EMPLOYS');
The thread becomes increasingly surreal as more and more posters suggest likely issues, from performance (he's prototyped it and the slowdown is insignificant) to complexity (the code will be generated dynamically from an object library) and the limited number of columns per table in Oracle (he might go with MySQL) while Erdem remains cheerfully confident that it will work (it won't).
My thanks to 3360 for sharing this. Send your WTFs to us at OracleWTF@bigfoot.com.
2 comments:
There is a genuinely difficult problem that guy is trying to solve here.
Not all application development is bespoke to a clients needs, so in the situations where flexibility is required while maintaining some sort of 'core' code base, some sort of extensibility needs to be built into the solution. It just isn't always practical or desirable to redesign the application schema for every additional column.
One of the key things that customers nearly always say in their requirements or RFPs is that they require the flexibility without resorting to the vendor for code changes.
The 'Concrete Elephant' solution proposed here is clearly nonsense, however an approach where the core entities, their attributes and entities are modelled in the schema itself, but whereby some felixibility is introduced by allowing the addition of attributes using an attribute-value approach does have some merit.
Comment
Post a Comment