The concept-oriented model (COM) of data is a general-purpose unified model. In this post we describe one aspect of this model. More specifically, we describe how this model can unite two branches currently existing in computer science: value or domain modeling and relation modeling. It is achieved by introducing a new data modeling and programming construct, called concept, which is used for typing both domains and relations.
1. Relations and domains
In the relational model, a domain is a set of values and a tuple is a combination of values from some domains. For example, a domain could consist of all integer values like 1, 2, 3 and so on.
A relation is defined over some domains via its schema and tuples. A relation schema is a number of domains for the relation attributes which are also called attribute types. Relation tuples are composed of values taken from these domains. For example, a ColorTable could be defined as a set of triples each composed of three integers taken from one domain. To define a new relation we have to specify domains for its attributes.
This classical approach clearly separates relations from domains. Here relations and domains are modeled differently by using different modeling constructs and patterns. Data modeling is broken into two isolated areas: relation modeling and domain (or value) modeling. Relations are normally modeled using the relational model while domains are modeled using object-oriented methods. For instance, domains can be extended. In figures, relations will be shown in light blue color and domains will be shown in light green color.
2. Complex values
Domains can be used not only to define relations. They also can be used to define complex domains which are sets of complex values. A complex value is a combination of several simpler values taken from other domains. Thus complex values may have arbitrary structure which is defined in terms of existing domains. For example, we could define a new domain for colors where one value is composed of three integers. It is very similar to how we have defined a new color relation except that now colors are represented as values within a domain rather than tuples within a relation.
These complex values can now be used in relations as if they were primitive values. For instance, the ColorTable could have an attribute with the type of the complex domain.
Thus complex domains (also known as user-defined types) allow us to model domains with arbitrary structure. And these complex domains can be then used to define relations.
So existing domains can be used to define either new relations or new domains. In other words, relations and domains are defined in terms of already existing domains.
The problem here is that
it is not possible to use existing relations when defining new relations or domains
Attributes in both domains and relations are typed using only domains and there is no possibility to have relation-typed attributes. Thus relations and domains are not only isolated but they are also asymmetric in their use because only domains are used when extending a schema.
Another problem is that relations cannot be extended like domains using the traditional object-oriented approach. For example, we can extend the domain People when defining a new domain Employees by adding more specific attributes but we cannot naturally extend the relation People by introducing a new relation Employees.
There exist some solutions to this problem.
- One consists in introducing objects, which are modeled by classes, instead of using tuples and relations. In contrast to relations, classes can be used as attribute types. In this approach however we will not be able to model custom references with arbitrary structure. In addition, this essentially means switching to the object-oriented approach which has always been controversial in data modeling. For example, it is not very suitable for set-oriented operations.
- Another solution consists in using foreign keys. Yet, foreign key is not a type – it is a constraint. Therefore it can be used as a pattern or workaround but not as a principled solution.
Our goal is to make relations and domains absolutely symmetric. So the main question is whether it is possible to combine relation modeling and domain modeling, by making them symmetric with respect to each other, as well as integral parts of one construct. Obviously, it is a quite non-trivial problem which touches the foundations of not only data modeling but also other branches of computer science.
The solution provided within the concept-oriented model consists in introducing a new construct, called concept:
Concept is defined as a couple of two classes: one identity class and one entity class
Identity and entity classes are also referred to as reference and object classes, respectively, in concept-oriented programming (COP). The main difference between them is that identities are always values and are passed and stored by-value while entities are passed by-reference.
Concept instances are identity-entity couples which are informally analogous to complex numbers in mathematics. Indeed, complex numbers also have two constituents but are manipulated as one whole. A domain in this case is defined as a set of identity-entity couples rather than either values or tuples. As a result, there is no need in distinguishing between value domains and relations. Concepts are used instead of both relation types and domain types by unifying relation modeling and value modeling.
Concept-typed attributes contain references in the format of the identity class. Simultaneously, they reference an object in the format of the entity class. In this way we can freely vary between by-value and by-reference constituents of data. If a concept has empty entity class then its instances are values. If a concept has empty identity class then its instances are represented by primitive references like objects.
In summary, concepts in the concept-oriented model allow us to unify domain and relation modeling by using only one construct for both purposes. Concepts provide a type-based mechanism for modeling domain-specific references or foreign keys. It is also important that concepts generalize conventional classes and are used also in concept-oriented programming.
More information on the concept-oriented model and concept-oriented programming can be found on this site: http://conceptoriented.org
1. Concept-oriented model: unifying domain and relation modeling. Youtube video
2. Concept-oriented model: unifying domain and relation modeling. Slideshare slides
3. A. Savinov, Concept-Oriented Model: Classes, Hierarchies and References Revisited, Journal of Emerging Trends in Computing and Information Sciences 3(4), 456-470, 2012. PDF
Since DOM elements are nested the main question is in which direction events should be processed: downwards in the direction of child elements or upwards in the direction of the root element. The first (downward) event propagation strategy where parent event handlers have precedence over child handlers is referred to as capturing. The second (upward) event propagation strategy where child handlers have precedence over their parents is referred to as bubbling. For example, if we click a button within a panel then which of them will process this event first: the button (child) handler or the panel (parent) handler? These two opposite approaches were implemented in different browsers: Netscape chose the capturing (downward) model and Microsoft chose the bubbling (upward) model. The W3C event model supports both strategies so that event processing consists of two phases: first capturing down to the target element and then bubbling back up to the parent element.
Values are opposed to objects because they have opposite properties, that is, values are not objects and objects are not values. For example, a value is immutable element which is passed by-copy (by-value) while objects are passed by-reference. In programming languages, values are normally stored in stack while objects are allocated in heap. Values are visible from and used by only this context while objects can be shared by many contexts. Values do not have a location in space (reference) while objects are characterized by a permanent reference.
Since OOP is about objects, values are not fully supported in this programming paradigm. In particular, the main purpose of classes in OOP consists in describing object types rather than value types. The only thing that is supported in most programming languages are primitive values with primitive types like integers. Another very important use of values supported in most OO languages is referencing. However, we are not able to create application-specific values and application-specific references -- they have a platform-specific format and behavior. (One exception is C++ where classes can be used for describing both values and objects.)
In the concept-oriented programming (COP), both values and objects are first class citizens with the same rights. It is not simply a decorative enhancement but the recognition of the importance of values. Values in COP can account for a great deal or even most the program complexity.
For modeling values and objects COP introduces a novel programming construct, called concept. Concept is defined as a couple of two classes: one reference class and one object class. Reference class describes values while object class describes objects. (We call it reference class because the main role of values is a reference.) What is really new here is that these two classes cannot be used separately, that is, reference class and object class is one whole. The change of paradigm is that programming is reduced to manipulating value-object couples. In this context, COP informally relates to OOP as complex numbers relate to real numbers in mathematics. If in OOP elements have only one constituent -- an object, then in COP an element has always two constituents -- one value and one object, which are informally analogous to two constituents of complex numbers (imaginary and real part).
The next change is that inclusion relation is used instead of classical inheritance. In OOP terms, this means that a base element (value-object couple) may have many extensions. If values are used as references, then it is a basis for describing application-specific hierarchical address spaces like conventional postal addresses. For example, assume that concept
Street is included in concept
City which in turn is included in
Country. After that each element is represented by a reference consisting of three segments: country (high), city (middle), street (low). This complex reference is a value which represents a complex object. The main different from OOP is that one country (base) can be shared among many cities (extensions) which in turn can be shared among many streets. Elements in COP are living in a hierarchical space (described by concept inclusion hierarchy) while in OOP they are living in a flat space although their classes are hierarchically ordered.
In programming, we used to think of references are something primitive and platform-specific that is provided by the compiler as a means for representing objects. In contrast to objects, which have domain-specific structure and behavior, references do not expose their structure and do not show any activity. We use objects represented by references as if they were directly accessible and programming is reduced to modeling exclusively objects while references are absolutely passive elements.
But what if we assume that references, like objects, may have arbitrary domain-specific structure and behavior? Shortly, we will get a novel approach, called concept-oriented programming (COP). References in COP are as important as objects because both can modularize domain-specific structure and behavior. The first question here is what advantages we will get by using active references? Here are some applications where it could be useful:
- Modeling domain-specific address space rather than adapting platform-specific surrogates. Indeed, why not to use domain-specific references directly in the program to identify its objects? It is simpler and more natural.
- Modeling cross-cutting concerns. An important observation about references is that their functions cross-cut many classes of objects. For example, when a Java object is being accessed, JVM executes one and the same procedure independent of the object class. If references could be modeled from the program then we could use them to execute functions which are common to many object classes.
- Modeling persistent, remote and transactional objects. References could be responsible for implementing intermediate functionality which is specific to these uses. For example, each time an object is about to be accessed, its reference will load its state from persistent storage or send the request to its remote location or start a transaction.
As programmers, we know that any object has a representative, called reference. For decades objects have relied exclusively on references provided by the compiler which in turn used the underlying environment, some kind of standard library or middleware for their generation and management. For example, the compiler could use global heap where allocated objects are represented by a system-specific memory handle. Objects could be represented by remote references generated by EJB container like JBoss or WebLogic. In Java, references are generated by JVM. Other interpreters are using their own run-time environment and hence their own specific references. Thus most contemporary programming languages have one common property:
References are not integral part of the program and cannot be adapted to the purposes of this application and this problem domain
In other words, in conventional programming languages we are destined to use only standard references and their access mechanisms provided by the available hardware, OS, middleware or library. There is no easy possibility to develop application-specific references for this concrete domain as integral part of this program.
The system can be then viewed as divided into two layers:
- The first layer is responsible for generating/managing references and providing object access procedure
- The second layer is the program itself where these references are used independent of the peculiarities of the first layer
Almost all exiting programming paradigms isolate these two layers so that programming is reduced to the second layer. This means that the programmer can develop objects and their methods but is not able to influence how these objects are represented and how they will be accessed. Concept-oriented programming (COP) is a novel approach to programming which changes this view and allows for describing both layers within one program. COP makes references first-class citizens of the object world by retaining the transparency of access (isolation of the layers). References and objects have equal rights in a concept-oriented program. In particular, COP is based on using a new programming construct, called concept, which is defined as a couple of two classes: one reference class and one object class. As a consequence, both references and objects have arbitrary domain-specific structure and behaviour. COP also uses inclusion relation instead of classical inheritance. (Concepts and inclusion generalize classes and inheritance, respectively.)
One of the main advantages of COP is that the programmer can develop domain-specific containers with a virtual address space where objects will live. In other words, the programmer can implement functions which normally belong to hardware, OS, middleware, run-time environment or a library. Another interesting application consists in modeling cross-cutting concerns.