Frequently Asked Questions
on the Concept-Oriented Query Language (COQL)

Alexandr Savinov
http://conceptoriented.org/

First started: 07.12.2004
Last updated: 15.08.2005

1 Collections

1.1 What are basic elements of COQL?
1.2 What is a collection?
1.3 What is an instance variable?
1.4 What is a difference between instance variables and collection?
1.5 How to use collections as a type of a variable?
1.6 How a collection can be defined?
1.7 How the source collections are specified?
1.8 How syntax of the collection is specified?
1.9 How collection definition is interpreted?

2 Dimensions and Inverse Dimensions

2.1 How item semantics is accessed?
2.2 How dimension values are retrieved?
2.3 What values dimensions return?
2.4 How inverse dimension values are retrieved?
2.5 What is the difference between dimension value and inverse dimension value?
2.6 Can an inverse dimension be used as a source collection?
2.7 How internal collections are used?

3 Restrictions

3.1 How restrictions are specified?
3.2 What if there are more than one source collection?

1 Collections

1.1 What are basic elements of COQL?

The concept-oriented query language manipulate collections of items. A concept in the database is a particular case of collection. The language is intended to provide convenient means for generating such custom collections given various criteria.

1.2 What is a collection?

Collection is a set of items built from other items in the database or other collections. In contrast to normal concepts collections produced by queries have the scope of this given query only. In particular, they are not visible from other collections or from the database concepts. Within one query scope a collection may have its own name. More generally, a collection can be viewed as a dynamic class and the collection items are the class instances. The collections can be also viewed as dynamic classes. In particular, they can be used to specify the type of variables.

1.3 What is an instance variable?

It is a variable, which stores one item at a time (by reference). Instance variables are typed by some collection, i.e., they may store only items from one collection.

1.4 What is a difference between instance variables and collection?

Variables reference single items while collections reference sets of items. More deep and subtle difference is that collections can be used as types or classes for instance variables because instance variables reference items, which live in collections rather than exist vacuum.

1.5 How to use collections as a type of a variable?

In contrast to static classes in programming languages collections are dynamic types or classes, i.e., we can create a collection and then declare variables having their type. This means that such a variable will reference items from this collection only:

MyCollectionOrConcept myItemReference;

This declaration means that the instance variable myItemReference may store references to items from the collection MyCollectionOrConcept.

1.6 How a collection can be defined?

In order to define a new collection it is necessary to specify two things: a way how its items are computed (semantics), and syntactic structure for them. COQL provides means to define collections from other collections or database concepts.

In COQL collection is defined by means of curly brackets {...} with the concrete semantic specification inside. In SQL the COQL curly brackets correspond to SELECT clause because both return a set of records. After the curly brackets there might follow angle brackets <...> with the syntactic specification inside. Some elements of the specification are optional. It is important that both constituents - semantics and syntax - are needed to define a collection (as well as any concept).

1.7 How the source collections are specified?

To define a new collection we normally need some source collections or concepts. Their names are specified within curly brackets, e.g.,

MyCollection = {SourceCollection1, SourceCollection2}

Frequently we need to have an instance variable defined for the source collections, which is specified by semicolon (or some key word like 'in'), e.g.,

MyCollection = {s1:SourceCollection1, s2:SourceCollection2}

With no other information such a collection will return a product of the source collections.

1.8 How syntax of the collection is specified?

In the angle brackets we can explicitly provide a list of dimensions for the new collection. These properties are normally produced from instance variables, e.g.,

MyCollection = {s:SourceCollection}<s.a, s.b, s.a.c>

We can also specify computed fields in the syntax part:

MyCollection = {s:SourceCollection}<s.a, s.b, s.a+s.b>

Custom names could be assigned by introducing new variables:

MyCollection = {s:SourceCollection}<nameA=s.a, nameB=s.b, sum=s.a+s.b>

1.9 How collection definition is interpreted?

A collection has two flavors: declarative and procedural. From declarative point of view it defines a set of items. From procedural point of view it is a loop through all elements of the collection. This loop can be written as follows: 

foreach(s in SourceCollection) { /* do something */ }

In SQL the procedural part is almost not present (yet it exists in such languages as T/SQL and PL/SQL). There is always some trade off between these two interpretations: the declarative approach is simple and convenient for interactive use while the procedural is powerful and provides full control over the database. COQL tries to combine these two interpretations (to the extent it is possible). For that purpose we have to be able to view the whole collection definition as one unit and at the same time to control all the peculiarities of its syntax and semantics. It is important to understand that in general case declarative approach does not allow us to define an arbitrary collection (easily) so at one or another moment we have to switch to procedural view of the query. The task of any query language in this sense is to make this transition more transparent. And here it is important to understand that each collection is a loop.

2 Dimensions and Inverse Dimensions

2.1 How item semantics is accessed?

An item is represented by reference and in isolation it is semantically empty or primitive. In order to distinguish items semantically we need to get and compare their properties. However in the concept-oriented approach item properties are other items. Thus in order to compare items semantically we need to get other items, which also are characterized by other items and so on. This process can be continued until we get some primitive items with semantically interpretable references (normally numbers or text strings).

In COM there exists two dual method for accessing item semantics:

All other operations in COQL are implemented by only using these two access methods normally applied to and producing collections. In other words, COQL provides convenient means for applying these two access methods to collections of items rather than to individual items.

It is an unique important feature of COQL that there is a special operator for producing an inverse dimension from a dimension. This operator results in inversing the corresponding arrow in the diagram and interpreting it in opposite direction.

2.2 How dimension values are retrieved?

Dimension values are retrieved by specifying the source item (reference) name and then one or more dimension names starting from the source item collection:

newItem = s.a.b

Here s is an item reference or instance variable, a is a dimension in the source concept of s, and b is the dimension in the domain of a. The result is stored in item reference newItem.

2.3 What values dimensions return?

Each dimensions returns a single item reference, which can be processed in-line or stored in a variable.

2.4 How inverse dimension values are retrieved?

Inverse dimensions are dual to dimensions. In particular, they are directed downward (rather than upward for dimensions) and return multiple values rather than a single value. To get a value of an inverse dimension we have to specify an item (reference) name and then in curly brackets a normal sequence of dimensions starting from the target subconcept and ending with this concept:

NewCollection = s.{Subconcept.a.b}

Here s is an item reference or instance variable, a is a dimension in the target concept or collection Subconcept, and b is the dimension in the domain of a. The result is stored in collection reference NewCollection.

Alternatively, inverse dimensions can be retrieved as follows:

NewCollection = {Subconcept.a.b=s}

Here we explicitly indicate that each item from the target Subconcept has to reference the source item s via the two consecutive dimensions.

In both case the curly brackets emphasize that the result is a collection of items rather than a single item. Curly brackets can be also interpreted as an operator for inversing a dimension or arrow, which results in the inverse dimension with the same rank.

2.5 What is the difference between dimension value and inverse dimension value?

Dimension returns a single item stored in an item reference while inverse dimension returns multiple items stored in a collection reference.

2.6 Can an inverse dimension be used as a source collection?

Yes, there is no difference between database concepts or collections produced by in queries. Particularly, inverse dimensions can be used as source collections of queries. The only specific feature is that inverse dimensions are associated with some concrete item (the current value of an instance variable), which has to be specified in the external context. (In the case of concepts such an item is formally the top item, which is the parent for all other items in the database.) 

2.7 How internal collections are used?

Each collection has a syntactic and semantic definition of its items. Additionally, this definition has a procedural semantics as a loop over the collection items. In other words, it is supposed that the collection is built by iterating through all possible items and selecting those satisfying the restriction. On each step of the iteration one item is fixed and its reference is available for further use. In particular, it is possible to produce internal collections associated with this item.

MyCollection = {s:SourceCollection | size({Subconcept.dim=s})<5 }, or equivalently 
MyCollection = {s:SourceCollection | size(s.{Subconcept.dim})<5 }

Here it is supposed that Subconcept has a dimension with the domain in SourceCollection. Thus the internal collection {Subconcept.dim=s} contains a group of subitems from Subconcept belonging to the current item s. As the instance variable s changes its value the internal collection is built again and contains another group. The aggregation function returns the cardinality of this group and the condition on the size of the group is analogous to HAVING clause in SQL.

3 Restrictions

3.1 How restrictions are specified?

There exist different ways how restrictions can be imposed. The simplest way consists in specifying them after the bar symbol as a predicate, i.e., a condition on various properties:

MyCollection = {s:SourceCollection | s.a<5 && s.b>10}

Thus before the bar symbol we specify all the potential source items while after the bar criteria for selecting them are provided. This way is analogous to the use of WHERE clause in SQL.

The predicate is evaluated for each iteration, i.e., for each potential new item. As usual, in practice this process is optimized, i.e., it does not mean that each source item is actually built or retrieved before the predicate is evaluated.

3.2 What if there are more than one source collection?

If there are several source collections (not necessarily concepts from the database but any existing collection or produced in this query) then the predicate is evaluated for each combination of their items. In other words, the set of all potentially valid items is the product of the source collections. This product is returned as a result collection if no restrictions are imposed. Otherwise, only combinations satisfying the restrictions are included into the result collection:

MyCollection = {s1:SourceCollection1, s2:SourceCollection2 | s1.a<5 && s2.b>10}

In the case of many source collections we can also produce JOIN-like queries:

MyCollection = {s1:SourceCollection1, s2:SourceCollection2 | s1.id = s2.id}

This method simulates the join operation of relational databases. Yet in the concept-oriented model we use such an operation as a secondary means because items are supposed to be linked in the model in a natural manner. In other words, in relational model join is one of the primary mechanisms while in the concept-oriented model it is used as a secondary facility only in situations where the model does not reflect the corresponding relationship. Since any well designed concept-oriented model should specify all main structural connections in its schema the join is used only for custom (unusual) queries.

 

Back to the Concept-Oriented Portal