Complex Values and Relationships

Compound Value Types

So far the property values you have seen are mostly strings and numbers, and they seem quite different from topics, which are typed and have identities (GUIDs or namespace/key IDs). However, what you might think of as property values are not always that simple, not always bare strings and numbers. Consider the "population" property of a city like San Francisco. We might start out with a simple string property value like the top left model in this diagram:

It soon becomes obvious that the population does change over time, and in order to record population figures over several years, we need to record the year corresponding to each number. So, the top left model is upgraded to the top right model, and adding another year's population figure gives the bottom model. The diamonds are still nodes within the Freebase graph, and so they are typed and given GUIDs, but they are not considered topics of discourse, meaning that they are not typed /common/topic. Their function is only to hold the pairings of year and population number. The types of such nodes (/measurement_unit/dated_integer in this example) are called Compound Value Types, or CVTs for short. A lot of them are listed at the bottom of this Schemas Explorer page.

You might be keen to notice that modeling the population with the /measurement_unit/dated_integer type is still an incomplete solution. For example, the population figures, collected by censuses, are not perfectly accurate, and as an improvement we can introduce yet another level of CVTs to capture the inaccuracy as follows:

However, data modeling perfection can easily drive us into deep ends, yielding extremely detailed models that are hard to understand, use, and maintain, and yet serve very few realistic use cases. So, a balance must be struck somewhere, and for this case of the population property, it has been decided that accuracy won't be recorded. The property can be changed in the future if enough compelling use cases call for more detailed modeling.

Mediator Types

Compound value types are used for capturing additional properties on seemingly simple property values. In the same vein, there needs to be something for capturing additional properties on seemingly simple relationships between topics. Consider the "starring" relationship between films and actors, such as between The Dark Knight and Heath Ledger. We might start with the simple model at the top of the following diagram:

As soon as we want to indicate which character Heath Ledger played, we need to introduce another graph node to capture the pairing between the actor and the character. Hence, the model at the bottom of the diagram. The types of such nodes (/film/performance in this example) are called mediator types, because they mediate between the source and the target of the original, simple relationship.

The /film/performance mediator type allows us to model more interesting cases, such as an actor playing several characters in the same movie

or the same character being played by several different actors in different movies

Note that while at this time of writing, /film/performance nodes are not considered as topics of discourse, meaning that they are not also typed /common/topic. However, it seems reasonable that in the future, we might want to discuss about a particular performance. For example, we might note that Heath Ledger's performance as The Joker surpassed even Jack Nicholson's performance as the same character. We only compare their performances of that character; we are not comparing them as actors in general. Modeling performances also allows us to say, for example, that an actor tends to perform better as protagonists than antagonists, which is more detailed than making a sweeping statement that the actor is good or bad without any specific. Fine-grained data modeling allows for fine-grained insights.