Search the API Reference:
Basic Concepts
If you have not dealt with Freebase Data ever before, this section contains the basic concepts in Freebase data modeling that you cannot afford to miss.
Topics
A lot of data in Freebase corresponds to information you might find on Wikipedia. Corresponding to a Wikipedia article is a Freebase topic (as in "topic of discourse"). For example, corresponding to this Wikipedia article on Bob Dylan is this Freebase topic on Bob Dylan. Of course, Freebase contains a lot of topics that have no correspondence in Wikipedia.
The term "topic" is chosen purposedly for its vagueness because there are all kinds of topic. Topics can range from
- physical entities, e.g., Bob Dylan, the Louvre Museum, the Saturn planet, to
- artistic/media creations, e.g., The Dark Knight (film), Hotel California (song), to
- classifications, e.g., nobel gas, Chordate, to
- abstract concepts, e.g., love, to
- schools of thoughts or artistic movements, e.g., Impressionism.
Some topics are important because they hold a lot of data (e.g., Wal-Mart), and some are important because they link to many other topics, potentially in different domains of information. For example, because such abstract topics as love, poverty, chivalry, etc. can be book subjects, poetry subjects, film subjects, and so forth, given a book, we can find poems and films about the same subject(s) as the book.
Types and Properties
As we mention in the introduction of this Getting Started guide, there can be many aspects to the same topic:
- Bob Dylan was a song writer, singer, performer, book author, and film actor;
- Leonardo da Vinci was a painter, a sculptor, an anatomist, an architect, an engineer, ...;
- Love is a book subject, film subject, play subject, poetry subject, ...;
- Any city is a location, potentially a tourist destination, and an employer of civil servants.
In order to capture this multi-faceted nature of many topics, we introduce the concept of types in Freebase. The topic about Bob Dylan is assigned several types: the song writer type, the music composer type, the music artist (singer) type, the book author type, etc. Each type carries a different set of properties germane to that type. For example,
- the music artist type contains a property that lists all the albums that Bob Dylan has produced as well as all the music instruments he was known to play;
- the book author type contains a property that lists all the books Bob Dylan has written or edited, as well as his writing school of thoughts or movement;
- the company type contains many property for listing a company's founders, board members, parent company, divisions, employees, products, year-by-year revenue and profit records, etc.
Thus, a type can be thought of as a conceptual container of properties that are most commonly needed for describing a particular aspect of information. (You can think of a type as analogous to a relational table, and each "type" table has a foreign key into the one "identity" table that uniquely defines each topic.)
Hands-on exercise:
- See all types assigned to the Bob Dylan topic and the properties available for those types using the Edit view
- See all properties available for the book author type using the Schemas Explorer
Domains and IDs
Just as properties are grouped into types, types themselves are grouped into domains. Think of domains as the sections in your favorite newspaper: Business, Life Style, Arts and Entertainment, Politics, Economics, etc. Each domain is given an ID (identifier), e.g.,
/businessis the ID of the Business domain/music- the Music domain/film- the Film domain/medicine- the Medicine domain- etc.
The ID of a domain looks like a file path, or a path in a web address.
Each type is also given an ID, and its ID is based on the domain in which it belongs. For example, the Company type belongs in the Business domain, and it's given the ID /business/company. Here are some other examples:
/music/albumis the ID of the (Music) Album type, belonging in the Music domain/film/actor- the Actor type in the Film domain/medicine/disease- the Disease type in the Medicine domain
Hands-on exercise:
- See all domains using the Schemas Explorer
- See the types available in the Business domain
Just as a type inherits the beginning of its ID from its domain, a property also inherits the beginning of its ID from the type it belongs to. For example, the Industry property of the Company type (used for specifying which industry a company is in) is given the ID /business/company/industry. Here are some other examples:
/automotive/engine/horsepoweris the ID of the Horsepower property of the (Automotive) Engine type/astronomy/star/planet_sis the ID of the Planets property of the Star type (used for listing planets around a star)/language/human_language/writing_systemis the ID of the Writing System property of the Human Language type
Thus, domains, types, and properties are given IDs conceptually arranged in a file directory-like hierarchy.
Namespaces, Keys, and Topic IDs
The file directory-like hierarchy of domain, type, and property IDs is just one application of a more general concept: namespaces and keys. A namespace is like a file directory, and a key is like a file name. Just as all file names within a particular file directory must be unique among themselves, all keys within a particular namespace must also be unique among themselves.
As a more specific example, /business is the namespace corresponding to the Business domain. Within it, Business-related types are given keys (e.g., company) that are unique among themselves. Each type's ID is formed by appending its key to the namespace's ID (e.g., /business/company).
There are several kinds of namespaces beside namespaces that correspond to domains and types. Most important and frequently encountered is the /en namespace. This is the English namespace in which most well-known topics can be given unique keys to form human-readable English IDs. For example, the prolific Bob Dylan is so well-known that his topic in Freebase is given the key bob_dylan in the /en namespace, and so the topic's ID is /en/bob_dylan. This ID allows you to access his topic on Freebase.com with the simple URL
We will continue this discussion in depth in the Namespaces and Keys section.
Topic GUIDs
While a topic might or might not be identifiable by namespace/key IDs, it can always be identified with a GUID--a Globally Unique Identifier, in the form of 32 hexadecimal digits following a #. For example, the GUID of the Gone With the Wind movie is #9202a8c04000641f800000000081e23b. Each topic has exactly one GUID, and each GUID maps to exactly one topic.
So that GUIDs and namespace/key IDs can be used in a uniform manner, Freebase APIs can understand the virtual namespace /guid. That is, if you were to ask Freebase for the topic with the ID /guid/9202a8c04000641f800000000081e23b, then Freebase understands that you want the topic with the GUID #9202a8c04000641f800000000081e23b. For example, if you were to navigate your browser to
http://www.freebase.com/view/guid/9202a8c04000641f800000000081e23b
then Freebase would recognize the virtual /guid namespace, identify the topic with that GUID, try to find an /en key for the topic, and redirect you to a more human-readable URL
Thus, for all practical purposes, you can think of /guid/9202a8c04000641f800000000081e23b as an ID composed of the key 9202a8c04000641f800000000081e23b in the namespace /guid.
More on Properties
The last basic concept to discuss involves a major difference between Freebase properties and their analogy in relational database technologies, namely relational table columns. For each row, a relational table column can only hold one value. For example, consider a typical "book" relational table with a column named "author". For each row in the "book" table, the "author" column can only hold one foreign key to an "author" table. If a book happens to have several authors, then this simple relational schema design does not work, and we would have to make a new table to model the authorships. That is, we would need one "book" table, one "author" table, and one "authorship" table to store the n-to-n relationships between books and authors. And the way you retrieve data changes quite radically as you switch from one schema design to the other.
In contrast with conventional database technologies, Freebase considers multi-value properties to be so desirable in modeling real-life data that it supports multi-value properties by default. That is, when the /book/written_work/author property was created, it was assumed to allow for multiple authors per book, and you can query for a multi-value property and for a single-value property in exactly the same way. There is no need to think if you need to join with a third table that models the n-to-n relationship.
Summary
- A type is a conceptual container of related properties commonly needed to describe a certain aspect of a topic.
- A topic can be assigned one or more types (the default type being
/common/topic) - As properties are grouped into types, types are grouped into domains.
- Domains, types, and properties are given IDs in a namespace/key hierarchy.
- Common well-known topics are given IDs in the
/ennamespace, which are human-readable English strings. - Topics are uniquely identified within Freebase by GUIDs.
- Properties are multi-value by default, and multi-value properties and single-value properties can be queried in the same way.