Discussions on Sidebar
Start a New Discussion
-
-
Having thought about the Metaweb/Freebase concept for a while (glad to see some people smarter than I am implemented it :)), I was always concerned with the definitions of data ownership. In particular, I'm coming at it from the perspective of "I want all data to be free". And I'm assuming the Creative Commons license works for that from the sharing perspective. The question I have, though, is: What data can I upload? For instance, here's a list of potential data items (along with groups that may feel they "own" the data):
1) All US ZIP codes and their corresponding GIS location-related data (polygons or even grid-based information on region mappings) - The US Postal Service
2) Historical stock price movements - NYSE, NASDAQ, etc.
3) Privately sponsored research data coming from public universities - Sponsor (drug/defense contractor/etc.)
Now I can make the argument that I could look through old newspapers (or Yahoo! Finance) and get historical stock quotes, and hell, my OpenBSD system comes with a default ZIP-to-City,State mapping file... but where does the line get drawn? I feel like at a bare minimum, any output of government should be in the public domain (and therefore uploadable), but even there things get interesting. And personally, I feel like anything that is published in a mass media outlet should be fair game (which would include at least daily market information).
Anyways, it's a big question, and I'm assuming you've thought it through... any guidelines? Since most people won't just be uploading personal information, where do you draw the line on what information can truly be free?
Cheers,
Caleb-
The law on this subject is pretty complex. While a single fact is not copyrightable, a collection of facts can be. Also, social constraints are sometime more stringent that legal ones. For instance, we try to use data only from people who want to share it, not just data that we can legally use. Also, we always attribute our sources, even if there is no legal requirement to do so.
-
Actually, the concept of a collection of facts, in and of itself is not necessarily dispositive. The copyright interest arises in the specific expression or representation of that collection of facts, from the standpoint of copyright law. Another dimension to the question I think also arises within the context and shadowline of when does the reposing, serving and accessing of the data as a "publication" transition into a commercial use or transaction, whereby the work of the aggregator that has been "converted" through posting to the Freebase by a third party, becomes a commercial use by the member who uploads it for a commercial purpose. Freebase is not involved nor engaged in a commercial distribution; however, a member's use could be. I recognize this thread is rather long of tooth dating back to March; however, in the spirit of supporting the goals of the service, it might serve the member community to develop reference materials or guidelines for evaluation of possible gray areas or areas of potential risk and exposure when it comes to possibly protected data.
-
breiter, I've been poking around for a while looking for just such guidelines. Seems to me that we should at least cover some of the obvious "this is okay" cases. For example, I've been toying with the idea of importing complete Fraggle Rock episode lists, but I'd have to get those lists from somewhere (probably the muppets wiki over at wikia), but I don't want to metawebify their content if that's not okay. And to my layman self, it's not clear if that's a protected collection of facts.
-
sblom, one thing I've heard posited is that if you get the facts from two or more sources that agree with each other, then they're not copyright(able). For instance, if one-and-only-one website has a list of Fraggle Rock eps, then it might be a copyrighted collection, however if that same list exists on the Muppets Wikia, Wikipedia, and IMDB (which I assume it does), and you look at all three of them and find they agree, and then input the data, then those facts probably aren't copyright, but are merely facts.
Furthermore, Wikipedia is GFDL, and Wikia requires an open license too (GFDL or CC-BY-SA, IIRC). We extract a great deal of data from Wikipedia under GFDL, as I'm sure you know. Here is the List of Fraggle Rock episodes from Wikipedia.
-
-
-
I was adding to the entry for Isaac Newton. It did not have his birthplace.
So I entered "Woolsthorpe-by-Colsterworth, Lincolnshire, England" not knowing about all the subtypes? of location you have.
Actually there does not seem to be a subtype that fits that.
My expectation was that the system would parse through what I entered and realize it was a "Hierachical Location", do something like "Aha! a New location 'Woolsthorpe-by-Colsterworth' that we are told is in 'Lincolnshire' which we have, and the user means the one that is in 'England' which we have.
Instead, there now seems to be a monolithic type "Woolsthorpe-by-Colsterworth, Lincolnshire, England" created. That's not what I expected or intended.
-- Mike Berrow-
I guess I actually mean something like "monolithic location type instance" in that last sentence.
-
You must type fast. :) There's already a "Woolsthorpe-by-Colsterworth" in Freebase typed as a Location. The idea is that a user would start typing "woolsth..." and auto-complete would prompt with what's already in the system, including "Woolsthorpe-by-Colsterworth". This, by the way, would fit the Type "City/Town" in addition to the more generic Location type it has acquired.
Non-US cities used to be named the city names only (e.g. "Paris") but a recent improvement appended country names to foreign city names for clarification (e.g. "Paris, France"). My guess is that because "Woolsthorpe-by-Colsterworth" was only typed as Location but not City/Town, it was omitted in the process that examined cities. I'll type it City/Town right now, and add "England" to the end, following the "city, country" convention we have for non-US cities.
FYI, US cities are displayed as "city, state" (e.g. "Paris, Arkansas") and co-typed "US City/Town" (in addition to Location and City/Town).
You can find more information on the various Location types at the Location Domain:
http://www.freebase.com/view/domain?id=/location -
Ah! That's it. I actually don't type *that* fast. But I did a cut/paste from Wikipedia, which amounts to the same thing.
-
-
-
I'd love to know what to do about merging two topics that are actually one topic. Like this one:
http://www.freebase.com/view?id=/wikipedia/en_id/10576
http://www.freebase.com/view?id=/user/metaweb/datasource/MusicBrainz/2027f439-a29c-4d28-9b97-c05b350b18da-
see this help topic: http://www.freebase.com/view/helptopic?id=%239202a8c04000641f8000000003c1b1e8
-
-