Mass Data Operation Filter Mass Data Operation topics

Share This
table started by robert for the Data World Commons
'Mass Data Operation' tracks the large scale data tasks carried out by the data team.
   
x name x image x Started Operation x Operator x Ended Operation x article
+

Do you know something that's missing from this view? Add it!

If you have a list you can use our wizard to match it with topics that may already be in Freebase.
Go to the import tool »
x Television infoboxes 29 Nov 2006 (example)   Nov 29, 2006 tristan Nov 29, 2006
This is an example -- not an actual data load Spreadsheet had 4,353 entries. 31 /tv/shows existed before load 4,388 existed after load 4,357 added during load time. I believe that the discrepancy of 4 were end user additions during the load.
x Location Refactoring 5 January 2007   Jan 5, 2007 colin Jan 5, 2007
Rename "Province" to "Canadian Province" Rename "Census Designated Place" to "US Census Designated Place" Co-Type all "US County" Add and co-type "US City/Town" type Add "cities" and "counties" to "US State" Remove "County" types Remove "State"...
x Mass typing (phase 1)          
x Basketball Domain Creation 15 January 2007   Jan 15, 2007 danm Jan 15, 2007
Schemas created by hand and loaded via iLoader utility. Information source is public domain information about NBA players and basketball in general.
x Multiple Domain creation 19 January 2007   Jan 19, 2007 jeff Jan 19, 2007
Schemas for the domains book, geography, meteorology, tv, broadcast, and transportation added; schemas for cheese and beverages added to the food domain. Schemas were uploaded using the iloader tool.
x Created 7bit aliases for compound unicode names 19 Jan 2007   Jan 19, 2007 jamie Jan 20, 2007
Operation reviewed all guids for a display names that contained a compound unicode character. The name was recomputed dropping any combining characters. Back quotes were also removed during this operation. The "cleaned" name was then added to...
x Typed some Topics for the new domains   Jan 20, 2007 jg Jan 21, 2007
typed 475 tv_network from Infobox TV channel and 584 from Infobox Network. typed 2094 tv_show from Infobox television typed 139 /transportation/road from Infobox Interstate types 74 cheeses from Template:Cheese typed 59 teas from Infobox Tea
x Added webpages for tv networks and companies   Jan 21, 2007 jg Jan 21, 2007
added ~5000 company webpages from Infobox Company
x Added schema to business domain   Jan 26, 2007 jeff Jan 26, 2007
Schema definition added for business domain (including adding stock exchange to the finance domain and the "employment history" property to /common/person) using iloader.
x Added company stock symbols        
Added 1439 stock symbols for US Companies from: Template:Nasdaq Template:Nyse Template:NYSE Template:Amex Template:NASDAQ Template:AMEX
x Added film release years and IMDB references via article text extraction   Jan 31, 2007 2:00am jamie Jan 31, 2007 3:30am
Approx 17k films now have release dates
x Typed 36458 Topics via MegaType Process   Feb 1, 2007 9:02:00pm jamie Feb 2, 2007 8:10:00am
Infobox information with natural language verification. 13 id: /american_football/football_coach 48 id: /american_football/football_player 21 id: /aviation/airliner_accident 4033 id: /aviation/airport 2 id: /aviation...
x 484 City/Town given new names or aliases   Feb 5, 2007 11:18:57pm jamie Feb 5, 2007 11:19:32pm
Hand selected names based on USBGN used to rename certain cities or used as aliases. When replacing the name, if the existing name was not a shortened version of the new name and the old name provided useful context, the existing name was moved to...
x Added schemas for education and visual art   Feb 2, 2007 jeff Feb 2, 2007
Uploaded schemas (via iloader) for education and visual art; added adapted and adapted_work types to the common domain.
x Retyped alternate albums as releases        
All instances of /music/album which had the orig_artist property rather than artist, which is to say that they were believed to be releases of other albums, were retyped as instances of /music/release instead, with their properties reset accordingly....
x Added restaurant data   Feb 20, 2007 8:00pm darin Feb 21, 2007 1:00am
added ~100,000 restaurants and locations, reconciled with existing data typed 1,100 existing topics as restaurants created new dining domain for restaurant and cuisine types created business_chain and retail_location types in the business domain
x Deleted 4488 duplicate birth dates   Feb 21, 2007 11:00pm jamie Feb 21, 2007 11:40pm
/people/person/date_of_birth is not a unique property. About 4700 people topics have two or more birthdays listed. This is the beginning of an effort to remove duplicate birthdays so /people/person/date_of_birth can be converted to a unique...
x Education mass typing operation   Feb 20, 2007 1:46pm jamie Feb 20, 2007 1:58pm
   (using account wp_typer) 103 id: /education/fraternity_sorority - inserted    2581 id: /education/institution - inserted    2461 id: /education/school - inserted    183 id: /education/school...
x Composer and lyricist properties moved from track to song   Mar 5, 2007 crism Mar 5, 2007
The composer property was added to /music/song.All existing uses of the composer or lyricist properties of /music/track were moved:if the track was known to be a recording of a song, the composer and lyricist property values were moved to the...
x Extended medical schema   Mar 6, 2007   Mar 6, 2007
Extended the medical domain to connect symptoms to diseases, expand information about drugs, and include treatments and medical trials.
x Added US City/Town co-type to cities that were missed the first time   Mar 6, 2007 12:30am colin Mar 6, 2007 12:45am  
x Seinfeld episode infobox   Mar 12, 2007 12:10pm colin Mar 12, 2007 12:25pm
Loaded data from the seinfeld infoboxes.
x Uplift of non-English Latin-character display names   Mar 12, 2007 6:30pm crism Mar 12, 2007 9:07pm
For all topics that had no English display name, but which had display names in other languages, if the foreign display name was at least 80% Latin characters (including unaccented Latin letters as well as Latin letters with eastern or western...
x 94739 Geolocations Added   Mar 19, 2007 10:25:54pm jamie Mar 20, 2007
Initial run of the geocode_bot.  This bot will run nightly to cover new /location/mailing_addresses that have been created.
x Glacier Infoboxes     jeff Mar 20, 2007 12:10pm

Uploaded data from Wikipedia "Infobox Glacier", including glaciers, locations, type, status, and terminus, but not numeric values. Used account mwcl_infobox.
x 14,000 CityTowns   Mar 20, 2007 4:40pm jamie Mar 20, 2007 5:30pm

Using account: mwcl_wpgeoNon-US locations. Data was cleaned, selected and organized from: http://de.wikipedia.org/wiki/Wikipedia:WikiProjekt_Georeferenzierung/Wikipedia-World/en
x Lake Infobox Load   Mar 21, 2007 9:15am colin Mar 21, 2007 9:17am  
x Artist Infobox Load   Mar 21, 2007 10:00am colin Mar 21, 2007 10:02am  
x Painting Infobox Load   Mar 21, 2007 10:07am colin Mar 21, 2007 12:00am  
x NBA Basketball Teams and Rosters   Mar 21, 2007 1:15pm danm Mar 21, 2007 1:47pm
Added NBA team information such as city, coach, date founded, division, conference, league. This process created one new person and typed an additional 28 people as basketball coaches. Also added current rosters to each team. This process created 10...
x Writer Infobox Loaded   Mar 21, 2007 3:50pm colin Mar 21, 2007 3:55pm  
x Mountain infobox   Mar 29, 2007 4:30pm colin Mar 29, 2007 4:32pm  
x Beer Infobox   Mar 21, 2007 5:40pm colin Mar 21, 2007 5:42pm  
x Brewery infobox   Mar 21, 2007 6:43pm colin Mar 21, 2007 6:45pm  
x 2861 non-US CityTowns   Mar 22, 2007 9:10:51pm jamie Mar 22, 2007 9:12:30pm
Added as User: mwcl_geonames Prereconcilied entries from GeoNames.org
x Import missing MusicBrainz albums and tracks   Mar 23, 2007 1:03am crism Mar 23, 2007 2:00am
Added 16,082 MusicBrainz albums and associated tracks based on MusicBrainz attributes field of {0,100}; two-valued attributes field was skipped during initial load.
x Theater Domain created   Mar 23, 2007 5:42pm jeff Mar 23, 2007 5:42pm
Created the theater domain, and loaded the schema via iloader.
x Capitals from Country Infobox   Mar 23, 2007 1:20pm colin Mar 23, 2007 1:25pm  
x state capitals with us state infobox   Mar 23, 2007 1:50pm colin Mar 23, 2007 1:55pm  
x Imported "Infobox Play" data   Mar 23, 2007 jeff Mar 23, 2007
Imported play and playwright data from WP infoboxes.
x more country capitols   Mar 23, 2007 5:00pm colin Mar 23, 2007 5:05pm  
x 111957 objects typed /location/geocode   Mar 24, 2007 10:00pm jamie Mar 24, 2007 11:00pm
user: geocode_botTyped objects which were created during the last geocode_bot, mwcl_geonames, mwcl_wpgeo operations.  These objects were created and given lat/lon properties, but failed to get typed /location/geocode.
x Uploaded Data for Tropical Cyclone Categories   Mar 27, 2007 1:47pm jeff Mar 27, 2007 1:47pm
Uploaded a spreadsheet of tropical cyclone category information, including wind force, corresponding Beaufort numbers, and meteorological services. Data was derived from the Wikipedia page Tropical Cyclone Scales.  Uploaded using the Metaweb...
x Move death-related properties from person to deceased person   Mar 28, 2007 5:26am crism Mar 28, 2007 5:47am
Given 100,000 uses of /people/person/date_of_death, it did not make sense to delete and recreate these property values. Accordingly, the death-related properties of /people/person were carefully moved to /people/deceased_person, taking their hints...
x Musical infoboxes uploaded   Mar 29, 2007 jeff Mar 29, 2007
Uploaded play, composer, lyricist, bookwriter, actors, and directors from Infobox Musical and Infobox Musical 2; did not link actors and directors to performances.  Used mwcl_infobox account.
x Airports & Airport Codes   Mar 29, 2007 7:00pm colin Mar 29, 2007 8:00pm  
x Created chemisty domain     typelibrarian Mar 30, 2007 1:05pm
Chemistry domain and schema created using iloader.
x Typed chemical elements     jeff Mar 30, 2007
Typed all chemical elements and uploaded their CAS ids using mwcl_infobox.
x Baseball Player data from infoboxes   Apr 1, 2007 10:00pm dm_wikipedia_loader Apr 1, 2007 11:00pm
Added 75 baseball players from Wikipedia infobox templates.
x Tennis player data from Wikipedia infoboxes   Apr 2, 2007 1:29pm dm_wikipedia_loader Apr 2, 2007 1:31pm
Added roughly 270 tennis players extracted from Wikipedia infobox templates. Simple load included birth dates and typed instances as person and tennis player.
x Moved "chemist" to chemistry domain     jeff Apr 3, 2007 9:10am
Moved the chemist type from /science to /chemistry.
x Opera domain created     jeff Apr 5, 2007
Created the opera domain and schema via schemaloader.
x Mass typing for opera data     jeff Apr 5, 2007
Mass typed existing Wikipedia articles as opera composer, librettist, opera director, opera company, and opera house.  Used the account mwcl_infobox.
x Music albums merged   Apr 6, 2007 kurt Apr 2008
15285 /music/album instances were merged. 
x Book Infobox   Apr 11, 2007 2:55pm colin Apr 11, 2007 3:10pm  
x Television Episode Infoboxes   Apr 11, 2007 5:25pm colin Apr 11, 2007 5:35pm  
x Wikipedia image import Mona Lisa Apr 12, 2007 7:39:12am jg Apr 20, 2007 7:36:53pm
324560 new images were imported from wikipedia commons and wikipedia en. 373412 topics were given new images.  approximately 84309 topics already had images from prior loads.The name of the image, if provided, is derived from the caption...
x Skyscrapers from around the world   Apr 15, 2007 12:25pm   Apr 2008
User 'robert' did the load using the data pusher.
x Literature awards schema     jeff Apr 15, 2007 1:35pm
Uploaded (via schemaloader) types for literature awards to the book domain.
x Re-typed chemical elements   Apr 17, 2007 jeff Apr 17, 2007
Typed all chemical elements as "chemical element" (again -- data was deleted after last typing) and added CAS IDs, using Metaweb Importer. Used account mwcl_infobox.
Edit Collection Schema
All topics in this collection are typed as Mass Data Operation
Use Data from this Collection
Choose a format:

Images and articles are not included in export files, which are limited to 1000 items. Complete data dumps are also available here.

Flag this Collection
Why do you want to flag this collection?