Wikidata Game: The Reference Treasure Hunt Dashboard

The following tables provide statistics on the decisions made in the The Wikidata Game to deliver reference sets. The Item x Property x Value Dataset is probably the most relevant table reported here: it provides statistics per item, per property, and per extracted value for an item x property pair, of the decisions to accept or reject the suggested value. The criterion used to determine whether the proposed value would be accepted or not is defined in this Phab ticket: (1) a proposed value can be accepted if it was reviewed by at least five (5) decision makers in the game, and (2) a proposed value is accepted if it has at least a 95% acceptance rate (implying that any proposed value up to 19 decisions must have a consensus to be accepted). The is_accepted column in the Item x Property x Value Dataset is based on this criteria. For reasons of clarity very long values in the Extracted Data (JSON) column in this table are truncated as indicated by (...truncated...) in this column. The complete dataset, with complete (i.e. non-truncated) values, can be downloaded as a .CSV file from the bottom of the table.
The rest of tables provide aggregated statistics per Wikidata properties and datatypes.
Note. We do not process the Wikidata external identifiers separately at this point because we still do not have enough decisions on them.

NOTE. This dashboard has moved to https://wikidata-analytics.wmcloud.org/app/WD_GameReferenceHunt. Please update your bookmarks. This service will be discontinued.


Per Property Statistics

The following table presents the data aggregated across the Wikidata properties. Columns: Property: Wikidata property, accepted: How many decisions to accept the proposed value, rejected: How many decisions to reject the proposed value, ratio: accepted divided by rejected, percent_accepted: the percent of accepted in accepted + rejected total, total_decisions: how many assessments were provided in total.

Download (csv)

Per Datatype Statistics

The following table presents the data aggregated across the Wikidata datatypes. Columns: datatype: Wikidata datatype, accepted: How many decisions to accept the proposed value, rejected: How many decisions to reject the proposed value, ratio: accepted divided by rejected, percent_accepted: the percent of accepted in accepted + rejected total, total_decisions: how many assessments were provided in total.

Download (csv)

Contact: Goran S. Milovanovic, Data Scientist, WMDE
e-mail: goran.milovanovic_ext@wikimedia.de
IRC: goransm


Item x Property x Value x Source x Choice Dataset

Columns: Item: Wikidata item, Property: Wikidata property of the Item, Extracted Data (JSON): Extracted value for the respective property and item, accepted: How many decisions to accept the proposed value, rejected: How many decisions to reject the proposed value, ratio: accepted divided by rejected, percent_accepted: the percent of accepted in accepted + rejected total, total_decisions: how many assessments were provided in total, is_accepted: is the proposed change acceptable given the criterion (described in the dashboard's header).

Download (csv)

Note. Please specify single quote as a string delimiter when opening this .csv file.


Technical Notes. The data for this dashboard are served from the wd-ref-island.toolforge.org server and a live, updated dataset is retrieved from there and processed upon each dashboard initialization. This dashboard will also send a SPARQL query to WDQS in order to fetch a fresh list of Wikidata external identifiers during initialization.