Project Summary. Wikidata usage overview for a specific project, including: the distribution of usage in Wikidata semantic categories, total Wikidata usage volume, similar projects, and the top Wikidata items. Use this tab to get a quick overview of WD usage on the project of interest.


Loading...
Loading...
Loading...
Loading...


Note: We study the distribution of Wikidata usage across the semantic categories to determine which client projects use Wikidata in a similar way. In this graph, each project points towards the one most similar to it. The selected projects has a different color. The results are relevant only in the context of the current selection: the selected project and its 20 nearest semantic neighboors.

Loading...


Note: In the absence of English item label the Wikidata item ID is used in place of it.

Loading...



WDCM Usage :: Wikidata, WMDE 2019

Contact: Goran S. Milovanovic, Data Scientist, WMDE
e-mail: goran.milovanovic_ext@wikimedia.de
IRC: goransm



Category Summary. Wikidata usage overview for a specific category, including the distribution of usage in category accross projects and the top Wikidata items per category.



Loading...

Note: In the absence of English item label the Wikidata item ID is used in place of it.



Loading...

Categories General Overview

Wikidata item usage per semantic category

Note: The current selection of semantic categories does not encompass all Wikidata items.



Loading...




Wikidata item usage per semantic category in each project type

Note: Item usage count is given on a logarithmic scale.



Loading...



WDCM Usage :: Wikidata, WMDE 2019

Contact: Goran S. Milovanovic, Data Scientist, WMDE
e-mail: goran.milovanovic_ext@wikimedia.de
IRC: goransm



WD Usage Tabs/Crosstabs. Here you can make selections of client projects and semantic categories to learn about Wikidata usage across them.
Note: You can search and add projects into the Search projects field by using (a) project names (e.g. enwiki, dewiki, sawikiquote, and similar or (b) by using project types that start with "_" (underscore, e.g. _Wikipedia, _Wikisource, _Commons, and similar; try typing anything into the Select projects field that starts with an underscore). Please note that by selecting a project type (again: _Wikipedia, _Wikiquote, and similar) you are selecting all client projects of the respective type, and that's potentially a lot of data. The Dashboard will pick unique projects from whatever you have inserted into the Search projects field. The selection of projects will be intesected with the selection of semantic categories from the Select categories field, and the obtained results will refer only to the Wikidata items from the current selection of client projects and semantic categories. In other words: disjunction operates inside the two search fields, while conjunction operates across the two search fields.
Note: The Dashboard will initialize a choice of three project types (Wikipedia, Wikinews, and Wiktionary) and a random choice of six semantic categories. All charts will present at most 25 top projects in respect to the Wikidata usage and relative to the current selection; however, complete selection data sets are available for download (.csv) beneath each chart.



Projects

Loading...
Data (csv)

Categories

Loading...
Data (csv)

Project Types

Loading...
Data (csv)

Project vs Categories

Loading...
Data (csv)

Project Types vs Categories

Loading...
Data (csv)



WDCM Usage :: Wikidata, WMDE 2019

Contact: Goran S. Milovanovic, Data Scientist, WMDE
e-mail: goran.milovanovic_ext@wikimedia.de
IRC: goransm



WD Usage Tables. Here you can access some tabulated and cross-tabulated raw data on Wikidata usage.
All tables can be searched and sorted by any of the respective columns.


Table A. Project Totals.

Loading...
Table B. Category Totals.

Loading...
Table C. Project vs Category Cross-Tabulation.

Loading...

Table D. Project Type Totals.

Loading...
Table E. Project Type vs Category Cross-Tabulation.

Loading...



WDCM Usage :: Wikidata, WMDE 2019

Contact: Goran S. Milovanovic, Data Scientist, WMDE
e-mail: goran.milovanovic_ext@wikimedia.de
IRC: goransm



WDCM Usage Dashboard

Description


Introduction


This Dashboard is a part of the Wikidata Concepts Monitor (WDMC). The WDCM system provides analytics on Wikidata usage across the Wikimedia sister projects. The WDCM Usage Dashboard focuses on providing the detailed statistics on Wikidata usage in particular sister projects or the selected subsets of them. Three tabs that present analytical results in this Dashboard receive a description here: (1) WD Usage, (2) Tabs/Crosstabs, and (3) Tables. But first, definitions.


Definitions


N.B. The current Wikidata item usage statistic definition is the count of the number of pages in a particular client project where the respective Wikidata item is used. Thus, the current definition ignores the usage aspects completely. This definition is motivated by the currently present constraints in Wikidata usage tracking across the client projects (see Wikibase/Schema/wbc entity usage). With a more mature Wikidata usage tracking system, the definition will become a subject of change. The term Wikidata usage volume is reserved for total Wikidata usage (i.e. the sum of usage statistics) in a particular client project, group of client projects, or semantic categories. By a Wikidata semantic category we mean a selection of Wikidata items that is that is operationally defined by a respective SPARQL query, returning a selection of items that intuitivelly match a human, natural semantic category. The structure of Wikidata does not necessarily match any intuitive human semantics. In WDCM, an effort is made to select the semantic categories so to match the intuitive, everyday semantics as much as possible, in order to assist anyone involved in analytical work with this system. However, the choice of semantic categories in WDCM is not necessarily exhaustive (i.e. they do not necessarily cover all Wikidata items), neither the categories are necessarily mutually exclusive. The Wikidata ontology is very complex and a product of work of many people, so there is an optimization price to be paid in every attempt to adapt or simplify its present structure to the needs of a statistical analytical system such as WDCM. The current set of WDCM semantic categories is thus not normative in any sense and a subject of change in any moment, depending upon the analytical needs of the community.

The currently used WDCM Taxonomy of Wikidata items encompasses the following 14 semantic categories: Geographical Object, Organization, Architectural Structure, Human, Wikimedia, Work of Art, Book, Gene, Scientific Article, Chemical Entities, Astronomical Object, Thoroughfare, Event, and Taxon.


Usage


The Usage tab provides elementary statistics on Wikidata usage across the semantic categories (left column) and sister projects (right column).
To the left, we first encounter a general overview of Basic Facts: the number of Wikidata items that are encompassed by the current WDCM taxonomy (in effect, this is the number of items that are encompassed by all WDCM analyses), the number of sister projects that have client-side Wikidata usage tracking enabled (currently, that means that the Wikibase/Schema/wbc entity usage) is present there), the number of semantic categories in the current version of the WDCM Taxonomy, and the number of different sister project types (e.g. Wikipedia, Wikinews, etc).
The Category Report subsection allows you to select a specific semantic category and generate two charts beneath the selection: (a) the category top 30 projects chart, and (b) the category top 30 Wikidata items chart. The first chart will display 30 sister projects that use Wikidata items from this semantic category the most, with the usage data represented on the horizontal axis, and the project labels on the vertical axis. The percentages next to the data points in this chart refer to the proportion of total category usage that takes place in the respective project. The next chart will display the 30 most popular items from the selected semantic category: item usage is again placed on the horizontal axis, item labels are on the vertical axis, and item IDs are placed next to the data points themselves.
The Categories General Overview subsection is static and allows no selection; it introduces two concise overviews of Wikidata usage across the semantic categories of Wikidata items. The Wikidata Usage per Semantic Cateory chart provides semantic categories on the vertical and item usage statistics on the horizontal axis; the percentages tells us about the proportion of total Wikidata usage that the respective semantic category carries. Beneath, the Wikidata item usage per semantic category in each project type provides a cross-tabulation of semantic categories vs. sister project types. The categories are color-coded and represented on the horizontal axes, while each chart represents one project type. The usage scale, represented on the vertical axes, is logarithmic to ease the comparison and enable practical data visualization.
To the right, an opportunity to inspect Wikidata usage in a single Wikimedia project is provided. The Project Report section allows you to select a single Wikimedia project and obtain results on it. The first section that will be generated upon making a selection provides a concise narrative summary of Wikidata usage in the selected project alongside a chart presenting an overview of Wikidata usage per semantic category. The next chart, Wikidata usage rank, show the rank position of the selected project among other sister projects in respect to the Wikidata usage volume. Beneath, a more complex structure, Semantic Neighbourhood, is given. In this network, or a directed graph if you prefere, each project points towards the one most similar to it. The selected projects has a different color. The results are relevant only in the context of the current selection: the selected project and its 20 nearest semantic neighboors only are presented. Once again: each project points to the one which utilizes Wikidata in a way most similar to it. The top 30 Wikidata items chart presents the top 30 Wikidata items in the selected project: item labels are given on the vertical axis, Wikidata usage on the horizontal axis, and the item IDs are labeled close to the data points themselves.


Tabs/Crosstabs


Here we have the most direct opportunity to study the Wikidata usage statistics across the sister projects. A selection of projects and semantic categories will be intersected and only results in the scope of the intersection will be returned. The charts should be self-explanatory: the usage statistic is always represented by the vertical axis, while the horizontal axis and sub-panels play various roles in the context of whether a category vs project or a category vs project type crosstabulation is provided. Data points are labeled in million (M) or thousand (K) pages (see Wikidata usage) definition above). While charts can display a limited number of data points only, relative to the size of the selection, each of them is accompanied by a Data (csv) button that will initiate a download of the full respective data set as a comma separated file.


Tables


The section presents searchable and sortable tables and crosstabulations with self-explanatory semantics. Access full WDCM usage datasets from here.




WDCM Usage :: Wikidata, WMDE 2019

Contact: Goran S. Milovanovic, Data Scientist, WMDE
e-mail: goran.milovanovic_ext@wikimedia.de
IRC: goransm



WDCM Navigation

Your orientation in the WDCM Dashboards System


  • WDCM Portal
    The entry point to WDCM Dashboards.

  • WDCM Overview
    The big picture. Fundamental insights in how Wikidata is used across the client projects.

  • WDCM Semantics
    Detailed insights into the WDCM Taxonomy (a selection of semantic categories from Wikidata), its distributional semantics, and the way it is used across the client projects. If you are looking for Topic Models - that’s where they live.

  • WDCM Usage
    Fine-grained information on Wikidata usage across client projects and project types. Cross-tabulations and similar.

  • WDCM Geo
    Wikidata items interactive maps.

  • WDCM Structure
    A method to investigate the WDCM Taxonomy and improve the choice of items that undergo analyses.

  • WDCM Biases
    The WDCM gender bias and north-south divide statistics.

  • WDCM (S)itelinks
    The WDCM (S)itelinks usage aspect statistics.

  • WDCM (T)itles
    The WDCM (T)itles usage aspect statistics.


  • WDCM System Technical Documentation
    The WDCM Wikitech Page.

  • WDCM Wikidata Project Page
    The WDCM Wikidata Project Page.

  • The WDCM Journal
    A regularly update selection of the most interesting empirical findings from WDCM.




WDCM Usage :: Wikidata, WMDE 2019

Contact: Goran S. Milovanovic, Data Scientist, WMDE
e-mail: goran.milovanovic_ext@wikimedia.de
IRC: goransm