100,000 video games on Wikidata

Wikidata’s WikiProject Video games has just passed a major milestone: 100,000 video game (Q7889) items on Wikidata. Like we did for the 50K milestone exactly two years ago, let’s use that opportunity to draw a quick mid-year report (and with the very same format, for both ease of comparison and ease of writing :-þ).

Description

Let’s look at how these items are described along some basic properties − asking the Wikidata Query Service for some pretty graphs, and using my trusted inteGraality for some more advanced statistics.

Over 89% of the items have a platform (P400) statement (which does not mean that we have 89% completion on that topic, since many games are published on several platforms, and we may only have recorded one or a couple of them).

85% of the items have a publication date (P577)

37% have a genre (P136) − we have a very long tail of 730 distinct values as genres, which we still should clean-up (minus indie game (Q2762504), which we recently moved to has characteristic (P1552)).

Almost 33% have a country of origin (P495)

About 34% of the items have a developer (P178) and 37% a publisher (P123).

Links to Wikipedia

42% of the items are linked to an article in at least one language-version of Wikipedia − English comes first (27%), then French (15%), Ladin (14%) and Japanese (13%).

What I also find interesting is to look at items linked to only one Wikipedia language version: some 5K (5%) only have an article in the English-language or Japanese-language Wikipedia, then comes French-language and Ladin-language Wikipedias with 1K (1%) of items.

External identifiers

Over at Wikidata we link to hundreds of other video game databases.

On top is still Internet Game Database game ID (P5794), used on 74% of items. Lutris game ID (P7597) follows with 60% (makes sense, as the Lutris database is seeded with IGDB). Steam application ID (P1733) completes the podium with 56%. The new entrant SteamGridDB ID (P12561) snatches the fourth place in barely 3 months, with 56%. RAWG game ID (P9968) and MobyGames game ID (P11688) stand at 51%. PCGamingWiki ID (P6337) is at 38%. Both Giant Bomb ID (P5247) and HowLongToBeat ID (P2816) at 33%. OGDB game title ID (P7564) and GameFAQs game ID (P4769) at 15%, speedrun.com game ID (P6783) and Mod DB game ID (P6774) at 13%. StopGame ID (P10030) and myabandonware.com game ID (P12652) at 11%… and a very very long tail of over 360, sometimes highly specialized, databases.

(The most represented are English-language databases, but the list above includes one database in German and two in Russian).

Some caveats

1/ By the time of writing this, we already reached 102,577 items.

2/ Last time, I had cautioned that looking strictly at instance of (P31)=video game (Q7889) items does not tell the full story, as we have a long tail of subclasses also used as P31: some refer to distinct concepts (the 956 DLCs or 3242 expansion packs), while others are indeed games. I’m happy that we have successfully culled out the hundreds of instances of video game remaster and video game remake (moved to based on (P144)) ; as well as free and open-source video game (moved to has characteristic (P1552)). Still some work to do to refine our P31s, but going in the right direction.

3/ With 100,000 items, we are going somewhere (surpassing the 85,000 of GiantBomb or the 63,000 of OGDB, for example). But this is still under the 153,000 games in Metacritic, far from the 278,000 entries in Mobygames or the 281,000 entries in IGDB (and dwarfed by the 868K entries in RAWG).

4/ The astute reader may have noticed that some data points went down compared to two years ago: a developer (P178), publisher (P123) or country of origin (P495) on a third of items (down from half), and a genre (P136) from two-thirds to a mere one-third. So while we should celebrate this significant milestone in breadth of coverage, we should keep in mind the depth of coverage on the road ahead of us.

Link collection

Laisser un commentaire