Utilisateur:Darkdadaah/Analyses/Interwikis

Analysis of interwiki (or language) links different from their page name in all wiktionaries (dumps from 2016-04-07, script used: Anagrimes: interwiki_analysis.pl).

apostrophe 2343
capital 3646
hyphen 805
not a letter 260
part of title 1774
partial title 2008
empty 1174
placeholder 100624
other 115578

Cases modifier

Apostrophe modifier

Differing community rules.

  • Most of the apostrophe interwikis are between fr and other wikis. This is due to the French policy on apostrophe to use the typographic one for titles and texts. Other chapters seem to have similar rules, although it may be due to bot imports.
  • It should be easy to match the corresponding entries algorithmically. An exception should be made for the article on the punctuations themselves.

Some details:

fr 889
ku 224
vi 103
ru 80
de 50
ko 47
mg 45
es 28
fi 26
lt 5
cs 3
eu 3
tr 2
et 1
nah 1
nl 1
wa 1

Capital modifier

Should not happen.

All Wiktionary projects differentiate the first letter in lower/uppercase.


hyphen modifier

Should not happen.

A difference in separation of parts, with a space or an hyphen: Wiktionaries have different pages for those.

Not a letter modifier

Differing community rules.

Difference with something that is not a letter: dot (with/without), comma (with/without), ellipsis (… or ...). Other cases are probably errors, like an absence of apostrophe.

Partial text modifier

Should not happen.

Partial texts (e.g. plural forms like pt:saca-rolha -> fr:saca-rolhas).

Empty modifier

Should not happen.

Two wikis have some empty links: te (1027) and mg (147). Probable bot errors.

Placeholder modifier

Should not happen.

One wiki, te, has 100624 links with the same content: కొత్త తెలుగు పదం (simply translated as "new Telugu word").

Other modifier

Should not happen.

Further analysis of this category shows the following:

  • 15041 contain colons, i.e. "Sjabloon:a", meaning that they have a template in place of the title.
    • This is the case for hu, mg, ay
    • Other such case are links to project pages (errors).
  • The majority of the rest is most likely an erroneous usage of those links where the links is actually a translation. This is the case for : ky, mg, sa, kn, or.

Summary modifier

Only two categories of differing interwikis seem acceptable:

  • Apostrophe: some wikis (mostly fr). Not only for French but for any language where an apostrophe is used in place of « ' ».
  • Punctuation: for phrases, some wikis add punctuations (commas, period, exclamation marks...).