Utilisateur:Darkdadaah/Analyses/Interwikis
Analysis of interwiki (or language) links different from their page name in all wiktionaries (dumps from 2016-04-07, script used: Anagrimes: interwiki_analysis.pl).
apostrophe | 2343 |
capital | 3646 |
hyphen | 805 |
not a letter | 260 |
part of title | 1774 |
partial title | 2008 |
empty | 1174 |
placeholder | 100624 |
other | 115578 |
Cases modifier
Apostrophe modifier
Differing community rules.
- Most of the apostrophe interwikis are between fr and other wikis. This is due to the French policy on apostrophe to use the typographic one for titles and texts. Other chapters seem to have similar rules, although it may be due to bot imports.
- It should be easy to match the corresponding entries algorithmically. An exception should be made for the article on the punctuations themselves.
Some details:
fr | 889 |
ku | 224 |
vi | 103 |
ru | 80 |
de | 50 |
ko | 47 |
mg | 45 |
es | 28 |
fi | 26 |
lt | 5 |
cs | 3 |
eu | 3 |
tr | 2 |
et | 1 |
nah | 1 |
nl | 1 |
wa | 1 |
Capital modifier
Should not happen.
All Wiktionary projects differentiate the first letter in lower/uppercase.
hyphen modifier
Should not happen.
A difference in separation of parts, with a space or an hyphen: Wiktionaries have different pages for those.
Not a letter modifier
Differing community rules.
Difference with something that is not a letter: dot (with/without), comma (with/without), ellipsis (… or ...). Other cases are probably errors, like an absence of apostrophe.
Partial text modifier
Should not happen.
Partial texts (e.g. plural forms like pt:saca-rolha -> fr:saca-rolhas).
Empty modifier
Should not happen.
Two wikis have some empty links: te (1027) and mg (147). Probable bot errors.
Placeholder modifier
Should not happen.
One wiki, te, has 100624 links with the same content: కొత్త తెలుగు పదం (simply translated as "new Telugu word").
Other modifier
Should not happen.
Further analysis of this category shows the following:
- 15041 contain colons, i.e. "Sjabloon:a", meaning that they have a template in place of the title.
- This is the case for hu, mg, ay
- Other such case are links to project pages (errors).
- The majority of the rest is most likely an erroneous usage of those links where the links is actually a translation. This is the case for : ky, mg, sa, kn, or.
Summary modifier
Only two categories of differing interwikis seem acceptable:
- Apostrophe: some wikis (mostly fr). Not only for French but for any language where an apostrophe is used in place of « ' ».
- Punctuation: for phrases, some wikis add punctuations (commas, period, exclamation marks...).