Sorry if this seems a stupid question, but...
I recently got curious as to how much of the internet was taken up by various languages (note I'm talking about specific media published as opposed to the first language of a user.)
From this source (https://w3techs.com/technologies/overview/content_language) English is stated to take the largest share of the pie with nearly 62% - no shock there. What surprised me is that Russian came in second with around 8% - beating out French, Spanish, German, Chinese dialects, Japanese, Hindi and Arabic (among various others.)
So, I'm simply curious, how is it Russian beats out so many other languages with similar or much larger speaker bases for content on the internet? Is it due to Russian acting as a lingua franca across Eastern Europe?
Why is so much internet content in Russian?
-
- Green Belt
- Posts: 464
- Joined: Fri Nov 16, 2018 2:58 am
- Location: England
- Languages: English (N), Dutch (A2 - July 2021), working towards B1
- x 1093
Why is so much internet content in Russian?
You do not have the required permissions to view the files attached to this post.
5 x
Languages: English (N), Dutch (passed A2 exam in May 2021, failed B1 in May 2023 - never sit an exam when you have food poisoning!)
Seeking: Linguaphone Polish and Linguaphone Afrikaans
Seeking: Linguaphone Polish and Linguaphone Afrikaans
- Xenops
- Brown Belt
- Posts: 1446
- Joined: Mon Nov 30, 2015 10:33 pm
- Location: Boston
- Languages: English (N), Danish (A2), Japanese (rusty), Nansha (constructing)
On break: Japanese (approx. N4), Norwegian (A2) - Language Log: https://forum.language-learners.org/vie ... 15&t=16797
- x 3574
- Contact:
Re: Why is so much internet content in Russian?
Thank you for sharing: upon seeing this chart, I understand now that I (clearly) need to submit to my language lusts, and start learning Turkish and Persian.
6 x
Check out my comic at: https://atannan.com/
-
- Orange Belt
- Posts: 137
- Joined: Wed Oct 10, 2018 5:05 am
- Languages: Know: English (N), German (B2), Spanish (B2/C1), Italian, Portuguese
Study (on and off): Persian, Russian - x 449
Re: Why is so much internet content in Russian?
I have to say I am skeptical of this chart. There isn't really a clear source of the data and some of that jumps out at me as just wildly improbable given the number of internet users for various languages. Whatever it is sampling from does not seem to be a good representation of the internet as a whole.
Can Vietnamese speakers really be producing more online content than Chinese speakers, given that there are at least 10 times as many of the latter?
If that really is the case and these numbers are true I would be fascinated to know why, but I am going to need to hear some reasoning before I'm prepared to believe that.
Can Vietnamese speakers really be producing more online content than Chinese speakers, given that there are at least 10 times as many of the latter?
If that really is the case and these numbers are true I would be fascinated to know why, but I am going to need to hear some reasoning before I'm prepared to believe that.
3 x
- Axon
- Blue Belt
- Posts: 775
- Joined: Thu Jun 16, 2016 12:29 am
- Location: California
- Languages: Native English, in order of comfort: Mandarin, German, Indonesian,
Spanish, French, Russian,
Cantonese, Vietnamese, Polish. - Language Log: viewtopic.php?f=15&t=5086
- x 3291
Re: Why is so much internet content in Russian?
Lemus wrote:I have to say I am skeptical of this chart. There isn't really a clear source of the data and some of that jumps out at me as just wildly improbable given the number of internet users for various languages. Whatever it is sampling from does not seem to be a good representation of the internet as a whole.
Can Vietnamese speakers really be producing more online content than Chinese speakers, given that there are at least 10 times as many of the latter?
If that really is the case and these numbers are true I would be fascinated to know why, but I am going to need to hear some reasoning before I'm prepared to believe that.
https://w3techs.com/technologies
https://w3techs.com/faq
Based on their methodologies, it appears that what's happening here is that Vietnamese speakers are producing slightly more individual websites that rank in the top 10 million worldwide than Chinese speakers are. If we look at the amount of actual text/audio online including comment sections, forums, wikis, social media posts, and videos, the language rankings may look quite different.
I wouldn't be surprised to learn that Mandarin ranks in the top 5 or higher when it comes to video content online, for instance. Most Chinese regional capitals and TV stations have official YouTube channels with thousands of videos on each, and that's not even looking at other video streaming sites catering to Chinese audiences exclusively!
8 x
-
- Orange Belt
- Posts: 242
- Joined: Wed Mar 21, 2018 6:54 pm
- Languages: English, Portuguese, Spanish, Catalan, French, Persian, Arabic, Mandarin, Japanese.
- x 444
Re: Why is so much internet content in Russian?
Perhaps it is because to escape censorship they don't host content in their own contry domain, but on US domains (.com), and thus have somehow biased the research.
I don't know about Russian, but this is a possible the explanation why Persian is so high. I can tell from experience that the Persian internet is not anywhere near Spanish or French.
Or perhaps the Alexa toolbar is popular in Russia for some reason.
I don't know about Russian, but this is a possible the explanation why Persian is so high. I can tell from experience that the Persian internet is not anywhere near Spanish or French.
Or perhaps the Alexa toolbar is popular in Russia for some reason.
0 x
- verdastelo
- Orange Belt
- Posts: 202
- Joined: Sun Jan 31, 2016 1:20 pm
- Languages: Punjabi (N), Hindi-Urdu (near-native), English (C1+), Russian (B1+), French (A2+), Chinese (A1+), Kannada (A0+)
- x 740
Re: Why is so much internet content in Russian?
That's probably because Russian is one of handful of languages in which you can find books on virtually any imaginable subject. As a child, I read in a 1982 issue of Soviet Science Fiction (a magazine published in the USSR) that about 90% of what was known to mankind had been translated into Russian. How they came up with that number is beyond my comprehension. Nonetheless, the idea and the number stuck and is validated during my quotidian experiences. If I don't like an article on the English Wikipedia, I can always refer to the Encyclopedia Britannica. Russian offers a similar capability. Those who don't prefer the Russian Wikipedia can consult the Большая Российская Энциклопедия. It's legal and free.
Russian has an Академический словарь for Encyclopedia.com, a Rutracker for The Pirate Bay, half-a-dozen major social networking platforms, and tons of websites where you can read classics for free. An example is Sima Qian (司馬遷), whose works are available on Project Gutenberg in Classical Chinese but on Востлит in Russian. That means you can read the works of the father of Chinese historiography in a modern language. Many a times I have consulted the Новая философическая энциклопедия to learn about European, Indian, and Chinese philosophical terms. Romantic ideas about education is probably a reason so much content, consequently websites, are in Russian.
With that said, I don't think the W3Techs chart is correct. It could probably be off by a huge margin. W3Techs ists Baidu.com as one of the most popular English-language websites, which seems wrong to me. Baidu is China's most popular search engine. I have seen its Traditional Chinese version, but never an English interface.
I skimmed through the FAQ and Technologies pages Axon shared. The pages don't tell you much. How are multilingual websites categorized? What if there are only a handful of Persian pages on a Spanish website? Many such questions remain unanswered. For this reason, I have my own subjective criteria for measuring a language's strength online:
Some languages, such as Hindi, are rich in audio-visual content, but they are dwarves when it comes to non-fiction texts. That's because the elite is fond of English. I would be genuinely surprised if you could find a book on the history of Tamil Nadu, China, or Kenya in Hindi. You can read the Mahabharata in Chinese, but you cannot read Confucius in Hindi. Such is the sad state. Hindi is rich in entertainment, movies, music, and news. That's it. Tagalog, Swahili, Tamil, and many other languages, I believe, fall in this category.
Russian has an Академический словарь for Encyclopedia.com, a Rutracker for The Pirate Bay, half-a-dozen major social networking platforms, and tons of websites where you can read classics for free. An example is Sima Qian (司馬遷), whose works are available on Project Gutenberg in Classical Chinese but on Востлит in Russian. That means you can read the works of the father of Chinese historiography in a modern language. Many a times I have consulted the Новая философическая энциклопедия to learn about European, Indian, and Chinese philosophical terms. Romantic ideas about education is probably a reason so much content, consequently websites, are in Russian.
With that said, I don't think the W3Techs chart is correct. It could probably be off by a huge margin. W3Techs ists Baidu.com as one of the most popular English-language websites, which seems wrong to me. Baidu is China's most popular search engine. I have seen its Traditional Chinese version, but never an English interface.
I skimmed through the FAQ and Technologies pages Axon shared. The pages don't tell you much. How are multilingual websites categorized? What if there are only a handful of Persian pages on a Spanish website? Many such questions remain unanswered. For this reason, I have my own subjective criteria for measuring a language's strength online:
- Encyclopedia and Reference Materials: That includes both peer-edited and expert-vetted content. Chinese peer-edited encyclopedias, such as 互动百科 and 百科百度, are at least three times larger than the English Wikipedia. Korean also boasts of 나무위키 and two other popular peer-edited encyclopedias. When it comes to expert-vetted content, Korean has a wonderful collection on 네이버 지식백과. Japanese has コトバンク and Italian Treccani.
- Torrent Sites, Online Libraries and Forums. It tells you how active people are in learning. I love Forum Littéraire and Les Mathematiques. The latter's existence showcases that intelligent conversations on mathematics can occur outside Mathematics Stack Exchange.
- College Lectures, such as those available on from College de France and MOOC platforms.
- Web Search Engines. English, Chinese, Russian, and Korean offer a great choice. I'm ignorant of reliable and functional search engines in other languages.
Some languages, such as Hindi, are rich in audio-visual content, but they are dwarves when it comes to non-fiction texts. That's because the elite is fond of English. I would be genuinely surprised if you could find a book on the history of Tamil Nadu, China, or Kenya in Hindi. You can read the Mahabharata in Chinese, but you cannot read Confucius in Hindi. Such is the sad state. Hindi is rich in entertainment, movies, music, and news. That's it. Tagalog, Swahili, Tamil, and many other languages, I believe, fall in this category.
11 x
The life of man is but a succession of vain hopes and groundless fears. — Monte(s)quieu
-
- Orange Belt
- Posts: 137
- Joined: Wed Oct 10, 2018 5:05 am
- Languages: Know: English (N), German (B2), Spanish (B2/C1), Italian, Portuguese
Study (on and off): Persian, Russian - x 449
Re: Why is so much internet content in Russian?
Axon wrote:
Based on their methodologies, it appears that what's happening here is that Vietnamese speakers are producing slightly more individual websites that rank in the top 10 million worldwide than Chinese speakers are. If we look at the amount of actual text/audio online including comment sections, forums, wikis, social media posts, and videos, the language rankings may look quite different.
I wouldn't be surprised to learn that Mandarin ranks in the top 5 or higher when it comes to video content online, for instance. Most Chinese regional capitals and TV stations have official YouTube channels with thousands of videos on each, and that's not even looking at other video streaming sites catering to Chinese audiences exclusively!
So it sounds like the Chinese internet is more intensely concentrated into a smaller number of websites then. The average content per website then is significantly higher for a language like Chinese and lower for something like Vietnamese.
Or, of course, the data was simply wrong to begin with.
0 x
- Axon
- Blue Belt
- Posts: 775
- Joined: Thu Jun 16, 2016 12:29 am
- Location: California
- Languages: Native English, in order of comfort: Mandarin, German, Indonesian,
Spanish, French, Russian,
Cantonese, Vietnamese, Polish. - Language Log: viewtopic.php?f=15&t=5086
- x 3291
Re: Why is so much internet content in Russian?
Lemus wrote:
So it sounds like the Chinese internet is more intensely concentrated into a smaller number of websites then. The average content per website then is significantly higher for a language like Chinese and lower for something like Vietnamese.
Or, of course, the data was simply wrong to begin with.
Perhaps! Or maybe the large numbers of Chinese sites don't rank in the top 10 million worldwide by these metrics. I randomly searched for a few small cities near where I used to live and turned up their local news sites. I can't imagine that they get enough traffic to rank very highly, but they're still full of content just like local news station websites are in the US.
1 x
- Serpent
- Black Belt - 3rd Dan
- Posts: 3657
- Joined: Sat Jul 18, 2015 10:54 am
- Location: Moskova
- Languages: heritage
Russian (native); Belarusian, Polish
fluent or close: Finnish (certified C1), English; Portuguese, Spanish, German, Italian
learning: Croatian+, Ukrainian; Romanian, Galician; Danish, Swedish; Estonian
exploring: Latin, Karelian, Catalan, Dutch, Czech, Latvian - x 5181
- Contact:
Re: Why is so much internet content in Russian?
I don't think it's common for non-natives to use Russian as lingua franca, maybe in Central Asia. In Europe people use it because it's their native language or strongest L2, or to interact with monolingual speakers.Ug_Caveman wrote: Is it due to Russian acting as a lingua franca across Eastern Europe?
Some other reasons I can think of:
-English proficiency or a lack thereof. FIGS speakers are more likely to speak English and create content in English
-illegal content, including mirror sites. Interestingly this includes sites where the interface is in Russian but the actual content is in another language - movies, downloads etc. People who are learning a language are still likely to google (or yandex) materials by using Russian search terms
-There are still plenty of major sites which haven't been replaced by social media, for stuff like reviews/opinions, information, gossip and whatnot. On many subjects there's no single go-to site but several equally good ones. Heck, I was surprised to see how many legit weather forecast sites are there.
-Speaking of social media, the user base is indeed split between a bunch of sites, both global and Russian. If you're starting any kind of project (commercial or some kind of fansite), you'll probably need to have a webpage and crosspost to several social media pages.
-Emigrants/expats often continue using Russian sites and may create new sites for their community. If you're moving abroad there's probably an unofficial site/page where you can find information about the bureacratic stuff in Russian. (This stuff is slowly moving over to social media/messenger chats. At least that's the case with Croatia)
-Perhaps some of these sites are businesses from China or Turkey trying to attract Russian(-speaking) customers? Not only these countries obviously
-outdated SEO techniques are still widely used? Again not sure how much influence this has unless enough people click these links/search results.
5 x
-
- Orange Belt
- Posts: 245
- Joined: Thu Jan 18, 2018 7:56 pm
- Languages: Dutch (N), English (C1), German (B1), Korean (high A2-low B1?)
- Language Log: https://forum.language-learners.org/vie ... php?t=7574
- x 330
Re: Why is so much internet content in Russian?
At the risk of sounding stupid (I don't quite understand the methodology used), but don't some countries have their own "internet platform"?
I'm not at home in technical internet terminology, but what I mean is that from what I know some countries have a platform/engine on which a large amount of sites, blogs, video channels etc are stored and accessed and which to my knowledge are not very easily accessible by "outsiders" or crawling software.
Just thought that if this is the case then many websites might not show up in the data at all.
Am I making sense or talking nonsense?
I'm not at home in technical internet terminology, but what I mean is that from what I know some countries have a platform/engine on which a large amount of sites, blogs, video channels etc are stored and accessed and which to my knowledge are not very easily accessible by "outsiders" or crawling software.
Just thought that if this is the case then many websites might not show up in the data at all.
Am I making sense or talking nonsense?
0 x
2020 resolution words learned:
Pages read at end of 2020:
Pages read at end of 2020:
Return to “General Language Discussion”
Who is online
Users browsing this forum: No registered users and 2 guests