example.srt:
Code: Select all
1
00:00:06,000 --> 00:00:12,074
Advertise your product or brand here contact www.OpenSubtitles.org today
2
00:00:26,424 --> 00:00:29,087
Writers basically take readers hostage.
3
00:00:29,297 --> 00:00:33,261
You're forcing someone to spend 5, 6, 7, 8 hours in your brain.
4
00:00:34,182 --> 00:00:35,763
People have less time now.
5
00:00:35,764 --> 00:00:35,804
People have less time now.
6
00:00:36,054 --> 00:00:38,686
10 years ago you had a minor bestseller.
7
00:00:38,687 --> 00:00:38,727
10 years ago you had a minor bestseller.
8
00:00:39,227 --> 00:00:41,269
Now people are saturated with info.
9
00:00:41,519 --> 00:00:43,351
They have every excuse not to read.
...
The new file with both subtitles is a single .ssa or .ass (SubStation Alpha: https://en.wikipedia.org/wiki/SubStation_Alpha) subtitle file, which VLC and other major video players can run.
example.ssa:
Code: Select all
[Script Info]
ScriptType: v4.00+
Collisions: Normal
PlayDepth: 0
Timer: 100,0000
Video Aspect Ratio: 0
WrapStyle: 0
ScaledBorderAndShadow: no
[V4+ Styles]
Format: Name,Fontname,Fontsize,PrimaryColour,SecondaryColour,OutlineColour,BackColour,Bold,Italic,Underline,StrikeOut,ScaleX,ScaleY,Spacing,Angle,BorderStyle,Outline,Shadow,Alignment,MarginL,MarginR,MarginV,Encoding
Style: Default,Arial,10,&H00FFFFFF,&H00FFFFFF,&H00000000,&H00000000,-1,0,0,0,100,100,0,0,1,1,0,2,10,10,10,0
Style: Top,Arial,10,&H00F9FFFF,&H00FFFFFF,&H00000000,&H00000000,-1,0,0,0,100,100,0,0,1,1,0,8,10,10,10,0
Style: Mid,Arial,18,&H0000FFFF,&H00FFFFFF,&H00000000,&H00000000,-1,0,0,0,100,100,0,0,1,2,0,5,10,10,10,0
Style: Bot,Arial,18,&H00F9FFF9,&H00FFFFFF,&H00000000,&H00000000,-1,0,0,0,100,100,0,0,1,2,0,2,10,10,10,0
[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:06.00,0:00:12.07,Top,,0000,0000,0000,,Advertise your product or brand here contact www.OpenSubtitles.org today
Dialogue: 0,0:00:06.00,0:00:12.07,Bot,,0000,0000,0000,,Annoncez votre produit ou votre marque ici, contactez www.OpenSubtitles.org dès aujourd'hui
Dialogue: 0,0:00:26.42,0:00:29.08,Top,,0000,0000,0000,,Writers basically take readers hostage.
Dialogue: 0,0:00:26.42,0:00:29.08,Bot,,0000,0000,0000,,Les écrivains prennent essentiellement les lecteurs en otage.
Dialogue: 0,0:00:29.29,0:00:33.26,Top,,0000,0000,0000,,You're forcing someone to spend 5, 6, 7, 8 hours in your brain.
Dialogue: 0,0:00:29.29,0:00:33.26,Bot,,0000,0000,0000,,Vous forcez quelqu'un à passer 5, 6, 7, 8 heures dans votre cerveau.
Dialogue: 0,0:00:34.18,0:00:35.76,Top,,0000,0000,0000,,People have less time now.
Dialogue: 0,0:00:34.18,0:00:35.76,Bot,,0000,0000,0000,,Les gens ont moins de temps maintenant.
Dialogue: 0,0:00:35.76,0:00:35.80,Top,,0000,0000,0000,,People have less time now.
Dialogue: 0,0:00:35.76,0:00:35.80,Bot,,0000,0000,0000,,Les gens ont moins de temps maintenant.
Dialogue: 0,0:00:36.05,0:00:38.68,Top,,0000,0000,0000,,10 years ago you had a minor bestseller.
Dialogue: 0,0:00:36.05,0:00:38.68,Bot,,0000,0000,0000,,Il y a 10 ans, vous aviez un best-seller mineur.
Dialogue: 0,0:00:38.68,0:00:38.72,Top,,0000,0000,0000,,10 years ago you had a minor bestseller.
Dialogue: 0,0:00:38.68,0:00:38.72,Bot,,0000,0000,0000,,Il y a 10 ans, vous aviez un best-seller mineur.
Dialogue: 0,0:00:39.22,0:00:41.26,Top,,0000,0000,0000,,Now people are saturated with info.
Dialogue: 0,0:00:39.22,0:00:41.26,Bot,,0000,0000,0000,,Maintenant, les gens sont saturés d'informations.
Dialogue: 0,0:00:41.51,0:00:43.35,Top,,0000,0000,0000,,They have every excuse not to read.
Dialogue: 0,0:00:41.51,0:00:43.35,Bot,,0000,0000,0000,,Ils ont toutes les excuses pour ne pas lire.
...
This is what it looks like in my VLC player:
You would first need to give it a source .srt file to translate. Yes, you must have a file for the program to translate from. This program does not make .srt files but translates them. Translations typically take 1-2 minutes.
The idea of the program is to be as simple and concise as possible, keeping maintenance low and reliability high. The program is a simple shell script with almost no dependencies (aside from extremely common command line tools that are included in nearly all Linux distrubutions). It has around 300 lines of code (compare to other projects with thousands), one file, and requires no installation. It uses Google Translate for its translations. The program is in ALPHA stage, and I've little idea when or if it'll reach BETA. Error detection is incomplete, and there is little instruction; however, it works excellent for the over 100 .srt files I've put into it.
This program will never cost money, never be sold, never take on a copyright, and never ask for donations (nor do I want them). You may edit and distribute this program, but please notify me of bugs or code changes so we can all benefit. Again, if you try the program and it doesn't work, please let me know. Simply run the program with the debug flag (-d) and give me the *.log file.
There are several options which can be chosen at runtime, such as a very simple gui mode, and automatic source and target language detection, and the ability to change the translation engine to any other CLI engine. Lots of other little tweaking options, too...
Currently, my tests indicate that it can translate English into 102 languages:
Afrikaans Afrikaans [af]
Albanian Shqip [sq]
Amharic አማርኛ [am]
Arabic العربية [ar]
Armenian Հայերեն [hy]
Azerbaijani Azərbaycanca [az]
Basque Euskara [eu]
Belarusian беларуская [be]
Bengali বাংলা [bn]
Bosnian Bosanski [bs]
Bulgarian български [bg]
Catalan Català [ca]
Cebuano Cebuano [ceb]
Chichewa Nyanja [ny]
Chinese Simplified 简体中文 [zh-CN]
Chinese Traditional 正體中文 [zh-TW]
Corsican Corsu [co]
Croatian Hrvatski [hr]
Czech Čeština [cs]
Danish Dansk [da]
Dutch Nederlands [nl]
Esperanto Esperanto [eo]
Estonian Eesti [et]
Filipino Tagalog [tl]
Finnish Suomi [fi]
French Français [fr]
Frisian Frysk [fy]
Galician Galego [gl]
Georgian ქართული [ka]
German Deutsch [de]
Greek Ελληνικά [el]
Gujarati ગુજરાતી [gu]
Haitian Creole Kreyòl Ayisyen [ht]
Hausa Hausa [ha]
Hawaiian ʻŌlelo Hawaiʻi [haw]
Hebrew עִבְרִית [he]
Hmong Hmoob [hmn]
Hungarian Magyar [hu]
Icelandic Íslenska [is]
Igbo Igbo [ig]
Indonesian Bahasa Indonesia [id]
Irish Gaeilge [ga]
Italian Italiano [it]
Japanese 日本語 [ja]
Javanese Basa Jawa [jv]
Kannada ಕನ್ನಡ [kn]
Kazakh Қазақ тілі [kk]
Kinyarwanda Kinyarwanda [rw]
Korean 한국어 [ko]
Kurdish Kurdî [ku]
Kyrgyz Кыргызча [ky]
Lao ລາວ [lo]
Latin Latina [la]
Latvian Latviešu [lv]
Lithuanian Lietuvių [lt]
Luxembourgish Lëtzebuergesch [lb]
Macedonian Македонски [mk]
Malagasy Malagasy [mg]
Malay Bahasa Melayu [ms]
Malayalam മലയാളം [ml]
Maltese Malti [mt]
Maori Māori [mi]
Mongolian Монгол [mn]
Norwegian Norsk [no]
Pashto پښتو [ps]
Persian فارسی [fa]
Polish Polski [pl]
Portuguese Português [pt]
Romanian Română [ro]
Russian Русский [ru]
Samoan Gagana Sāmoa [sm]
Scots Gaelic Gàidhlig [gd]
Serbian (Cyrillic) српски [sr-Cyrl]
Serbian (Latin) srpski [sr-Latn]
Sesotho Sesotho [st]
Shona chiShona [sn]
Sindhi سنڌي [sd]
Sinhala සිංහල [si]
Slovak Slovenčina [sk]
Slovenian Slovenščina [sl]
Somali Soomaali [so]
Spanish Español [es]
Sundanese Basa Sunda [su]
Swahili Kiswahili [sw]
Swedish Svenska [sv]
Tajik Тоҷикӣ [tg]
Tamil தமிழ் [ta]
Tatar татарча [tt]
Telugu తెలుగు [te]
Thai ไทย [th]
Turkish Türkçe [tr]
Turkmen تۆرکمنچه [tk]
Ukrainian Українська [uk]
Urdu اُردُو [ur]
Uyghur ئۇيغۇر تىلى [ug]
Uzbek Oʻzbek tili [uz]
Vietnamese Tiếng Việt [vi]
Welsh Cymraeg [cy]
Xhosa isiXhosa [xh]
Yiddish ייִדיש [yi]
Yoruba Yorùbá [yo]
Zulu isiZulu [zu]еларуская [be]
The program will translate the following languages but very slowly (line by line):
Hindi हिन्दी [hi]
Khmer ភាសាខ្មែរ [km]
Marathi मराठी [mr]
Myanmar (Burmese) မြန်မာစာ [my]
Nepali नेपाली [ne]
Odia (Oriya) ଓଡ଼ିଆ [or]
Punjabi ਪੰਜਾਬੀ [pa]
Likely, it could translate the above languages back into English or those in the list, but I've yet to test that out; there are just too many combinations to test. It's been heavily tested in all combinations of English, French, and Russian.
The newest version of the code can be found here: https://gitlab.com/bedtime_/srtssa/-/tree/master
*** EDIT ***
The program now has the ability to translate and merge .txt, .pdf, and .epub (html and xhtml-based!) files into a dual language .txt file, which can be read in reader or made into flashcards. The text is formatted in such as way as to break down paragraphs and sentences to make reading easier. The file may be any size.
An epub file was translated into a text document here: