My Chinese works

All about language programs, courses, websites and other learning resources
Ericounet
Yellow Belt
Posts: 62
Joined: Sun Jul 19, 2015 7:38 pm
x 285

My Chinese works

Postby Ericounet » Sun May 13, 2018 8:32 am

Hi,

I'm studying Chinese for 2 years now, and I managed to work on lists of vocabulary (the HSK and other frequency lists).
I formated them in a common format so it's easier to work with them (importing into Anki, making programs to use them)

I generated 3 data formats: CSV, xml and json

Here are an example:

Json format:

Code: Select all

{
            "hanzi": "桌子 ",
            "traditional": "桌子 ",
            "pinyin": "zhuōzi",
            "translation": "table / bureau ",
            "classifier": null,
            "lesson": "HSK1",
            "sound": "[sound:cmn-60f2dada.ogg]",
            "origin": "Chinwa"
        }


CSV format:

Code: Select all

爱    愛    ài   aimer / affection / apprécier       HSK1   [sound:cmn-2d9d12c4.ogg]   Chinwa
八    八    bā   huit / 8       HSK1   [sound:cmn-5b366cae.ogg]   Chinwa


XML format:

Code: Select all

<enregistrement>
    <hanzi>爱 </hanzi>
    <traditional>愛 </traditional>
    <pinyin>ài</pinyin>
    <translation>aimer / affection / apprécier </translation>
    <classifier/>
    <lesson>HSK1</lesson>
    <sound>[sound:cmn-2d9d12c4.ogg]</sound>
    <origin>Chinwa</origin>
  </enregistrement>


The lists are available with French, English and German (not all of them).

What would be nice is the possibility to have a sound for every record. I Used the ones in the shtooka databses, but many sounds are missing. I also did some audio splitting on the FSI Chinese recordings ... but there is still a lot of work to do.

If someone knows a free source of Chinese words sounds, it would be great to link them in the files.

So, if someone is interested in these lists, I can put them on the website (private use only ... I gathered these original lists a long time ago and don't know about the copyrights ... some websites just don't exist anymore)

The origin is embeded in the files (when available)

Some of the lists available:
wikidictionary
Chinwa,
HSK academy
Official HSK
...

The MaineEdu website has a ton of sentences with different audio speakers and with English translations. I scrapped the site and create a big csv, xml, json file with all the content (like a big list of sentences with audio and translations)

an example of the json record:

Code: Select all

 {
      "phrase": {
        "topic": "Talking with Children 孩子 - Character Review",
        "hanzi": {
          "simplified": "你 耳 朵 疼 吗 、使 劲 儿 咽 几 下 儿 。",
          "traditional": "你 耳 朵 疼 嗎 、使 勁 兒 咽 幾 下 兒 。"
        },
        "pinyin": "nǐ ěrduō téng ma?  shǐjìnr yàn jǐxiàr.",
        "translations": [
          {
            "translation": {
              "langue": "en",
              "texte": "Do your ears hurt?  Swallow very hard."
            }
          }
        ],
        "recordings": [
          {
            "recording": {
              "langue": "zh",
              "locuteur": "Cao Lihong",
              "audio": "../../Language/Sound19a/19103cao.wav"
            }
          },
          {
            "recording": {
              "langue": "zh",
              "locuteur": "Shao Jingxian",
              "audio": "../../Language/Sound19b/19103sjx.wav"
            }
          },
          {
            "recording": {
              "langue": "zh",
              "locuteur": "Ren Shuang",
              "audio": "../../Language/Sound19c/19103rs.wav"
            }
          },
          {
            "recording": {
              "langue": "zh",
              "locuteur": "Zhao Mo",
              "audio": "../../Language/Sound19d/19103zm.wav"
            }
          },
          {
            "recording": {
              "langue": "zh",
              "locuteur": "Li Xinzhou",
              "audio": "../../Language/Sound19e/19103lxz.wav"
            }
          },
          {
            "recording": {
              "langue": "en",
              "locuteur": "Cashmeira",
              "audio": "../../Language/Sounde19a/19103csh.wav"
            }
          },
          {
            "recording": {
              "langue": "en",
              "locuteur": "Cherie",
              "audio": "../../Language/Sounde19b/19103ca.wav"
            }
          }
        ]
      }
    },


I also converted the following dictionaries:

cedict
cfdict
handedict
chdict

I forgot to mention some audio epub I made from websites like grammar-wiki, rudger's chinese etc .. some need still some work.

http://divers.yojik.eu/

I'm writing my own training programs to use the lists.

Here are some screenshots (the programs are still not finished, but in a good state!)

http://divers.yojik.eu/c1.png
http://divers.yojik.eu/c2.png
http://divers.yojik.eu/c3.png

Nothing original, but my own versions of programs found on the web, which I can customize with my own lists, exercises.

I also finished the 2 first modules of FSI-Chinese. (here just in pdf format)
http://divers.yojik.eu/FSI-Chinese.pdf

I forgot to mention that I wrote all formatting programs in nodejs. Available if someone wants them. (in source, of course)

Like usual, send me any comment or updates ...

Eric
2 x
administrator of fsi-languages.yojik.eu;
French(N), German, English, Russian.
Chinese and Korean beginner

crush
Blue Belt
Posts: 514
Joined: Mon Nov 30, 2015 3:35 pm
Languages: EN (N), ES, ZH
Maintain: EUS, YUE, JP, HAW
Study: TGL, SV
On Hold: RU
x 953

Re: My Chinese works

Postby crush » Wed May 16, 2018 12:03 am

I believe i've uploaded all the vocabulary form the FSI courses to Memrise, it might be easier to use that than do it all yourself. I also added it to Skritter, i believe.
0 x


Return to “Language Programs and Resources”

Who is online

Users browsing this forum: No registered users and 2 guests