Iversen's Guide to Learning Languages (version 3b)

All about language programs, courses, websites and other learning resources
User avatar
Black Belt - 4th Dan
Posts: 4870
Joined: Sun Jul 19, 2015 7:36 pm
Location: Denmark
Languages: Monolingual travels in Danish, English, German, Dutch, Swedish, French, Portuguese, Spanish, Catalan, Italian, Romanian and (part time) Esperanto
Ahem, not yet: Norwegian, Afrikaans, Platt, Scots, Russian, Serbian, Bulgarian, Albanian, Greek, Latin, Irish, Indonesian and a few more...
Language Log: viewtopic.php?f=15&t=1027
x 15359

Re: Iversen's Guide to Learning Languages (version 3b)

Postby Iversen » Fri Jan 29, 2016 7:38 pm

3. Third part - How to learn grammar

3.1. Grammar in general: Morphology, Syntax and Chaos

When it comes to learning grammar you should consider two cases: the easily specifiable morphology and the more elusive syntax (and the even more elusive idiomatics).

To learn morphology I make simplified tables according to my own ideas. When I have made up my mind about how some part of the morphological system functions I often write my own version down on thick green paper so that it doesn't get lost among all the white paper sheets I produce every day. While I'm still learning the morphology of a language I try to keep these green sheets within sight so that I can quickly can check an obscure ending when I need to, - that's as least as efficient memorywise as repeating conjugations and declensions all day long. But of course I have to study the tables in the books thoroughly in order to write my green sheets, and that is also part of the learning process. When I write "simplified" I take it to mean that I cut out everything is based on a few words. Exceptions should be learnt as exceptions, they shouldn't clutter your 'general case' tables. I normally don't use example words, but just indicate the infixes and the endings, plus maybe an indication of forms with likely vowel changes etc. (the exact formulations have to be decided for each language). The reason is that the sheets are to be used in situations where you have a concrete example in mind, not for rote memorization.

Syntax should generally be learnt in close conjunction with actual reading and listening, but some things can also be written down on green sheets - like for instance the cases used with certain prepositions or the main situations where you have to choose a perfective verb in a Slavic languaghe.

To get a hold on some specific grammatical topic area the most efficient way is to see (and think about) a lot of examples in real life. Unfortunately even the thickest grammars generally lack the requisite avalance of examples of a particular phenomenon which could hammar it into your head. For space reasons most grammars just have one or two examples of each phenomenon, and monographies have a tendency to concentrate on rare and maybe theoretically significant constructions rather the common ones. Some language learning systems try to fill the void with drills (FSI - I'm looking at you!), but you have to be a masochist to find such systems amusing. Language guides typically try to vary their sentences, and they do so to such an extent that you can't see any system behind their sample sentences. One of the better in this respect is the small book called "Conversational English - Cebuano" - cfr. the example below, but it would be even better if each example had a hyperliteral transltion and an explanation.

My point is that it would be nice to have some wellstructured collections that illustrated a few very common constructions - with the cases you actually will see over and over.


One way to use a grammar is to read about one topic and then try yo look for examples - and maybe even make a collection. It definitely won't cover all the constructions you will see in your grammar book, but the important thing is to teach yourself to be more vigilant so that you actually notice the relevant things you are confronted with in real life. In other words: one of the main lessons you should get from studying grammar books is the system of categories you need to analysize concrete examples. Another important thing is to avoid being focused entirely on the 'macro meaning'. If your goal is to learn to produce relative clauses then it is irrelevant whether the heroine in your book gets eaten alive by giant zombie maggots or married to the hero. Only the semantic structure of the sentence is relevant.

The most influential grammar school among professional linguists in the world right now seems to be some kind of trasnsformational grammar. This kind of linguistics popped up like a bombshell from outer space in 1957 with Noam Chomsky's book "Syntactic Structures", where he compared three main types of grammars: finite state grammars (roughly grammars that see utterances as structures constructed in a linear fashion), phrase structure grammars (i.e. models based on immediate constituent analysis) and his own proposal, the transformational generative grammar, which since 1957 has been revised again and again. But basically he expected to reach a point with this last kind of grammar, where it could produce all grammatical sentences and none of the ungrammatical ones (as judged by a native speaker). And this claim to generality is the thing that bothers me most. My own gut feeling tells me to start with something that has meaning (basically words, but affixes and certain combinations of words also qualify). The entities have certain construction possibilities, and on that basis I want to construct all the sentences in the universe. This is in fact how an old-fashioned grammar functions.

It is often stated that Chomsky proved that neither finite state grammars nor phrase structure grammars could deliver similar complete descriptions of a language. That's correct, but also somewhat misleading. He delivered some rather convincing evidence that a 'left-to-right' grammar would run into problems, but his arguments against phrase structure grammars were mostly based on the fact that they had to be very complicated to explain all complicated sentences, and they couldn't explain the differences in meaning between sentences with superficially identical structures. And he remedied this by adding a transformational layer. However after close to forty years it has to be said that transformational grammar(s) have failed to deliver. As far as I know NO complete transformational grammar of any language has ever been published (please tell me if you have seen one), whereas there are traditional grammars from the preceding generations which come close to be all-encompassing descriptions of their targets - like for instance Maurice Grevisse's "Le Bon Usage".

My own explanation for this is this sad state of affairs is that Chomsky was right in introducing a transformational layer, but he put it on top of the constituent structure grammar he just had dismissed as insufficient. The main difference between such a grammar and a valency (or dependency) grammar of some kind is that you postulate a structure and then fill out the holes with concrete words, which you afterwards put in a certain morphological form. In a valency grammar you have words with some construction potentials, and you build sentences by combining the possibilities of the concrete words. Because the words are there from the beginning you don't have to ask yourself where the semantics come from, and the semantics are still there while you have made your transformations. This is illustrated in a very pedagogical way in Wikipedia (although the analysis as "with a gun" as part of the syntagm built around "man" is dubious):


On of the dubious elements Chomsky took over from the constituent structure grammars is the tendency to see binary structures everywhere, which includes the dubious NP + VP dichotomy. In contrast valency grammars generally see the finite verb as the one that organizes the main structure of the sentence - and that includes the subjects, different kinds of objects and adverbials. If there isn't a finite verb then the uppermost level must of course another kind of word, for instance a substantive, or it could be an interjection. And sometimes the finite verb is a dummy or even missing, even though the structure of the rest of the sentence suggests that there should have been one. For instance you will typically omit the copula verb in the present in Russian (a 'copula verb' is something like 'to be' in English), but include it in the past tense - and apart from some complications around the case of the subject predicative the sentence structures are the same with and without a visible verb. But this is not the main argument - the point is that at last in the Indoeuropean languages the structure of the sentence is dictated by the verbal and not by one of the nominal clauses in it. And the choice of verb dictates the possible layouts of the sentence.

You might have expected a postulated grammar machine à la Chomsky to be ideal for constructing machine translation systems, but ironically the most successful system among these, Goggle Translate, has not a shred of transformational grammar in it. And no commercial language teaching system has ever been based entirely on transformational grammar

My own stance on this is that the structure of simple sentences hasn't ever been described more efficiently AND pedagogically than with valency grammars, supplemented with elements from field structure grammars where applicable, while transformational descriptions are indispensable if you which to describe and explain more complicated structures. And I have a gut feeling that this is the way language actually works: our language production mechanism is based on concrete words and affixes with built-in semantic and grammatical properties which may suffice to form the simplest structures - but then we use transformations to complicate things.

For me sentences are organized like Chinese boxes (or Russian матрёшка dolls), and at each level the central 'organizer' normally is a verb, - at least in the Indoeuropean languages. Attached to this verb are some fields, which can be filled out with single words or organized structures, generically known as syntagmas (syntagmata in correct Greek), and the most important type of syntagma is the nominal syntagma which has a noun as its core and articles, adjectives and some other kinds of words around it.

A sentence is a box with concrete words or boxes at different levels, and words are connected by rods which sometimes point out of a box. One kind of rod connect a verb and (for instance) a direct object, another kind connects pronouns with their 'antecedents', i.e. the things they points to (with interrogative pronouns this is expected to occur in the answer). But the structures can also be seen as linear constructs, and the rules that govern word order may refer to grammatical functions, although these don't suffice to dictate every aspect of word order.

For instance conjunctions have a tendency to stand first in a sentence, and by occupying this position they tend to push other elements to later positions - as evidenced by the word order in subordinate clauses in German:

"Der mann ist hier" ---> "Ich habe gesehen, daß der man hier ist"
(the man is here ---> *I have seen, that the man here is)

But other languages don't have this rule, and the language learner has to be able to spot and analyze the differences in word order in different languages. That's part of learning a foreign language, and it would be much easier if your grammars and textbooks gave you a simple explanation and supplemented that with a lot of examples that clearly showed the mecanics.

So to deal with a standard grammar imagine that you should make a short summary of each chapter, where you put the statements into some kind of table or a tree structure. Sort out what you really have to know and what mostly is there to placate the author's collegues. Try comparing the statements of two grammars concerning some specific topic. You may find that they even differ on quite elementary things like morphology. Take an example: in Irish there are simple verbal forms and compound verbal forms, and different grammars and textbook do absolutely not agree on where the dividing line is. Grammars should be taken with a pinch of salt: they are simple tools and shouldn't be regarded as infallible holy books.
You do not have the required permissions to view the files attached to this post.
1 x

User avatar
Black Belt - 4th Dan
Posts: 4870
Joined: Sun Jul 19, 2015 7:36 pm
Location: Denmark
Languages: Monolingual travels in Danish, English, German, Dutch, Swedish, French, Portuguese, Spanish, Catalan, Italian, Romanian and (part time) Esperanto
Ahem, not yet: Norwegian, Afrikaans, Platt, Scots, Russian, Serbian, Bulgarian, Albanian, Greek, Latin, Irish, Indonesian and a few more...
Language Log: viewtopic.php?f=15&t=1027
x 15359

Re: Iversen's Guide to Learning Languages (version 3b)

Postby Iversen » Fri Jan 29, 2016 8:08 pm

3.2. Morphology in general

So just call me old-fashioned, but I'll start out with some words about morphology - just as any old grammar did before Chomsky. Actually there is one layer below it, the phonemic layer, but grammar starts where meaning starts. And now I have made myself unpopular not only with those transformationalists who see meaning as an element you add late in the process, but also with those traditionalists who state that endings and other flective elements don't have a meaning in themselves. Nonsense - if I add a plural 's' to an English noun there suddenly are more than one of whatever that substantive refers to. That's change in the meaning. Or if I insert an apostrophe before the 's' to mark that its a genitive marker, that entity suddenly 'owns' something - well maybe in some figurate sense, but that doesn't matter - the apostrophe still provoked a change in the meaning. Sometimes it is hard to see any effect on the meaning, but this should only be seen as an aberration where a mechanism has become so automatized that you don't have alternatives.

There may be different genitive endings in other languages than English, but normally the choice is dictated by common consent once you have chosen a given substantive. So using an -a with a Russian masculine substantive doesn't convey a different meaning than using an -i with a feminine one. You may say: if some prepositions govern one and only one case, then you don't have a choice and the meaning has disappeared in a pouff of green smoke. Well, it is definitely easier to see a meaning if you have contrasts between several possibilities to look at, but sometime long ago the speakers of that particular language chose a certain case for some reason, and that reason must have been semantically loaded. Automatizing constructions make discernable meanings fade away, but at the origin there was a meaning.

So what causes automatization? It can simply be common consent which weeds out some of the available alternatives, but it can also be phonetic considerations - please note that I didn't write 'phonematic': here it is the actual sounds that interfere with the meaning bearing units. If you look for instance Polish verbs you'll find a lot of apparent irregularities. But there are basically two kinds of irregularities: 1) irregularities caused by clashes between competing forms, 2) sound changes caused by phonetic circumstances.

In many languages some of the most common words are actually constructed like patchwork - witness the present tense forms of the English verb "to be": I am, you are, he IS, we are, you are, they are. Where did that "is" come from? Well, from another verb that long ago collided with a verb based on the 'a'-sound. The site Etymonline has these two explanations:

am (v.)
Old English eom "to be, to remain," (Mercian eam, Northumbrian am), from PIE *esmi- (cognates: Old Norse emi, Gothic im, Hittite esmi, Old Church Slavonic jesmi, Lithuanian esmi), from root *es-, the S-ROOT, which also yielded Greek esti-, Latin est, Sanskrit as-, and German ist.

In Old English it existed only in present tense, all other forms being expressed in the W-BASE (see were, was). This cooperative verb is sometimes referred to by linguists as *es-*wes-. Until the distinction broke down 13c., *es-*wes- tended to express "existence," with beon meaning something closer to "come to be" (see be).

Old English am had two plural forms: 1. sind/sindon, sie and 2. earon/aron The s- form (also used in the subjunctive) fell from use in English in the early 13c. (though it continues in German sind, the 3rd person plural of "to be") and was replaced by forms of be, but aron (aren, arn, are, from Proto-Germanic *ar-, probably a variant of PIE root *es-) continued, and as am and be merged it encroached on some uses that previously had belonged to be. By the early 1500s it had established its place in standard English. Art became archaic in the 1800s.

is (v.)
third person singular present of be, Old English is, from Germanic stem *es- (cognates: Old High German, German, Gothic ist, Old Norse es, er), from PIE *es-ti- (cognates: Sanskrit asti, Greek esti, Latin est, Lithuanian esti, Old Church Slavonic jesti), from PIE root *es- "to be." Old English lost the final -t-. See be. Until 1500s, pronounced to rhyme with kiss. Phrase it is what it is, indicating resigned acceptance of an unpleasant but inevitable situation or circumstance about which nothing positive really can be said, is attested by 2001.

"PIE" means Protoindoeuropean, which is the name for a reconstructed 'language' based on comparisons between forms in the oldest extant Indoeuropean languages and assumed to resemble their common ancestor.

It is more than likely that there is some phonetic explanation for the opposition between *es- and *wes-, but it must lie very long back in time, and as this source suggests the pesky little "is" may actually be traced back to the same root as "am". But while this may be interesting for the historical linguists it is the similarities or lack of similarities between the current forms which should be of interest to a modern language learner, even when this means that you have to disregard some historical facts.

Some linguists are too reluctant to cut the line back to the past, and one curious example of this is found in Polish grammar. Polish is a language with much more morphology than English, and it is also a language where the endings of verbal stems are very likely to be affected by the endings. But there is still a system behind the madness, and your task as a language learner is to internalize that system. Native speakers may have each form stored in their brains with a suitable set of construction instructions, but it took them many years to get so far. You are a language learner, and by seeing the logic behind the tables you can make a shortcut to learning them.

To ease my way into this arcane language I have bought Oscar E. Swan's "A Grammar of Contemporary Polish". Under the verbs it distinguishes four conjugations, and within each of these it stipulates a number of types named after the last part of their stems. Under the 2. conjugation Mr. Swan operates with a type based on " ł" (which is pronounced something like English "wh"), and he writes as follows:


So actually this group is named after a consonant that totally has disappeared from the verbal paradigm, but left its mark on at least one derived substantive. And in the analysis of the 1. person singular the author postulates a 'j', which is totally invisible - except that it is blamed for the shift from "ł" to a flat "l". In many cases it can however explain a sound change, and that's why linguists still operate with this shadow of a 'j', that may have been a real sound long ago.

I do appreciate etymology and explanations based on etymology, but for learning purposes it should only be invoked where it makes things clearer. If we only look at the present forms, the grand picture says that we have a system with four conjugations, each with a number of types with different stems, where we can expect a lot of sound changes - but there is some kind of logic in the changes. In fact it seems that almost all forms can be deduced if we just are informed about the shape of the 1. and 2. person singular. Do we then get these two forms served on a silver plate when we consult our Polish dictionaries? No, unfortunately not - shame on them! So we have to learn the most likely sound changes (and resulting spelling changes) by heart, and this is best done by noticing the relevant forms when we see or hear them.

We can use the Polish verbs to illustrate another type of mechanism, namely the combinatorics of the past tense. For this we use an illustration from Pons: Grammatik kurz und Bündig, Polnisch" (p.31):


As you can see the forms contain either the letter "ł" or the letter combination "li", followed by an ending that is reflect the gender in the singular and a distinction between personal and impersonal forms in the plural. Those of you who know Russian will recognize the 'l' sounds followed by essentially adjectival endings - but Polish has added a personal ending taken from the finite verbal forms. The "i" versus "y" that distinguish between personal and impersonal forms are some of the possible plural endings of adjectives, and the Ø - a- o in the singular 3. person are clearly also the nominative singular endings of an adjective. But how come that adjectival endings pop up in the morphological table for a verb? Well, the reason can be seen in Sorbian and the South Slavic languages. The weird l-form is actually an old active past participle. We don't have such a thing in the Romance or the Germanic languages, so it may surprise newcomers to the Slavic languages, but once you know the reasons for its development it doesn't seem as confusing any more.

Let's have a closer look at the construction with an auxiliary, using the verb "imenovati" ('to name') in Croatian. The source is Pons' "Verbtabellen Kroatisch" (in German). The layout is different, but contains the same elements as the one given above for pracować. The new thing is the finite verbform in front of the participle, and this is actually the short present forms of the verb for 'to be' , "biti". There is also a long series: "jesam, jesi, jest, jesmo, jeste, jesu", but as an auxiliary verb the short series is used. So "je sam imenovao" actually would mean something like "I am been-naming", which became the most common past tense form, pushing out the other two in the proces: one named aorist and another named imperfect. In Croatian the auxiliary verb survived (as in for instance Serbian and Bulgarian), but it disappeared in most of the other Slavic languages.


The Southern Slavic languages can be used to explain another puzzling item in Polish and Russian. To the left below you see the conditional of Croatian "imenovati", and here the auxiliary form is actually the otherwise extinct aorist of "biti". In Russian you will recognize this as the indeclinable "бы", which now is seen as some kind of particle and not a verbal form. And as usual the Poles have chosen the most complicated solution they could think up, namely incorporating their "by" into the Polish conditional, which therefore consists of both the usual adjectival elements, the "by"-thing (which at least is left uninflected) followed by an inflected personal ending (seen to the right):


Warning: If you try to learn all the forms of the Polish verbs by rote memorizing you will drown! You'll find the nearest high tower and jump, screaming all the way down! Your rescue is to pin down the mechanisms that produce the thousands of forms under specific conditions. Then you can start combining the elements in the hope that they add up to the forms actually in use. Maybe you can do this by some kind of subconscious magic, but I prefer looking at the forms to see the pattern. Native speakers have seen all the forms so often that they have stored them as items in their brain, but if you are a newcomer to Slavic grammar then you don't have direct access to such a warehouse of readymade forms - but once you understand the logic behind the constructions you can learn to handle them with grace and ease.

In other languages there will be other mechanisms and other patterns. The examples from the Slavic languages just serve to illustrate how the history of a language still is present in its modern form. You may not know about the history, but you can still benefit from looking for the structures it left behind.
You do not have the required permissions to view the files attached to this post.
1 x

User avatar
Black Belt - 4th Dan
Posts: 4870
Joined: Sun Jul 19, 2015 7:36 pm
Location: Denmark
Languages: Monolingual travels in Danish, English, German, Dutch, Swedish, French, Portuguese, Spanish, Catalan, Italian, Romanian and (part time) Esperanto
Ahem, not yet: Norwegian, Afrikaans, Platt, Scots, Russian, Serbian, Bulgarian, Albanian, Greek, Latin, Irish, Indonesian and a few more...
Language Log: viewtopic.php?f=15&t=1027
x 15359

Re: Iversen's Guide to Learning Languages (version 3b)

Postby Iversen » Fri Jan 29, 2016 8:17 pm

3.3. Green sheets

There are language learners who claim never to study grammar, and evidently children don't do it so learning a grammar book by heart can obviously not by the only way to learn grammar - and maybe it isn't even a viable method. If you want to learn grammar - and now I'm not only thinking of morphology, but also syntax and idiomatics - you have to do it in an interplay between the discovery of rules and testing them out in practice, and you have to do this in stages. Here I'll assume you aren't allergic to grammar books, but they can be intimidating - how can you survive grammar?

First, try to get more than one grammar. If you can't get a fullsize grammar then even the short sketchy grammatical sections of travelers' language guides will be better than nothing, but it will be easier to see the patterns if you see them from more than one angle. Your first task will be to look through the morphological sections and compare them. Do they agree on the terminology? The order of cases? The number of declensions? Do the divide the verbs into corresponding groups, and do they list the same verbal forms? Probably not, which may come as a nasty surprise to some learners.

Now look at the adjectives and the substantives. Do their endings in the different cases - if there are cases at all - look almost similar or not? Are there articles? Look at the verbs in the same way, - try to get a comprehensive view of the whole morphology in this way. Then leave the morphology aside and read the syntactical sections with the same critical attitude. Which kinds of subordinate sentences are there, and which constructions with infinite verbal forms do you find, which may or may not correspond to subordinate constructions and vice versa in the languages you already know. Remember, you are not supposed to learn any of these things by heart yet, just find out what there is to learn later.

Next step, - you have to learn something by heart, sorry. But don't do it without also having some texts to use at the same time. I say texts because I find it easier to read than to understand spoken words in the beginning, - if you have a teacher then by all means listen to him/her, but find some things to read also, - preferably bilingual texts. The internet may be a good source for parallel texts, or you can get some from text books or touristical guidebooks, but DO try to use bilingual texts in the beginning - this will spare you a lot of misunderstandings and a lot of half-understood constructions along the way. And most translations aren't so precise that they will do all the work for you - you will still have to look things up. If you can find hyperliteral translations then just be happy, but they are rare. If necessary, read the chapter about bilingual texts to learn to make them yourself.

Among the first things to learn by heart would be the main forms of the most common verbs including the auxiliaries, the pronouns, prepositions and things like that. You will have to learn them by heart eventually so you can just as well start now. Do what most people do: read them aloud many times, write them, find them in your texts and identify the forms, make associations (if you can) and so forth. Do the same kind of forced slave labour with some of the forms of articles and - and then put the grammar or grammars away and look for the forms in your texts. You could be learning all the forms in the book without being able to identify them or use them in actual sentences. And the problem is worse with grammar than with words. So the process where you skip back and forth between the grammar book and the world outside the book should start as early as possible.

Next step: look at the morphological tables and try to find the main distinctions and separate the exceptions from the regular forms - taking into account the problem I described in the preceding chapter, namely that phonetic rules may cause regular changes in the forms that were supposed to be regular. This means that the 100% regular paradigms may be are, but then you have to see what you can rescue from the mess. There must be something regular, otherwise the language would be unlearnable.

When you think you see a regular pattern you should try to write it down on paper. Maybe just ordinary white paper in the beginning, but eventually you should use some better paper which make these sheets easily recognizable and more durable. I use thick green paper, and therefore my word for these records has become 'green sheets'. The idea is that you keep these colored sheets within in sight whenever you work with the language - personally I use a note stand.

If you see a form that bothers you (or you need a specific while writing) then look at your collection of coloured sheets. Making these sheets yourself makes you think about each single form, and looking at them daily for maybe a month will make them into something like an extension of your brain. Therefore it is also extremely important that you settle for a specific way of presenting the facts, because the system you choose with a bit a luck eventually will be the system that gets engrained into your brain. Maybe there are a few small errors - OK, then correct them, but if you get unhappy with the layout itself then you may have to change the system.

As you probably have noted your grammars aren't in total agreement. Maybe you can even spot some inconsistencies. Now think hard about a way to organize the forms of articles and adjectives and nouns on one sheet (maybe two), and all the verbal forms on another - and do it in a logical fashion. For instance all Germanic languages have strong and weak verbs (the first group basically changes the verb stem through the tenses, the second depends entirely on endings for its distinctions). Your tables should show that double system in some way - use colors or special signs or different kinds of dividing lines for such things. Don't put irregular forms into your tables for the regular forms, - if a set of endings is used only by two or three verbs then leave them for a list over irregular nouns or verbs or whatever - these tables aren't meant to contain everything, but only the basic things which you must learn as patterns sooner or later.

When I first wrote about my 'green sheets' almost everybody criticized that I only wrote the endings. But this criticism was misguided: by using whole example words you tie the tables to a few example words. That isn't a bad idea if you expect people to learn grammar by studying the tables. However in practice you will almost always have a specific word in mind when you use the tables on the green sheets, so it doesn't matter that they only contain the endings. And with only the endings you can make the tables much more compact so that you ideally can fit the whole regular part of the morphology of any (friendly) language into maybe 4 or 5 sheets. Plus a number of sheets for pronouns and other more or less irregular adjectives, nouns and verbs.

For instance it takes many pages to get through the forms of the adjectives and the substantives in Latin. This allows the authors to mention exceptions and give examples of word that adhere to a certain pattern. The result is that it takes time to find the relevant spot if you are in doubt about a certain form. And to boot the rules for the use of each form are typically found in another chapter. One green sheet can contain the most common of all the forms in the book, and the price for this is that it doesn't include the exceptions or lists of words in each category (and even less the syntactical rules) - but you can find the information it does contain in a flash.

To learn syntax you can to some degree make similar 'green sheets' - like lists over the cases governed by different prepositions or the verbal forms used in different kinds of subordinates or infinite verbal constructions. But most of the syntax has to be learnt using other methods because those things depend on structures that can't easily be given in table form.

Let's stick with morphology and have a look as some concrete examples. First the Icelandic verbs:


From top to bottom you see first the forms of the present indicative, below the subjunctive, then the past indicate followed by the past conditional. And at the bottom the infinite forms. From left to right you first see five tables with weak verbs, then one full table with a strong verb and a few forms of one more (mostly to point out that the 2. person singular may have a special form). Along right margin you see eight patterns of vowel changes in strong verbs plus the compound verbal forms, exemplified by the first person singular.

Your will notice that truly irregular verbs like "að vera" (to be) aren't included. The strong verbs are there because they aren't really irregular - they just have a vowel shift, which mostly is predictable. And if not, then you simply have to learn the pattern with each verb. There is one detail which you should notice: the forms köllum and kölluðum from the verb "kalla" (call) look irregular, but it is actually the rule that an "u" in the following syllable changes an "a" into "ö".

The vowel changes of the strong verbs were actually caused the vowel changes in the paradigms of the strong verbs, but at an earlier stage: the endings that caused them belonged to the "Proto-Norse" language which preceded "Old Norse", and they disappeared before the Old Norse stage - but left their mark in the vowel changes of the strong verbs.

And by the way: if you think the system above reminds you of High German then you aren't quite off the mark: Icelandic and High German have both preserved their Medieval inflections almost unchanged, where other Germanic languages have simplified their system.

The table below contains just about anything you need to know about the nominal syntagms of Icelandic, including pre- and postpositioned articles, regular adjectives and regular weak and strong substantives. The first column gives the case names (nominative, accusative, dative and genitive), the following three columns gives the 'strong' forms of the adjectives, then after the double vertical line you see the prepositioned articles (derived from a demonstrative pronoun), the 'weak' adjectival forms, the strong substantives, the weak substantives and the postpositioned articles for the masculine, followed by the same forms for the feminine and the neutral.


There is one truly rotten case of terminology at play here, but I can't avoid it: when you speak about weak and strong verbs and substantive you speak about different classes of verbs and substantives, but with adjectives you have the same adjectives in roles where they use either the weak or the strong set of endings: the weak when there is an article or something similar in the slot before them in the substantival syntagma, the strong set when there isn't such a thing or when they are used alone. This also determines the position of the article: without an article you generally use the postclitic article, but with an article you have to use the prepositioned independent form.

On the Icelandic sheet right above you have probably noticed that some columns have been marked "AJ" (adjective), but the rest of the columns have not been marked with word class - after all I made the sheet for my own personal use. But this gives me an opportunity to explain some principles behind the setup.

First you have three columns marked with the signs for masculine, feminine and .. OK, my own invention .. neuter (¤). These columns refer to adjectives standing alone, most likely as some kind of predicative. Inside the following three sections - one for each gender and separated by double vertical lines - you first see the prepositioned definite article (there is no indefinite article in Icelandic) plus one form for the adjective. This is a reflection of the structure of the language: you use the prepositioned article if there is an adjective in a substantive clause. Otherwise you use the postclitic article, which is shown after the substantive endings. And that article has its own endings so in a substantive with such an article there are actually two endings in one word: substantive+ substantive ending plus article +article ending. The endings of the substantive are divided by a vertical dotted linje into endings for strong substantives and endings for weak substantive. In the dative ou see a cross over an -m. That's because there is an ending -um when there isn't a postclitic article, blut the last -m is dropped in the presence of the article: - um ---> -u()num. Apart from that spot it's all a matter of simple combinatorics. There are no other assimilation phenomena.

When I made the corresponding green sheet for Latin it turned out the be necessary to give it another structure. Latin doesn't use articles (which simplifies matter greatly!), and the endings of the adjectives are very close to those of the substantives. And the plethora of endings for each group it was logical to show the substantives and the adjectives on top of each other. In Icelandic it was possible and desirable to show the structure of the noun phrase, wherease that's less relevant in Latin. On the other hand Latin has more 'columns' to fit into the table both for adjectives and nouns, but the sets endings mostly are the same for the two word classes.


The sheet below shows the forms of the adjectives and substantives in Modern Greek (the articles should really have been there too, but they don't combine with the substantives like they did in Icelandic so their absence is less problematic):


The main feature is one you can't see, namely that Greek adjectives have three genders, and there are some typical 'ending sets': if the masculine singular nominative is ος, then the femininum and neutrum can be α ο or η ο, but if it is υς then they will almost certainly be η and υ or ια and υ . So the columns with adjectival endings go from top to bottom, whereas the substantives have one gender each so here a column contains masculine, feminine and neutral substantives with similar endings. The table is not complete - I have left out some katherevousa forms - but it contains just about everything a beginner needs to know. Except the articles and other determiners of course.

The last green sheet is different: it doesn't show morphology, but syntax in the shape of the cases governed by different prepositions in Polish. And the important thing here is to illustrate some simple rules of thumb: with one exception (z/ze) the prepositions governing the genitive ONLY govern this case. Those that govern the instrumental also govern the accusative apart from one that only governs the instrumental and one that governs the genitive instead of the accusative. Similarly those that govern the locative also govern the accusative apart from one that only governs the locative ... and finally there are a few which are reserved for the dative:

You do not have the required permissions to view the files attached to this post.
1 x

User avatar
Black Belt - 4th Dan
Posts: 4870
Joined: Sun Jul 19, 2015 7:36 pm
Location: Denmark
Languages: Monolingual travels in Danish, English, German, Dutch, Swedish, French, Portuguese, Spanish, Catalan, Italian, Romanian and (part time) Esperanto
Ahem, not yet: Norwegian, Afrikaans, Platt, Scots, Russian, Serbian, Bulgarian, Albanian, Greek, Latin, Irish, Indonesian and a few more...
Language Log: viewtopic.php?f=15&t=1027
x 15359

Re: Iversen's Guide to Learning Languages (version 3b)

Postby Iversen » Fri Jan 29, 2016 9:01 pm

3.4. Syntax: subordinate clauses

There is something called an pronoun and another thing called a conjunction. An pronoun is a word that points to something else. If it has an implied reference in the real world it is a demonstrative, if the reference is just something vague somewhere then it is an indefinite, if the solution is expected to come in an answer to a question then it is interrogative, and if the reference goes to something in the (main) sentence then it can in certain cases be a relative pronoun, or it can be a demonstrative. I'm not going into any philosophical hairsplitting about these traditional categories, - show me a language where they are irrelevant, and we'll work out a solution from there.

Conjunctions are things that tie subordinate sentences to something in the main sentence, or they can tie two or more elements together within one sentence. Right now we only deal with conjunctions at the sentence level, and the conjunctions here can be pronouns, but also words which from a synchronic point of view don't seem to have any trace of anaphoric function. However even these conjunctions may once have had a reference - it is a strange coincidence that most conjunctions in argument clauses (or whatever you call them - "phrases complétives" in French) looks like or had older forms that looked like demonstratives. So maybe the idea once upon a time was that they had a reference, namely the whole clause following them: "I say that: "this is a weird idea" ---> "I say that this is a weird idea". The Danish "at" is one of the conjunctions that look meaningless, but it goes back to Old Norse "það", which still is used in Icelandic - the same word as English "that".

We have to broaden the definition of the word 'pronoun'. This word literally says that it is something that replaces a substantive, though most grammars in spite of the name tacitly also accept adjectival pronouns. For me there is no reason to exclude words that typically have adverbial functions from the pronouns, so I consider words like "then", "how", "so" and "sometimes" as just as much pronouns as "me" or "what" or "somebody". Or I call them pro-words to placate the manes of those bygone grammarians who coined the word 'pronoun'. This change in terminology has consequences for the analysis of certain subordinate clauses, including a number formerly called adverbial ("I'll tell you more when I'm ready").

I will now just sketch the main types of subordinate clauses, using the notions that I have just introduced.

The first group are the argument clauses, which are used in typically substantival functions - mostly verbs that indicate questions or affirmations. There are three kinds:
1) those where the clause represent a simple declarative sentence
it is weird ----> I say that it is weird
2, 3) those where it is based on 'full' or 'partial' questions
is it weird? ----> I don't know whether this is weird
why is it weird? ---> I don't know why this is weird

The second group is formed by the relative clauses (in a broad sense), where the conjunction is a relative pro-word with some element in the surrounding main clause functioning as its antecedent. If you would isolate the subordinate clause then you would have to replace the relative pronoun with its antecedent:

This sentence may sound weird ----> I have said a sentence, which may sound weird
My sentence may sound weird ----> I am the person, whose sentence may sound weird
I said a weird sentence today ----> This is the day, where I said a weird day

NB: in most grammatical traditions the last sentence would be seen as adverbial because the antecedent is analyzed as a temporal adverbial, but the parallel with the preceding sentences is obvious, and besides the so-called 'adverbials' which indicate time and amount are more often than not filled in with a substantive.

The third group contains a mixed construction without antecedents, but also not something which somebody would have said as a sentence or question. Some have called them 'independent relative clauses', but this has no basis in reality - except maybe that they often can be replaced by true relative clauses with a dummy antecedent. Their conjunction is mostly filled in by an interrogative pronoun:

nonrelative: Some day I'll say a weird sentence ---> You will be somewhere else when I say a weird sentence
relative: You will be somewhere else the day (when) I say a weird sentence

The message in all this is once again that you should try to dissolve complicated structures in their components, and in the case of embedded sentences this means that you should try to find the logic in the way normal sentences change when they are incorporated as subordinate clauses in other sentences. And as far as I can see such structural arguments lead to a more logical categorization system than the traditional one, which is based on wordclasses.

Relative pro-words in English and other Western language have typically two roles: the one as conjunctional and another one. The system with combined conjunctions-cum-pronouns can break together in a number of ways ways, such as when the role of pronoun and conjunction are separated or a word in the subordinate duplicates the pronoun aspect of the conjunctions-cum-pronouns. Actually it is weird that we have words with two roles, and many languages outside the Indoeuropean group haven't got them, but any grammatical system worth its salt should be able to deal with such phenomena. For a lot of examples see A.Murelli's dissertation "Relative Constructions in European Languages: A Look at Non-Standard". I would like to mention a couple of constructions in Danish which are considered as errors - but they are so common that they just as well could be accepted as correct and inevitable parts of Danish.

We have a construction with "der" as 'preliminary' subject:
"Der kommer en dame nu" (* there comes a lady now)

.. and we have substantival clauses with 'at':
"Jeg hører at der kommer en dame nu" (* I hear that there comes a lady now)

We have relative clauses with "som" and "der" and nothing. "Som" (roughly = 'as') can be subject and object, "der" (roughly = "there") can only be subject and nought (■) can be object, but not subject. The funny thing is that 'at' in informal language can penetrate into relative constructions, but even this system obeys a clear logic - as if the 'at' tries to usurp the role as general conjunction in all kinds of subordinate clauses, which leads to problems with the double nature of the pro-word .

Standard constructions:

subject: Her er damen, der/som kommer nu (* here is theLady, that/who comes now)
object: Her er damen, som/■ jeg har set komme (* here is theLady, whom/■ I have seen come)

Non-standard constructions:

subject: Her er damen, som at der kommer nu (* here is theLady, as that who comes now)
object: Her er damen, som at ■ jeg har set komme (* here is theLady, as that I have seen come)
The interesting thing here is that only "som at der" (or just "som at") is possible when the pronoun(s) fill out the role as subject - not "der at som". This hints at the possibility that the construction with the 'preliminary' subject somehow is involved - in that construction only "der" can be used. However there ought not be any need for a preliminary subject. This interpretation can be reconciled with the observation that "der" can't be included if the relative "som" is used as object in the clause.

In some other languages the combined 'conjunctional' and 'pronominal' roles of the relative pro-word is the norm, but once in a while the language users apparently feel the need to reinforce the latter by adding another pronoun later in the sentence. As in Greek (authentic example):

Η γυναίκα του γιάτρου που τον είχαμε γνωρίζει στο Ρόδο
the women of theDoctor whom him we-had (learnt-to-)know in-the Rhodes

Here " που" is an indeclinable relative pronoun, and maybe it is to identify the doctor as the one somebody met in Rhodes (and not his wife) that leads the speaker to add the extra 'τον'. But examples like this are apparently legal in Greek because I have seen them several times.

In still other languages this 'extra' representation of the pro-word is not only normal, but even the norm - see Murelli's text. Which all goes to show that the category known as relative clause isn't a monolitic construction. We can't assume that all languages behave as the Western languages do.

And the point of this is not to teach you about Danish or other languages, but to point out that you at some level HAVE to analyze sentence structures in order to understand the differences between two languages, and simple tools like sticks and boxes and pronominal references to outside elements are practical tools to do this. All the fine words in the grammars basically have to be translated into such models to be applicable in practice - the words alone aren't enough.

And finally: I wrote in the precending chapter that word order can't simply be deduced from dependency structure, but there is clearly an interplay between the. In those cases where word order takes precedence over dependency structures this results in field models, where you fill out a number of slots. A classical example of this is the difference between word order in main and subordinate clauses in German. Here the finite verb in a subordinate clause takes refuge at the last position in the sentence, and in the case of compound verbal forms it is the finite part that goes to the very last position.

Dies is schön (this is nice/pretty) ---> Ich sagte, daß dies schön sei
Dies is schön gewesen --> Ich habe gesagt, daß dies schön gewesen sei (aber nicht mehr)

Another example: in Greek adjectives and possessive pronouns mostly are placed after the substantives they qualify if there only is one of them present, but if you have certain words in front of the substantive or there is both a possessive and an adjective then the adjective is moved to the position between them: "οι διακοπές μας", but "οι καλοκαιρινές μας διακοπές" ('the holidays our' resp. 'the summerly our holidays'). Again, I don't mention this to inform you about something in Greek, but to emphasize that all grammatical rules have to be formulated in simple mechanical terms to be useful. And if your grammar doesn't do this for you you have to do it yourself.
1 x

User avatar
Black Belt - 4th Dan
Posts: 4870
Joined: Sun Jul 19, 2015 7:36 pm
Location: Denmark
Languages: Monolingual travels in Danish, English, German, Dutch, Swedish, French, Portuguese, Spanish, Catalan, Italian, Romanian and (part time) Esperanto
Ahem, not yet: Norwegian, Afrikaans, Platt, Scots, Russian, Serbian, Bulgarian, Albanian, Greek, Latin, Irish, Indonesian and a few more...
Language Log: viewtopic.php?f=15&t=1027
x 15359

Re: Iversen's Guide to Learning Languages (version 3b)

Postby Iversen » Fri Jan 29, 2016 9:10 pm

3.5. Syntax: tying a knot

One way of studying a grammatical problem is first to read about it in a grammar book and then proceed to finding some genuine examples. If it is a common phenomenon this shouldn't be too hard, especially in those cases where you can use a search machine to find them. If you haven't seen any example outside your grammar then it might still be relevant to learn about it, but just as with words and expressions the more common things should take precedence.

One little, but important warning: don't waste time on writing down the examples in their full length, but cut them down to the important part - and don't try to remember the examples you find as full sentences, but cut them down to short mnemonic formulas such as "to do something to somebody" (with suitable dummy words). If you want to find them again you can add a reference.

Later on you should keep a notebook for funny syntactical items, not least those that you have been looking in vain for. This will keep you alert, and being alert is one of the most efficient things when it comes to learning languages. This also applies to idiomatic expressions, which is in a sense the continuation of syntax when it has become too fragmented to put into a fixed structure.

For me the key to memorizing syntactical patterns is reduction of these patterns to something I can boil down to a rule of thumb and/or visualize in the form of sticks and boxes (or bubbles, as below).

To take a concrete example: assume that I have seen the French relative pronoun 'lequel' for the first time, and now I want to learn to use it. My grammar tells me that it is both a relative pronoun and an interrogative pronoun, and a big fat old grammar like Grevisse would also give some examples from venerated and famous authors, - including some written in a style that would feel obsolete and stilted even to native French writers of today. I am perfectly aware that these quotes are necessary for the linguist who has to research something before writing a paper about it, but they are too complicated for me who just want to learn the use of the word "lequel". So what I want to see is the morphological information that "lequel" is inflected ("laquelle" if the reference is feminine, lesquelles in plural, "duquel" and "auquel" when combined with the prepositions "de" resp. "à") plus some simplified examples of its use in different constructions:

As an interrogative pronoun:

Lequel des [noun phrase in plural] … ?

As a relative pronoun, nominal function:

[antecedent], lequel ….
[antecedent], lequel ….

As a relative pronoun, adjectival function:

[antecedent], lequel [noun]…. (stone dead, but can be seen in older literature)

OK, I have become interested in "lequel", and based on the grammars I now have concocted a list of the constructions it is supposed to occur in. Next phase in the evil old days would be have been to be on the lookout for examples for several weeks, and during that time I would have jumped into the air of joy every time I found one. With the advent of the search machines and the internet this has become much simpler: I still have to know the different forms of lequel, but I can now make a quick search for each of them. Let's try not "lequel" itself, but the feminine form "laquelle". Leaving aside company names and references to dictionaries and wiktionaries and other rubbish, I find these quotes:

Histoire des insectes; dans laquelle ces animaux sont rangés suivant un ordre méthodique…
relative pronoun (NB: book title from 1799, so maybe a bit oldfashioned)

Percussion! mais laquelle? - interrogative pronoun, - used without "de", but in a very short question

Laquelle des trois M te ressemble le plus? - interrogative pronoun used with "de" in a nominal phrase in a more extended phrase

Raison pour laquelle je blogue -- substantival relative pronoun ("[this is the] reason for which I blog")

This should be enough, - the following examples are just repeats of these patterns. But looking through such a series of examples with the simplified patterns in your head is one sure way of making you understand the mechanics of at least this corner of French grammar. In my opinion it is the combination of a simple catalogue of patterns and a lot of relevant examples that is the best way of learning syntax.

The example with 'lequel' was easy because you could search for a single word in a limited number of shapes. Other syntactical phenomena are harder to find through a simple search, like the rules that describe word order. But the proposed method is still to consider how the rules should be formulated and then to have a look in the real world to check the usefulness of those rules. And if you simply can't find any examples of a phenomenon from your grammar in the real world then it isn't strictly necessary to learn it here and now.

Fors omething slightly more complex, let's have a look at some constructions which are based on transformations - including the thing called a sentence knot.

Think of a sentence as something organized around a verb. Think about subordinate phrases as phrases which function as a member (call it 'constituent' or whatever - I don't care) in another sentence - and normally there is a conjunction of some sorts which often also is a pronoun with a reference somewhere. In other word a subordinate phrase is like a little box in a big box. And in the little box there can be an even smaller box. The conjunction (whether pronominal or 'empty') is typically at the start of the phrase to which it belongs.

Now imagine that you have a three tier system, but the connecting word is actually a relative pronoun which - when judged on its function - clearly belong to the lowest level. So you have V1 rel3 V2 V3 :

something could happen --->
I told you (that) something could happen --->
This is exactly the thing which I told you could happen --->

Number three is a sentence knot.

Notice that the first phrase (a variant of V3) starts out having an indefinite pronoun as its subject, and in the final construction the subject is a relative pronoun which points to "thing" at the topmost layer. So in spite of claims in bad grammars the linking pronoun does not 'belong' to the nearest verb V2, but to V3.

And that's what makes the sentence knot a knot - not the word order alone, as stated in some grammars (insofar they even mention the phonomenon).


Sentence knots can be found in many Romance and Germanic languages, and in Danish they are especially common - maybe because we under certain conditions can drop the conjunctions, which makes the whole thing even more Gordian-like:

Jeg ville sige noget om et eller andet --->
Jeg sagde, at jeg ville sige noget om et eller andet --->
Det var just det (som) jeg sagde (at) jeg ville sige noget om
It was exactly that (which) I said (that) I would say something about

This thing about sentence knots may seem like a rather unimportant curiosity - especially if your target language doesn't permit them - but it is just one among several things that tend to be overlooked in normal grammars. In sentence knots there is a reference from a deeply buried sentence to an element in a sentence several layers higher up in the hierarchy, skipping an intermediate layer.

Another, more frequent construction is the 'cleft sentence' where some element is pulled up to a higher level where it becomes the subject predicative:

I want to say this ---> what I want to say is this---> this is what I want to say
that this is the wrong colour is evident ---> it's evident that this is the wrong colour

You may remember group 3 of subordinate clauses from chapter 3.4: "The third group contains a mixed construction without antecedents, but also not something which somebody would have said as a sentence or question. Some have called them 'independent relative clauses', but this has no basis in reality - except maybe that they often can be replaced by true relative clauses with a dummy antecedent. Their conjunction is mostly filled in by an interrogative pronoun:". The sentence "this is what I want to say" contains such a cluse, and the clause in " it's evident that this is the wrong colour" is an 'argument clause' (group one). Both "this" and "it" are pronouns, demonstrative resp. personal. You might ask: what is the role of the demonstrative or personal pronoun in a cleft sentence? It is obvious that it only pops up once the sentence has been transformed, and that it is coreferential with the subordinate clause. In Danish we say "foreløbigt grundled", i.e. 'provisional subject', and that is as good a name as any.

Cleft constructions obviously also exist in other languages, and one of the most conspicuous phrases in French is actually a cleft sentence turned interrogative:

"qu'est-ce que c'est que ça?" (what is-it that it-is (ahem) that? = what is it?) .. from an affirmative "C'est ça que c'est", which again is a transformation of the simpler, but less used sentence "[ce que c'est] est ça", in which "que c'est" is an ordinary relative clause with "ce" as its antecedent.

This "qu'est-ce que c'est (que ça)?" is so common - both in its full and its 'light' version -that even newbee learners almost certainly would learn it by heart without even thinking about its complicated inner structure. But it's a productive contruction with new examples being produced all the time, and .. well, I can't know what you think, but for me it is easier to deal with such constructions in a new language if I can reduce them to sticks and boxes. And I actually do think cleft sentences are funny.
You do not have the required permissions to view the files attached to this post.
1 x

User avatar
Black Belt - 4th Dan
Posts: 4870
Joined: Sun Jul 19, 2015 7:36 pm
Location: Denmark
Languages: Monolingual travels in Danish, English, German, Dutch, Swedish, French, Portuguese, Spanish, Catalan, Italian, Romanian and (part time) Esperanto
Ahem, not yet: Norwegian, Afrikaans, Platt, Scots, Russian, Serbian, Bulgarian, Albanian, Greek, Latin, Irish, Indonesian and a few more...
Language Log: viewtopic.php?f=15&t=1027
x 15359

Re: Iversen's Guide to Learning Languages (version 3b)

Postby Iversen » Fri Jan 29, 2016 9:38 pm

3.6. Word order

A sentence has a main structure that comprises a number of syntagmata and clauses, and each of these may have an inner structure and so on. If the world was a simple place no component would be discontinuous, but you can't count on that.

Let's start from the easy end: prepositional phrases, which consist of a preposition plus a substantival syntagm (which can have its own inner structure). One name used for the substantival syntagm in a prepositional phrase is regimen, but in English this name is also used about other parts of a sentence, and the same applies to the currently used alternative word 'object'. So the terminology in the English grammatical tradition is rotten, and I stick with 'regimen' as the least bad choice.

Actually you also can have 'prepositions' that stands after the substantival syntagma, and then they are called postpositions, and both kinds together are called adpositions. In some languages some words can be both pre- and postpositions (like "nach" and "wegen" in German), but according to Wikipedia you have to go to something as exotic as the native Californian Timbisha language to find an 'inposition', where the adposition settles within the substantival syntagma. But you don't have to go further than English to find a language where prepositional clauses are split - to the teeth-gnashing chagrin of a few die-hard purists. But frankly, a construction used by almost all native speakers should be defined as correct. Instead of trying to kill a correct construction, it is more relevant to study the reasons that the clause was split in the first place, and the typical reason is that the role of regimen is taken by an interrogative or relative pronoun, which has to take the initial position in a question or subordinate clause: "what does he think of? Well, obviously the thing (which) he is thinking of".

"Arma virumque cano, Troiae qui primus ab oris
Italiam, fato profugus, Laviniaque venit litora, "

I sing of arms and the man, he who, exiled by fate,
first came from the coast of Troy to Italy, and to Lavinian shores

This is the beginning of the Eneid (translation borrowed from poetryintranslation.com). As you probably know Latin is notorious for its discontinuous syntagms, though the problem as expected is worst in poetry. When you know that this is possible then it isn't as hard to piece the elements together as you might think, but in my opinion the reason isn't the one usually given, namely the concording case, number and gender, but more the semantics - you expect some qualification of a noun, and then you find something suitable. But that is of course difficult to formulate in operational terms, and therefore Latin teachers have always emphasized the morphological criteria.

Notice also the two instances of "-que" - something as unusual as a postclitic conjunction: "arms man-And (I-)sing, (from)Troia who ..."

A substantival syntagm in English will typically have the substantive last and a number of other elements in front of it, and the order of these elements can conveniently be described by a field model - but if you look at other languages their field structures will often be different, and there can be several competing model. In French most adjectives are placed after the substantive, but a number of very common adjectives are prepositioned or used before or after with a difference in meaning: "un pauvre homme" (a poor wretched fellow) vs. "un homme pauvre" (an impecunious person). But you won't ever see *pauvre un homme or *homme un pauvre, which shows that there is a field model behind the word order, and as a language learner you need to find out about this model, which may exist in several versions. The constraints are most visible in cases where your target language differs from your native one, but even with related languages you have to be on guard.

In chapter 3.2 I mentioned the postclitic articles of the Nordic languages, but there are some subtle differences in their use. The Swedish author Strindborg wrote a book called "Röda rummat" - but in Danish that's not possible: here it must be "Det røde rum" (with a prepositioned definite article because of the adjective). In Swedish you can also have double article, as in "Det röda rummet", but that's not allowed in Danish.

We can introduce other kinds of words in a substabtival syntagm: numerals, demonstratives etc - and you can also add relative clauses and prepositional clauses. I'm not going to write a lot about those possibilities, but can't resist quoting a couple of German sentences (from Herdamitfestival.com - but they are by no means unusual):

Die auf dieser Website veröffentlichten Inhalte unterliegen dem deutschen Urheber- und Leistungsschutzrecht. Jede vom deutschen Urheber- und Leistungsschutzrecht nicht zugelassene Verwertung bedarf der vorherigen schriftlichen Zustimmung des Anbieters oder jeweiligen Rechteinhabers.

The on this homepage published contents aresubjectto the German author- and outputprotectionlaw. Every fromThe German author- and outputprotectionlaw not permitted use needs the previous written accept ofThe provider or current rightsOwner.

And the funny thing is that once you have learnt German well you will find this system both natural and efficient. The trick is to look for the internal system of sticks and boxes, and contrary to Latin each box is usually kept together - *'Die auf veröffentlichten dieser Website Inhalte' is not allowed.

In some languages there are elements which serve roles not found in English, and I would like to mention a case in Greek. Let's first see the effect of the introduction of a possessive pronoun:

"The small house" can be translated into "το μικρό σπίτι" (/to mikró spiti/)
"My small house" can be translated into "το μικρό σπίτι μου" (/to mikró spiti mu/, the small house my - 8 hits on Google), but it is more common to introduce a socalled emphatic possessive: "το δικó μου μικρó σπíτι" (/to dhikó mu mikró spiti", 236 hits). OK, small numbers, but the tendency is clear.

Let's leave it at that. The message is: look for a field model with slots for different kinds of words. And even exceptions or alternatives should defined in terms of such a model. But it also has to be stresssed that to actually use the relevant patterns in speech you don't have time to consult a grammar: those patterns have to be stored and automatized in your mind.

Let's leave the substantival syntagms and look at the verbal ones. In a dependency or valency grammar the whole sentence is seen as an extended verbal syntagm, but there is also a narrower environment around the verb. Her you find among other things the unstressed personal pronouns, whose precise combination possibilities can become a nightmare. When I was taught French long ago we were shown something called "the circles of Sandfeld" which practically solved the problem:


You can combine the pronouns within one circle, but not from the parts of the circles that don't overlap. It may not always be possible to draw circles, but the combination rules of a languages can normally be rationalized to make them easier to remember. And even a rationalization that doesn't cover each and every weird combination can be valuable.

Finally: what about word order at the sentence level? It is a fairly well known factoid that the verbs in German have a tendency to be postponed for as long as possible, making them pile up at the very end of the speech and causing endless problems for simultaneous interpretors who have to tread water while they wait for the pot of gold at the end of the rainbow. Actually this special word order is reserved for subordinate clauses, which of course can be bad enough. The cause is the rule that puts a conjunction in the beginning of the clause, which leads to some internal rearranging:

She arrives today --> sie kommt heute
*I hope that she today arrives --> ich hoffe, daß sie heute kommt
*She is arrived yesterday --> sie ist gestern gekommen
*I know that she yesterday arrived is --> ich weiß, daß sie gestern gekommen ist

You can't discuss word order at the sentence level without mentioning the typology based on basic word order, which results in types like SVO (most Western European languages), SOV (Latin) and VSO (Irish) (quoted from Wikipedia):

There are six theoretically possible basic word orders for the transitive sentence: subject–verb–object (SVO), subject–object–verb (SOV), verb–subject–object (VSO), verb–object–subject (VOS), object–subject–verb (OSV) and object–verb–subject (OVS). The overwhelming majority of the world's languages are either SVO or SOV, with a much smaller but still significant portion using VSO word order. The remaining three arrangements are exceptionally rare, with VOS being slightly more common than OSV, and OVS being significantly more rare than the two preceding orders

One problem with that is not all languages are based on those categories. For instance an ergative language will only have a subject with an intransitive verb. With transitive verbs it has an object s expected, but it corresponds to the subject of an intransitive verb, and instead of a subject it then has something called an agent, which is in another case than the subject of an intransive verb. But according to Wikipedia "The label [SVO] is often used for ergative languages which do not have subjects, but have an agent–verb–object order." - problem solved.

The idea of a basic word order is shaky - we have just seen that German subordinates aren't SVO, but more like SOV, and Latin word order is close to being anarchic - in poetry it IS anarchic. The main excuse for using the typology is the excellent work of linguists like Greenberg, who have shown that there are solid statistical evidence for its relevance. These connexions can be formulated as universals, and it is worth noting that the universals - contrary to those of Chomsky - aren't supposed to be inborn parts of the language mechanism, but instead something that has been formulated on the basis of empirical fieldwork. I'll quote a few rules from an article written by W.Bisang ("Typed of Universals, Greenberg's Universals"):

Universal 1: In declarative sentences with nominal subject and object, the dominant order is almost always one in which the subject precedes the object.

On alternative word order types within one and the same language:

Universal 6: All languages with dominant VSO order have SVO as an alternative or as the only alternative basic order.
You do not have the required permissions to view the files attached to this post.
1 x

User avatar
Black Belt - 4th Dan
Posts: 4870
Joined: Sun Jul 19, 2015 7:36 pm
Location: Denmark
Languages: Monolingual travels in Danish, English, German, Dutch, Swedish, French, Portuguese, Spanish, Catalan, Italian, Romanian and (part time) Esperanto
Ahem, not yet: Norwegian, Afrikaans, Platt, Scots, Russian, Serbian, Bulgarian, Albanian, Greek, Latin, Irish, Indonesian and a few more...
Language Log: viewtopic.php?f=15&t=1027
x 15359

Re: Iversen's Guide to Learning Languages (version 3b)

Postby Iversen » Fri Jan 29, 2016 9:41 pm

3.7. The rhythm of grammar studies

When you start learning a new language you basically need a little of everything: morphology, syntax and idiomatics. But you have to start somewhere. Contrary to most text books I prefer getting a total overview from the start, but without trying to learn the specific forms - and certainly not the exceptions. A language guide like those from Berlitz or Lonely Planet are quite sufficient for this, but if you already have some experience with real grammar books then reading cursorily through one of these is even better. But sooner or later you have get some flesh on that skeleton, and for most people just reading through the whole grammar and then trying to apply it isn't the way to go.

Some things like morphology can to a large extent be put into tables, and I have earlier described how you can make 'green sheets' to learn this part of the grammar. However even morphology should be learnt with some real texts within reach (or a person who is willing to speak really slowly and repeat endlessly, such as the average parent). Lets say you see a verbal ending and you wonder what it is. OK, check your green sheet (or if that isn't enough, consult your true grammar). When you have found it then look at some related forms while you are at it. If it was a 3. person singular preterite indicative form, then run through the other forms in the same tense, and maybe you should also briefly remind yourself of its uses - if you are in doubt then consult your grammar.

Now back to the example: did your identification function in the context? Did your general impression of the uses of this form suit the present case? If not, then back to the sheets (or grammar books) again. Maybe it is a totally different form, maybe it is a known form in a new context, maybe you can't make up your mind and have to leave the case for later, but you have definitely learnt something now. And this is of course also the right moment to look for more examples of the same kind. At least with morphology it is likely that similar forms will pop up soon so you just have to read or listen on with your attention focused on this particular form (and being alert in general is a good thing). With syntax you may have to be patient, or you can try to figure out a way to find more examples (for instance through Google or by consulting one more grammar).

The same technique can be used on syntactical patterns. Actually syntax could be described as the part of grammar that isn't as prone to table building as the morphology, but still so regular that you can infer rules. Try to think about those rules in the same terms as I described for morphological green sheets: which are the central rules? Which are the exceptions? What do you most urgently need to learn?

Learning the more fluid part of syntax will be fairly close to learning the idiomatics of your target language - i.e. you have to do it from real texts. And that can be done, provided that you stay alert to form and don't focus entirely on the meaning. One trick to do this is to choose a certain grammatical phenomenon and then keep that on your mind while reading or listening. It really doesn't matter which phenomenon you choose (it could for instance be prepositions after verbs in English), because being alert to one phenomenon will also make you notice other things in the form of the message. The important thing is to avoid being focused entirely on the meaning.

But those things in grammar that can be formalized and put into a system will be more efficiently learned if you do consult a grammar when you are intrigued by something. And you can only consult a grammar efficiently if you know where to find what in that book. That's one more reason for looking through a true grammar book before even trying to learn the language - and yes, I know that there are people who can't stand reading about grammar, but I don't like celery, and yet there are people who eat the stuff and pretend that it is good for you. It is exactly the same thing with grammar books, except that it here is me who extoll the virtues of something.

One of the things you have to learn as a grammar book user is that their authors sometimes complicate things to avoid criticism from their collegues. This means that they have to include rare and complicated topics which aren't really necessary for you - at least not in the beginning. So if something demands half a page of explanations and then you still haven't got a clear idea about what the heck the man or lady is speaking about, then postpone it until you have a better developed 'Sprachgefühl'. There are so many things to learn about grammar that you just as well can start out picking the lowest hanging fruits first.

Another thing is that exceptions to exceptions is an abomination escaped from the old black school of hell. When you see such a thing in a grammar then try to reformulate the whole passage in your mind (or on paper) so that it consists of a claim with only positive exceptions. And try to pinpoint the examples for both the main claim and for each positive exception. If you have a passage that says that something is A except when it fulfils condition X where it becomes B, unless it belongs to the subgroup Y where it will be C (or maybe D), then ... OK, I had nearly said whack the author, but then there might not be any grammar authors left. Instead do the necessary clean-up yourself: the main rule is A, but in case of X it is B, and with X+Y it is C. And you have absolutely no clue to what it would have been with just Y and not X - blame that on the author!
1 x

User avatar
Black Belt - 4th Dan
Posts: 4870
Joined: Sun Jul 19, 2015 7:36 pm
Location: Denmark
Languages: Monolingual travels in Danish, English, German, Dutch, Swedish, French, Portuguese, Spanish, Catalan, Italian, Romanian and (part time) Esperanto
Ahem, not yet: Norwegian, Afrikaans, Platt, Scots, Russian, Serbian, Bulgarian, Albanian, Greek, Latin, Irish, Indonesian and a few more...
Language Log: viewtopic.php?f=15&t=1027
x 15359

Re: Iversen's Guide to Learning Languages (version 3b)

Postby Iversen » Sun Jan 31, 2016 1:16 pm

4. Fourth part - How to use translations

4.1. How to use translations as nutcrackers

Reading something in Basque or Finnish is well-nigh impossible if you haven't learnt those languages, - but you may have a guess concerning a few scattered loanwords. Making sense of this meager information is hard work, at least for me, and I can't trust my guesses. On the other hand reading Chinese is not merely hard, but impossible if you don't know those pretty Chinese signs. Which actually makes it less hard because you just have to give up.

If it's a matter of missing a few words here and here in order to get the meaning then it isn't too bad, and then I would use the term "comprehensible input" (cfr. Krashen). Maybe the missing words aren't crucial, maybe I can guess their meaning, and if the number is limited I can look them up in a dictionary (a digital pop-up dictionary would obviously be faster). But with too many unknown words it might turn out to be difficult to piece them together. The idea of using bilingual texts to overcome this problem is not new, but with the advent of the internet it has become much easier to find - or produce - bilingual texts to use in intensive reading. Finding transcripts of texts is even harder, but still easier than in the evil old pre-digital days (I sometimes wonder how we survived them). Longer texts are sometimes available in several languages, but generally as separate books and often with serious discrepancies between the versions.

At this point is has to be mentioned that there are language teachers and opinion makers who are squarely against the use of your own language in language learning. Sometimes this attitude results in excellent teaching systems - like the textbook "Lingva latina per se illvstrata" by my honoured compatriot H.H. Ørbæk, where even the grammatical explanations are in Latin. But in spite of all my enthousiasm for this work I have to point out that I learnt Latin using Kristian Mikkelsen's "Latinsk Læsebog" from 1878, which is a pure example of the old-fashioned setup with texts, wordlists for each text and grammar as used by a teacher versed in the grammar-translation method. So both kinds are possible, but my ideal textbook would combine elements from both.

The opinions concerning the use of translations in language learning may reflect differences in the way languages are stored in the brain.

In 1953 Uriel Weinreich wrote a book "Languages in Contact" in which he identified three general categories of bilingual speakers: compound, coordinate and subordinate bilinguals. Later Ervin and Osgood took up the distinction, and nowadays the last type is seen as a subgroup within the coordinate group. In the summary of the article "Three Types of Bilingualism" by M.R. D'Acierno the three types are defined as follows:

A compound bilingual is an individual who learns two languages in the same environment so that he/she acquires one notion with two verbal expressions. A coordinate bilingual acquires the two languages in different contexts (e.g., home and school), so the words of the two languages belong to separate and independent systems. In a sub-coordinate bilingual, one language dominates.

The distinction between compound and coordinate bilingualism has come under scrutiny. When studies are done of multilinguals, most are found to show behavior intermediate between compound and coordinate bilingualism. Some authors have suggested that the distinction should only be made at the level of grammar rather than vocabulary, others use "coordinate bilingual" as a synonym for one who has learned two languages from birth, and others have proposed dropping the distinction altogether. My own hunch is that the distinction is valid, but it isn't global. Some individuals have almost no references between words in different languages, not even when these words actually refer to the same physical entities in the world. Others have a strong interplay between their languages, and word meanings in one language are used not only to learn, but also to maintain the definitions in another.

In the long article "Different Typologies of Bilingualism" by L.G. Tarabocchia you can read this on page 127:

While monolinguals revealed the expected left-hemisphere dominance for language, bilinguals showed less lateralized language representation. Whereas early bilinguals, however, showed no statistically significant differences between L l and L2 in either hemisphere, late bilinguals revealed left-hemisphere dominance only for Ll and greater right-hemisphere involvement for L2. Cerebral language representation, therefore, differs between early and late bilinguals (Sussman et al., 1982).

and this on page 129:

The organization of one or more languages in the brain underlies the definition of compound an d coordinate bilinguals. The original notion - that of a single language system comprising both languages in the compound bilingual versus dual noninterfering language systems in the coordinate bilingual - has been modified following experimental studies (Kolers, 1968; Diller, 1974), which showed that age of acquisition, manner of acquisition and manner of practice affect the way languages are represented neurologically. It is therefore suggested that individuals lie along a continuum between the two poles - the compound and the coordinate.

If this is correct then it has some consequences for the way language learners (and their teachers!) see the use of translations and relations between languages in general. For me these things are quite innocuous while you are trying to conquer a new language, and you can just stop using translations when you don't need them any more. But you shouldn't use them indiscriminately. I'm as negative as everybody else about the idea of formulating every single thought in your native language and translating it before speaking - almost as if you were reading a speech from a piece of paper. It won't function, you haven't got time for that extra step when you are in the middle of a heated discussion and it is a bad habit even when writing. However I have never had problems dropping the umbilical chord when the newborn doesn't need it any more, so I don't understand why some people are so very much against using translations at all. But the difference in brain organization suggested above might explain it.

A wellknown/native language interspersed with elements from target language is called an interlanguage , and it is not something that I would recommend in general. But in the phase where you want to get a foothold in a new language it could serve a purpose. When I formulated a plain English text and successively changed sections of it into Scots as early as March 2009 I learnt quite a lot about Scots, not least because I got some quite concrete needs - certain formulations which I wanted to incorporate. This made me much more attentive to the genuine materials in which I made my searches, and I was much more likely to remember the things I found in my dictionaries, one (small) on paper and the excellent "Online Scots Dictionary". But this is not active use of a language - it is construction of utterances based on a limited passive competence plus liberal use of external resources.

At least in theory it is possible to push an interlanguage in the direction of the intended target, but the risk is that you end up with a version of the target language which is contaminated by elements from your native language - and that don't even know where you are in conflict with a truly native version of the language. The cure against this is in my view even more hardcore study of the vocabulary, grammar, phonology and cultural text of the target language, coupled with forages into genuine texts/speech, and then I think the end result could be satisfactory. But not overnight.

The alternative would be to try to eliminate all references to your native language while you learn your target language from scratch ... or in other words: try to learn as a child. As some teachers and theoreticians advocate. But grown-ups are not children, and in spite of looking for it I haven't seen any convincing evidence that a monolingual approach is more efficient than using the mix I suggested above: bilingual intensive studies and increasingly monolingual extensive activities.

There is another, more concrete perspective to this: if you have two related languages(or dialects) and you don't know to formulate something what is then your best bet? Well, it would be to assume that it would be 'the same thing' in the other language. So if I don't know a word in Afrikaans then I would assume that it is the same word as in Dutch, with some minor changes due to another sound system and simpler morphology. The problem is of course that you may forget that this was just a guess, and that you MUST try to check whether your guess really is correct in the target language. Often it isn't (so it was just a 'false friend'), but more often it is correct, especially when we talk about somewhat technical words. The risk of making grave errors is larger when it comes to expressions than it is with single words, but sometimes the odds are on your side.

To compensate for this bit of heresy I would mention that I generally prefer to work on genuine materials - both intensively and extensively - and with time you will develop a sense for what can be said or written in the target language and what can't. And with time you can also scrap the references to your native language while you formulate yourself in a target language. But Rome wasn't built in one day. The use of bilingual texts is undeniably a great time saver when it comes to understanding difficult texts in a target language, and if you didn't have this opportunity your only alternative would be to decode those texts sentence by sentence using a dictionary and a grammar book - or to give up. Having a translation may be just the thing that makes an incomprehensible text comprehensible.
1 x

User avatar
Black Belt - 4th Dan
Posts: 4870
Joined: Sun Jul 19, 2015 7:36 pm
Location: Denmark
Languages: Monolingual travels in Danish, English, German, Dutch, Swedish, French, Portuguese, Spanish, Catalan, Italian, Romanian and (part time) Esperanto
Ahem, not yet: Norwegian, Afrikaans, Platt, Scots, Russian, Serbian, Bulgarian, Albanian, Greek, Latin, Irish, Indonesian and a few more...
Language Log: viewtopic.php?f=15&t=1027
x 15359

Re: Iversen's Guide to Learning Languages (version 3b)

Postby Iversen » Sun Jan 31, 2016 1:33 pm

4.2. How to make parallel texts

In the preceding chapter I have argued that the use of parallel text is a very practical way to deal with texts which objectively seen are above your level in target language. The question is now how to get them. If you have a book in the original and a translation, both on paper, there isn't much you can do about it. I have a notestand near my favorite armchair where I can put one of them, but it is not a convenient solution. If you use Google translate to see a homepage in two different languages - or you find a homepage with the same texts in several languages - then you operate with several windows on your screen, and there are ways make them both visible at the same time. This chapter deals with three methods which can be used for making short bilingual texts on paper - which is the format I use for almost all my intensive text studies.

The first format I used was the interlinear format, inspired by language guides and grammatical treatises where a text might be accompanied by a translation in the line below it. Like

(English) The first format I used ...
(Danish) Det første format jeg brugte ...

I remember that I found a collection of tales by Hans Christian Andersen in a number of languages, and I wanted to combine them into one text with alternating languages in alternating lines. So I read to versions in each a window in MS Word and placed one above the other. Then I went through the texts and cut the lines at the same points in the two versions (which only was possible if I already could identify corresponding passages).

I found a collection of tales by Hans Christian Andersen
J'ai trouvé une collection de contes de Hans Christian Andersen

in a number of languages, and I wanted to combine them ...
en plusieurs langues, et je voudrais les combiner ...

After that I transferred the two texts to Excel (any spreadsheet would do) and added line numbers and a letter as shown:


And finally I combined the two sets of columns and sorted them as shown below. Problem solved.


Was it worth the effort? No.

Later on I experimented with Google translate and found a trick that could produce an 'interspersed' format, similar to the one used by Ilya Frank:

Dear Friends, Geagte Vriende, You are now visiting Ilya Frank's site, where you can find books and texts in different languages with their literal translations into English and brief linguistic comments. Jy is nou 'n besoek Ilya Frank se webwerf, waar jy boeke en tekste in verskillende tale met hul letterlike vertalings in Engels en kort taalkundige kommentaar kan vind.

I let Google Translate do its magick on a homepage, marked a translated text and inserted it 'as text' in Word. By some weird quirk in that program all the comments now became regular parts of the text. But if both languages used the same alphabet the result was confusing so I then had to add colours by hand, sentence by sentence - and that took long time for texts with more than just a few lines. Besides there was a problem with dots after abbreviations and number - Google thought all dots separated sentences, and the result could be rather muddled.

Was it worth the effort? Definitely not. And when I tried it a moment ago the trick didn't function.

Nowadays I only use a side by side approach, where clips in each language are placed in tables in Word or Libre Writer or some other word processor. Original and translation rarely have the same length, but this can be regulated by changing the width of the columns, and the texts can be further finetuned by manipulating the font size or maybe even the choice of font. The translation can be a genuine human- made translation or a machine translation, with or without formatting. And it is very fast to make bilingual texts using this method (source: the Asturian Uiquipedia and my own translation):


In the example above I have deliberately used a translation which has elements that are expressed quite differently from what you saw in the original version (for instance "communitariamente" somehow became the first section of the compound "community project"), and even where the elements are similar their order can be changed. This is typical for human made translations, where the goal of the translator is to make a text that is easy and pleasant to read for people who can't understand with the original version, not to help scholars or language learners who want some help to understand the original. Some translations are more literal than others - for instance the translators of the Harry Potter books have clearly been instructed not to deviate more than necessary from the original, whereas some other literary translations leave the impression that the translator would have preferred to write his/her own book from scratch.

There are systems on the internet that can be used to make parallel translations of whole books, but it should be clear that copyright reasons often will prevent you from sharing a translation of any kind.

According to Wikipedia machine translations have been around since the 50s:

In the 1950s Machine translation, (MT), became a reality in research, although references to subject can be found as early as the 17th century. The Georgetown experiment, which successfully involved fully automatic translation of more than sixty Russian sentences into English in 1954 was one of the earliest recorded projects. Researchers of the Georgetown experiment asserted their beliefs that within three to five years, machine translation would be a solved problem.

Well, it didn't happen. I haven't studied the methods used by the Georgetown experimentators or other pioneers, but when I was studying languages in the 70s we tacitly expected that functioning machine translation systems - if they ever became reality - would have to be programmed by hand, and that you would need a separate dictionary and a suitable grammar - probably based on some variant of Chomsky's generative grammar, which seemed to be more suitable for machines than for humans.

According to the same source, Google Translate was launched in 2006, and it has now by a wide margin become the dominating system - even to the point that other homepages which claim to offer their own translations actually just tap into Google Translate. Google used a system called Systran for a year or so (and it may still be used by other machine translation systems), but in 2007 it switched to a statistically based system based entirely on comparisons between huge amounts of parallel texts. It is surprising that it has been possible at all, but some persistent errors can be traced back to this technique - like the tendency to substitute capitals, institution names and languages names in a text in a base language with others which have the same role and are more frequent in the target language. The lack of a grammatical analysis also causes other persistent and typical errors, like for instance loss of negations and weird changes in the word order which result in the disruption of nominal paradigms.

How can I then advocate using such a system to make bilingual texts? Well, for three reasons.

The first reason is that I don't have an alternative - it is hard enough to find a decent translation of Shakespeare, but try finding a translation of an article from the magazine of the Serbian tuba players' organization.

The second reason is that human-made translations can be just as disloyal to the original - not because they make more overt translation errors, but because they have a tendency to use expressions that transmit the gist, but not the letter of the original. And sometimes they even add new elements or remove whole passages from the original without any warning.

The third reason is that I only use Google and its competitors to produce translations into a language I know well - never to produce texts in my target language which I then proceed to study.

On the internet you can sometimes find a text in a target language which at closer inspection turns out to be machine translations, which barely have been edited before publication. For instance I once found a collection of articles about popular science in Irish - but it didn't take long to notice that the publishers hadn't even bothered to move the verbs to the initial position, where they belong in Irish. Such texts shouldn't be used for any kind of study - they prevent you from acquiring any kind of real feeling for the language in question.

The other way is a totally different story. In practice a Google translation from ar target language into something you know well can be more useful than a free human-made translation, both because its most glaring errors are clearly visible and because you only use it to clear up murky points in the original. The quality of the translation depends on the size of the language - the more texts Goggle has had to analyze, the higher the chance that it settled for correct translations.

The worst case by far of those I have been able to try myself has been Latin, where the lack of a comprehensive modern bilingual text base AND the free word order of the source language have combined to make translations that would be an abomination even if they were produced by a newbee learner. The translations from Dutch have also sometimes surprised me -. which is harder to understand. After all a language with around twenty million native speakers should have produced a sufficiently large amount of parallel text for analysis by the software used by Google. But to be fair, I have also seen many absolutely correct translations, and Google adds new language faster than I can learn them.

Let's finally have a look at a concrete example, a quote from the bilingual in-flight magazine "Elevate" of Air Serbia, October 2014, with a Google translation:

Serbian: U "rajski vrt" u kome se biljke uzgajaju biodinamičkim putem, ulazi se samo u tačno određeno vreme - biljke i pčele se ne smeju uznemiravati!

English: Visitors can only enter the "Garden of Eden", where plants are grown through biodynamic agriculture, at scheduled times - plants and bees mustn't be disturbed!

Google T: In the "Garden of Eden" in which plants are grown bio-dynamic way, enters only at specific time - the plants and the bees must not disturb!

As you can see, the Google Translate translation lacks a subject for "enters" and interprets "se ne smeju uznemiravati" (disturb) as an active construction, where the English version from the magazine correctly assigns a passive meaning to this construction. But if you were a struggling language learner, which version would help you most?
You do not have the required permissions to view the files attached to this post.
1 x

User avatar
Black Belt - 4th Dan
Posts: 4870
Joined: Sun Jul 19, 2015 7:36 pm
Location: Denmark
Languages: Monolingual travels in Danish, English, German, Dutch, Swedish, French, Portuguese, Spanish, Catalan, Italian, Romanian and (part time) Esperanto
Ahem, not yet: Norwegian, Afrikaans, Platt, Scots, Russian, Serbian, Bulgarian, Albanian, Greek, Latin, Irish, Indonesian and a few more...
Language Log: viewtopic.php?f=15&t=1027
x 15359

Re: Iversen's Guide to Learning Languages (version 3b)

Postby Iversen » Sun Jan 31, 2016 1:44 pm

4.3 Hyperliteral translations

Hyperliteral translations are translations from language A to language B that try to stick to the words and the constructions of language A, even when this means that the result in language B isn't grammatical and in some cases not even meaningful when judged with the standards of language B.

It is a known fact that languages aren't parallel. If you say that "cheval" in French is a translation of "horse" in English, then it just means that both refer to a certain fourlegged mammal. However there may be cases where 'cheval' in French doesn't correspond to 'horse' in English. Take for instance the expression "être très à cheval sur quelqu'un" = "to be strict with someone", "be a stickler with someone". Often these derived meanings have some connection with the 'core meaning' (in this case the fourlegged animal, - to be strict with someone is almost as sitting on them as on a horse), but some words don't even have a single dominating core meaning, and then there is no reason to expect that their meaning(s) can be covered by one single word in another language.

This situation is also found with grammatical constructions. For instance many languages have reflexive pronouns, i.e. pronouns that by definition refer back to an explicit or implied subject. In Danish "tage sin hat" means 'take the speaker's own hat' (reflexive "sin"), while "tage hendes hat" means 'take her hat', i.e. some other (female) person's hat. In English there aren't unstressed reflexive pronouns so the context will dictate the meaning in a concrete sentence. (PS: in this case I have decided to write a generic reflexive 'self', just as in Old English)

Ordinary translations systematically try to cover up these problems by reformulating phrases or guessing at the intended meaning and choosing one out of several possible interpretations. Depending on the skill and ambitions of the translator this can mean that the general meaning is preserved, but all direct parallels between the original and the translation are lost. This can be a problem for a language learner who wants to understand the role of each element in the original version. A hyperliteral translation has no literary pretensions at all, but tries to 'imitate' the original version at all levels.

So "être très à cheval sur quelqu'un" would in a hyperliteral translation be something like "(to) be very on horse on someone", and you would add a corresponding idiomatic expression in language B if the meaning can't be guessed ('be a stickler'). However even this version isn't a perfect hyperliteral translation: the 'to' is normally necessary in English, while the French infinitive can stand alone, - therefore the word 'to' is put between parentheses. Even this simple example shows that there is some judgment involved in making a hyperliteral translation, just as in making an ordinary 'literary' translation. When explaining exotic constructions in remote languages you may even have to add morphological markers in some places. This has to be decided in the concrete case.

Hyperliteral translations in combination with 'normal' translations have been used in some language guides, such as the German series Kauderwelsch and the French language guides from Assimil. An example from Kauderwelsch 90 Irisch-Gälisch p.85:


(in English: Would-be good with-you cup tea? Is! better with-me coffee)

The second line is the pronunciation, the third is the German hyperliteral translation. and the fourth line contains the expression a native German would have used. And that's exactly the problem with 'free' translations: they tell you what a speaker of your own language would have said, not what the foreign speaker actually said (or wrote). But part of learning a foreign language is to know what native speakers REALLY say, and what the logic in their constructions and expression REALLY is.

Hyperliteral translations are also commonly used in scientific articles and books, especially those that deal with grammatical questions. Below you see a couple of examples from "A survey of relative pronouns and their uses in natural and artificial languages" by Libert, Alan; Moskovsky and Christo:


In some cases a simple rendering in words isn't sufficient, and then you can add symbols to indicate things like case, gender, role in the sentence or distinctions that don't even exist in the target language. Let's se another example from the same article, this time from Russian:


Line 1 and 3 contain the transliterated version of the Russian text (мужчина, с которым я вчера познакомился), and the hyperliteral translation in line 2 and 4 runs like this: "the man-NOM-3SG-MASC with who-DAT-3SG-MASC I yesterday got-acquainted". It is clear that too many details make the translation almost illegible, but the added informations in the quote are actually all embedded in the endings of the Russian words, and they represent the information you have to be able to extract from the Russian original to understand it.

I would like to mention one concrete case more, which I had to deal with myself. In Irish the verbs are placed at the beginning of the sentences, and normally only conjunctions, a few verbal particles and interrogative pronouns can precede them (except in some constructions with a copula verb). But inversion is used for interrogative sentences in most of my languages, so in the beginning the Irish word order irritated me. So I got the idea to put an exclamation sign after the verb to 'neutralize' the interrogative effect - but only until I had become accustomed to the weird ways of the Irish:

"Oro, sé do bheatha a bhaile, is fearr liom tu ná céad bo bain(ne)" ---> (Wikipedia:) Oro, welcome home, I would rather have you than a hundred milch cows ---> (me:) ..is! better with me than hundred milk cows

(see also the discussion at Wikipedia)

It is clear that hyperliteral translations only should be used in the early stages of learning a new language. They are much better than ordinary translations to convey the structure of the phrases in the original language, but as soon as you can understand the general meaning of spoken and written texts in language A the best strategy would normally be not to make or use translations at all, except when you look up unknown words or idiomatic phrases in dictionaries and other sources. From that moment on translations would primarily be done for the benefit of others, and then it is logical to try to make translations from language A that are please your costumers, even though they from a strictly linguistical point of view are misleading.
You do not have the required permissions to view the files attached to this post.
1 x

Return to “Language Programs and Resources”

Who is online

Users browsing this forum: No registered users and 2 guests