New auxlang: Atlas

General discussion about learning languages
User avatar
tommus
Blue Belt
Posts: 957
Joined: Sat Jul 04, 2015 3:59 pm
Location: Kingston, ON, Canada
Languages: English (N), French (B2), Dutch (B2)
x 1937

Re: New auxlang: Atlas

Postby tommus » Wed Aug 16, 2017 5:13 pm

I have updated my grammar, vocabulary and software to the changes from 12 August and 14 August. I have a suggestion for the format of dates for both Atlas and for managing Atlas-related files, versions etc. I find that the date format YYYY-MM-DD (like today 2017-08-16) is very useful to keep things organized. For example, I use this format for many files: "2017-08-16 Vocabulary.txt". Same where appropriate for folders. Then everything is listed in a directory in chronological order and does great service for versioning. No ambiguity. Note the leading "0" where needed.

Some notes and comments on these updates:

1. "dodine" has been changed to "vodine". However, vodine, tedine and houdine are still listed in the vocabulary under adverbs, and are specifically identified as adverbs. Don't you agree these three words (yesterday, today and tomorrow) are abstract nouns, not adverbs. And note "to-day" should be "today".

2. In the vocabulary spreadsheet, the new or changed words are shown in red. However, with just the words in red, it is very difficult to see what the change was (a deletion, a spelling change, a totally new word). The only option, when updating other wordlists (vocabulary) is to bring up the previous version of the spreadsheet and try to compare the old and the new side by side. Very awkward and very slow. For future changes, in addition to the red words in the spreadsheet, could we get a simple list of each change, such as: dodine -> voding, xyz = new word, and what the root(s) is/are?

3. "kuk" (cook) was changed to the derived word for cook (civazes). However, the derived word (kukarte (gastronomy) was not changed.

4. "air" (air) was a change, but I still don't understand how the word dakaire can mean "blow". "dak" is hit, collision.

5. "zel" (order, instruct) and "zell" (cell). I still think these two are too similar and will cause errors.

I have now combined all the nouns, verbs and all the other dictionaries into a single dictionary with all the variations of nouns, verbs, adjectives and some others, like demonstratives can get an "-n". In order to generate all possible words (or parse all possible words) it is important to know which other parts of speech are considered to be roots that can take a suffix. The answer probably is "as appropriate", but I think we need something more definitive for learners and for computers, or they will not be able to learn, parse or create such words. So could there be guidelines on which parts of speech are considered roots, and perhaps what suffixed they could take? For example, the vocabulary says connectors can be derived from roots by adding "-ev". Does that mean "any suitable root"?
3 x
Dutch: 01 September -> 31 December 2020
Watch 1000 Dutch TV Series Videos : 40 / 1000

Rodiniye
Orange Belt
Posts: 120
Joined: Mon Jun 12, 2017 4:27 pm
Languages: Spanish, English, Italian, German, Rodinian
x 90

Re: New auxlang: Atlas

Postby Rodiniye » Thu Aug 17, 2017 10:28 am

Thanks Tommus I will correct these mistakes asap and I take your suggestions onboard.

I ended up having to go away from home for a few days so have not been able to complete the post on countries, I will try today or tomorrow of I can. Extremely busy week at work.

I have now combined all the nouns, verbs and all the other dictionaries into a single dictionary with all the variations of nouns, verbs, adjectives and some others, like demonstratives can get an "-n". In order to generate all possible words (or parse all possible words) it is important to know which other parts of speech are considered to be roots that can take a suffix. The answer probably is "as appropriate", but I think we need something more definitive for learners and for computers, or they will not be able to learn, parse or create such words. So could there be guidelines on which parts of speech are considered roots, and perhaps what suffixed they could take? For example, the vocabulary says connectors can be derived from roots by adding "-ev". Does that mean "any suitable root"?


Ok, roots are those listed in the dictionary, so the 525 (I think we have at the moment). From them, you can pretty much add any suffix, as long as it makes sense (except -ar, -ev):

direb-o: tree

but:

*zunz-o? a bird that is a plant? that does not make sense (not that I can think of).

So basically from roots you form initially nouns (a, o, u, e) - and from them, plural (n) for instance - adjectives (i), adverbs (em)... I consider them to be productive suffixes, because you can produce endless lists of words with them.

So you can add productive suffixes to any root. What is not a root? a preposition, a correlative, a demonstrative, a connector, a conjunction, a number... Generally speaking, these parts of speech are never inflected (don't take any affixes). There is, however, a prefix that can be added to prepositions in order to make them accept other suffixes: "me-". Add "me-" to a preposition and you will be able to attach preffixes. So you have from "li" (inside), "meliu" (interior), or from "der" (close), "medera" (neighbor), etc.

There is another suffix that can be added to these parts of speech mentioned above: the plural -n. It is only added to demonstratives (when not attached to a noun) in order to avoid amibiguity: "te" (this), "ten" (these), but "te-baitun" (these houses).

Suffix -ar is a marker for correlatives, so it is not really a productive suffix.

Same think for -ev. It is not productive because the number of connectors is limited. -Ev can be added in fact even to numbers and prepositions (ekev - firstly, houev - after that...), so as I was saying it is not part of the productive suffixes.

So basically there is a distinction between what is a productive suffix added to roots (category, plural...) and what is a marker (correlative, connector...). Generally speaking only roots that productive suffixes, but prepositions can take the preffix "me-" and demonstratives the suffix "-n".

Does this make sense?
2 x

User avatar
tommus
Blue Belt
Posts: 957
Joined: Sat Jul 04, 2015 3:59 pm
Location: Kingston, ON, Canada
Languages: English (N), French (B2), Dutch (B2)
x 1937

Re: New auxlang: Atlas

Postby tommus » Thu Aug 17, 2017 2:50 pm

Rodiniye wrote:roots are those listed in the dictionary, so the 525 (I think we have at the moment). From them, you can pretty much add any suffix, as long as it makes sense (except -ar, -ev):

That is the reason that I believe it makes sense for the computer to be able to parse any suffix (even if it doesn't make sense), because how is the computer to know if it makes sense. The only way for the computer to know if it makes sense would be for people to go through every possible root and every possible suffix and make a judgement as to whether it makes sense, and provide the computer with that "complete" vocabulary. But what makes sense to the people reviewing the vocabulary may be different than a writer who needs to produce a word that is not in the dictionary. But it will be a long time, if ever, that people review all the possible roots plus suffixes. So for me, the easiest way for the computer to handle all these possibilities is to generate all of them and then have them instantly available in an internal vocabulary.
Rodiniye wrote:*zunz-o? a bird that is a plant? that does not make sense (not that I can think of).

There is a well-known plan called "bird of paradise": Wikipedia
An English native writing in Atlas might derive the word "zunzo de dormwen-baxwene" (bird of dream landscape). So would this bird be a plant?

This example where I derived a word for paradise (dream landscape) brings up a question. If a derived word is made up of compound words, how are hyphens handled? One of the compound words could have three or more roots, thus with hyphens of its own.
Rodiniye wrote:I consider them to be productive suffixes, because you can produce endless lists of words with them.

OK. That is useful.
Rodiniye wrote:There is, however, a prefix that can be added to prepositions in order to make them accept other suffixes: "me-". Add "me-" to a preposition and you will be able to attach prefixes. So you have from "li" (inside), "meliu" (interior), or from "der" (close), "medera" (neighbor), etc.

So again, all the 25 prepositions with a prefix of "me-" could potentially take any suffix. Especially since the prepositions are so short (many are one letter), adding the "me" and suffix makes it a very different word that is hard to process. So I believe every possible version needs to be added to the computer internal dictionary.
Rodiniye wrote:Suffix -ar is a marker for correlatives, so it is not really a productive suffix.

In the grammar under Part 14 Negation, it says:

There are some words that are negative per se: hakun (no-nothing), nithe (neither):
Wi imis hakunar vrendan – I have no friends.
Wi imis zeen nithe – I don’t have them either - I have them neither.

Is hakun (no-nothing) a root? What about nithe (neither)? Why does hakun get an "ar" suffix but nithe does not? And then under Part 16 Correlatives, hakunar is there, but nithe is not? In these two sentences, are hakunar and nithe the same part of speech? I know nothing about them. I know neither of them?
Rodiniye wrote:Does this make sense?

Yes. Very informative. When you have time, it would be helpful to have some of this additional explanation added to the grammar.

Another question. The grammar Section 3 Nouns contains the following:

Voluntartily, a dual number might be expressed by the addition of the suffix –k:
nekoa (cat) - nekoak (two cats)
deribo (tree) - deribok (two trees)
itsa (person) - itsak (two people)

Voluntartily? I assume this is Voluntarily. Perhaps a better word would be Optionally, if that is what is meant. But I do not see the advantage of this "-k" to make two of something. "doi nekoan" seems perfectly good for "two cats" so why introduce an exception (option) that just complicates Atlas? The "nak" suffix already exists for eknak (first), doinak (second), etc. and that is fine.

Sorry for asking so many questions, and perhaps being too "nit picky" about details. But if I am going to tell my software how to interpret Atlas, then details are critical. So far, I have only been concerned with software that would translate Atlas into English. I haven't really looked at English into Atlas, but I think it will be much more difficult. So far, my software has an Atlas dictionary of 5671 words and roots, plus 622 derived words. So translating those to English is relative easy compared with translating English to Atlas. Oxford Dictionaries say there are 171,476 words in current use in their main English dictionary. Counting inflections, etc, it is estimated that there are more than a million English words.
1 x
Dutch: 01 September -> 31 December 2020
Watch 1000 Dutch TV Series Videos : 40 / 1000

User avatar
tommus
Blue Belt
Posts: 957
Joined: Sat Jul 04, 2015 3:59 pm
Location: Kingston, ON, Canada
Languages: English (N), French (B2), Dutch (B2)
x 1937

Re: New auxlang: Atlas

Postby tommus » Fri Aug 18, 2017 2:11 am

I have been doing a lot of computer programming, and a lot of searching through the grammar and the dictionary to try to reduce the number of derived words that my computer cannot yet parse, and/or that I cannot understand. It is very time-consuming. I have reduced the number of those the computer can't parse from 230 down to 190. Here are ten derived that are a challenge and/or hard to understand:

bazvodibue (engineering): 3 components: baz-vo-dibue: factory-before-drawing
ekdiambinu (block of flats): 3 components: ek-diam-binu: one-live-building
ilemkitabu (encyclopedia): ilem-kitabu: ????-book
zahare (ruin): za-hare: rather than-lose, miss
xerkeqa (judge): xer-keqa: legal-work
xinnirmanu (wall): xin-nirmanu: line-build
mittciu (sweet): mitt-ciu: sweet-eat
houciu (dessert): hou-ciu: after-eat
quduane (summary): qu-duane: take-short
vunkhuqe (call by telephone): vunk-huqe: broadcast-call

Some of the parsing problems are caused by derived words of more than two components but which have no hyphens.

For ilemkitabu (encyclopedia): ilem-kitabu: ????-book, I cannot figure out what "ilem" is.

I have a major concern that is growing stronger about Atlas derived words. And there will be many, many Atlas derived words because there are only about 500 main words. With the very limited number of roots, and the very complex grammar (in my opinion), many of the derived words are going to be strange. Just have a look at these ten above. They are supposed to be easy to remember because they are derived from the small number of roots. But I think a learners head is really going to hurt to try to (1) figure out how to parse it (2) figure out how and why a derived word was constructed (3) figure out how to memorize it, because it may well be a pretty vague representation of the intended word (4) not think of several different or better ways of composing that derived word (5) and try to bring back that complex picture when you need to produce the word. I think it may well be much more difficult to memorize and remember this complex construction and reasoning than simply to memorize unique words that are not constructed from two or more roots and a lot of grammar.

Certainly hyphenating even the 2-component words would help, but that makes derived words of two or more 2-component words a bigger challenge.

As I work through some more of the so-far un-parsable words, I hope my optimism improves.
0 x
Dutch: 01 September -> 31 December 2020
Watch 1000 Dutch TV Series Videos : 40 / 1000

User avatar
tommus
Blue Belt
Posts: 957
Joined: Sat Jul 04, 2015 3:59 pm
Location: Kingston, ON, Canada
Languages: English (N), French (B2), Dutch (B2)
x 1937

Re: New auxlang: Atlas

Postby tommus » Fri Aug 18, 2017 3:03 pm

Good progress in parsing derived words.

I added a lot more words to my Atlas dictionary, internal to my software. I think I have now generated just about every possible word that can have the whole range of suffixes. And that produced dividends. I reduced the number of derived words that can't be parsed by the computer from 190 to 36. I haven't got time at the moment to analyse any of these 36. I'll get to that later today. But I'll post them here.

Code: Select all

xerkeqa
vunkhuqe
carxines
lizaic
weizaic
hesanzaic
anzaic
bezaic
xahzue
sesende
wowomi
avsaic
xervermbaxu
vahsa
vikti
cardxnaides
anzeites
bellzunza
viktanku
weiaizhewana
sardmebelu
xahheze
keqhissu
sestelu
tarkaixe
darizidee
hinviaqe
cangdirebo
sardnosu
samuilnosu
vraivexu, darvexu
vekvexu
iadarti
quga
herikvese
qiensoge
0 x
Dutch: 01 September -> 31 December 2020
Watch 1000 Dutch TV Series Videos : 40 / 1000

Rodiniye
Orange Belt
Posts: 120
Joined: Mon Jun 12, 2017 4:27 pm
Languages: Spanish, English, Italian, German, Rodinian
x 90

Re: New auxlang: Atlas

Postby Rodiniye » Fri Aug 18, 2017 9:11 pm

First of all, my apologies for not being able to update the blog with countries as I promised.

As some of you might now, my home city is Barcelona, and it was hit by an awful terrorist attack yesterday. I wanted to spend that afternoon working on Atlas but obviously I had other things in mind. It was an awful experience and feeling and I am really sorry for all the people affected by the attack.

I said I would have a news page in Atlas, updating it with important world news. Well I would have never wanted to translate this, but I wrote about the attack in atlas:

https://atlas-language.blogspot.com.es/p/news.html

It is more peace and love that we need, not violence.

Anyway, I will be away now until Monday! (holidays). But I have a bit of time to answer a few questions.

Again, thank you for your hard work. The blog is attracting more people than ever possibly (without "marketing campaigns") so thank you everyone.

Countries post when I come back!

If a derived word is made up of compound words, how are hyphens handled? One of the compound words could have three or more roots, thus with hyphens of its own.


Interesting. I would keep compound words together with no hyphen in this case.

Any thoughts Crush?

I reduced the number of derived words that can't be parsed by the computer from 190 to 36


That's very good news then. I will have a look at them when I can!

Is hakun (no-nothing) a root? What about nithe (neither)? Why does hakun get an "ar" suffix but nithe does not? And then under Part 16 Correlatives, hakunar is there, but nithe is not? In these two sentences, are hakunar and nithe the same part of speech? I know nothing about them. I know neither of them?


hakun is no root, "hakunar" is a correlative.

nithe is no root either, it is an adverb (found in adverbs). There are a few adverbs that do not come from roots and therefore they do not take -em endings (nithe, itzei... etc).

"neither" (nithe) is used when it is the opposite of "too" or "as well".

Oxford Dictionaries say there are 171,476 words in current use in their main English dictionary. Counting inflections, etc, it is estimated that there are more than a million English words.


My previous project had around 12.000 words and I am pretty sure you can handle every situation with around 13-15.000. That is not inflected words. I mean, I am sure those 170.000 English words count every possible modification of the word "watch" (watches, watched, watch, watching..... etc), and I am only talking about one version of them.

Vocabulary will grow. I believe Esperanto had only around 2000 words when it was published?

Voluntartily? I assume this is Voluntarily. Perhaps a better word would be Optionally, if that is what is meant. But I do not see the advantage of this "-k" to make two of something. "doi nekoan" seems perfectly good for "two cats" so why introduce an exception (option) that just complicates Atlas? The "nak" suffix already exists for eknak (first), doinak (second), etc. and that is fine.


Optional might be better, you are right. I do not think it is overcomplicating. If people know the dual number and want to use it, use it. If not, just use the plural. "doi nekoan" is as good as "doi nekoak", or only "nekoak".

So again, all the 25 prepositions with a prefix of "me-" could potentially take any suffix. Especially since the prepositions are so short (many are one letter), adding the "me" and suffix makes it a very different word that is hard to process. So I believe every possible version needs to be added to the computer internal dictionary.


This system is not that productive but it is useful for some words, as the ones I mentioned in the example. Actually, it will only be used in a few words. Having said this, not many roots begin with "me-". I have only found one that could be mistaken by a preposition "me-" compound (mes - month). So maybe changing that root to "mess" solves the problem quite easily.

qu/qu


Just found out that there is a duplicity here too. "qu" (take) and "qu" (preposition) are the same. I will probably change the verb into something very similar very soon in order to avoid confussion. The root "qu" as a verb "take" does not many compound words.
1 x

User avatar
tommus
Blue Belt
Posts: 957
Joined: Sat Jul 04, 2015 3:59 pm
Location: Kingston, ON, Canada
Languages: English (N), French (B2), Dutch (B2)
x 1937

Re: New auxlang: Atlas

Postby tommus » Fri Aug 18, 2017 9:42 pm

Rodiniye wrote:As some of you might now, my home city is Barcelona, and it was hit by an awful terrorist attack yesterday. I wanted to spend that afternoon working on Atlas but obviously I had other things in mind. It was an awful experience and feeling and I am really sorry for all the people affected by the attack.

As I said to you in a private message, it is very sad what has happened in Barcelona and I wish the people of the area all the best as they try to recover from this violence.
0 x
Dutch: 01 September -> 31 December 2020
Watch 1000 Dutch TV Series Videos : 40 / 1000

User avatar
tommus
Blue Belt
Posts: 957
Joined: Sat Jul 04, 2015 3:59 pm
Location: Kingston, ON, Canada
Languages: English (N), French (B2), Dutch (B2)
x 1937

Re: New auxlang: Atlas

Postby tommus » Fri Aug 18, 2017 11:09 pm

Some more analysis of derived words that could not be parsed:

xerkeqa (judge): xer-kaqa: legal-work: "keqa" is spelled incorrectly. Should be "kaqa". Legal-work doesn't uniquely point to "judge".

lizaic (should be "lizaii")(the "c" was dropped and double letters ("ii") are now allowed.

Same problem with:
weizaic
hesanzaic
anzaic
bezaic
avsaic

lizaii (intrinsic): li-zaii: inside, during, while - be
weizaii (extrinsic): wei-zaii: outside, except - be
hesanzaii (comfortable): hesan-zaii: easy - be
anzaii (available): an-zaii: on - be
bezaii (comfortable): be-zaii: good - be
avsaii (puzzled): av-zaii: off, inactive - be

xahheze (democracy: xah-heze: ???-power (what is "xah"?)

wowomi (artificial): wo-womi: ??? - man, woman (what is "wo"?)

xervermbaxu (prison): (spelling wrong in vocabulary (prision)
xervermbaxu (prison): xer-verm-baxu: legal-close-building (3-component word with no hyphen)

vahsa (adult): vah-sa: ??? - so (what is "vah"?

vikti (heavy): this is a word (adjective), not a derived word. Problem is that in the vocabulary, the root is shown as vikti (it should be vikt as the root)

cardxnaides (sculpt): card-xnaides: ??? - cut (what is "card"?)

anzeites (take time): should be "anzaites"
anzaites (take time): an-zaites: on, active - time

bellzunza (swan): should be "belzunza"
belzunza (swan): bell-zunza: beautiful - bird

Many of these Atlas words are very difficult for me to relate to their meanings. As I said earlier, I can't see that these words will be easy to learn.

The remaining words that my software hasn't been able to parse are shown below. I haven't gotten to them yet.

Code: Select all

viktanku
weiaizhewana
sardmebelu
keqhissu
sestelu
tarkaixe
darizidee
hinviaqe
cangdirebo
sardnosu
samuilnosu
vraivexu, darvexu (in vocabulary like this.)
vekvexu
iadarti
quga
herikvese
qiensoge
1 x
Dutch: 01 September -> 31 December 2020
Watch 1000 Dutch TV Series Videos : 40 / 1000

User avatar
tommus
Blue Belt
Posts: 957
Joined: Sat Jul 04, 2015 3:59 pm
Location: Kingston, ON, Canada
Languages: English (N), French (B2), Dutch (B2)
x 1937

Re: New auxlang: Atlas

Postby tommus » Sat Aug 19, 2017 1:44 am

Well, I have finished analysing the rest of the words that the software could not parse. It was tedious.

weiaizhewana (deer): wei-aiz-hewana: outside-bone-animal (that is weird)

sardmebelu (fridge): sard-mebelu: ???-furniture
Should it be sarmebelu? cold furniture

keqhissu (office): keq-hissu: ???-section
probably should be kaqhissu: work-section

sestelu (satellite): se-stelu: ???-star
Should be s-stelu: around-star

tarkaixe (commerce): tar-kaixe: ???-corporation

darizidee (intention): dariz-idee: catch-idea
The root of idea is "idei" so the word should be "darizideie"

hinviaqe (turism): should be tourism.
hinviaqe (tourism): hin-viaqe: ???-travel

cangdirebo (pine): cang-direbo: ???-tree

sardnosu (coat): sard-nosu: ???-cloth
Should be: sarnosu (coat): sard-nosu: cold-cloth

samuilnosu (sunglasses): samuil-nosu: common,typical-eye-cloth (weird)

vraivexu, darvexu (in vocabulary like this.)
vraivexu (?chain): vrai-vexu: free-object
darvexu (?chain): dar-vexu: catch-object (I don't understand)

vekvexu (pan): vek-vexu: ???-object

iadarti (artisan): iad-arti: ???-art

quga (player): This is single component word, not a derived word.

herikvese (flow): herik-vese: ???-go

qiensoge (economy): Spelling? Should it be:
qiensonge (economy): qien-songe: paper-move (why?)

So there are about 35 words in the last two or three of my posts that have problems that cause the parsing problems. Rodiniye, if you can address these, I will make the changes in my dictionary and my list of your derived words. When I get that processed, I'll make available the complete lists.
0 x
Dutch: 01 September -> 31 December 2020
Watch 1000 Dutch TV Series Videos : 40 / 1000

User avatar
tommus
Blue Belt
Posts: 957
Joined: Sat Jul 04, 2015 3:59 pm
Location: Kingston, ON, Canada
Languages: English (N), French (B2), Dutch (B2)
x 1937

Re: New auxlang: Atlas

Postby tommus » Sat Aug 19, 2017 4:13 pm

Rodiniye wrote:
If a derived word is made up of compound words, how are hyphens handled? One of the compound words could have three or more roots, thus with hyphens of its own.

Interesting. I would keep compound words together with no hyphen in this case.

OK. I think that will work
Rodiniye wrote:
Is hakun (no-nothing) a root? What about nithe (neither)? Why does hakun get an "ar" suffix but nithe does not? And then under Part 16 Correlatives, hakunar is there, but nithe is not? In these two sentences, are hakunar and nithe the same part of speech? I know nothing about them. I know neither of them?

hakun is no root, "hakunar" is a correlative.
nithe is no root either, it is an adverb (found in adverbs). There are a few adverbs that do not come from roots and therefore they do not take -em endings (nithe, itzei... etc).

"neither" (nithe) is used when it is the opposite of "too" or "as well".

In the grammar in section 14 Negation, this sentence shows "hakun" by itself without "ar" so it looks like it is a root. Probably this has to be changed.
I think the references to hakunar and nithe under Negation should show that hakunar is a correlative and nithe is an adverb to help explain why nithe doesn't have an "ar" suffix.
Rodiniye wrote:I am pretty sure you can handle every situation with around 13-15.000. That is not inflected words.

So far, Atlas has about 500 root words and about 600 derived words. A lot more are going to be needed. I think all of those 13-15,000 need to be established as legitimate (approved) Atlas words, not left to the users to invent. Otherwise, there will be multiple derived words for each single word, and it will be a nightmare. Control will then be lost. So the advantage of the small number of roots will have to be in the ease of remembering or figuring out words, not in the ability to invent all the words a user needs. A user would still be free to invent words if the user doesn't know the Atlas word, or the Atlas word does not yet exist.
Rodiniye wrote:If not, just use the plural. "doi nekoan" is as good as "doi nekoak", or only "nekoak".

As you know, I am against exceptions. I simply do not see the advantage "k" provides that is worth having this 3-way exception.
0 x
Dutch: 01 September -> 31 December 2020
Watch 1000 Dutch TV Series Videos : 40 / 1000


Return to “General Language Discussion”

Who is online

Users browsing this forum: s_allard, terracotta and 2 guests