Author Topic: Baluchi language in the Unicode  (Read 2911 times)

0 Members and 1 Guest are viewing this topic.

Offline Digital Walk

  • Global Moderator
  • ******
  • Posts: 195
  • Karma: 24
    • Kechsoft
Baluchi language in the Unicode
« on: September 25, 2008, 10:07:23 PM »
9:36 PM 25-09-2008

Salam o draahbaat!

Omeit eN shomaa droaaheN Baask wash o jod et. Modaam Wash jod baatay.

Note for poets and Linguists: Again, i don't claim to be any kind of linguist. So, kindly and please find errors and correct me if I'm not correct.

Dear all,
As we all know that we write Baluchi in the Arabic script with reduced set of letters.
Which means we do not pronounce the words starting with some of those in Arabic which ultimately means we should not use those letters in writing. Here's an example, we do not pronounce ( ص)('Saad'), as there is no word in Baluchi vocabulary containing this letter, instead any word beginning with this letter is pronounced using (س) 'Seen' . usually human names (mostly Islamic ones) that came from Arabic can have the letter 'Saad' or words from foreign languages like , again Arabic may contain this letter.
These letters usually aren't part of Baluchi language
So the list goes like that:
ث, ح، خ، ص، ض، ط، ظ، ع، غ، ف، ق، ه
(please do't confuse this , Heh as in 'hindi' with Heh Do-Chashmi. Baluchi words (at any place in the word) use the Heh Do-chashmi)

Now here's the list of the alphabets or letters of Baluchi:
آ، ا، ب، پ، ت، ٹ، ج، چ، د، ڈ، ر، ز، ژ، ڑ، س، ش، ک، گ، ل، م، ن، و، ھ، ی، ے

Note: Please someone give the correct Baluchi pronunciation of these letters with examples.

So, for more details about the topic, 'Unicode' will be covered later.
thanks for your time,
shomay kaster,
To be continued...

Offline Digital Walk

  • Global Moderator
  • ******
  • Posts: 195
  • Karma: 24
    • Kechsoft
Re: Baluchi language in the Unicode
« Reply #1 on: September 26, 2008, 01:02:06 AM »
Note: This part of my post contains technical terms so be prepared to encounter a bunch of them!
And this is going to be a long, dry and a boring post. :hehe:

To know more about a term, open type "define:your_keyword",
where your_keyword replaces the word you want google to define for you.

Now the question: What is Unicode?

Unicode Consortium answers it  like this:

Unicode provides a unique number for every character,
no matter what the platform,
no matter what the program,
no matter what the language.

Fundamentally, computers just deal with numbers. They store letters and other characters by assigning a number for each one. Before Unicode was invented, there were hundreds of different encoding systems for assigning these numbers. No single encoding could contain enough characters: for example, the European Union alone requires several different encodings to cover all its languages. Even for a single language like English no single encoding was adequate for all the letters, punctuation, and technical symbols in common use.
These encoding systems also conflict with one another. That is, two encodings can use the same number for two different characters, or use different numbers for the same character. Any given computer (especially servers) needs to support many different encodings; yet whenever data is passed between different encodings or platforms, that data always runs the risk of corruption.
for more visit:

Now, the ASCII table contains 256 (0-255) letters, which include the upper and lower case alphabets of English, numbers from 0 to 9, punctuation marks etc,and others for system.Now, there are hundreds of languages in the world which have their own scripts. To solve this problem Unicode consortium has been established.

U have to thoroughly read the Unicode website to get an idea of what really is Unicode and compare it with ASCII (American standard code for information interchange) to see the difference.

Unicode supports many script may contain several languages, for example Arabic script.etc, Baluchi Farsi Urdu etc are based on the Arabic script which means all the languages share a common writing system.

Every Script in the Unicode has a code and the languages in a particular script is identified by that code.
Unicode website maintains and updates the list of "alive" scripts and and languages.

the Language code for Baluchi is ISO639-2 bal

Besides,  Unicode also has a huge repository of alphabets of every (almost) language of the world. As its said in the beginning that each letter is given a unique code.
Arabic is given a range of 0600 to 06ff. Which means 0600 (0600 is the Unicode code for ؀ ) is first letter in the range and 0600ff (06ff is Unicode code for  ۿ). In between  this range are all the alphabets of arabic, syriac, sindhi etc.

If we try we can (actually we should) submit missing characters such as   to Unicode.

...and for fun or knowledge:
open Ms word, type a 4digits or a combination of letters and numbers. letters a to f and numbers 0-9
and try this :
type 062C and place the cursor on the right(right of C) and hold down Alt key and press x and release booth keys!. tell me what you see! ok.
you can try the next consecutive code that is 062d be continued