There are many projects to do in Open Source for Tamil Computing.
Listing here some of them.
1. Font conversion to unicode
There are lot of Tamil fonts in TSCII and ASCII format.
Example:
TAM, TAB, Vanavil, Bamini, indoword, softview, kabilan, kaniyan, shri TAM, Shri lipi, ilango, mayilai, anu, senthamizh etc. These are used in DTP centers.
There are tons of documents generated in these fonts.
To view them, we need to install these fonts locally.
These documents should be converted to Unicode so that anyone can view them without installing any special fonts.
NHM converter is a online service which does this.
http://software.nhm.in/ services/converter
We need to create a FOSS application for this.
We can do this in python, php, ruby etc.
We can do this as desktop or web application.
TACE format should be considered.
2. Spellchecker
We can create a new spellchecker or extend the existing spellcheckers aspell or hunspell or project silpa.
Explore these:
www.silpa.in/
https://groups.google.com/ forum/?fromgroups#!topic/ freetamilcomputing/dEQgHESN9us
http://saranyaselvaraj. wordpress.com/2009/09/17/ aspell-and-hunspell/
3. Grammer checker - santhi pizai thirutthi
4. Dictionary with Tamil meaning, english meaning, opposite, same meaning words
5. Number to string converter
example: 100 = nooru
6. OCR for Tamil
The following are in beginning stage.
http://gtamilocr.sourceforge. net
https://launchpad.net/tamilocr
test and extend them.
7. Tamil Corpus
A web application should be developed, showing a word and all the grammar tags.
Logged in users can select the relevant grammar tag for that word.
Thus, when many people contribute, a whole Tamil corpus will be generated.
8. Rule based auto complete for Tamil
9. Automatic machine translation
10. GUI and web based Tools to learn tamil for beginners
11. Text to Speech for Tamil
http://dhvani.sourceforge.net/
Ttest and enhance it
12. Project for Wiktionary
Wikionary is the wiki based dictionary for all languages.
Example:
http://ta.wiktionary.org
We can add voice files to wiktionary.
We need to create an web application, desktop and mobile client to display each word, asking the user to record the sound of the word.
Once recorded, the sound ogg file should be uploaded to commons.wikipedia.org and then it should be linked back to the same word in the wiktionary page.
Thus, any user can record and upload the audio words automatically.
13. In Tamil wikipedia, we need a javascript based on screen keyboard, so
that users can click and type easily.
Some of the projects are discussed in the following research paper collection.
http://ti2012.infitt.org/ sites/default/files/ Conference-book.part1.rar
http://ti2012.infitt.org/ sites/default/files/ Conference-book.part2.rar
Engineering college students who needs some base paper to their projects can use these papers and build applications on top of them.
Thanks to T.Shrinivasan, ILUGC.
Listing here some of them.
1. Font conversion to unicode
There are lot of Tamil fonts in TSCII and ASCII format.
Example:
TAM, TAB, Vanavil, Bamini, indoword, softview, kabilan, kaniyan, shri TAM, Shri lipi, ilango, mayilai, anu, senthamizh etc. These are used in DTP centers.
There are tons of documents generated in these fonts.
To view them, we need to install these fonts locally.
These documents should be converted to Unicode so that anyone can view them without installing any special fonts.
NHM converter is a online service which does this.
http://software.nhm.in/
We need to create a FOSS application for this.
We can do this in python, php, ruby etc.
We can do this as desktop or web application.
TACE format should be considered.
2. Spellchecker
We can create a new spellchecker or extend the existing spellcheckers aspell or hunspell or project silpa.
Explore these:
www.silpa.in/
https://groups.google.com/
http://saranyaselvaraj.
3. Grammer checker - santhi pizai thirutthi
4. Dictionary with Tamil meaning, english meaning, opposite, same meaning words
5. Number to string converter
example: 100 = nooru
6. OCR for Tamil
The following are in beginning stage.
http://gtamilocr.sourceforge.
https://launchpad.net/tamilocr
test and extend them.
7. Tamil Corpus
A web application should be developed, showing a word and all the grammar tags.
Logged in users can select the relevant grammar tag for that word.
Thus, when many people contribute, a whole Tamil corpus will be generated.
8. Rule based auto complete for Tamil
9. Automatic machine translation
10. GUI and web based Tools to learn tamil for beginners
11. Text to Speech for Tamil
http://dhvani.sourceforge.net/
Ttest and enhance it
12. Project for Wiktionary
Wikionary is the wiki based dictionary for all languages.
Example:
http://ta.wiktionary.org
We can add voice files to wiktionary.
We need to create an web application, desktop and mobile client to display each word, asking the user to record the sound of the word.
Once recorded, the sound ogg file should be uploaded to commons.wikipedia.org and then it should be linked back to the same word in the wiktionary page.
Thus, any user can record and upload the audio words automatically.
13. In Tamil wikipedia, we need a javascript based on screen keyboard, so
that users can click and type easily.
Some of the projects are discussed in the following research paper collection.
http://ti2012.infitt.org/
http://ti2012.infitt.org/
Engineering college students who needs some base paper to their projects can use these papers and build applications on top of them.
Thanks to T.Shrinivasan, ILUGC.