ILEX Home Page
International Interlinked Lexicon
This is the home page for the ILEX database opensource project.
Click here for a printable version (without the colours, etc.).
(After printing, use your browser's Back button to get back to here.)
Hi there from Robert!
I am just getting started on making this ILEX project open source.
Previously the data was in SFM (Standard Format Marker) text files on my computer with a C++ program
to compress them into an undocumented format which my program TED used. Now I'm just learning about SourceForge.net and how to use the facilities, so please be patient with me (or else offer to help me).
I am looking to make the multilingual dictionary more comprehensive, more general and more universal and to get others involved in helping add to this database.
Main Features of the ILEX database
- MULTILINGUAL: The largest word base is English but there are a reasonable number of words from about a dozen languages. It is hoped to greatly expand the number of languages by harvesting online dictionaries.
- INTERNATIONALIZED: Everything is in UTF8 and would like to be able to handle as many of the worlds languages as possible (not just the major ones). Also handles dialects, e.g., British English and American English.
- TAGGED: Words are not only tagged with a part of speech, but also with semantic components, e.g., yellow is a colour/color, oak is a type of tree, piston is part of an engine.
- INTERLINKED: Word entries are interlinked within each language, e.g. English cats is the plural of cat, and also translations between languages, e.g., Cebuano of cat is iring.
- FUZZY: Able to handle real-world data which is not always computer friendly, e.g., "synonyms" in languages are not usually exact synonyms.
- OPEN SOURCE: Will be released under GPL v3 for everyone to use freely.
Related Projects (also on SourceForge)
- ILEX TOOLS: A library to use (and possibly edit) this ILEX database. Might include a spell checking program (partly as a way of expanding the database).
- TED-AI: A fun (toy) AI program using ILEX Tools for multilingual conversation (similar to the original English Eliza program) and also translation.
- Put more useful general information on this webpage.
- Determine the best licence for this database. (Tentatively GPLv3 but is that suitable for a data project like this one?)
- Upload the initial SQLite3 database into Subversion so others can check it out and play with it.
- Put this website into Subversion and write a cron script to automatically update the web server with rsync.
- Write a script to convert the SQLite3 database into MySQL.
- Write a PHP web interface to display the existing data here on SourceForge.
- Write a PHP web interface to allow existing data to be edited and new data to be entered (with some control over it somehow).
- Write scripts to harvest/mine new entries and cross references from online sources, e.g., Wiktionary, the online Interlingua dictionary, etc...
- Make an icon for this project.
This page last updated: Thursday, 7 February 2008
Copyright © 2008—Robert—All Rights Reserved