Text processing: Text Cat
Cesar D. Rodas (contact me)
This class can be used to guess the language of a given text.
The class reads data files that contain ranking information about characters that are most likely to be found in texts of several languages.
The text being analyzed is converted to Unicode to be compared with the language character ranking data.
The class returns an array of the language sorted by ranking .
Currently it support the language are: Arabic, Belarus, Chinese, Czech, Danish, Dutch, English, Esperanto, French, German, Greek, Hebrew, Italian, Japanese, Russian, and Spanish.
Have a lot of fun with this!
Click here for detailed information about this class on phpclasses.org