Karl Wiberg wrote: > On Fri, Dec 4, 2009 at 1:29 AM, Carl Worth <cworth@cworth.org> wrote: >> And a step beyond that would support different languages for >> different emails, but that sounds like something "hard" to identify. > > But probably not as hard as identifying spam. It could probably be > done with a simple Bayesian filter counting word frequencies---but > it'd be much better if somebody else had already solved the problem, > since this smells suspiciously like something that ought to be a > separate project and put in a library ... does anyone know if such a > project already exists? I know Google can do it ... > > It'd be very cool to have notmuch automatically tag messages according > to what language they're in. What we should have is an interface to run an external program to classify a message when it's newly introduced and another that runs when tags are changed so that machine learning can be made to work when the user changes tags. Baruch