The simple analyzer introduced in #1010 can possibly lead to confusion. From Will:
> If my file is called biology-data/CSFV-GFP01_3_R3D and I search for
> Name: "GFP*" it won't be found. (the * is a wild card).
> This is because, currently, CSFV-GFP01 is treated as a single
> So, if we "tokenise" using "-" , then the search index will have
> the words "GFP" and "CSFV", and not GFP-CSFV. If I search with
> "GFP" it will find this image.
> However, if I search for "CSFV-GFP", then this search string will
> also be tokenised the same way, so I'll be searching for "CSFV" and
> "GFP" separately. I guess this will find the file OK, but it won't
> distinguish between a file named biology-data/CSFV-GFP01_3_R3D and
> CSFV-data/H2B-GFP01_3_R3D since both have "CSFV" and "GFP" in the
> name. Although it may? favour the former, based on the proximity of
> the words?
> Maybe, in the grand scheme of things, this is not a problem.
> One question I have is whether you can have different tokenisers
> for the file name and for text annotations (comments, descriptions
> etc). Ideally, if I've got CSFV-GFP in a text description, I'd like
> to be able to find it by searching for "CSFV-GFP". (When using this
> search string for a text annotation search, it would have to NOT be
> tokenised on "-"). So, if you have more than one tokeniser for
> indexing, you'd have to have more than one for searching different