I need to escape special characters which are sent to Apache Lucene.
Since the code will run on a production server, I want the code to be the fastest possible.
I've seen multiple ways to do it:
- Using Pattern
- Using Replace
- Using Library
See: http://www.javalobby.org/java/forums/t86124.html
I'm wondering:
- For trivial cases such as this, should I use RegEx or custom?
Can the below code be optimized further?
/* * Lucene supports escaping special characters that are part of the * query syntax. The current list special characters are + - && || ! * ( ) { } [ ] ^ " ~ * ? : \ * * To escape these character use the \ before the character. */ String query = "http://This+*is||a&&test(whatever!!!!!!)"; char[] queryCharArray = new char[query.length()*2]; char c; int length = query.length(); int currentIndex = 0; for (int i = 0; i < length; i++) { c = query.charAt(i); switch (c) { case ':': case '\\': case '?': case '+': case '-': case '!': case '(': case ')': case '{': case '}': case '[': case ']': case '^': case '"': case '~': case '*': queryCharArray[currentIndex++] = '\\'; queryCharArray[currentIndex++] = c; break; case '&': case '|': if(i+1 < length && query.charAt(i+1) == c) { queryCharArray[currentIndex++] = '\\'; queryCharArray[currentIndex++] = c; queryCharArray[currentIndex++] = c; i++; } break; default: queryCharArray[currentIndex++] = c; } } query = new String(queryCharArray,0,currentIndex); System.out.println("TEST="+query);