Know your flags
withoutString
: Given two strings, base
and remove
, return a version of the base
string where all non-overlapping instances of the remove
string have been removed (not case sensitive).public String withoutString(String base, String remove) { return base.replaceAll( "(?i)" + java.util.regex.Pattern.quote(remove), ""); }The
'i'
flag enables case-insensitive matching; Pattern.quote
is used (unnecessarily for codingbat tests) to match remove
literally, giving no special meaning to any regex special metacharacters. Know your character classes
countYZ
: Given a string, count the number of words ending in 'y'
or 'z'
(not case sensitive). We'll say that a 'y'
or 'z'
is at the end of a word if it's not followed by a letter (Note: Character.isLetter
tests if a char is a letter.)public int countYZ(String str) { return str.split("[yzYZ](?!\\p{javaLetter})", -1).length - 1; }Any non-deprecated
Character.isXXX
method maps to \p{javaXXX}
character class.Know your bounds
notReplace
: Given a string, return a string where every appearance of the lowercase word "is"
has been replaced with "is not"
. The word should not be immediately preceded or followed by a letter. public String notReplace(String str) { return str.replaceAll("\\b(is)\\b", "is not"); }Do note that while the above solution passes all tests, it is technically incorrect, because it uses
\b
word boundary anchor. Regex words consist of [a-zA-Z_0-9]
. When it is applicable, though, you should use the anchor, because it results in a clean regex like above.Capture only when necessary
startOz
: Given a string, return a string made of the first 2 chars (if present), however include first char only if it is 'o'
and include the second only if it is 'z'
.public String startOz(String str) { return str.replaceAll("(?:(o)|.)(?:(z)|.).*", "$1$2"); }Note that
deFront
is essentially the same problem.Measuring lengths
everyNth
: Given a non-empty string and an int N, return the string made starting with char 0, and then every Nth char of the string. public String everyNth(String str, int N) { return str.replaceAll(String.format("(.).{0,%d}", N-1), "$1"); }
conCat
: Given two strings, return their concatenation. However, if the concatenation creates a double-char, then omit one of the chars.public String conCat(String a, String b) { return (a + b).replaceAll( String.format("(?<=^.{%d})(?<=(.))\\1", a.length()), ""); }Admittedly both solutions are awkward in regex, but it shows how one can use finite repetition to measure lengths (see also the solution to
repeatEnd
in Part 3).2-for-1
startWord
: Given a string and a second non-empty "word" string, we'll say that the word matches the string if it appears at the front of the string, except its first char does not need to match exactly. On a match, return the front of the string; otherwise return the empty string.
public String startWord(String str, String word) { return (word + "#" + str) .replaceAll(".(.*)#(.\\1)?.*", "$2"); }Regex tasks involving two strings can sometimes be done on their concatenation instead, using a special delimiter (which must not occur in either string) to mark where one ends and the other begins. Note that if no special delimiter is safely available, you can always measure length instead (see above, and compare with below).
An alternate solution for
conCat
using this technique would look something like this:
public String conCat(String a, String b) { return (a + "#" + b) .replaceAll("(.)?#(?=\\1|(.?)).?", "$1$2"); }
2-steps
stringE
: Return true if the given string contains between 1 and 3 'e'
chars.
public boolean stringE(String str) { return str.replaceAll("[^e]", "").matches("e{1,3}"); }Sometimes a task becomes a lot easier to solve if you naturally break it down into smaller steps, such is the case here. For reference, this is how you'd solve this one in one step:
public boolean stringE(String str) { return str.matches("([^e]*e){1,3}[^e]*"); }See also
gHappy
in Part 3.
Normalize input to eliminate exceptional cases
lastChars
: Given 2 strings, a
and b
, return a new string made of the first char of a
and the last char of b
. If either string is empty, use '@'
for its missing char.
public String lastChars(String a, String b) { return (a + "@@" + b).replaceAll("(?<!^).*(?!$)", ""); }Note that the way we concatenate the strings is such that the regex doesn't have to consider any special case. This is a technique that is often applicable to many other things as well.
It should be mentioned that while a lot of the regex solutions given here aren't the most efficient way of solving these problems, this one in particular makes it obvious: it's linear when a constant-time solution exists.
No comments:
Post a Comment