regex - How to match a minimal pattern ending at the end of input in JavaScript? -


this follow-up previous question

i find minimal sequence of characters of length > n, starts @ word boundary , ends @ end of input.

for example:

n = 5, input = "aaa bbb cccc dd" result = "cccc dd"

i tried \b.{5,}?$ matches whole input rather minimal part.

what regex suggest?

the problem time isn't greediness, it's eagerness. regexes naturally try find earliest possible match, , getting them find last 1 can tricky. easiest way 1 @arcadien demonstrated: use .* gobble whole string, use backtracking find match on rebound.

i have questions requirements, though. \b can match beginning or end of word, if (for example) n=5 , string ends "foo1 bar2", result " bar2" (notice leading space). want match starts @ end of word, or should drop space or beginning of "foo1"? also, words consist entirely of word characters? if there non-word characters, \b able match in more surprising places.

for regex below, redefined "word" mean complete chunk of non-whitespace characters. .* starts out consuming whole string, lookahead - (?=.{5,}) - forces backtrack 5 positions before tries match anything. \s forces match start @ beginning of word, rest of regex captures 1 or more complete words.

/^.*(?=.{5,})\s(\s+(?:\s+\s+)*$)/  var n = 5; var regex = "^.*(?=.{" + n + ",})\\s(\\s+(?:\\s+\\s+)*$)"; var match = regex.exec(subject); var result = (match != null) ? match[1] : ""; 

this regex won't match that's less 5 characters long or doesn't contain whitespace. if that's problem, let me know , i'll tweak it.


Comments

Popular posts from this blog

c# - SVN Error : "svnadmin: E205000: Too many arguments" -

c# - Copy ObservableCollection to another ObservableCollection -

All overlapping substrings matching a java regex -