c# - Generating sample data from regex to verify input strings by focussing on boundary cases defined in regex -


there several tools how generate sample data given regex. include:

however, while may sufficient seed dataset, doesn't testing code depends on regex itself, such validation.

assume have code generator generates model property. user specifies regex validate property. assume code generator attempting generate tests ensure validation succeeds , fails appropriately. seems reasonable tool focus on boundary cases within regex avoid generating unnecessary data.

for example, consider regex ^([a-z]{3,6})$ boundary cases include:

  • any string consisting of [a-z] length equal 2 (failure)
  • any string consisting of [a-z] length equal 3 (success)
  • any string consisting of [a-z] length equal 4 (success)
  • any string consisting of [a-z] length equal 5 (success)
  • any string consisting of [a-z] length equal 6 (success)
  • any string consisting of [a-z] length equal 7 (failure)
  • any string not consisting of [a-z] (failure)
  • any string not starting [a-z] ends [a-z] (failure)
  • any string starting [a-z] not ending [a-z] (failure)

the reason focussing on boundary cases string consisting of [a-z] length greater 6 verifies upper boundary of string length defined in regex. testing string of length 7, 8, 9 testing same (boundary) condition.

this arbitrary regex chosen simplicity, reasonable regex may act input.

does framework/tools exists code generator can use generate input strings test cases of different layers of systems being generated. test cases come own when system no longer generated , modified later in development cycle.

if understand question correctly, want generate input system based on validation regex can automate unit testing.

doesn't defeat purpose of unit testing, though? if changes regex, wouldn't want validation fail?

in case, simple answer generating string regex impossible. if done, extremely complex. example, consider regex:

(?<=\g\d{0,3})(?>[a-z]+)(?<=(?<foo>foo)|)(?(foo)(?!)) 

it simple me think of string match (and/or generate matches):

abc123def456ghi789jkl123foo456pqr789stu123vwx456yz

the matches be:

  • "abc"
  • "def"
  • "ghi"
  • "jkl"

but how generate string expression? there no clear starting point - takes extreme (for computer) intelligence plus dash of creativity work out solution. simple human, very, hard computer. if come computer algorithm generate matching string, this:

a

this generate match, poor job of exercising regex. \d{0,3} never tried , \g ever used match beginning of input (rather end of last match). (?<=(?<foo>foo)) never tested (and if was, result in non-match).

it easy generate string not match:

1

but, again, doesn't put regex through paces.

i don't know computer theory enough prove it, believe falls p v np class of problems. relatively easy generate regex match collection of complex strings, difficult generate collection of complex strings match regex.


Comments

Popular posts from this blog

c# - SVN Error : "svnadmin: E205000: Too many arguments" -

c# - Copy ObservableCollection to another ObservableCollection -

All overlapping substrings matching a java regex -