The best solution?

Have you ever wondered what the best solution is? For example, what if somebody asks you 'What is the correct regex pattern that would match to any 3 digits from 0-9 but not match if those three digits are all zeros, e.g. 123 is a match but 000 is not a match?'

So what is the best solution? Is it:

([1-9]\d\d|\d[1-9]\d|\d\d[1-9])

This still would allow a string like 1234 to be correct.

Or this one:
^([1-9]{3}|[1-9][0-9][0-9]|[0-9][1-9][0-9]|00[1-9])

It still does allow ambiguity.

Then this regex:
^([0][1-9]{2})?([0]{2}[1-9])?([1-9]{2}[0])?([1-9][0]{2})?([0-9][1-9][0-9])?([1-9]{3})?([1-9][0][1-9])?$

This does work! Assuming that what is tested is a complete line, not a word.

Now is it the best solution? I don't think so. Would the person who's going to follow you up still be able to read that regex line? How long did it take to find out that line? What will you do if the requirements change?

What I want to illustrate is that sometimes it's better to think again about the question. Why not make two lines of code? First check if it contains 3 digits, regex should be something like /d:3, and then immediately check for the occurence of '000'. It is so much easier to read, less fault tolerant, quickly created etc.

So in short: keep in mind what is best for your client instead of what you can do with technique.

I just wanted to share this based on a discussion I had with some colleages and especially Douglas Johnston.

No comments: