Previous Page
Next Page

Recipe 13.6. Validating User Input with Common Patterns

Problem

You want to make sure that a user correctly entered information, such as an email address, social security number, telephone number, or Zip/Postal Code.

Solution

Use one of the common patterns included in this recipe.

Discussion

Regular expressions are extremely useful for validating a wide range user input. For example, you might have a form that allows a user to enter an email address to sign up for your latest online game and you need to ensure the email address is valid. Or, you might possibly want to make sure that a birth date was entered correctly. Or, you might want to verify that a credit card number was input properly. The following list of regular expressions will help:

  • Match a date in the format ##/##/####, where both the day and month value can be 1 or 2 digits, and the year can be either 2 digits or 4 digits when starting with 19 or 20:

  • ^\d{1,2}\/\d{1,2}\/(\d{2}|(19|20)\d{2})$
    

  • Match a social security number in the format ###-##-####, where the dashes are optional and the three groups can have optional spacing between them:

  • ^\d{3}\s*-?\s*\d{2}\s*-?\s*\d{4}$
    

  • Match a five-digit U.S. Zip Code with an optional dash and four-digit extension:

  • ^\d{5}(-\d{4})?$
    

  • Match a Canadian Postal Code in the format L#L #L# (where L is a letter). There is a restriction placed on the first letter in the Postal Code to ensure that a valid province, territory, or region is specified:

  • ^[ABCEGHJKLMNPRSTVXY]\d[A-Z] \d[A-Z]\d$
    

  • Match a U.S. telephone number in the format (###) ###-####, where the area code is optional, the parentheses around the area code are optional and could be replaced with a dash, and there is optional spacing between the number groups:

  • ^(\(\s*\d{3}\s*\)|(\d{3}\s*-?))?\s*\d{3}\s*-?\s*\d{4}$
    

  • Match a U.S. telephone number, like the previous expression, except allow for an optional one- to five-digit extension specified with an "x", "ext", or "ext." and optional spacing:

  • ^(\(\s*\d{3}\s*\)|(\d{3}\s*-?))?\s*\d{3}\s*-?\s*\d{4}\s*((x|ext|ext\.)\s*\d{1,5})?$
    

  • Match a credit card number with four group of four digits separated by optional dashes and optional spacing between the groups:

  • ^(\d{4}\s*\-?\s*){3}\d{4}$
    

  • Match U.S. currency starting with a $ and followed by any number with at most two optional decimal digits:

  • ^\$\d(\.\d{1,2})?$
    

  • Match an email address where the domain is not an IP address and may contain any number of optional subdomains. When creating this regex, it's a good idea to set the ignoreCare flag to true:

  • ^[a-z0-9][-._a-z0-9]*@([a-z0-9][-_a-z0-9]*\.)+[a-z]{2,6}$
    

  • Match an IP address when you are only concerned that the address is formatted correctly with four groups of one to three digits separated by periods:

  • ^(\d{1,3}\.){3}\d{1,3}$
    

The preceding regex matches 999.999.999.999, which technically is not a valid IP address, but it has the correct IP address formatting.


  • Match an IP address when you are concerned that the address is formatted correctly and that each number group only ranges from 0-255:

  • ^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)$
    

  • Match an email address when the domain is either a domain name consisting of any number of optional subdomains or an IP address:

  • ^[a-z0-9][-._a-z0-9]*@(([a-z0-9][-_a-z0-9]*\.)+[a-z]{2,6}|((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?))$
    

To use one of the preceding regular expressions in the context of validating user input, use the RegExp.test( ) method described in Recipe 13.2:

// Create a regular expression to test for a valid Zip Code
var zipCode:RegExp = /^\d{5}(-\d{4})?$/;

// Check to see if the user input is a valid Zip Code
if ( !zipCode.test( "12384-1231" ) ) {
  // Zip Code is not valid, alert the user of an error here
} else {
  // Zip Code is valid, probably don't need to do anything here
}

When creating regular expressions, there is usually a tradeoff between accuracy and complexity. Consider the two IP address regexes provided in the preceding list. The first IP address regex is fairly easy to understand; it matches one to three digits followed by a period three times, and then matches another group of one to three digits. Although this regex is simple, it's not entirely accurate. IP addresses limit each number group in the range of 0 to 255, whereas this regex accepts any number between 0 and 999. Therefore, the first regex is only good at determining if the IP address is formatted correctly and not reliable for determining if the IP address is actually valid.

The second IP address regex is much more complex, but also more accurate. It limits the number groups to values from 0 to 255, and also ensures that four number groups exist and are separated by periods. It's up to you to determine which regex is best for your situation.

Complex regexes are hard to understand and create, but tend to offer the most accuracy. Simple regexes are much easier to understand and create, but may either match more than what you really want them to (false positives), or may not match something that that you want them to (false negatives).


Another example of the complexity and accuracy tradeoff is the regex for matching a date. The regex provided to match a date in the format ##/##/#### allows for false positives. For instance, 01/99/2006 is accepted by the pattern, but January does not have 99 days in it.

Unfortunately, trying to create a regex to match a date in a specific format that accepts only valid month, day, and year combinations is virtually impossible because of the conditional dependencies. In these situations, use a combination of regex grouping and ActionScript code. You can inspect the groups matched in the regex and perform logic to ensure that the group values make sense:

// Verify a date based on a string input from the user
var inputDate:Date = extractDate( theInputString );

// Test to see if the date was valid or not
if ( inputDate == null ) {
  // Could not parse the date correctly, invalid
} else {
  // Valid date
}

The following is the definition for extractDate( ):

// Attempts to extract a date value from a string in the 
// format of ##/##/#### or ##/##/##. Returns a Date object
// described by the string value if successful, or null
// otherwise.
public function extractDate( possibleDate:String ):Date {
  var datePattern:RegExp = /^(\d{1,2})\/(\d{1,2})\/(\d{2}|(19|20)\d{2})$/;

  // Use the regex to filter out badly formatted dates right away
  var result:Array = datePattern.exec( possibleDate );
  
  // A null result means the format was invalid
  if ( result == null ) {
    return null;
  }
  
  // At this point, the date is formatted corrected and the result
  // array contains the matched substring as well as all of the matched
  // groups. If the possibleDate is "02/08/2006", then the result array
  // contains: 02/08/2006,02,08,2006,20
  
  // Convert the string values to ints
  var month:int = parseInt( result[1] );
  var day:int = parseInt( result[2] );
  var year:int = parseInt( result[3] );
  
  // Perform additional logic to make sure month, day, and year all make sense
  if ( month > 12 || day > 31 || month == 0 || day == 0 ) {
    // Month or day value is too high or too low - not valid
    return null;
  } else if ( day == 31 && ( month == 9 || month == 4 || month == 6 
                           || month == 11 || month == 2 ) ) {
    // 31 days for September, April, June, November, or February - not valid
    return null;
  } else if ( day == 30 && month == 2 ) {
    // 30 days for February - not valid
    return null;
  } else if ( day == 29 && month == 2 
              && !( year % 4 == 0 && ( year % 100 != 0 || year % 400 == 0 ) ) ) {
    // 29 days in February, but not a leap year - not valid
    return null;
  } else {
    // Handling two digit years is tricky.  The year 99 should be 1999, but 06 
    // should be 2006.  Using 06 in the Date constructor will yield 1906 as the
    // year, so pick an arbitrary year, say, 15, and everything less than that
    // will be converted to 20xx. Everything after that will be 19xx by default.
    if ( year <= 15 ) {
      year += 2000;    
    }
    // Logically, month, day, and year all make sense, so return the 
    // proper Date object. Subtract 1 from the month because months
    // are zero indexed in ActionScript.
    return new Date( year, month - 1, day );
  }
}

See Also

Recipes 13.1 and 13.2


Previous Page
Next Page
Converted from CHM to HTML with chm2web Pro 2.85 (unicode)