What We Will Cover
Elucidations
Homework Questions?
Homework Discussion Questions
- The examples in the textbook show validation of one error at a time. If the user makes more than one error, they do not find our about the additional errors until they try to submit the form again. Does this seem like the best approach? If so, why? If not, then what do you suggest to improve the validation process for the user?
- It is very common and useful to do client-side validation of form data. But some users may block JavaScript in their browsers or they may use older browsers that support older versions of JavaScript. How will this affect the forms that have client-side validation? What can you do about it?
- Server-side and client-side validations both have their advantages and disadvantages. What are some advantages and disadvantages for each type of validation.
^ top
10.1: Introducing Regular Expressions
Learner Outcomes
At the end of the lesson the student will be able to:
- Create a regular expression
- Use the properties and methods of the regular expression object
- Use incremental refinement to develop a regular expression
|
^ top
10.1.1: About Regular Expressions
- Many programming problems require matching a pattern in text strings
- Verifying HTML form data is one such problem
- For example, if you look at the new zip codes you see that they follow a pattern:
95003-3119
- The pattern is five digits followed by a dash and four more digits
- To validate form data such as this, you can use a regular expression
- A regular expression is a string that describes a character pattern
- Once you define the character pattern, you can test to see if a string matches the pattern
- Also, you can use them to extract substrings, insert new text, or replace old text
- In this lesson, we will learn how to create regular expressions and how to apply them
- We will focus on testing strings against a regular expression for form validation
^ top
10.1.2: Creating a Regular Expression
- You create a regular expression in JavaScript using the command:
re = /pattern/;
- Where pattern is a text string describing the regular expression
- The hard part of a regular expression is defining the pattern
- To help us learn how to define patterns, the textbook author created a web page to help us: demo_regexp.htm
- We will use this page to try different patterns
Matching a Substring
Setting Regular Expression Flags
- By default, regular expressions are case sensitive
- To make a regular expression not sensitive to case, use the
i flag:
/pattern/i
- Also by default, a regular expression stops after the first match
- To allow a global search for all matches in a test string, use the g flag:
/pattern/g
- You can apply both flags at the same time:
/pattern/ig
- For example, we can try the following pattern on our demo page:
/are/ig
^ top
10.1.3: Defining Character Patterns
- So far our regular expressions only match exact strings
- We can do this using just string object methods
- In this section we look at using special characters to allow us to match text in other ways
- These special characters are often called metacharacters
Character Boundaries
| Character |
Description |
Example |
^ |
Matches the beginning of a text string |
/^GPS/ matches "GPS-ware" but not "Products from GPS" |
$ |
Matches the end of a text string |
/ware$/ matches "GPS-ware" but not "GPS-ware Products" |
\b |
Matches a word boundary |
/\bart/ matches "art" and "artist" but not "kart" or "start" |
\B |
Matches the absence of a word boundary |
/art\B/ matches "artist" but not "art" |
Character Types
- You can use regular expressions to indicate the type of a character
- There are three general types of characters: digits, word characters, and whitespace
| Character |
Description |
Example |
\d |
A digit (from 0 to 9) |
/\dth/ matches "5th" but not "path" |
\D |
A non-digit |
/\Dth/ matches "path" but not "5th" |
\w |
A word character (letter, digit or underscore) |
/\w\w/ matches "to" or "A1" but not "$x" or "*" |
\W |
A non-word character |
/\W/ matches "$" or "&" but not "A", "b" or"3" |
\s |
A whitespace character (space, tab, newline, carriage return or form feed) |
/\s\w\s/ matches " A " but not "A" |
\S |
A non-whitespace character |
/\S\S\S/ matches "123" or "abc" but not "1 3" or "a c" |
. |
Any character except newline |
/./ matches any single character except newline |
Character Classes
- While character types are useful, sometimes you want to limit the allowed characters to a few select letters or digits
- For this you can use square brackets
[] to specify a character class
- A character class defines a set of characters that can match a single character of the text string
- For example, to match all the vowels in a string:
/[aeiou]/ig
- You can negate (reverse) the meaning of a character class by putting a ^ as the first symbol of the class
- For example, to match all the consonants in a string:
/[^aeiou]/ig
- Also, you can define a range of characters by separating the starting and ending characters with a dash
- Since characters and numbers are arranged in alphabetical order, you can specify all lowercase letters using:
/[a-z]/
- For uppercase letters you would use:
/[A-Z]/
- For both lowercase and uppercase letters you would use:
/[a-zA-Z]/
- For all letters and digits you can use:
/[0-9a-zA-Z]/
Repeating Characters
- So far our regular expressions match a single character
- However, you can use metacharacters to specify repetition
| Character(s) |
Description |
Example |
* |
Repeat 0 or more times |
/\s*/ matches 0 or more consecutive whitespace characters |
+ |
Repeat 1 or more times |
/\s+/ matches 1 or more consecutive whitespace characters |
? |
Repeat 0 or 1 times |
/colou?r/ matches "color" or "colour" |
{n} |
Repeat exactly n times |
/\d{5}/ matches a five digit number |
{n,} |
Repeat n or more times |
/\d{5,}/ matches a number with at least five digits |
{,m} |
Repeat no more than m times |
/\d{,5}/ matches a number with no more than five digits |
{n,m} |
Repeat at least n but no more than m times |
/\d{5,9}/ matches a number with 5 to 9 digits |
Escape Sequences
| Escape Sequence |
Example |
\/ |
/\d/\d/ matches "2/3" but not "23" |
\\ |
/\d\\\d/ matches "2\3" but not "23" |
\. |
/\d\.\d\d/ matches "1.23" but not "123" |
\* |
/\d\*\d/ matches "1*2" but not "12" |
\+ |
/\d\+\d/ matches "1+2" but not "12" |
\? |
/\w{5}\?/ matches "hello?" but not "hello" |
\n |
/\n/ matches a new line in the text string |
\t |
/\n/ matches a tab in the text string |
Alternate Patterns and Grouping
- Sometimes you want to define two or more patterns for the same text string
- For this we can use the alternation symbol ('|')
- The alternation symbol matches either the pattern on the left or the right
- For example, if we want to match either Dave or David we could use:
/Dave|David/
- Notice that the alternation applies to all the characters on the left or right side, and not just a single character
- If we want the alternation to apply to a subpattern, then we can group characters with parenthesis
- For example, we could modify our Dave or David example to use:
/Dav(e|id)/
- Another benefit of grouping is that parenthesis are remembered elsewhere in the pattern
- This is known as creating a back-reference
- You can reference these grouped subpatterns using the syntax
\groupNumber
- Where groupNumber is the number of the grouping counted from left to right
- For example, if you wanted to search for repeating words in a string (a common error) in a text string like:
products for for sale
- To create a pattern to find a single instance of a word, we use something like:
/(\b\w+\b)/
- Since we are looking for consecutive words separated by a space, we add a space and back-reference to the pattern:
/(\b\w+\b)\s+\1/
^ top
10.1.4: Applying Regular Expressions
- Once you develop a regular expression you apply it using methods of the String and RegExp objects
- There are several methods commonly used with regular expressions as shown in the table below
- Usually, you package your regular expressions and the methods for applying them into a function
Commonly Used Regular Expression Methods
| Method |
Description |
re.exec(text) |
Executes a search for a match on text using the regular expression in re. Returns a result array if found, or null otherwise. |
re.test(text) |
Executes a search for a match on text using the regular expression in re. Returns true if found and false otherwise. |
text.match(re) |
Finds a match in text using the regular expression re. If a match is found, it returns an Array containing all the matches. |
text.replace(re,newText) |
Finds a match in text using the regular expression re. If a match is found, it returns a new string where the matched substring is replaced with the substring newText. |
text.search(re) |
Finds a match in text using the regular expression re. If a match is found, it returns the index of the match in the string; otherwise it returns -1. |
Regular Expression Example
1
2
3
4
|
function checkZip2(zip) {
regex = /^\d{5}(-\d{4})?$|^$/;
return regex.test(zip);
}
|
^ top
10.1.5: Building Regular Expressions That Work
- Regular expressions are very powerful -- but can be almost unreadable
- To build complex regular expressions, you start with a simple expression
- Once the simple expression works, you refine your regular expression incrementally
- Build it one piece at a time and test each addition as you go
Incremental Refinement Example
- This example incrementally builds a regular expression for form verification
- We want to verify that a form field meets requirements for email addresses
- The steps that follow detail a process for building this verification incrementally
- Determine the precise rules for what you need to verify
john.doe@hotmail.com
You determine what is valid and invalid input by examining email addresses and reading specifications. Some of the rules you come up with are:
- User names can have almost any printable ASCII character
- An @ symbol separates the user name from the domain name
- Domain names can have letters, digits, and hyphens
- Each part of a domain name is separated by a dot
- Set up your test environment
You can use the demo_regexp.htm page to test your regular expressions against various text.
- Code a simple expression to get you started
You look at the rules and code a simple regular expression like this:
regex = /.+@.+\.\w+/i;
Then you test the regular expression to verify it works.
- Set anchors if you can
Add the ^ and $ quantifiers where possible. This prevents invalid characters before and after the acceptable pattern.
regex = /^.+@.+\.\w+$/i;
- Get more specific, if you can, testing each addition carefully
For instance, you refine the domain name to include only letters, digits, and dashes. This would give you a regular expression like:
regex = /^.+@[a-z0-9.-]+\.\w+$/i;
You may decide to restrict the top level domain (TLD) to only those that actually exist. This turns out to be quite complicated. Almost every two-letter combination is used by some country. In addition to the well-known generic TLD's of com, edu, net, org, mil and gov, there are many new TLD's: biz, info, name, coop, aero and museum. More are being suggested and adopted every year. Thus the best approach is probably to restrict the top-level domain to between 2 and 6 characters. This would give you a regular expression like:
regex = /^.+@[-.a-z0-9].\w{2,6}$/i;
- Once you have a working and tested regular expression, you create a function for your form-validation library:
function checkEmail(email) {
regex = /^.+@[a-z0-9.-]+\.\w+$/i;
return regex.test(email);
}
^ top
10.1.6: Summary
- In this section we looked at the language of regular expressions
- We looked at how to match substrings using:
/characters/
- In addition, we looked at using the
i and g flags to ignore case and perform global searches
- Then we looked at how to define character patterns using techniques such as:
- Character boundaries:
^, $, \b, \B
- Character types:
\d, \D, \w, \W, \s, \S
- Character classes:
/[a-zA-Z0-9_.-]/
- Repeating characters:
*, +, ?, {n}, {n,}, {n,m}
- Escape sequences (using a backslash: '\')
- Alternate patterns:
/Dav(e|id)/
- Grouping:
/(\b\w+\b)\s+\1/
- We learned about the methods associated with regular expressions so we can apply a regular expression to a number of tasks
- In addition, we looked at how to build regular expressions that work
Check Yourself
- What is a regular expression to match the first occurrence of "abc"?
- What is the regular expression to match every occurrence of the substring "abc" in a string, regardless of case?
- Write a regular expression to match a social security number, which has nine digits.
- Create a regular expression to match either "apple", "banana" or "orange".
Also, look at the Quick Check questions in the textbook on page JVS 401.
^ top
Activity 10.1
Take one minute to read over the following Quick Quiz questions. We will discuss the questions in one minute.
Quick Quiz
^ top
10.2: Working with Regular Expressions
Learner Outcomes
At the end of the lesson the student will be able to:
- Apply regular expressions to ZIP code fields
- Remove blank spaces from form fields
- Validate Credit Card Numbers
- Apply the Luhn Formula to validate credit card numbers
|
^ top
10.2.1: Validating a ZIP code
- Let us apply our knowledge of regular expressions to validate zip codes
- Recall that zip codes follow a pattern like this:
95003-3119
- The pattern is five digits followed by a dash and four more digits
- However, the dash and last 4 digits are optional
- Also, a ZIP code is not required for delivery
- Thus, we need to develop a regular expression that matches:
- A five digit ZIP code
- A nine-digit ZIP code
- An empty text string
- We can develop this expression using the demo_regexp.htm page
One possible answer
^ top
10.2.2: Applying the ZIP code Regular Expression
- Let us use our ZIP code regular expression in a Web form
- Turn your computer on and save the following files to a convenient place on your computer (like the Desktop):
- Complete the exercise on these pages JVS 388-389 of the textbook
- If you have difficulty, ask a classmate or the instructor for help
- When you are finished, let the instructor know
^ top
10.2.3: Validating Financial Information
- The third form in the examples of Tutorial 7 is the payment form
- A credit card must always be validated on the server
- However, you can do some validation on the client side to weed out some problems before going to the server
- For instance, you can verify that:
- The customer selects a credit card
- Enters the name and number appearing on the card
- The form has the following functions to verify this information
- Function
checkForm3() gets called when the user submits the form
Functions Verifying form3.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
|
function checkForm3() {
if (selectedCard() == -1) {
alert("You must select a credit card");
return false;
} else if (document.form3.cname.value.length == 0) {
alert("You must enter the name on your card");
return false;
} else if (document.form3.cnumber.value.length == 0) {
alert("You must enter the number on your card");
return false;
} else {
return true;
}
}
function selectedCard() {
card = -1;
for (i = 0; i < 5; i++) {
if (document.form3.ccard[i].checked) {
card = i;
}
}
return card;
}
|
Exercise Instructions
- Save form3txt.htm as
form3.htm to a convenient place on your computer (like the Desktop)
- Complete the exercise on page JVS 390 of your textbook
^ top
10.2.4: Removing Blanks Spaces from Form Fields
Exercise Instructions
^ top
10.2.5: Validating Credit Card Number Patterns
Exercise Instructions
- Let us try check the credit card numbers by completing the exercises on page JVS 392-393 of the textbook
- You can use the following fake credit card numbers for testing, or make up your own:
- American Express: 341234567890123
- Diners Club: 30012345678901
- Discover: 6011123456789012
- MasterCard: 5112345678901234
- Visa: 4123456789012
- If you have difficulty, ask a classmate or the instructor for help
- When you are finished, let the instructor know
^ top
10.2.6: About the Luhn Formula
- We can do one more check of the credit card on the client to weed out mistakes a customer may make
- All credit card numbers must satisfy the Luhn Formula, or "Mod 10" algorithm
- The Luhn Formula was developed in the 1960s by Hans Peter Luhn to validate account numbers
- It is designed to protect against accidental error and almost all financial institutions use account numbers that satisfy the Luhn Formula
- The Luhn Formula works by:
- Starting with the next-to-last digit and moving left, divide the numbers into two groups of even and odd digits
1234567897 -> (1 3 5 7 9) + (2 4 6 8 7)
- Double the value of all the digits in the group with the number you started with
(1 3 5 7 9) -> (2 6 10 14 18)
- Add up the sum of the digits in both groups
2 + 6 + 1 + 0 + 1 + 4 + 1 + 8 = 23
2 + 4 + 6 + 8 + 7 = 27
- Add the sums of both groups together
23 + 27 = 50
- If the total sum is evenly divisible by 10, the number is valid
50 / 10 = 5
- You can see an example of this in the following diagram from page JVS 395 of the textbook
Example of Using the Luhn Formula
^ top
10.2.7: Using the Luhn Formula
Exercise Instructions
- Completing the exercise on page JVS 396 of the textbook
- If you have difficulty, ask a classmate or the instructor for help
- For an example of American Express that fails the Luhn Formula you can use:
34 12345 67890 123
- For an example of American Express that passes the Luhn Formula you can use:
34 12345 67890 127
- When you are finished, let the instructor know
^ top
10.2.8: Summary
- In this section we looked at using regular expressions in number of different ways
- First we looked at using regular expressions to validate ZIP codes:
function checkEmail(email) {
regex = /^.+@[a-z0-9.-]+\.\w+$/i;
return regex.test(email);
}
- Then we looked at removing white space from a form field:
wsre = /\s/g;
cnum = document.form3.cnumber.value.replace(wsre, "");
- After this we looked at the number patterns of five credit cards and how to verify that a credit-card number meets these patterns
switch (selectedCard()) {
case 0: re =/^3[47]\d{13}$/;break;
case 1: re =/^30[0-5]\d{11}$|^3[68]\d{12}$/;break;
case 2: re =/^6011\d{12}$/;break;
case 3: re =/^5[1-5]\d{14}$/;break;
case 4: re =/^4(\d{12}|\d{15})$/;break;
}
pattern = re.test(cnum);
- Finally, we looked at a method to catch accidental data entry errors for credit cards named the Luhn Formula
- Most financial institutions use account numbers that satisfy the Luhn Formula and we looked at how to use the formula
Check Yourself
- What regular expression would you use to validate a five or nine digit ZIP code?
- What JavaScript command would you use to test whether the string "95003" matches the pattern in the regular expression object
reZip?
- What JavaScript commands would you use to remove all the white space from a form field named
cardnum in the form named cards?
- What is the Luhn Formula?
^ top
Activity 10.2
Take one minute to read over the following Quick Quiz questions. We will discuss the questions in one minute.
Quick Quiz
^ top
10.3: Passing Data Between Pages
Learner Outcomes
At the end of the lesson the student will be able to:
- Pass data from one page to another by appending data to the URL
- Discuss the limitations of appending data to a URL
|
^ top
10.3.1: About Passing Data Between Pages
- Sometimes you need to pass data from one Web page to another
- For instance, you may want to break up a large form into several smaller forms so that entering data does not look too daunting
- Passing data from one page to another is surprisingly difficult
- This is because HTTP is a stateless protocol
- Once a server responds to a request, it drops the connection to the browser
- You can see this in the following diagram:

Passing Data Using the URL
- There are some ways to get around the statelessness of HTTP
- One way is by appending the data to the URL
- You can use either hyperlinks or forms to send the data
- Then you can use JavaScript and the DOM to extract the data on the receiving page
- Another way is to use cookies, which we will discuss later in the course
Data Passing Example
- You can see the process of appending data to the URL using the demo_form1.htm page provided by the textbook
- Whatever data you type into the form is sent to another page
- The second page uses JavaScript to extract the data
^ top
10.3.2: Appending Data to a URL
Retrieving Appended Data
Example of Extracting Data Using slice()
- The
slice() method extracts part of a string and returns that part as a new string
- Syntax:
string.slice(start, [end])
- Where string is the string object, start is the starting index and end is the optional ending index
- For example:
<script type="text/javascript">
var data = "?GPS-ware";
document.write(data.slice(1));
</script>
- Produces the output:
^ top
10.3.3: Limitations of Appending Data
- There are several limitations to the technique of appending data to a URL
- First, URLs are limited in their length
- For instance, IE 6 and earlier limits an URL to 2083 characters
- Another problem is that characters other than letters and numbers cannot be passed in the URL without modification
- Because URLs cannot contain blank spaces, for example, a blank space is converted to the character code %20
- Thus, the link:
<a href="form2.htm?GPS-ware Products">Go to form2</a>
- Would get translated into:
http://server/path/GPS-ware%20Products"
- You can replace these "escaped characters" by using the method
unescape()
- For instance:
unescape("GPS-ware%20Products");
- Returns the text string:
GPS-ware Products
^ top
10.3.4: Example of Appending and Retrieving Form Data
- You can use the technique of appending data to the URL with Web forms as well
- To append data to a form:
- Set a form's action attribute to the URL of the page to which you want to pass the data
- Set the method of the form to "get"
- You can see an example by viewing the source of: demo_form1.htm
Retrieving the Form Data
- To retrieve the form data, you use the
location.search property and the slice() method to extract only the text string of the field names and values
- For instance:
searchString = location.search.slice(1);
- Then you use the
unescape() function to remove any escape sequences characters from the text string
searchString = unescape(searchString);
- Next, you convert each occurrence of the + symbol to a blank space
formString = searchString.replace(/\+/g, " ");
- Following this, you need to extract each name and value pair
- A convenient way to store the data is in an array
- Thus, you can split the text string at every occurrence of an = or & character, and store the substrings into an array using:
data = formString.split(/[&=]/g);
- Finally, you process the retrieved data
- The following is a function showing this process that is part of: demo_form2.htm
Example Function for Retrieving Form Data
1
2
3
4
5
6
7
8
9
|
function retrieveData() {
searchString = location.search.slice(1);
searchString = unescape(searchString);
formString = searchString.replace(/\+/g, " ");
data = formString.split(/[&=]/g);
document.dform2.name.value=data[1];
document.dform2.age.value=data[3];
document.dform2.city.value=data[5];
}
|
^ top
10.3.5: Summary
Check Yourself
- What character marks the start of appended data?
- What character separates each name=value pair of appended data?
- What object property can you use to extract data appended to a URL of the current Web page?
^ top
Activity 10.3
Take one minute to review the Check Yourself questions. We will discuss the questions as time permits.
^ top
Wrap Up
Due Next: A10: Regular Expressions (11/13/06)
Discussion 9 and Quiz 9 (11/13/06)
^ top
Home
| WebCT
| Announcements
| Schedule
| Room Policies
| Course Info
Help
| FAQ's
| HowTo's
| Links
Last Updated: November 18 2006 @17:39:53
|