What We Will Cover
Continuations
Making Select Lists
- The algorithm for making a select list from a SQL query is straightforward:
- Write a SQL query that returns the rows with the data you need.
$sql = "SELECT ProductName FROM Products";
- Use a loop to extract the data from the result set and create an array.
$prodList = array();
while ($row = mysql_fetch_assoc($result)) {
$prodList[$row["ProductName"]] = $row["ProductName"];
}
- Use the array to code the HTML data.
echo $f->makeSelect('products', $prodList);
- Here is an example querying the products table of the Artzy database:
Cannot find file: examples/selectlistdb.php
Try it and view the source before and after you press the Select button
^ top
14.1: Improving Verification With Patterns
Learner Outcomes
At the end of the lesson the student will be able to:
- Use PHP pattern-matching functions
- Code regular expressions to match string patterns
|
^ top
14.1.1: About Regular Expressions
- Many programming problems require matching a pattern in string variables
- Verifying the data received from HTML forms is one such problem
- For example, if you are expecting an email address, your script needs to verify the string meets requirements for email addresses
john.doe@hotmail.com
Regular Expression Standards
- There are two main standards for regular expressions: POSIX and Perl
- PHP supports both standards
- We will use Perl-compatible functions and focus on using
preg_match()
Commonly Used Pattern-Matching Functions (Perl Compatible)
| Function |
Description |
| preg_match() |
Searches a string for matches to a regular expression. |
| preg_replace() |
Searches a string for matches to a regular expression and replaces them with the specified text. |
| preg_split() |
Searches a string for boundaries matched by a regular expression and splits the string into an array of strings along the boundaries. |
^ top
14.1.2: Using the preg_match() Function
- You use
preg_match() to search a string for matches to a regular expression
- If the regular expression pattern matches a part of the string, then it returns the number
1 (meaning true)
Basic Syntax
int preg_match(string pattern, string subject)
pattern: regular expression pattern
subject: the string to search for pattern matches
- returns the number
1
For Example
Cannot find file: examples/regex1.php
- Put the regular expression pattern between forward slashes:
/ /
- If the pattern "
se" is found, then $found is set to the number 1
- Otherwise,
$found is set to the number 0
Further Information
^ top
14.1.3: Using Regular Expressions with preg_match()
- PHP has a special set of pattern-matching characters (meta characters)
- These characters form a small language with each character having a special meaning
- These characters are part of an industry standard
Commonly Used Pattern-Matching Characters
| Symbol |
Description |
| ^ |
Matches when the characters that follow start the string. |
| $ |
Matches when the preceding characters end the string |
| * |
Matches zero or more occurences of the preceeding character |
| + |
Matches one or more occurences of the preceeding character |
| ? |
Matches zero or one occurences of the preceeding character |
| . |
A wildcard symbol that matches any one character |
| | |
Alternation symbol (OR) that matches either the pattern on the left or the right |
For Example
- We can test our regular expressions with a simple form script
- To match when starting with "She":
^She
- To match when ending with "shells":
shells$
- To match an "se" followed by one or more l's:
sel+
- To match an "se" followed by zero or more a's:
sea*
- To match any character followed by an "e":
.e
- To match "He" or "She":
He|She
- Note that you can ignore case by putting an "
i" after the closing slash
/^she/i
"Escaped" Character Literals
- You can match one of the special characters
- However, you must prefix it with the backslash character
/\.\*/
To match one backslash, your regular expression should include "\\"
The backslash is also used to specify non-printing characters like:
| Sequence |
Meaning |
\a |
Alert |
\f |
Formfeed |
\n |
Newline |
\r |
Carriage return |
\t |
Horizontal tab |
Further Information: Backslash
^ top
14.1.4: Grouping Characters
- Regular expressions use parenthesis, curly brackets and square brackets to group characters
- Each type of grouping character has different meanings
- You can combine these grouping characters with other special characters to get flexible and specific matching patterns
Using Parenthsis to Group Characters
- Use parenthesis to group characters in a regular expression
- For example, to match "Dave" or David" in a string"
/Dav(e|id)/
To match "Dave" or David" whether the name starts with a "D" or "d":
/(D|d)av(e|id)/
Further Information: Subpatterns
Using Curly Brackets to Specify Repetitions
- You use curly brackets to specify a range of repetitions for the preceeding character
- You can specify a range of values such as between 3 and 5
z's
/^z{3,5}$/
You can specify a minimum value such as 3 or more z's
/^z{3,}$/
You can specify a maximum value such as 3 or fewer z's
/^z{0,3}$/
You can specify an exact value such as exactly 3 z's
/^z{3}$/
Further Information: Repetition
Using Square Brackets to Specify Character Classes
- You use square brackets to specify a character class
- Classes match only one of the characters found between the square brackets
- For example, to match either
sea or sel:
/se[al]/
A more common use is to specify a range of values to match
To specify a range, use a dash (-)
For example, to specify the numbers from 0 to 9: /[0-9]/
To specify a capital letter from "A" to "Z": /[A-Z]/
You can specify multiple ranges within one square bracket
/[0-9a-zA-Z_]/
When the caret symbol (^) appears first, it reverses the meaning
Thus to matches any character not between 0 and 9: /[^0-9]/
Further Information: Square brackets
^ top
14.1.5: Building Regular Expressions That Work
- Regular expressions are very powerful -- but can be almost unreadable
- To build complex regular expressions, start with a simple expression
- After a simple start, refine your regular expression incrementally
- Build it one piece at a time and test each addition as you go
Incremental Refinement Example
- This example incrementally builds a regular expression for form verification
- We want to verify that a form field meets requirements for email addresses
- The steps that follow detail a process for building this verification incrementally
- Determine the precise rules for your field
john.doe@hotmail.com
You determine what is valid and invalid input by examining email addresses and reading specifications. Some of the rules you come up with are:
- User names can have almost any printable ASCII character
- An @ symbol seperates the user name from the domain name
- Domain names can have letters, digits, and hyphens
- Each part of a domain name is separated by a dot
- Set up your test environment
Next you build a form with an element to verify and the receiving function. You decide to use the FormVerifier class and add a verification function like that shown. Make sure these work before you add regular expressions.
function isEmailAddress($field, $msg) {
$value = $this->getValue($field);
$pattern = "/.+/";
if(preg_match($pattern, $value)) {
return true;
} else {
$this->addError($field, $value, $msg);
return false;
}
}
- Code the most specific term possible
You look at the rules and code the most specific line you can easily come up with. Then you test the regular expression to verify it works.
$pattern = "/[_a-z0-9+.-]+@([a-z0-9-]+\.)+com/i";
- Set anchors if you can
Add the ^ and $ quantifiers where possible. This prevents characters before and after the acceptable pattern to be invalidated.
$pattern = "/^[_a-z0-9+.-]+@([a-z0-9-]+\.)+com$/i";
- Get more specific if you can, testing each addition carefully
You may decide to restrict the top level domain (TLD) to only those authorized. This turns out to be quite complicated. Almost every two-letter combination is used by some country. In addition to the well-known generic TLD's of com, edu, net, org, mil and gov, there are many new TLD's: biz, info, name, coop, aero and museum. More are being suggested and adopted every year.
We leave the coding of a TLD regular expression as an exercise for the student.
^ top
14.1.6: Summary
- Regular expressions enable a script to look for character patterns in a string
- PHP supports many functions useful for use with regular expressions
- The most useful function for verifying user input is preg_match()
- Regular characters are matched in an expression like:
/She sells/i
Special "meta" characters are used to form a small language for matching patterns
You use parenthesis to goup characters
/(D|d)av(e|id)/
You use curly brackets to specify a range of repetitions for the preceeding character
/^z{3,5}$/
You use square brackets to specify a character class:
/[0-9a-z_]/i
Regular expressions are very powerful -- but can be almost unreadable
You must build them carefully by starting with simple expressions that work
Refine and test your regular expression incrementally
Build it one piece at a time and test each addition as you go
^ top
Activity 14.1
- Develop a regular expression for verifying top-level-domain (TLD) names.
- Make sure your regular expression works with the email pattern we have devloped so far.
$pattern = "/^[_a-z0-9+.-]+@([a-z0-9-]+\.)+com$/i";
- You may use this test script to test your changes.
Cannot find file: examples/regexer.php
^ top
14.2: Scripting the Internet
Learner Outcomes
At the end of the lesson the student will be able to:
- Send email from PHP scripts
- Work with URLs
- Read and parse web pages
|
^ top
14.2.1: Sending Email
- Sometimes you want to send email:
- New password
- Order confirmation
- Survey results
- PHP provides a function called
mail() that sends e-mail via SMTP
Basic Syntax
bool mail(toAddress, subject, message);
toAddress: destination address of the e-mail
subject: subject line of the email
message: text of the email message
For Example
Cannot find file: examples/email.php
Security Considerations
- Do not use a web form for the
toAddress
- Also, do not read a form variable for the
toAddress
- This would let anyone use your mail server to send anything
More Information
- mail: PHP function documentation
^ top
14.2.2: Verifying Network Information
- Sometimes you need to verify network information
- For example, you want to verify that an email address or URL is valid
- With PHP, you can look up hostnames, IP address and MX records
- An MX record is short for mail exchange record
- MX records are stored at the DNS and are looked up like a hostname
- If no MX record exists, there is nowhere for the email to go
- There can be more than one MX record, so the function
getmxrr() returns an array
- Note that
getmxrr() is not implemented on Windows
Commonly Used Functions to Verify Network Information
| Function |
Description |
| gethostbyaddr(ipAddress) |
Returns the host name of the Internet host specified by the string ipAddress. |
| gethostbyname(hostname) |
Returns the IP address of the Internet host specified by the string hostname. |
| getmxrr(hostName, mxArray) |
Returns an array of MX host names in mxArray from an email hostName. (Not implemented on Windows) |
| parse_url(url) |
Returns from the URL string an associative array with the following indexes (if present): scheme, host, port, user, pass, path, query, and fragment. |
Example Checking a URL
Cannot find file: examples/urlcheck.php
Example Checking an Email for MX Records
Cannot find file: examples/emailmxcheck.php
More Information
^ top
14.2.3: Reading Pages from a URL
- You can easily read a page from a URL
$page = file_get_contents($url);
Many of PHP's Filesystem functions work with Internet sources
Some Functions that Read from URL's
| Function |
Description |
| file(url) |
Returns an array containing the contents read from the string url, with each element of the array corresponding to a line in the file. |
| file_get_contents(url) |
Returns a string containing the contents read from the string url. Note: Needs PHP version 4.3 or later and so does not work on classroom computers. |
Example Script to Read From a URL
Cannot find file: examples/urlread.php
^ top
14.2.4: Parsing a Web Page
- You can use information from other parts of the web in your own pages
- In general, the steps you follow are:
- Find an original source URL
- Read the information from the URL
- Parse (extract) the data you want to use
- Finding the information might involve some detective work
- We looked how to read the information in the previous section
- To parse the information, you often use regular expressions
- Function
preg_match() allows you to include an extra parameter for matches to the pattern
Syntax
int preg_match(string pattern, string subject, array matches)
pattern: regular expression pattern
subject: the string to search for pattern matches
matches: optional argument that is filled with the results of search
Example Script to Parse a URL
Cannot find file: examples/urlparse.php
^ top
14.2.5: Uploading Files
- Most browsers let you upload files using the POST method
- PHP is capable of receiving and processing uploaded files
An Upload Form
- The following HTML creates a file upload form
Cannot find file: examples/upload.html
- The attribute
enctype="multipart/form-data" is needed to load files into PHPs $_FILES superglobal array
- The optional hidden field
MAX_FILE_SIZE tells the browser the maximum file size to upload
- The
MAX_FILE_SIZE field must be placed before the file field
- This field can be ignored by the browser and is easy to circumvent
- Thus you will need to verify the value in your script as well
- The file field creates the upload form element
<input type="file" name="uploadFile">
Processing the Uploaded File
- The following is a minimal script to process an uploaded file
Cannot find file: examples/upload1.php
- PHP first places the uploaded file in a temporary directory using a temporary name
- Your code should move the file to its permanent location before the script finishes processing
move_uploaded_file($tmp_name, $name);
PHP stores all the uploaded file information in the $_FILES array
Each file has an its own array of information in the $_FILES array
Thus, you need to specify both the file name and data element to retrieve the value
$tmp_name = $_FILES["uploadFile"]["tmp_name"];
Explanations of all the available data values are listed in the documentation for Handling file uploads
Error Checking and Validation
- We need to both validate the file upload and check for errors
- First of all, you should require a user to authenticate before uploading files
- That way you can keep records of anyone atacking your uploading system
- Note that the
move_uploaded_file() checks to ensure that the file is a valid upload file
- If any error occurs, then the file is not moved
- Thus you can check for errors easily with with the following code
Cannot find file: examples/upload2.php
- Part of the information returned about the file is an error code
$errorCode = $_FILES['uploadFile']['error'];
Error codes are explained in: Error Messages Explained
These codes can help you to check for user error and other problems
You can use the other file information to check file types and sizes as well
The following code shows how to check file type and size before uploading
Cannot find file: examples/upload.php
More Information
Large File Uploads
- PHP has a limit on file upload sizes -- usually about 2 MB
- You can change this limit in the
php.ini file
- Also, the web server may limit the amount of information processed during one POST operation
- Sometimes this limit is as low as 512 KB
- To upload large file sizes, you may need another solution like Java or Perl
- More info: PHP Upload Configuration
- Includes links to other solutions like Applets and Perl scripts
^ top
14.2.6: Summary
- PHP has numerous functions for using the Internet
- PHP provides a funtion called
mail() that sends e-mail via SMTP
- Function
parse_url() parses a URL and returns its various parts
- You can use PHP functions to verify user-supplied information
gethostbyname(): returns the IP address of a host, if found
getmxrr(): returns the MX records for an email host, if found
- Also, you can read entire pages off the web:
$page = file_get_contents($url);
Once you read the page, you can use regular expressions to extract information
preg_match($pattern, $page, $matches);
The information extracted is returned in the $matches array
echo $matches[0];
Browsers allow you to upload files using HTML forms
PHP can recieve and process the uploaded files
You should require a user to authenticate before uploading files
That way you can keep records of anyone atacking your uploading system
Also, you need to both validate the file upload and check for errors
^ top
Activity 14.2
- Modify the following script to extract information from a web page of your choosing.
Cannot find file: examples/urlparse2.php
^ top
14.3: Finishing the Course
Learner Outcomes
At the end of the lesson the student will be able to:
- Discuss the final preparation for the project presentation
- Advise the instructor on how to improve future courses
|
^ top
14.3.1: About the Final Project Presentation
Before the Presentation
- Submit your project to WebCT before the presentation:
- Bring a written report on paper to give to the instructor before the presentation
During the Presentation
The presentation should have the following:
- Your name and your project's name
- A brief introduction describing the purpose of your project
- A demonstration and discussion of the user interface including:
- Entry page
- Page layout
- Navigation features
- A demonstration of a multi-form sequence where you pass information from one page to another
- A demonstration of user-input error handling
- Checking of form input for errors
- Highlighting of errors so users easily see them
- Explanation to user of how to correct errors
- Retention of prior entries on error (except passwords)
- A discussion or demonstration of user authentication
- How the database is used for authentication
- How passwords are encrypted in database
- A discussion or demonstration of security features
- How data types are checked before insertion into a database
- How data sizes are checked before insertion into a database
- How taint checking of special characters is implemented (e.g.
'"$#)
- How special symbols and spaces do not cause database errors
- A discussion or demonstration of cool features
- Point them out so we can all appreciate them
- Feel free to display your written report during the presentation
- Keep the presentation to 10 minutes or less
After the Presentation
- Feel free to leave (or stay) after your presentation
- You can present to the instructor alone after the other presentations are through
^ top
14.3.2: Lecture Finale
- During the semester we have covered many topics and learned at least two languages: SQL and PHP
- With this knowledge you can develop professional-looking database-driven Web sites
- Your project will allow you to demonstrate what you have learned
- There is no substitute for a working application!
- I hope that everyone has enjoyed taking the course as much as I have enjoyed presenting it
- I am always open to suggestions for improving the course
^ top
Wrap Up
Due Next: Final Project Report and Presentation (6/1/07)
When class is over, please shut down your computer if it is on
^ top
Home
| WebCT
| Announcements
| Course info
| Expectations
| Schedule
Project
| Help
| FAQ's
| HowTo's
| Links
Last Updated: May 21 2007 @19:17:07
|