14. Networking and Other Topics

What We Will Cover


Continuations

Questions from last class?

14.1: Improving Verification With Patterns

Objectives

At the end of the lesson the student will be able to:

  • Use PHP pattern-matching functions
  • Code regular expressions to match string patterns

14.1.1: About Regular Expressions

  • Many programming problems require matching a pattern in string variables
  • Verifying the data received from HTML forms is one such problem
  • For example, if you are expecting an email address, your script needs to verify the string meets requirements for email addresses
  • john.doe@hotmail.com

Regular Expression Standards

  • There are two main standards for regular expressions: POSIX and Perl
  • PHP supports both standards
  • We will use Perl-compatible functions and focus on using preg_match()

Commonly Used Pattern-Matching Functions (Perl Compatible)

Function Description
preg_match() Searches a string for matches to a regular expression.
preg_replace() Searches a string for matches to a regular expression and replaces them with the specified text.
preg_split() Searches a string for boundaries matched by a regular expression and splits the string into an array of strings along the boundaries.

14.1.2: Using the preg_match() Function

  • You use preg_match() to search a string for matches to a regular expression
  • If the regular expression pattern matches a part of the string, then it returns the number 1

Basic Syntax

int preg_match(string pattern, string subject)
  • pattern: regular expression pattern
  • subject: the string to search for pattern matches

For Example

<?php
$pattern 
"/se/";
$subject "She sells sea shells";
$found preg_match($pattern$subject);
if (
$found) {
    echo 
"Matches";
} else {
    echo 
"No match";
}
?>

  • Put the regular expression pattern between forward slashes: / /
  • If the pattern "se" is found, then $found is set to the number 1
  • Otherwise, $found is set to the number 0

Further Information

14.1.3: Using Regular Expressions with preg_match()

  • PHP has a special set of pattern-matching characters (meta characters)
  • These characters form a small language with each character having a special meaning
  • These characters are part of an industry standard

Commonly Used Pattern-Matching Characters

Symbol Description
^ Matches when the characters that follow start the string.
$ Matches when the preceding characters end the string
* Matches zero or more occurences of the preceeding character
+ Matches one or more occurences of the preceeding character
? Matches zero or one occurences of the preceeding character
. A wildcard symbol that matches any one character
| Alternation symbol (OR) that matches either the pattern on the left or the right

For Example

  • We can test our regular expressions with a simple form script
  • To match when starting with "She": ^She
  • To match when ending with "shells": shells$
  • To match an "se" followed by one or more l's: sel+
  • To match an "se" followed by zero or more a's: sea*
  • To match any character followed by an "e": .e
  • To match "He" or "She": He|She
  • Note that you can ignore case by putting an "i" after the closing slash
  • /^she/i

"Escaped" Character Literals

  • You can match one of the special characters
  • However, you must prefix it with the backslash character
  • /\.\*/
  • To match one backslash, your regular expression should include "\\"
  • The backslash is also used to specify non-printing characters like:
  • Sequence Meaning
    \a Alert
    \f Formfeed
    \n Newline
    \r Carriage return
    \t Horizontal tab

  • Further Information: Backslash

14.1.4: Grouping Characters

  • Regular expressions use parenthesis, curly brackets and square brackets to group characters
  • Each type of grouping character has different meanings
  • You can combine these grouping characters with other special characters to get flexible and specific matching patterns

Using Parenthsis to Group Characters

  • Use parenthesis to group characters in a regular expression
  • For example, to match "Dave" or David" in a string"
  • /Dav(e|id)/
  • To match "Dave" or David" whether the name starts with a "D" or "d":
  • /(D|d)av(e|id)/
  • Further Information: Subpatterns

Using Curly Brackets to Specify Repetitions

  • You use curly brackets to specify a range of repetitions for the preceeding character
  • You can specify a range of values such as between 3 and 5 z's
  • /^z{3,5}$/
  • You can specify a minimum value such as 3 or more z's
  • /^z{3,}$/
  • You can specify a maximum value such as 3 or fewer z's
  • /$z{,3}$/
  • You can specify an exact value such as exactly 3 z's
  • /^z{3}$/
  • Further Information: Repetition

Using Square Brackets to Specify Character Classes

  • You use square brackets to specify a character class
  • Classes match only one of the characters found between the square brackets
  • For example, to match either sea or sel:
  • /se[al]/
  • A more common use is to specify a range of values to match
  • To specify a range, use a dash (-)
  • For example, to specify the numbers from 0 to 9: /[0-9]/
  • To specify a capital letter from "A" to "Z": /[A-Z]/
  • You can specify multiple ranges within one square bracket
  • /[0-9a-zA-Z_]/
    When the caret symbol (^) appears first, it reverses the meaning
  • Thus to matches any character not between 0 and 9: /[^0-9]/
  • Further Information: Square brackets

14.1.5: Building Regular Expressions That Work

  • Regular expressions are very powerful -- but can be almost unreadable
  • To build complex regular expressions, start with a simple expression
  • After a simple start, refine your regular expression incrementally
  • Build it one piece at a time and test each addition as you go

Incremental Refinement Example

  • This example incrementally builds a regular expression for form verification
  • We want to verify that a form field meets requirements for email addresses
  • The steps that follow detail a process for building this verification incrementally
  1. Determine the precise rules for your field
  2. john.doe@hotmail.com

    You determine what is valid and invalid input by examining email addresses and reading specifications. Some of the rules you come up with are:

    • User names can have almost any printable ASCII character
    • An @ symbol seperates the user name from the domain name
    • Domain names can have letters, digits, and hyphens
    • Each part of a domain name is separated by a dot

  3. Set up your test environment
  4. Next you build a form with an element to verify and the receiving function. You decide to use the FormVerifier class and add a verification function like that shown. Make sure these work before you add regular expressions.

    function isEmailAddress($field, $msg) {
        $value = $this->getValue($field);
        $pattern = "/.+/";
        if(preg_match($pattern, $value)) {
            return true;
        } else {
            $this->addError($field, $value, $msg);
            return false;
        }
    }
    
  5. Code the most specific term possible
  6. You look at the rules and code the most specific line you can easily come up with. Then you test the regular expression to verify it works.

    $pattern = "/[_a-z0-9+.-]+@([a-z0-9-]+\.)+com/i";
    
  7. Set anchors if you can
  8. Add the ^ and $ quantifiers where possible. This prevents characters before and after the acceptable pattern to be invalidated.

    $pattern = "/^[_a-z0-9+.-]+@([a-z0-9-]+\.)+com$/i";
    
  9. Get more specific if you can, testing each addition carefully
  10. You may decide to restrict the top level domain (TLD) to only those authorized. This turns out to be quite complicated. Almost every two-letter combination is used by some country. In addition to the well-known generic TLD's of com, edu, net, org, mil and gov, there are many new TLD's: biz, info, name, coop, aero and museum. More are being suggested and adopted every year.

    We leave the coding of a TLD regular expression as an exercise for the student.

14.1.6: Summary

  • Regular expressions enable a script to look for character patterns in a string
  • PHP supports many functions useful for use with regular expressions
  • The most useful function for verifying user input is preg_match()
  • Regular characters are matched in an expression like:
  • /She sells/i
  • Special "meta" characters are used to form a small language for matching patterns
  • You use parenthesis to goup characters
  • /(D|d)av(e|id)/
  • You use curly brackets to specify a range of repetitions for the preceeding character
  • /^z{3,5}$/
  • You use square brackets to specify a character class:
  • /[0-9a-z_]/i
  • Regular expressions are very powerful -- but can be almost unreadable
  • You must build them carefully by starting with simple expressions that work
  • Refine and test your regular expression incrementally
  • Build it one piece at a time and test each addition as you go

Exercise 14.1

  1. Develop a regular expression for verifying top-level-domain (TLD) names.
  2. Make sure your regular expression works with the email pattern we have devloped so far.
  3. $pattern = "/^[_a-z0-9+.-]+@([a-z0-9-]+\.)+com$/i";
    
  4. You may use the test script to test your changes.

14.2: Scripting the Internet

Objectives

At the end of the lesson the student will be able to:

  • Send email from PHP scripts
  • Work with URLs
  • Read and parse web pages

14.2.1: Sending Email

  • Sometimes you want to send email:
    • New password
    • Order confirmation
    • Survey results
  • PHP provides a funtion called mail() that sends e-mail via SMTP

Basic Syntax

    boolSuccess mail(toAddress, subject, message);
  • toAddress: destination address of the e-mail
  • subject: subject line of the email
  • message: text of the email message

For Example

<?php

$to 
"someone@somewhere.com";
$subject "Today's Wisdom";
$message "
A Person Who Asks A Question
Is A Fool For Five Minutes.
A Person Who Doesn't
Is A Fool Forever"
;

echo 
mail($to$subject$message);
?>

Security Considerations

  • Do not use a web form for the toAddress
  • Also, do not read a form variable for the toAddress
  • This would let anyone use your mail server to send anything

14.2.2: Verifying Network Information

  • Sometimes you need to verify network information
  • For example, you want to verify that an email address or URL is valid
  • With PHP, you can look up hostnames, IP address and MX records
  • An MX record is short for mail exchange record
  • MX records are stored at the DNS and are looked up like a hostname
  • If no MX record exists, there is nowhere for the email to go
  • There can be more than one MX record, so the function getmsrr() returns an array

Commonly Used Functions to Verify Network Information

Function Description
gethostbyaddr(ipAddress) Returns the host name of the Internet host specified by the string ipAddress.
gethostbyname(hostname) Returns the IP address of the Internet host specified by the string hostname.
getmxrr(hostName, mxArray) Returns an array of MX host names in mxArray from an email hostName. (Not implemented on Windows)
parse_url(url) Returns from the URL string an associative array with the following indexes (if present): scheme, host, port, user, pass, path, query, and fragment.

Example Checking a URL

<?php
$url 
"http://www.edparrish.com/cis165/04s/lesson13.php";
$urlArray parse_url($url);
$host $urlArray['host'];
$ip gethostbyname($host);
if (
$ip != $host) {
    echo 
"Host for URL has a valid IP";
} else {
    echo 
"Host for URL does not have a valid IP";
}
?>

Example Checking an Email for MX Records

<?php
$email 
"someone@totallyBogusEmailServerName.com";
$emailArray explode('@'$email);
$emailHost $emailArray[1];
$result getmxrr($emailHost$mxhosts);
if (
$result) {
    echo 
"MX host exist";
} else {
    echo 
"MX host not found";
}
?>

14.2.3: Reading Pages from a URL

  • You can easily read a page from a URL
  • $page = file_get_contents($url);
    
  • Many of PHP's Filesystem functions work with Internet sources

Some Functions that Read from URL's

Function Description
file(url) Returns an array containing the contents read from the string url, with each element of the array corresponding to a line in the file.
file_get_contents(url) Returns a string containing the contents read from the string url. Note: Needs PHP version 4.3 or later and so does not work on classroom computers.

For Example

<?php
$url 
"http://www.edparrish.com/index.html";
$page file_get_contents($url);
echo 
$page;
?>

14.2.4: Parsing a Web Page

  • You can use information from other parts of the web in your own pages
  • In general, the steps you follow are:
    1. Find an original source URL
    2. Read the information from the URL
    3. Parse (extract) the data you want to use
  • Finding the information might involve some detective work
  • We looked how to read the information in the previous section
  • To parse the information, you often use regular expressions
  • Function preg_match() allows you to include an extra parameter for matches to the pattern

Syntax

int preg_match(string pattern, string subject, array matches)
  • pattern: regular expression pattern
  • subject: the string to search for pattern matches
  • matches: optional argument that is filled with the results of search

For Example

<?php
$symbol 
"AMZN";
$url "http://www.amex.com/equities/listCmp/"
      
."EqLCDetQuote.jsp?Product_Symbol=$symbol";
$page file_get_contents($url);
$pattern "/\\\$[0-9]+\\.[0-9]+/i";
if (
preg_match($pattern$page$matches)) {
    echo 
"$symbol last sold at: ";
    echo 
$matches[0];
} else {
    echo 
"No quote available";
}
echo 
"<br>Information retrieved from:<br>"
    
."<a href=\"$url\">$url</a><br>"
    
."on ".(date('l jS F Y g:i a T'));
?>

14.2.5: Summary

  • PHP has numerous functions for using the Internet
  • PHP provides a funtion called mail() that sends e-mail via SMTP
  • Function parse_url() parses a URL and returns its various parts
  • You can use PHP functions to verify user-supplied information
  • gethostbyname(): returns the IP address of a host, if found
  • getmxrr(): returns the MX records for an email host, if found
  • Also, you can read entire pages off the web:
  • $page = file_get_contents($url);
    
  • Once you read the page, you can use regular expressions to extract information
  • preg_match($pattern, $page, $matches);
    
  • The information extracted is returned in the $matches array
  • echo $matches[0];
    

Exercise 14.2

  1. Modify the following script to extract information from a web page of your choosing.
  2. Cannot find file: examples/urlparse2.php

14.3: Lecture Finale

Objectives

At the end of the lesson the student will be able to:

  • Discuss the final preparation for the project presentation
  • Advise the instructor on how to improve future courses

14.3.1: What We Have Learned

During the course, we have learned how to:

  1. Query a database using SQL
  2. Create databases and tables
  3. Insert into a database and update data already in a database
  4. Design a database and put it into an optimal form
  5. Improve the performance of a database with indexing
  6. Use database functions for grouping, aggregating and other procedures
  7. Create PHP pages and display database data in a Web page
  8. Work with PHP variables and process form data
  9. Save form data into a database
  10. Use conditional statements to make our applications appear intelligent
  11. Use loops to repeat code
  12. Use arrays to group our data
  13. Use database meta-data to format data in Web pages
  14. Write functions to group related code
  15. Use functions and include files to organize Web applications
  16. Create classes and objects using PHP
  17. Use classes that makes processing forms easier for users
    • FormVerifier
    • FormLib
  18. Pass data from one page to another using:
    • Hidden fields
    • Hypertext Links
    • Cookies
    • Sessions
  19. Apply these techniques to a multi-page authentication system
  20. Handle database errors
  21. Implement a shopping cart
  22. Improve the security of our Web applications
    • Including our responsibilities as developers
  23. Safeguard our database from user input
    • Especially from SQL injection
  24. Use regular expressions to validate data
  25. Send email from an application
  26. Read and parse web pages
  • With this knowledge you can develop professional-looking database-driven Web sites
  • Your project will allow you demonstrate what you have learned
  • Suggestions for improvement?

14.3.2: End of Course Survey

  • Please take a few minutes to answer this short survey
  • This will help the instructor to improve future courses
  • Survey respondent answers are anonymous
  • The link to WebCT is here
  • You may want to rate the textbook at your favorite book seller, such as:
  • Also, you can rate me at Katsu's site: Student Feedback

14.3.3: About the Project Presentation

Before the Presentation

  • Submit your project to WebCT before the presentation:
  • Bring a written report on paper to give to the instructor before the presentation

During the Presentation

The presentation should have the following:

  • A brief introduction describing the purpose of your database application
  • A demonstration of your project that includes:
    • A multi-form sequence where information is retained across pages
    • User authentication
    • User-error handling
  • A description of your database design
    • A list of table names
    • A brief description of the data that the tables contain
  • A demonstration of any extra-credit features
    • Point out the extras so we can all appreciate them
  • Feel free to display your written report during the presentation
  • Keep the presentation to 15 minutes or less

After the Presentation

  • Feel free to leave (or stay) after your presentation
  • You can present to the instructor alone after the other presentations are through

Wrap Up

    Reminders

    Due Next Class: Final Project Report and Presentation (12/13/04)

  • When class is over, please shut down your computer
  • There is no need to turn in this weeks exercises
  • Work on your project!

Home | WebCT | Announcements | Course info | Expectations | Schedule
Project | Help | FAQ's | HowTo's | Links

Last Updated: December 08 2004 @15:38:41