14. Networking and Other Topics

What We Will Cover


Continuations

Questions from last class?

Making Select Lists

  • The algorithm for making a select list from a SQL query is straightforward:
    1. Write a SQL query that returns the rows with the data you need.
    2. $sql = "SELECT ProductName FROM Products"; 
    3. Use a loop to extract the data from the result set and create an array.
    4. $prodList = array();
      while ($row = mysql_fetch_assoc($result)) {
          $prodList[$row["ProductName"]] = $row["ProductName"];
      }
      
    5. Use the array to code the HTML data.
    6. echo $f->makeSelect('products', $prodList);
  • Here is an example querying the products table of the Artzy database:

<?php
include_once("includes/formlib.php");

$title "Select DB Test Page";
main($title);

function 
main($title "") {
    
$f = new FormLib("red"HORIZONTAL);
    require(
"includes/header.php");
    
showContent($title$f);
    require(
"includes/footer.php");
}

// Display the content of a page
function showContent($title$f) {
    require_once(
"includes/dbconvars.php");
    
$dbCnx mysql_connect($dbhost$dbuser$dbpwd)
        or die(
"Could not connect: ".mysql_error());
    
mysql_select_db($dbname$dbCnx)
        or die(
"Could not select databas: ".mysql_error());
    
$sql "SELECT ProductName FROM products";
    
$result mysql_query($sql)
        or die(
"Query failed: ".mysql_error());
    
$prodList = array("- Select  One -"=>"");
    while (
$row mysql_fetch_assoc($result)) {
        
$prodList[$row["ProductName"]] = $row["ProductName"];
    }

    echo 
"<h1>$title</h1>\n";
    echo 
$f->reportErrors();
    echo 
$f->start();
    echo 
$f->makeSelect('products'$prodList);
    echo 
$f->makeButton("Select");
    echo 
$f->finish();
}
?>

Try it and view the source before and after you press the Select button

14.1: Improving Verification With Patterns

Objectives

At the end of the lesson the student will be able to:

  • Use PHP pattern-matching functions
  • Code regular expressions to match string patterns

14.1.1: About Regular Expressions

  • Many programming problems require matching a pattern in string variables
  • Verifying the data received from HTML forms is one such problem
  • For example, if you are expecting an email address, your script needs to verify the string meets requirements for email addresses
  • john.doe@hotmail.com

Regular Expression Standards

  • There are two main standards for regular expressions: POSIX and Perl
  • PHP supports both standards
  • We will use Perl-compatible functions and focus on using preg_match()

Commonly Used Pattern-Matching Functions (Perl Compatible)

Function Description
preg_match() Searches a string for matches to a regular expression.
preg_replace() Searches a string for matches to a regular expression and replaces them with the specified text.
preg_split() Searches a string for boundaries matched by a regular expression and splits the string into an array of strings along the boundaries.

14.1.2: Using the preg_match() Function

  • You use preg_match() to search a string for matches to a regular expression
  • If the regular expression pattern matches a part of the string, then it returns the number 1 (meaning true)

Basic Syntax

int preg_match(string pattern, string subject)
  • pattern: regular expression pattern
  • subject: the string to search for pattern matches
  • returns the number 1

For Example

<?php
$pattern 
"/se/";
$subject "She sells sea shells";
$found preg_match($pattern$subject);
if (
$found) {
    echo 
"Matches";
} else {
    echo 
"No match";
}
?>

  • Put the regular expression pattern between forward slashes: / /
  • If the pattern "se" is found, then $found is set to the number 1
  • Otherwise, $found is set to the number 0

Further Information

14.1.3: Using Regular Expressions with preg_match()

  • PHP has a special set of pattern-matching characters (meta characters)
  • These characters form a small language with each character having a special meaning
  • These characters are part of an industry standard

Commonly Used Pattern-Matching Characters

Symbol Description
^ Matches when the characters that follow start the string.
$ Matches when the preceding characters end the string
* Matches zero or more occurences of the preceeding character
+ Matches one or more occurences of the preceeding character
? Matches zero or one occurences of the preceeding character
. A wildcard symbol that matches any one character
| Alternation symbol (OR) that matches either the pattern on the left or the right

For Example

  • We can test our regular expressions with a simple form script
  • To match when starting with "She": ^She
  • To match when ending with "shells": shells$
  • To match an "se" followed by one or more l's: sel+
  • To match an "se" followed by zero or more a's: sea*
  • To match any character followed by an "e": .e
  • To match "He" or "She": He|She
  • Note that you can ignore case by putting an "i" after the closing slash
  • /^she/i

"Escaped" Character Literals

  • You can match one of the special characters
  • However, you must prefix it with the backslash character
  • /\.\*/
  • To match one backslash, your regular expression should include "\\"
  • The backslash is also used to specify non-printing characters like:
  • Sequence Meaning
    \a Alert
    \f Formfeed
    \n Newline
    \r Carriage return
    \t Horizontal tab

  • Further Information: Backslash

14.1.4: Grouping Characters

  • Regular expressions use parenthesis, curly brackets and square brackets to group characters
  • Each type of grouping character has different meanings
  • You can combine these grouping characters with other special characters to get flexible and specific matching patterns

Using Parenthsis to Group Characters

  • Use parenthesis to group characters in a regular expression
  • For example, to match "Dave" or David" in a string"
  • /Dav(e|id)/
  • To match "Dave" or David" whether the name starts with a "D" or "d":
  • /(D|d)av(e|id)/
  • Further Information: Subpatterns

Using Curly Brackets to Specify Repetitions

  • You use curly brackets to specify a range of repetitions for the preceeding character
  • You can specify a range of values such as between 3 and 5 z's
  • /^z{3,5}$/
  • You can specify a minimum value such as 3 or more z's
  • /^z{3,}$/
  • You can specify a maximum value such as 3 or fewer z's
  • /^z{,3}$/
  • You can specify an exact value such as exactly 3 z's
  • /^z{3}$/
  • Further Information: Repetition

Using Square Brackets to Specify Character Classes

  • You use square brackets to specify a character class
  • Classes match only one of the characters found between the square brackets
  • For example, to match either sea or sel:
  • /se[al]/
  • A more common use is to specify a range of values to match
  • To specify a range, use a dash (-)
  • For example, to specify the numbers from 0 to 9: /[0-9]/
  • To specify a capital letter from "A" to "Z": /[A-Z]/
  • You can specify multiple ranges within one square bracket
  • /[0-9a-zA-Z_]/
    When the caret symbol (^) appears first, it reverses the meaning
  • Thus to matches any character not between 0 and 9: /[^0-9]/
  • Further Information: Square brackets

14.1.5: Building Regular Expressions That Work

  • Regular expressions are very powerful -- but can be almost unreadable
  • To build complex regular expressions, start with a simple expression
  • After a simple start, refine your regular expression incrementally
  • Build it one piece at a time and test each addition as you go

Incremental Refinement Example

  • This example incrementally builds a regular expression for form verification
  • We want to verify that a form field meets requirements for email addresses
  • The steps that follow detail a process for building this verification incrementally
  1. Determine the precise rules for your field
  2. john.doe@hotmail.com

    You determine what is valid and invalid input by examining email addresses and reading specifications. Some of the rules you come up with are:

    • User names can have almost any printable ASCII character
    • An @ symbol seperates the user name from the domain name
    • Domain names can have letters, digits, and hyphens
    • Each part of a domain name is separated by a dot

  3. Set up your test environment
  4. Next you build a form with an element to verify and the receiving function. You decide to use the FormVerifier class and add a verification function like that shown. Make sure these work before you add regular expressions.

    function isEmailAddress($field, $msg) {
        $value = $this->getValue($field);
        $pattern = "/.+/";
        if(preg_match($pattern, $value)) {
            return true;
        } else {
            $this->addError($field, $value, $msg);
            return false;
        }
    }
    
  5. Code the most specific term possible
  6. You look at the rules and code the most specific line you can easily come up with. Then you test the regular expression to verify it works.

    $pattern = "/[_a-z0-9+.-]+@([a-z0-9-]+\.)+com/i";
    
  7. Set anchors if you can
  8. Add the ^ and $ quantifiers where possible. This prevents characters before and after the acceptable pattern to be invalidated.

    $pattern = "/^[_a-z0-9+.-]+@([a-z0-9-]+\.)+com$/i";
    
  9. Get more specific if you can, testing each addition carefully
  10. You may decide to restrict the top level domain (TLD) to only those authorized. This turns out to be quite complicated. Almost every two-letter combination is used by some country. In addition to the well-known generic TLD's of com, edu, net, org, mil and gov, there are many new TLD's: biz, info, name, coop, aero and museum. More are being suggested and adopted every year.

    We leave the coding of a TLD regular expression as an exercise for the student.

14.1.6: Summary

  • Regular expressions enable a script to look for character patterns in a string
  • PHP supports many functions useful for use with regular expressions
  • The most useful function for verifying user input is preg_match()
  • Regular characters are matched in an expression like:
  • /She sells/i
  • Special "meta" characters are used to form a small language for matching patterns
  • You use parenthesis to goup characters
  • /(D|d)av(e|id)/
  • You use curly brackets to specify a range of repetitions for the preceeding character
  • /^z{3,5}$/
  • You use square brackets to specify a character class:
  • /[0-9a-z_]/i
  • Regular expressions are very powerful -- but can be almost unreadable
  • You must build them carefully by starting with simple expressions that work
  • Refine and test your regular expression incrementally
  • Build it one piece at a time and test each addition as you go

Exercise 14.1

  1. Develop a regular expression for verifying top-level-domain (TLD) names.
  2. Make sure your regular expression works with the email pattern we have devloped so far.
  3. $pattern = "/^[_a-z0-9+.-]+@([a-z0-9-]+\.)+com$/i";
    
  4. You may use this test script to test your changes.

<?php
main
("Regular Expression Tester");

// Control the operation of a page
function main($title "") {
    
$msg "";
    
$pattern "/^[_a-z0-9+.-]+@([a-z0-9-]+\.)+com$/i";
    
$subject "john.doe@hotmail.com";
    if (isset(
$_POST["submit"])) {
        
$pattern $_POST["pattern"];
        if (
get_magic_quotes_gpc()) {
            
$pattern stripslashes($pattern);
        }
        
$subject $_POST["subject"];
        @
$found preg_match($pattern$subject);
        if (
$found) {
            
$msg '<font color="green">Matches</font>';
        } else {
            
$msg '<font color="red">No match</font>';
        }
    }
    
showContent($title$msg$pattern$subject);
}

// Display the content of a page
function showContent($title$msg$pattern$subject) {
    echo<<<HTML
<html>
<head><title>$title</title></head>
<body>
<h1>$title</h1>
<p>Enter a regular expression in <b>Pattern</b>
<br>and the string to search in <b>Subject</b>
<br>and then press the <b>Test</b> button.</p>

<form method="POST" action="regexer.php">
<p>Pattern: <input type="text" name="pattern" value="$pattern" size="40">
<p>Subject: <input type="text" name="subject" value="$subject" size="40">
<p>$msg</p>
<p><input type="submit" name="submit" value="Test"></form>
</body>
</html>

HTML;
}
?>

14.2: Scripting the Internet

Objectives

At the end of the lesson the student will be able to:

  • Send email from PHP scripts
  • Work with URLs
  • Read and parse web pages

14.2.1: Sending Email

  • Sometimes you want to send email:
    • New password
    • Order confirmation
    • Survey results
  • PHP provides a function called mail() that sends e-mail via SMTP

Basic Syntax

For Example

<?php

$to 
"someone@somewhere.com";
$subject "Today's Wisdom";
$message "
A Person Who Asks A Question
Is A Fool For Five Minutes.
A Person Who Doesn't
Is A Fool Forever"
;

echo 
mail($to$subject$message);
?>

Security Considerations

  • Do not use a web form for the toAddress
  • Also, do not read a form variable for the toAddress
  • This would let anyone use your mail server to send anything

More Information

  • mail: PHP function documentation

14.2.2: Verifying Network Information

  • Sometimes you need to verify network information
  • For example, you want to verify that an email address or URL is valid
  • With PHP, you can look up hostnames, IP address and MX records
  • An MX record is short for mail exchange record
  • MX records are stored at the DNS and are looked up like a hostname
  • If no MX record exists, there is nowhere for the email to go
  • There can be more than one MX record, so the function getmxrr() returns an array
  • Note that getmxrr() is not implemented on Windows

Commonly Used Functions to Verify Network Information

Function Description
gethostbyaddr(ipAddress) Returns the host name of the Internet host specified by the string ipAddress.
gethostbyname(hostname) Returns the IP address of the Internet host specified by the string hostname.
getmxrr(hostName, mxArray) Returns an array of MX host names in mxArray from an email hostName. (Not implemented on Windows)
parse_url(url) Returns from the URL string an associative array with the following indexes (if present): scheme, host, port, user, pass, path, query, and fragment.

Example Checking a URL

<?php
$url 
"http://www.edparrish.com/cis165/04s/lesson13.php";
$urlArray parse_url($url);
$host $urlArray['host'];
$ip gethostbyname($host);
if (
$ip != $host) {
    echo 
"Host for URL has a valid IP";
} else {
    echo 
"Host for URL does not have a valid IP";
}
?>

Example Checking an Email for MX Records

<?php
$email 
"someone@totallyBogusEmailServerName.com";
$emailArray explode('@'$email);
$emailHost $emailArray[1];
$result getmxrr($emailHost$mxhosts);
if (
$result) {
    echo 
"MX host exist";
} else {
    echo 
"MX host not found";
}
?>

More Information

14.2.3: Reading Pages from a URL

  • You can easily read a page from a URL
  • $page = file_get_contents($url);
    
  • Many of PHP's Filesystem functions work with Internet sources

Some Functions that Read from URL's

Function Description
file(url) Returns an array containing the contents read from the string url, with each element of the array corresponding to a line in the file.
file_get_contents(url) Returns a string containing the contents read from the string url. Note: Needs PHP version 4.3 or later and so does not work on classroom computers.

Example Script to Read From a URL

<?php
$url 
"http://www.edparrish.com/index.html";
$page file_get_contents($url);
echo 
$page;
?>

14.2.4: Parsing a Web Page

  • You can use information from other parts of the web in your own pages
  • In general, the steps you follow are:
    1. Find an original source URL
    2. Read the information from the URL
    3. Parse (extract) the data you want to use
  • Finding the information might involve some detective work
  • We looked how to read the information in the previous section
  • To parse the information, you often use regular expressions
  • Function preg_match() allows you to include an extra parameter for matches to the pattern

Syntax

int preg_match(string pattern, string subject, array matches)
  • pattern: regular expression pattern
  • subject: the string to search for pattern matches
  • matches: optional argument that is filled with the results of search

Example Script to Parse a URL

<?php
$symbol 
"AMZN";
$url "http://www.amex.com/equities/listCmp/"
      
."EqLCDetQuote.jsp?Product_Symbol=$symbol";
$page file_get_contents($url);
$pattern "/\\\$[0-9]+\\.[0-9]+/i";
if (
preg_match($pattern$page$matches)) {
    echo 
"$symbol last sold at: ";
    echo 
$matches[0]."\n";
} else {
    echo 
"No quote available\n";
}
echo 
"<br>Information retrieved from:<br>\n"
    
."<a href=\"$url\">$url</a><br>\n"
    
."on ".(date('l jS F Y g:i a T'))."\n";
?>

14.2.5: Uploading Files

  • Most browsers let you upload files using the POST method
  • PHP is capable of receiving and processing uploaded files

An Upload Form

  • The following HTML creates a file upload form
  • <form action="upload.php" method="post" enctype="multipart/form-data">
    <input type="hidden" name="MAX_FILE_SIZE" value="15000">
    <br>Type (or browse to) Filename:<br>
    <input type="file" name="uploadFile"><br>
    <input type="submit" value="Upload File">
    </form>

  • The attribute enctype="multipart/form-data" is needed to load files into PHPs $_FILES superglobal array
  • The optional hidden field MAX_FILE_SIZE tells the browser the maximum file size to upload
    • The MAX_FILE_SIZE field must be placed before the file field
    • This field can be ignored by the browser and is easy to circumvent
    • Thus you will need to verify the value in your script as well
  • The file field creates the upload form element
  • <input type="file" name="uploadFile">

Processing the Uploaded File

  • The following is a minimal script to process an uploaded file
  • <?php
    define
    ('UPLOAD_DIR''uploads/');
    $tmp_name $_FILES['uploadFile']['tmp_name'];
    $name UPLOAD_DIR.$_FILES['uploadFile']['name'];
    move_uploaded_file($tmp_name$name);
    ?>

  • PHP first places the uploaded file in a temporary directory using a temporary name
  • Your code should move the file to its permanent location before the script finishes processing
  • move_uploaded_file($tmp_name, $name);
  • PHP stores all the uploaded file information in the $_FILES array
  • Each file has an its own array of information in the $_FILES array
  • Thus, you need to specify both the file name and data element to retrieve the value
  • $tmp_name = $_FILES["uploadFile"]["tmp_name"];
  • Explanations of all the available data values are listed in the documentation for Handling file uploads

Error Checking and Validation

  • We need to both validate the file upload and check for errors
  • First of all, you should require a user to authenticate before uploading files
    • That way you can keep records of anyone atacking your uploading system
  • Note that the move_uploaded_file() checks to ensure that the file is a valid upload file
  • If any error occurs, then the file is not moved
  • Thus you can check for errors easily with with the following code
  • <?php
    define
    ('UPLOAD_DIR''uploads/');

    $tmp_name $_FILES['uploadFile']['tmp_name'];
    $name UPLOAD_DIR.$_FILES['uploadFile']['name'];
    if (
    move_uploaded_file($tmp_name$name)) {
        echo 
    "File is valid, and was successfully uploaded.\n";
    } else {
        echo 
    "Possible file upload attack!\n";
    }
    echo 
    'Here is some more debugging info:';
    echo 
    '<pre>';
    print_r($_FILES);
    echo 
    "</pre>";
    $error = array(
       
    0=>"There is no error, the file uploaded successfully",
       
    1=>"The uploaded file exceeds the upload_max_filesize
       directive in php.ini"
    ,
       
    2=>"The uploaded file exceeds the MAX_FILE_SIZE directive
       that was specified in the HTML form"
    ,
       
    3=>"The uploaded file was only partially uploaded",
       
    4=>"No file was uploaded",
       
    6=>"Missing a temporary folder"
    );
    echo 
    $error[$_FILES['uploadFile']['error']];

    ?>

  • Part of the information returned about the file is an error code
  • $errorCode = $_FILES['uploadFile']['error'];
  • Error codes are explained in: Error Messages Explained
  • These codes can help you to check for user error and other problems
  • You can use the other file information to check file types and sizes as well
  • The following code shows how to check file type and size before uploading

<?php
define
('UPLOAD_DIR''uploads/');

if ((
$_FILES["uploadFile"]["type"] == "image/gif") &&
        (
$_FILES["uploadFile"]["size"] < 15000)) {
    
$tmp_name $_FILES['uploadFile']['tmp_name'];
    
$name UPLOAD_DIR.$_FILES['uploadFile']['name'];
    if (
move_uploaded_file($tmp_name$name)) {
        echo 
"File is valid and was successfully uploaded.\n";
    } else {
        echo 
"Possible file upload attack!\n";
    }
    echo 
'Here is some debugging info you can remove later';
    echo 
'<pre>';
    
print_r($_FILES);
    echo 
"</pre>";
    
$error = array(
       
0=>"There is no error, the file uploaded successfully",
       
1=>"The uploaded file exceeds the upload_max_filesize
       directive in php.ini"
,
       
2=>"The uploaded file exceeds the MAX_FILE_SIZE directive
       that was specified in the HTML form"
,
       
3=>"The uploaded file was only partially uploaded",
       
4=>"No file was uploaded",
       
6=>"Missing a temporary folder"
    
);
    echo 
$error[$_FILES['uploadFile']['error']];
} else {
    echo 
"Sorry, we only accept .GIF images under 15Kb.";
}
?>

More Information

Large File Uploads

  • PHP has a limit on file upload sizes -- usually about 2 MB
  • You can change this limit in the php.ini file
  • Also, the web server may limit the amount of information processed during one POST operation
    • Sometimes this limit is as low as 512 KB
  • To upload large file sizes, you may need another solution like Java or Perl
  • More info: PHP Upload Configuration
  • Includes links to other solutions like Applets and Perl scripts

14.2.6: Summary

  • PHP has numerous functions for using the Internet
  • PHP provides a funtion called mail() that sends e-mail via SMTP
  • Function parse_url() parses a URL and returns its various parts
  • You can use PHP functions to verify user-supplied information
  • gethostbyname(): returns the IP address of a host, if found
  • getmxrr(): returns the MX records for an email host, if found
  • Also, you can read entire pages off the web:
  • $page = file_get_contents($url);
    
  • Once you read the page, you can use regular expressions to extract information
  • preg_match($pattern, $page, $matches);
    
  • The information extracted is returned in the $matches array
  • echo $matches[0];
    
  • Browsers allow you to upload files using HTML forms
  • PHP can recieve and process the uploaded files
  • You should require a user to authenticate before uploading files
  • That way you can keep records of anyone atacking your uploading system
  • Also, you need to both validate the file upload and check for errors

Exercise 14.2

  1. Modify the following script to extract information from a web page of your choosing.
  2. <?php
    $symbol 
    "AMZN";
    $url "http://www.amex.com/equities/listCmp/"
          
    ."EqLCDetQuote.jsp?Product_Symbol=$symbol";
    $lines file($url);
    $price "";
    $pattern "/\\\$[0-9]+\\.[0-9]+/i";
    foreach (
    $lines as $line) {
        if (
    preg_match($pattern$line$matches)) {
            
    $price $matches[0];
            break;
        }
    }
    if (
    $price) {
        echo 
    "$symbol last sold at: $price\n";
    } else {
        echo 
    "No quote available\n";
    }
    echo 
    "<br>Information retrieved from:<br>\n"
        
    ."<a href=\"$url\">$url</a><br>\n"
        
    ."on ".(date('l jS F Y g:i a T'))."\n";
    ?>

14.3: Finishing the Course

Objectives

At the end of the lesson the student will be able to:

  • Discuss the final preparation for the project presentation
  • Advise the instructor on how to improve future courses

14.3.1: About the Final Project Presentation

Important Final Exam Information

Date and Time: 7:00 9:50 P.M. Wednesday, May 31
Location: Room 516 (regular classroom)

Before the Presentation

  • Submit your project to WebCT before the presentation:
  • Bring a written report on paper to give to the instructor before the presentation

During the Presentation

The presentation should have the following:

  • Your name and your project's name
  • A brief introduction describing the purpose of your project
  • A demonstration and discussion of the user interface including:
    • Entry page
    • Page layout
    • Navigation features
  • A demonstration of a multi-form sequence where you pass information from one page to another
  • A demonstration of user-input error handling
    • Checking of form input for errors
    • Highlighting of errors so users easily see them
    • Explanation to user of how to correct errors
    • Retention of prior entries on error (except passwords)
  • A discussion or demonstration of user authentication
    • How the database is used for authentication
    • How passwords are encrypted in database
  • A discussion or demonstration of security features
    • How data types are checked before insertion into a database
    • How data sizes are checked before insertion into a database
    • How taint checking of special characters is implemented (e.g. '"$#)
    • How special symbols and spaces do not cause database errors
  • A discussion or demonstration of cool features
    • Point them out so we can all appreciate them
  • Feel free to display your written report during the presentation
  • Keep the presentation to 10 minutes or less

After the Presentation

  • Feel free to leave (or stay) after your presentation
  • You can present to the instructor alone after the other presentations are through

14.3.2: Lecture Finale

  • During the semester we have covered many topics and learned at least two languages: SQL and PHP
  • With this knowledge you can develop professional-looking database-driven Web sites
  • Your project will allow you to demonstrate what you have learned
    • There is no substitute for a working application!
  • I hope that everyone has enjoyed taking the course as much as I have enjoyed presenting it
  • I am always open to suggestions for improving the course
  • In the survey you can provide anonymous feedback as well
  • After the survey, I will help anyone with final project problems

End of Course Survey

  • Please take a few minutes to answer this short survey
  • This will help the instructor to improve future courses
  • Survey respondent answers are anonymous
  • The link to WebCT is here

Wrap Up

    Reminders

    Due Next: Final Project Report and Presentation (5/31/06)

  • When class is over, please shut down your computer
  • There is no need to turn in this weeks exercises
  • Work on your project!

Home | WebCT | Announcements | Course info | Expectations | Schedule
Project | Help | FAQ's | HowTo's | Links

Last Updated: May 22 2006 @16:39:43