PHPitfalls: Five Beginner Mistakes to Avoid

PHPitfalls: Five Beginner Mistakes to Avoid

Got something to say?

Share your comments on this topic with other web professionals

In: Articles

By Daryl L. L. Houston

Published on October 6, 2004

What, a beginner make mistakes?

Cruising through developer forums, we’ve all seen it: A new post with a subject line that reads:

newbie needs URGENT help

and body text something like:

omg iv been working on this for 4 days solid and afraid im going to get fired. i dont know php and my boss asked me to write a shopping cart. i cant seem to make it work and keep geting errors can you PLEASE PLEASE help me. i have to finish this today.

Although such posts—filled with typographical and grammatical errors, and lacking code snippets, a descriptive subject line, and those pesky error messages—are among the most irritating sort, they often turn out to be useful markers for threads filled with beginner mistakes.

It’s not only in programming that the uninitiated stumble, of course. Who didn’t teeter into a ditch full of water when learning to ride a bike or accidentally hang up on someone in the office when trying to transfer a call for the first time? There’s nothing wrong with making beginner mistakes if you’re a beginner, but we might as well try to head some mistakes off at the pass.

My perusal of various developer forums has helped me spot several trends in beginner mistakes, and I’ll address some of the most common problems I’ve run across while sifting through nearly incomprehensible forum postings in search of beginners in distress.

Comparison != Assignment

The equal sign is egalitarian in that, used incorrectly, it’ll break anybody’s code. It also happens to be a potentially confusing operator because it’s used for both comparison and assignment. Further confusing things is the fact that the evaluation of assignment operations can be used in logical comparison operations. Let’s consider an example:

/* Set two variables to the results of an assignment operation. */
$true_val=($temp = true);
/* Assignment evaluates to true */
$false_val=($temp = false);
/* Assignment evaluates to false */
/* Test true_val to confirm what it evaluated to. */
if($true_val==true){ print "ntrue_val is truen"; }
else{ print "ntrue_val is falsen"; }
/* Test false_val to confirm what it evaluated to. */
if($false_value==true){ print "nfalse_val is truen"; } else{ print "nfalse_val is falsen"; }

The script produces the following output:

true_val is true
false_val is false

In the code, we set the variables $true_val and $false_val to the results of an assignment operation. What I’m demonstrating here is that an assignment operation can evaluate to a true or a false value, which is where the fact that the equal sign can be used in both assignment and comparison operations can be problematic. Take the following code, with only one change in the second if statement, for example:

/* Set two variables to the results of an assignment operation. */
$true_val=($temp = true);
/* Assignment evaluates to true */
$false_val=($temp = false);
/* Assignment evaluates to false */
/* Test true_val to confirm what it evaluated to. */
if($true_val==true){ print "ntrue_val is truen"; }
else{ print "ntrue_val is falsen"; }
/* Test false_val to confirm what it evaluated to. */
/* Note that we've changed the comparison operator (==) to the assignment operator (=). */
if($false_value=true){ print "nfalse_val is truen"; } else{ print "nfalse_val is falsen"; }

The modified code produces the following result:

true_val is true
false_val is true

Because the second if statement contains an assignment that evaluates to true rather than a comparison operation, the statement evaluates to true and doesn’t execute the intended code block. This can be a particularly tricky error to catch because it’s a logical rather than a runtime error. That is, it skews the results of your program without actually causing the program to exit due to a parse error. Moreover, it’s a hard error to catch because it occurs only some of the time. There are many cases in which it’s perfectly valid for a given if statement to evaluate to true. In these cases, the error won’t manifest as an error. In the other cases, you’ll simply notice that the else block, which should be evaluated, isn’t being evaluated. Because the error tends to occur sporadically, it can be very hard to trace.

The moral of the story is that you should type very carefully when using logical operators in if statements. And if you find your code skipping an else block inexplicably, be sure to look up a few lines at the comparison statement in the if block to make sure it’s not always evaluating to true thanks to an omitted equal sign.

Where there’s scope, there’s hope

One reason PHP is so easy for non-programmers to pick up is that it is very lax about variable types and namespaces. A given variable can contain any data type of any length and can then have data of any type and length reassigned to it without producing an error. Compare this to strictly-typed languages in which a variable declaration must include a type and a size. The larger a Web application gets, the greater the likelihood of variable name collision, which, like assignment/comparison operator confusion, can cause non-fatal, hard-to-trace errors. Imagine an application that pulls in three includes, as follows:

include('test_one.php');
include('test_two.php');
include('test_three.php');

The code for test_one.php runs as follows:

$name='My Web Application';

The code for test_two.php runs as follows:

if($show_name==true){ $name='John Doe'; }

The code for test_three.php runs as follows:

print $name;

As the code currently stands, this page will print out:

My Web Application

But say somebody comes along later and, not seeing the if statement in test_two.php, adds some code to test_one.php that sets a variable named (coincidentally) $show_name to true. The output, which was originally intended to display an application name, will display a person’s name. Now, this particular example isn’t very likely to occur because it includes a tiny and very simple code base, but it is very common indeed for programmers to use more collision-prone variable names ($temp, $x and $count, for example) that are easily overlooked later, especially as code begins to fill more lines and files. In some cases, the sort of value replacement demonstrated here is perfectly valid, but it often happens inadvertently thanks to careless naming conventions, and it can be a real bear to hunt these errors down in larger applications.

The most workable solution is to refer to scoped variables within their scope and to scope your variables if they’re unscoped. For example, although many server configurations allow you to refer to GET and POST variables by name (the form field “address” can be referred to as “$address”), you can make your code clearer and less prone to variable name collisions if you use the $_GET and $_POST arrays to refer to these variables. The form field “address” should be referred to as $_GET['address'] or $_POST['address']). If you refer to the values within the scope of these arrays rather than the names, then there’s no chance that you’ll later overwrite a needed value by carelessly using $address as a variable name for another purpose.

Where no default scope exists, you can create your own. I often do this by writing classes to contain and manipulate related data. You can also do this by storing certain sets of data in arrays. For example:

$application_info=array( "name" => "My Web Application", "url" => "http://localhost/myapp.php" );
$user_info=array( "name" => "John Doe", "address" => "1313 Mockingbird Lane" );

In this example, we have overlapping array keys where different types of data have overlapping attributes, but we’ve scoped their values so it’s easy to know when we’re dealing with application information and when we’re dealing with user information. Naturally, you could achieve a similar result by naming simple variables carefully—$app_name and $user_name, for example—but as Web applications become more advanced, more complex data structures such as arrays and objects become more and more useful.

Speaking in tongues

Like all languages, human or machine, PHP can be mixed with other languages only under certain circumstances. If your boss suddenly began speaking Tagalog mid-sentence and then jumped back to English, there would be a disconnect for most people (and it could be said that the conversation needed debugging). The same is true of mixing PHP with other languages.

Most cases of language confusion I’ve seen have involved the intermingling of PHP and JavaScript. Consider the following snippet, for example:

<script language="JavaScript">
<?php $name=get_name(); ?> document.write(name);
</script>

Here, the developer uses PHP to set a variable’s value and then attempts to access this PHP variable in a JavaScript function. I’ve seen similar mistakes with the document.write() call using name and $name. In the former case, name is a valid JavaScript variable format, but the variable hasn’t been defined within the JavaScript namespace. In the latter, the PHP variable notation can’t be understood by the JavaScript parser. The developer has interspersed PHP within JavaScript in a nonsensical way and will net nonsensical results. Two simple ways of getting around this issue are readily available:

<script language="JavaScript">
document.write('<?php echo get_name(); ?>');
</script>

In this case, we’re echoing the results of the (fake) PHP get_name() function within our JavaScript, but because we’ve enclosed the function call in <?php ?> tags and are echoing the result, the PHP code is evaluated and has its value written in the JavaScript inline. In the example below, we’re doing something similar, but rather than dumping the evaluated result of the PHP code in our document.write() JavaScript call, we’re using it to assign a value to a JavaScript variable, which we then use in the document.write() call.

<script language="JavaScript">
var name='<?php echo get_name(); ?>';
document.write(name);
</script>

The distinction many people who run into this problem miss is that JavaScript and PHP have different namespaces and are in fact run at entirely different times. PHP is processed on the server side and is used to generate HTML and JavaScript code. Accordingly, JavaScript code can’t be mingled with PHP code except as a string value to be printed out by the PHP code. Similarly, PHP code can’t be included raw within JavaScript code, though it can be embedded within JavaScript code in a PHP page if enclosed within <? ?> tags. PHP code is interpreted by the Web server and JavaScript is interpreted by the browser. Therefore, PHP has to have finished executing before JavaScript is displayed onscreen.

This stuff is tricky to wrap your brain around. Let’s go back to our example of the Tagalog-interjecting boss and modify the scenario slightly so that she’s not speaking to you but is instead reading to you from an email she received. As she’s reading aloud, she notes that the text changes mid-sentence into Tagalog, and she knows that this will never do, as you speak only Swahili, Esperanto, Catalan, and English. Luckily, she does speak Tagalog, and she’s able to do a translation on the fly and read the email to you in seamless English. Your boss here is acting as the PHP engine reading over a PHP file including some JavaScript that has well-formed PHP calls embedded in it. As PHP engine/translator, she evaluates the PHP calls/Tagalog and reads out to you (the browser) the interpreted version.

To err is human, to debug divine

We all make mistakes—that’s precisely why compilers and interpreters are equipped to display error messages. These messages can be hard to decipher, though, and there are several types of errors I’ve seen beginners stumble over.

Perhaps the most common is the “unexpected T_STRING” error. For example, this code:

<?php
$x="he;
$y="llo";
$greeting=$x . $y;
?>

Generates this error:

Parse error: parse error, unexpected T_STRING on line 4.

But line four, which reads $y="llo";, appears to be well formed. In fact, it is. What many new programmers don’t understand is that compilers and interpreters can report syntax errors but can’t always interpret them as humans do. In this example, a human will see that the error actually occurs on line three, where a double-quote is omitted. The PHP interpreter, however, sees line three as a perfectly valid line (quoted text can spill over to multiple lines, after all) but spots what appears to be a stray quote in line four, so it registers an error on line four. PHP interprets the end result rather than the intention, in other words, but because human beings are inclined to think in terms of intent, error messages like this can be difficult to understand. Many people have stared at and tweaked line four for an hour before resorting to a message board for help. The key with such errors (and there’s a whole family of them, including the “unexpected T_PRINT” and “unexpected T_VARIABLE” siblings) is not to get stuck looking at the line number the interpreter spits back out at you. In most cases, the line just above the provided line number is actually the one causing the problem.

Another very common stumbling block for beginner debuggers is the display of non-fatal warning messages. Consider the following code:

<? print "Non-existent variable: " . $non_existent; ?>

Under the right (or maybe wrong) circumstances, this code produces the following output:

Notice: Undefined variable: non_existent in /var/www/html/error_example.php on line 3 Nonexistent variable:

This non-fatal error occurs because we have attempted to use a variable ($non_existent) that we haven’t initialized. There are several ways to get rid of that obnoxious error message. We could initialize it by adding the line $non_existent='' above the existing line, for example. Or we could add an if statement that checks to see if $non_existent has been set. These solutions fix the issue programmatically. Generally, this isn’t the sort of issue that should cause alarm or merit a great deal of attention, and for such cases, there are other solutions. For instance, prepending @ to the beginning of a function name suppresses error output, so changing our line to @print "Non-existent variable: " . $non_existent; is a quick fix.

However, error messages are produced for a reason, so you need to be very careful about suppressing them. It’s common practice to display errors during development and to suppress them once an application goes into production. Such a practice renders prepending @ to function names impractical, as all function calls would have to be edited before going into production. Two other common practices can really help maximize debugging ability while minimizing debugging output once in production.

The first is to use the @ syntax in conjunction with die() statements. Consider a case in which you’re fetching results from a database. In some instances, you may get notices or warnings that don’t hinder the program execution but cause ugly display issues, and in others, you may get valid errors. Prepending @ to the mysql_query() function (and other relevant functions) will suppress error messages, and adding a die() statement (so $result=@mysql_query($params) or die(mysql_error())) to be executed if the function returns false allows you to handle fatal errors.

The second practice is to set the error reporting level in your php.ini config file. The config file does a good job of documenting how to handle error reporting, but I’ll cover the basics here. The directives involved are error_reporting and display_errors. The former governs what types of errors to watch for, and the latter governs whether or not to display them at all. The display_errors directive takes simply “On” or “Off” and does or doesn’t display errors accordingly. You have more granular control over error reporting with the other directive, which takes an argument defining exactly what kinds of errors to catch. If you set it to error_reporting = E_ALL, for example, all errors will be displayed (including the harmless undefined variable error listed above). But if you set it to error_reporting = E_ALL & ~E_NOTICE, PHP will catch all errors except notices. Similarly, error_reporting = E_ERROR|E_PARSE will catch fatal run- and compile-time errors but will produce no output for others. So, by specifying granularity of error reporting across the board and then turning error reporting off once in production, you can avoid unsightly output in a production environment without compromising your ability to develop with the full benefit of error messages within a development environment.

Laziness is a virtue, or: RTFM

Perl creator Larry Wall suggests that the three cardinal virtues of a programmer are laziness, impatience and hubris. The general idea is that programmers tend to be too lazy to make existing tools jump through the necessary hoops to get a job done, they’re impatient about the time involved to do so, and they have the hubris to believe that they can do the job better themselves. I think he’s got it pretty much right, but there are times when it’s appropriate to be lazy in a different way.

The first time I tried to write anything in PHP, I hadn’t really done my homework. I knew that the control structures were more or less like those of other languages I had played around with, and I knew that variables looked a lot like Perl’s variables, and that was about it. When confronted with the need to extract variables from the query string, I didn’t know of an existing mechanism for doing so. So, I wrote my own little code snippet that grabbed the URL, split on the question mark and then split the second element of that array on ampersands. Each of these elements I split on equal signs, and voilà!, I had my query string variables. I was able to use this code without any problems for the application in question, but the values weren’t validated or decoded and I had no mechanism for grabbing POST variables if they existed. Had I known that a mechanism for grabbing these variables without any work existed (using the $_GET and $_POST arrays), I could not only have saved myself some time but could also have had more reliable code doing my work for me. So while it’s often beneficial to take Wall’s virtues to heart and do the grunt work yourself (as paradoxically lazy as that seems), for most things a beginner developer is going to try to do, more reliable solutions for fundamental functionality have already been developed. In this case, laziness consists not of going out of your way to write code to get around having to work laboriously with existing code, but rather of taking a few minutes to review the PHP manual to see if what you’re wanting to do has been done already.

Related Topics: PHP, Programming

Daryl L. L. Houston is a literature geek by training and a computer geek by trade. He’s done public Web site and private business systems programming for well-known and respected Fortune 500 companies in recent years. PHP is his weapon of choice, but he’s done production work in perl, python, and jsp as well.