An Intro To PDO

Posted 2012-07-15
Written by Matt Frost
Category code

So if you haven't heard or don't know yet, you shouldn't be using the MySQL extension anymore (mysql_connect, mysql_query, etc). If for no other reason, PHP is in the process (albeit slowly) of deprecating and removing the functionality, the more you continue to use it; the more mind-numbing work you'll have in front of you when that day comes. The way PHP has been really responding to feature requests and implementing some really good ones; the more I think that staying a version back to continue using this extension isn't a viable option. A lot of PHP/MySQL tutorials use the mysql extension to demonstrate functionality; which is fine, but you should know how to "translate" your existing code into PDO.

Setup Information PDO is available as PECL extension for PHP 5.0 and ships with PHP 5.1+. If you are not using PHP 5.0 for some reason, and I can't imagine what that reason would be...then you can't use PDO.

Prepared statements
Selecting things was always pretty easy, SELECT {columns} from {table} where {field}={a hopefully secured post/get variable}. The problem is a lot of people didn't do anything to their $_POST or $_GET variables and that is one of the reasons we hear about SQL Injection all the time. Prepared statements are a great way of protecting yourself from SQL injection and creating query templates that can be reused throughout your application. Let's consider the following scenarios:

$query = "SELECT * from table where id=$id";
$result = mysql_query($query);

In order to run this query securely, you'll have to make sure you run mysql_real_escape_string on the variable if you're getting it from the user. Forgetting to do so leaves yourself vulnerable to injection. Let's consider the prepared statement alternative.

$stmt = $dbh->prepare("SELECT * from table where id=:id");
$stmt->bindParam(':id',$id);

$id = 1;
$stmnt->execute(); //will excute the the query with 1 inserted for the ':id' placeholder

The nice feature here is that, in the event I have to use this query again with different values, I just have to assign the new $id variable again and execute the query again.

$id=2;
$stmt->execute(); //this will excute the prepared statement with the new value for $id

Now I can see you saying, wait what used to take me 2 line to do now takes me multiple lines, how is that better? You see those bindParam calls, they can call go away and be implemented directly in the execute method (with the statement above)

$stmt->execute(array(':id'=>$id));
Walla! Now your prepared statements require one statement and executing them with valid information requires one line of code. If you remember above, I don't have to remember the mysql_real_escape_string() everything I use.

Let's take a look at how that works to insert...

$insert = $dbh->prepare("INSERT into table (id,name,content) values(:id,:name,:content);
$insert->bindParam(':id',$id);
$insert->bindParam(':name',$name);
$insert->bindParam(':content',$content);
//assign values to the variables and... $insert->execute();
Now as you guessed, I don't have to use mysql_real_escape_string on any of this stuff, because the driver takes care of that for me! Being safe and efficient is super fun and cool!

Data Source Names vs. mysql_connect()
There aren't prepared statements though without a data connection. My annoyance with the mysql extension was always that it required two statements to connect to the database, for practical purposes the 2 calls were usually encapsulated into a connect method in your database layer. The fact is that using mysql_connect and mysql_select_db always struck me as a strange way to do things and I never really liked it. Enter the Data Source Name (DSN) - which is simply laid out as prefix:host=hostname;dbname=database. This string sets up what driver to use, on what host, and what database to connect to. Let's assume for the sake of example, that we want to use MySQL to connect to a database called application on localhost.

$dsn = 'mysql:host=localhost;dbname=application';
$user = 'user';
$password = 'password';
$dbh = new PDO($dsn, $user, $password);

Now you have your PDO object and you can go through the prepared statement process mentioned above, the mysql_connect and mysql_select_db are now merged into the constructor of the PDO Object.

Changing swiftly
Now let's imagine you have a version of your application that is done with a a MySQL datasource. As your application grows, you find that you need to switch from MySQL to Postgres (not suggesting this is what you should do by any means, just using MySQL and Postgres as examples of switching data sources) - if you had to make this change with the mysql extension, you would have to replace all the mysql_*() calls with the extension for the database you're connecting to. Not only that, but you have to investigate the calls to make sure that they take the same parameters in the same order, doesn't sound like fun to me. If you abstracted your database layer, you could always just provide a new library and provide the implementations for your abstract database class; my guess is that if you're here you haven't done this.

PDO takes care of a lot of this abstraction for you, now keep in mind it's not all roses. In the example provided, we'll know that we're creating our PDO objects with the DSN, Username, and Password arguments, but in Postgres we can concatenate the the user/pass into the DSN.

pgsql:host=localhost;dbname=application;user=user;password=password

so you'll end up having to refactor your instantiations of the PDO object. Depending on the state of your code base, this can be an undertaking as well; but I'm sure you'd agree it's certainly less work than having to replace every single function call.

You won't ever change data sources you say? Can you be so sure? It can be very dangerous to rely on something you don't have control over...

PDO vs mysqli
PDO isn't your only option in this fight, mysqli is available and is also a much better alternative that the MySQL extension. It's my view that PDO is superior to mysqli for the flexibility reason mentioned above. MySQLi is only going to work with MySQL, where PDO can work with a number (12 right now I believe) of different DB Engines. You never know when management is going to want to change engines on you and if you can be prepared for something you can't control; that puts you ahead of the game. When you consider that the APIs are very similar in how they operate, that gives PDO the edge in my book. I do want to reiterate that MySQLi is a very viable option and certainly is better than the mysql extension. For more info on how to use mysqli, check out this piece from codular.com

Conclusion
PDO is a huge step up from the mysql extension, the fact that it be injected into other objects using Dependency Injection right out of the box is a huge plus. The work that would generally go into doing this involves wrapping the mysql functionality - why do that if you don't have to? It may be unreasonable for you to go back through a large code base and remove all the use of the mysql extension; this is assuming that you've taken all the proper precautions to protect yourself against SQL injection. If that's you, you need this knowledge going forward. As a quick recap, this is why you should use PDO over the MySQL extension:

  • PHP is going to be deprecating/removing the mysql extension
  • Escaping user input by binding parameters eliminates the need to manually escape variables
  • PDO encapsulates the details of different data sources - so changes can be made without tearing your code apart.

Comments

Gravatar
Jacques

2012-07-17

Great post and overview of the PDO. I actually haven't heard of the move from the MySQL extension to the PDO until now- yes I've been living under a rock.

Thanks for making the post.

Posting comments after has been disabled.