How to use AuthComponent in CakePHP

September 24th, 2007

Wherever I went - irc, trac, google groups and certain blogs - I noticed people complaining about either not knowing how to use the AuthComponent or having problems with it.

After reading the only tutorials available across all usual blogs I rely on for all my CakePHP needs, I realized their was a gap I could fill. The reasons are:

  1. Lack of a CPC way for quick and easy implementation.
  2. No real explanation or solution for using in a real application environment (multiple controllers, beforeFiters, actions, etc.).
  3. Not covering the different authorization methods it’s capable of: default, controller, actions, crud, model and object
I know I can’t cover everything at once, I just don’t have the time for it, but I will start with the first two and probably cover the third part in a serie of posts. Read the rest of this entry »

CakePHP URL validation bug fix and enhancement

September 23rd, 2007

Update: 25/10/2007 Bug appears to have been fixed in the latest pre-beta release. Looks like this was the correct regex.

When I started parsing the millions of Google pages scraped, I came across a bug with the URL validation. To fix that, I overwrote the url validation method in appmodel.php:

   function url($check)
   {
      $validation =& new Validation;
      $validation->check = $check;
      $validation->regex = '/^((https?|ftps?|file|news|gopher)://)?'  //protocol
                            . '('
                              . '(?:(?:25[0-5]|2[0-4]d|(?:(?:1d)?|[1-9]?)d).){3}(?:25[0-5]|2[0-4]d|(?:(?:1d)?|[1-9]?)d)' //ip 199.194.52.184
                              . '|' //ip or domain
                              . '([0-9a-z]{1}[0-9a-z-].)‘ //subdomain(s) www.
                              . ‘([0-9a-z]{1}[0-9a-z-]{0,56}).’ //domain
                              . ‘([a-z]{2,6}|[a-z]{2}.[a-z]{2,6})’ //tld
                              . ‘(:[0-9]{1,4})?’ //port
                           . ‘)’
                           . ‘(’
                              . ‘/?|’ //ending-slash
                              . ‘/[w-.,’@?^=%&:;/~+#]*[w-@?^=%&/~+#]’ //path
                           . ‘)$/i’;
      return $validation->check();
   }
Read the rest of this entry »

CakePHP’s advanced model fields validation

September 20th, 2007

After checking different blogs and tutorials, the bakery, API and IRC channel, it was obvious that some kind of documentation for the validation methods available in the Model was necessary. I can’t say that I will fulfill this mission but I’ll at least share what I came up with for future reference.

Here is the ‘User’ model I will be using in my example:

class User extends AppModel
{
   var $name = 'User'; //optional

var $validate = array( ‘username’ => array( array( ‘allowEmpty’ => false, ‘required’ => true, ‘rule’ => ‘alphaNumeric’, ‘message’ => ‘Username should only contain alpha-numeric characters.’, ), array( ‘rule’ => array(’between’, 3, 10), ‘message’ => ‘User should be between 3 and 10 characters long.’, ), array( ‘rule’ => ‘isUnique’, ‘message’ => ‘Username is already in use.’, ), ), ‘passwd’ => array( ‘alphaNumeric’ => array( ‘allowEmpty’ => false, ‘required’ => true, ‘rule’ => ‘alphaNumeric’, ‘message’ => ‘Username should only contain alpha-numeric characters.’, ), ‘validLength’ => array( ‘rule’ => array(’between’, 3, 10), ‘message’ => ‘User should be between 3 and 10 characters long.’, ), ), ‘website’ => array( array( ‘rule’ => ‘url’, ‘on’ => ‘update’, ‘message’ => ‘Invalid URL.’, ), ), ‘agree_tos’ => array( array( ‘allowEmpty’ => false, ‘required’ => true, ‘on’ => ‘create’, ), ), );

); }

That’s a lot of validation rules, I know - I just wanted to try covering the multiple ways of using the Model->validates() method. Read the rest of this entry »

Domain TLD Parser

September 17th, 2007

Parsing URLs in PHP isn’t perfect. Don’t get me wrong here, it does the job when it comes to breaking the URL in logical parts, but, it doesn’t have any options to parse the host into domain name, TLD and sub-domain(s). Most probably because new TLDs are coming out from time to time and they want to avoid having to update that same function with every new TLD release.

To over-come this limitation and because I needed some way of extracting the domain, sub-domain and TLD out of each given URL, I came up with the following class: Domain TLD Parser

It parses hosts with all kinds of different TLDs, even the country-specific ones like ‘.co.za’, ‘.ne.jp’ or ‘.ltd.uk’. Here is an example:

<?php
$url = $SERVER[’HTTPREFERER’];
include(’/path/to/domaintldparser.class.php’);
$domain = new DomainTldParser;
echo ‘<pre>’;
print_r($domain->parse($url));
echo ‘</pre>’;
?>

DocBlock and svn:keywords

September 4th, 2007

In the application’s coding conventions (which I am almost done writing), I took the time to elaborate about code documentation and the use of phpDocumentor. Among the things discussed, there is the DocBlock, the header template of each PHP file and which looks something like that:

/**
 * Short description for file.
 *
 * Long description for file
 *
 * PHP 5
 *
 * Copyright (c) 2007, Company Name
 *                     Street address
 *                     City, State, Zip
 *
 *
 * @filesource   $HeadURL$
 * @copyright    Copyright (c) 2007, Company Name
 * @link         http://www.companywebsite.com CompanyName
 * @package      #### PACKAGE NAME ####
 * @sub-package  #### SUBPACKAGE NAME ####
 * @since        #.#.#  //Correct version number as needed
 * @version      $Revision$
 * @author       Your Name
 * @modifiedby   $LastChangedBy$
 * @lastmodified $Date$
 */
Now you might be asking yourself what are all those $HeadURL$, $Revision$, etc. Those are ‘keywords’ for Subversion which can be dynamically updated on every commit. By default, Subversion doesn’t substitute those keywords but you can easily set that directly from the shell using:
$ svn propset --recursive svn:keywords 'HeadURL Revision LastChangedBy Date' /path/to/repo
Or, in case you are using TortoiseSVN, from the right-click menu of your repository’s folder - TortoiseSVN > Properties > Add. You can then enter ’svn:keywords’ in the ‘property name’ field and ‘HeadURL Revision LastChangedBy Date’ in the ‘property value’ field. Don’t forget to check the ‘apply property recursively’, otherwise, make sure you are only setting it on a file not a directory.

From the svn propset help shell command:

The svn:keywords, svn:executable, svn:eol-style, svn:mime-type and svn:needs-lock properties cannot be set on a directory. A on-recursive attempt will fail, and a recursive attempt will set the property only on the file children of the directory.

SELECT DISTINCT in CakePHP

August 31st, 2007

Even though CakePHP’s model already includes many of the database query functions, I found that the SELECT DISTINCT was missing. Ok, I know that you can always do it using either Model->query('SELECT DISTINCT c1, c2) or Model->findAll(null, 'DISTINCT c1, c2‘) but that would be like saying use Model->query() instead of Model->findAll().

The cool thing in CakePHP is that you can add your own functions to use in your app on top of the ones that come bundled in the core. For the Model, you just create an ‘app_model.php’ file that you place in our app’s main folder. The empty file should look like this:

/**
 * Custom AppModel that adds functionality to the core Model
 */
class AppModel extends Model
{
   //empty
}
Now inside your new AppModel class, add the following function:
   /**
    * Returns a resultset array with DISTINCT fields from database matching given conditions.
    *
    * @param   mixed    $conditions SQL conditions as a string or as an array('field' =>'value',...)
    * @param   mixed    $fields Either a single string of a field name, or an array of field names
    * @return  array    Array of records
    */
   function findDistinct($conditions = null, $fields = null)
   {
      $db =& ConnectionManager::getDataSource($this->useDbConfig);

  $str = 'DISTINCT ';
  if (!is_array($fields))
  {
     $str .= '`' . $fields . '`';
  }
  else
  {
     foreach ($fields as $field)
     {
        $str .= '`' . $field . '`, ';
     }
     $str = substr($str, 0, -2);   
  }

  $queryData = array(
                 'conditions'   => $conditions,
                 'fields'       => $str,
                 );
  $data = $db->read($this, $queryData, false);

  return $data;      

}

You can now use Model->findDistinct('c1') or Model->findDistinct(array('c1', 'c2', 'c3')) to retrieve DISTINCT columns values.

Hoping you find it useful.

nusoap class or SOAP extension?

August 3rd, 2007

Back when the web app was to be hosted on the VPS (using PHP4), I had started coding some of the scrapers and parsers to retrieve data from different affiliate networks. After moving to the new servers and setting them up with the latest stable versions of PHP and MySQL, it was time I clean up, optimize and document my code. But no, before I could start doing that and while I was still testing it to refresh my memory on the different methods available within the DtSoap class, I started getting errors!

nusoap

My code required nusoap to communicate with the networks’ WSDL. However, just like for nusoap, PHP5 includes a class named ‘soapclient‘ which caused the conflict. The solution was simple, changing the class’ name to ‘nusoapclient’ and the constructor too before finally changing my code to call for it instead of the old name.

That was the lazy-guy in me reasoning. As soon as the geek kicked in, you guessed it, or maybe not: I decided to update the code and use the built-in soap extension available in PHP5. Read the rest of this entry »

Data Manipulation 101

July 30th, 2007

One of the application’s core functionalities is monitoring changes on different sources. Some of which have some kind of webservices available while other don’t. By monitoring I mean getting, validating, storing and comparing the data over time. For that, different data manipulations are required:

  • fetching
  • scraping
  • parsing
  • cleansing
  • storing
  • mining
I will share in this serie of posts different resources, code snippets, benchmarks and techniques related to one or more of the above. Maybe if I get enough time, I can cover them all and group them in one easy-to-understand chapter: data manipulation 101.

The sandbox environement I am using:

And now, on with the show. Next up, Data Scraping.

Configuring CakePHP for easy deployment

July 29th, 2007

Whenever you are developing an application, you are normally using at least 2 environments: development and production. Depending on how big your app and database are, deployment may become a long list of to-dos in order to have everything setup correctly (database changes, code merges, apache configuration, etc.). When you want to do things the right way, you usually have a 3rd environment, staging. Yes, another almost identical to-do list.

Now imagine you are developing in a team with each member working on a different part of the application in individual branches on the repository. New features are only merged with the team’s branch (development) after they are tested individually. So now, you have to deploy from individual to development, then to staging before finally pushing updates to production.

This time-consuming process is definitely not the best solution, put aside the fact that it is bound to break with any mistake while reconfiguring everything. Today, we will automate CakePHP’s configuration for easy deployment. Read the rest of this entry »

Detailed documentation - to the rescue!

July 22nd, 2007

Having worked by myself on the vast majority of my coding projects, I never realized the necessity of having lots of documentation. Good documentation takes time to write and time is something I happen to always be short on. I am not only talking about code commenting here, but rather all kinds of documentation: coding conventions, database structure, choice of configuration, etc.

When this project was started, I only had one other person involved in the coding part and it was mostly for some outside classes we needed. With the growing code needs (all the new features, etc.), I believed it would be wise to add a new person to the team. Given my past experience with site/script development, for which I had never planned to hand to other coders, I was expecting the worse when it comes to explaining all what the web application is about: features, users, structure, etc. It definitely always sounds much simpler when you are the author/creator, but when you are also the end-user, things become ridiculously easy to understand and put together - which is unfortunately not the case for someone that has never heard nor used of anything similar before. Read the rest of this entry »