Detailed documentation - to the rescue!

Posted by Jad on July 22, 2007

Having worked by myself on the vast majority of my coding projects, I never realized the necessity of having lots of documentation. Good documentation takes time to write and time is something I happen to always be short on. I am not only talking about code commenting here, but rather all kinds of documentation: coding conventions, database structure, choice of configuration, etc.

When this project was started, I only had one other person involved in the coding part and it was mostly for some outside classes we needed. With the growing code needs (all the new features, etc.), I believed it would be wise to add a new person to the team. Given my past experience with site/script development, for which I had never planned to hand to other coders, I was expecting the worse when it comes to explaining all what the web application is about: features, users, structure, etc. It definitely always sounds much simpler when you are the author/creator, but when you are also the end-user, things become ridiculously easy to understand and put together - which is unfortunately not the case for someone that has never heard nor used of anything similar before. Continue reading…

Hardware, connectivity and load estimation

Posted by Jad on July 18, 2007

When developing a web application, you can’t only think of software (programming language, operating system, etc.), you also have to think of hardware, connectivity and load management.

In one of my early posts, I said that I’ll leave the web app on a VPS - I rather quickly understood that I was totally wrong to think it could handle the load, which is not only related to the number of users or to the app’s success but also to the number of data tracked, the system’s core functionality.

As a matter of fact, adding more features certainly added more value but it definitely added more overhead. So why don’t I just scratch those new features out, at least for now? After all, one of the books I believe to be true on many aspects, Getting Real (by 37Signals), repeats that over and over again. While I can’t deny that many of the advices found in that book will prove to be very beneficial, but sometimes, you need to put aside a couple general rules and follow your instinct.

And that’s what I decided to do.

Some details

The application’s business logic is pretty extensive and processes data all day long, no matter if users are online or not. It also communicates with different child servers all over the web.

The database, will start with millions of entries and over 30 tables, growing by the millions of new records every day.

Sketching the virtual network

development_phase_-_servers_setup.gif

Choice of hardware

The application server needs to cope with heavy-duty tasks, as fast as possible. The database server doesn’t only need storage space but also processing speed. And finally, the web server needs to handle multiple concurrent connections while serving HTML.

After comparing multiple not-so-solid-benchmarks, reading on Web Hosting Talk and a couple of emails with some ISPs, I finally made my choice on the hardware settings illustrated in green above.

Choosing your ISP

With all the hosting reviews and comparison sites available, you’d think that the task is easy. Let me tell you it’s not. While some will strongly encourage you to look for a provider in your city (and for good reasons), I opted to go for pricing, staff experience and perfect reliability/support track record.

I started researching a bunch, some of the names that sticked: rackspace.com, theplanet.com, 365main.com, cari.net, softlayer.com. I won’t go into too much details about how I finally made my choice because I am no expert, but let’s say that rackspace.com is over-priced (and I am being nice) and theplanet.com seems to have lost its touch by growing too big, otherwise it would have done a great job.

SoftLayer is the one I decided to go with. Established in 2005, with a management team that evolved together for many years, it looked pretty solid to me. Their staff, pricing structure, website and echo in the market just reinforced that feeling.

UPDATE: Looks like I had eliminated 365Main right on time! Less than 10 days after I had made my choice, 365Main suffers a power outage which leaves BIG customers (craiglist, technorati, sixapart, adbrite, yelp, redenvelope and other) offline for several hours.
Sources:
outage at 365 Main’s San Francisco datacenter
365 Main datacenter power outage - Six Apart Technorati Craigslist
San Franciscon Power Outage - A Case Study in Downtime

Estimating your needs

For this web application, the counters just don’t start at 0. The basic servers’ configuration takes that into account but, like for any other application, the actual load inflicted by concurrent users’ requests can’t really be estimated nor benchmarked in advance.

Not so long ago, only 2 solutions would have been available. Start with just enough and be ready to upgrade fast or start strong enough and handle the first couple of months to discover your averages.

Today, there is a third solution: Amazon EC2 and S3 services. Pay-as-you-go for storage and processor usage, what better to avoid seeing your application crashing under the heavy load or your bank account drained because of the hefty initial costs that turned out useless?

Conclusion

I am no expert in that field to start with but I feel confident that I made the right choices. Time will tell I guess.

Agree, disagree?

Delays - very often, uncontrollable

Posted by Jad on July 17, 2007

In my last post, I said I would start posting more frequently after being absent for a while and here I am, 7 days later, with no posts to show.

I haven’t been procrastinating, nor have I been making more changes to the application’s plan, no - I was recovering from an unexpected surgery!

Aside from affecting the project’s timeline, this hospital stay gave me time to take a couple steps back from the development/planning side of things. Time I spent refreshing my memory with books like Building Scalable Web Sites (by Cal Henderson) and Prioritizing Web Usability (by Jakob Nielsen and Hoa Loranger).

A couple of the things that got either setup, approved or coded during my absence:

  1. New company (legal documents, bank account, etc.)
  2. A new Google Adwords API account (GAA) - never heard back from them concerning the old account’s inquiry.
  3. A new Yahoo! Search Marketing API (YSMA) sandbox account.
  4. 75% of the data cleansing classes we need.

Lesson of the day:

Early in the development process, having a detailed plan (no matter if you know that delays will occur), combined with taking action on simple third party requests (placing orders, registering accounts, etc.), helps keeping things rolling when you encounter unexpected events or delays.

Sticking to the timeline

Posted by Jad on June 22, 2007

One of the things I hate the most when developing an application is counting on 3rd parties to deliver or reply on time. Be it freelancers, API developers or sometimes your own partners (banks, lawyers, etc.). In this case, it’s the API developers. A couple days ago, I emailed Google for some help on finding ‘My Client Center’ link under the ‘Account’ tab in my Adwords account. After waiting 4 days, when I finally received an email from their support team, I was excited, but not for long. Even though I had sent a specific request, explaining that I have already received my developer token but couldn’t log to my developer account, all I got back is a quick-reply style of email with the instructions on how to apply for a developer token! Looks like when you’re Google you can permit yourself everything from late replies to stupid staff.

Anyhow, this obviously affects the development of a vendor class I wanted to finish, but nothing that could stop me from moving on and pushing it back a couple of weeks. In the worst case scenario, if they just tell me that my developer token is not valid anymore (inactivity for a long period is the only reason I could think of) and since they don’t give out developer tokens anymore, I will find myself obliged to completely scrap that feature from the application. It sucks, I know, but that’s the price to pay when some of your features depend on 3rd parties I guess.

On the positive side of things, everything has been going according to the initial timeline with a couple of improvements to be made on the design and finalizing the teaser page for early invitations.

Choosing a development framework

Posted by Jad on June 21, 2007

During the past couple of years, Ruby on Rails has been getting massive attention and I never really got the time to dive into it - learned a couple things here and there, but haven’t really developed any application with it. I liked it for the same reasons that everyone else does: it’s a rapid development framework using the MVC pattern and ActiveRecord.

Learning a new language (in this case 2, Ruby and Rails) is definitely exciting, but having to do that while developing a pretty advanced web application that needs to go live in less than 2 months, raises the bar pretty high.

So I opted for goold ol’ PHP. I have built all kinds of little scripts, hacked many open-source projects (as in modified them) and released 4 of my own turnkey solutions all based on PHP but never had I used an open-source framework for that. This time, things will change. I said in the first post something about CakePHP which is the one I selected for this application and probably more to come.

As Jonathan Snook says it so well:

I almost fear putting this kind of post together as it’s bound to pull the fanatics (in the negative sense of the word) out of the woodworks.

So instead of giving a comparison of the available PHP MVC frameworks, I will instead list what I was looking for and believed CakePHP would cover. Here it is:

  • Open Source / Free License
  • KISS
  • OO and MVC
  • ORM / Active Record
  • Security
  • Ajax interoperability
  • Good controller structure
  • Good helpers available
  • Scaffolding

I did try a couple before making my final decision, but here are the resources that helped me make up my mind:

- Taking a look at ten different PHP frameworks, by Dennis Pallett
- CodeIgniter vs. CakePHP, by Jonathan Snook
- New Year’s Benchmarks and A Bit About Benchmarks, by Paul M. Jones
- Glue vs. Full Stack and More framework fun, by Chris Hartjes
- Comparing Frameworks, by Tim Bray

AdWords API - only the beginning

Posted by Jad on June 19, 2007

Been pretty busy lately with csstester.com which is soon to be launched, finalizing stuff, etc.

I know, I know - I should only be concentrating on one application at a time, but that ain’t my kind. Am not smarter or anything, I just get bored too fast from working on the same thing project every single day, all day.

Anyhow, back to the reason for this post, Google AdWords API and the awful experience I had so far.

Back in early 2006, I had applied and received my developer token for the API (which Google stopped distributing as of October 2006). Developer token, email account to which the token was emailed and password in hand, I head to the login page. To my big surprise, Google replies back with a message along the lines of:

This account does not exist

Ok, so you emailed me a developer token and an API key in a separate email, to that exact same email address I am using to access my account, but you don’t believe I have an account. Great!

Since my developer email access is not the main admin access to the account, I log with my other email and invite the email Google had sent me the developer token to. A couple steps after, I am logged into my account using the developer credentials.

According to AdWords’ welcome email, I am supposed to find a ‘My Client Center’ link under the ‘Account’ tab.  But nothing there.

I go looking for help, reading pages and pages of information I could have skipped for the time being, but couldn’t find any kind of help about the issue I was having.

Last resort: email Google.

It’s been nearly 12 hours, and still no reply. One would have believed that between their  free meals or exercising, someone could have a look and just fix that in my account.

Waiting…

Parsing the use cases XML

Posted by Jad on June 11, 2007

I discussed the other day about writing the use cases. I also mentioned that I was going to write them all in XML so they could be easily parsed later on.

As you might have noticed by looking at only 3 use cases, it gets pretty long and hard to read in raw format, so first code had to be a parser for that.

Solution: Clean XML To Array by Ivan Enderlin, found on PHPClasses.

All you need is to download the lib.xml.php and create the a new file with the little code shown below.

<?php
include('lib.xml.php');
$xml = new Xml;
$out = $xml->parse('file.xml', 'FILE');
echo '<pre>'.print_r($out).'</pre>';
?>

All that was left to do was to parse it in an easy to read format. This will do for now as I don’t want to spend too much time on it.

<?php
include('lib.xml.php');
$xml = new Xml;
$out = $xml->parse('usercase.xml', 'FILE');
//echo '<pre>'.print_r($out).'</pre>';exit();

foreach ($out as $usecases) {
	for ($i=0; $i<count($usecases); $i++) {
		$usecase = $usecases[$i];

		echo '<div class="usecase">';
		echo '<h3>'.$usecase[name].': '.$usecase[description].'</h3>';
		echo '<p class="overview">';
		echo '<strong>Sitting:</strong> '.$usecase[sitting].'<br />';
		echo '<strong>Primary Actor:</strong> '.$usecase[primaryactor].'<br />';
		echo '<strong>Scope:</strong> '.$usecase[scope].'<br />';
		echo '<strong>Level:</strong> '.$usecase[level].'<br />';
		echo '<strong>Minimal Guarantee:</strong> '.$usecase[minimalguarantee].'<br />';
		echo '<strong>Success Guarantee:</strong> '.$usecase[successguarantee].'<br />';
		echo '</p>';

		echo '<h4>Stakeholders</h4><p class="stakeholders">';
		foreach($usecase[stakeholders][stakeholder] as $id => $stakeHolder){
			echo '<strong>'.$stakeHolder.':</strong> '.$usecase[stakeholders][interest][$id].'<br />';
		}
		echo '</p>';

		echo '<h4>Scenario</h4><p class="scenario">';
		if(is_array($usecase[scenario][step])){
			foreach($usecase[scenario][step] as $id => $step) echo $id.'. '.$step.'<br />';
		} else {
			echo $usecase[scenario][step];
		}
		echo '</p>';

		echo '<h4>Extensions</h4><p class="extensions">';
		foreach($usecase[extensions][extension] as $id => $extension){
			echo '<strong>Extends:</strong> '.$extension['extends'].' ('.$usecase[scenario][step][$extension['extends']].')<br />';
			echo $extension['xcase'].'<br />';
			echo '<strong>Steps:</strong><ol>';
			if(is_array($extension['step'])){
				foreach($extension['step'] as $id => $step) echo $id.'. '.$step.'<br />';
			} else {
				echo $extension['step'];
			}
			echo '</ol>';
		}
		echo '</p>';
	}
}

?>

Designing the database

Posted by Jad on June 06, 2007

I’ve been really vague about the app we’re building here but that’s because I know it’s something that many in the affiliate marketing industry are trying to do right now and I prefer keeping everything like that until pre-launch.

Since 90% of what it does is data tracking and analysis from hundreds of sites. You can imagine now all kinds of settings and tables involved, all the database complexity. To start off, I identified groups based on similarities found at those sites. I then eliminated all the parts that could be just cloned and slightly modified to be added later on. All I had left was the strict minimum for the app to function and be able to run an example of each use case I had already written.

Time to design the database, identify the relations, etc. Here’s a scaled down version of what I came up with after an hour on PHPDesigner DBDesigner 4. I am sure certain things will get added or modified as we go but I believe this is a very solid start.

database design

Trac + Subversion on CentOS

Posted by Jad on June 04, 2007

The new app will be running on a VDS from Myriad Network for the moment. To be more precise, I find that their $56.95 per month package gives enough room to test the waters without any big initial investment. Myriad runs CentOS and to be honest, we ran into a couple of problems installing what was supposed to be very easy.

Here is a quick documentation of how we finally got it to run.

IMPORTANT: The following steps have been tested and worked for us. In no way do I guarantee they will work for every CentOS configuration and strongly suggest you do things carefully and read more documentation on the Trac and Subversion sites.

Installing Trac

  • Make sure you have at peast Python 2.4 (otherwise, get ready to face some serious issues)
  • Download the latest Trac version - here
  • Download ClearSilver - here
  • Download Swig - here
  • Download SQLite - here
  • Download PySQLite - here
  • Install Trac
    • Extract files
    • Change to extracted files’ directory
    • Run the following command: python setup.py install
  • Install ClearSilver, Swig, SQLite, PySQLite. For each:
    • Extract files
    • Change to extracted files’ directory
    • Run: ./configure
    • Run: male && make install
  • Move svnlib to python libraries directory: cp -r /path/to/svnlib /path/to/python2.x/site-packages/
  • Test svn bindings from Python CLI
    • Run: python
    • Run: import svn core
    • If everything is ok, you shouldn’t get any errors

Installing Subversion

  • Download the latest version of Subversion - here
  • Download the latest deps file - here
  • Extract subversion
  • Extract deps
  • Copy/move extracted deps folder content into Subversion root directory
  • Run: ./configure –without-neon
  • Run: make && make swig-py && make install

Creating a repository

  • Run: svnadmin create /path/to/repo
  • Change to /path/to/repo/conf and edit svnserve.conf, disable authz-db
  • To start SVN as a standalone server, run: svnserve -d -r /path/to/repo
  • You can optionally add –listen-port PORT in the command arguments

Setup Trac for repository

  • Run: trac-admin path/to/trac-env-root initenv
    • Select a project name
    • Enter path to your subversion repository
  • Run: tracd -d -p [PORT] /path/to/trac-env/root /path/to/another-trac-en-root

Subversion cheatsheets:

Notes: Python doesn’t send out any stderr to the system, which in turn makes it hard to debug. Instead, run tracd in interactive mode instead of the daemon mode by simply taking out the ‘-d’ from the tracd -d -p [PORT] command.

Writing use cases

Posted by Jad on June 02, 2007

Today I wanted to start writing the detailed use cases after having sketched most of the wireframe. I’ve been doing my usual research and here are the interesting documents I found for ‘use case’:

Now, some developers might disagree with me, but I opted not to do all the different UML diagrams for 2 reasons:

  1. As for any first version of a product, no matter how much I believe in its success, I prefer investing the least time possible when starting and instead focus on getting it built. When the application prooves to be successful, a complete rebuild can be made, inflicting a higher cost than just upgrading it, but that’s negligeable since it would pay for itself.
  2. I don’t have a complete knowledge of UML to say that it would be a perfectly solid one. I’d rather concentrate on learning more about it instead of playing with it.

That been said, Writing Effective Use Cases inspired me to make an XML for writing those use cases so they could be parsed later on to write documentation, usability tests, etc. And those of course will be reusable.

Here is a sample of the first 3 use cases I wrote.

<usercases>
	<usecase>
		<name>Use Case 1</name>
		<description>Register User</description>
		<sitting>Instant</sitting>
		<primaryactor>User</primaryactor>
		<scope>System, Beanstream API</scope>
		<level>User goal</level>
		<stakeholders>
			<stakeholder>User</stakeholder>
			<interest>Getting a membership access</interest>
			<stakeholder>System</stakeholder>
			<interest>Saving all valid user data and creating account</interest>
			<stakeholder>Company</stakeholder>
			<interest>Getting paid</interest>
		</stakeholders>
		<minimalguarantee>Sufficient validation to detect wrong inputs and failed payments</minimalguarantee>
		<successguarantee>Beanstream API has acknowledged the purchase, the user's account is created</successguarantee>
		<scenario>
			<step>User selects membership plan</step>
			<step>System gets billing information from user</step>
			<step>System sends data to Beanstream API</step>
			<step>Beanstream API bills user</step>
			<step>System creates user account</step>
			<step>System writes cookie and session for user</step>
			<step>System forwards user to account</step>
		</scenario>
		<extensions>
			<extension>
				<extends>2</extends>
				<xcase>User enters invalid billing information</xcase>
				<step>System alerts user of the error(s)</step>
				<step>System gets new billing information from user</step>
			</extension>
			<extension>
				<extends>4</extends>
				<xcase>Beanstream API transaction fails</xcase>
				<step>System alerts user of the error(s)</step>
				<step>System gets new billing information from user</step>
			</extension>
		</extensions>
	</usecase>

	<usecase>
		<name>Use Case 2</name>
		<description>Log User</description>
		<sitting>Instant</sitting>
		<primaryactor>User</primaryactor>
		<scope>System</scope>
		<level>User goal</level>
		<stakeholders>
			<stakeholder>User</stakeholder>
			<interest>Accessing his account</interest>
			<stakeholder>System</stakeholder>
			<interest>Identifying and authorizing the user</interest>
		</stakeholders>
		<preconditions>
			<precondition>User must have a valid account</precondition>
		</preconditions>
		<minimalguarantee>Sufficient validation for existing account and right credentials</minimalguarantee>
		<successguarantee>System has authorized access, user gets access to account</successguarantee>
		<scenario>
			<step>System gets credentials from user</step>
			<step>System writes cookie and session</step>
			<step>System logs access datetime</step>
			<step>System redirects user to account</step>
		</scenario>
		<extensions>
			<extension>
				<extends>0</extends>
				<xcase>System validates session</xcase>
				<step>System redirects to account</step>
			</extension>
			<extension>
				<extends>1</extends>
				<xcase>User enters invalid credentials</xcase>
				<step>System alerts user of the error(s)</step>
				<step>System gets new credentials from user</step>
			</extension>
		</extensions>
	</usecase>

	<usecase>
		<name>Use Case 3</name>
		<description>Reset Password</description>
		<sitting>Instant</sitting>
		<primaryactor>User</primaryactor>
		<scope>System</scope>
		<level>User goal</level>
		<stakeholders>
			<stakeholder>User</stakeholder>
			<interest>Retrieving access to his account</interest>
			<stakeholder>System</stakeholder>
			<interest>Making sure the request comes from the real account holder</interest>
		</stakeholders>
		<preconditions>
			<precondition>User must have a valid account</precondition>
		</preconditions>
		<minimalguarantee>Sufficient validation to check existing account</minimalguarantee>
		<successguarantee>System resets password</successguarantee>
		<scenario>
			<step>System gets email address from user</step>
			<step>System sends confirmation email to user</step>
			<step>User follows email's link</step>
			<step>System resets password</step>
			<step>System emails new password to user</step>
			<step>System logs action</step>
		</scenario>
		<extensions>
			<extension>
				<extends>1</extends>
				<xcase>User enters invalid email</xcase>
				<step>System alers user with error(s)</step>
				<step>System gets new email address from user</step>
			</extension>
			<extension>
				<extends>3</extends>
				<xcase>User follows invalid link</xcase>
				<step>System alers user with error(s)</step>
				<step>System gives option to manually enter the confirmation code</step>
			</extension>
			<extension>
				<extends>3</extends>
				<xcase>User enters invalid confirmation code</xcase>
				<step>System alers user with error(s)</step>
				<step>System gives option to manually re-enter the confirmation code</step>
			</extension>
		</extensions>
	</usecase>
</usecases>