Difference between revisions of "Web Frameworks - Workbook"
Line 53: | Line 53: | ||
(0707202) Useful link about Automatic testing of MVC applications created with Zend Framework - http://www.alexatnet.com/node/12 | (0707202) Useful link about Automatic testing of MVC applications created with Zend Framework - http://www.alexatnet.com/node/12 | ||
− | (0622597)Roll you own Search with Zend_Search_Lucene | + | (0622597) |
+ | |||
+ | Roll you own Search with Zend_Search_Lucene | ||
+ | |||
Creating index | Creating index | ||
Revision as of 14:12, 2 April 2009
Main Page >> Web Frameworks >> Web Frameworks - Workbook
Workshop schedule:
- Workshop - week 01 - Getting started with Zend
- Workshop - week 02 - Controllers
- Workshop - week 03 - Views and templates
- Workshop - week 04 - Models and databases
- Workshop - week 05 - CRUD (Create Read Update Delete) by simon baker (0718432)
- Workshop - week 06 - Web 2.0 - Web feeds
- Workshop - week 07 - Web 2.0 - Web services
- Workshop - week 08 - Web 2.0 - Ajax
- Workshop - week 09 - Automated testing
- Workshop - week 10 - Free to work on assessment
- Workshop - week 11 - Demonstrations
- Workshop - week 12 - Demonstrations
Useful information:
- Binay Randhawa (0719961)
[CRUD Function (Add, edit, delete) -- http://weierophinney.net/matthew/uploads/2007-02-28-FrameworkPresentation.pdf
- [Google map]--<iframe width="425" height="350" frameborder="0" scrolling="no" marginheight="0" marginwidth="0" src="http://maps.google.co.uk/maps?hl=en&ie=UTF8&ll=53.800651,-4.064941&spn=18.252022,39.50 (0719796)
- Home Installation - Apache, PHP, MYSQL, Zend
- Working With Zend and CSS - Working with Zend and CSS
- Ajax With Parameters - Using parameters in ajax with prototype.js by Nick Banford (0506508)
- Searching in Zend - Searching articles in Zend by Nick Banford (0506508)
- Zend Image Root Path - Relating images to your personal file store by Adam Orton (0708875)
- Google Maps using API - This would allow people to add a map to their website by Amar Joshi (0709436).
- Zend Yahoo Service - This would allow people to add functions from yahoo to their website by Amar Joshi (0709436).
- Zend Service Audioscrobbler - This would allow people to add functions from AudioScrobbler to their website by Amar Joshi (0709436).
- Google Search using AJAX and API - This will add google search to your site By Andre Treutlein (0705104)
Some useful links I have found (0610970)
http://www.developertutorials.com/tutorials/php/zend-framwork-tutorial-8-08-13/page1.html
http://blog.astrumfutura.com/archives/353-An-Example-Zend-Framework-Blog-Application-Part-2-The-MVC-Application-Architecture.html - detail information about MVC pattern
http://zendguru.wordpress.com/category/zend-framework/ - explanation about the ZEND form
http://www.killerphp.com/zend-framework/videos/ - video tutorial about MVC pattern
http://webdeveloper.econsultant.com/ajax-demos-examples-code-samples/ - ajax tutorial
http://ajbrown.org/blog/2009/01/04/automated-testing-using-zend-framework-part-1.html - Zend automated testing information
(0707202) Useful link about Automatic testing of MVC applications created with Zend Framework - http://www.alexatnet.com/node/12
(0622597)
Roll you own Search with Zend_Search_Lucene
Creating index
<?php
require_once 'Zend/Feed.php'; require_once 'Zend/Search/Lucene.php';
function sanitize($input) { return htmlentities(strip_tags( $input )); }
//create the index $index = new Zend_Search_Lucene('/tmp/feeds_index', true);
$feeds = Array('http://feeds.feedburner.com/ZendDeveloperZone', 'http://www.planet-php.net/rss/', 'http://www.sitepoint.com/blogs/category/php/feed/', );
//grab each feed foreach ($feeds as $feed) {
$channel = Zend_Feed::import($feed);
echo $channel->title()."\n";
// index each item foreach ($channel->items as $item) { if ($item->link() && $item->title() && $item->description()) {
$doc = new Zend_Search_Lucene_Document();
$doc->addField(Zend_Search_Lucene_Field::Keyword('link', sanitize($item->link())));
$doc->addField(Zend_Search_Lucene_Field::Text('title', sanitize($item->title())));
$doc->addField(Zend_Search_Lucene_Field::Unstored('contents', sanitize($item->description())));
echo "\tAdding: ".$item->title()."\n"; $index->addDocument($doc); } } } $index->commit(); echo $index->count()." Documents indexed.\n";
Next, we specify the RSS feeds we are interested in and fetch them in a loop. Then, with each feed we loop through the articles and index each one as a seperate Zend_Search_Lucene document.
$feeds = Array('http://feeds.feedburner.com/ZendDeveloperZone', 'http://www.planet-php.net/rss/', 'http://www.sitepoint.com/blogs/category/php/feed/', );
//grab each feed
foreach ($feeds as $feed) {
$channel = Zend_Feed::import($feed);
echo $channel->title()."\n";
// index each item foreach ($channel->items as $item) { if ($item->link() && $item->title() && $item->description()) {
//Create and index a ZSearch Document
} To add a document to our index, we create the document object and specify content for the document's fields. Zend_Search_Lucene provides different ways to analyze and store fields depending on how we need to search them and return the results. In this example, for each RSS item, we want to index the link, title, and description. $doc = new Zend_Search_Lucene_Document();
$doc->addField(Zend_Search_Lucene_Field::Keyword('link', sanitize($item->link())));
$doc->addField(Zend_Search_Lucene_Field::Text('title', sanitize($item->title())));
$doc->addField(Zend_Search_Lucene_Field::Unstored('contents', sanitize($item->description())));
echo "\tAdding: ".$item->title()."\n"; $index->addDocument($doc); value stored? indexed? tokenized? binary? Keyword yes yes no no UnIndexed yes no no no Binary yes no no yes Text yes yes yes no UnStored no yes yes no Keyword fields are stored and indexed, meaning I can search them as well as display them back in my search results. They are not split up into seperate words by tokenization. My link field is a good candidate for a Keyword because I might want to search articles by link URL, and I definitely want to display the link in the search results since the link is serving as my external identifier for the document. Enumerated database fields usually translate well to Keyword fields in Zend_Search_Lucene. It's usually a good idea to store an identifier for each document that can be used as a lookup mechanism in the search results. For this example, it makes sense to use the RSS item's link. If we were building an index from an existing relational database, we would want to store the primary key of the record, and if we were indexing a file system we would probably want to store the path to the file. UnIndexed fields are not searchable, but they are returned with search hits. Database timestamps, primary keys, file system paths, and other external identifiers are good candidates for UnIndexed fields. Binary fields are not tokenized or indexed, but are stored for retrieval with search hits. They can be used to store any data encoded as a binary string, such as an image icon. Text fields are stored, indexed, and tokenized. Text fields are appropriate for storing information like subjects and titles that need to be searchable as well as returned with search results. In my example, the title field of the RSS articles are indexed as Text fields. UnStored fields are tokenized and indexed, but not stored in the index. Large amounts of text are best indexed using this type of field. Storing data creates a larger index on disk, so if you need to search but not redisplay the data, use an UnStored field. In my example, the RSS description--the main body of text--is stored as an UnStored field. UnStored fields are particularly practical when using a Zend_Search_Lucene index in combination with a relational database. You can index large data fields with UnStored fields for searching, and retrieve them from your relational database by using a seperate fields as an identifier. It's also important to note that we named the field to store the description 'contents'. This is no accident. This is the field name that Zend_Search_Lucene will search by default. Internal discussion with the Framework development team is leading to the idea that Zend_Search_Lucene may break away from the Lucene norm and implement a simple way to search all fields instead of just the 'contents' field. Searching the Index Now that we have created a Zend_Search_Lucene index, let's put it to use by performing some searches. You can implement search on an index in just a couple dozen lines of code: <?php
require_once 'Zend/Search/Lucene.php';
//open the index $index = new Zend_Search_Lucene('/tmp/feeds_index');
$query = 'framework';
$hits = $index->find($query);
echo "Index contains ".$index->count()." documents.\n\n";
echo "Search for '".$query."' returned " .count($hits). " hits\n\n";
foreach ($hits as $hit) { echo $hit->title."\n"; echo "\tScore: ".sprintf('%.2f', $hit->score)."\n"; echo "\t".$hit->link."\n\n"; }
?> Could it be any easier? We include the library, open our index, seach for a term, and iterate through the result set.You should note that since we used the default case insensitive text analyzer to build the index, the search query should be lowercase. The Zend_Search_Lucene query format is powerful but simple. It's a snap to specify multiple query terms with a special syntax. To search our RSS index for articles that must contain the word 'framework' in the 'contents' field: $query = '+framework'; For articles with 'Zend' in the title: $query = 'title:zend'; For articles with containing the word 'framework' but without the word 'Zend' in the title: $query = 'framework -title:zend'; Conclusion In these simple examples, we have seen that the Zend_Search_Lucene module provides an easy way to add customized search functionality to an any php application without a dependance on external software packages. As the Zend_Search_Lucene module matures, it will no doubt prove to be a prized component of the Zend Framework. In future articles I hope to explore advanced indexing and search capabilities of Zend_Search_Lucene, and put the module through some real-life benchmarks using large data sets, comparing indexing and search performance against some other current popular methods