Tuesday, 25 August 2009

Prefix Match Search using Ternary Search Tree Dictionary in C#

I got a thank you email last week from someone who found my TST prefix match search code useful. I posted that code 2 and half years ago on CodeProject, very please to know someone is interested in it :-)

Ternary Search Tree is an useful data structure for certain type of project. It combines attributes of binary search trees and digital search tries. Like tries, they proceed character by character. Like binary search trees, they are space efficient, though each node has three children, rather than two. A search compares the current character in the search string with the character at the node. If the search character is less, the search goes to the left child; if the search character is greater, the search goes to the right child. When the search character is equal, though, the search goes to the middle child, and proceeds to the next character in the search string.

The idea behind ternary search trees dates back at least as far as 1964. In "Randomized Binary Searching with Tree Structures" (Communications of the ACM, March 1964), H.A. Clampett sketched a primitive version of the structure. Computer scientists have proven many theorems about the trees; for instance, searching for a string of length k in a ternary search tree with n strings will require at most O(log n+k) comparisons.

Useful ariticles:
http://www.comp.nus.edu.sg/~xujia/mirror/algorithm.myrice.com/resources/technical_artile/ternary_search_tree/terstree.htm
http://en.wikipedia.org/wiki/Ternary_search_tree

Ternary Search Tree is wildly used in computer science. There are implementation in C++, Java and .NET. I have found a C# version in http://www.codeproject.com/KB/recipes/tst.aspx.

This is a great C# implementation, although there are a couple of issues (please read the comments of the article), the main one is the partial search it provided is not what I expect, for example, provide words zymase, zyme and zymic, partial search for zym* only return zyme, partial search for zym** will return zymic. What I need is a prefix match search that will return all three words for zym. There is a guy provide a wildcard search function, but it didn't get back what I expect as well. So I wrote the prefixMatch feature based on a Java Ternary Tree example.

Feel free to change it to suite your requirements and polish it (e.g. exception handling etc ^_^). Currently it return a StringCollection. In my project I actually changed the dictionary to generic so can store and return different type of value object. I use a keysNumReturnValues to limit the number of results that returned, -1 means return all. I do this for my testing purpose.

I stored the words that in the words.txt in the TstDictionary(included in the Tst source code), when I search for zym, it give me the following results:
------------------------------------------------------------
type a prefix to do prefix match or type exit to exit
zym
-- end 19 results --
zymase
zyme
zymic
zymogen
zymogene
zymogenic
zymologic
zymological
zymologist
zymology
zymolysis
zymome
zymometer
zymophyte
zymoscope
zymose
zymosimeter
zymosis
zymotic

Hope you find the code useful.

------------------------ Source Code -------------------------------

///
/// Returns alphabetical list of all keys in the tree that begin with prefix.
///

///
The prefix.
///
public virtual ICollection PrefixMatch(string prefix)
{
return PrefixMatch(prefix, -1);
}

///
/// Returns alphabetical list of all keys in the tree that begin with prefix.
///

/// The prefix.
/// The number of values to return.
///
public virtual ICollection PrefixMatch(string prefix, int numReturnValues)
{
if (prefix==null)
throw new ArgumentNullException("prefix");
if (prefix.Length==0)
throw new ArgumentException("prefix is empty");
if (numReturnValues < -1 || numReturnValues == 0) throw new ArgumentException("invalid numReturnValues"); TstDictionaryEntry startNode = Find(prefix); if(startNode == null) return null; int keysNumReturnValues = numReturnValues; StringCollection keyList = new StringCollection(); if(startNode.IsKey) { keyList.Add(startNode.Key); DecreaseRecursionCount(ref keysNumReturnValues); } if(startNode.EqChild != null && (keysNumReturnValues == -1 || keysNumReturnValues > 0))
{
GetKeysRecursion(startNode.EqChild, ref keyList, ref keysNumReturnValues);
}

return keyList;
}


internal void GetKeysRecursion(TstDictionaryEntry currentNode, ref StringCollection keyList, ref int keysNumReturnValues)
{
if(currentNode == null) return;
if(currentNode.LowChild != null && (keysNumReturnValues == -1 || keysNumReturnValues > 0))
GetKeysRecursion(currentNode.LowChild, ref keyList, ref keysNumReturnValues);
if(currentNode.IsKey && (keysNumReturnValues == -1 || keysNumReturnValues > 0))
{
keyList.Add(currentNode.Key);
DecreaseRecursionCount(ref keysNumReturnValues);
}
if(currentNode.EqChild != null && (keysNumReturnValues == -1 || keysNumReturnValues > 0))
GetKeysRecursion(currentNode.EqChild, ref keyList, ref keysNumReturnValues);
if(currentNode.HighChild != null && (keysNumReturnValues == -1 || keysNumReturnValues > 0))
GetKeysRecursion(currentNode.HighChild, ref keyList, ref keysNumReturnValues);
}


internal void DecreaseRecursionCount(ref int keysNumReturnValues){
if(keysNumReturnValues > 0)
keysNumReturnValues--;

}

Friday, 21 August 2009

How to create P3P policy and implement P3P header

For website that require cookies, if using the web site in an Iframe, and parent website is on a different domain, then 3rd party(cross domain) cookies that do not have a compact policy will be blocked on some common browser privacy settings, e.g. IE when privacy setting is set to Medium.

This result on the web site inside the Iframe won't working properly, as session won't work if cookie is not enabled.

P3P compact policy can be applied to website as HTTP header so user agents such as IE can detect the P3P policy and decide whether or not to allow cookies from a particular site.

Please refer to http://en.wikipedia.org/wiki/P3P and http://www.w3.org/TR/P3P/ for further details on P3P definition.

If you are building a test site, you can grab some dummy P3P compact policy online(e.g. CP="CAO PSA OUR"), but if you need a proper one for your production website, generate your own one would be better :-)

There are some softwares or online generators can be used to generate P3P policy. I used the P3P Policy Editor from IBM http://www.alphaworks.ibm.com/tech/p3peditor/ to generate P3P policy.

Unzip the p3pExecutableJar and execute the p3p.jar to load up the software.

You can create a new policy from scratch or create one from one of the templates which I found is easier.

Please refer to this Knowledge Gene Create and Use P3P Policy for Website for detailed instructions.
* To view text views, select a knowde and click Attached Text icon on the right panel header.

Once the policy is created, it includes a Privacy Policy html file and the Compact Policy text, you need to deploy the Privacy Policy html to your website, and use the Compact Policy text in individual pages header.

For example, Page.Response.AddHeader("p3p","CP=\"CAO PSA OUR\""); for ASP.NET C#,
Adam Young's blog has a very good post about syntax on different languages.
Alternatively, you can set it in IIS for the whole web site. The KGene include instruction on how to set up HTTP header in IIS as well :-)

Wednesday, 19 August 2009

Start crocheting in a round

Knowledge Gene: Start crocheting in a round (PDF export available)

Many crochet projects require start crocheting in a round, for example, Amigurumi(Japanese art of knitting or crocheting small stuffed animals and anthropomorphic creatures), coasters, containers, hats, doily, individual pattern for crochet patchwork projects etc.
In general, there are 3 ways of doing this. Some people might have developed their own method.
Start with Multiple Chain stitches
Use multiple chain stitches to form a loop, e.g. use 6 or 8 ch. Example below is using 8 ch.


* picture is taken from KAGIBARI AMI NANDEMO Q&A - KONNATOKIANNATOKI Copyright© N.Seto 2003
Start from a slip knot then make 8 ch (chain stitich), use sl st(slip stitch) from the first ch to form a loop.
The disadvange of this method is the hole in the middle is quite big and not adjustable.

Start with single chain stitch
Use the first ch(chain stitich) as a loop.


* picture is taken from My First Love of Crocheting, author: Pan Meiling, published by www.winfortune.com.tw
The hole in the middle is smaller using this method.
Start with double ring
This is my favourite method. It used frequently in Japanese crochet projects.
* The demo picture below is taken from http://i-s.milkcafe.to/ami/kiso/ichiniju.html

1. Wrap the yarn twice around your finger (I usually wrap it around my two fingers instead).




2. Carefully took off the created loops and use two fingers to pinch them. Crochet the yarn out of the loop.

3. Then make a ch(chain stitch), note: this ch is not counted as one stitich in the 1st row. It just an upright stitch. Use the left hand finger to pull the yarn to control the tightness of the base of this ch.

4. Now start to crochet a sc(single stitch).

Now the first stitch of the 1st row is finished. Continue crochet 5 sc if the 1st row requires 6 stitches.

Next step is to adjust the loop. The left hand side thread is connect to the yarn ball, I named it T1. In the loop itself there are two threads, I named it T2 and T3. You need to pull one of them(T2 or T3) to tighten the loop. To find out which one you need to pull, slightly pull the left hand side yarn thread T1, don't pull too much, just pull it little bit so you can notice one of the yarn thread in the loop get tighter, assume it's T2, this is the thread you need to pull to make the loop smaller. So pull T2 and after the loop is minimized, you can see that T3 is the remaining thread in the loop, pull T1 again which will then tighten up the this remaining thead in the loop.

Then do a sl st(slip stitch) from the first ch(marked in green below) to form the circle. I normally will pull this sl st stitch tighter so it won't be too obvious and the project will look better.


How to Install East Asian Language Support(e.g. Chinese) to BlackBerry

I managed to install East Asian Language support to my BlackBerry 8900 Curve. I have been searching through lots of posts in order to find out how to just install the language support without upgrade the OS, and finished my installation without any heart attack, so I will share my installation steps here to save some effort if anybody needs it.

You can remove unwanted input after installation, e.g. I only keep Chinese input. To switch between input languages, press Entre while holding down Alt key.

For full installation details and screen shots, please look at this Knowledge Gene - [link: 'Install East Asian Language Support for BlackBerry 8900']

You can export the KGene and document using the pdf export icon, or hover over the first knowde, then click 'Learn more', navigate through the KGene, or click 'Show Full Document' to view the full doc :-)

Below is the outline steps:

1. Download Multi Language OS
For my 8900 Curve, I downloaded the 8900-v4.6.1.199_P4.2.0.102 version of OS.

2. Install the downloaded OS to your PC, you will need the files from this new OS later on to add multi language support to your BlackBerry device.
A new 8900-v4.6.1.199_P4.2.0.102 folder will be created under C:\Program Files\Common Files\Research In Motion\Shared\Loader Files.
The CJK.alx is the file we need to use later on.

3. Use Desktop Manager to backup your BlackBerry
Just in case... ^__^

4. Use AppLoader to add extra language support to BB.
Before the installation, go to your AppLoader folder, e.g. C:\Program Files\Common Files\Research In Motion\AppLoader, move the Vendor.xml to a backup directory, so you have kept a copy of the file.
Use the loader.exe to load up the AppLoader, follow the wizard, you will be able to add the CJK.alx file to the list at some point and select the desire languages.