
What's hash objects?
What's the purpose of hash table?
Why bother to learn it for Table Lookup Operation?
1. Hash

In wiki:
- In computing, a hash table (also hash map) is a data structure used to implement an associative array, a structure that can map keys to values.
From here you will get the idea:
"The basic idea behind hashing is to take a field in a record, known as the key, and convert it through some fixed process to a numeric value, known as the hash key, which represents the position to either store or find an item in the table. The numeric value will be in the range of 0 to n-1, where n is the maximum number of slots (or buckets) in the table.
The fixed process to convert a key to a hash key is known as a hash function. This function will be used whenever access to the table is needed.
One common method of determining a hash key is the division method of hashing. "
2. What for?
If you are in a computer course, your professor may tell you the reason:
"Develop a structure that will allow user to insert/delete/find records in constant average time:
- structure will be a table (relatively small)
- table completely contained in memory
- implemented by an array
- capitalizes on ability to access any element of the array in constant time"
3. Why bother "Hash" in SAS?
A time-consuming part of many SAS programs is looking up a value from one data set in another data set.
SAS 6 lookup methods, like SET with KEY= or a format, are good for many applications. However, in SAS 9, there is a better tool, the DATA Step hash object. The hash object provides a fast, easy way to perform in-memory table lookups without sorting or indexing. It provides the ability to store duplicate keys in a hash object and have added a find frequency counter.
For more reasons, you can read this ppt.
Simply put, you get speed and flexibility in this data step Hashing.
4. Learn by Example
Now we have two data sets "subject" and "weight".


We will:
- create the data sets;
- sort the "subject" by composite key (more than one variable) "age" and "name";
- lookup the table with the key "name";


Here is the code to do it:
data subject;
input name $ gender:$1. treatment $ age;
datalines;
John M Placebo 40
Ronald M Drug-A 50
Barbara M Drug-B 40
Alice F Drug-A 60
;
run;
data _null_;
set subject;
if _n_ = 1 then do;
/* delcare the sort order for hash */
declare hash hashsort(ordered:"a");
/* identify variables age and name to use as composite key */
hashsort.DefineKey("age","name");
/* identify columns of data */
hashsort.DefineData("name", "gender", "treatment", "age");
/* complete hashsort table definition */
hashsort.DefineDone();
end;
set subject end = eof;
/* add data with key to hash object */
hashsort.add();
/* write data using hash hashsort */
if eof then
hashsort.output(dataset:"sorted_subject");
run;
data weight (drop = i);
input date:DATE9. @;
do i = 1 to 4;
input name $ weight @;
output;
end;
datalines;
05May2006 Barbara 125 Alice 130 Ronald 170 John 70
04Jun2006 Barbara 122 Alice 133 Ronald 168 John 155
;
run;
data result;
set sorted_subject;
length name treatment $8. gender $1. age 3.;
if _n_ = 1 then do;
/* create the hash table from sorted_subject */
declare hash h(dataset:"sorted_subject");
h.defineKey("name");
h.defineData("gender", "treatment", "age");
h.defineDone();
end;
set weight;
/* find() returns zero to indicate a key was found, otherwise a non-zero value */
if h.find() = 0 then output;
run;
5. References:Getting Started with the DATA Step Hash Object
Better Hashing in SAS 9.2
Hash Objects - Why bother?