A Performant, Parallel, Random Acyclic-Hypergraph Approach
Current status: draft. Last update: 10th June, 2018.
TL;DR
This article documents a recent assignment regarding the implementation of a
perfect hash table. I discuss the initial goal as a set of requirements, then
capture design decisions and provide an implementation overview.
Requirements
Author a perfect hash table component that provides offline table
generation and fast lookup performance. Assume a key and value width of
ULONG (32 bits). Optimize for key set sizes ranging from
10,000 to 40,000 keys on average, with up to 100,000 keys on the high end.
Assume a key distribution similar to that of shared library address offsets;
not linear, but definitely not random, either. Do not make assumptions about
the presence of leading or trailing zeros (i.e. alignment), although feel free
to optimize for this if it doesn't detract from the component's performance on
less predictable key sets.
Prioritize lookup speed over generation speed. Optimize the lookup algorithm
to minimize latency involved in a single Value = Lookup(Key)
call. Table generation will be done ahead of time, in a separate process, and
only needs to ensure that tables can be generated in a reasonable amount of
time given the size of the input key set. Linear overhead is ideal, quadratic
less so, exponential would be infeasible.
Prioritize lookup speed over table size (memory requirements). A larger table
size, within reason, is an acceptable trade-off if it yields a faster lookup
speed. A minimal perfect hash table, where each key maps to exactly one table
location, is not a requirement, nor is it prohibited.
Note the inevitable tradeoff in size and performance with regards to the masking
method used by the implementation. Modulus-oriented solutions, those that use
the % operator in C, tend to be slower (modulus division can take upward of 90
cycles), but yield smaller table sizes. Solutions relying on power-of-2 based
table sizes boast much faster masking routines (e.g. Input &
(Size-1)), but incur greater table size overhead.
The offline generation process takes, as its input, a key file. The file will
be an array of ULONG keys (i.e. binary data). The number of
elements in the array can be ascertained by dividing the file size by
sizeof(ULONG). It produces, as its output, a perfect hash table
file that can be subsequently loaded and used in separate processes.
Callers wishing to use a given perfect hash table will need to load the file
produced in the step above. This will yield an interface from which the hash
table can be interacted with. At a bare minimum, the interface should support
the following semantics:
The interface requirements are flexible and can be extended as long as the
criteria above are met at a bare minimum.
The behavior of looking up or inserting a key that wasn't in the original input
set is undefined. That is, the implementation is not required to detect or
protect against this scenario — that is the responsibility of the caller.
Feel free to review existing works on the topic, particularly the cmph open
source project, the GNU gperf library, and the plethora of papers on the subject
of perfect hashing and minimal perfect hashing.
This was an interesting project. I'd never written a perfect hash table before,
nor was I familiar with the landscape for doing such a thing. I spent about
three days reviewing existing work, including the cmph project's source code. (I ended up
collecting about 147 (!) documents on the topic (papers, PhD thesis, slides,
etc) in my PDFs repo over the course
of the project.)
Algorithm Decisions
The algorithm I settled on is the acyclic random 2-part hypergraph (or r-graph,
where r = 2). The algorithm works as follows: for each key, generate two
independent hash values. Mask these values such that they fall within the
confines of the number of vertices picked for the table (more on this
later; for now, assume the number of vertices exceeds the number of
keys, or edges by at least 2x). These masked hash values now become
the two vertices, and are added to a graph structure by a connecting edge. The
edge is simply the 0..N index being used for enumeration, e.g.:
Once constructed, the graph is assessed to determine whether or not it is
acyclic. If the graph is acyclic, it means every vertex has at most 1 degree
of connectivity to other vertices. We want an acyclic graph. If it's not
acyclic, the attempt has failed, the graph is thrown away, and a new attempt
is made, using new random seed data to drive the two hash functions. Once
an acyclic graph is found, it's relatively straight forward to convert this
into a data structure that can be used as a perfect hash table.
There is one thing that stood out in their 1996 paper (page 9) that I was able
to verify experimentally (after finally hacking the
CHM algorithm
in the CMHP project into a working state). To summarize, sans heavy math notation: the
probability that we find a perfect hash solution by identifying an acyclic
r-graph (r = 2) is 99.9% within 18 iterations. On average, a solution is found
in √3 attempts.
This is referred to as a geometric
distribution, something I hadn't come across before. It is a very desirable
trait, especially for this particular problem. In essence, the more we do it,
the more likely we'll figure it out. Assuming there is a solution (i.e. the
hash function is performing properly), the chance of us not solving it is provably
infinitesimal, which is neat.
Consider a gambler who is going all-in on heads against the Universe in an
infinite game of coin toss. We are the Universe. Play it long enough, and
we'll eventually see a tail.
I made the following notes around day 3 of the project:
In my experiments, even with 10 million random keys, graph creation took about
6-7 seconds on the 64-bit release build. On average it found a solution usually
within 1-3 iterations. The worst-case I saw was 7 iterations.
The nice thing about the graph creation step is that each iteration can be
palmed off to a threadpool worker, such that you can attempt to find a graph
solution in parallel up to NCPU. On my 12 core box at home, there is a very
high probability I'll find a solution in the first batch of 12 iterations
submitted in parallel — thus, my clock time for solving the perfect hash
stays relatively consistent at 6-7 seconds, give or take.
This algorithm is not the fastest, nor the most state-of-the-art, nor does
it have the lowest bits-per-key storage requirement, nor will I be aiming
for a minimal perfect hash function. However, it's simple (relatively),
I understand it, I can explain it on a whiteboard without having to
continually reference a paper, and it's definitely fast enough in the
generation stage based on our target static key sets. It's also old; graph
theory has flourished since the 60s, and this particular algorithm came onto
the scene in 1992, and has been cited widely.
(The current state-of-the-art depicted in papers like
Fast Scalable Construction of (Minimal Perfect Hash) Functions (Genuzio,
Ottaviano, Vigna, 2016) use "lazy Gaussian elimination" to try tackle
the minimal perfect hash problem on ultra-large key sets (in the billions
and above). That is interesting, but not a wise choice for me to tackle in
a week, nor does it improve on our target static key sets, which are very
modest in size in comparison.)
Another reason I'm favoring the algorithm I've chosen is that because the
generation stage is so cheap, relatively, and we have that nice
probabilistic guarantee that we'll "probably" find a solution by iteration
18... that gives us a lot of leeway with regards to experimenting with the
underlying hash functions used. I envision there being much faster, non-mod
based hash functions we can experiment with, that actually have relatively
poor "randomness" qualities unless a particularly good seed is found.
Combined with the threadpool infrastructure for submitting iterations in
parallel, I have a hunch that I'll be able to find some very fast hash
functions that can still be solved in an acceptable amount of time. This
will help greatly with our evaluation time; reducing the latency and CPU
cycles required to perform the lookup.
There is one large risk item associated with my current plan: the key
validation step of the cmph command line program just flat-out isn't working
properly. The graph generation step *appears* to be doing
the right thing, at least with regards to finding cyclic graphs and
discarding them, etc. The validation step works on small key sizes,
however, after about 500, I'm seeing rampant conflicts and severe
degradation of the underlying hash functions (i.e. everything is hashing to
394 or 85 or something).
The last paragraph depicts some of the issues I had trying to get the cmph
program to validate the perfect hash tables it was supposedly generating. Try
as I might, I just couldn't get the thing to generate solutions that were
actually valid (despite it reporting that they were valid) after a few hours
of fiddling.
More impressively, though, is that the bug survived and still exists in my
complete reimplementation of their initial modulus-oriented masking approach.
This only became apparent in the latter stages of the project, when I authored
more robust validation and test logic. Thankfully though, my power-of-2 based
approach does work, and it's a lot faster, so, who knows. The
modulus functionality was only implemented for a reference point, I didn't
anticipate using it as the final "fast" version of the perfect hash solution, so
I haven't investigated why it doesn't work properly any further. It's a little
disconcerting, though.
Initial Design Decisions
A few of the initial guiding sentiments regarding the design follow:
I wanted to be able to easily mix and match different algorithms, hash
functions and masking types. I figured this would make experimentation
easier, and generally promote sensible de-coupling of internal
components. Thus, the main interface for creating a perfect hash table
is parameterized by three enums for the desired
algorithm,
hash function, and
masking type, respectively. These identifiers, depicted below, are
also stored with the on-disk representation of the perfect hash table,
such that the loading component knows which implementations to select
when preparing an interface for use.
Algorithm
Hash Function
Masking Type
//
// Define an enumeration for identifying which backend algorithm variant to
// use for creating the perfect hash table.
//
typedef enum _PERFECT_HASH_TABLE_ALGORITHM_ID {
//
// Explicitly define a null algorithm to take the 0-index slot.
// This makes enum validation easier.
//
PerfectHashTableNullAlgorithmId = 0,
//
// Begin valid algorithms.
//
PerfectHashTableDefaultAlgorithmId = 1,
PerfectHashTableChm01AlgorithmId = 1,
//
// End valid algorithms.
//
//
// N.B. Keep the next value last.
//
PerfectHashTableInvalidAlgorithmId,
} PERFECT_HASH_TABLE_ALGORITHM_ID;
typedef PERFECT_HASH_TABLE_ALGORITHM_ID *PPERFECT_HASH_TABLE_ALGORITHM_ID;
//
// Provide a simple inline algorithm validation routine.
//
FORCEINLINE
BOOLEAN
IsValidPerfectHashTableAlgorithmId(
_In_ PERFECT_HASH_TABLE_ALGORITHM_ID AlgorithmId
)
{
return (
AlgorithmId > PerfectHashTableNullAlgorithmId &&
AlgorithmId < PerfectHashTableInvalidAlgorithmId
);
}
//
// Define an enumeration for identifying which hash function variant to use.
//
typedef enum _PERFECT_HASH_TABLE_HASH_FUNCTION_ID {
//
// Explicitly define a null algorithm to take the 0-index slot.
// This makes enum validation easier.
//
PerfectHashTableNullHashFunctionId = 0,
//
// Begin valid hash functions.
//
PerfectHashTableHashCrc32RotateFunctionId = 1,
PerfectHashTableDefaultHashFunctionId = 1,
PerfectHashTableHashJenkinsFunctionId = 2,
//
// N.B. The following three hash functions are purposefully terrible from
// the perspective of generating a good distribution of hash values.
// They all have very simple operations and are intended to test the
// theory that even with a poor hash function, once we find the right
// seed, the hash quality is unimportant.
//
PerfectHashTableHashRotateXorFunctionId = 3,
PerfectHashTableHashAddSubXorFunctionId = 4,
PerfectHashTableHashXorFunctionId = 5,
//
// End valid hash functions.
//
//
// N.B. Keep the next value last.
//
PerfectHashTableInvalidHashFunctionId,
} PERFECT_HASH_TABLE_HASH_FUNCTION_ID;
typedef PERFECT_HASH_TABLE_HASH_FUNCTION_ID
*PPERFECT_HASH_TABLE_HASH_FUNCTION_ID;
//
// Provide a simple inline hash function validation routine.
//
FORCEINLINE
BOOLEAN
IsValidPerfectHashTableHashFunctionId(
_In_ PERFECT_HASH_TABLE_HASH_FUNCTION_ID HashFunctionId
)
{
return (
HashFunctionId > PerfectHashTableNullHashFunctionId &&
HashFunctionId < PerfectHashTableInvalidHashFunctionId
);
}
//
// Define an enumeration for identifying the type of table masking used by the
// underlying perfect hash table. This has performance and size implications.
// Modulus masking typically results in smaller tables at the expenses of slower
// modulus-based hash functions. Non-modulus masking requires power-of-2 sized
// tables, which will be larger, but the resulting mask function can be done
// by logical AND instructions, which are fast.
//
typedef enum _PERFECT_HASH_TABLE_MASK_FUNCTION_ID {
//
// Null masking type.
//
PerfectHashTableNullMaskFunctionId = 0,
//
// Being valid masking types.
//
PerfectHashTableModulusMaskFunctionId = 1,
PerfectHashTableAndMaskFunctionId = 2,
PerfectHashTableDefaultMaskFunctionId = 2,
PerfectHashTableXorAndMaskFunctionId = 3,
PerfectHashTableFoldAutoMaskFunctionId = 4,
PerfectHashTableFoldOnceMaskFunctionId = 5,
PerfectHashTableFoldTwiceMaskFunctionId = 6,
PerfectHashTableFoldThriceMaskFunctionId = 7,
//
// End valid masking types.
//
//
// N.B. Keep the next value last.
//
PerfectHashTableInvalidMaskFunctionId,
} PERFECT_HASH_TABLE_MASK_FUNCTION_ID;
typedef PERFECT_HASH_TABLE_MASK_FUNCTION_ID
*PPERFECT_HASH_TABLE_MASK_FUNCTION_ID;
//
// Provide a simple inline masking type validation routine.
//
FORCEINLINE
BOOLEAN
IsValidPerfectHashTableMaskFunctionId(
_In_ PERFECT_HASH_TABLE_MASK_FUNCTION_ID MaskFunctionId
)
{
return (
MaskFunctionId > PerfectHashTableNullMaskFunctionId &&
MaskFunctionId < PerfectHashTableInvalidMaskFunctionId
);
}
//
// Masking tends to fall into one of two buckets: modulus and not-modulus.
// Provide an inline routine that guarantees to match all current and future
// modulus masking function IDs.
//
FORCEINLINE
BOOLEAN
IsModulusMasking(
_In_ PERFECT_HASH_TABLE_MASK_FUNCTION_ID MaskFunctionId
)
{
return MaskFunctionId == PerfectHashTableModulusMaskFunctionId;
}
The problem parallelizes incredibly well. The more threads you can have
looking for an acylic graph, the better. There was no way I was
implementing this as a single-threaded solution; it's rare to get a
problem so well suited to a multithreaded approach, and the threadpool
scaffolding provided by NT is sublime, so, that was a no-brainer.
(In fact, the way that it is currently written, you can't actually solve
the graph *without* using a threadpool. You can
dictate the level of concurrency you want, and specifying 1 means the
threadpool only gets 1 thread, which makes debugging easier, but there
is no way to isolate the solving process to avoid this. This is by
design.)
Memory map all the things. All file system interaction is achieved via
sections and memory maps. There is a separate threadpool used for
handling file-oriented work (such as saving the solution to disk in
parallel whilst the main thread verifies it is correct — which
it will be unless we've got internal bugs). Memory mappings are used
for the source .keys file, the resulting .pht1 file for the perfect
hash table, and the .pht1:Info NTFS stream used to capture metadata
about the perfect hash table (such as which algorithm, hash function,
and masking function was used, sizes, stats etc).
For handling testing, due to the limited time constraints, I went for a
big-bang systems level "self-test". Coupled with aggressive internal
ASSERT()ing, this worked out pretty well. A self-test .exe is provided,
which, when pointed at a data directory and given a set of algorithm,
hash and mask IDs, will process all *.keys files in the directory; for
each one, a perfect hash file is created, the on-disk version is then
loaded and subsequently tested. This is all handled by the routine
SelfTestPerfectHashTable, which also serves as a good example for
how to exercise the entire system end-to-end.
Current Performance
(Note: this section is an active work-in-progress. It documents observed
performance at the time of writing: 10th June, 2018. It is the most recently
authored section in this article, and thus, depicts newer code samples and
implementation details compared to the subsequent sections. (For example,
the FastIndex() and FastIndexUnsafe() routines
were only recently added as part of performance tuning and benchmarking work;
these routines aren't mentioned in subsequent sections yet.)
The performance of the solution can be evaulated in two distinct areas: the
performance of the perfect hash table generation functionality, i.e. how quickly
can we create a perfect hash table for any given key set, and the lookup
performance of the resulting table, i.e. how many cycles does it take to look up
a key (best case, worst case), how many memory references are made for a key
lookup, etc.
Performance TL;DR
Generation is really fast. Although the parallelism helps immensely, the
underlying acyclic hypergraph algorithm described in the original papers
deserves all of the credit.
(That being said, I couldn't get their modulus-based algorithm to
generate valid solutions, either in their original code or my
reimplementation of their algorithm. I'm glad I got my AND-based
masking version working, otherwise this assignment would have been
a disaster.)
The hash function I use has three CRC32 instructions, one rotate, and
one XOR. The latency for just the hashing logic is sub 10 cycles, give
or take.
The Jenkins hash function, for comparison, clocks in at about 30 cycles
for just the hashing logic. If Jenkins is paired with the original
modulus masking, the minimum latency jumps to over 100 cycles.
I investigated even cheaper hash functions, such as simply XOR'ing
the input with two random seeds. They often work on smaller graphs,
but the CRC32 + rotate + XOR already offers very attractive latency
and good randomness, so it's a better general purpose
option. Also, reducing the hashing cost from sub 10 to, say, sub 6
cycles — we're completely limited by the overhead of the current
function signature and table layout at that point (i.e. we could shave
off more if we passed the seeds in as parameters via r8 and r9 instead
of constantly loading them from the table).
The FastIndex() routine (i.e. what Lookup()
and Insert() call in order to obtain the array offset for
a given key) requires a minimum of two memory references within the
assigned array in order to construct the final index. All other memory
references within the routine will be served from the L1 cache with high
probability (e.g. the seed values).
The best case scenario, where everything results in an L1 cache hit,
is about 23 cycles of latency. I think I could get this down to about
15 cycles with an optimized LEAF_ENTRY routine written in assembly,
however, it doesn't make sense to pursue that until more robust tools
are in place for measuring performance.
Worst case scenario is cache misses for both lookups, resulting in a
fetch from main memory. That'll be hundreds of cycles of penalty.
Smaller table sizes should reduce the likelihood of misses on the lookup
fast path, however, access patterns will factor heavily into this; if,
say, the most frequent keys accessed fall within a smaller subset of the
total key set — the penalty for a larger table is offset. (E.g.
90% of the lookups are on 10% of the keys.) Making real-world key
frequency metrics available during table creation would definitely
add value when attempting to assess the cost of a larger table size.
Additional areas I'd look to improve performance:
Analysis of the values in the input key set. What's the first
value, what's the last value, what's the minimum leading zeros,
maximum leading zeros, ditto for trailing zeros. What's the
value distribution look like? What's the gap between values
look like? Certain patterns will emerge for target key sets
that will allow for specific optimizations (like shifting away
known min trailing zeros, etc). I currently don't look at a
single key value during the creation step from the perspective
of directing the underlying algorithm.
Investigate prefetching. We could potentially dispatch a
prefetch (especially a prefetch with write-intent in the
Insert() case) early enough to offset some of
the memory latency involved in the two-part vertex lookup.
Write a LEAF_ENTRY in assembly that has zero error checking, no
COM dependencies (returns the index in rax, not HRESULT + output
parameter), and takes the seed values in registers. (Regarding
error checking: our FastIndex() routine still does
the sanity check that Vertex1 != Vertex2 —
the only way to trigger an E_FAIL return code in
this case is passing in a key that isn't in the original input
set, AND having it trigger a collision with the
selected hash function and random seed data such that the same
value is produced for both values. The chances of this are
very, very slim with reasonable hash functions and good seed
data. So, we could omit the S_OK or
E_FAIL return codes in favor of returning the index
directly in rax.)
Currently, the user is responsible for providing the algorithm,
hash function and masking type to use in order to create the
table. With the current work on benchmarking functions, this
could be improved such that the creation routine is given a goal
to maximize, e.g. lowest latency for FastIndex(),
and then it tries all known algo/hash/masking variants until it
finds the best solution. This wouldn't take that much more
effort and would ensure that the most optimal combo is always
being picked for a given key set.
Generation Performance
Broadly speaking, generation performance is very good. The parallel approach to
searching for acyclic graphs has paid off quite nicely — the thing just
flies. The current implementation is incredibly defensive at the moment, too,
and includes numerous expensive operations that could be relegated to a debug
build or removed completely.
For example, in order to check the incoming keys are unique, I simply allocate a
512MB bitmap and set a bit for every key I see in the range of 0-MAX_ULONG. I
verify I only see every key once, and then as an extra check, count the number
of set bits to ensure they match the reported number of keys. This is overly
defensive and very abusive to the cache.
Similarly, once a graph has been found, an initial internal verification routine
walks the entire graph and verifies none of the assigned indices for the entire
key set appear more than once. This happens before the final solution is saved
to disk. Then, after we load the table from the on-disk instance, we do another
thorough system test, exercising all of the public API routines (e.g.
Index(), Lookup(), Insert(), etc),
verifying the table is behaving correctly and keys are being mapped to unique
indices.
Even with all this additional overhead in place, a solution is generally found
within about 5 seconds on my machines, even for the largest key sets we support
(currently around ~750,000 for no particular reason other than I haven't tested
larger yet). For the key set sizes more representative of our target domain
(e.g. around 10,000 to 100,000), the generation time feels like it's
basically instant, at least from the user perception of running the command
line and waiting for it to report back that it solved the graph.
One aspect worth noting is related to the initial hunch I expressed earlier:
.... I envision there being much faster, non-mod based hash functions we
can experiment with, that actually have relatively poor "randomness"
qualities unless a particularly good seed is found. Combined with the
threadpool infrastructure for submitting iterations in parallel, I have a
hunch that I'll be able to find some very fast hash functions that can still
be solved in an acceptable amount of time.
Just to elaborate a bit further on that note — I was anticipating a
situation where, when fed some overly-simple hash functions, the graph solving
would take far longer than the 99.9% probability of success by iteration 18
guarantees cited by the academic work on the subject.
For example, assume we've got a 37,000 key set that solves on the first set of
attempts (i.e. within 1..NCPU), when fed a good, but expensive, hashing
function.
I pictured a scenario whereby, if fed a poorer-but-cheaper hash function, a
solution would be found in, say, 2 minutes. Or 5 minutes. Or basically any
time period that falls within our gut feeling of being acceptable for the given
input key set size.
That hunch appears to be wrong; I have not seen any evidence of this in
practice. A graph will either be solved within the those first ~20 attempts,
or it will never be solved. On numerous occasions, I left the solver running
overnight, such that I had 11 out of 12 cores searching for solutions at 5GHz
for 8 hours, and... nada. No solution was ever found. We often got really
close — I put in functionality to track the highest number of deleted
edges we'd seen during solving, and remember seeing that we were regularly
hitting 499,987 for a key set size of 500,000 (so 13 away from finding a
solution). Graphs would either solve very quickly, or not at all.
Something else I found interesting is how quickly graphs would be solved
once I implemented support for detecting when the attempts have exceeded
a certain threshold (currently 100) and then resizing the table such that
we double the number of underlying vertices, and re-trying the graph solving
in parallel. Once this was in place, graphs would always be rapidly solved.
Supplying poor hash functions would result in a number of table resize events,
but as soon as the right size was found, the graph would be solved within the
first batch of attempts.
Lookup Performance
Ironically, the most critical piece of this project — optimizing the
performance of the Insert() and Lookup() routines
— received the least amount of attention, at least in relation to the
other components like the graph solving or threadpool scaffolding.
The first hash function I half-heartedly wrote as a quick stop-gap, is,
amusingly, still the best performing one in terms of offering the best trade-off
between lookup latency and hash quality (with the latter being proportional to
underlying table size). (It uses three CRC32 instructions, a rotate, and an
XOR. I creatively gave it the moniker Crc32Rotate.)
Smaller tables are ideal — the minimum table size we can hope for when
using power-of-2 masking is the number of keys, rounded up to a power of two,
yielding our edge count, and then rounding that up to the next power-of-2,
yielding our vertices count, and thus, table size.
If two hash functions yield a solution from the same vertices count, the
function with the lower latency (cycles per lookup) is preferable, because
there is simply no benefit to spending extra cycles in order to try improve
the quality of the hash.
However, the same isn't necessarily true for two hash functions that yield
different table sizes. Every power-of-2 increase in table size results in
a larger memory footprint, which translates to a larger cache footprint,
which translates to an increase likelihood of cache misses when looking
up the two vertex values from the assigned array, which directly impacts
performance. The type of access pattern greatly affects performance in this
scenario too; if a subset of keys are looked up far more frequently than
the others, then they're likely to be in the cache versus not. However, if
the access pattern is truly random, the likelihood of a cache miss during
lookup is increased, because there's a larger footprint of possible values
the keys could be hashing to.
In order to give the Crc32Rotate hash function something to
be evaluated against, I eventually ended up authoring a version of the
Jenkins hash, which is widely used and cited, and is the hash function
used by the authors of the CMHP package. I wasn't particularly attracted
to it when first picking a hash function as it has long dependency chains
in its instruction stream, which inhibit the out-of-order and superscalar
nature of contemporary CPUs (the Jenkins hash was authored in the late 90s,
for what it's worth).
Let's take a look at both implementations before reviewing performance.
We'll review the FastIndex() implementations of each routine,
which inline the AND-based masking such that we don't have to make three
function calls through the COM vtbl pipeline to perform what equates to a
very simple instruction (Value & (TableSize - 1)).
(Note: the FastIndex() idiom was only added very recently as
part of the benchmarking work; thus, it's not mentioned in the later code
walkthrough sections yet.)
Crc32Rotate
Jenkins
_Use_decl_annotations_
HRESULT
PerfectHashTableFastIndexImplChm01Crc32RotateHashAndMask(
PPERFECT_HASH_TABLE Table,
ULONG Key,
PULONG Index
)
/*++
Routine Description:
Looks up given key in a perfect hash table and returns its index. This
is a fast version of the normal Index() routine that inlines the Crc32Rotate
hash function and AND masking.
N.B. If Key did not appear in the original set the hash table was created
from, the behavior of this routine is undefined. (In practice, the
key will hash to either an existing key's location or an empty slot,
so there is potential for returning a non-unique index.)
Arguments:
Table - Supplies a pointer to the table for which the key lookup is to be
performed.
Key - Supplies the key to look up.
Index - Receives the index associated with this key. The index will be
between 0 and Table->HashSize-1, and can be safely used to offset
directly into an appropriately sized array (e.g. Table->Values[]).
Return Value:
S_OK on success, E_FAIL if the underlying hash function returned a failure.
This will happen if the two hash values for a key happen to be identical.
It shouldn't happen once a perfect graph has been created (i.e. it only
happens when attempting to solve the graph). The Index parameter will
be cleared in the case of E_FAIL.
--*/
{
ULONG A;
ULONG B;
ULONG C;
ULONG D;
ULONG Seed1;
ULONG Seed2;
ULONG Seed3;
ULONG Input;
PULONG Seeds;
ULONG Masked;
ULONG Vertex1;
ULONG Vertex2;
PULONG Assigned;
ULONG MaskedLow;
ULONG MaskedHigh;
ULONGLONG Combined;
//IACA_VC_START();
//
// Initialize aliases.
//
Seeds = &Table->Header->FirstSeed;
Seed1 = Seeds[0];
Seed2 = Seeds[1];
Seed3 = Seeds[2];
Input = Key;
Assigned = Table->Data;
//
// Calculate the individual hash parts.
//
A = _mm_crc32_u32(Seed1, Input);
B = _mm_crc32_u32(Seed2, _rotl(Input, 15));
C = Seed3 ^ Input;
D = _mm_crc32_u32(B, C);
//IACA_VC_END();
Vertex1 = A;
Vertex2 = D;
if (Vertex1 == Vertex2) {
goto Error;
}
//
// Mask each hash value such that it falls within the confines of the
// number of vertices.
//
MaskedLow = Vertex1 & Table->HashMask;
MaskedHigh = Vertex2 & Table->HashMask;
//
// Obtain the corresponding vertex values for the masked high and low hash
// values. These are derived from the "assigned" array that we construct
// during the creation routine's assignment step (GraphAssign()).
//
Vertex1 = Assigned[MaskedLow];
Vertex2 = Assigned[MaskedHigh];
//
// Combine the two values, then perform the index masking operation, such
// that our final index into the array falls within the confines of the
// number of edges, or keys, in the table. That is, make sure the index
// value is between 0 and Table->Keys->NumberOfElements-1.
//
Combined = (ULONGLONG)Vertex1 + (ULONGLONG)Vertex2;
Masked = Combined & Table->IndexMask;
//
// Update the caller's pointer and return success. The resulting index
// value represents the array offset index for this given key in the
// underlying table, and is guaranteed to be unique amongst the original
// keys in the input set.
//
*Index = Masked;
//IACA_VC_END();
return S_OK;
Error:
//
// Clear the caller's pointer and return failure. We should only hit this
// point if the caller supplies a key that both: a) wasn't in the original
// input set, and b) happens to result in a hash value where both the high
// part and low part are identical, which is rare, but not impossible.
//
*Index = 0;
return E_FAIL;
}
_Use_decl_annotations_
HRESULT
PerfectHashTableFastIndexImplChm01JenkinsHashAndMask(
PPERFECT_HASH_TABLE Table,
ULONG Key,
PULONG Index
)
/*++
Routine Description:
Looks up given key in a perfect hash table and returns its index. This
is a fast version of the normal Index() routine that inlines the Jenkins
hash function and AND masking.
N.B. If Key did not appear in the original set the hash table was created
from, the behavior of this routine is undefined. (In practice, the
key will hash to either an existing key's location or an empty slot,
so there is potential for returning a non-unique index.)
Arguments:
Table - Supplies a pointer to the table for which the key lookup is to be
performed.
Key - Supplies the key to look up.
Index - Receives the index associated with this key. The index will be
between 0 and Table->HashSize-1, and can be safely used to offset
directly into an appropriately sized array (e.g. Table->Values[]).
Return Value:
S_OK on success, E_FAIL if the underlying hash function returned a failure.
This will happen if the two hash values for a key happen to be identical.
It shouldn't happen once a perfect graph has been created (i.e. it only
happens when attempting to solve the graph). The Index parameter will
be cleared in the case of E_FAIL.
--*/
{
ULONG A;
ULONG B;
ULONG C;
ULONG D;
ULONG E;
ULONG F;
PBYTE Byte;
ULONG Seed1;
ULONG Seed2;
ULONG Input;
PULONG Seeds;
ULONG Masked;
ULONG Vertex1;
ULONG Vertex2;
PULONG Assigned;
ULONG MaskedLow;
ULONG MaskedHigh;
ULONGLONG Combined;
//
// Initialize aliases.
//
//IACA_VC_START();
Seeds = &Table->Header->FirstSeed;
Seed1 = Seeds[0];
Seed2 = Seeds[1];
Input = Key;
Byte = (PBYTE)&Input;
//
// Generate the first hash.
//
A = B = 0x9e3779b9;
C = Seed1;
A += (((ULONG)Byte[3]) << 24);
A += (((ULONG)Byte[2]) << 16);
A += (((ULONG)Byte[1]) << 8);
A += ((ULONG)Byte[0]);
A -= B; A -= C; A ^= (C >> 13);
B -= C; B -= A; B ^= (A << 8);
C -= A; C -= B; C ^= (B >> 13);
A -= B; A -= C; A ^= (C >> 12);
B -= C; B -= A; B ^= (A << 16);
C -= A; C -= B; C ^= (B >> 5);
A -= B; A -= C; A ^= (C >> 3);
B -= C; B -= A; B ^= (A << 10);
C -= A; C -= B; C ^= (B >> 15);
Vertex1 = C;
//
// Generate the second hash.
//
D = E = 0x9e3779b9;
F = Seed2;
D += (((ULONG)Byte[3]) << 24);
D += (((ULONG)Byte[2]) << 16);
D += (((ULONG)Byte[1]) << 8);
D += ((ULONG)Byte[0]);
D -= E; D -= F; D ^= (F >> 13);
E -= F; E -= D; E ^= (D << 8);
F -= D; F -= E; F ^= (E >> 13);
D -= E; D -= F; D ^= (F >> 12);
E -= F; E -= D; E ^= (D << 16);
F -= D; F -= E; F ^= (E >> 5);
D -= E; D -= F; D ^= (F >> 3);
E -= F; E -= D; E ^= (D << 10);
F -= D; F -= E; F ^= (E >> 15);
//IACA_VC_END();
Vertex2 = F;
if (Vertex1 == Vertex2) {
goto Error;
}
//
// Mask each hash value such that it falls within the confines of the
// number of vertices.
//
MaskedLow = Vertex1 & Table->HashMask;
MaskedHigh = Vertex2 & Table->HashMask;
//
// Obtain the corresponding vertex values for the masked high and low hash
// values. These are derived from the "assigned" array that we construct
// during the creation routine's assignment step (GraphAssign()).
//
Assigned = Table->Data;
Vertex1 = Assigned[MaskedLow];
Vertex2 = Assigned[MaskedHigh];
//
// Combine the two values, then perform the index masking operation, such
// that our final index into the array falls within the confines of the
// number of edges, or keys, in the table. That is, make sure the index
// value is between 0 and Table->Keys->NumberOfElements-1.
//
Combined = (ULONGLONG)Vertex1 + (ULONGLONG)Vertex2;
Masked = Combined & Table->IndexMask;
//
// Update the caller's pointer and return success. The resulting index
// value represents the array offset index for this given key in the
// underlying table, and is guaranteed to be unique amongst the original
// keys in the input set.
//
*Index = Masked;
//IACA_VC_END();
return S_OK;
Error:
//
// Clear the caller's pointer and return failure. We should only hit this
// point if the caller supplies a key that both: a) wasn't in the original
// input set, and b) happens to result in a hash value where both the high
// part and low part are identical, which is rare, but not impossible.
//
*Index = 0;
return E_FAIL;
}
As you can see, the CRC32 routine is doing far less work to derive the
two 32-bit hash values for a given input key. The IACA profiles for each
routine tell a similar story:
For the sake of comparison, let's take a look at the IACA profile of the
modulus-masking based approach used in the original academic papers. (Note that
I wasn't able to get their modulus-masking algorithm to generate valid graphs
without collisions — both in the original CMPH project and my
reimplementation of their algorithm. So, the following routine isn't used
anywhere. It is useful for depicting the impact of the modulus operator,
though.)
Converting the AND-masking to a %-mask for the Jenkins hash function results in
a diff that looks like this:
% diff -u JenkinsAnd.c JenkinsMod.c
--- JenkinsAnd.c 2018-06-10 20:48:49.330504900 -0700
+++ JenkinsMod.c 2018-06-10 20:50:06.373304600 -0700
@@ -1,6 +1,6 @@
_Use_decl_annotations_
HRESULT
-PerfectHashTableFastIndexImplChm01JenkinsHashAndMask(
+PerfectHashTableFastIndexImplChm01JenkinsHashModMask(
PPERFECT_HASH_TABLE Table,
ULONG Key,
PULONG Index
@@ -130,8 +130,8 @@
// number of vertices.
//
- MaskedLow = Vertex1 & Table->HashMask;
- MaskedHigh = Vertex2 & Table->HashMask;
+ MaskedLow = Vertex1 % Table->HashModulus;
+ MaskedHigh = Vertex2 % Table->HashModulus;
//
// Obtain the corresponding vertex values for the masked high and low hash
@@ -153,7 +153,7 @@
Combined = (ULONGLONG)Vertex1 + (ULONGLONG)Vertex2;
- Masked = Combined & Table->IndexMask;
+ Masked = Combined % Table->IndexModulus;
//
// Update the caller's pointer and return success. The resulting index
The three div operations bump the µop count from 119 up to 168, a 1.4x
increase. The block throughput is even worse, jumping from 43 to 119, ~2.7x
increase. At 19 cycles, the CRC32-Rotate version is clearly in the lead, 2.2x
better than Jenkins, and 6.2x better than Jenkins with modulus masking as per
the original paper.
In Defense of Jenkins...
It definitely produces a better quality hash than the CRC32-based routine. This
is evidenced by the fact that, against our 50 auto-generated linear/random key
files of varying sizes — I've not seen Jenkins result in a single table
resize event.
On the other hand, on the same data set, the CRC32 routine will typically need
to resort to around 15 table resize events in order to find a solution. The
affected 15 files are all the random ones (I've not seen any linear ones needing
resizing, even the large ones). There could be some pathological issue that is
easily fixed, but I haven't investigated in detail, yet. On all of the key sets
that are more representative of our target, the CRC32 routine obtains a solution
easily.
Implementation Notes
(The following section serves as a general introduction to the project's
organization, concepts, files, etc. It aims to provide a bit of
background context to some of the idioms that will be observed when reviewing
the code. The next section, Code Walkthrough,
digs into the implementation details in a more end-to-end type fashion.)
From an implementation perspective, I decided to develop the component as part
of my tracer project for two
reasons: a) the tracer project already has a lot of scaffolding in place
(especially for the boring bits) that I'd be able to re-use, which is useful
when time-constrained, and b) a perfect hash table is actually a pretty
neat component that I can see my self using down the track.
The implementation is creatively-named PerfectHashTable,
and is a DLL.
Development of the project was done in Visual Studio 2017 via the
PerfectHashTable.sln
solution, which includes Debug, Release, PGInstrument and PGOptimize
configurations.
The coding style and conventions are the same as the rest of the tracer project:
Cutler Normal Form C, heavily commented, and generally very NT-esque in style.
Components in the tracer project have an interesting restriction in that they
can't use the C runtime library in any way (due to the need to potentially be
remote injected into a target process, and not wanting to deal with varying
levels of C runtime availability). Instead, they make extensive, exclusive
use of the NT runtime primitives afforded by ntdll.dll,
kernel32.dll, and ntoskrnl.exe. This
functionality is wrapped up by a component named
Rtl (inspired by the
NT runtime library of the same name). There is a huge structure named
RTL,
which contains the kitchen sink of things we need across all components, as well
as function pointers to DDK-type functionality normally not available if you're
including <Windows.h> (such as bitmaps, AVL tables, prefix
tables, etc).
(Note: one of the original design objectives of the RTL structure was to
facilitate writing components that could run in both kernel and user mode
without needing recompilation or #ifdefs. That is, you could author and test
the vast majority of your functionality in user mode, where development,
debugging and testing is easier, without needing to deploy it into the kernel
until much later in the development lifecycle.)
The reason for mentioning this is that the first two parameters for every public
function of the PerfectHashTable component are usually Rtl and
Allocator. E.g.:
The ALLOCATOR
structure encapsulates the memory management functions like
Calloc() and Free(), such that we're not dependent
upon CRT linkage to equivalent functions. Initially, it just had the basic set
of routines required to mimic Python's PyMemAllocatorEx structure:
malloc(), calloc(), realloc() and
free(). However, since then, it grew to support a plethora of
routines, many of which driven by the TraceStore component, but also more
general functions, such as aligned memory allocation.
Short story long: if you see Rtl our Allocator
variables anywhere, it's just our runtime glue or memory allocator
stuff.
Moving on, all public functions get a SAL-annotated function pointer typedef in
the main public header file (PerfectHashTable.h).
Functions are defined for creating and destroying keys and perfect hash tables,
creating and destroying contexts (a context is required to create a table), and
loading a previously-created table.
typedef struct _Struct_size_bytes_(SizeOfStruct) _PERFECT_HASH_TABLE_API {
//
// Size of the structure, in bytes. This is filled in automatically by
// LoadPerfectHashTableApi() based on the initial SizeOfAnyApi parameter.
//
_In_range_(sizeof(struct _PERFECT_HASH_TABLE_API),
sizeof(struct _PERFECT_HASH_TABLE_API_EX)) ULONG SizeOfStruct;
//
// Number of function pointers contained in the structure. This is filled
// in automatically by LoadPerfectHashTableApi() based on the initial
// SizeOfAnyApi parameter divided by the size of a function pointer.
//
ULONG NumberOfFunctions;
//
// Begin function pointers.
//
union {
PVOID FirstFunctionPointer;
PSET_C_SPECIFIC_HANDLER SetCSpecificHandler;
};
PLOAD_PERFECT_HASH_TABLE_KEYS LoadPerfectHashTableKeys;
PDESTROY_PERFECT_HASH_TABLE_KEYS DestroyPerfectHashTableKeys;
PCREATE_PERFECT_HASH_TABLE_CONTEXT CreatePerfectHashTableContext;
PDESTROY_PERFECT_HASH_TABLE_CONTEXT DestroyPerfectHashTableContext;
PCREATE_PERFECT_HASH_TABLE CreatePerfectHashTable;
PLOAD_PERFECT_HASH_TABLE LoadPerfectHashTable;
PTEST_PERFECT_HASH_TABLE TestPerfectHashTable;
PINITIALIZE_PERFECT_HASH_TABLE_ALLOCATOR InitializePerfectHashAllocator;
PINITIALIZE_PERFECT_HASH_TABLE_ALLOCATOR_FROM_RTL_BOOTSTRAP
InitializePerfectHashAllocatorFromRtlBootstrap;
} PERFECT_HASH_TABLE_API;
typedef PERFECT_HASH_TABLE_API *PPERFECT_HASH_TABLE_API;
And the loader routine looks like this:
FORCEINLINE
BOOLEAN
LoadPerfectHashTableApi(
_In_ PRTL Rtl,
_Inout_ HMODULE *ModulePointer,
_In_opt_ PUNICODE_STRING ModulePath,
_In_ ULONG SizeOfAnyApi,
_Out_writes_bytes_all_(SizeOfAnyApi) PPERFECT_HASH_TABLE_ANY_API AnyApi
)
/*++
Routine Description:
Loads the perfect hash table module and resolves all API functions for
either the PERFECT_HASH_TABLE_API or PERFECT_HASH_TABLE_API_EX structure.
The desired API is indicated by the SizeOfAnyApi parameter.
Example use:
PERFECT_HASH_TABLE_API_EX GlobalApi;
PPERFECT_HASH_TABLE_API_EX Api;
Success = LoadPerfectHashApi(Rtl,
NULL,
NULL,
sizeof(GlobalApi),
(PPERFECT_HASH_TABLE_ANY_API)&GlobalApi);
ASSERT(Success);
Api = &GlobalApi;
In this example, the extended API will be provided as our sizeof(GlobalApi)
will indicate the structure size used by PERFECT_HASH_TABLE_API_EX.
Arguments:
Rtl - Supplies a pointer to an initialized RTL structure.
ModulePointer - Optionally supplies a pointer to an existing module handle
for which the API symbols are to be resolved. May be NULL. If not
NULL, but the pointed-to value is NULL, then this parameter will
receive the handle obtained by LoadLibrary() as part of this call.
If the string table module is no longer needed, but the program will
keep running, the caller should issue a FreeLibrary() against this
module handle.
ModulePath - Optionally supplies a pointer to a UNICODE_STRING structure
representing a path name of the string table module to be loaded.
If *ModulePointer is not NULL, it takes precedence over this parameter.
If NULL, and no module has been provided via *ModulePointer, loading
will be attempted via LoadLibraryA("PerfectHashTable.dll")'.
SizeOfAnyApi - Supplies the size, in bytes, of the underlying structure
pointed to by the AnyApi parameter.
AnyApi - Supplies the address of a structure which will receive resolved
API function pointers. The API furnished will depend on the size
indicated by the SizeOfAnyApi parameter.
Return Value:
TRUE on success, FALSE on failure.
--*/
This is a little quirky, but I've found it to be a useful way to
dynamically load and use small components at runtime without requiring
any static linking or dll export/import glue.
Publicly, the keys and context structures are opaque. Routines are exposed to
create them and destroy them, and that's it. The structure details are reserved
for main private header for the component,
PerfectHashTablePrivate.h,
which is available for inclusion by all of our internal .c files. Pre-compiled
headers are used, such that the only file each individual .c file needs to
include is stdafx.h.
The actual perfect hash table interface that implements the required
Insert() and Lookup() is actually exposed as a
COM-style vtbl structure named
PERFECT_HASH_TABLE_VTBL. It's not technically a COM interface, as we don't
expose a QueryInterface() function pointer in the first slot, but
it has the same initial IUnknown layout, and leverages the
reference counting facilities implied by AddRef() and
Release() to manage lifetime. (This differs from all of the other
structures, which will typically have both a Create and Destroy-type method.)
The structure, with an accompanying definition of the loader function pointer
and all public vtbl methods for interacting with the perfect hash table, looks
like this:
//
// Forward definition of the interface.
//
typedef struct _PERFECT_HASH_TABLE_VTBL PERFECT_HASH_TABLE_VTBL;
typedef PERFECT_HASH_TABLE_VTBL *PPERFECT_HASH_TABLE_VTBL;
typedef PERFECT_HASH_TABLE_VTBL **PPPERFECT_HASH_TABLE;
//
// Define a minimal vtbl encapsulation structure if we're a public
// (i.e. non-internal) build. The actual structure is defined in
// PerfectHashTablePrivate.h.
//
#ifndef _PERFECT_HASH_TABLE_INTERNAL_BUILD
typedef struct _PERFECT_HASH_TABLE {
PPERFECT_HASH_TABLE_VTBL Vtbl;
} PERFECT_HASH_TABLE;
#else
typedef struct _PERFECT_HASH_TABLE PERFECT_HASH_TABLE;
#endif
typedef PERFECT_HASH_TABLE *PPERFECT_HASH_TABLE;
typedef
_Check_return_
_Success_(return != 0)
BOOLEAN
(NTAPI LOAD_PERFECT_HASH_TABLE)(
_In_ PRTL Rtl,
_In_ PALLOCATOR Allocator,
_In_opt_ PPERFECT_HASH_TABLE_KEYS Keys,
_In_ PCUNICODE_STRING Path,
_Out_ PPERFECT_HASH_TABLE *TablePointer
);
typedef LOAD_PERFECT_HASH_TABLE *PLOAD_PERFECT_HASH_TABLE;
//
// Define the public perfect hash table functions.
//
typedef
HRESULT
(NTAPI PERFECT_HASH_TABLE_INSERT)(
_In_ PPERFECT_HASH_TABLE Table,
_In_ ULONG Key,
_In_ ULONG Value,
_Out_opt_ PULONG PreviousValue
);
typedef PERFECT_HASH_TABLE_INSERT *PPERFECT_HASH_TABLE_INSERT;
typedef
HRESULT
(NTAPI PERFECT_HASH_TABLE_LOOKUP)(
_In_ PPERFECT_HASH_TABLE Table,
_In_ ULONG Key,
_Out_ PULONG Value
);
typedef PERFECT_HASH_TABLE_LOOKUP *PPERFECT_HASH_TABLE_LOOKUP;
typedef
HRESULT
(NTAPI PERFECT_HASH_TABLE_DELETE)(
_In_ PPERFECT_HASH_TABLE Table,
_In_ ULONG Key,
_Out_opt_ PULONG PreviousValue
);
typedef PERFECT_HASH_TABLE_DELETE *PPERFECT_HASH_TABLE_DELETE;
//
// Given a key, this routine returns the relative index of the key in the
// underlying hash table. This is guaranteed to be within the bounds of the
// table size.
//
typedef
HRESULT
(NTAPI PERFECT_HASH_TABLE_INDEX)(
_In_ PPERFECT_HASH_TABLE Table,
_In_ ULONG Key,
_In_ PULONG Index
);
typedef PERFECT_HASH_TABLE_INDEX *PPERFECT_HASH_TABLE_INDEX;
typedef
HRESULT
(NTAPI PERFECT_HASH_TABLE_HASH)(
_In_ PPERFECT_HASH_TABLE Table,
_In_ ULONG Input,
_Out_ PULONGLONG Hash
);
typedef PERFECT_HASH_TABLE_HASH *PPERFECT_HASH_TABLE_HASH;
typedef
HRESULT
(NTAPI PERFECT_HASH_TABLE_MASK_HASH)(
_In_ PPERFECT_HASH_TABLE Table,
_In_ ULONG Input,
_Out_ PULONG Masked
);
typedef PERFECT_HASH_TABLE_MASK_HASH *PPERFECT_HASH_TABLE_MASK_HASH;
typedef
HRESULT
(NTAPI PERFECT_HASH_TABLE_MASK_INDEX)(
_In_ PPERFECT_HASH_TABLE Table,
_In_ ULONGLONG Input,
_Out_ PULONG Masked
);
typedef PERFECT_HASH_TABLE_MASK_INDEX *PPERFECT_HASH_TABLE_MASK_INDEX;
//
// Loaded hash tables are reference counted using the AddRef()/Release() COM
// semantics. The number of AddRef() calls should match the number of Release()
// calls. The resources will be released when the final Release() is called.
//
typedef
ULONG
(NTAPI PERFECT_HASH_TABLE_ADD_REF)(
_In_ PPERFECT_HASH_TABLE Table
);
typedef PERFECT_HASH_TABLE_ADD_REF *PPERFECT_HASH_TABLE_ADD_REF;
typedef
ULONG
(NTAPI PERFECT_HASH_TABLE_RELEASE)(
_In_ PPERFECT_HASH_TABLE Table
);
typedef PERFECT_HASH_TABLE_RELEASE *PPERFECT_HASH_TABLE_RELEASE;
//
// The interface as a vtbl. Note that we're *almost* a valid COM interface,
// except for the NULL pointer that will occupy the first slot where the impl
// for QueryInterface() is meant to live.
//
typedef struct _PERFECT_HASH_TABLE_VTBL {
PVOID Unused;
PPERFECT_HASH_TABLE_ADD_REF AddRef;
PPERFECT_HASH_TABLE_RELEASE Release;
PPERFECT_HASH_TABLE_INSERT Insert;
PPERFECT_HASH_TABLE_LOOKUP Lookup;
PPERFECT_HASH_TABLE_DELETE Delete;
PPERFECT_HASH_TABLE_INDEX Index;
PPERFECT_HASH_TABLE_HASH Hash;
PPERFECT_HASH_TABLE_MASK_HASH MaskHash;
PPERFECT_HASH_TABLE_MASK_INDEX MaskIndex;
} PERFECT_HASH_TABLE_VTBL;
typedef PERFECT_HASH_TABLE_VTBL *PPERFECT_HASH_TABLE_VTBL;
The COM-style vtbl pattern worked quite nicely here, and provided the perfect
amount of exposure between the public and private definitions of the table
interface. It allows us to mix and match who provides what with regards to
the vtbl function pointers, based on the selected algorithm, hash function
and masking type. The
PerfectHashTableConstants.h
file exposes an internal routine named
InitializeExtendedVtbl,
which looks like this:
Note that, thanks to COM, the first five routines can actually be serviced by
generic, algorithm-agnostic implementations. That is, because the
Insert(),
Lookup(),
and
Delete()
routines simply use the Index() routine to
obtain the array offset for the given input key, they don't
need to know anything about the backend algorithm, hash or
masking type. This actually makes all of these functions
very simple, as can be seen below:
Insert
Lookup
Delete
AddRef
Release
_Use_decl_annotations_
HRESULT
PerfectHashTableInsert(
PPERFECT_HASH_TABLE Table,
ULONG Key,
ULONG Value,
PULONG PreviousValue
)
/*++
Routine Description:
Looks up given key in a perfect hash table and returns the value set by
the Insert() routine. If no insertion has taken place for this key, this
routine guarantees to return 0 as the value.
N.B. If Key did not appear in the original set the hash table was created
from, the behavior of this routine is undefined. (In practice, the
key will hash to either an existing key's location or an empty slot,
so there is potential to corrupt the table in the sense that previously
inserted values will be trampled over.)
Arguments:
Table - Supplies a pointer to the table to insert the key/value into.
Key - Supplies the key to insert.
Value - Supplies the value to insert.
PreviousValue - Optionally supplies a pointer that will receive the previous
value at the relevant table location prior to this insertion. If no
prior insertion, the previous value is guaranteed to be 0.
Return Value:
S_OK in all normal operating conditions. E_FAIL may be returned in some
cases when passed a key not in the original input set. The PreviousValue
parameter, if non-NULL, will be cleared in this case.
--*/
{
ULONG Index;
ULONG Existing;
HRESULT Result;
//
// Obtain the index for this key.
//
Result = Table->Vtbl->Index(Table, Key, &Index);
if (FAILED(Result)) {
//
// Clear the caller's pointer if applicable and return error.
//
if (ARGUMENT_PRESENT(PreviousValue)) {
*PreviousValue = 0;
}
return E_FAIL;
}
//
// Get the existing value.
//
Existing = Table->Values[Index];
//
// Write the new value.
//
Table->Values[Index] = Value;
//
// Update the caller's pointer if applicable.
//
if (ARGUMENT_PRESENT(PreviousValue)) {
*PreviousValue = Existing;
}
return S_OK;
}
_Use_decl_annotations_
HRESULT
PerfectHashTableLookup(
PPERFECT_HASH_TABLE Table,
ULONG Key,
PULONG Value
)
/*++
Routine Description:
Looks up given key in a perfect hash table and returns the value set by
the Insert() routine. If no insertion has taken place for this key, this
routine guarantees to return 0 as the value.
N.B. If key did not appear in the original set the hash table was created
from, the behavior of this routine is undefined. (In practice, the
value returned will be the value for some other key in the table that
hashes to the same location -- or potentially an empty slot in the
table.)
Arguments:
Table - Supplies a pointer to the table for which the key lookup is to be
performed.
Key - Supplies the key to look up.
Value - Receives the vaue for the given key.
Return Value:
S_OK in all normal operating conditions. E_FAIL may be returned in some
cases when passed a key not in the original input set. The Value parameter
will be set to NULL in this case.
--*/
{
ULONG Index;
HRESULT Result;
//
// Obtain the index for this key.
//
Result = Table->Vtbl->Index(Table, Key, &Index);
if (FAILED(Result)) {
//
// Clear the caller's pointers and return the error code.
//
*Value = 0;
return E_FAIL;
}
*Value = Table->Values[Index];
return S_OK;
}
_Use_decl_annotations_
HRESULT
PerfectHashTableDelete(
PPERFECT_HASH_TABLE Table,
ULONG Key,
PULONG PreviousValue
)
/*++
Routine Description:
Deletes a key from a perfect hash table, optionally returning the value
prior to deletion back to the caller. Deletion simply clears the value
associated with the key, and thus, is a simple O(1) operation. Deleting
a key that has not yet been inserted has no effect other than potentially
returning 0 as the previous value. That is, a caller can safely issue
deletes of keys regardless of whether or not said keys were inserted first.
N.B. If key did not appear in the original set the hash table was created
from, the behavior of this routine is undefined. (In practice, the
key will hash to either an existing key's location or an empty slot,
so there is potential to corrupt the table in the sense that a
previously inserted value for an unrelated, valid key will be cleared.)
Arguments:
Table - Supplies a pointer to the table for which the key is to be deleted.
Key - Supplies the key to delete.
PreviousValue - Optionally supplies a pointer that will receive the previous
value at the relevant table location prior to this deletion. If no
prior insertion, the previous value is guaranteed to be 0.
Return Value:
S_OK in all normal operating conditions. E_FAIL may be returned in some
cases when passed a key not in the original input set. The PreviousValue
parameter, if non-NULL, will be cleared in this case.
--*/
{
ULONG Index;
ULONG Existing;
HRESULT Result;
//
// Obtain the index for this key.
//
Result = Table->Vtbl->Index(Table, Key, &Index);
if (FAILED(Result)) {
//
// Clear the caller's pointer if applicable and return error.
//
if (ARGUMENT_PRESENT(PreviousValue)) {
*PreviousValue = 0;
}
return E_FAIL;
}
//
// Get the existing value.
//
Existing = Table->Values[Index];
//
// Clear the value.
//
Table->Values[Index] = 0;
//
// Update the caller's pointer if applicable.
//
if (ARGUMENT_PRESENT(PreviousValue)) {
*PreviousValue = Existing;
}
return S_OK;
}
_Use_decl_annotations_
ULONG
PerfectHashTableAddRef(
PPERFECT_HASH_TABLE Table
)
/*++
Routine Description:
Increments the reference count for a perfect hash table.
Arguments:
Table - Supplies a pointer to the table for which the reference count
is to be incremented.
Return Value:
The new reference count.
--*/
{
return InterlockedIncrement(&Table->ReferenceCount);
}
_Use_decl_annotations_
ULONG
PerfectHashTableRelease(
PPERFECT_HASH_TABLE Table
)
/*++
Routine Description:
Decrements the reference count associated with a perfect hash table. If
this is the last reference, the table is destroyed.
Arguments:
Table - Supplies a pointer to the table for which the reference count
is to be decremented.
Return Value:
The new reference count.
--*/
{
ULONG Count = InterlockedDecrement(&Table->ReferenceCount);
PPERFECT_HASH_TABLE TablePointer = Table;
if (Count > 0) {
return Count;
}
DestroyPerfectHashTable(&TablePointer, NULL);
return Count;
}
I prefer this COM-style vtbl approach to the cmph project's approach, which
uses giant switch statements to select appropriate backends, e.g.:
cmph_t *cmph_new(cmph_config_t *mph)
{
cmph_t *mphf = NULL;
double c = mph->c;
DEBUGP("Creating mph with algorithm %s\n", cmph_names[mph->algo]);
switch (mph->algo)
{
case CMPH_CHM:
DEBUGP("Creating chm hash\n");
mphf = chm_new(mph, c);
break;
case CMPH_BMZ: /* included - Fabiano */
DEBUGP("Creating bmz hash\n");
mphf = bmz_new(mph, c);
break;
case CMPH_BMZ8: /* included - Fabiano */
DEBUGP("Creating bmz8 hash\n");
mphf = bmz8_new(mph, c);
break;
case CMPH_BRZ: /* included - Fabiano */
DEBUGP("Creating brz hash\n");
if (c >= 2.0) brz_config_set_algo(mph, CMPH_FCH);
else brz_config_set_algo(mph, CMPH_BMZ8);
mphf = brz_new(mph, c);
break;
case CMPH_FCH: /* included - Fabiano */
DEBUGP("Creating fch hash\n");
mphf = fch_new(mph, c);
break;
case CMPH_BDZ: /* included - Fabiano */
DEBUGP("Creating bdz hash\n");
mphf = bdz_new(mph, c);
break;
case CMPH_BDZ_PH: /* included - Fabiano */
DEBUGP("Creating bdz_ph hash\n");
mphf = bdz_ph_new(mph, c);
break;
case CMPH_CHD_PH: /* included - Fabiano */
DEBUGP("Creating chd_ph hash\n");
mphf = chd_ph_new(mph, c);
break;
case CMPH_CHD: /* included - Fabiano */
DEBUGP("Creating chd hash\n");
mphf = chd_new(mph, c);
break;
default:
assert(0);
}
return mphf;
}
int cmph_dump(cmph_t *mphf, FILE *f)
{
switch (mphf->algo)
{
case CMPH_CHM:
return chm_dump(mphf, f);
case CMPH_BMZ: /* included - Fabiano */
return bmz_dump(mphf, f);
case CMPH_BMZ8: /* included - Fabiano */
return bmz8_dump(mphf, f);
case CMPH_BRZ: /* included - Fabiano */
return brz_dump(mphf, f);
case CMPH_FCH: /* included - Fabiano */
return fch_dump(mphf, f);
case CMPH_BDZ: /* included - Fabiano */
return bdz_dump(mphf, f);
case CMPH_BDZ_PH: /* included - Fabiano */
return bdz_ph_dump(mphf, f);
case CMPH_CHD_PH: /* included - Fabiano */
return chd_ph_dump(mphf, f);
case CMPH_CHD: /* included - Fabiano */
return chd_dump(mphf, f);
default:
assert(0);
}
assert(0);
return 0;
}
The Index() routine is algorithm-specific, and forms the backbone of
the perfect hash table functionality: the ability to take a key and return a
unique index that can be used to offset into an array of data.
Here's what the Index()
routine looks like for our CHM algorithm implementation. Note that it too
leverages the COM interface for driving the hash and masking routines.
_Use_decl_annotations_
HRESULT
PerfectHashTableIndexImplChm01(
PPERFECT_HASH_TABLE Table,
ULONG Key,
PULONG Index
)
/*++
Routine Description:
Looks up given key in a perfect hash table and returns its index.
N.B. If Key did not appear in the original set the hash table was created
from, the behavior of this routine is undefined. (In practice, the
key will hash to either an existing key's location or an empty slot,
so there is potential for returning a non-unique index.)
Arguments:
Table - Supplies a pointer to the table for which the key lookup is to be
performed.
Key - Supplies the key to look up.
Index - Receives the index associated with this key. The index will be
between 0 and NumberOfKeys-1, and can be safely used to offset directly
into an appropriately sized array (e.g. Table->Values[]).
Return Value:
S_OK on success, E_FAIL if the underlying hash function returned a failure.
This will happen if the two hash values for a key happen to be identical.
It shouldn't happen once a perfect graph has been created (i.e. it only
happens when attempting to solve the graph). The Index parameter will
be cleared in the case of E_FAIL.
--*/
{
ULONG Masked;
ULONG Vertex1;
ULONG Vertex2;
ULONG MaskedLow;
ULONG MaskedHigh;
PULONG Assigned;
ULONGLONG Combined;
ULARGE_INTEGER Hash;
//
// Hash the incoming key into the 64-bit representation, which is two
// 32-bit ULONGs in disguise, each one driven by a separate seed value.
//
if (FAILED(Table->Vtbl->Hash(Table, Key, &Hash.QuadPart))) {
goto Error;
}
//
// Mask each hash value such that it falls within the confines of the
// number of vertices. That is, make sure the value is between 0 and
// Table->NumberOfVertices-1.
//
if (FAILED(Table->Vtbl->MaskHash(Table, Hash.LowPart, &MaskedLow))) {
goto Error;
}
if (FAILED(Table->Vtbl->MaskHash(Table, Hash.HighPart, &MaskedHigh))) {
goto Error;
}
//
// Obtain the corresponding vertex values for the masked high and low hash
// values. These are derived from the "assigned" array that we construct
// during the creation routine's assignment step (GraphAssign()).
//
Assigned = Table->Data;
Vertex1 = Assigned[MaskedLow];
Vertex2 = Assigned[MaskedHigh];
//
// Combine the two values, then perform the index masking operation, such
// that our final index into the array falls within the confines of the
// number of edges, or keys, in the table. That is, make sure the index
// value is between 0 and Table->Keys->NumberOfElements-1.
//
Combined = (ULONGLONG)Vertex1 + (ULONGLONG)Vertex2;
if (FAILED(Table->Vtbl->MaskIndex(Table, Combined, &Masked))) {
goto Error;
}
//
// Update the caller's pointer and return success. The resulting index
// value represents the array offset index for this given key in the
// underlying table, and is guaranteed to be unique amongst the original
// keys in the input set.
//
*Index = Masked;
return S_OK;
Error:
//
// Clear the caller's pointer and return failure. We should only hit this
// point if the caller supplies a key that both: a) wasn't in the original
// input set, and b) happens to result in a hash value where both the high
// part and low part are identical, which is rare, but not impossible.
//
*Index = 0;
return E_FAIL;
}
This routine lives in our
Chm_01.c
file, which is our only algorithm backend at the moment. This file, together
with its counterpart
Chm_01.h
header, features all of the algorithm-specific structures and routines related
to the implementation. It is here we define the graph structures specific to
the CHM algorithm: GRAPH_DIMENSIONS, GRAPH_INFO, and
GRAPH.
GRAPH_DIMENSIONS
GRAPH_INFO
GRAPH
//
// Define the primary dimensions governing the graph size.
//
typedef struct _GRAPH_DIMENSIONS {
//
// Number of edges in the graph. This corresponds to the number of keys
// in our input set. If modulus masking is active, the number of keys and
// the number of edges will be identical. Otherwise, the number of edges
// will be the number of keys rounded up to a power of 2.
//
ULONG NumberOfEdges;
//
// Total number of edges in the graph. This will be twice the size of the
// NumberOfEdges value above, due to the quirky way the underlying r-graph
// algorithm captures two hash values in the same list and offsets the
// second set after the first, e.g.:
//
// Edge2 = Edge1 + Graph->NumberOfEdges;
//
ULONG TotalNumberOfEdges;
//
// Number of vertices in the graph. This will vary based on the masking
// type. It is doubled every time a graph resize event is encountered.
//
ULONG NumberOfVertices;
//
// The number of edges in the graph, rounded up to a power of 2, and then
// shifted the appropriate amount to extract the exponent part for 2^n.
//
BYTE NumberOfEdgesPowerOf2Exponent;
//
// As above, but rounded up to the next power of 2 first.
//
BYTE NumberOfEdgesNextPowerOf2Exponent;
//
// The same exponent logic applied to the number of vertices as per the
// NumberOfVertices field above.
//
BYTE NumberOfVerticesPowerOf2Exponent;
//
// And again for the next value.
//
BYTE NumberOfVerticesNextPowerOf2Exponent;
} GRAPH_DIMENSIONS;
C_ASSERT(sizeof(GRAPH_DIMENSIONS) == 16);
typedef GRAPH_DIMENSIONS *PGRAPH_DIMENSIONS;
//
// Define various memory offsets associated with a given graph structure.
// This allows parallel worker threads to reset their local GRAPH instance
// back to the initial state each time they want to try a new random seed.
//
typedef struct _GRAPH_INFO {
//
// Number of pages consumed by the entire graph and all backing arrays.
//
ULONG NumberOfPagesPerGraph;
//
// Page size (e.g. 4096, 2MB).
//
ULONG PageSize;
//
// Total number of graphs created. This will match the maximum concurrency
// level of the upstream context.
//
ULONG NumberOfGraphs;
//
// Number of RTL_BITMAP structures used by the graph.
//
USHORT NumberOfBitmaps;
//
// Size of the GRAPH structure.
//
USHORT SizeOfGraphStruct;
//
// System allocation granularity. We align the memory map for the on-disk
// structure using this value initially.
//
ULONG AllocationGranularity;
//
// If a masking type other than modulus is active, the AbsoluteEdge() needs
// a way to mask edge values that exceed the number of edges in the table.
// It does this via EdgeMask, which is initialized to the number of edges
// (which will be power-of-2 sized for non-modulus masking), minus 1, such
// that all lower bits will be set.
//
ULONG EdgeMask;
//
// Also capture the mask required to isolate vertices.
//
ULONG VertexMask;
//
// Graph dimensions. This information is duplicated in the graph due to
// it being accessed frequently.
//
GRAPH_DIMENSIONS Dimensions;
//
// Pointer to the owning context.
//
PPERFECT_HASH_TABLE_CONTEXT Context;
//
// Base address of the entire graph allocation.
//
union {
PVOID BaseAddress;
struct _GRAPH *FirstGraph;
};
//
// Array sizes.
//
ULONGLONG EdgesSizeInBytes;
ULONGLONG NextSizeInBytes;
ULONGLONG FirstSizeInBytes;
ULONGLONG PrevSizeInBytes;
ULONGLONG AssignedSizeInBytes;
ULONGLONG ValuesSizeInBytes;
//
// Deleted edges bitmap buffer size.
//
ULONGLONG DeletedEdgesBitmapBufferSizeInBytes;
//
// Visited vertices bitmap buffer size.
//
ULONGLONG VisitedVerticesBitmapBufferSizeInBytes;
//
// Assigned bitmap buffer size.
//
ULONGLONG AssignedBitmapBufferSizeInBytes;
//
// Index bitmap buffer size.
//
ULONGLONG IndexBitmapBufferSizeInBytes;
//
// The allocation size of the graph, including structure size and all
// array and bitmap buffer sizes.
//
ULONGLONG AllocSize;
//
// Allocation size rounded up to the nearest page size multiple.
//
ULONGLONG FinalSize;
} GRAPH_INFO;
typedef GRAPH_INFO *PGRAPH_INFO;
//
// Define the graph structure. This represents an r-graph, or a hypergraph,
// or an r-partite 2-uniform graph, or any other seemingly unlimited number
// of names floating around in academia for what appears to be exactly the
// same thing.
//
typedef struct _Struct_size_bytes_(SizeOfStruct) _GRAPH {
//
// List entry used to push the graph onto the context's work list.
//
SLIST_ENTRY ListEntry;
//
// Edge and vertex masks that can be used when non-modulus masking is in
// place. Both of these values are duplicated from the info structure as
// they are accessed frequently.
//
//
ULONG EdgeMask;
ULONG VertexMask;
//
// Duplicate the mask type, as well, as this directs AbsoluteEdge()'s
// decision to use the two masks above.
//
PERFECT_HASH_TABLE_MASK_FUNCTION_ID MaskFunctionId;
//
// Duplicate the number of keys, as this is also frequently referenced.
//
ULONG NumberOfKeys;
//
// Structure size, in bytes.
//
_Field_range_(== , sizeof(struct _GRAPH)) ULONG SizeOfStruct;
//
// Graph flags.
//
GRAPH_FLAGS Flags;
//
// Pointer to the info structure describing various sizes.
//
PGRAPH_INFO Info;
//
// Graph attempt. This ID is derived from an interlocked increment against
// Context->Attempts, and represents the attempt number across all threads.
//
ULONGLONG Attempt;
//
// A localized attempt number that reflects the number of attempts made
// by just this thread.
//
ULONG ThreadAttempt;
//
// Thread ID of the thread that owns us. Each callback thread is provided
// a single graph, and will attempt to solve the perfect hash table until
// told otherwise. Thus, there's a 1:1 relationship between graph instance
// and owning thread.
//
ULONG ThreadId;
//
// Counter that is incremented each time we delete an edge during the
// acyclic graph detection stage.
//
ULONG DeletedEdgeCount;
//
// Counter that is incremented each time we visit a vertex during the
// assignment stage.
//
ULONG VisitedVerticesCount;
//
// Capture collisions during assignment step.
//
ULONG Collisions;
//
// Inline the GRAPH_DIMENSIONS structure. This is available from the
// GRAPH_INFO structure, however, it's accessed frequently, so we inline
// it to avoid the extra level of indirection.
//
union {
struct {
ULONG NumberOfEdges;
ULONG TotalNumberOfEdges;
ULONG NumberOfVertices;
BYTE NumberOfEdgesPowerOf2Exponent;
BYTE NumberOfEdgesNextPowerOf2Exponent;
BYTE NumberOfVerticesPowerOf2Exponent;
BYTE NumberOfVerticesNextPowerOf2Exponent;
};
GRAPH_DIMENSIONS Dimensions;
};
//
// Duplicate the context pointer. (This is also available from Info.)
//
PPERFECT_HASH_TABLE_CONTEXT Context;
//
// Edges array. The number of elements in this array is governed by the
// TotalNumberOfEdges field, and will be twice the number of edges.
//
PEDGE Edges;
//
// Array of the "next" edge array, as per the referenced papers. The number
// of elements in this array is also governed by TotalNumberOfEdges.
//
PEDGE Next;
//
// Array of vertices. Number of elements is governed by the
// NumberOfVertices field.
//
PVERTEX First;
//
// The original CHM paper in 1996 references a "prev" array to "facilitate
// fast deletion". However, the chmp project appears to have switched to
// using bitmaps. Let's reserve a slot for the "prev" array anyway.
//
PVERTEX Prev;
//
// Array of assigned vertices. Number of elements is governed by the
// NumberOfVertices field.
//
PVERTEX Assigned;
//
// Array of values indexed by the offsets in the Assigned array. This
// essentially allows us to simulate a loaded table that supports the
// Insert(), Index() and Lookup() routines as part of graph validation.
//
PULONG Values;
//
// Bitmap used to capture deleted edges as part of the acyclic detection
// stage. The SizeOfBitMap will reflect TotalNumberOfEdges.
//
RTL_BITMAP DeletedEdges;
//
// Bitmap used to capture vertices visited as part of the assignment stage.
// The SizeOfBitMap will reflect NumberOfVertices.
//
RTL_BITMAP VisitedVertices;
//
// Bitmap used to test the correctness of the Assigned array.
//
RTL_BITMAP AssignedBitmap;
//
// Bitmap used to track indices during the assignment step.
//
RTL_BITMAP IndexBitmap;
//
// Capture the seeds used for each hash function employed by the graph.
//
ULONG NumberOfSeeds;
struct {
union {
struct {
union {
ULONG Seed1;
ULONG FirstSeed;
};
ULONG Seed2;
};
ULARGE_INTEGER Seeds12;
};
union {
struct {
ULONG Seed3;
union {
ULONG Seed4;
ULONG LastSeed;
};
};
ULARGE_INTEGER Seeds34;
};
};
} GRAPH;
typedef GRAPH *PGRAPH;
In general, the structures and functions defined in our
PerfectHashTablePrivate.h
file are algorithm agnostic. That is, they don't deal with the notion of a
graph, as that's an implementation detail of the CHM algorithm.
To generate some test data, you have two options. If you have Python and NumPy
installed, you can generate a bunch of linear and random key files via the
generate.py
file that lives in the data subdirectory of the
PerfectHashTable component. Alternatively, you can check out a
small git repository that houses some existing sample test data.
Python Generated
The generate.py file and its output follows. It will generate linear and random
arrays of given sizes, saving a .txt representation of the values, a .keys
representation of the binary data (which is the actual file we load as part of
perfect hash table creation), and, if matplotlib is installed, a .png plot of
the data, which allows a quick sanity check that the value distribution is what
we expect.
(Note: the random routine creates a double-sized array, then calls
np.unique() against it to ensure all the key values are unique,
then isolates the [:size] portion we're actually interested in.
One side effect of calling np.unique() is that our resulting array
is sorted. We don't stipulate whether key files are required to be sorted at
the moment, as it makes no difference to the underlying perfect hash table
algorithm we're using. i.e. we don't suffer the sort of pathological worst-case
performance issues some data structures exhibit when inserting data in sorted
order. Nevertheless, if you look at the random files compared to the linear
files, they're usually quite similar, given everything is sorted.)
generate.py
Output
from __future__ import print_function
import numpy as np
try:
import matplotlib.pyplot as plt
# Turn off interactive mode to disable plots being displayed
# prior to saving them to disk.
plt.ioff()
def save_array_plot_to_png_file(filename, a):
plt.plot(a)
plt.savefig(filename)
print("Generated %s." % filename)
except ImportError:
def save_array_plot_to_png_file(filename, a):
pass
def save_array_to_text_file(filename, a):
with open(filename, 'wb') as f:
for i in a:
f.write(str(i).zfill(9).encode('ascii'))
f.write(b'\n')
print("Generated %s." % filename)
def save_array_to_binary_file(filename, a):
fp = np.memmap(filename, dtype='uint32', mode='w+', shape=a.shape)
fp[:] = a[:]
del fp
print("Generated %s." % filename)
def save_array(prefix, a):
l = len(a)
png_filename = '%s-%d.png' % (prefix, l)
text_filename = '%s-%d.txt' % (prefix, l)
binary_filename = '%s-%d.keys' % (prefix, l)
save_array_plot_to_png_file(png_filename, a)
save_array_to_text_file(text_filename, a)
save_array_to_binary_file(binary_filename, a)
def gen_random_unique_array(size):
a = np.random.randint(0, (1 << 32)-1, size * 2, dtype='uint32')
a = np.unique(a)[:size]
return a if len(a) == size else None
def gen_linear_array(size):
a = np.linspace(0, size, num=size, dtype='uint32')
return a
sizes = (
2000, 4000, 4050, 5000, 10000, 15000, 17000, 25000, 31000, 33000, 50000,
63000, 65500, 75000, 100000, 121000, 125000, 150000, 200000, 225000, 245000,
255000, 265000, 389161, 472374,
)
functions = (
('linear', gen_linear_array),
('random', gen_random_unique_array),
)
def main():
for (name, func) in functions:
for size in sizes:
a = func(size)
if a is None:
continue
save_array(name, a)
if __name__ == '__main__':
main()
If you don't have Python and NumPy installed or you can't get the
generate.py to work, you can clone a small git repository
named perfecthash, which
contains a data directory that has a test file in it named
mshtml-37209.keys:
C:\Users\Trent\Home\src> git clone https://github.com/tpn/perfecthash
...
C:\Users\Trent\Home\src> cd perfecthash\data
C:\Users\Trent\Home\src\perfecthash\data> dir
...
Running the Self-Test
If you clone the tracer project
and open
PerfectHashTable.sln
in Visual Studio 2017, you should be able to build the project from scratch.
Selecting the Debug build configuration will make the code easier to step
through if you'd like to follow along with an interactive debugging session.
Let's run the debug build of the self-test program against the data directory we
prepared earlier. (I'll use the perfecthash\data directory for now
as it only has two key files in it, mshtml-10.keys and
mshtml-37209.keys.) It's worth noting up-front how
not user friendly thePerfectHashTableSelfTest.exe
program is. It's the bare minimum command line wrapper around the
SelfTestPerfectHashTable() routine. It takes five mandatory parameters:
the test directory containing the *.keys files you wish to process, the
algorithm ID, the hash function ID, the mask function ID, and the maximum
concurrency (specifying 0 will use all cores). (You can optionally add a
dummy character as the final parameter which will print a little Press
any key to continue. message and then wait for a keypress before
exiting — this is useful when debugging via Visual Studio and you want the
console window to stay open so you can review results.) Usage:
C:\Users\Trent\Home\src\tracer>x64\Debug\PerfectHashTableSelfTest.exe
Usage: PerfectHashTableSelfTest.exe <TestDataDirectory (must be fully-qualified)>
<AlgorithmId>
<HashFunctionId>
<MaskFunctionId>
<MaximumConcurrency (0-ncpu)>
[PauseBeforeExit (can be any character)]
E.g.: PerfectHashTableSelfTest.exe C:\Users\Trent\Home\src\perfecthash\data 1 1 2 0
Let's try run it against our perfecthash\data directory. We'll use
the CHM algorithm (1... the only option), the CRC32-Rotate hash function (1),
the AND-based masking function (2), and a max concurrency of 0, which will
default to all cores, which will be 4 on this Surface Book.
S:\tracer>x64\Debug\PerfectHashTableSelfTest.exe S:\perfecthash\data 1 1 2 0
Starting perfect hash self-test for directory: S:\perfecthash\data.
Processing key file: mshtml-10.keys
Successfully created perfect hash table: S:\perfecthash\data\mshtml-10.pht1.
Successfully loaded perfect hash table: S:\perfecthash\data\mshtml-10.pht1.
Algorithm: Chm01 (1).
Hash Function: Crc32Rotate (1).
Mask Function: And (2).
Successfully tested perfect hash table.
Concurrency: 4.
Number of attempts: 1.
Number of failed attempts: 0.
Number of solutions found: 1.
Number of keys: 10.
Number of table elements (vertices): 32.
Seed 1: 1086980845.
Seed 2: 3416976708.
Seed 3: 2208417421.
Seed 4: 3947037297.
Cycles to solve: 178412.
Cycles to verify: 3924.
Cycles to prepare file: 821965.
Cycles to save file: 9093112.
Microseconds to solve: 63.
Microseconds to verify: 1.
Microseconds to prepare file: 292.
Microseconds to save file: 3238.
Processing key file: mshtml-37209.keys
Successfully created perfect hash table: S:\perfecthash\data\mshtml-37209.pht1.
Successfully loaded perfect hash table: S:\perfecthash\data\mshtml-37209.pht1.
Algorithm: Chm01 (1).
Hash Function: Crc32Rotate (1).
Mask Function: And (2).
Successfully tested perfect hash table.
Concurrency: 4.
Number of attempts: 3.
Number of failed attempts: 0.
Number of solutions found: 3.
Number of keys: 37209.
Number of table elements (vertices): 131072.
Seed 1: 43757427.
Seed 2: 283337092.
Seed 3: 1819527900.
Seed 4: 1437745965.
Cycles to solve: 71649060.
Cycles to verify: 6385313.
Cycles to prepare file: 41221291.
Cycles to save file: 51229166.
Microseconds to solve: 25516.
Microseconds to verify: 2274.
Microseconds to prepare file: 14680.
Microseconds to save file: 18244.
Your output should look similar to the output above. There will be two new
files, mshtml-10.pht1 and mshtml-37209.pht1 in that
perfecthash\data directory. The .pht1 file is the array of
"assigned" values we use as part of index construction for the CHM algorithm.
When the output says Successfully loaded perfect hash table ..., it
is referring to the .pht1 files.
The metadata for the perfect hash table is stored in an NTFS stream named
:Info that is attached to the .pht1 file. We can view this
via the Get-Item -Path mshtml-37209.pht1 -Stream * PowerShell
command:
You can see the normal file data in the first stream (the default file stream)
named $DATA. This is reported as length 524,288 bytes. If we
divide this by 4 (the sizeof(ULONG)), we get 131,072 — which
corresponds to the number of vertices reported for the final table in the
output:
...
Number of table elements (vertices): 131072.
...
The :Info Stream
Following that is our custom :Info NTFS stream of length 256 bytes.
As mentioned earlier, all of our file I/O is done through memory maps. During
the creation step, the :Info stream is created and mapped at a base
address accessible through Table->InfoStreamBaseAddress (we'll
discuss the Table structure soon). In our CHM implementation
routine, we literally cast the base address to an on-disk structure
named GRAPH_INFO_ON_DISK, which is required to embed a common
header structure named TABLE_INFO_ON_DISK_HEADER as its first
member.
Let's look at these two header structures, as well as some initialization and
finalization code in the CHM implementation that depicts how we work with them.
The initialization code is taken from the
CreatePerfectHashTableImplChm01 routine, and the finalization code is taken
from the
FileWorkCallbackChm01.
GRAPH_INFO_ON_DISK
TABLE_INFO_ON_DISK_HEADER
Initialization
Finalization
//
// Define an on-disk representation of the graph's information. This is stored
// in the NTFS stream extending from the backing file named :Info. It is
// responsible for storing information about the on-disk mapping such that it
// can be reloaded from disk and used as a perfect hash table. The structure
// must always embed the TABLE_INFO_ON_DISK_HEADER structure such that the
// generic loader routine can access on-disk versions saved by different algos
// in order to extract the algorithm ID and determine a suitable loader func to
// use.
//
typedef struct _Struct_size_bytes_(Header.SizeOfStruct) _GRAPH_INFO_ON_DISK {
//
// Include the required header.
//
TABLE_INFO_ON_DISK_HEADER Header;
//
// Additional information we capture is mostly just for informational
// and debugging purposes.
//
//
// Inline the GRAPH_DIMENSIONS structure.
//
union {
struct {
ULONG NumberOfEdges;
ULONG TotalNumberOfEdges;
ULONG NumberOfVertices;
BYTE NumberOfEdgesPowerOf2Exponent;
BYTE NumberOfEdgesNextPowerOf2Exponent;
BYTE NumberOfVerticesPowerOf2Exponent;
BYTE NumberOfVerticesNextPowerOf2Exponent;
};
GRAPH_DIMENSIONS Dimensions;
};
} GRAPH_INFO_ON_DISK;
C_ASSERT(sizeof(GRAPH_INFO_ON_DISK) <= PAGE_SIZE);
typedef GRAPH_INFO_ON_DISK *PGRAPH_INFO_ON_DISK;
//
// Metadata about a perfect hash table is stored in an NTFS stream named :Info
// that is tacked onto the end of the perfect hash table's file name. Define
// a structure, TABLE_INFO_ON_DISK_HEADER, that literally represents the on-disk
// layout of this metadata. Each algorithm implementation must write out an
// info record that conforms with this common header. They are free to extend
// it with additional details.
//
typedef union _TABLE_INFO_ON_DISK_HEADER_FLAGS {
struct {
//
// Unused bits.
//
ULONG Unused:32;
};
LONG AsLong;
ULONG AsULong;
} TABLE_INFO_ON_DISK_HEADER_FLAGS;
C_ASSERT(sizeof(TABLE_INFO_ON_DISK_HEADER_FLAGS) == sizeof(ULONG));
typedef struct _Struct_size_bytes_(SizeOfStruct) _TABLE_INFO_ON_DISK_HEADER {
//
// A magic value used to identify the structure.
//
ULARGE_INTEGER Magic;
//
// Size of the structure, in bytes.
//
// N.B. We don't annotate this with a _Field_range_ SAL annotation as the
// value will vary depending on which parameters were used to create
// the table.
//
ULONG SizeOfStruct;
//
// Flags.
//
TABLE_INFO_ON_DISK_HEADER_FLAGS Flags;
//
// Algorithm that was used.
//
PERFECT_HASH_TABLE_ALGORITHM_ID AlgorithmId;
//
// Hash function that was used.
//
PERFECT_HASH_TABLE_HASH_FUNCTION_ID HashFunctionId;
//
// Masking type.
//
PERFECT_HASH_TABLE_MASK_FUNCTION_ID MaskFunctionId;
//
// Size of an individual key element, in bytes.
//
ULONG KeySizeInBytes;
//
// The concurrency level used to generate the hash.
//
ULONG Concurrency;
//
// Pad out to 8 bytes.
//
ULONG Padding;
//
// Number of keys in the input set. This is used to size an appropriate
// array for storing values.
//
ULARGE_INTEGER NumberOfKeys;
//
// Final number of elements in the underlying table. This will vary
// depending on how the graph was created. If modulus masking is in use,
// this will reflect the number of keys (unless a custom table size was
// requested during creation). Otherwise, this will be the number of keys
// rounded up to the next power of 2. (That is, take the number of keys,
// round up to a power of 2, then round that up to the next power of 2.)
//
ULARGE_INTEGER NumberOfTableElements;
//
// Capture the hash and index details required by MaskHash(), MaskIndex()
// and Index() routines.
//
ULONG HashSize;
ULONG IndexSize;
ULONG HashShift;
ULONG IndexShift;
ULONG HashMask;
ULONG IndexMask;
ULONG HashFold;
ULONG IndexFold;
ULONG HashModulus;
ULONG IndexModulus;
//
// Seed data.
//
ULONG NumberOfSeeds;
union {
ULONG Seed1;
ULONG FirstSeed;
};
ULONG Seed2;
ULONG Seed3;
union {
ULONG Seed4;
ULONG LastSeed;
};
//
// Capture statistics about the perfect hash table solution that can be
// useful during analysis and performance comparisons.
//
//
// Number of attempts at solving the solution.
//
ULONGLONG NumberOfAttempts;
//
// Number of failed attempts at solving the solution.
//
ULONGLONG NumberOfFailedAttempts;
//
// If solutions are being sought in parallel, more than one thread may
// find a solution before it detects that someone else has already found
// a solution (in which case, it stops solving and returns from the pool
// callback). This counter measures the number of solutions that were
// found in parallel. It corresponds to the Context->FinishedCount value.
//
// With a good hashing function and large concurrency value, this value
// may often be relatively large. E.g. with a concurrency level of 12, it
// would not be surprising to see 12 attempts reported, and 10 solutions
// found. (Meaning that our concurrency level was a tad unnecessary, or
// our hash function is unnecessarily good (which usually means expensive).)
//
// N.B. Finding multiple solutions in parallel is harmless, if not a little
// wasteful of CPU time. The threads can detect if they are the first
// ones to find a solution prior to continuing with the assignment
// step, such that the assignment can be avoided if another thread has
// beaten them to that point. They also check at regular intervals
/ during graph solving, and will do a fast-path exit as soon as they
// detect another thread has found a solution.
//
ULONGLONG NumberOfSolutionsFound;
//
// If a solution can't be found within a configurable threshold, a "table
// resize" event will be generated. This results in doubling the number
// of vertices being used for the backing table and trying the solution
// again. The following field captures how many times this occurred.
//
ULONGLONG NumberOfTableResizeEvents;
//
// This counter captures the sum of all prior attempts at solving the
// solution before giving up and resizing the table. It excludes the
// attempts made by the winning table size (captured by NumberOfAttempts).
// It will be zero if no resize events occurred. We don't keep a separate
// failed counter here, as the resize event implies all attempts failed.
//
ULONGLONG TotalNumberOfAttemptsWithSmallerTableSizes;
//
// The following counter captures the initial table size that was attempted
// in order to solve the solution. It will differ from the final table size
// if there were resize events. As we simply double the size of the table
// on each resize event, we can extrapolate the different table sizes that
// were tried prior to finding a winning one by looking at the initial size
// attempted and the number of resize events.
//
ULONGLONG InitialTableSize;
//
// The following counter captures the closest we came to solving the graph
// in previous attempts before a resize event occurred. This is calculated
// by taking the number of edges and subtracting the value of the context's
// HighestDeletedEdgesCount counter. The value represents the additional
// number of 1 degree edges we needed to delete in order to obtain an
// acyclic graph. A very low number indicates that we came very close to
// solving it as there were very few hash collisions with the seed values
// we picked. A very high number indicates we had no chance and there were
// collisions galore.
//
ULONGLONG ClosestWeCameToSolvingGraphWithSmallerTableSizes;
//
// Number of cycles it took to solve the solution for the winning thread.
// (This does not factor in the total cycle time consumed by all threads.)
//
ULARGE_INTEGER SolveCycles;
//
// Number of microseconds taken to solve the solution.
//
ULARGE_INTEGER SolveMicroseconds;
//
// Number of cycles taken to verify the solution.
//
ULARGE_INTEGER VerifyCycles;
//
// Number of microseconds taken to verify the solution.
//
ULARGE_INTEGER VerifyMicroseconds;
//
// Number of cycles taken to prepare the file.
//
ULARGE_INTEGER PrepareFileCycles;
//
// Number of microseconds taken to prepare the file.
//
ULARGE_INTEGER PrepareFileMicroseconds;
//
// Number of cycles taken to save the file.
//
ULARGE_INTEGER SaveFileCycles;
//
// Number of microseconds taken to save the file.
//
ULARGE_INTEGER SaveFileMicroseconds;
} TABLE_INFO_ON_DISK_HEADER;
typedef TABLE_INFO_ON_DISK_HEADER *PTABLE_INFO_ON_DISK_HEADER;
//
// Save the on-disk representation of the graph information. This is a
// smaller subset of data needed in order to load a previously-solved
// graph as a perfect hash table. The data resides in an NTFS stream named
// :Info off the main perfect hash table file. It will have been mapped for
// us already at Table->InfoStreamBaseAddress.
//
OnDiskInfo = (PGRAPH_INFO_ON_DISK)Table->InfoStreamBaseAddress;
ASSERT(OnDiskInfo);
OnDiskHeader = &OnDiskInfo->Header;
OnDiskHeader->Magic.LowPart = TABLE_INFO_ON_DISK_MAGIC_LOWPART;
OnDiskHeader->Magic.HighPart = TABLE_INFO_ON_DISK_MAGIC_HIGHPART;
OnDiskHeader->SizeOfStruct = sizeof(*OnDiskInfo);
OnDiskHeader->Flags.AsULong = 0;
OnDiskHeader->Concurrency = Context->MaximumConcurrency;
OnDiskHeader->AlgorithmId = Context->AlgorithmId;
OnDiskHeader->MaskFunctionId = Context->MaskFunctionId;
OnDiskHeader->HashFunctionId = Context->HashFunctionId;
OnDiskHeader->KeySizeInBytes = sizeof(ULONG);
OnDiskHeader->HashSize = Table->HashSize;
OnDiskHeader->IndexSize = Table->IndexSize;
OnDiskHeader->HashShift = Table->HashShift;
OnDiskHeader->IndexShift = Table->IndexShift;
OnDiskHeader->HashMask = Table->HashMask;
OnDiskHeader->IndexMask = Table->IndexMask;
OnDiskHeader->HashFold = Table->HashFold;
OnDiskHeader->IndexFold = Table->IndexFold;
OnDiskHeader->HashModulus = Table->HashModulus;
OnDiskHeader->IndexModulus = Table->IndexModulus;
OnDiskHeader->NumberOfKeys.QuadPart = (
Table->Keys->NumberOfElements.QuadPart
);
OnDiskHeader->NumberOfSeeds = ((
FIELD_OFFSET(GRAPH, LastSeed) -
FIELD_OFFSET(GRAPH, FirstSeed)
) / sizeof(ULONG)) + 1;
//
// This will change based on masking type and whether or not the caller
// has provided a value for NumberOfTableElements. For now, keep it as
// the number of vertices.
//
OnDiskHeader->NumberOfTableElements.QuadPart = (
NumberOfVertices.QuadPart
);
CopyMemory(&OnDiskInfo->Dimensions, Dim, sizeof(*Dim));
case FileWorkSaveId: {
PULONG Dest;
PULONG Source;
PGRAPH Graph;
ULONG WaitResult;
BOOLEAN Success;
ULONGLONG SizeInBytes;
LARGE_INTEGER EndOfFile;
PPERFECT_HASH_TABLE Table;
PTABLE_INFO_ON_DISK_HEADER Header;
//
// Indicate the save event has completed upon return of this
// callback.
//
SetOnReturnEvent = SavedEvent;
//
// Initialize aliases.
//
Table = Context->Table;
Dest = (PULONG)Table->BaseAddress;
Graph = (PGRAPH)Context->SolvedContext;
Source = Graph->Assigned;
Header = Table->Header;
SizeInBytes = (
Header->NumberOfTableElements.QuadPart *
Header->KeySizeInBytes
);
//
// The graph has been solved. Copy the array of assigned values
// to the mapped area we prepared earlier (above).
//
CopyMemory(Dest, Source, SizeInBytes);
//
// Save the seed values used by this graph. (Everything else in
// the on-disk info representation was saved earlier.)
//
Header->Seed1 = Graph->Seed1;
Header->Seed2 = Graph->Seed2;
Header->Seed3 = Graph->Seed3;
Header->Seed4 = Graph->Seed4;
//
// Kick off a flush file buffers now before we wait on the verified
// event. The flush will be a blocking call. The wait on verified
// will be blocking if the event isn't signaled. So, we may as well
// get some useful blocking work done, before potentially going into
// another wait state where we're not doing anything useful.
//
ASSERT(FlushFileBuffers(Table->FileHandle));
//
// Stop the save file timer here, after flushing the file buffers,
// but before we potentially wait on the verified state.
//
CONTEXT_END_TIMERS(SaveFile);
//
// Wait on the verification complete event. This is done in the
// main thread straight after it dispatches our file work callback
// (that ended up here). We need to block on this event as we want
// to save the timings for verification to the header.
//
WaitResult = WaitForSingleObject(Context->VerifiedEvent, INFINITE);
ASSERT(WaitResult == WAIT_OBJECT_0);
//
// When we mapped the array in the work item above, we used a size
// that was aligned with the system allocation granularity. We now
// want to set the end of file explicitly to the exact size of the
// underlying array. To do this, we unmap the view, delete the
// section, set the file pointer to where we want, set the end of
// file (which will apply the file pointer position as EOF), then
// close the file handle.
//
ASSERT(UnmapViewOfFile(Table->BaseAddress));
Table->BaseAddress = NULL;
ASSERT(CloseHandle(Table->MappingHandle));
Table->MappingHandle = NULL;
EndOfFile.QuadPart = SizeInBytes;
Success = SetFilePointerEx(Table->FileHandle,
EndOfFile,
NULL,
FILE_BEGIN);
ASSERT(Success);
ASSERT(SetEndOfFile(Table->FileHandle));
ASSERT(CloseHandle(Table->FileHandle));
Table->FileHandle = NULL;
//
// Stop the save file timers, then copy all the timer values into
// the header (before closing the :Info stream).
//
CONTEXT_SAVE_TIMERS_TO_HEADER(Solve);
CONTEXT_SAVE_TIMERS_TO_HEADER(Verify);
CONTEXT_SAVE_TIMERS_TO_HEADER(PrepareFile);
CONTEXT_SAVE_TIMERS_TO_HEADER(SaveFile);
//
// Save the number of attempts and number of finished solutions.
//
Header->NumberOfAttempts = Context->Attempts;
Header->NumberOfFailedAttempts = Context->FailedAttempts;
Header->NumberOfSolutionsFound = Context->FinishedCount;
//
// Finalize the :Info stream the same way we handled the backing
// file above; unmap, delete section, set file pointer, set eof,
// close file.
//
ASSERT(UnmapViewOfFile(Table->InfoStreamBaseAddress));
Table->InfoStreamBaseAddress = NULL;
ASSERT(CloseHandle(Table->InfoStreamMappingHandle));
Table->InfoStreamMappingHandle = NULL;
//
// The file size for the :Info stream will be the size of our
// on-disk info structure.
//
EndOfFile.QuadPart = sizeof(*OnDiskInfo);
Success = SetFilePointerEx(Table->InfoStreamFileHandle,
EndOfFile,
NULL,
FILE_BEGIN);
ASSERT(Success);
ASSERT(SetEndOfFile(Table->InfoStreamFileHandle));
ASSERT(CloseHandle(Table->InfoStreamFileHandle));
Table->InfoStreamFileHandle = NULL;
break;
}
So, to summarize, the :Info NTFS stream is memory mapped and then
directly cast into a GRAPH_INFO_ON_DISK structure, which is
required to embed the TABLE_INFO_ON_DISK_HEADER structure as its
first element. The
LoadPerfectHashTable() routine is then able to memory map the same
:Info stream at a later point (or in a separate process), validate
the header semantics, extract the algorithm ID, then call the algorithm
implementation's custom loader routine.
This is how we satisfy the original off-line generation requirement. (That is,
making sure the creation step persists something that can be loaded and used at
an arbitrary future time, in a different process, potentially on a different
machine.)
SelfTestPerfectHashTable
Let's step back and take a look at the self test routine, as this exercises all
of the main public API routines, and serves as a good starting point. I'll
provide two code samples; the first is a vastly simplified pseudo-code depiction
of the routine, sans all details other than the absolute bare minimum, the
second is the full routine.
Pseudo
Full
_Use_decl_annotations_
BOOLEAN
SelfTestPerfectHashTable(
PRTL Rtl,
PALLOCATOR Allocator,
PPERFECT_HASH_TABLE_ANY_API AnyApi,
PCUNICODE_STRING TestDataDirectory,
PULONG MaximumConcurrency,
PERFECT_HASH_TABLE_ALGORITHM_ID AlgorithmId,
PERFECT_HASH_TABLE_HASH_FUNCTION_ID HashFunctionId,
PERFECT_HASH_TABLE_MASK_FUNCTION_ID MaskFunctionId
)
/*++
Routine Description:
Performs a self-test of the entire PerfectHashTable component.
Arguments:
Rtl - Supplies a pointer to an initialized RTL structure.
Allocator - Supplies a pointer to an initialized ALLOCATOR structure that
will be used for all memory allocations.
AnyApi - Supplies a pointer to an initialized PERFECT_HASH_TABLE_ANY_API
structure. Note that this must be an instance of the extended API;
this is verified by looking at the Api-&SizeOfStruct field and ensuring
it matches our expected size of the extended API structure.
TestDataDirectory - Supplies a pointer to a UNICODE_STRING structure that
represents a fully-qualified path of the test data directory.
MaximumConcurrency - Optionally supplies a pointer to a variable that
contains the desired maximum concurrency to be used for the underlying
threadpool. If NULL, or non-NULL but points to a value of 0, then the
number of system processors will be used as a default value.
N.B. This value is passed directly to SetThreadpoolThreadMinimum() and
SetThreadpoolThreadMaximum().
AlgorithmId - Supplies the algorithm to use.
MaskFunctionId - Supplies the type of masking to use.
HashFunctionId - Supplies the hash function to use.
Return Value:
TRUE on success, FALSE on failure.
--*/
{
//
// Enumerate over all keys in the test data directory.
//
for (KeysPath in FindFiles(TestDataDirectory, "*.keys")) {
//
// Create a new perfect hash table context.
//
Api->CreatePerfectHashTableContext(Rtl,
Allocator,
MaximumConcurrency,
&Context);
//
// Load the keys file.
//
Api->LoadPerfectHashTableKeys(Rtl,
Allocator,
&KeysPath,
&Keys);
//
// Keys were loaded successfully. Construct the equivalent path name
// for the backing perfect hash table when persisted to disk. Although
// this can be automated for us as part of CreatePerfectHashTable(),
// having it backed by our memory simplifies things a little further
// down the track when we want to load the table via the path but have
// destroyed the original table that came from CreatePerfectHashTable().
//
CreateTablePathFromKeysPath(&Keys, &TablePath);
//
// We now have the fully-qualified path name of the backing perfect
// hash table file living in TablePath. Continue with creation of the
// perfect hash table, using this path we've just created and the keys
// that were loaded.
//
Api->CreatePerfectHashTable(Rtl,
Allocator,
Context,
AlgorithmId,
MaskFunctionId,
HashFunctionId,
NULL,
Keys,
&TablePath);
//
// Load the perfect hash table we just created.
//
Api->LoadPerfectHashTable(Rtl,
Allocator,
Keys,
&TablePath,
&Table);
//
// Table was loaded successfully from disk. Obtain the names of all
// the enumeration IDs. Currently these should always match the same
// enums provided as input parameters to this routine.
//
Api->GetAlgorithmName(Table->AlgorithmId, &AlgorithmName);
Api->GetHashFunctionName(Table->HashFunctionId, &HashFunctionName);
Api->GetMaskFunctionName(Table->MaskFunctionId, &MaskFunctionName);
//
// Test the table. The TRUE parameter corresponds to the DebugBreakOnFailure
// boolean parameter, indicating that we want an immediate __debugbreak() if
// a failure is detected.
//
Api->TestPerfectHashTable(Table, TRUE);
//
// Dump some stats from the header.
//
DumpHeaderStats(Table->Header);
//
// Destroy the table, keys, and context.
//
Table->Vtbl->Release(Table);
Api->DestroyPerfectHashTableKeys(&Keys);
Api->DestroyPerfectHashTableContext(&Context, NULL);
}
}
_Use_decl_annotations_
BOOLEAN
SelfTestPerfectHashTable(
PRTL Rtl,
PALLOCATOR Allocator,
PPERFECT_HASH_TABLE_ANY_API AnyApi,
PCUNICODE_STRING TestDataDirectory,
PULONG MaximumConcurrency,
PERFECT_HASH_TABLE_ALGORITHM_ID AlgorithmId,
PERFECT_HASH_TABLE_HASH_FUNCTION_ID HashFunctionId,
PERFECT_HASH_TABLE_MASK_FUNCTION_ID MaskFunctionId
)
/*++
Routine Description:
Performs a self-test of the entire PerfectHashTable component.
Arguments:
Rtl - Supplies a pointer to an initialized RTL structure.
Allocator - Supplies a pointer to an initialized ALLOCATOR structure that
will be used for all memory allocations.
AnyApi - Supplies a pointer to an initialized PERFECT_HASH_TABLE_ANY_API
structure. Note that this must be an instance of the extended API;
this is verified by looking at the Api->SizeOfStruct field and ensuring
it matches our expected size of the extended API structure.
TestDataDirectory - Supplies a pointer to a UNICODE_STRING structure that
represents a fully-qualified path of the test data directory.
MaximumConcurrency - Optionally supplies a pointer to a variable that
contains the desired maximum concurrency to be used for the underlying
threadpool. If NULL, or non-NULL but points to a value of 0, then the
number of system processors will be used as a default value.
N.B. This value is passed directly to SetThreadpoolThreadMinimum() and
SetThreadpoolThreadMaximum().
AlgorithmId - Supplies the algorithm to use.
MaskFunctionId - Supplies the type of masking to use.
HashFunctionId - Supplies the hash function to use.
Return Value:
TRUE on success, FALSE on failure.
--*/
{
PWSTR Dest;
PWSTR Source;
USHORT Length;
USHORT BaseLength;
USHORT NumberOfPages;
BOOLEAN Success;
BOOLEAN Failed;
BOOLEAN IsProcessTerminating = FALSE;
PWCHAR Buffer;
PWCHAR BaseBuffer;
PWCHAR WideOutput;
PWCHAR WideOutputBuffer;
HANDLE FindHandle = NULL;
HANDLE WideOutputHandle;
HANDLE ProcessHandle = NULL;
ULONG Failures;
ULONG BytesWritten;
ULONG WideCharsWritten;
ULONGLONG BufferSize;
ULONGLONG WideOutputBufferSize;
LONG_INTEGER AllocSize;
LARGE_INTEGER BytesToWrite;
LARGE_INTEGER WideCharsToWrite;
WIN32_FIND_DATAW FindData;
UNICODE_STRING SearchPath;
UNICODE_STRING KeysPath;
UNICODE_STRING TablePath;
PPERFECT_HASH_TABLE Table;
PPERFECT_HASH_TABLE_KEYS Keys;
PTABLE_INFO_ON_DISK_HEADER Header;
PPERFECT_HASH_TABLE_CONTEXT Context;
PPERFECT_HASH_TABLE_API Api;
PPERFECT_HASH_TABLE_API_EX ApiEx;
PUNICODE_STRING AlgorithmName;
PUNICODE_STRING HashFunctionName;
PUNICODE_STRING MaskFunctionName;
UNICODE_STRING Suffix = RTL_CONSTANT_STRING(L"*.keys");
UNICODE_STRING TableSuffix = RTL_CONSTANT_STRING(L"pht1");
//
// Validate arguments.
//
if (!ARGUMENT_PRESENT(Rtl)) {
return FALSE;
}
if (!ARGUMENT_PRESENT(Allocator)) {
return FALSE;
}
if (!ARGUMENT_PRESENT(AnyApi)) {
return FALSE;
} else {
Api = &AnyApi->Api;
if (Api->SizeOfStruct == sizeof(*ApiEx)) {
ApiEx = &AnyApi->ApiEx;
} else {
//
// The API should be the extended version. (Otherwise, how did
// the caller even find this function?)
//
return FALSE;
}
}
if (!ARGUMENT_PRESENT(TestDataDirectory)) {
return FALSE;
}
if (!IsValidMinimumDirectoryUnicodeString(TestDataDirectory)) {
return FALSE;
}
if (!IsValidPerfectHashTableAlgorithmId(AlgorithmId)) {
return FALSE;
} else {
}
//
// Arguments have been validated, proceed.
//
//
// Create a buffer we can use for stdout, using a very generous buffer size.
//
NumberOfPages = 10;
Success = Rtl->CreateBuffer(Rtl,
&ProcessHandle,
NumberOfPages,
NULL,
&WideOutputBufferSize,
&WideOutputBuffer);
if (!Success) {
return FALSE;
}
WideOutput = WideOutputBuffer;
//
// Create a buffer we can use for temporary path construction. We want it
// to be MAX_USHORT in size, so (1 << 16) >> PAGE_SHIFT converts this into
// the number of pages we need.
//
NumberOfPages = (1 << 16) >> PAGE_SHIFT;
Success = Rtl->CreateBuffer(Rtl,
&ProcessHandle,
NumberOfPages,
NULL,
&BufferSize,
&BaseBuffer);
if (!Success) {
return FALSE;
}
Buffer = BaseBuffer;
//
// Get a reference to the stdout handle.
//
WideOutputHandle = GetStdHandle(STD_OUTPUT_HANDLE);
ASSERT(WideOutputHandle);
//
// Calculate the size required for a new concatenated wide string buffer
// that combines the test data directory with the "*.keys" suffix. The
// 2 * sizeof(*Dest) accounts for the joining slash and trailing NULL.
//
AllocSize.LongPart = TestDataDirectory->Length;
AllocSize.LongPart += Suffix.Length + (2 * sizeof(*Dest));
ASSERT(!AllocSize.HighPart);
SearchPath.Buffer = (PWSTR)Buffer;
if (!SearchPath.Buffer) {
goto Error;
}
//
// Copy incoming test data directory name.
//
Length = TestDataDirectory->Length;
CopyMemory(SearchPath.Buffer,
TestDataDirectory->Buffer,
Length);
//
// Advance our Dest pointer to the end of the directory name, write a
// slash, then copy the suffix over.
//
Dest = (PWSTR)RtlOffsetToPointer(SearchPath.Buffer, Length);
*Dest++ = L'\\';
CopyMemory(Dest, Suffix.Buffer, Suffix.Length);
//
// Wire up the search path length and maximum length variables. The max
// length will be our AllocSize, length will be this value minus 2 to
// account for the trailing NULL.
//
SearchPath.MaximumLength = AllocSize.LowPart;
SearchPath.Length = AllocSize.LowPart - sizeof(*Dest);
ASSERT(SearchPath.Buffer[SearchPath.Length] == L'\0');
//
// Advance the buffer past this string allocation, up to the next 16-byte
// boundary.
//
Buffer = (PWSTR)(
RtlOffsetToPointer(
Buffer,
ALIGN_UP(SearchPath.MaximumLength, 16)
)
);
WIDE_OUTPUT_RAW(WideOutput,
L"Starting perfect hash self-test for directory: ");
WIDE_OUTPUT_UNICODE_STRING(WideOutput, TestDataDirectory);
WIDE_OUTPUT_RAW(WideOutput, L".\n");
WIDE_OUTPUT_FLUSH();
//
// Create a find handle for the <test data>\*.keys search pattern we
// created.
//
FindHandle = FindFirstFileW(SearchPath.Buffer, &FindData);
if (!FindHandle || FindHandle == INVALID_HANDLE_VALUE) {
ULONG LastError;
//
// Check to see if we failed because there were no files matching the
// wildcard *.keys in the test directory. In this case, GetLastError()
// will report ERROR_FILE_NOT_FOUND.
//
LastError = GetLastError();
if (LastError == ERROR_FILE_NOT_FOUND) {
WIDE_OUTPUT_RAW(WideOutput,
L"No files matching pattern '*.keys' found in "
L"test data directory.\n");
WIDE_OUTPUT_FLUSH();
goto End;
} else {
//
// We failed for some other reason.
//
WIDE_OUTPUT_RAW(WideOutput,
L"FindFirstFileW() failed with error code: ");
WIDE_OUTPUT_INT(WideOutput, LastError);
WIDE_OUTPUT_LF(WideOutput);
WIDE_OUTPUT_FLUSH();
goto Error;
}
}
//
// Initialize the fully-qualified keys path.
//
KeysPath.Buffer = Buffer;
CopyMemory(KeysPath.Buffer, TestDataDirectory->Buffer, Length);
//
// Advance our Dest pointer to the end of the directory name, then write
// a slash.
//
Dest = (PWSTR)RtlOffsetToPointer(KeysPath.Buffer, Length);
*Dest++ = L'\\';
//
// Update the length to account for the slash we just wrote, then make a
// copy of it in the variable BaseLength.
//
Length += sizeof(*Dest);
BaseLength = Length;
//
// Zero the failure count and begin the main loop.
//
Failures = 0;
do {
//
// Clear the failure flag at the top of every loop invocation.
//
Failed = FALSE;
WIDE_OUTPUT_RAW(WideOutput, L"Processing key file: ");
WIDE_OUTPUT_WCSTR(WideOutput, (PCWSZ)FindData.cFileName);
WIDE_OUTPUT_LF(WideOutput);
WIDE_OUTPUT_FLUSH();
//
// Create a new perfect hash table context.
//
Success = Api->CreatePerfectHashTableContext(Rtl,
Allocator,
MaximumConcurrency,
&Context);
if (!Success) {
//
// We can't do anything without a context.
//
WIDE_OUTPUT_RAW(WideOutput, L"Fatal: failed to create context.\n");
WIDE_OUTPUT_FLUSH();
Failures++;
break;
}
//
// Copy the filename over to the fully-qualified keys path.
//
Dest = (PWSTR)RtlOffsetToPointer(KeysPath.Buffer, BaseLength);
Source = (PWSTR)FindData.cFileName;
while (*Source) {
*Dest++ = *Source++;
}
*Dest = L'\0';
Length = (USHORT)RtlOffsetFromPointer(Dest, KeysPath.Buffer);
KeysPath.Length = Length;
KeysPath.MaximumLength = Length + sizeof(*Dest);
ASSERT(KeysPath.Buffer[KeysPath.Length >> 1] == L'\0');
ASSERT(&KeysPath.Buffer[KeysPath.Length >> 1] == Dest);
Success = Api->LoadPerfectHashTableKeys(Rtl,
Allocator,
&KeysPath,
&Keys);
if (!Success) {
WIDE_OUTPUT_RAW(WideOutput, L"Failed to load keys for ");
WIDE_OUTPUT_UNICODE_STRING(WideOutput, &KeysPath);
WIDE_OUTPUT_RAW(WideOutput, L".\n");
WIDE_OUTPUT_FLUSH();
Failures++;
goto DestroyContext;
}
//
// Keys were loaded successfully. Construct the equivalent path name
// for the backing perfect hash table when persisted to disk. Although
// this can be automated for us as part of CreatePerfectHashTable(),
// having it backed by our memory simplifies things a little further
// down the track when we want to load the table via the path but have
// destroyed the original table that came from CreatePerfectHashTable().
//
//
// Align the Dest pointer up to a 16-byte boundary. (We add 1 to its
// current value to advance it past the terminating NULL of the keys
// path.)
//
Dest = (PWSTR)ALIGN_UP(Dest + 1, 16);
//
// We know the keys file ended with ".keys". We're going to use the
// identical name for the backing hash table file, except replace the
// ".keys" extension at the end with ".pht1". So, we can just copy the
// lengths used for KeysPath, plus the entire buffer, then just copy the
// new extension over the old one. The 1 doesn't have any significance
// other than it padding out the extension length such that it matches
// the length of ".keys". (Although it may act as a nice versioning
// tool down the track.)
//
TablePath.Length = KeysPath.Length;
TablePath.MaximumLength = KeysPath.MaximumLength;
TablePath.Buffer = Dest;
//
// Copy the keys path over.
//
CopyMemory(Dest, KeysPath.Buffer, KeysPath.MaximumLength);
//
// Advance the Dest pointer to the end of the buffer, then retreat it
// five characters, such that it's positioned on the 'k' of keys.
//
Dest += (KeysPath.MaximumLength >> 1) - 5;
ASSERT(*Dest == L'k');
ASSERT(*(Dest - 1) == L'.');
//
// Copy the "pht1" extension over "keys".
//
Source = TableSuffix.Buffer;
while (*Source) {
*Dest++ = *Source++;
}
*Dest = L'\0';
//
// Sanity check invariants.
//
ASSERT(TablePath.Buffer[TablePath.Length >> 1] == L'\0');
ASSERT(&TablePath.Buffer[TablePath.Length >> 1] == Dest);
//
// We now have the fully-qualified path name of the backing perfect
// hash table file living in TablePath. Continue with creation of the
// perfect hash table, using this path we've just created and the keys
// that were loaded.
//
Success = Api->CreatePerfectHashTable(Rtl,
Allocator,
Context,
AlgorithmId,
MaskFunctionId,
HashFunctionId,
NULL,
Keys,
&TablePath);
if (!Success) {
WIDE_OUTPUT_RAW(WideOutput, L"Failed to create perfect hash "
L"table for keys: ");
WIDE_OUTPUT_UNICODE_STRING(WideOutput, &KeysPath);
WIDE_OUTPUT_RAW(WideOutput, L".\n");
WIDE_OUTPUT_FLUSH();
Failed = TRUE;
Failures++;
goto DestroyKeys;
}
WIDE_OUTPUT_RAW(WideOutput, L"Successfully created perfect "
L"hash table: ");
WIDE_OUTPUT_UNICODE_STRING(WideOutput, &TablePath);
WIDE_OUTPUT_RAW(WideOutput, L".\n");
//
// Load the perfect hash table we just created.
//
Success = Api->LoadPerfectHashTable(Rtl,
Allocator,
Keys,
&TablePath,
&Table);
if (!Success) {
WIDE_OUTPUT_RAW(WideOutput, L"Failed to load perfect hash table: ");
WIDE_OUTPUT_UNICODE_STRING(WideOutput, &TablePath);
WIDE_OUTPUT_RAW(WideOutput, L".\n");
WIDE_OUTPUT_FLUSH();
Failures++;
goto DestroyKeys;
}
//
// Table was loaded successfully from disk. Obtain the names of all
// the enumeration IDs. Currently these should always match the same
// enums provided as input parameters to this routine.
//
// N.B. I'm being lazy with the ASSERT()s here instead of reporting the
// error properly like we do with other failures.
//
ASSERT(Api->GetAlgorithmName(Table->AlgorithmId, &AlgorithmName));
ASSERT(Api->GetHashFunctionName(Table->HashFunctionId,
&HashFunctionName));
ASSERT(Api->GetMaskFunctionName(Table->MaskFunctionId,
&MaskFunctionName));
WIDE_OUTPUT_RAW(WideOutput, L"Successfully loaded perfect "
L"hash table: ");
WIDE_OUTPUT_UNICODE_STRING(WideOutput, &TablePath);
WIDE_OUTPUT_RAW(WideOutput, L".\n");
WIDE_OUTPUT_RAW(WideOutput, L"Algorithm: ");
WIDE_OUTPUT_UNICODE_STRING(WideOutput, AlgorithmName);
WIDE_OUTPUT_RAW(WideOutput, L" (");
WIDE_OUTPUT_INT(WideOutput, Table->AlgorithmId);
WIDE_OUTPUT_RAW(WideOutput, L").\n");
WIDE_OUTPUT_RAW(WideOutput, L"Hash Function: ");
WIDE_OUTPUT_UNICODE_STRING(WideOutput, HashFunctionName);
WIDE_OUTPUT_RAW(WideOutput, L" (");
WIDE_OUTPUT_INT(WideOutput, Table->HashFunctionId);
WIDE_OUTPUT_RAW(WideOutput, L").\n");
WIDE_OUTPUT_RAW(WideOutput, L"Mask Function: ");
WIDE_OUTPUT_UNICODE_STRING(WideOutput, MaskFunctionName);
WIDE_OUTPUT_RAW(WideOutput, L" (");
WIDE_OUTPUT_INT(WideOutput, Table->MaskFunctionId);
WIDE_OUTPUT_RAW(WideOutput, L").\n");
WIDE_OUTPUT_FLUSH();
//
// Test the table.
//
Success = Api->TestPerfectHashTable(Table, TRUE);
if (!Success) {
WIDE_OUTPUT_RAW(WideOutput, L"Test failed for perfect hash table "
L"loaded from disk: ");
WIDE_OUTPUT_UNICODE_STRING(WideOutput, &TablePath);
WIDE_OUTPUT_RAW(WideOutput, L".\n");
WIDE_OUTPUT_FLUSH();
Failures++;
Failed = TRUE;
goto DestroyTable;
}
WIDE_OUTPUT_RAW(WideOutput, L"Successfully tested perfect hash "
L"table.\n");
//
// Initialize header alias.
//
Header = Table->Header;
//
// Define some helper macros here for dumping stats.
//
#define STATS_INT(String, Name) \
WIDE_OUTPUT_RAW(WideOutput, String); \
WIDE_OUTPUT_INT(WideOutput, Table->Header->##Name##); \
WIDE_OUTPUT_RAW(WideOutput, L".\n")
#define STATS_QUAD(String, Name) \
WIDE_OUTPUT_RAW(WideOutput, String); \
WIDE_OUTPUT_INT(WideOutput, Table->Header->##Name##.QuadPart); \
WIDE_OUTPUT_RAW(WideOutput, L".\n")
if (Header->NumberOfTableResizeEvents > 0) {
STATS_INT(L"Number of table resize events: ",
NumberOfTableResizeEvents);
STATS_INT(L"Total number of attempts with smaller table sizes: ",
TotalNumberOfAttemptsWithSmallerTableSizes);
STATS_INT(L"First table size attempted: ",
InitialTableSize);
STATS_INT(L"Closest we came to solving the graph in previous "
L"attempts by number of deleted edges away: ",
ClosestWeCameToSolvingGraphWithSmallerTableSizes);
}
STATS_INT(L"Concurrency: ", Concurrency);
STATS_INT(L"Number of attempts: ", NumberOfAttempts);
STATS_INT(L"Number of failed attempts: ", NumberOfFailedAttempts);
STATS_INT(L"Number of solutions found: ", NumberOfSolutionsFound);
STATS_QUAD(L"Number of keys: ", NumberOfKeys);
STATS_QUAD(L"Number of table elements (vertices): ",
NumberOfTableElements);
STATS_INT(L"Seed 1: ", Seed1);
STATS_INT(L"Seed 2: ", Seed2);
STATS_INT(L"Seed 3: ", Seed3);
STATS_INT(L"Seed 4: ", Seed4);
STATS_QUAD(L"Cycles to solve: ", SolveCycles);
STATS_QUAD(L"Cycles to verify: ", VerifyCycles);
STATS_QUAD(L"Cycles to prepare file: ", PrepareFileCycles);
STATS_QUAD(L"Cycles to save file: ", SaveFileCycles);
STATS_QUAD(L"Microseconds to solve: ", SolveMicroseconds);
STATS_QUAD(L"Microseconds to verify: ", VerifyMicroseconds);
STATS_QUAD(L"Microseconds to prepare file: ", PrepareFileMicroseconds);
STATS_QUAD(L"Microseconds to save file: ", SaveFileMicroseconds);
WIDE_OUTPUT_RAW(WideOutput, L"\n\n");
WIDE_OUTPUT_FLUSH();
//
// Destroy the table.
//
DestroyTable:
Table->Vtbl->Release(Table);
DestroyKeys:
Success = Api->DestroyPerfectHashTableKeys(&Keys);
if (!Success) {
WIDE_OUTPUT_RAW(WideOutput, L"Failed to destroy keys for ");
WIDE_OUTPUT_UNICODE_STRING(WideOutput, &KeysPath);
WIDE_OUTPUT_RAW(WideOutput, L".\n");
WIDE_OUTPUT_FLUSH();
Failures++;
}
DestroyContext:
Success = Api->DestroyPerfectHashTableContext(&Context, NULL);
if (!Success) {
//
// Failure to destroy a context is a fatal error that we can't
// recover from. Bomb out now.
//
WIDE_OUTPUT_RAW(WideOutput, L"Fatal: failed to destroy context.\n");
WIDE_OUTPUT_FLUSH();
Failures++;
break;
}
} while (FindNextFile(FindHandle, &FindData));
//
// Self test complete!
//
if (!Failures) {
Success = TRUE;
goto End;
}
//
// Intentional follow-on to Error.
//
Error:
Success = FALSE;
//
// Intentional follow-on to End.
//
End:
//
// We can't do much if any of these routines error out, hence the NOTHINGs.
//
if (WideOutputBuffer) {
if (!Rtl->DestroyBuffer(Rtl, ProcessHandle, &WideOutputBuffer)) {
NOTHING;
}
WideOutput = NULL;
}
if (BaseBuffer) {
if (!Rtl->DestroyBuffer(Rtl, ProcessHandle, &BaseBuffer)) {
NOTHING;
}
Buffer = NULL;
}
if (FindHandle) {
if (!FindClose(FindHandle)) {
NOTHING;
}
FindHandle = NULL;
}
return Success;
}
The self-test routine creates a context, loads a key file, creates a perfect
hash table using the context from the key file, tests the perfect hash table,
then destroys it, the keys, and the context. Let's take a look at each routine
and any corresponding data structures involved.
Context Creation
This routine is responsible for creating a
PERFECT_HASH_TABLE_CONTEXT structure, which we use to encapsulate
all of the details regarding the threadpool machinery we provide to the backend
algorithm implementations such that solutions can be sought in parallel. The
context is passed to the CreatePerfectHashTable() routine (along
with the keys we load in the next step).
PERFECT_HASH_TABLE_CONTEXT
The public header gets an opaque structure for the context. We don't need to
expose any internal details of the structure to consumers, other than providing
with an opaque pointer they can subsequently pass to the creation routine, and
then to the DestroyPerfectHashTableContext() routine once they're
finished creating perfect hash tables.
//
// Define a runtime context to encapsulate threadpool resources. This is
// passed to CreatePerfectHashTable() and allows for algorithms to search for
// perfect hash solutions in parallel.
//
typedef union _PERFECT_HASH_TABLE_CONTEXT_FLAGS {
struct {
ULONG Unused:32;
};
LONG AsLong;
ULONG AsULong;
} PERFECT_HASH_TABLE_CONTEXT_FLAGS;
C_ASSERT(sizeof(PERFECT_HASH_TABLE_CONTEXT_FLAGS) == sizeof(ULONG));
typedef PERFECT_HASH_TABLE_CONTEXT_FLAGS *PPERFECT_HASH_TABLE_CONTEXT_FLAGS;
typedef struct _Struct_size_bytes_(SizeOfStruct) _PERFECT_HASH_TABLE_CONTEXT {
//
// Size of the structure, in bytes.
//
_Field_range_(==, sizeof(struct _PERFECT_HASH_TABLE_CONTEXT))
ULONG SizeOfStruct;
//
// Flags.
//
PERFECT_HASH_TABLE_CONTEXT_FLAGS Flags;
//
// The algorithm in use.
//
PERFECT_HASH_TABLE_ALGORITHM_ID AlgorithmId;
//
// The masking type in use.
//
PERFECT_HASH_TABLE_MASK_FUNCTION_ID MaskFunctionId;
//
// The hash function in use.
//
PERFECT_HASH_TABLE_HASH_FUNCTION_ID HashFunctionId;
//
// Pointer to an initialized RTL structure.
//
PRTL Rtl;
//
// Pointer to an initialized allocator.
//
PALLOCATOR Allocator;
//
// Pointer to the API structure in use.
//
PPERFECT_HASH_TABLE_ANY_API AnyApi;
//
// Pointer to the active perfect hash table.
//
struct _PERFECT_HASH_TABLE *Table;
//
// The highest number of deleted edges count encountered by a worker thread.
// This is useful when debugging a poorly performing hash/mask combo that is
// failing to find a solution.
//
volatile ULONG HighestDeletedEdgesCount;
//
// The number of attempts we'll make at trying to solve the graph before
// giving up and resizing with a larger underlying table.
//
ULONG ResizeTableThreshold;
//
// Limit on how many times a resize will be kicked off.
//
ULONG ResizeLimit;
//
// Define the events used to communicate various internal state changes
// between the CreatePerfectHashTable() function and the algorithm-specific
// creation routine.
//
// N.B. All of these events are created with the manual reset flag set to
// TRUE, such that they stay signalled even after they have satisfied
// a wait.
//
//
// A global "shutdown" event handle that threads can query to determine
// whether or not they should continue processing at various internal
// checkpoints.
//
union {
HANDLE ShutdownEvent;
PVOID FirstEvent;
};
//
// This event will be set if an algorithm was successful in finding a
// perfect hash. Either it or the FailedEvent will be set; never both.
//
HANDLE SucceededEvent;
//
// This event will be set if an algorithm failed to find a perfect hash
// solution. This may be due to the algorithm exhausting all possible
// options, hitting a time limit, or potentially as a result of being
// forcibly terminated or some other internal error. It will never be
// set if SucceededEvent is also set.
//
HANDLE FailedEvent;
//
// The following event is required to be set by an algorithm's creation
// routine upon completion (regardless of success or failure). This event
// is waited upon by the CreatePerfectHashTable() function, and thus, is
// critical in synchronizing the execution of parallel perfect hash solution
// finding.
//
HANDLE CompletedEvent;
//
// The following event is set when a worker thread detects that the number
// of attempts has exceeded a specified threshold, and that the main thread
// should cancel the current attempts and try again with a larger vertex
// table size.
//
HANDLE TryLargerTableSizeEvent;
//
// The following event is set when a worker thread has completed preparing
// the underlying backing file in order for the solved graph to be persisted
// to disk.
//
HANDLE PreparedFileEvent;
//
// The following event is set by the main thread when it has completed
// verification of the solved graph. It is used to signal to the save
// file worker that verification has finished such that cycle counts can
// be captured in order to calculate the number of cycles and microseconds
// it took to verify the graph.
//
HANDLE VerifiedEvent;
//
// The following event is set when a worker thread has completed saving the
// solved graph to disk.
//
union {
HANDLE SavedFileEvent;
PVOID LastEvent;
};
//
// N.B. All events are created as named events, using the random object
// name generation helper Rtl->CreateRandomObjectNames(). This will
// fill out an array of PUNICODE_STRING pointers. The next field
// points to the first element of that array. Subsequent fields
// capture various book-keeping items about the random object names
// allocation (provided by the Rtl routine).
//
PUNICODE_STRING ObjectNames;
PPUNICODE_STRING ObjectNamesPointerArray;
PWSTR ObjectNamesWideBuffer;
ULONG SizeOfObjectNamesWideBuffer;
ULONG NumberOfObjects;
//
// Number of attempts made by the algorithm to find a solution.
//
volatile ULONGLONG Attempts;
//
// Counters used for capturing performance information. We capture both a
// cycle count, using __rdtsc(), plus a "performance counter" count, via
// QueryPerformanceCounter(). The former provides a finer resolution, but
// can't be used to calculate elapsed microseconds due to turbo boost and
// variable frequencies. The latter provides a coarser resolution, but
// can be used to convert into elapsed microseconds (via the frequency,
// also captured below).
//
LARGE_INTEGER Frequency;
//
// Capture the time required to solve the perfect hash table. This is not
// a sum of all cycles consumed by all worker threads; it is the cycles
// consumed between the "main" thread (i.e. the CreatePerfectHashTable()
// impl routine (CreatePerfectHashTableImplChm01())) dispatching parallel
// work to the threadpool, and a solution being found.
//
ULARGE_INTEGER SolveStartCycles;
LARGE_INTEGER SolveStartCounter;
ULARGE_INTEGER SolveEndCycles;
LARGE_INTEGER SolveEndCounter;
ULARGE_INTEGER SolveElapsedCycles;
ULARGE_INTEGER SolveElapsedMicroseconds;
//
// Capture the time required to verify the solution. This involves walking
// the entire key set, applying the perfect hash function to derive an index
// into the Assigned array, and verifying that we only saw each index value
// at most once.
//
// This is a reasonably good measure of the combined performance of the
// chosen hash and mask algorithm, with lower cycles and counter values
// indicating better performance.
//
ULARGE_INTEGER VerifyStartCycles;
LARGE_INTEGER VerifyStartCounter;
ULARGE_INTEGER VerifyEndCycles;
LARGE_INTEGER VerifyEndCounter;
ULARGE_INTEGER VerifyElapsedCycles;
ULARGE_INTEGER VerifyElapsedMicroseconds;
//
// Capture the time required to prepare the backing .pht1 file in the file
// work threadpool.
//
ULARGE_INTEGER PrepareFileStartCycles;
LARGE_INTEGER PrepareFileStartCounter;
ULARGE_INTEGER PrepareFileEndCycles;
LARGE_INTEGER PrepareFileEndCounter;
ULARGE_INTEGER PrepareFileElapsedCycles;
ULARGE_INTEGER PrepareFileElapsedMicroseconds;
//
// Capture the time required to save the final Assigned array to the backing
// file prepared in an earlier step. This is also dispatched to the file
// work thread pool, and consists of a memory copy from the assigned array
// of the graph to the base address of the backing file's memory map, then
// flushing the map, unmapping it, closing the section, and closing the
// file.
//
ULARGE_INTEGER SaveFileStartCycles;
LARGE_INTEGER SaveFileStartCounter;
ULARGE_INTEGER SaveFileEndCycles;
LARGE_INTEGER SaveFileEndCounter;
ULARGE_INTEGER SaveFileElapsedCycles;
ULARGE_INTEGER SaveFileElapsedMicroseconds;
//
// Number of failed attempts at solving the graph across all threads.
//
volatile ULONGLONG FailedAttempts;
//
// The main threadpool callback environment, used for solving perfect hash
// solutions in parallel.
//
TP_CALLBACK_ENVIRON MainCallbackEnv;
PTP_CLEANUP_GROUP MainCleanupGroup;
PTP_POOL MainThreadpool;
PTP_WORK MainWork;
SLIST_HEADER MainWorkListHead;
ULONG MinimumConcurrency;
ULONG MaximumConcurrency;
//
// The algorithm is responsible for registering an appropriate callback
// for main thread work items in this next field.
//
PPERFECT_HASH_TABLE_MAIN_WORK_CALLBACK MainWorkCallback;
//
// A threadpool for offloading file operations.
//
TP_CALLBACK_ENVIRON FileCallbackEnv;
PTP_CLEANUP_GROUP FileCleanupGroup;
PTP_POOL FileThreadpool;
PTP_WORK FileWork;
SLIST_HEADER FileWorkListHead;
//
// Provide a means for file work callbacks to indicate an error back to
// the creation routine by incrementing the following counter.
//
volatile ULONG FileWorkErrors;
volatile ULONG FileWorkLastError;
//
// The algorithm is responsible for registering an appropriate callback
// for file work threadpool work items in this next field.
//
PPERFECT_HASH_TABLE_FILE_WORK_CALLBACK FileWorkCallback;
//
// If a threadpool worker thread finds a perfect hash solution, it will
// enqueue a "Finished!"-type work item to a separate threadpool, captured
// by the following callback environment. This allows for a separate
// threadpool worker to schedule the cancellation of other in-progress
// and outstanding perfect hash solution attempts without deadlocking.
//
// This threadpool environment is serviced by a single thread.
//
// N.B. This cleanup only refers to the main graph solving thread pool.
// The file threadpool is managed by the implicit lifetime of the
// algorithm's creation routine (e.g. CreatePerfectHashTableImplChm01).
//
TP_CALLBACK_ENVIRON FinishedCallbackEnv;
PTP_POOL FinishedThreadpool;
PTP_WORK FinishedWork;
SLIST_HEADER FinishedWorkListHead;
//
// If a worker thread successfully finds a perfect hash solution, it will
// push its solution to the FinishedListHead above, then submit a finished
// work item via SubmitThreadpoolWork(Context->FinishedWork).
//
// This callback will be processed by the finished group above, and provides
// a means for that thread to set the ShutdownEvent and cancel outstanding
// main work callbacks.
//
// N.B. Although we only need one solution, we don't prevent multiple
// successful solutions from being pushed to the FinishedListHead.
// Whatever the first solution is that the finished callback pops
// off that list is the solution that wins.
//
volatile ULONGLONG FinishedCount;
//
// Similar to the Finished group above, provide an Error group that also
// consists of a single thread. If a main threadpool worker thread runs
// into a fatal error that requires termination of all in-progress and
// outstanding threadpool work items, it can just dispatch a work item
// to this particular pool (e.g. SubmitThreadpoolWork(Context->ErrorWork)).
//
// There is no ErrorListHead as no error information is captured that needs
// communicating back to a central location.
//
TP_CALLBACK_ENVIRON ErrorCallbackEnv;
PTP_POOL ErrorThreadpool;
PTP_WORK ErrorWork;
//
// An opaque pointer that can be used by the algorithm to stash additional
// context.
//
PVOID AlgorithmContext;
//
// An opaque pointer that can be used by the hash function to stash
// additional context.
//
PVOID HashFunctionContext;
//
// An opaque pointer to the winning solution (i.e. the solved graph).
//
PVOID SolvedContext;
} PERFECT_HASH_TABLE_CONTEXT;
typedef PERFECT_HASH_TABLE_CONTEXT *PPERFECT_HASH_TABLE_CONTEXT;
//
// Define helper macros for marking start/end points for the context's
// cycle/counter fields. When starting, we put __rdtsc() last, and when
// stopping we put it first, as its resolution is more sensitive than the
// QueryPerformanceCounter() routine.
//
#define CONTEXT_START_TIMERS(Name) \
QueryPerformanceCounter(&Context->##Name##StartCounter); \
Context->##Name##StartCycles.QuadPart = __rdtsc()
#define CONTEXT_END_TIMERS(Name) \
Context->##Name##EndCycles.QuadPart = __rdtsc(); \
QueryPerformanceCounter(&Context->##Name##EndCounter); \
Context->##Name##ElapsedCycles.QuadPart = ( \
Context->##Name##EndCycles.QuadPart - \
Context->##Name##StartCycles.QuadPart \
); \
Context->##Name##ElapsedMicroseconds.QuadPart = ( \
Context->##Name##EndCounter.QuadPart - \
Context->##Name##StartCounter.QuadPart \
); \
Context->##Name##ElapsedMicroseconds.QuadPart *= 1000000; \
Context->##Name##ElapsedMicroseconds.QuadPart /= ( \
Context->Frequency.QuadPart \
)
#define CONTEXT_SAVE_TIMERS_TO_HEADER(Name) \
Header->##Name##Cycles.QuadPart = Context->##Name##ElapsedCycles.QuadPart; \
Header->##Name##Microseconds.QuadPart = ( \
Context->##Name##ElapsedMicroseconds.QuadPart \
)
_Use_decl_annotations_
BOOLEAN
CreatePerfectHashTableContext(
PRTL Rtl,
PALLOCATOR Allocator,
PULONG MaximumConcurrencyPointer,
PPERFECT_HASH_TABLE_CONTEXT *ContextPointer
)
/*++
Routine Description:
Creates and initializes a PERFECT_HASH_TABLE_CONTEXT structure, which
creates a main threadpool with a maximum of threads indicated by the
MaximumConcurrency parameter, if applicable.
Arguments:
Rtl - Supplies a pointer to an initialized RTL structure.
Allocator - Supplies a pointer to an initialized ALLOCATOR structure that
will be used for all memory allocations.
AnyApi - Supplies a pointer to the active API structure in use.
MaximumConcurrency - Optionally supplies a pointer to a variable that
contains the desired maximum concurrency to be used for the underlying
threadpool. If NULL, or non-NULL but points to a value of 0, then the
number of system processors will be used as a default value.
N.B. This value is passed directly to SetThreadpoolThreadMinimum() and
SetThreadpoolThreadMaximum().
ContextPointer - Supplies the address of a variable that receives the
address of the newly created PERFECT_HASH_TABLE_CONTEXT structure on
success, NULL on failure.
Return Value:
TRUE on success, FALSE on failure.
--*/
{
ValidateArguments();
//
// Calculate allocation size required by the structure.
//
CalculateAllocationSize(&AllocSize);
//
// Allocate it.
//
Context = AllocateContext(AllocSize);
//
// Create the random object names for our underlying events.
//
Success = Rtl->CreateRandomObjectNames(Rtl,
Allocator,
Allocator,
NumberOfContextObjectPrefixes,
64,
NULL,
Context->ObjectNamesPointerArray,
Prefixes,
&SizeOfNamesWideBuffer,
&NamesWideBuffer);
//
// Create named events.
//
for (Index = 0; Index < NumberOfEvents; Index++, Event++, Name++) {
//
// We want all of our events to be manual reset, such that they stay
// signaled even after they've satisfied a wait.
//
BOOLEAN ManualReset = TRUE;
*Event = Rtl->CreateEventW(NULL,
ManualReset,
FALSE,
Name->Buffer);
}
//
// Create our threadpools: main work, file work, finished, error.
//
CreateMainWorkThreadpoolResources(Context);
CreateFileWorkThreadpoolResources(Context);
CreateFinishedWorkThreadpoolResources(Context);
CreateErrorWorkThreadpoolResources(Context);
*ContextPointer = Context;
return TRUE;
}
_Use_decl_annotations_
BOOLEAN
CreatePerfectHashTableContext(
PRTL Rtl,
PALLOCATOR Allocator,
PULONG MaximumConcurrencyPointer,
PPERFECT_HASH_TABLE_CONTEXT *ContextPointer
)
/*++
Routine Description:
Creates and initializes a PERFECT_HASH_TABLE_CONTEXT structure, which
creates a main threadpool with a maximum of threads indicated by the
MaximumConcurrency parameter, if applicable.
Arguments:
Rtl - Supplies a pointer to an initialized RTL structure.
Allocator - Supplies a pointer to an initialized ALLOCATOR structure that
will be used for all memory allocations.
AnyApi - Supplies a pointer to the active API structure in use.
MaximumConcurrency - Optionally supplies a pointer to a variable that
contains the desired maximum concurrency to be used for the underlying
threadpool. If NULL, or non-NULL but points to a value of 0, then the
number of system processors will be used as a default value.
N.B. This value is passed directly to SetThreadpoolThreadMinimum() and
SetThreadpoolThreadMaximum().
ContextPointer - Supplies the address of a variable that receives the
address of the newly created PERFECT_HASH_TABLE_CONTEXT structure on
success, NULL on failure.
Return Value:
TRUE on success, FALSE on failure.
--*/
{
BYTE Index;
BYTE NumberOfEvents;
ULONG LastError;
PHANDLE Event;
BOOLEAN Success;
PBYTE Buffer;
PBYTE ExpectedBuffer;
ULONG MaximumConcurrency;
ULONG NumberOfProcessors;
ULONG SizeOfNamesWideBuffer;
PWSTR NamesWideBuffer;
PTP_POOL Threadpool;
ULARGE_INTEGER AllocSize;
ULARGE_INTEGER ObjectNameArraySize;
ULARGE_INTEGER ObjectNamePointersArraySize;
PUNICODE_STRING Name;
PPUNICODE_STRING Names;
PPUNICODE_STRING Prefixes;
PPERFECT_HASH_TABLE_CONTEXT Context = NULL;
//
// Validate arguments.
//
if (!ARGUMENT_PRESENT(Rtl)) {
return FALSE;
}
if (!ARGUMENT_PRESENT(Allocator)) {
return FALSE;
}
if (!ARGUMENT_PRESENT(ContextPointer)) {
return FALSE;
}
//
// Default the maximum concurrency to 0 if a NULL pointer was provided.
//
if (ARGUMENT_PRESENT(MaximumConcurrencyPointer)) {
MaximumConcurrency = *MaximumConcurrencyPointer;
} else {
MaximumConcurrency = 0;
}
//
// Argument validation complete. Clear the caller's pointer up-front and
// continue with creation.
//
*ContextPointer = NULL;
//
// Calculate the size required by the array of UNICODE_STRING structures
// that trail the context, then the array of addresses to those structures.
//
ObjectNameArraySize.QuadPart = (
(NumberOfContextObjectPrefixes * sizeof(UNICODE_STRING))
);
ObjectNamePointersArraySize.QuadPart = (
(NumberOfContextObjectPrefixes * sizeof(PUNICODE_STRING))
);
ASSERT(!ObjectNameArraySize.HighPart);
ASSERT(!ObjectNamePointersArraySize.HighPart);
//
// Calculate allocation size required by the structure.
//
AllocSize.QuadPart = (
//
// Account for the size of the context structure itself.
//
sizeof(PERFECT_HASH_TABLE_CONTEXT) +
//
// Account for the object name overhead.
//
ObjectNameArraySize.QuadPart +
ObjectNamePointersArraySize.QuadPart
);
//
// Sanity check we haven't overflowed.
//
ASSERT(!AllocSize.HighPart);
//
// Allocate space for the context.
//
Context = (PPERFECT_HASH_TABLE_CONTEXT)(
Allocator->Calloc(Allocator->Context,
1,
AllocSize.LowPart)
);
if (!Context) {
return FALSE;
}
//
// Allocation was successful, continue with initialization.
//
Context->SizeOfStruct = sizeof(*Context);
Context->Rtl = Rtl;
Context->Allocator = Allocator;
Context->Flags.AsULong = 0;
//
// The context structure will be trailed by an array of UNICODE_STRING
// structures that will be filled in by Rtl->CreateRandomObjectNames().
// This is then followed by an array of PUNICODE_STRING pointers, with
// each one initialized to point to the corresponding offset into the
// first array. This is a bit quirky, but it's required by the Rtl
// method (which allows for prefixes to be ignored for certain objects
// if desired).
//
Buffer = (PBYTE)Context;
Buffer += sizeof(*Context);
Context->ObjectNames = (PUNICODE_STRING)Buffer;
Buffer += ObjectNameArraySize.LowPart;
Context->ObjectNamesPointerArray = (PPUNICODE_STRING)Buffer;
Buffer += ObjectNamePointersArraySize.LowPart;
//
// If our pointer arithmetic was correct, Buffer should match the base
// address of the context plus the total allocation size at this point.
// Assert this invariant now.
//
ExpectedBuffer = RtlOffsetToPointer(Context, AllocSize.LowPart);
ASSERT(Buffer == ExpectedBuffer);
//
// Wire up the pointer array to the object names.
//
Names = Context->ObjectNamesPointerArray;
for (Index = 0; Index < NumberOfContextObjectPrefixes; Index++) {
Names[Index] = Context->ObjectNames + Index;
}
//
// Create the random object names for our underlying events.
//
Prefixes = (PPUNICODE_STRING)&ContextObjectPrefixes;
Success = Rtl->CreateRandomObjectNames(Rtl,
Allocator,
Allocator,
NumberOfContextObjectPrefixes,
64,
NULL,
Context->ObjectNamesPointerArray,
Prefixes,
&SizeOfNamesWideBuffer,
&NamesWideBuffer);
if (!Success) {
goto Error;
}
//
// Wire up the wide buffer pointer and containing size.
//
Context->ObjectNamesWideBuffer = NamesWideBuffer;
Context->SizeOfObjectNamesWideBuffer = SizeOfNamesWideBuffer;
Context->NumberOfObjects = NumberOfContextObjectPrefixes;
//
// Initialize the event pointer to the first handle, and the name pointer
// to the first UNICODE_STRING pointer. Obtain the number of events.
//
Event = (PHANDLE)&Context->FirstEvent;
Name = &Context->ObjectNames[0];
NumberOfEvents = GetNumberOfContextEvents(Context);
for (Index = 0; Index < NumberOfEvents; Index++, Event++, Name++) {
//
// We want all of our events to be manual reset, such that they stay
// signaled even after they've satisfied a wait.
//
BOOLEAN ManualReset = TRUE;
*Event = Rtl->CreateEventW(NULL,
ManualReset,
FALSE,
Name->Buffer);
LastError = GetLastError();
if (!*Event || LastError == ERROR_ALREADY_EXISTS) {
//
// As the event names are random, a last error that indicates the
// name already exists is evident of a pretty serious problem.
// Treat this as a fatal.
//
goto Error;
}
}
//
// If the maximum concurrency is 0, default to the number of processors.
//
NumberOfProcessors = GetMaximumProcessorCount(ALL_PROCESSOR_GROUPS);
if (MaximumConcurrency == 0) {
MaximumConcurrency = NumberOfProcessors;
}
Context->MinimumConcurrency = MaximumConcurrency;
Context->MaximumConcurrency = MaximumConcurrency;
//
// Create the Main threadpool structures. This threadpool creates a fixed
// number of threads equal to the maximum concurrency specified by the user
// (e.g. min threads is set to the same value as max threads).
//
Threadpool = Context->MainThreadpool = CreateThreadpool(NULL);
if (!Threadpool) {
goto Error;
}
if (!SetThreadpoolThreadMinimum(Threadpool, MaximumConcurrency)) {
goto Error;
}
SetThreadpoolThreadMaximum(Threadpool, MaximumConcurrency);
//
// Initialize the Main threadpool environment, set its priority to
// low, then associate it with the Main threadpool.
//
InitializeThreadpoolEnvironment(&Context->MainCallbackEnv);
SetThreadpoolCallbackPriority(&Context->MainCallbackEnv,
TP_CALLBACK_PRIORITY_LOW);
SetThreadpoolCallbackPool(&Context->MainCallbackEnv,
Context->MainThreadpool);
//
// Create a cleanup group for the Main threadpool and register it.
//
Context->MainCleanupGroup = CreateThreadpoolCleanupGroup();
if (!Context->MainCleanupGroup) {
goto Error;
}
SetThreadpoolCallbackCleanupGroup(&Context->MainCallbackEnv,
Context->MainCleanupGroup,
CleanupCallback);
//
// Create a work object for the Main threadpool.
//
Context->MainWork = CreateThreadpoolWork(MainWorkCallback,
Context,
&Context->MainCallbackEnv);
if (!Context->MainWork) {
goto Error;
}
//
// Initialize the main work list head.
//
InitializeSListHead(&Context->MainWorkListHead);
//
// Create the File threadpool structures. We currently use the default
// number of system threads for this threadpool. We don't have to clamp
// it at the moment as we don't bulk-enqueue work items (that can cause
// the threadpool machinery to think more threads need to be created).
//
Threadpool = Context->FileThreadpool = CreateThreadpool(NULL);
if (!Threadpool) {
goto Error;
}
//
// Initialize the File threadpool environment and associate it with the
// File threadpool.
//
InitializeThreadpoolEnvironment(&Context->FileCallbackEnv);
SetThreadpoolCallbackPool(&Context->FileCallbackEnv,
Context->FileThreadpool);
//
// Create a cleanup group for the File threadpool and register it.
//
Context->FileCleanupGroup = CreateThreadpoolCleanupGroup();
if (!Context->FileCleanupGroup) {
goto Error;
}
SetThreadpoolCallbackCleanupGroup(&Context->FileCallbackEnv,
Context->FileCleanupGroup,
CleanupCallback);
//
// Create a work object for the File threadpool.
//
Context->FileWork = CreateThreadpoolWork(FileWorkCallback,
Context,
&Context->FileCallbackEnv);
if (!Context->FileWork) {
goto Error;
}
//
// Initialize the main work list head.
//
InitializeSListHead(&Context->FileWorkListHead);
//
// Create the Finished and Error threadpools and associated resources.
// These are slightly easier as we only have 1 thread maximum for each
// pool and no cleanup group is necessary.
//
Context->FinishedThreadpool = CreateThreadpool(NULL);
if (!Context->FinishedThreadpool) {
goto Error;
}
if (!SetThreadpoolThreadMinimum(Context->FinishedThreadpool, 1)) {
goto Error;
}
SetThreadpoolThreadMaximum(Context->FinishedThreadpool, 1);
Context->FinishedWork = CreateThreadpoolWork(FinishedWorkCallback,
Context,
&Context->FinishedCallbackEnv);
if (!Context->FinishedWork) {
goto Error;
}
//
// Initialize the finished list head.
//
InitializeSListHead(&Context->FinishedWorkListHead);
//
// Create the Error threadpool.
//
Context->ErrorThreadpool = CreateThreadpool(NULL);
if (!Context->ErrorThreadpool) {
goto Error;
}
if (!SetThreadpoolThreadMinimum(Context->ErrorThreadpool, 1)) {
goto Error;
}
SetThreadpoolThreadMaximum(Context->ErrorThreadpool, 1);
Context->ErrorWork = CreateThreadpoolWork(ErrorWorkCallback,
Context,
&Context->ErrorCallbackEnv);
if (!Context->ErrorWork) {
goto Error;
}
//
// We're done! Jump to the end of the routine to finish up.
//
Success = TRUE;
goto End;
Error:
Success = FALSE;
//
// Call the destroy routine on the context if it is non-NULL.
//
if (Context) {
if (!DestroyPerfectHashTableContext(&Context, NULL)) {
//
// There's nothing we can do here.
//
NOTHING;
}
//
// N.B. DestroyPerfectHashTableContext() should clear the Context
// pointer. Assert that invariant now.
//
ASSERT(Context == NULL);
}
//
// Intentional follow-on to End.
//
End:
//
// Update the caller's pointer and return.
//
// N.B. Context could be NULL here, which is fine.
//
*ContextPointer = Context;
return Success;
}
The Rtl->CreateRandomObjectNames() function warrants a little
extra explanation. Its role can best be appreciated when viewing the named
event handles created for the context. The following screenshot was taken
with Process Hacker 2, which has a convenient means for interacting with a
process's named events:
Being able to select an event and set, reset or pulse it can be very useful
during development and debugging, especially in a mulithreaded environment
where the events will typically have one or more threads waiting on them at
various points.
The CreateRandomObjectNames() routine, when given an array of
object name prefixes, will create names suitable for passing directly to any of
the NT routines that allow naming of resources (events, sections, etc). It
appends a configurable amount of Base64-encoded random data to each name,
ensuring uniqueness across the entire machine.
Key Loading
The key loader routine is one of our more simple routines. It memory maps the
given .keys file such that it can be accessed as a linear 1D array of
ULONG values by the creation routines down the track. It produces
an instance of a PERFECT_HASH_TABLE_KEYS structure, for which an
opaque pointer is returned to the caller.
PERFECT_HASH_TABLE_KEYS
//
// Cap the maximum key set size we're willing to process.
//
#define MAXIMUM_NUMBER_OF_KEYS 500000
//
// Define the PERFECT_HASH_TABLE_KEYS_FLAGS structure.
//
typedef union _PERFECT_HASH_TABLE_FLAGS_KEYS {
struct _Struct_size_bytes_(sizeof(ULONG)) {
//
// Unused bits.
//
ULONG Unused:32;
};
LONG AsLong;
ULONG AsULong;
} PERFECT_HASH_TABLE_KEYS_FLAGS;
C_ASSERT(sizeof(PERFECT_HASH_TABLE_KEYS_FLAGS) == sizeof(ULONG));
typedef PERFECT_HASH_TABLE_KEYS_FLAGS *PPERFECT_HASH_TABLE_KEYS_FLAGS;
//
// Define the PERFECT_HASH_TABLE_KEYS structure.
//
typedef struct _Struct_size_bytes_(SizeOfStruct) _PERFECT_HASH_TABLE_KEYS {
//
// Reserve a slot for a vtable. Currently unused.
//
PPVOID Vtbl;
//
// Size of the structure, in bytes.
//
_Field_range_(==, sizeof(struct _PERFECT_HASH_TABLE_KEYS))
ULONG SizeOfStruct;
//
// Flags.
//
PERFECT_HASH_TABLE_KEYS_FLAGS Flags;
//
// Pointer to an initialized RTL structure.
//
PRTL Rtl;
//
// Pointer to an initialized ALLOCATOR structure.
//
PALLOCATOR Allocator;
//
// Pointer to the API structure in use.
//
PPERFECT_HASH_TABLE_ANY_API AnyApi;
//
// Number of keys in the mapping.
//
ULARGE_INTEGER NumberOfElements;
//
// Handle to the underlying keys file.
//
HANDLE FileHandle;
//
// Handle to the memory mapping for the keys file.
//
HANDLE MappingHandle;
//
// Base address of the memory map.
//
union {
PVOID BaseAddress;
PULONG Keys;
};
//
// Fully-qualified, NULL-terminated path of the source keys file.
//
UNICODE_STRING Path;
} PERFECT_HASH_TABLE_KEYS;
typedef PERFECT_HASH_TABLE_KEYS *PPERFECT_HASH_TABLE_KEYS;
LoadPerfectHashTableKeys
Pseudo
Full
_Use_decl_annotations_
BOOLEAN
LoadPerfectHashTableKeys(
PRTL Rtl,
PALLOCATOR Allocator,
PCUNICODE_STRING Path,
PPERFECT_HASH_TABLE_KEYS *KeysPointer
)
/*++
Routine Description:
Loads a keys file from the file system and returns a PERFECT_HASH_TABLE_KEYS
structure.
Arguments:
Rtl - Supplies a pointer to an initialized RTL structure.
Allocator - Supplies a pointer to an initialized ALLOCATOR structure that
will be used for al memory allocations.
Path - Supplies a pointer to a UNICODE_STRING structure that represents
a fully-qualified path of the keys to use for the perfect hash table.
N.B. Path must be NULL-terminated, which is not normally required for
UNICODE_STRING structures. Howver, the underlying buffer is passed
to CreateFileW(), which requires a NULL-terminated wstr.
KeysPointers - Supplies the address of a variable that will receive the
address of the newly created PERFECT_HASH_TABLE_KEYS structure if the
routine is successful, or NULL if the routine fails.
Return Value:
TRUE on success, FALSE on failure.
--*/
{
ValidateArguments();
//
// Open the file, create a file mapping, then map it into memory.
//
FileHandle = Rtl->CreateFileW(
Path->Buffer,
GENERIC_READ,
FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
NULL,
OPEN_EXISTING,
FILE_FLAG_SEQUENTIAL_SCAN | FILE_FLAG_OVERLAPPED,
NULL
);
//
// Verify file details such as size.
//
VerifyFileDetails();
//
// Create the file mapping.
//
MappingHandle = CreateFileMappingW(FileHandle,
NULL,
PAGE_READONLY,
FileInfo.EndOfFile.HighPart,
FileInfo.EndOfFile.LowPart,
NULL);
//
// Successfully created a file mapping. Now map it into memory.
//
BaseAddress = MapViewOfFile(MappingHandle,
FILE_MAP_READ,
0,
0,
FileInfo.EndOfFile.LowPart);
//
// The file has been mapped successfully. Calculate the size required for
// the keys structure and a copy of the Path's underlying unicode string
// buffer.
//
CalculateAllocSize(&AllocSize);
//
// Allocate the structure.
//
Keys = AllocKeys(AllocSize);
//
// Initialize fields then duplicate the keys path.
//
InitializeFields(Keys);
CopyKeysPath(Keys, Path);
*KeysPointer = Keys;
return Success;
}
_Use_decl_annotations_
BOOLEAN
LoadPerfectHashTableKeys(
PRTL Rtl,
PALLOCATOR Allocator,
PCUNICODE_STRING Path,
PPERFECT_HASH_TABLE_KEYS *KeysPointer
)
/*++
Routine Description:
Loads a keys file from the file system and returns a PERFECT_HASH_TABLE_KEYS
structure.
Arguments:
Rtl - Supplies a pointer to an initialized RTL structure.
Allocator - Supplies a pointer to an initialized ALLOCATOR structure that
will be used for al memory allocations.
Path - Supplies a pointer to a UNICODE_STRING structure that represents
a fully-qualified path of the keys to use for the perfect hash table.
N.B. Path must be NULL-terminated, which is not normally required for
UNICODE_STRING structures. Howver, the underlying buffer is passed
to CreateFileW(), which requires a NULL-terminated wstr.
KeysPointers - Supplies the address of a variable that will receive the
address of the newly created PERFECT_HASH_TABLE_KEYS structure if the
routine is successful, or NULL if the routine fails.
Return Value:
TRUE on success, FALSE on failure.
--*/
{
BOOLEAN Success;
PVOID BaseAddress = NULL;
HANDLE FileHandle = NULL;
HANDLE MappingHandle = NULL;
LARGE_INTEGER AllocSize;
LARGE_INTEGER NumberOfElements;
FILE_STANDARD_INFO FileInfo;
PPERFECT_HASH_TABLE_KEYS Keys = NULL;
//
// Validate arguments.
//
if (!ARGUMENT_PRESENT(Rtl)) {
return FALSE;
}
if (!ARGUMENT_PRESENT(Allocator)) {
return FALSE;
}
if (!ARGUMENT_PRESENT(KeysPointer)) {
return FALSE;
}
if (!ARGUMENT_PRESENT(Path)) {
return FALSE;
}
if (!IsValidMinimumDirectoryNullTerminatedUnicodeString(Path)) {
return FALSE;
}
//
// Clear the caller's pointer up-front.
//
*KeysPointer = NULL;
//
// Open the file, create a file mapping, then map it into memory.
//
FileHandle = Rtl->CreateFileW(
Path->Buffer,
GENERIC_READ,
FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
NULL,
OPEN_EXISTING,
FILE_FLAG_SEQUENTIAL_SCAN | FILE_FLAG_OVERLAPPED,
NULL
);
if (!FileHandle || FileHandle == INVALID_HANDLE_VALUE) {
return FALSE;
}
//
// N.B. Error handling should 'goto Error' from this point onward now that
// we have resources that may need to be cleaned up.
//
Success = GetFileInformationByHandleEx(
FileHandle,
(FILE_INFO_BY_HANDLE_CLASS)FileStandardInfo,
&FileInfo,
sizeof(FileInfo)
);
if (!Success) {
goto Error;
}
//
// Make sure the file is a multiple of our key size.
//
if (FileInfo.EndOfFile.QuadPart % 4ULL) {
goto Error;
}
//
// The number of elements in the key file can be ascertained by right
// shifting by 2.
//
NumberOfElements.QuadPart = FileInfo.EndOfFile.QuadPart >> 2;
//
// Sanity check the number of elements. There shouldn't be more than
// MAX_ULONG.
//
ASSERT(!NumberOfElements.HighPart);
//
// Create the file mapping.
//
MappingHandle = CreateFileMappingW(FileHandle,
NULL,
PAGE_READONLY,
FileInfo.EndOfFile.HighPart,
FileInfo.EndOfFile.LowPart,
NULL);
if (!MappingHandle || MappingHandle == INVALID_HANDLE_VALUE) {
goto Error;
}
//
// Successfully created a file mapping. Now map it into memory.
//
BaseAddress = MapViewOfFile(MappingHandle,
FILE_MAP_READ,
0,
0,
FileInfo.EndOfFile.LowPart);
if (!BaseAddress) {
goto Error;
}
//
// The file has been mapped successfully. Calculate the size required for
// the keys structure and a copy of the Path's underlying unicode string
// buffer.
//
AllocSize.QuadPart = (
//
// Account for the keys structure size.
//
sizeof(*Keys) +
//
// Account for the length of the UNICODE_STRING buffer and terminating
// NULL.
//
Path->Length + sizeof(Path->Buffer[0])
);
//
// Sanity check our size.
//
ASSERT(!AllocSize.HighPart);
//
// Proceed with allocation.
//
Keys = (PPERFECT_HASH_TABLE_KEYS)(
Allocator->Calloc(Allocator->Context,
1,
AllocSize.LowPart)
);
if (!Keys) {
goto Error;
}
//
// Fill out the main structure details.
//
Keys->SizeOfStruct = sizeof(*Keys);
Keys->Rtl = Rtl;
Keys->Allocator = Allocator;
Keys->FileHandle = FileHandle;
Keys->MappingHandle = MappingHandle;
Keys->BaseAddress = BaseAddress;
Keys->NumberOfElements.QuadPart = NumberOfElements.QuadPart;
//
// Initialize the path length variables, point the buffer at the space after
// our structure, then copy the string over and NULL-terminate it.
//
Keys->Path.Length = Path->Length;
Keys->Path.MaximumLength = Path->Length + sizeof(Path->Buffer[0]);
Keys->Path.Buffer = (PWSTR)RtlOffsetToPointer(Keys, sizeof(*Keys));
CopyMemory(Keys->Path.Buffer, Path->Buffer, Path->Length);
Keys->Path.Buffer[Path->Length >> 1] = L'\0';
//
// We've completed initialization, indicate success and jump to the end.
//
Success = TRUE;
goto End;
Error:
Success = FALSE;
//
// Clean up any resources we may have allocated.
//
if (BaseAddress) {
UnmapViewOfFile(BaseAddress);
BaseAddress = NULL;
}
if (MappingHandle) {
CloseHandle(MappingHandle);
MappingHandle = NULL;
}
if (FileHandle) {
CloseHandle(FileHandle);
FileHandle = NULL;
}
if (Keys) {
Allocator->FreePointer(Allocator->Context, &Keys);
}
//
// Intentional follow-on to End.
//
End:
//
// Update the caller's pointer and return.
//
// N.B. Keys could be NULL here, which is fine.
//
*KeysPointer = Keys;
return Success;
}
With a context and keys structure in hand, we're ready to try create a perfect
hash table.
Perfect Hash Table Creation
Table creation, if successful, does not return an opaque pointer to a structure
that can be subsequently interacted with by other routines — unlike the
context creation and key loading routines we've examined up to now.
Instead, it writes out the .pht1 perfect hash table file to disk,
such that it can be loaded (via the filename) in a subsequent call to the
LoadPerfectHashTable() routine. Now, that being said, both the
create and load routines use the same C structure for the table:
PERFECT_HASH_TABLE. The public definition of this structure, as we
saw earlier, was simply a wrapper around a vtbl pointer. The private definition
in PerfectHashTablePrivate.h, still adheres to the vtbl interface, but extends
it with all the additional data we need.
Let's take a look at the table structure first, then examine the creation routine.
PERFECT_HASH_TABLE
//
// Define the PERFECT_HASH_TABLE_FLAGS structure.
//
typedef union _PERFECT_HASH_TABLE_FLAGS {
struct _Struct_size_bytes_(sizeof(ULONG)) {
//
// When set, indicates the table came from CreatePerfectHashTable().
//
// Invariant:
//
// - If Created == TRUE:
// Assert Loaded == FALSE
//
ULONG Created:1;
//
// When set, indicates the table came from LoadPerfectHashTable().
//
// Invariant:
//
// - If Loaded == TRUE:
// Assert Created == FALSE
//
ULONG Loaded:1;
//
// Unused bits.
//
ULONG Unused:30;
};
LONG AsLong;
ULONG AsULong;
} PERFECT_HASH_TABLE_FLAGS;
C_ASSERT(sizeof(PERFECT_HASH_TABLE_FLAGS) == sizeof(ULONG));
typedef PERFECT_HASH_TABLE_FLAGS *PPERFECT_HASH_TABLE_FLAGS;
//
// Define the PERFECT_HASH_TABLE structure.
//
typedef struct _Struct_size_bytes_(SizeOfStruct) _PERFECT_HASH_TABLE {
//
// Our extended vtbl slot comes first, COM-style.
//
struct _PERFECT_HASH_TABLE_VTBL_EX *Vtbl;
//
// Size of the structure, in bytes.
//
_Field_range_(==, sizeof(struct _PERFECT_HASH_TABLE)) ULONG SizeOfStruct;
//
// Flags.
//
PERFECT_HASH_TABLE_FLAGS Flags;
//
// Generic singly-linked list entry.
//
SLIST_ENTRY ListEntry;
//
// Pointer to an initialized RTL structure.
//
PRTL Rtl;
//
// Pointer to an initialized ALLOCATOR structure.
//
PALLOCATOR Allocator;
//
// Reference count. Affected by AddRef() and Release().
//
volatile ULONG ReferenceCount;
//
// Capture the number of elements in the underlying perfect hash table.
// This refers to the number of vertices for the CHM algorithm, or can
// mean the rounded-up power-of-2 size. The masking implementations need
// an agnostic way to access this value, which is why it is provided here
// at the table level (despite being obtainable from things like the number
// of vertices or Keys->NumberOfElements).
//
ULONG HashSize;
ULONG IndexSize;
//
// Similarly, provide a convenient way to access the table "shift" amount
// if a shifting mask routine is active. This value is the number of
// trailing zeros of the Size above for tables whose size is a power of 2.
// It is not used if modulus masking is active.
//
ULONG HashShift;
ULONG IndexShift;
//
// Mask.
//
ULONG HashMask;
ULONG IndexMask;
//
// The following value represents how many times we need to XOR the high
// part of a word with the low part of a word -- where word is being used
// in the general computer word sense (i.e. not a 2-byte short) -- such
// that the final value is within the bounds of the table mask. It is
// also the value of log2(Table->Shift).
//
ULONG HashFold;
ULONG IndexFold;
//
// If modulus masking is active, this represents the modulus that will be
// used for masking, e.g. Input %= Table->Modulus.
//
ULONG HashModulus;
ULONG IndexModulus;
//
// If a caller provided the number of table elements as a parameter to the
// CreatePerfectHashTable() function, that value will be captured here. It
// overrides the default sizing heuristics. (If non-zero, it will be at
// least equal to or greater than the number of keys.)
//
ULARGE_INTEGER RequestedNumberOfTableElements;
//
// The algorithm in use.
//
PERFECT_HASH_TABLE_ALGORITHM_ID AlgorithmId;
//
// The masking type in use.
//
PERFECT_HASH_TABLE_MASK_FUNCTION_ID MaskFunctionId;
//
// The hash function in use.
//
PERFECT_HASH_TABLE_HASH_FUNCTION_ID HashFunctionId;
//
// Pointer to the keys corresponding to this perfect hash table. May be
// NULL.
//
PPERFECT_HASH_TABLE_KEYS Keys;
//
// Pointer to the PERFECT_HASH_TABLE_CONTEXT structure in use.
//
PPERFECT_HASH_TABLE_CONTEXT Context;
//
// Handle to the backing file.
//
HANDLE FileHandle;
//
// Handle to the memory mapping for the backing file.
//
HANDLE MappingHandle;
//
// Base address of the memory map for the backing file.
//
union {
PVOID BaseAddress;
PULONG Data;
};
//
// Fully-qualified, NULL-terminated path of the backing file. The path is
// automatically derived from the keys file.
//
UNICODE_STRING Path;
//
// Capture the mapping size of the underlying array, which will be aligned
// up to a system allocation granularity.
//
ULARGE_INTEGER MappingSizeInBytes;
//
// Handle to the info stream backing file.
//
HANDLE InfoStreamFileHandle;
//
// Handle to the memory mapping for the backing file.
//
HANDLE InfoStreamMappingHandle;
//
// Base address of the memory map for the :Info stream.
//
union {
PVOID InfoStreamBaseAddress;
struct _TABLE_INFO_ON_DISK_HEADER *Header;
};
//
// Fully-qualified, NULL-terminated path of the :Info stream associated with
// the path above.
//
UNICODE_STRING InfoStreamPath;
//
// Capture the mapping size and actual structure size for the :Info stream.
//
ULARGE_INTEGER InfoMappingSizeInBytes;
ULARGE_INTEGER InfoActualStructureSizeInBytes;
//
// If a table is loaded successfully, an array will be allocated for storing
// values (as part of the Insert()/Lookup() API), the base address for which
// is captured by the next field.
//
union {
PVOID ValuesBaseAddress;
PULONG Values;
};
//
// During creation, a large bitmap is created to cover the entire range of
// possible ULONG keys. This is used to ensure no duplicate keys appear in
// the input key set, and also assists in debugging.
//
RTL_BITMAP KeysBitmap;
} PERFECT_HASH_TABLE;
typedef PERFECT_HASH_TABLE *PPERFECT_HASH_TABLE;
CreatePerfectHashTable()
The creation routine is the first point where we dispatch to algorithm-specific
implementation routines.
Full
BOOLEAN
CreatePerfectHashTable(
PRTL Rtl,
PALLOCATOR Allocator,
PPERFECT_HASH_TABLE_CONTEXT Context,
PERFECT_HASH_TABLE_ALGORITHM_ID AlgorithmId,
PERFECT_HASH_TABLE_MASK_FUNCTION_ID MaskFunctionId,
PERFECT_HASH_TABLE_HASH_FUNCTION_ID HashFunctionId,
PULARGE_INTEGER NumberOfTableElementsPointer,
PPERFECT_HASH_TABLE_KEYS Keys,
PCUNICODE_STRING HashTablePath
)
/*++
Routine Description:
Creates and initializes a PERFECT_HASH_TABLE structure from a given set
of keys, using the requested algorithm.
Arguments:
Rtl - Supplies a pointer to an initialized RTL structure.
Allocator - Supplies a pointer to an initialized ALLOCATOR structure that
will be used for all memory allocations.
Context - Supplies a pointer to an initialized PERFECT_HASH_TABLE_CONTEXT
structure that can be used by the underlying algorithm in order to
search for perfect hash solutions in parallel.
AlgorithmId - Supplies the algorithm to use.
MaskFunctionId - Supplies the type of masking to use. The algorithm and hash
function must both support the requested masking type.
HashFunctionId - Supplies the hash function to use.
NumberOfTableElementsPointer - Optionally supplies a pointer to a
ULARGE_INTEGER structure that, if non-zero, indicates the number of
elements to assume when sizing the hash table. If a non-NULL pointer
is supplied, it will receive the final number of elements in the table
if a solution could be found.
N.B. If this value is non-zero, it must be equal to or greater than
the number of keys indicated by the Keys parameter. (It should be
at least 2.09 times the number of keys; the higher the value, the
larger the hash table, the faster a perfect hash solution will be
found.)
Keys - Supplies a pointer to a PERFECT_HASH_TABLE_KEYS structure obtained
from LoadPerfectHashTableKeys().
HashTablePath - Optionally supplies a pointer to a UNICODE_STRING structure
that represents the fully-qualified, NULL-terminated path of the backing
file used to save the hash table. If NULL, the file name of the keys
file will be used, with ".pht1" appended to it.
Return Value:
TRUE on success, FALSE on failure.
If TRUE, the table will be persisted at the path described by for the
HashTablePath parameter above. This can be subsequently interacted with
once loaded via LoadPerfectHashTable().
--*/
{
BOOLEAN Success;
BOOLEAN UsingKeysPath;
PWSTR Dest;
PWSTR Source;
PBYTE Buffer;
PWSTR PathBuffer;
ULONG Key;
ULONG Index;
ULONG Result;
ULONGLONG Bit;
ULONG ShareMode;
ULONG LastError;
ULONG NumberOfKeys;
ULONG DesiredAccess;
ULONG NumberOfSetBits;
ULONG InfoMappingSize;
ULONG FlagsAndAttributes;
PLONGLONG BitmapBuffer;
USHORT VtblExSize;
PVOID BaseAddress;
HANDLE FileHandle;
HANDLE MappingHandle;
HANDLE ProcessHandle;
SYSTEM_INFO SystemInfo;
ULARGE_INTEGER AllocSize;
ULONG_INTEGER PathBufferSize;
ULONGLONG KeysBitmapBufferSize;
ULONG IncomingPathBufferSizeInBytes;
ULONG_INTEGER AlignedPathBufferSize;
ULONG_INTEGER InfoStreamPathBufferSize;
ULONG_INTEGER AlignedInfoStreamPathBufferSize;
ULARGE_INTEGER NumberOfTableElements;
PPERFECT_HASH_TABLE_VTBL_EX Vtbl;
PPERFECT_HASH_TABLE Table = NULL;
UNICODE_STRING Suffix = RTL_CONSTANT_STRING(L".pht1");
UNICODE_STRING InfoStreamSuffix = RTL_CONSTANT_STRING(L":Info");
//
// Validate arguments.
//
if (!ARGUMENT_PRESENT(Rtl)) {
return FALSE;
}
if (!ARGUMENT_PRESENT(Allocator)) {
return FALSE;
}
if (!ARGUMENT_PRESENT(Context)) {
return FALSE;
} else if (Context->Table) {
//
// We don't support a context being used more than once at the moment.
// If it has been used before, Context->Table will have a value, and
// thus, we need to error out.
//
return FALSE;
}
if (!ARGUMENT_PRESENT(Keys)) {
return FALSE;
} else {
//
// Ensure the number of keys is within our maximum tested limit.
//
if (Keys->NumberOfElements.QuadPart > MAXIMUM_NUMBER_OF_KEYS) {
return FALSE;
}
}
if (!IsValidPerfectHashTableMaskFunctionId(MaskFunctionId)) {
return FALSE;
}
if (ARGUMENT_PRESENT(NumberOfTableElementsPointer)) {
//
// If the number of table elements is non-zero, verify it is greater
// than or equal to the number of keys present.
//
NumberOfTableElements.QuadPart = (
NumberOfTableElementsPointer->QuadPart
);
if (NumberOfTableElements.QuadPart > 0) {
if (NumberOfTableElements.QuadPart <
Keys->NumberOfElements.QuadPart) {
//
// Requested table size is too small, abort.
//
return FALSE;
}
}
} else {
NumberOfTableElements.QuadPart = 0;
}
if (ARGUMENT_PRESENT(HashTablePath) &&
!IsValidMinimumDirectoryNullTerminatedUnicodeString(HashTablePath)) {
return FALSE;
}
if (!IsValidPerfectHashTableAlgorithmId(AlgorithmId)) {
return FALSE;
}
if (!IsValidPerfectHashTableHashFunctionId(HashFunctionId)) {
return FALSE;
}
//
// Calculate the allocation size required for the structure, including the
// memory required to take a copy of the backing file name path.
//
AllocSize.QuadPart = sizeof(*Table);
if (ARGUMENT_PRESENT(HashTablePath)) {
//
// Use the length of the caller-provided path, plus a trailing NULL.
//
PathBufferSize.LongPart = (
HashTablePath->Length +
sizeof(HashTablePath->Buffer[0])
);
UsingKeysPath = FALSE;
IncomingPathBufferSizeInBytes = HashTablePath->Length;
PathBuffer = HashTablePath->Buffer;
} else {
//
// No path has been provided by the caller, so we'll use the path of
// the keys file with ".pht1" appended. Perform a quick invariant check
// first: maximum length should be 1 character (2 bytes) larger than
// length. (This is handled in LoadPerfectHashTableKeys().)
//
ASSERT(Keys->Path.Length + sizeof(Keys->Path.Buffer[0]) ==
Keys->Path.MaximumLength);
PathBufferSize.LongPart = (Keys->Path.MaximumLength + Suffix.Length);
UsingKeysPath = TRUE;
IncomingPathBufferSizeInBytes = Keys->Path.Length;
PathBuffer = Keys->Path.Buffer;
}
//
// Align the path buffer up to a 16 byte boundary.
//
AlignedPathBufferSize.LongPart = ALIGN_UP(PathBufferSize.LongPart, 16);
//
// Sanity check we haven't overflowed MAX_USHORT for the path buffer size.
//
ASSERT(!AlignedPathBufferSize.HighPart);
//
// Add the path buffer size to the structure allocation size, then check
// we haven't overflowed MAX_ULONG.
//
AllocSize.QuadPart += AlignedPathBufferSize.LowPart;
ASSERT(!AllocSize.HighPart);
//
// Calculate the size required by the :Info stream that will be created
// for the on-disk metadata. We derive this by adding the length of the
// path to the length of the :Info suffix, plus an additional trailing NULL.
//
InfoStreamPathBufferSize.LongPart = (
PathBufferSize.LowPart +
InfoStreamSuffix.Length +
sizeof(InfoStreamSuffix.Buffer[0])
);
//
// Align the size up to a 16 byte boundary.
//
AlignedInfoStreamPathBufferSize.LongPart = (
ALIGN_UP(
InfoStreamPathBufferSize.LowPart,
16
)
);
//
// Sanity check we haved overflowed MAX_USHORT.
//
ASSERT(!AlignedInfoStreamPathBufferSize.HighPart);
//
// Add the stream path size to the total size and perform a final overflow
// check.
//
AllocSize.QuadPart += AlignedInfoStreamPathBufferSize.LowPart;
ASSERT(!AllocSize.HighPart);
//
// Account for the vtbl interface size.
//
VtblExSize = GetVtblExSizeRoutines[AlgorithmId]();
AllocSize.QuadPart += VtblExSize;
ASSERT(!AllocSize.HighPart);
//
// Allocate space for the structure and backing paths.
//
Table = (PPERFECT_HASH_TABLE)(
Allocator->Calloc(Allocator->Context,
1,
AllocSize.LowPart)
);
if (!Table) {
return FALSE;
}
//
// Allocation was successful, continue with initialization.
//
Table->SizeOfStruct = sizeof(*Table);
Table->Rtl = Rtl;
Table->Allocator = Allocator;
Table->Flags.AsULong = 0;
Table->Keys = Keys;
Table->Context = Context;
Context->Table = Table;
//
// Our main enumeration IDs get replicated in both structures.
//
Table->AlgorithmId = Context->AlgorithmId = AlgorithmId;
Table->MaskFunctionId = Context->MaskFunctionId = MaskFunctionId;
Table->HashFunctionId = Context->HashFunctionId = HashFunctionId;
//
// Carve out the backing memory structures for the unicode string buffers
// for the path names.
//
Buffer = RtlOffsetToPointer(Table, sizeof(*Table));
Table->Path.Buffer = (PWSTR)Buffer;
CopyMemory(Table->Path.Buffer, PathBuffer, IncomingPathBufferSizeInBytes);
if (!UsingKeysPath) {
//
// Inherit the lengths provided by the input parameter string.
//
Table->Path.Length = HashTablePath->Length;
Table->Path.MaximumLength = HashTablePath->MaximumLength;
} else {
//
// Replace the ".keys" suffix with ".pht1".
//
Dest = Table->Path.Buffer;
Dest += (Keys->Path.MaximumLength >> 1) - 5ULL;
ASSERT(*Dest == L'k');
ASSERT(*(Dest - 1) == L'.');
Source = Suffix.Buffer;
while (*Source) {
*Dest++ = *Source++;
}
*Dest = L'\0';
//
// We can use the Keys->Path lengths directly.
//
Table->Path.Length = Keys->Path.Length;
Table->Path.MaximumLength = Keys->Path.MaximumLength;
}
//
// Advance past the aligned path buffer size such that we're positioned at
// the start of the info stream buffer.
//
Buffer += AlignedPathBufferSize.LongPart;
Table->InfoStreamPath.Buffer = (PWSTR)Buffer;
Table->InfoStreamPath.MaximumLength = InfoStreamPathBufferSize.LowPart;
Table->InfoStreamPath.Length = (
Table->InfoStreamPath.MaximumLength -
sizeof(Table->InfoStreamPath.Buffer[0])
);
//
// Advance the buffer to the vtbl interface area and initialize it.
//
Buffer += AlignedInfoStreamPathBufferSize.LongPart;
Vtbl = (PPERFECT_HASH_TABLE_VTBL_EX)Buffer;
InitializeExtendedVtbl(Table, Vtbl);
//
// Copy the full path into the info stream buffer.
//
CopyMemory(Table->InfoStreamPath.Buffer,
Table->Path.Buffer,
Table->Path.Length);
Dest = Table->InfoStreamPath.Buffer;
Dest += (Table->Path.Length >> 1);
ASSERT(*Dest == L'\0');
//
// Copy the :Info suffix over.
//
Source = InfoStreamSuffix.Buffer;
while (*Source) {
*Dest++ = *Source++;
}
*Dest = L'\0';
//
// We've finished initializing our two unicode string buffers for the
// backing file and it's :Info counterpart. Now, let's open file handles
// to them.
//
//
// Open the file handle for the backing hash table store.
//
ShareMode = (
FILE_SHARE_READ |
FILE_SHARE_WRITE |
FILE_SHARE_DELETE
);
DesiredAccess = (
GENERIC_READ |
GENERIC_WRITE
);
FlagsAndAttributes = FILE_FLAG_OVERLAPPED;
FileHandle = Rtl->CreateFileW(Table->Path.Buffer,
DesiredAccess,
ShareMode,
NULL,
OPEN_ALWAYS,
FlagsAndAttributes,
NULL);
LastError = GetLastError();
Table->FileHandle = FileHandle;
if (!FileHandle || FileHandle == INVALID_HANDLE_VALUE) {
//
// Failed to open the file successfully.
//
goto Error;
} else if (LastError == ERROR_ALREADY_EXISTS) {
//
// The file was opened successfully, but it already existed. Clear the
// local last error variable then truncate the file.
//
LastError = ERROR_SUCCESS;
Result = SetFilePointer(FileHandle, 0, NULL, FILE_BEGIN);
if (Result == INVALID_SET_FILE_POINTER) {
LastError = GetLastError();
goto Error;
}
Success = SetEndOfFile(FileHandle);
if (!Success) {
LastError = GetLastError();
goto Error;
}
//
// We've successfully truncated the file. The creation routine
// implementation can now allocate the space required for it as part
// of successful graph solving.
//
}
//
// The :Info stream is slightly different. As it's a fixed size metadata
// record, we can memory map an entire section up-front prior to calling
// the algorithm implementation. So, do that now.
//
FileHandle = Rtl->CreateFileW(Table->InfoStreamPath.Buffer,
DesiredAccess,
ShareMode,
NULL,
OPEN_ALWAYS,
FlagsAndAttributes,
NULL);
Table->InfoStreamFileHandle = FileHandle;
LastError = GetLastError();
if (!FileHandle || FileHandle == INVALID_HANDLE_VALUE) {
//
// Failed to open the file successfully.
//
goto Error;
} else if (LastError == ERROR_ALREADY_EXISTS) {
//
// The file was opened successfully, but it already existed. Clear the
// local last error variable then truncate the file.
//
LastError = ERROR_SUCCESS;
Result = SetFilePointer(FileHandle, 0, NULL, FILE_BEGIN);
if (Result == INVALID_SET_FILE_POINTER) {
LastError = GetLastError();
goto Error;
}
Success = SetEndOfFile(FileHandle);
if (!Success) {
LastError = GetLastError();
goto Error;
}
//
// We've successfully truncated the :Info file.
//
}
//
// Get the system allocation granularity, as we use this to govern the size
// we request of the underlying file mapping.
//
GetSystemInfo(&SystemInfo);
InfoMappingSize = SystemInfo.dwAllocationGranularity;
ASSERT(InfoMappingSize >= PAGE_SIZE);
//
// Create a file mapping for the :Info stream.
//
MappingHandle = CreateFileMappingW(FileHandle,
NULL,
PAGE_READWRITE,
0,
InfoMappingSize,
NULL);
Table->InfoStreamMappingHandle = MappingHandle;
Table->InfoMappingSizeInBytes.QuadPart = InfoMappingSize;
if (!MappingHandle || MappingHandle == INVALID_HANDLE_VALUE) {
goto Error;
}
//
// We successfully created a file mapping. Proceed with mapping it into
// memory.
//
BaseAddress = MapViewOfFile(MappingHandle,
FILE_MAP_READ | FILE_MAP_WRITE,
0,
0,
InfoMappingSize);
Table->InfoStreamBaseAddress = BaseAddress;
if (!BaseAddress) {
goto Error;
}
//
// Set the number of table elements requested by the user (0 is a valid
// value here).
//
Table->RequestedNumberOfTableElements.QuadPart = (
NumberOfTableElements.QuadPart
);
//
// Allocate a 512MB buffer for the keys bitmap.
//
KeysBitmapBufferSize = ((1ULL << 32ULL) >> 3ULL);
//
// Try a large page allocation for the bitmap buffer.
//
ProcessHandle = GetCurrentProcess();
BaseAddress = Rtl->TryLargePageVirtualAllocEx(ProcessHandle,
NULL,
KeysBitmapBufferSize,
MEM_COMMIT,
PAGE_READWRITE);
Table->KeysBitmap.Buffer = (PULONG)BaseAddress;
if (!BaseAddress) {
//
// Failed to create a bitmap buffer, abort.
//
LastError = GetLastError();
goto Error;
}
//
// Initialize the keys bitmap.
//
Table->KeysBitmap.SizeOfBitMap = (ULONG)-1;
BitmapBuffer = (PLONGLONG)Table->KeysBitmap.Buffer;
ASSERT(!Keys->NumberOfElements.HighPart);
NumberOfKeys = Keys->NumberOfElements.LowPart;
//
// Loop through all the keys, obtain the bitmap bit representation, verify
// that the bit hasn't been set yet, and set it.
//
for (Index = 0; Index < NumberOfKeys; Index++) {
Key = Keys->Keys[Index];
Bit = Key + 1;
ASSERT(!BitTestAndSet64(BitmapBuffer, Bit));
}
//
// Count all bits set. It should match the number of keys.
//
NumberOfSetBits = Rtl->RtlNumberOfSetBits(&Table->KeysBitmap);
ASSERT(NumberOfSetBits == NumberOfKeys);
//
// Common initialization is complete, dispatch remaining work to the
// algorithm's creation routine.
//
Success = CreationRoutines[AlgorithmId](Table);
if (!Success) {
goto Error;
}
//
// Update the caller's number of table elements pointer, if applicable.
//
if (ARGUMENT_PRESENT(NumberOfTableElementsPointer)) {
NumberOfTableElementsPointer->QuadPart = Table->HashSize;
}
//
// We're done! Set the reference count to 1 and finish up.
//
Table->ReferenceCount = 1;
goto End;
Error:
Success = FALSE;
//
// Intentional follow-on to End.
//
End:
//
// Call the destroy routine on the table if one is present.
//
// N.B. We currently always delete the table if it is created successfully
// so as to ensure the only way to use a table is by loading one from
// disk via LoadPerfectHashTable().
//
if (Table) {
if (!DestroyPerfectHashTable(&Table, NULL)) {
//
// There's nothing we can do here.
//
NOTHING;
}
//
// N.B. DestroyPerfectHashTable() should clear the Table pointer.
// Assert that invariant now.
//
ASSERT(Table == NULL);
}
return Success;
}
Algorithm-Specific Implementation Details
This section covers implementation details specific to our implementation of the
CHM algorithm.
CreatePerfectHashTableImplChm01
This is the implementation of our CHM algorithm's creation routine. It is
responsible for: determining the appropriate memory requirements for a given
graph used to try solve the perfect hash, allocating and initializing the
appropriate data structures, submitting threadpool work for file preparation,
parallel graph solving, and file-saving (once a solution has been found), and
verifying the found solution is actually correct. Additionally, it has support
for detecting table resize events, which will be signaled by a worker
when it detects the number of attempts at solving a graph have exceeded a
threshold. This results in the routine doubling the number of vertices used for
the underlying graph and re-attempting to solve.
It makes extensive use of threadpools. It can be viewed as the main "parent"
coordinator thread, dispatching relevant work to the file-work and main-work
threadpools as required.
Full
_Use_decl_annotations_
BOOLEAN
CreatePerfectHashTableImplChm01(
PPERFECT_HASH_TABLE Table
)
/*++
Routine Description:
Attempts to create a perfect hash table using the CHM algorithm and a
2-part random hypergraph.
Arguments:
Table - Supplies a pointer to a partially-initialized PERFECT_HASH_TABLE
structure.
Return Value:
TRUE on success, FALSE on failure.
--*/
{
PRTL Rtl;
USHORT Index;
PULONG Keys;
PGRAPH Graph;
PBYTE Buffer;
BOOLEAN Success;
USHORT PageSize;
USHORT PageShift;
ULONG_PTR LastPage;
ULONG_PTR ThisPage;
BYTE NumberOfEvents;
PVOID BaseAddress = NULL;
ULONG WaitResult;
GRAPH_INFO Info;
PBYTE Unusable;
ULONG LastError;
ULONG NumberOfKeys;
BOOLEAN CaughtException;
PALLOCATOR Allocator;
HANDLE ProcessHandle;
PHANDLE Event;
USHORT NumberOfGraphs;
USHORT NumberOfPagesPerGraph;
USHORT NumberOfGuardPages;
ULONG TotalNumberOfPages;
USHORT NumberOfBitmaps;
PGRAPH_DIMENSIONS Dim;
PSLIST_ENTRY ListEntry;
SYSTEM_INFO SystemInfo;
FILE_WORK_ITEM SaveFile;
FILE_WORK_ITEM PrepareFile;
PGRAPH_INFO_ON_DISK OnDiskInfo;
PTABLE_INFO_ON_DISK_HEADER OnDiskHeader;
ULONGLONG Closest;
ULONGLONG LastClosest;
ULONGLONG NextSizeInBytes;
ULONGLONG PrevSizeInBytes;
ULONGLONG FirstSizeInBytes;
ULONGLONG EdgesSizeInBytes;
ULONGLONG ValuesSizeInBytes;
ULONGLONG AssignedSizeInBytes;
ULONGLONG TotalBufferSizeInBytes;
ULONGLONG UsableBufferSizeInBytesPerBuffer;
ULONGLONG ExpectedTotalBufferSizeInBytes;
ULONGLONG ExpectedUsableBufferSizeInBytesPerBuffer;
ULONGLONG GraphSizeInBytesIncludingGuardPage;
PERFECT_HASH_TABLE_MASK_FUNCTION_ID MaskFunctionId;
ULARGE_INTEGER AllocSize;
ULARGE_INTEGER NumberOfEdges;
ULARGE_INTEGER NumberOfVertices;
ULARGE_INTEGER TotalNumberOfEdges;
ULARGE_INTEGER DeletedEdgesBitmapBufferSizeInBytes;
ULARGE_INTEGER VisitedVerticesBitmapBufferSizeInBytes;
ULARGE_INTEGER AssignedBitmapBufferSizeInBytes;
ULARGE_INTEGER IndexBitmapBufferSizeInBytes;
PPERFECT_HASH_TABLE_CONTEXT Context = Table->Context;
HANDLE Events[5];
//
// Validate arguments.
//
if (!ARGUMENT_PRESENT(Table)) {
return FALSE;
}
//
// The following label is jumped to by code later in this routine when we
// detect that we've exceeded a plausible number of attempts at finding a
// graph solution with the given number of vertices, and have bumped up
// the vertex count (by adjusting Table->RequestedNumberOfElements) and
// want to try again.
//
RetryWithLargerTableSize:
//
// Initialize aliases.
//
Rtl = Table->Rtl;
Keys = (PULONG)Table->Keys->BaseAddress;;
Allocator = Table->Allocator;
Context = Table->Context;
MaskFunctionId = Context->MaskFunctionId;
//
// If no threshold has been set, use the default.
//
if (!Context->ResizeTableThreshold) {
Context->ResizeTableThreshold = GRAPH_SOLVING_ATTEMPTS_THRESHOLD;
Context->ResizeLimit = GRAPH_SOLVING_RESIZE_TABLE_LIMIT;
}
//
// Explicitly reset all events. This ensures everything is back in the
// starting state if we happen to be attempting to solve the graph after
// a resize event.
//
Event = (PHANDLE)&Context->FirstEvent;
NumberOfEvents = GetNumberOfContextEvents(Context);
for (Index = 0; Index < NumberOfEvents; Index++, Event++) {
ASSERT(*Event && *Event != INVALID_HANDLE_VALUE);
ASSERT(ResetEvent(*Event));
}
//
// The number of edges in our graph is equal to the number of keys in the
// input data set if modulus masking is in use. It will be rounded up to
// a power of 2 otherwise.
//
NumberOfEdges.QuadPart = Table->Keys->NumberOfElements.QuadPart;
//
// Sanity check we're under MAX_ULONG.
//
ASSERT(!NumberOfEdges.HighPart);
NumberOfKeys = NumberOfEdges.LowPart;
//
// Determine the number of vertices. If a caller has requested a certain
// table size, Table->RequestedNumberOfTableElements will be non-zero, and
// takes precedence.
//
if (Table->RequestedNumberOfTableElements.QuadPart) {
NumberOfVertices.QuadPart = (
Table->RequestedNumberOfTableElements.QuadPart
);
if (IsModulusMasking(MaskFunctionId)) {
//
// Nothing more to do with modulus masking; we'll verify the number
// of vertices below.
//
NOTHING;
} else {
//
// For non-modulus masking, make sure the number of vertices are
// rounded up to a power of 2. The number of edges will be rounded
// up to a power of 2 from the number of keys.
//
NumberOfVertices.QuadPart = (
RoundUpPowerOf2(NumberOfVertices.LowPart)
);
NumberOfEdges.QuadPart = (
RoundUpPowerOf2(NumberOfEdges.LowPart)
);
}
} else {
//
// No table size was requested, so we need to determine how many
// vertices to use heuristically. The main factor is what type of
// masking has been requested. The chm.c implementation, which is
// modulus based, uses a size multiplier (c) of 2.09, and calculates
// the final size via ceil(nedges * (double)2.09). We can avoid the
// need for doubles and linking with a math library (to get ceil())
// and just use ~2.25, which we can calculate by adding the result
// of right shifting the number of edges by 1 to the result of left
// shifting said edge count by 2 (simulating multiplication by 0.25).
//
// If we're dealing with modulus masking, this will be the exact number
// of vertices used. For other types of masking, we need the edges size
// to be a power of 2, and the vertices size to be the next power of 2.
//
if (IsModulusMasking(MaskFunctionId)) {
NumberOfVertices.QuadPart = NumberOfEdges.QuadPart << 1ULL;
NumberOfVertices.QuadPart += NumberOfEdges.QuadPart >> 2ULL;
} else {
//
// Round up the edges to a power of 2.
//
NumberOfEdges.QuadPart = RoundUpPowerOf2(NumberOfEdges.LowPart);
//
// Make sure we haven't overflowed.
//
ASSERT(!NumberOfEdges.HighPart);
//
// For the number of vertices, round the number of edges up to the
// next power of 2.
//
NumberOfVertices.QuadPart = (
RoundUpNextPowerOf2(NumberOfEdges.LowPart)
);
}
}
//
// Another sanity check we haven't exceeded MAX_ULONG.
//
ASSERT(!NumberOfVertices.HighPart);
//
// The r-graph (r = 2) nature of this implementation results in various
// arrays having twice the number of elements indicated by the edge count.
// Capture this number now, as we need it in various size calculations.
//
TotalNumberOfEdges.QuadPart = NumberOfEdges.QuadPart;
TotalNumberOfEdges.QuadPart <<= 1ULL;
//
// Another overflow sanity check.
//
ASSERT(!TotalNumberOfEdges.HighPart);
//
// Make sure vertices > edges.
//
ASSERT(NumberOfVertices.QuadPart > NumberOfEdges.QuadPart);
//
// Calculate the size required for the DeletedEdges bitmap buffer. One
// bit is used per TotalNumberOfEdges. Convert the bits into bytes by
// shifting right 3 (dividing by 8) then align it up to a 16 byte boundary.
// We add 1 before shifting to account 1-based bitmaps vs 0-based indices.
//
DeletedEdgesBitmapBufferSizeInBytes.QuadPart = (
ALIGN_UP(((ALIGN_UP(TotalNumberOfEdges.QuadPart + 1, 8)) >> 3), 16)
);
ASSERT(!DeletedEdgesBitmapBufferSizeInBytes.HighPart);
//
// Calculate the size required for the VisitedVertices bitmap buffer. One
// bit is used per NumberOfVertices. Convert the bits into bytes by
// shifting right 3 (dividing by 8) then align it up to a 16 byte boundary.
// We add 1 before shifting to account 1-based bitmaps vs 0-based indices.
//
VisitedVerticesBitmapBufferSizeInBytes.QuadPart = (
ALIGN_UP(((ALIGN_UP(NumberOfVertices.QuadPart + 1, 8)) >> 3), 16)
);
ASSERT(!VisitedVerticesBitmapBufferSizeInBytes.HighPart);
//
// Calculate the size required for the AssignedBitmap bitmap buffer. One
// bit is used per NumberOfVertices. Convert the bits into bytes by shifting
// right 3 (dividing by 8) then align it up to a 16 byte boundary.
// We add 1 before shifting to account 1-based bitmaps vs 0-based indices.
//
AssignedBitmapBufferSizeInBytes.QuadPart = (
ALIGN_UP(((ALIGN_UP(NumberOfVertices.QuadPart + 1, 8)) >> 3), 16)
);
ASSERT(!AssignedBitmapBufferSizeInBytes.HighPart);
//
// Calculate the size required for the IndexBitmap bitmap buffer. One
// bit is used per NumberOfVertices. Convert the bits into bytes by shifting
// right 3 (dividing by 8) then align it up to a 16 byte boundary.
// We add 1 before shifting to account 1-based bitmaps vs 0-based indices.
//
IndexBitmapBufferSizeInBytes.QuadPart = (
ALIGN_UP(((ALIGN_UP(NumberOfVertices.QuadPart + 1, 8)) >> 3), 16)
);
ASSERT(!IndexBitmapBufferSizeInBytes.HighPart);
//
// Calculate the sizes required for each of the arrays. We collect them
// into independent variables as it makes carving up the allocated buffer
// easier down the track.
//
EdgesSizeInBytes = (
ALIGN_UP_POINTER(sizeof(*Graph->Edges) * TotalNumberOfEdges.QuadPart)
);
NextSizeInBytes = (
ALIGN_UP_POINTER(sizeof(*Graph->Next) * TotalNumberOfEdges.QuadPart)
);
FirstSizeInBytes = (
ALIGN_UP_POINTER(sizeof(*Graph->First) * NumberOfVertices.QuadPart)
);
PrevSizeInBytes = (
ALIGN_UP_POINTER(sizeof(*Graph->Prev) * TotalNumberOfEdges.QuadPart)
);
AssignedSizeInBytes = (
ALIGN_UP_POINTER(sizeof(*Graph->Assigned) * NumberOfVertices.QuadPart)
);
//
// Calculate the size required for the values array. This is used as part
// of verification, where we essentially do Insert(Key, Key) in combination
// with bitmap tracking of assigned indices, which allows us to detect if
// there are any colliding indices, and if so, what was the previous key
// that mapped to the same index.
//
ValuesSizeInBytes = (
ALIGN_UP_POINTER(sizeof(*Graph->Values) * NumberOfVertices.QuadPart)
);
//
// Calculate the total size required for the underlying graph, such that
// we can allocate memory via a single call to the allocator.
//
AllocSize.QuadPart = ALIGN_UP_POINTER(
//
// Account for the size of the graph structure.
//
sizeof(GRAPH) +
//
// Account for the size of the Graph->Edges array, which is double
// sized.
//
EdgesSizeInBytes +
//
// Account for the size of the Graph->Next array; also double sized.
//
NextSizeInBytes +
//
// Account for the size of the Graph->First array. This is sized
// proportional to the number of vertices.
//
FirstSizeInBytes +
//
// Account for the size of the Graph->Prev array, also double sized.
//
PrevSizeInBytes +
//
// Account for Graph->Assigned array of vertices.
//
AssignedSizeInBytes +
//
// Account for the Table->Values array of values for the perfect hash
// table, indexed via the result of the table's Index() method.
//
ValuesSizeInBytes +
//
// Account for the size of the bitmap buffer for Graph->DeletedEdges.
//
DeletedEdgesBitmapBufferSizeInBytes.QuadPart +
//
// Account for the size of the bitmap buffer for Graph->VisitedVertices.
//
VisitedVerticesBitmapBufferSizeInBytes.QuadPart +
//
// Account for the size of the bitmap buffer for Graph->AssignedBitmap.
//
AssignedBitmapBufferSizeInBytes.QuadPart +
//
// Account for the size of the bitmap buffer for Graph->IndexBitmap.
//
IndexBitmapBufferSizeInBytes.QuadPart +
//
// Keep a dummy 0 at the end such that the last item above can use an
// addition sign at the end of it, which minimizes the diff churn when
// adding a new size element.
//
0
);
//
// Capture the number of bitmaps here, where it's close to the lines above
// that indicate how many bitmaps we're dealing with. The number of bitmaps
// accounted for above should match this number. Visually confirm this any
// time a new bitmap buffer is accounted for.
//
// N.B. We ASSERT() in InitializeGraph() if we detect a mismatch between
// Info->NumberOfBitmaps and a local counter incremented each time
// we initialize a bitmap.
//
NumberOfBitmaps = 4;
//
// Sanity check the size hasn't overflowed.
//
ASSERT(!AllocSize.HighPart);
//
// Calculate the number of pages required by each graph, then extrapolate
// the number of guard pages and total number of pages. We currently use
// 4KB for the page size (i.e. we're not using large pages).
//
PageSize = PAGE_SIZE;
PageShift = (USHORT)TrailingZeros(PageSize);
NumberOfGraphs = (USHORT)Context->MaximumConcurrency;
NumberOfPagesPerGraph = ROUND_TO_PAGES(AllocSize.LowPart) >> PageShift;
NumberOfGuardPages = (USHORT)Context->MaximumConcurrency;
TotalNumberOfPages = (
(NumberOfGraphs * NumberOfPagesPerGraph) +
NumberOfGuardPages
);
GraphSizeInBytesIncludingGuardPage = (
(ULONGLONG)PageSize +
((ULONGLONG)NumberOfPagesPerGraph * (ULONGLONG)PageSize)
);
//
// Create multiple buffers separated by guard pages using a single call
// to VirtualAllocEx().
//
ProcessHandle = NULL;
Success = Rtl->CreateMultipleBuffers(Rtl,
&ProcessHandle,
PageSize,
NumberOfGraphs,
NumberOfPagesPerGraph,
NULL,
NULL,
&UsableBufferSizeInBytesPerBuffer,
&TotalBufferSizeInBytes,
&BaseAddress);
if (!Success) {
LastError = GetLastError();
return FALSE;
}
//
// N.B. Subsequent errors must 'goto Error' at this point to ensure our
// cleanup logic kicks in.
//
//
// Assert the sizes returned by the buffer allocation match what we're
// expecting.
//
ExpectedTotalBufferSizeInBytes = (
(ULONGLONG)TotalNumberOfPages *
(ULONGLONG)PageSize
);
ExpectedUsableBufferSizeInBytesPerBuffer = (
(ULONGLONG)NumberOfPagesPerGraph *
(ULONGLONG)PageSize
);
ASSERT(TotalBufferSizeInBytes == ExpectedTotalBufferSizeInBytes);
ASSERT(UsableBufferSizeInBytesPerBuffer ==
ExpectedUsableBufferSizeInBytesPerBuffer);
//
// Initialize the GRAPH_INFO structure with all the sizes captured earlier.
// (We zero it first just to ensure any of the padding fields are cleared.)
//
ZeroStruct(Info);
Info.PageSize = PageSize;
Info.AllocSize = AllocSize.QuadPart;
Info.Context = Context;
Info.BaseAddress = BaseAddress;
Info.NumberOfPagesPerGraph = NumberOfPagesPerGraph;
Info.NumberOfGraphs = NumberOfGraphs;
Info.NumberOfBitmaps = NumberOfBitmaps;
Info.SizeOfGraphStruct = sizeof(GRAPH);
Info.EdgesSizeInBytes = EdgesSizeInBytes;
Info.NextSizeInBytes = NextSizeInBytes;
Info.FirstSizeInBytes = FirstSizeInBytes;
Info.PrevSizeInBytes = PrevSizeInBytes;
Info.AssignedSizeInBytes = AssignedSizeInBytes;
Info.ValuesSizeInBytes = ValuesSizeInBytes;
Info.AllocSize = AllocSize.QuadPart;
Info.FinalSize = UsableBufferSizeInBytesPerBuffer;
Info.DeletedEdgesBitmapBufferSizeInBytes = (
DeletedEdgesBitmapBufferSizeInBytes.QuadPart
);
Info.VisitedVerticesBitmapBufferSizeInBytes = (
VisitedVerticesBitmapBufferSizeInBytes.QuadPart
);
Info.AssignedBitmapBufferSizeInBytes = (
AssignedBitmapBufferSizeInBytes.QuadPart
);
Info.IndexBitmapBufferSizeInBytes = (
IndexBitmapBufferSizeInBytes.QuadPart
);
//
// Capture the system allocation granularity. This is used to align the
// backing memory maps used for the table array.
//
GetSystemInfo(&SystemInfo);
Info.AllocationGranularity = SystemInfo.dwAllocationGranularity;
//
// Copy the dimensions over.
//
Dim = &Info.Dimensions;
Dim->NumberOfEdges = NumberOfEdges.LowPart;
Dim->TotalNumberOfEdges = TotalNumberOfEdges.LowPart;
Dim->NumberOfVertices = NumberOfVertices.LowPart;
Dim->NumberOfEdgesPowerOf2Exponent = (BYTE)(
TrailingZeros64(RoundUpPowerOf2(NumberOfEdges.LowPart))
);
Dim->NumberOfEdgesNextPowerOf2Exponent = (BYTE)(
TrailingZeros64(RoundUpNextPowerOf2(NumberOfEdges.LowPart))
);
Dim->NumberOfVerticesPowerOf2Exponent = (BYTE)(
TrailingZeros64(RoundUpPowerOf2(NumberOfVertices.LowPart))
);
Dim->NumberOfVerticesNextPowerOf2Exponent = (BYTE)(
TrailingZeros64(RoundUpNextPowerOf2(NumberOfVertices.LowPart))
);
//
// If non-modulus masking is active, initialize the edge and vertex masks.
//
if (!IsModulusMasking(MaskFunctionId)) {
Info.EdgeMask = NumberOfEdges.LowPart - 1;
Info.VertexMask = NumberOfVertices.LowPart - 1;
//
// Sanity check our masks are correct: their popcnts should match the
// exponent value identified above whilst filling out the dimensions
// structure.
//
ASSERT(_mm_popcnt_u32(Info.EdgeMask) ==
Dim->NumberOfEdgesPowerOf2Exponent);
ASSERT(_mm_popcnt_u32(Info.VertexMask) ==
Dim->NumberOfVerticesPowerOf2Exponent);
}
//
// Set the Modulus, Size, Shift, Mask and Fold fields of the table, such
// that the Hash and Mask vtbl functions operate correctly.
//
// N.B. Shift, Mask and Fold are meaningless for modulus masking.
//
// N.B. If you change these fields, you'll probably need to change something
// in LoadPerfectHashTableImplChm01() too.
//
Table->HashModulus = NumberOfVertices.LowPart;
Table->IndexModulus = NumberOfEdges.LowPart;
Table->HashSize = NumberOfVertices.LowPart;
Table->IndexSize = NumberOfEdges.LowPart;
Table->HashShift = TrailingZeros(Table->HashSize);
Table->IndexShift = TrailingZeros(Table->IndexSize);
Table->HashMask = (Table->HashSize - 1);
Table->IndexMask = (Table->IndexSize - 1);
Table->HashFold = Table->HashShift >> 3;
Table->IndexFold = Table->IndexShift >> 3;
//
// If auto folding has been requested, set the appropriate function ID based
// on the table fold and update the vtbl.
//
if (MaskFunctionId == PerfectHashTableFoldAutoMaskFunctionId) {
ASSERT(Table->HashFold >= 0 && Table->HashFold <= 4);
switch (Table->HashFold) {
case 4:
case 3:
case 2:
MaskFunctionId = PerfectHashTableFoldOnceMaskFunctionId;
break;
case 1:
MaskFunctionId = PerfectHashTableFoldTwiceMaskFunctionId;
break;
case 0:
MaskFunctionId = PerfectHashTableFoldThriceMaskFunctionId;
break;
}
Table->MaskFunctionId = MaskFunctionId;
Context->MaskFunctionId = MaskFunctionId;
Table->Vtbl->MaskHash = MaskHashRoutines[MaskFunctionId];
Table->Vtbl->MaskIndex = MaskIndexRoutines[MaskFunctionId];
}
//
// Save the on-disk representation of the graph information. This is a
// smaller subset of data needed in order to load a previously-solved
// graph as a perfect hash table. The data resides in an NTFS stream named
// :Info off the main perfect hash table file. It will have been mapped for
// us already at Table->InfoStreamBaseAddress.
//
OnDiskInfo = (PGRAPH_INFO_ON_DISK)Table->InfoStreamBaseAddress;
ASSERT(OnDiskInfo);
OnDiskHeader = &OnDiskInfo->Header;
OnDiskHeader->Magic.LowPart = TABLE_INFO_ON_DISK_MAGIC_LOWPART;
OnDiskHeader->Magic.HighPart = TABLE_INFO_ON_DISK_MAGIC_HIGHPART;
OnDiskHeader->SizeOfStruct = sizeof(*OnDiskInfo);
OnDiskHeader->Flags.AsULong = 0;
OnDiskHeader->Concurrency = Context->MaximumConcurrency;
OnDiskHeader->AlgorithmId = Context->AlgorithmId;
OnDiskHeader->MaskFunctionId = Context->MaskFunctionId;
OnDiskHeader->HashFunctionId = Context->HashFunctionId;
OnDiskHeader->KeySizeInBytes = sizeof(ULONG);
OnDiskHeader->HashSize = Table->HashSize;
OnDiskHeader->IndexSize = Table->IndexSize;
OnDiskHeader->HashShift = Table->HashShift;
OnDiskHeader->IndexShift = Table->IndexShift;
OnDiskHeader->HashMask = Table->HashMask;
OnDiskHeader->IndexMask = Table->IndexMask;
OnDiskHeader->HashFold = Table->HashFold;
OnDiskHeader->IndexFold = Table->IndexFold;
OnDiskHeader->HashModulus = Table->HashModulus;
OnDiskHeader->IndexModulus = Table->IndexModulus;
OnDiskHeader->NumberOfKeys.QuadPart = (
Table->Keys->NumberOfElements.QuadPart
);
OnDiskHeader->NumberOfSeeds = ((
FIELD_OFFSET(GRAPH, LastSeed) -
FIELD_OFFSET(GRAPH, FirstSeed)
) / sizeof(ULONG)) + 1;
//
// This will change based on masking type and whether or not the caller
// has provided a value for NumberOfTableElements. For now, keep it as
// the number of vertices.
//
OnDiskHeader->NumberOfTableElements.QuadPart = (
NumberOfVertices.QuadPart
);
CopyMemory(&OnDiskInfo->Dimensions, Dim, sizeof(*Dim));
//
// Set the context's main work callback to our worker routine, and the algo
// context to our graph info structure.
//
Context->MainWorkCallback = ProcessGraphCallbackChm01;
Context->AlgorithmContext = &Info;
//
// Set the context's file work callback to our worker routine.
//
Context->FileWorkCallback = FileWorkCallbackChm01;
//
// Prepare the initial "file preparation" work callback. This will extend
// the backing file to the appropriate size.
//
ZeroStruct(PrepareFile);
PrepareFile.FileWorkId = FileWorkPrepareId;
InterlockedPushEntrySList(&Context->FileWorkListHead,
&PrepareFile.ListEntry);
CONTEXT_START_TIMERS(PrepareFile);
SubmitThreadpoolWork(Context->FileWork);
//
// Capture initial cycles as reported by __rdtsc() and the performance
// counter. The former is used to report a raw cycle count, the latter
// is used to convert to microseconds reliably (i.e. unaffected by turbo
// boosting).
//
QueryPerformanceFrequency(&Context->Frequency);
CONTEXT_START_TIMERS(Solve);
//
// We're ready to create threadpool work for the graph.
//
Buffer = (PBYTE)BaseAddress;
Unusable = Buffer;
for (Index = 0; Index < NumberOfGraphs; Index++) {
//
// Invariant check: at the top of the loop, Buffer and Unusable should
// point to the same address (which will be the base of the current
// graph being processed). Assert this invariant now.
//
ASSERT(Buffer == Unusable);
//
// Carve out the graph pointer, and bump the unusable pointer past the
// graph's pages, such that it points to the first byte of the guard
// page.
//
Graph = (PGRAPH)Buffer;
Unusable = Buffer + UsableBufferSizeInBytesPerBuffer;
//
// Sanity check the page alignment logic. If we subtract 1 byte from
// Unusable, it should reside on a different page. Additionally, the
// two pages should be separated by at most a single page size.
//
ThisPage = ALIGN_DOWN(Unusable, PageSize);
LastPage = ALIGN_DOWN(Unusable-1, PageSize);
ASSERT(LastPage < ThisPage);
ASSERT((ThisPage - LastPage) == PageSize);
//
// Verify the guard page is working properly by wrapping an attempt to
// write to it in a structured exception handler that will catch the
// access violation trap.
//
// N.B. We only do this if we're not actively being debugged, as the
// traps get dispatched to the debugger engine first as part of
// the "first-pass" handling logic of the kernel.
//
if (!IsDebuggerPresent()) {
CaughtException = FALSE;
TRY_PROBE_MEMORY{
*Unusable = 1;
} CATCH_EXCEPTION_ACCESS_VIOLATION{
CaughtException = TRUE;
}
ASSERT(CaughtException);
}
//
// Guard page is working properly. Push the graph onto the context's
// main work list head and submit the corresponding threadpool work.
//
InterlockedPushEntrySList(&Context->MainWorkListHead,
&Graph->ListEntry);
SubmitThreadpoolWork(Context->MainWork);
//
// Advance the buffer past the graph size and guard page. Copy the
// same address to the Unusable variable as well, such that our top
// of the loop invariants hold true.
//
Buffer += GraphSizeInBytesIncludingGuardPage;
Unusable = Buffer;
//
// If our key set size is small and our maximum concurrency is large,
// we may have already solved the graph, in which case, we can stop
// submitting new solver attempts and just break out of the loop here.
//
if (!ShouldWeContinueTryingToSolveGraph(Context)) {
break;
}
}
//
// Wait on the context's events.
//
Events[0] = Context->SucceededEvent;
Events[1] = Context->CompletedEvent;
Events[2] = Context->ShutdownEvent;
Events[3] = Context->FailedEvent;
Events[4] = Context->TryLargerTableSizeEvent;
WaitResult = WaitForMultipleObjects(ARRAYSIZE(Events),
Events,
FALSE,
INFINITE);
//
// If the wait result indicates the try larger table size event was set,
// deal with that, first.
//
if (WaitResult == WAIT_OBJECT_0+4) {
//
// The number of attempts at solving this graph have exceeded the
// threshold. Set the shutdown event in order to trigger all worker
// threads to abort their current attempts and wait on the main thread
// work to complete.
//
SetEvent(Context->ShutdownEvent);
WaitForThreadpoolWorkCallbacks(Context->MainWork, TRUE);
//
// Perform a blocking wait for the prepare file work to complete. (It
// would be highly unlikely that this event hasn't been set yet.)
//
WaitResult = WaitForSingleObject(Context->PreparedFileEvent, INFINITE);
ASSERT(WaitResult == WAIT_OBJECT_0);
//
// There are no more threadpool callbacks running. However, a thread
// could have finished a solution between the time the try larger table
// size event was set, and this point. So, check the finished count
// first. If it indicates a solution, jump to that handler code.
//
if (Context->FinishedCount > 0) {
goto FinishedSolution;
}
//
// Destroy the existing buffer we allocated for this attempt. We'll
// need a new, larger one to accomodate the resize.
//
ASSERT(Rtl->DestroyBuffer(Rtl, ProcessHandle, &BaseAddress));
//
// Increment the resize counter and update the total number of attempts
// in the header. Then, determine how close we came to solving the
// graph, and store that in the header as well if it's the best so far
// (or no previous version is present).
//
OnDiskHeader->NumberOfTableResizeEvents++;
OnDiskHeader->TotalNumberOfAttemptsWithSmallerTableSizes += (
Context->Attempts
);
Closest = NumberOfEdges.LowPart - Context->HighestDeletedEdgesCount;
LastClosest = (
OnDiskHeader->ClosestWeCameToSolvingGraphWithSmallerTableSizes
);
if (!LastClosest || Closest < LastClosest) {
OnDiskHeader->ClosestWeCameToSolvingGraphWithSmallerTableSizes = (
Closest
);
}
//
// If this is our first resize, capture the initial size we used.
//
if (!OnDiskHeader->InitialTableSize) {
OnDiskHeader->InitialTableSize = NumberOfVertices.QuadPart;
}
//
// Reset the remaining counters.
//
Context->Attempts = 0;
Context->FailedAttempts = 0;
Context->HighestDeletedEdgesCount = 0;
//
// Double the vertex count. If we have overflowed max ULONG, abort.
//
Table->RequestedNumberOfTableElements.QuadPart = (
NumberOfVertices.QuadPart
);
Table->RequestedNumberOfTableElements.QuadPart <<= 1ULL;
if (Table->RequestedNumberOfTableElements.HighPart) {
Success = FALSE;
goto End;
}
//
// Unmap the existing mapping and close the section.
//
ASSERT(UnmapViewOfFile(Table->BaseAddress));
Table->BaseAddress = NULL;
ASSERT(CloseHandle(Table->MappingHandle));
Table->MappingHandle = NULL;
//
// Jump back to the start and try again with a larger vertex count.
//
goto RetryWithLargerTableSize;
}
//
// The wait result did not indicate a resize event. Ignore the wait
// result for now; determine if the graph solving was successful by the
// finished count of the context. We'll corroborate that with whatever
// events have been signaled shortly.
//
Success = (Context->FinishedCount > 0);
if (!Success) {
//
// Invariant check: if no worker thread registered a solved graph (i.e.
// Context->FinishedCount > 0), then verify that the shutdown event was
// set. If our WaitResult above indicates WAIT_OBJECT_2, we're done.
// If not, verify explicitly.
//
if (WaitResult != WAIT_OBJECT_0+2) {
//
// Manually test that the shutdown event has been signaled.
//
WaitResult = WaitForSingleObject(Context->ShutdownEvent, 0);
ASSERT(WaitResult == WAIT_OBJECT_0);
}
//
// Clean up the main thread work group members and clear the work field.
// This will block until all the worker threads have returned. We need
// to put this in place prior to jumping to the End: label as that step
// will destroy the buffer we allocated earlier for the parallel graphs,
// which we mustn't do if any threads are still working.
//
Context->MainWork = NULL;
CloseThreadpoolCleanupGroupMembers(Context->MainCleanupGroup,
TRUE,
NULL);
//
// Perform the same operation for the file work threadpool. Note that
// the only work we've dispatched to this pool at this point is the
// initial file preparation work.
//
Context->FileWork = NULL;
CloseThreadpoolCleanupGroupMembers(Context->FileCleanupGroup,
TRUE,
NULL);
goto End;
}
//
// Pop the winning graph off the finished list head.
//
FinishedSolution:
ListEntry = InterlockedPopEntrySList(&Context->FinishedWorkListHead);
ASSERT(ListEntry);
Graph = CONTAINING_RECORD(ListEntry, GRAPH, ListEntry);
//
// Note this graph as the one solved to the context. This is used by the
// save file work callback we dispatch below.
//
Context->SolvedContext = Graph;
//
// Graphs always pass verification in normal circumstances. The only time
// they don't is if there's an internal bug in our code. So, knowing that
// the graph is probably correct, we can dispatch the file work required to
// save it to disk to the file work threadpool whilst we verify it has been
// solved correctly.
//
ZeroStruct(SaveFile);
SaveFile.FileWorkId = FileWorkSaveId;
//
// Before we dispatch the save file work, make sure the preparation has
// completed.
//
WaitResult = WaitForSingleObject(Context->PreparedFileEvent, INFINITE);
if (WaitResult != WAIT_OBJECT_0 || Context->FileWorkErrors > 0) {
__debugbreak();
Success = FALSE;
goto End;
}
//
// Push this work item to the file work list head and submit the threadpool
// work for it.
//
CONTEXT_START_TIMERS(SaveFile);
InterlockedPushEntrySList(&Context->FileWorkListHead, &SaveFile.ListEntry);
SubmitThreadpoolWork(Context->FileWork);
//
// Capture another round of cycles and performance counter values, then
// continue with verification of the solution.
//
CONTEXT_START_TIMERS(Verify);
Success = VerifySolvedGraph(Graph);
CONTEXT_END_TIMERS(Verify);
//
// Set the verified event (regardless of whether or not we succeeded in
// verification). The save file work will be waiting upon it in order to
// write the final timing details to the on-disk header.
//
SetEvent(Context->VerifiedEvent);
ASSERT(Success);
//
// Wait on the saved file event before returning.
//
WaitResult = WaitForSingleObject(Context->SavedFileEvent, INFINITE);
if (WaitResult != WAIT_OBJECT_0 || Context->FileWorkErrors > 0) {
__debugbreak();
Success = FALSE;
}
End:
//
// Destroy the buffer we created earlier.
//
// N.B. Although we used Rtl->CreateMultipleBuffers(), we can still free
// the underlying buffer via Rtl->DestroyBuffer(), as only a single
// VirtualAllocEx() call was dispatched for the entire buffer.
//
if (BaseAddress) {
Rtl->DestroyBuffer(Rtl, ProcessHandle, &BaseAddress);
}
//
// Explicitly reset all events before returning.
//
Event = (PHANDLE)&Context->FirstEvent;
NumberOfEvents = GetNumberOfContextEvents(Context);
for (Index = 0; Index < NumberOfEvents; Index++, Event++) {
ASSERT(*Event && *Event != INVALID_HANDLE_VALUE);
ASSERT(ResetEvent(*Event));
}
return Success;
}
Appendix
Rtl->CreateRandomObjectNames()
This helper routine is used to assist with the creation of event names that have
a set of prefixes we specify (e.g.
PerfectHashTableContext_ShutdownEvent), and then a configurable
amount of trailing Base64-encoded random data, such that the final name is
likely to be unique within the current session with high probability.
_Use_decl_annotations_
BOOL
CreateRandomObjectNames(
PRTL Rtl,
PALLOCATOR TemporaryAllocator,
PALLOCATOR WideBufferAllocator,
USHORT NumberOfNames,
USHORT LengthOfNameInChars,
PUNICODE_STRING NamespacePrefix,
PPUNICODE_STRING NamesArrayPointer,
PPUNICODE_STRING PrefixArrayPointer,
PULONG SizeOfWideBufferInBytes,
PPWSTR WideBufferPointer
)
/*++
Routine Description:
This routine writes Base64-encoded random data to an existing buffer in a
format suitable for subsequent use of UNICODE_STRING-based system names for
things like events or shared memory handles.
Arguments:
Rtl - Supplies a pointer to an initialized RTL structure. If the crypto
subsystem hasn't yet been initialized, this routine will also initialize
it.
TemporaryAllocator - Supplies a pointer to an initialized ALLOCATOR struct
that this routine will use for temporary allocations. (Any temporarily
allocated memory will be freed before the routine returns, regardless
of success/error.)
WideBufferAllocator - Supplies a pointer to an initialized ALLOCATOR struct
that this routine will use to allocate the final wide character buffer
that contains the base64-encoded random data. This data will then have
the object namespace+prefix and trailing NULL characters overlaid on
top of it. (That is, the UNICODE_STRING structures pointed to by the
NamesArray will have their Buffer addresses point within this buffer
space.) The caller is responsible for freeing this address (which will
be received via the output param WideBufferPointer).
NumberOfNames - Supplies the number of names that will be carved out of the
provided WideBuffer by the caller. This parameter is used in concert
with the LengthOfNameInChars parameter to ensure the buffer is laid out
in the correct format.
LengthOfNameInChars - Supplies the desired length of each name string in
characters. This length is assumed to include the trailing NULL and
the prefix -- that is, the space required to contain the prefix and
trailing NULL will be subtracted from this parameter. For optimal
layout, this parameter should be a power of 2 -- with 64 and 128 being
good default values.
NamespacePrefix - Optionally supplies a pointer to a UNICODE_STRING to
use as the namespace (prefix) for each string. If NULL, this value
defaults L"Local\\". (If L"Global\\" is used, the caller is responsible
for ensuring the SeCreateGlobalPrivilege privilege is enabled.)
NamesArrayPointer - Supplies a pointer to the first element of an array of
of pointers to UNICODE_STRING structures that will be filled out with
the details of the corresponding object name. Sufficient space should
be allocated such that the array contains sizeof(UNICODE_STRING) *
NumberOfNames in space.
PrefixArrayPointer - Optionally supplies a pointer to the first element of
an array of pointers to UNICODE_STRING structures which can be used to
further customize the name of the object after the namespace but before
the random character data. If a NULL pointer resides at a given array
element, it is assumed no prefix is desired for this element.
SizeOfWideBufferInBytes - Receives the size in bytes of the buffer allocated
to store the object names' wide character data.
WideBufferPointer - Receives the base address of the object names wide char
buffer.
Return Value:
TRUE on success, FALSE on error.
--*/
{
BOOL Success;
USHORT Index;
USHORT Count;
LONG PrefixLengthInChars;
LONG NumberOfWideBase64CharsToCopy;
LONG CharsRemaining;
LONG CharsUsed;
LONG RandomCharsUsed;
LONG FinalCharCount;
ULONG CryptFlags;
ULONG SizeOfBinaryBufferInBytes;
ULONG SizeOfWideBase64BufferInBytes = 0;
ULONG OldLengthOfWideBase64BufferInChars;
ULONG LengthOfWideBase64BufferInChars;
PBYTE BinaryBuffer;
PWCHAR Dest;
PWCHAR WideBase64Buffer = NULL;
PUNICODE_STRING String;
PUNICODE_STRING Prefix;
PPUNICODE_STRING Prefixes;
PUNICODE_STRING Namespace;
UNICODE_STRING LocalNamespace = RTL_CONSTANT_STRING(L"Local\\");
//
// Validate arguments.
//
if (ARGUMENT_PRESENT(NamespacePrefix)) {
if (!IsValidUnicodeStringWithMinimumLengthInChars(NamespacePrefix,7)) {
return FALSE;
}
Namespace = NamespacePrefix;
} else {
Namespace = &LocalNamespace;
}
//
// Namespace length should be (far) less than the desired name length.
//
if (Namespace->Length >= (LengthOfNameInChars << 1)) {
__debugbreak();
return FALSE;
}
//
// If the namespace ends with a trailing NULL, omit it by reducing the
// length by one wide character. Then, verify the final character is a
// slash.
//
if (Namespace->Buffer[(Namespace->Length >> 1) - 1] == L'\0') {
Namespace->Length -= sizeof(WCHAR);
}
if (Namespace->Buffer[(Namespace->Length >> 1) - 1] != L'\\') {
__debugbreak();
return FALSE;
}
if (ARGUMENT_PRESENT(PrefixArrayPointer)) {
Prefixes = PrefixArrayPointer;
} else {
Prefixes = NULL;
}
//
// Make sure the crypto subsystem is available.
//
if (!Rtl->Flags.Crypt32Initialized) {
if (!InitCrypt32(Rtl)) {
return FALSE;
}
}
//
// Allocate a buffer for the initial binary data; we generate more random
// data than we need here, but it's easier than trying to get everything
// exact up-front (i.e. base64->binary size conversions).
//
// N.B. We use Allocator->Malloc() instead of the Calloc() here as the
// existing memory data will contribute as a seed value.
//
SizeOfBinaryBufferInBytes = NumberOfNames * LengthOfNameInChars;
BinaryBuffer = (PBYTE)(
TemporaryAllocator->Malloc(
TemporaryAllocator->Context,
SizeOfBinaryBufferInBytes
)
);
if (!BinaryBuffer) {
return FALSE;
}
//
// Allocate a wide character buffer for the base64-encoded binary data that
// is double the length of the binary buffer -- this is simpler than trying
// to get the exact conversion right.
//
SizeOfWideBase64BufferInBytes = ALIGN_UP_POINTER(
NumberOfNames *
LengthOfNameInChars *
sizeof(WCHAR) *
2
);
WideBase64Buffer = (PWCHAR)(
WideBufferAllocator->Calloc(
WideBufferAllocator->Context,
1,
SizeOfWideBase64BufferInBytes
)
);
if (!WideBase64Buffer) {
goto Error;
}
//
// We successfully allocated space for our two buffers. Fill the first one
// with random data now.
//
Success = Rtl->CryptGenRandom(Rtl,
SizeOfBinaryBufferInBytes,
(PBYTE)BinaryBuffer);
if (!Success) {
Rtl->LastError = GetLastError();
__debugbreak();
goto Error;
}
//
// Convert the entire binary data buffer into base64-encoded wide character
// representation. We calculate the number of wide characters the buffer
// can receive by shifting the byte size right by one, then copy that value
// into a second variable that CryptBinaryToStringW() can overwrite with
// the actual length converted.
//
OldLengthOfWideBase64BufferInChars = SizeOfWideBase64BufferInBytes >> 1;
LengthOfWideBase64BufferInChars = OldLengthOfWideBase64BufferInChars;
CryptFlags = CRYPT_STRING_BASE64 | CRYPT_STRING_NOCRLF;
Success = Rtl->CryptBinaryToStringW(BinaryBuffer,
SizeOfBinaryBufferInBytes,
CryptFlags,
WideBase64Buffer,
&LengthOfWideBase64BufferInChars);
if (!Success) {
Rtl->LastError = GetLastError();
__debugbreak();
goto Error;
}
//
// Conversion of the binary data into base64-encoded data was successful,
// so we can free the binary buffer now.
//
TemporaryAllocator->FreePointer(
TemporaryAllocator->Context,
&BinaryBuffer
);
//
// Loop through the array of pointers to UNICODE_STRING structures and fill
// each one out, adding the namespace and prefixes accordingly.
//
RandomCharsUsed = 0;
for (Index = 0; Index < NumberOfNames; Index++) {
//
// Resolve the next unicode string pointer.
//
String = *(NamesArrayPointer + Index);
//
// Reset counters. CharsUsed has one subtracted from it in addition
// to the namespace length to account for the trailing NULL.
//
CharsUsed = (Namespace->Length >> 1) + 1;
CharsRemaining = (LONG)LengthOfNameInChars - CharsUsed;
if (Prefixes && ((Prefix = *(Prefixes + Index)) != NULL)) {
//
// Omit any trailing NULLs from the custom prefix provided by the
// caller, then subtract the prefix length from the remaining bytes
// and verify we've got a sensible number left.
//
PrefixLengthInChars = Prefix->Length >> 1;
if (Prefix->Buffer[PrefixLengthInChars - 1] == L'\0') {
PrefixLengthInChars -= 1;
}
CharsUsed += PrefixLengthInChars;
CharsRemaining -= PrefixLengthInChars;
} else {
Prefix = NULL;
PrefixLengthInChars = 0;
}
if (CharsRemaining <= 0) {
__debugbreak();
goto Error;
}
//
// Final sanity check that the lengths add up.
//
NumberOfWideBase64CharsToCopy = CharsRemaining;
FinalCharCount = (
(Namespace->Length >> 1) +
PrefixLengthInChars +
NumberOfWideBase64CharsToCopy +
1
);
if (FinalCharCount != LengthOfNameInChars) {
__debugbreak();
goto Error;
}
//
// Everything checks out, fill out the unicode string details and copy
// the relevant parts over: namespace, optional prefix and then finally
// the random characters.
//
String->Length = (USHORT)(FinalCharCount << 1) - sizeof(WCHAR);
String->MaximumLength = String->Length + sizeof(WCHAR);
Dest = String->Buffer = (WideBase64Buffer + RandomCharsUsed);
//
// Copy the namespace and optionally prefix into the initial part of
// the random character data buffer, then NULL terminate the name and
// update counters.
//
Count = Namespace->Length >> 1;
__movsw(Dest, Namespace->Buffer, Count);
Dest += Count;
RandomCharsUsed += Count;
if (Prefix) {
Count = (USHORT)PrefixLengthInChars;
__movsw(Dest, Prefix->Buffer, Count);
Dest += Count;
RandomCharsUsed += Count;
}
Count = (USHORT)NumberOfWideBase64CharsToCopy + 1;
Dest += Count - 1;
*Dest = L'\0';
RandomCharsUsed += Count;
}
//
// We're done, indicate success and finish up.
//
Success = TRUE;
goto End;
Error:
Success = FALSE;
//
// Intentional follow-on to End.
//
End:
if (BinaryBuffer) {
TemporaryAllocator->FreePointer(
TemporaryAllocator->Context,
&BinaryBuffer
);
}
*WideBufferPointer = WideBase64Buffer;
*SizeOfWideBufferInBytes = SizeOfWideBase64BufferInBytes;
return Success;
}
Rtl->CreateMultipleBuffers()
This routine is used to create multiple buffers of equal sizes (where the size
is specified in numbers of pages), separated by a "guard" page that has all
permissions removed from it (such that any attempts to read or write to memory
in that page will trap).
It is used to allocate all memory required by concurrent graph solving using a
single VirtualAlloc() call.
_Use_decl_annotations_
BOOL
CreateMultipleBuffers(
PRTL Rtl,
PHANDLE TargetProcessHandle,
ULONG PageSize,
ULONG NumberOfBuffers,
ULONG NumberOfPagesPerBuffer,
PULONG AdditionalProtectionFlags,
PULONG AdditionalAllocationTypeFlags,
PULONGLONG UsableBufferSizeInBytesPerBuffer,
PULONGLONG TotalBufferSizeInBytes,
PPVOID BufferAddress
)
/*++
Routine Description:
Allocates a single buffer using VirtualAllocEx() sized using the following:
(PageSize * NumberOfPagesPerBuffer * NumberOfBuffers) +
(NumberOfBuffers * PageSize)
This includes a guard page following each buffer's set of pages. Memory
protection is set on the guard page such that any access will trap.
Arguments:
Rtl - Supplies a pointer to an initialized RTL structure.
TargetProcessHandle - Optionally supplies a pointer to a variable that
contains a process handle for which the memory is to be allocated.
If non-NULL, but pointed-to value is 0, this will receive the handle
of the current process.
PageSize - Indicates the page size indicated by the NumberOfPagesPerBuffer
parameter. This should be either 4096 or 2MB. If 2MB, the caller
should also ensure a) MEM_LARGE_PAGES is also provided as the value in
the parameter AdditionalAllocationTypeFlags, and b) relevant privileges
have been obtained (e.g. Rtl->EnableManageVolumePrivilege() and
Rtl->EnableLockMemoryPrivilege()).
NumberOfBuffers - Supplies the number of individual buffers being serviced
by this request. This will also reflect the number of implicit guard
pages included in the allocation.
NumberOfPagesPerBuffer - Supplies the number of pages to assign for each
buffer.
AdditionalProtectionFlags - Optionally supplies flags that will be OR'd
with the flProtect flags parameter of VirtualAllocEx() (e.g.
PAGE_NOCACHE or PAGE_WRITECOMBINE).
AdditionalAllocationTypeFlags - Optionally supplies flags that will be OR'd
with the flAllocationType flags parameter of VirtualAllocEx(). This
should include MEM_LARGE_PAGES if large pages are desired.
UsableBufferSizeInBytesPerBuffer - Supplies the address of a variable that
receives the number of usable bytes per buffer.
TotalBufferSizeInBytes - Supplies the address of a variable that receives
the total allocation size provided to VirtualAllocEx().
BufferAddress - Supplies the address of a variable that receives the
base address of the allocation if successful.
Return Value:
TRUE on success, FALSE on failure.
--*/
{
BOOL Success;
ULONG Index;
PVOID Buffer;
PBYTE Unusable;
HANDLE ProcessHandle;
ULONG ProtectionFlags;
ULONG OldProtectionFlags;
ULONG AllocationFlags;
ULARGE_INTEGER TotalNumberOfPages;
ULARGE_INTEGER AllocSizeInBytes;
ULONGLONG UsableBufferSizeInBytes;
//
// Validate arguments.
//
if (!ARGUMENT_PRESENT(Rtl)) {
return FALSE;
}
if (!NumberOfPagesPerBuffer) {
return FALSE;
}
if (!NumberOfBuffers) {
return FALSE;
}
//
// Verify page size is either 4KB or 2MB.
//
if (PageSize != 4096 && PageSize != (1 << 21)) {
return FALSE;
}
//
// If page size is 2MB, verify MEM_LARGE_PAGES has also been requested.
//
if (PageSize == (1 << 21)) {
if (!ARGUMENT_PRESENT(AdditionalAllocationTypeFlags)) {
return FALSE;
}
if (!(*AdditionalAllocationTypeFlags & MEM_LARGE_PAGES)) {
return FALSE;
}
}
if (!ARGUMENT_PRESENT(TargetProcessHandle)) {
ProcessHandle = GetCurrentProcess();
} else {
if (!*TargetProcessHandle) {
*TargetProcessHandle = GetCurrentProcess();
}
ProcessHandle = *TargetProcessHandle;
}
if (!ARGUMENT_PRESENT(UsableBufferSizeInBytesPerBuffer)) {
return FALSE;
} else {
*UsableBufferSizeInBytesPerBuffer = 0;
}
if (!ARGUMENT_PRESENT(TotalBufferSizeInBytes)) {
return FALSE;
} else {
*TotalBufferSizeInBytes = 0;
}
if (!ARGUMENT_PRESENT(BufferAddress)) {
return FALSE;
} else {
*BufferAddress = NULL;
}
TotalNumberOfPages.QuadPart = (
//
// Account for the buffer related pages.
//
((ULONGLONG)NumberOfPagesPerBuffer * (ULONGLONG)NumberOfBuffers) +
//
// Account for the guard pages; one for each buffer.
//
((ULONGLONG)NumberOfBuffers)
);
//
// Calculate the total allocation size required.
//
AllocSizeInBytes.QuadPart = (
TotalNumberOfPages.QuadPart * (ULONGLONG)PageSize
);
ProtectionFlags = PAGE_READWRITE;
if (ARGUMENT_PRESENT(AdditionalProtectionFlags)) {
ProtectionFlags |= *AdditionalProtectionFlags;
}
AllocationFlags = MEM_RESERVE | MEM_COMMIT;
if (ARGUMENT_PRESENT(AdditionalAllocationTypeFlags)) {
AllocationFlags |= *AdditionalAllocationTypeFlags;
}
//
// Validation of parameters complete. Proceed with buffer allocation.
//
Buffer = Rtl->VirtualAllocEx(ProcessHandle,
NULL,
AllocSizeInBytes.QuadPart,
AllocationFlags,
ProtectionFlags);
if (!Buffer) {
ULONG LastError = GetLastError();
return FALSE;
}
//
// Buffer was successfully allocated. Any failures after this point should
// `goto Error` to ensure the memory is freed.
//
UsableBufferSizeInBytes = (ULONGLONG)(
(ULONGLONG)NumberOfPagesPerBuffer *
(ULONGLONG)PageSize
);
//
// Loop through the assigned memory and toggle protection to PAGE_NOACCESS
// for the guard pages.
//
Unusable = (PBYTE)Buffer;
ProtectionFlags = PAGE_NOACCESS;
for (Index = 0; Index < NumberOfBuffers; Index++) {
Unusable += UsableBufferSizeInBytes;
Success = Rtl->VirtualProtectEx(ProcessHandle,
Unusable,
PageSize,
ProtectionFlags,
&OldProtectionFlags);
if (!Success) {
goto Error;
}
//
// Advance past this guard page.
//
Unusable += PageSize;
}
//
// We're done, indicate success and finish up.
//
Success = TRUE;
goto End;
Error:
Success = FALSE;
//
// Buffer should be non-NULL at this point. Assert this invariant, free the
// allocated memory, clear the buffer pointer and set the alloc size to 0.
//
ASSERT(Buffer);
Rtl->VirtualFreeEx(ProcessHandle, Buffer, 0, MEM_RELEASE);
Buffer = NULL;
//
// Intentional follow-on to End.
//
End:
//
// Update caller's parameters and return.
//
*BufferAddress = Buffer;
if (Success) {
*TotalBufferSizeInBytes = AllocSizeInBytes.QuadPart;
*UsableBufferSizeInBytesPerBuffer = UsableBufferSizeInBytes;
}
return Success;
}