utils/data-structures
Custom data structures.
These are only used internally, meaning an end-user shouldnβt need to access anything here.
- utils/data-structures
- static
- .PriorityQueue
new PriorityQueue(comparator)
.size
.isEmpty()
βboolean
.peek()
βany
.push(...values)
βnumber
.extend(values)
βnumber
.pop()
βany
.replace(value)
β*
._siftUpFrom(node)
- .CharTrie
- .TokenLattice
new TokenLattice(sentence, bosTokenId, eosTokenId)
.insert(pos, length, score, tokenId)
.viterbi()
βArray.<TokenLatticeNode>
.piece(node)
βstring
.tokens()
βArray.<string>
.tokenIds()
βArray.<number>
- .PriorityQueue
- inner
- ~CharTrieNode
new CharTrieNode(isLeaf, children)
.default()
βCharTrieNode
- ~TokenLatticeNode
new TokenLatticeNode(tokenId, nodeId, pos, length, score)
.clone()
βTokenLatticeNode
- ~CharTrieNode
- static
utils/data-structures.PriorityQueue
Efficient Heap-based Implementation of a Priority Queue.
It uses an array-based binary heap, where the root is at index 0
, and the
children of node i
are located at indices 2i + 1
and 2i + 2
, respectively.
Adapted from the following sources:
- https://stackoverflow.com/a/42919752/13989043 (original)
- https://github.com/belladoreai/llama-tokenizer-js (minor improvements)
Kind: static class of utils/data-structures
- .PriorityQueue
new PriorityQueue(comparator)
.size
.isEmpty()
βboolean
.peek()
βany
.push(...values)
βnumber
.extend(values)
βnumber
.pop()
βany
.replace(value)
β*
._siftUpFrom(node)
new PriorityQueue(comparator)
Create a new PriorityQueue.
Param | Type | Description |
---|---|---|
comparator | function | Comparator function to determine priority. Defaults to a MaxHeap. |
priorityQueue.size
The size of the queue
Kind: instance property of PriorityQueue
priorityQueue.isEmpty() β <code> boolean </code>
Check if the queue is empty.
Kind: instance method of PriorityQueue
Returns: boolean
- true
if the queue is empty, false
otherwise.
priorityQueue.peek() β <code> any </code>
Return the element with the highest priority in the queue.
Kind: instance method of PriorityQueue
Returns: any
- The highest priority element in the queue.
priorityQueue.push(...values) β <code> number </code>
Add one or more elements to the queue.
Kind: instance method of PriorityQueue
Returns: number
- The new size of the queue.
Param | Type | Description |
---|---|---|
...values | any | The values to push into the queue. |
priorityQueue.extend(values) β <code> number </code>
Add multiple elements to the queue.
Kind: instance method of PriorityQueue
Returns: number
- The new size of the queue.
Param | Type | Description |
---|---|---|
values | Array.<any> | The values to push into the queue. |
priorityQueue.pop() β <code> any </code>
Remove and return the element with the highest priority in the queue.
Kind: instance method of PriorityQueue
Returns: any
- The element with the highest priority in the queue.
priorityQueue.replace(value) β <code> * </code>
Replace the element with the highest priority in the queue with a new value.
Kind: instance method of PriorityQueue
Returns: *
- The replaced value.
Param | Type | Description |
---|---|---|
value | * | The new value. |
priorityQueue._siftUpFrom(node)
Helper function to sift up from a given node.
Kind: instance method of PriorityQueue
Param | Type | Description |
---|---|---|
node | number | The index of the node to start sifting up from. |
utils/data-structures.CharTrie
A trie structure to efficiently store and search for strings.
Kind: static class of utils/data-structures
charTrie.extend(texts)
Adds one or more texts
to the trie.
Kind: instance method of CharTrie
Param | Type | Description |
---|---|---|
texts | Array.<string> | The strings to add to the trie. |
charTrie.push(text)
Adds text to the trie.
Kind: instance method of CharTrie
Param | Type | Description |
---|---|---|
text | string | The string to add to the trie. |
charTrie.commonPrefixSearch(text)
Searches the trie for all strings with a common prefix of text
.
Kind: instance method of CharTrie
Param | Type | Description |
---|---|---|
text | string | The common prefix to search for. |
utils/data-structures.TokenLattice
A lattice data structure to be used for tokenization.
Kind: static class of utils/data-structures
- .TokenLattice
new TokenLattice(sentence, bosTokenId, eosTokenId)
.insert(pos, length, score, tokenId)
.viterbi()
βArray.<TokenLatticeNode>
.piece(node)
βstring
.tokens()
βArray.<string>
.tokenIds()
βArray.<number>
new TokenLattice(sentence, bosTokenId, eosTokenId)
Creates a new TokenLattice instance.
Param | Type | Description |
---|---|---|
sentence | string | The input sentence to be tokenized. |
bosTokenId | number | The beginning-of-sequence token ID. |
eosTokenId | number | The end-of-sequence token ID. |
tokenLattice.insert(pos, length, score, tokenId)
Inserts a new token node into the token lattice.
Kind: instance method of TokenLattice
Param | Type | Description |
---|---|---|
pos | number | The starting position of the token. |
length | number | The length of the token. |
score | number | The score of the token. |
tokenId | number | The token ID of the token. |
tokenLattice.viterbi() β <code> Array. < TokenLatticeNode > </code>
Implements the Viterbi algorithm to compute the most likely sequence of tokens.
Kind: instance method of TokenLattice
Returns: Array.<TokenLatticeNode>
- The most likely sequence of tokens.
tokenLattice.piece(node) β <code> string </code>
Kind: instance method of TokenLattice
Returns: string
- The array of nodes representing the most likely sequence of tokens.
Param | Type |
---|---|
node | TokenLatticeNode |
tokenLattice.tokens() β <code> Array. < string > </code>
Kind: instance method of TokenLattice
Returns: Array.<string>
- The most likely sequence of tokens.
tokenLattice.tokenIds() β <code> Array. < number > </code>
Kind: instance method of TokenLattice
Returns: Array.<number>
- The most likely sequence of token ids.
utils/data-structures~CharTrieNode
Represents a node in a character trie.
Kind: inner class of utils/data-structures
- ~CharTrieNode
new CharTrieNode(isLeaf, children)
.default()
βCharTrieNode
new CharTrieNode(isLeaf, children)
Create a new CharTrieNode.
Param | Type | Description |
---|---|---|
isLeaf | boolean | Whether the node is a leaf node or not. |
children | Map.<string, CharTrieNode> | A map containing the node's children, where the key is a character and the value is a |
CharTrieNode.default() β <code> CharTrieNode </code>
Returns a new CharTrieNode
instance with default values.
Kind: static method of CharTrieNode
Returns: CharTrieNode
- A new CharTrieNode
instance with isLeaf
set to false
and an empty children
map.
utils/data-structures~TokenLatticeNode
Kind: inner class of utils/data-structures
- ~TokenLatticeNode
new TokenLatticeNode(tokenId, nodeId, pos, length, score)
.clone()
βTokenLatticeNode
new TokenLatticeNode(tokenId, nodeId, pos, length, score)
Represents a node in a token lattice for a given sentence.
Param | Type | Description |
---|---|---|
tokenId | number | The ID of the token associated with this node. |
nodeId | number | The ID of this node. |
pos | number | The starting position of the token in the sentence. |
length | number | The length of the token. |
score | number | The score associated with the token. |
tokenLatticeNode.clone() β <code> TokenLatticeNode </code>
Returns a clone of this node.
Kind: instance method of TokenLatticeNode
Returns: TokenLatticeNode
- A clone of this node.
< > Update on GitHub