clang-tools
9.0.0
|
Namespaces | |
detail | |
Classes | |
struct | Chunk |
NOTE: This is an implementation detail. More... | |
class | Corpus |
class | Dex |
In-memory Dex trigram-based index implementation. More... | |
class | Iterator |
Iterator is the interface for Query Tree node. More... | |
class | PostingList |
PostingList is the storage of DocIDs which can be inserted to the Query Tree as a leaf by constructing Iterator over the PostingList object. More... | |
class | Token |
A Token represents an attribute of a symbol, such as a particular trigram present in the name (used for fuzzy search). More... | |
Typedefs | |
using | DocID = uint32_t |
Symbol position in the list of all index symbols sorted by a pre-computed symbol quality. More... | |
Functions | |
std::vector< std::string > | generateProximityURIs (llvm::StringRef URIPath) |
Returns Search Token for a number of parent directories of given Path. More... | |
std::vector< std::pair< DocID, float > > | consume (Iterator &It) |
Advances the iterator until it is exhausted. More... | |
std::vector< Token > | generateIdentifierTrigrams (llvm::StringRef Identifier) |
Returns list of unique fuzzy-search trigrams from unqualified symbol. More... | |
std::vector< Token > | generateQueryTrigrams (llvm::StringRef Query) |
Returns list of unique fuzzy-search trigrams given a query. More... | |
using clang::clangd::dex::DocID = typedef uint32_t |
Symbol position in the list of all index symbols sorted by a pre-computed symbol quality.
Definition at line 46 of file Iterator.h.
Advances the iterator until it is exhausted.
Returns pairs of document IDs with the corresponding boosting score.
Boosting can be seen as a compromise between retrieving too many items and calculating finals score for each of them (which might be very expensive) and not retrieving enough items so that items with very high final score would not be processed. Boosting score is a computationally efficient way to acquire preliminary scores of requested items.
Definition at line 350 of file Iterator.cpp.
References clang::clangd::dex::Iterator::advance(), clang::clangd::dex::Iterator::consume(), clang::clangd::dex::Iterator::peek(), clang::clangd::dex::Iterator::reachedEnd(), and Result.
std::vector< Token > clang::clangd::dex::generateIdentifierTrigrams | ( | llvm::StringRef | Identifier | ) |
Returns list of unique fuzzy-search trigrams from unqualified symbol.
The trigrams give the 3-character query substrings this symbol can match.
The symbol's name is broken into segments, e.g. "FooBar" has two segments. Trigrams can start at any character in the input. Then we can choose to move to the next character or to the start of the next segment.
Short trigrams (length 1-2) are used for short queries. These are:
For "FooBar" we get the following trigrams: {f, fo, fb, foo, fob, fba, oob, oba, bar}.
Trigrams are lowercase, as trigram matching is case-insensitive. Trigrams in the returned list are deduplicated.
Definition at line 23 of file Trigram.cpp.
References clang::clangd::calculateRoles(), clang::clangd::Head, clang::clangd::Tail, and clang::clangd::dex::Token::Trigram.
std::vector< std::string > clang::clangd::dex::generateProximityURIs | ( | llvm::StringRef | URIPath | ) |
Returns Search Token for a number of parent directories of given Path.
Should be used within the index build process.
This function is exposed for testing only.
Definition at line 299 of file Dex.cpp.
References Limit, clang::clangd::URI::parse(), Result, and clang::clangd::toString().
std::vector< Token > clang::clangd::dex::generateQueryTrigrams | ( | llvm::StringRef | Query | ) |
Returns list of unique fuzzy-search trigrams given a query.
Query is segmented using FuzzyMatch API and downcasted to lowercase. Then, the simplest trigrams - sequences of three consecutive letters and digits are extracted and returned after deduplication.
For short queries (less than 3 characters with Head or Tail roles in Fuzzy Matching segmentation) this returns a single trigram with the first characters (up to 3) to perform prefix match.
Definition at line 86 of file Trigram.cpp.