Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add PrefixTree and RegisterHandler to support TDFA simulation. #56

Merged
merged 355 commits into from
Dec 6, 2024

Conversation

SharafMohamed
Copy link
Contributor

@SharafMohamed SharafMohamed commented Nov 26, 2024

Description

  • Add PrefixTreeNode and PrefixTree classes to maintain a prefix tree representation of the registers. One tree represents all registers.
  • Add Register class which tracks an index in the prefix tree. The vector of positions represented by a register can be obtained by going to its index in the prefix tree and traversing in reverse to the root.
  • Add RegisterHandle class which maintains a vector of registers, as well as the prefix tree. This class handle the set, copy, append, and get operators.

Validation performed

  • Unit-tests added for every new class.

Summary by CodeRabbit

  • New Features

    • Introduced a new prefix tree data structure for managing tag positions.
    • Added a register management system for handling tag locations associated with regex.
    • Enhanced error handling for out-of-bounds access in both PrefixTree and RegisterHandler.
    • Implemented a method to retrieve reversed positions in the prefix tree.
    • Updated build configuration to include new source files and libraries for improved functionality.
  • Tests

    • Added comprehensive unit tests for the PrefixTree and RegisterHandler classes to validate functionality and error handling.

SharafMohamed and others added 30 commits October 23, 2024 15:38
…ransitions) return nullopt if state_ids is malformed.
…ake it clear to the reader that both failures are handled the same way and return nullopt. For more complicated return cases it would warrant the reader looking at the doc for the individual functions, but here I think we can make their life easier.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (2)
src/log_surgeon/finite_automata/PrefixTree.hpp (2)

12-12: Fix typo in documentation

"posiiton" should be "position".


65-65: Simplify boolean expression

The expression false == m_parent_node_id.has_value() can be simplified.

-        [[nodiscard]] auto is_root() const -> bool { return false == m_parent_node_id.has_value(); }
+        [[nodiscard]] auto is_root() const -> bool { return !m_parent_node_id.has_value(); }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between f651a24 and 1f66918.

📒 Files selected for processing (3)
  • src/log_surgeon/finite_automata/PrefixTree.hpp (1 hunks)
  • tests/test-prefix-tree.cpp (1 hunks)
  • tests/test-register-handler.cpp (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/test-prefix-tree.cpp
  • tests/test-register-handler.cpp
🧰 Additional context used
📓 Learnings (1)
src/log_surgeon/finite_automata/PrefixTree.hpp (2)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#56
File: src/log_surgeon/finite_automata/PrefixTree.hpp:80-80
Timestamp: 2024-11-27T22:24:42.637Z
Learning: The method `get_reversed_positions` is implemented in `src/log_surgeon/finite_automata/PrefixTree.cpp`.
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#56
File: src/log_surgeon/finite_automata/PrefixTree.cpp:0-0
Timestamp: 2024-12-04T11:20:27.990Z
Learning: In the `PrefixTree` implementation in `src/log_surgeon/finite_automata/PrefixTree.cpp`, the root node serves as a common placeholder for all paths, and its position should be ignored in methods like `get_reversed_positions`.
🔇 Additional comments (3)
src/log_surgeon/finite_automata/PrefixTree.hpp (3)

22-25: LGTM! Well-defined types and constants

Good use of type aliases and constants to improve code maintainability and readability.


80-80: LGTM! Clean implementation of tree storage

The vector-based storage is an efficient choice for this use case.


56-56: Verify implementation of get_reversed_positions

Let's verify the implementation matches the declaration.

✅ Verification successful

Implementation matches declaration and follows expected behavior

The implementation in PrefixTree.cpp correctly matches the declaration in the header file:

  • Returns std::vector<position_t> as declared
  • Takes node_id parameter and is marked const
  • Implements proper bounds checking
  • Correctly builds reversed positions by traversing from node to root
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify get_reversed_positions implementation matches declaration

# Search for the implementation
rg "get_reversed_positions.*const.*std::vector<position_t>" src/log_surgeon/finite_automata/PrefixTree.cpp -A 10

Length of output: 604

src/log_surgeon/finite_automata/PrefixTree.hpp Outdated Show resolved Hide resolved
src/log_surgeon/finite_automata/PrefixTree.cpp Outdated Show resolved Hide resolved
src/log_surgeon/finite_automata/PrefixTree.hpp Outdated Show resolved Hide resolved
src/log_surgeon/finite_automata/RegisterHandler.hpp Outdated Show resolved Hide resolved
tests/test-prefix-tree.cpp Outdated Show resolved Hide resolved
tests/test-register-handler.cpp Outdated Show resolved Hide resolved
tests/test-register-handler.cpp Outdated Show resolved Hide resolved
tests/test-register-handler.cpp Outdated Show resolved Hide resolved
tests/test-prefix-tree.cpp Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (4)
src/log_surgeon/finite_automata/RegisterHandler.hpp (1)

20-26: Add method documentation for public methods

Consider adding documentation for each public method to clarify:

  • Parameters and their purpose
  • Return values and their meaning
  • Exceptions that may be thrown
  • Any preconditions or postconditions

Example format:

/**
 * @brief Adds a new register to the handler
 * @param prefix_tree_parent_node_id Parent node ID in the prefix tree
 * @param position Position in the lexed string
 * @throw std::out_of_range if prefix_tree_parent_node_id is out of range
 */

Also applies to: 28-30, 32-34, 36-39, 41-44

src/log_surgeon/finite_automata/PrefixTree.hpp (3)

12-20: Add root node documentation

Consider adding a note about the root node's special characteristics:

  • Its position is always -1
  • It has no parent (std::nullopt)
  • It serves as a common placeholder for all paths

Also applies to: 28-28


36-43: Consider reserving vector capacity

The insert method could benefit from reserving capacity in the vector to reduce reallocations, especially if you know the approximate number of nodes that will be inserted.

// In constructor
m_nodes.reserve(expected_size);

78-80: Consider making member variables const where appropriate

If m_parent_id never changes after construction, consider making it const:

-    std::optional<id_t> m_parent_id;
+    std::optional<id_t> const m_parent_id;
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 1f66918 and ed70bd5.

📒 Files selected for processing (5)
  • src/log_surgeon/finite_automata/PrefixTree.cpp (1 hunks)
  • src/log_surgeon/finite_automata/PrefixTree.hpp (1 hunks)
  • src/log_surgeon/finite_automata/RegisterHandler.hpp (1 hunks)
  • tests/test-prefix-tree.cpp (1 hunks)
  • tests/test-register-handler.cpp (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/log_surgeon/finite_automata/PrefixTree.cpp
  • tests/test-register-handler.cpp
  • tests/test-prefix-tree.cpp
🧰 Additional context used
📓 Learnings (2)
src/log_surgeon/finite_automata/PrefixTree.hpp (2)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#56
File: src/log_surgeon/finite_automata/PrefixTree.hpp:80-80
Timestamp: 2024-11-27T22:24:42.637Z
Learning: The method `get_reversed_positions` is implemented in `src/log_surgeon/finite_automata/PrefixTree.cpp`.
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#56
File: src/log_surgeon/finite_automata/PrefixTree.cpp:0-0
Timestamp: 2024-12-04T11:20:27.990Z
Learning: In the `PrefixTree` implementation in `src/log_surgeon/finite_automata/PrefixTree.cpp`, the root node serves as a common placeholder for all paths, and its position should be ignored in methods like `get_reversed_positions`.
src/log_surgeon/finite_automata/RegisterHandler.hpp (2)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#56
File: src/log_surgeon/finite_automata/RegisterHandler.hpp:0-0
Timestamp: 2024-11-27T22:25:35.608Z
Learning: In the `RegisterHandler` class in `src/log_surgeon/finite_automata/RegisterHandler.hpp`, the methods `add_register` and `append_position` rely on `emplace_back` and `m_prefix_tree.insert` to handle exceptions correctly and maintain consistent state without requiring additional exception handling.
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#56
File: src/log_surgeon/finite_automata/RegisterHandler.hpp:0-0
Timestamp: 2024-11-27T21:56:13.425Z
Learning: In the `log_surgeon` project, header guards in C++ header files should include `_HPP` at the end to match the filename. For example, in `RegisterHandler.hpp`, the header guard should be `LOG_SURGEON_FINITE_AUTOMATA_REGISTER_HANDLER_HPP`.

Copy link
Member

@LinZhihao-723 LinZhihao-723 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last two comments before we merge

src/log_surgeon/finite_automata/PrefixTree.hpp Outdated Show resolved Hide resolved
tests/test-register-handler.cpp Outdated Show resolved Hide resolved
Copy link
Member

@LinZhihao-723 LinZhihao-723 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the PR title, how about:

feat: Add `PrefixTree` and `RegisterHandler` to support TDFA simulation.

@SharafMohamed SharafMohamed changed the title feat: Add functionality to support registers needed by the TDFA. feat: Add PrefixTree and RegisterHandler to support TDFA simulation. Dec 6, 2024
@SharafMohamed SharafMohamed merged commit 99b5b08 into y-scope:main Dec 6, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants