-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python bindings #426
Python bindings #426
Conversation
Just updated the PR with a missing documentation file that I had forgotten to commit :-D. No changes in the code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks amazing, thanks a lot for working on improving the initial support! It's a lot better with great new features :)
I don't have any strong comment, I'm sure we can smooth the Python usage a bit but that will come over time; the biggest one is probably about the event vs series representation and use but that might be out of scope here. (Also there's a dedicated comment).
In addition to the individual comments I think we should improve the Python documentation, eg. mention the skb-tracking
helpers, show more use of what can be done, raw vs non-raw etc. That can be done later tbf (I have some fun examples locally and can make a PR later).
nit: Commit "python: add python event representation." has the same author & co-author (and a trailing '.').
One thing that I find hard to work with is the event object is a dict, fields can be accessed like With a few additions we can actually access section fields (and below) directly as Python object; which helps with the UX to get them and allows to use per-type helpers: e = next(events)
print(e['skb'].packet.len)
tcp = e['skb'].tcp
print(tcp.dport)
print(tcp.dport.service_name()) # We could do this for example. I haven't added a Patch: diff --git a/retis-derive/src/lib.rs b/retis-derive/src/lib.rs
index 355dadf73ce0..5ec26a8c68c2 100644
--- a/retis-derive/src/lib.rs
+++ b/retis-derive/src/lib.rs
@@ -38,6 +38,19 @@ fn enum_is_simple(item: &Item) -> bool {
}
}
+fn has_named_fields(item: &Item) -> bool {
+ match &item {
+ Item::Struct(item) => match &item.fields {
+ Fields::Named(_) => true,
+ _ => false,
+ },
+ Item::Enum(item) => {
+ item.variants.iter().all(|v| matches!(v.fields, Fields::Named(_)))
+ }
+ _ => false,
+ }
+}
+
#[proc_macro_attribute]
pub fn event_type(
_args: proc_macro::TokenStream,
@@ -50,6 +63,8 @@ pub fn event_type(
// using underlying integers: See https://pyo3.rs/main/doc/pyo3/attr.pyclass.html.
let pyclass_args = if enum_is_simple(&parsed) {
"(eq, eq_int)"
+ } else if has_named_fields(&parsed) {
+ "(get_all)"
} else {
""
};
diff --git a/retis-events/src/kernel.rs b/retis-events/src/kernel.rs
index 806fb5bb448e..01bd5b76de3c 100644
--- a/retis-events/src/kernel.rs
+++ b/retis-events/src/kernel.rs
@@ -58,3 +58,9 @@ impl EventFmt for StackTrace {
}
}
}
+
+impl pyo3::ToPyObject for StackTrace {
+ fn to_object(&self, py: pyo3::Python<'_>) -> pyo3::PyObject {
+ pyo3::IntoPy::into_py(self.0.clone(), py)
+ }
+}
diff --git a/retis-events/src/net.rs b/retis-events/src/net.rs
index 6001b94829cd..bf5ac795fb2f 100644
--- a/retis-events/src/net.rs
+++ b/retis-events/src/net.rs
@@ -53,9 +53,17 @@ pub(crate) fn protocol_str(protocol: u8) -> Option<&'static str> {
}
/// Represents a raw packet. Stored internally as a `Vec<u8>`.
+// We don't use #[event_type] as we're implementing serde::Serialize and
+// serde::Deserialize manually.
#[derive(Clone, Debug)]
pub struct RawPacket(pub Vec<u8>);
+impl pyo3::ToPyObject for RawPacket {
+ fn to_object(&self, py: pyo3::Python<'_>) -> pyo3::PyObject {
+ pyo3::IntoPy::into_py(self.0.clone(), py)
+ }
+}
+
impl serde::Serialize for RawPacket {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
--
2.46.1 |
I tried something similar at the beginning and found many quirks so I left it as a follow up improvement. I can revisit now that things look a bit more stable. Thanks |
Doing this as a follow up is fine, I was just happy to see there's a way forward. Up to you then. |
Now I remember
Now I remember, the problem is enums, e.g: for ovs:
|
I have found a workaround. We can "merge" the type into the enum (same for ip.V4) at the cost of loosing helpers (which we don't have right now): diff --git a/retis-events/src/ovs.rs b/retis-events/src/ovs.rs
index ff061f66..90667e06 100644
--- a/retis-events/src/ovs.rs
+++ b/retis-events/src/ovs.rs
@@ -31,7 +31,18 @@ pub enum OvsEventType {
/// Upcall event. It indicates the begining of an upcall. An upcall can have multiple enqueue
/// events.
#[serde(rename = "upcall")]
- Upcall(UpcallEvent),
+ Upcall {
+ /// Upcall command. Holds OVS_PACKET_CMD:
+ /// OVS_PACKET_CMD_UNSPEC = 0
+ /// OVS_PACKET_CMD_MISS = 1
+ /// OVS_PACKET_CMD_ACTION = 2
+ /// OVS_PACKET_CMD_EXECUTE = 3
+ cmd: u8,
+ /// Upcall port.
+ port: u32,
+ /// Cpu ID
+ cpu: u32,
+ },
|
Oh, did not realized this was the same limitation for I'm not too much in favor of the above, it prevents from having a per-variant type and thus helpers. We should investigate more; eg. maybe we can do the |
I know, it's not nice. I agree, if we need helpers, we cannot just remove the type. However, in such case we could have this: diff --git a/retis-events/src/ovs.rs b/retis-events/src/ovs.rs
index ff061f66..0546f9e2 100644
--- a/retis-events/src/ovs.rs
+++ b/retis-events/src/ovs.rs
@@ -31,7 +31,10 @@ pub enum OvsEventType {
/// Upcall event. It indicates the begining of an upcall. An upcall can have multiple enqueue
/// events.
#[serde(rename = "upcall")]
- Upcall(UpcallEvent),
+ Upcall {
+ #[serde(flatten)]
+ upcall: UpcallEvent
+ },
This workaround:
The result (with an unchanged json file):
Both options (removing the struct and adding an extra field) might be enough for now, WDYT? |
The above is quite nice as a workaround IMO. The only downside is having to handle this when working with the event in the Rust side, but that's manageable and something we can easily change later if needed. |
f3e48bd
to
090172b
Compare
The latest version has:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reworks!
Could we improve the following error msg by calling file_type()
? Or even just print an already sorted series (sorting something sorted should just return the input)?
$ retis sort retis.sorted
Error: Incorrect filetype. Use next_series().
Or maybe just improving the underlying error message, it is currently developer-centered but could be user-centered. I guess this was designed with Python use in mind, but maybe such things should be done in the Python part only?
Ack. I'll write something that can serve both purposes.
For this specific case, how about:
|
Sounds good, thanks! |
b7c8b51
to
9db8a08
Compare
@atenart FYI It turns out pyo3 requires python >= 3.7 which means it doesn't work with centos-8 :-(. So we do need a non-python build after all. |
And as stable distro gets older pyo3 as chances to start being incompatible at some point (not only w/ c8s). Being able to build w/o Python is a fix, but removing support when is was previously available is a bummer. Not that I have a solution, I guess that's more a packaging issue than an upstream one. Also thanks for trying to make all of this work, rebasing and making the CI to pass looks like a long journey :( |
It is not required by the core or by any other piece of infrastructure. Only some events need to derive from Default because of the way they are constructed. Signed-off-by: Adrian Moreno <[email protected]>
OVS module currently makes no assumptions on the order of the event chunks and tries to build the event even if out-of-order pieces are received. This makes little sense in practice as these chuncks are sent in the same hook. Removing that unneeded requirement and assuming the base action event (OVS_DP_ACTION) will be received before specific action argument events (e.g: OVS_DP_ACTION_OUTPUT) makes decoding simpler. This also it avoids requiring a Default version of the event which is currently implemented using an "undefined" value for enums. This mechanism is not supported by pyo3. Also, create a dummy object to avoid having mixed complex unit variants in enums. [1] Upstream discussions: PyO3/pyo3#3582 (comment) PyO3/pyo3#3749 Signed-off-by: Adrian Moreno <[email protected]>
It is useful to be able to define a dummy event to test their creation (factories) and manipulation. However, in preparation to add python support, we must ensure pyo3 dependencies only come from retis-events. Therefore, all events must be defined there, including TestEvent. Create a feature to enable this dummy event that is enabled only when "cargo test" is run on the main crate. Signed-off-by: Adrian Moreno <[email protected]>
Signed-off-by: Adrian Moreno <[email protected]>
Signed-off-by: Adrian Moreno <[email protected]>
It would anyway be resolved to 'a and nightly toolchain prefers it to be explicit. Signed-off-by: Adrian Moreno <[email protected]>
It currently uses a struct EventResult that was originally thought to be used when creating events from ebpf as it has a timeout. The file factory is not using this so we can express the result as Result<Option<Event>>. Signed-off-by: Adrian Moreno <[email protected]>
Use pyo3 to represent events and event sections. Notably, this requires some simple enums to implement PartialEq. Also, SectionId::from_str() has to be recovered. EventTypes are given a __repr__() helper that uses the debug formatter. It seems a good first start. EventSections are given a __repr__() helper that prints a dictionary based on the json object as well as a show() helper that formats the event. Co-authored-by: Adrian Moreno <[email protected]> Signed-off-by: Antoine Tenart <[email protected]> Signed-off-by: Adrian Moreno <[email protected]>
Add an python shell as post-process command. Signed-off-by: Antoine Tenart <[email protected]> Co-authored-by: Antoine Tenart <[email protected]> Signed-off-by: Adrian Moreno <[email protected]>
Add a helper to iterate through the available sections of an event. Expose such helper to the python representation. Signed-off-by: Adrian Moreno <[email protected]>
Currently, EventSeries are created, printed and even encoded into json but we do not support reading it back from json. This patch adds support for EventSeries in FileEventFactory. The type of file that is being processed is autodetected upon creation and a differnt interface (next_series()) is used to retrieve EventSeries so that typing is clear. By also adding TrackingInfo to the types that unmarshal, we can now do: ``` retis collect [...] retis sort -o sorted.json retis print sorted.json ``` Signed-off-by: Adrian Moreno <[email protected]>
The features used for pyo3 are different (and not-compatible) for the embedded interpreter and python bindings, so move the interpreter into an independent file that is easy to conditionally compile off. Use maturin to build with stable abi3. Signed-off-by: Adrian Moreno <[email protected]>
Store the result as artifact to ease testing. Signed-off-by: Adrian Moreno <[email protected]>
Signed-off-by: Adrian Moreno <[email protected]>
Use tox to handle virtual env and pytest to as test runner. Add a task in cirrus CI to run them. Signed-off-by: Adrian Moreno <[email protected]>
Python support provided by pyo3 requires linking against python >= 3.6. Building retis on distros with older python versions would fail. In order to work around this issues, make python an optional dependency but enable it by default. On non-supported distros, CARGO_CMD_OPTS="--no-default-features" can be used to build without python support. Signed-off-by: Adrian Moreno <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM; thanks for the hard work, I know this has been quite challenging! I believe this is a great addition.
Python bindings
This PR introduces python bindings and makes them available in two main ways:
In addition, a few refactorings and changes are included first. The most notable of them is adding support to read sorted events (produced by
retis sort
). This is also added to the python bindings.Regarding the python bindings themselves, the main functional differences compared with the RFC are:
Python shell
The embedded shell exposes a global variable called "events" which is of type "EventReader". This class is capable of
reading retis-generated files and iterate through its content.
It's not fully perfect but it's usable for quick prototyping.
Python library
The library is built using maturin and compatible with manylinux2014.
CI integration and (super-basic) unit tests (using tox and pytest) are added for the library. Also, the documentation has instructions to publish to pypi and install in a development environment.
Follow ups