Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python bindings #426

Merged
merged 16 commits into from
Oct 24, 2024
Merged

Python bindings #426

merged 16 commits into from
Oct 24, 2024

Conversation

amorenoz
Copy link
Contributor

@amorenoz amorenoz commented Sep 19, 2024

Python bindings

This PR introduces python bindings and makes them available in two main ways:

  • retis python: This command creates an embedded python shell to explore the content of an event file
  • retis python library: A new library (that will be) available in pypi and users can just "pip install" and use.

In addition, a few refactorings and changes are included first. The most notable of them is adding support to read sorted events (produced by retis sort). This is also added to the python bindings.

Regarding the python bindings themselves, the main functional differences compared with the RFC are:

  • Use FileEventsFactory to read events one by one using the iterator protocol (instead of reading the entire file into a list).
  • Bindings for all EventSection and EventTypes.
  • A few additional helper functions (such as section iterators).

Python shell

The embedded shell exposes a global variable called "events" which is of type "EventReader". This class is capable of
reading retis-generated files and iterate through its content.

It's not fully perfect but it's usable for quick prototyping.

Python library

The library is built using maturin and compatible with manylinux2014.

CI integration and (super-basic) unit tests (using tox and pytest) are added for the library. Also, the documentation has instructions to publish to pypi and install in a development environment.

Follow ups

  • Improve access to tuple enums
  • Add some canned scripts
  • Publish retis_events crate and refer to its API documentation in the python module

@amorenoz
Copy link
Contributor Author

amorenoz commented Sep 25, 2024

Just updated the PR with a missing documentation file that I had forgotten to commit :-D. No changes in the code.

Copy link
Contributor

@atenart atenart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks amazing, thanks a lot for working on improving the initial support! It's a lot better with great new features :)

I don't have any strong comment, I'm sure we can smooth the Python usage a bit but that will come over time; the biggest one is probably about the event vs series representation and use but that might be out of scope here. (Also there's a dedicated comment).

In addition to the individual comments I think we should improve the Python documentation, eg. mention the skb-tracking helpers, show more use of what can be done, raw vs non-raw etc. That can be done later tbf (I have some fun examples locally and can make a PR later).

nit: Commit "python: add python event representation." has the same author & co-author (and a trailing '.').

retis-events/src/ovs.rs Outdated Show resolved Hide resolved
retis-events/src/ovs.rs Outdated Show resolved Hide resolved
retis/src/module/ovs/bpf.rs Outdated Show resolved Hide resolved
retis/src/process/tracking.rs Outdated Show resolved Hide resolved
retis-events/src/events.rs Show resolved Hide resolved
retis-events/README.md Outdated Show resolved Hide resolved
retis-events/pyproject.toml Outdated Show resolved Hide resolved
retis-events/pyproject.toml Outdated Show resolved Hide resolved
retis-events/pyproject.toml Outdated Show resolved Hide resolved
retis-derive/src/lib.rs Outdated Show resolved Hide resolved
@atenart
Copy link
Contributor

atenart commented Sep 27, 2024

One thing that I find hard to work with is the event object is a dict, fields can be accessed like e['skb'] but then it's impossible to work with the SkbEvent fields directly (this was also the case in my initial version). This also prevents from having event types helpers after the event section level. The current workaround was to use the raw version (coming from JSON).

With a few additions we can actually access section fields (and below) directly as Python object; which helps with the UX to get them and allows to use per-type helpers:

e = next(events)

print(e['skb'].packet.len)

tcp = e['skb'].tcp
print(tcp.dport)
print(tcp.dport.service_name())  # We could do this for example.

I haven't added a __repr__ and show implementation for event_type so print(e['skb'].ip) and print(e['skb'].ip.show()) do not work. We should do this too. Also, there is at least one point not working as expected, such as accessing ip.version.V4.id (not sure why, haven't looked at it).

Patch:

diff --git a/retis-derive/src/lib.rs b/retis-derive/src/lib.rs
index 355dadf73ce0..5ec26a8c68c2 100644
--- a/retis-derive/src/lib.rs
+++ b/retis-derive/src/lib.rs
@@ -38,6 +38,19 @@ fn enum_is_simple(item: &Item) -> bool {
     }
 }
 
+fn has_named_fields(item: &Item) -> bool {
+    match &item {
+        Item::Struct(item) => match &item.fields {
+            Fields::Named(_) => true,
+            _ => false,
+        },
+        Item::Enum(item) => {
+            item.variants.iter().all(|v| matches!(v.fields, Fields::Named(_)))
+        }
+        _ => false,
+    }
+}
+
 #[proc_macro_attribute]
 pub fn event_type(
     _args: proc_macro::TokenStream,
@@ -50,6 +63,8 @@ pub fn event_type(
     // using underlying integers: See https://pyo3.rs/main/doc/pyo3/attr.pyclass.html.
     let pyclass_args = if enum_is_simple(&parsed) {
         "(eq, eq_int)"
+    } else if has_named_fields(&parsed) {
+        "(get_all)"
     } else {
         ""
     };
diff --git a/retis-events/src/kernel.rs b/retis-events/src/kernel.rs
index 806fb5bb448e..01bd5b76de3c 100644
--- a/retis-events/src/kernel.rs
+++ b/retis-events/src/kernel.rs
@@ -58,3 +58,9 @@ impl EventFmt for StackTrace {
         }
     }
 }
+
+impl pyo3::ToPyObject for StackTrace {
+    fn to_object(&self, py: pyo3::Python<'_>) -> pyo3::PyObject {
+        pyo3::IntoPy::into_py(self.0.clone(), py)
+    }
+}
diff --git a/retis-events/src/net.rs b/retis-events/src/net.rs
index 6001b94829cd..bf5ac795fb2f 100644
--- a/retis-events/src/net.rs
+++ b/retis-events/src/net.rs
@@ -53,9 +53,17 @@ pub(crate) fn protocol_str(protocol: u8) -> Option<&'static str> {
 }
 
 /// Represents a raw packet. Stored internally as a `Vec<u8>`.
+// We don't use #[event_type] as we're implementing serde::Serialize and
+// serde::Deserialize manually.
 #[derive(Clone, Debug)]
 pub struct RawPacket(pub Vec<u8>);
 
+impl pyo3::ToPyObject for RawPacket {
+    fn to_object(&self, py: pyo3::Python<'_>) -> pyo3::PyObject {
+        pyo3::IntoPy::into_py(self.0.clone(), py)
+    }
+}
+
 impl serde::Serialize for RawPacket {
     fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
     where
-- 
2.46.1

@amorenoz
Copy link
Contributor Author

One thing that I find hard to work with is the event object is a dict, fields can be accessed like e['skb'] but then it's impossible to work with the SkbEvent fields directly (this was also the case in my initial version). This also prevents from having event types helpers after the event section level. The current workaround was to use the raw version (coming from JSON).

With a few additions we can actually access section fields (and below) directly as Python object; which helps with the UX to get them and allows to use per-type helpers:

e = next(events)

print(e['skb'].packet.len)

tcp = e['skb'].tcp
print(tcp.dport)
print(tcp.dport.service_name())  # We could do this for example.

I haven't added a __repr__ and show implementation for event_type so print(e['skb'].ip) and print(e['skb'].ip.show()) do not work. We should do this too. Also, there is at least one point not working as expected, such as accessing ip.version.V4.id (not sure why, haven't looked at it).

I tried something similar at the beginning and found many quirks so I left it as a follow up improvement. I can revisit now that things look a bit more stable.

Thanks

@atenart
Copy link
Contributor

atenart commented Sep 30, 2024

One thing that I find hard to work with is the event object is a dict, fields can be accessed like e['skb'] but then it's impossible to work with the SkbEvent fields directly (this was also the case in my initial version). This also prevents from having event types helpers after the event section level. The current workaround was to use the raw version (coming from JSON).
With a few additions we can actually access section fields (and below) directly as Python object; which helps with the UX to get them and allows to use per-type helpers:

e = next(events)

print(e['skb'].packet.len)

tcp = e['skb'].tcp
print(tcp.dport)
print(tcp.dport.service_name())  # We could do this for example.

I haven't added a __repr__ and show implementation for event_type so print(e['skb'].ip) and print(e['skb'].ip.show()) do not work. We should do this too. Also, there is at least one point not working as expected, such as accessing ip.version.V4.id (not sure why, haven't looked at it).

I tried something similar at the beginning and found many quirks so I left it as a follow up improvement. I can revisit now that things look a bit more stable.

Doing this as a follow up is fine, I was just happy to see there's a way forward. Up to you then.

@atenart atenart added this to the v1.5 milestone Oct 2, 2024
@amorenoz
Copy link
Contributor Author

amorenoz commented Oct 2, 2024

One thing that I find hard to work with is the event object is a dict, fields can be accessed like e['skb'] but then it's impossible to work with the SkbEvent fields directly (this was also the case in my initial version). This also prevents from having event types helpers after the event section level. The current workaround was to use the raw version (coming from JSON).
With a few additions we can actually access section fields (and below) directly as Python object; which helps with the UX to get them and allows to use per-type helpers:

e = next(events)

print(e['skb'].packet.len)

tcp = e['skb'].tcp
print(tcp.dport)
print(tcp.dport.service_name())  # We could do this for example.

I haven't added a __repr__ and show implementation for event_type so print(e['skb'].ip) and print(e['skb'].ip.show()) do not work. We should do this too. Also, there is at least one point not working as expected, such as accessing ip.version.V4.id (not sure why, haven't looked at it).

I tried something similar at the beginning and found many quirks so I left it as a follow up improvement. I can revisit now that things look a bit more stable.

Now I remember

One thing that I find hard to work with is the event object is a dict, fields can be accessed like e['skb'] but then it's impossible to work with the SkbEvent fields directly (this was also the case in my initial version). This also prevents from having event types helpers after the event section level. The current workaround was to use the raw version (coming from JSON).
With a few additions we can actually access section fields (and below) directly as Python object; which helps with the UX to get them and allows to use per-type helpers:

e = next(events)

print(e['skb'].packet.len)

tcp = e['skb'].tcp
print(tcp.dport)
print(tcp.dport.service_name())  # We could do this for example.

I haven't added a __repr__ and show implementation for event_type so print(e['skb'].ip) and print(e['skb'].ip.show()) do not work. We should do this too. Also, there is at least one point not working as expected, such as accessing ip.version.V4.id (not sure why, haven't looked at it).

I tried something similar at the beginning and found many quirks so I left it as a follow up improvement. I can revisit now that things look a bit more stable.

Doing this as a follow up is fine, I was just happy to see there's a way forward. Up to you then.

Now I remember, the problem is enums, e.g: for ovs:

In [3]: e["ovs"]                                                    
Out[3]: OvsEvent { event: Some(Upcall(UpcallEvent { cmd: 1, port: 3366920467, cpu: 1 })) }
In [4]: e["ovs"].event                                              
Out[4]: Upcall(UpcallEvent { cmd: 1, port: 3366920467, cpu: 1 })                                                                         

In [5]: e["ovs"].event.cmd                                          
---------------------------------------------------------------------------                                                                                                                                                                                                       
AttributeError                            Traceback (most recent call last)                                                                                                                                                                                                       
Cell In[5], line 1                                                  
----> 1 e["ovs"].event.cmd                                          

AttributeError: 'builtins.OvsEventType_Upcall' object has no attribute 'cmd'                                                                                                                                                                                                      

In [6]: e["ovs"].event._0                                           
Out[6]: UpcallEvent { cmd: 1, port: 3366920467, cpu: 1 }                                                                                 
                                                                          
In [7]: e["ovs"].event._0.cmd                                       
Out[7]: 1                                     

@amorenoz
Copy link
Contributor Author

amorenoz commented Oct 2, 2024

I haven't added a __repr__ and show implementation for event_type so print(e['skb'].ip) and print(e['skb'].ip.show()) do not work. We should do this too. Also, there is at least one point not working as expected, such as accessing ip.version.V4.id (not sure why, haven't looked at it).

I tried something similar at the beginning and found many quirks so I left it as a follow up improvement. I can revisit now that things look a bit more stable.

Doing this as a follow up is fine, I was just happy to see there's a way forward. Up to you then.

Now I remember, the problem is enums, e.g: for ovs:

In [3]: e["ovs"]                                                    
Out[3]: OvsEvent { event: Some(Upcall(UpcallEvent { cmd: 1, port: 3366920467, cpu: 1 })) }
In [4]: e["ovs"].event                                              
Out[4]: Upcall(UpcallEvent { cmd: 1, port: 3366920467, cpu: 1 })                                                                         

In [5]: e["ovs"].event.cmd                                          
---------------------------------------------------------------------------                                                                                                                                                                                                       
AttributeError                            Traceback (most recent call last)                                                                                                                                                                                                       
Cell In[5], line 1                                                  
----> 1 e["ovs"].event.cmd                                          

AttributeError: 'builtins.OvsEventType_Upcall' object has no attribute 'cmd'                                                                                                                                                                                                      

In [6]: e["ovs"].event._0                                           
Out[6]: UpcallEvent { cmd: 1, port: 3366920467, cpu: 1 }                                                                                 
                                                                          
In [7]: e["ovs"].event._0.cmd                                       
Out[7]: 1                                     

I have found a workaround. We can "merge" the type into the enum (same for ip.V4) at the cost of loosing helpers (which we don't have right now):

diff --git a/retis-events/src/ovs.rs b/retis-events/src/ovs.rs
index ff061f66..90667e06 100644
--- a/retis-events/src/ovs.rs
+++ b/retis-events/src/ovs.rs
@@ -31,7 +31,18 @@ pub enum OvsEventType {
     /// Upcall event. It indicates the begining of an upcall. An upcall can have multiple enqueue
     /// events.
     #[serde(rename = "upcall")]
-    Upcall(UpcallEvent),
+    Upcall {
+        /// Upcall command. Holds OVS_PACKET_CMD:
+        ///   OVS_PACKET_CMD_UNSPEC   = 0
+        ///   OVS_PACKET_CMD_MISS     = 1
+        ///   OVS_PACKET_CMD_ACTION   = 2
+        ///   OVS_PACKET_CMD_EXECUTE  = 3
+        cmd: u8,
+        /// Upcall port.
+        port: u32,
+        /// Cpu ID
+        cpu: u32,
+    },
 

@atenart
Copy link
Contributor

atenart commented Oct 2, 2024

I have found a workaround. We can "merge" the type into the enum (same for ip.V4) at the cost of loosing helpers (which we don't have right now):

diff --git a/retis-events/src/ovs.rs b/retis-events/src/ovs.rs
index ff061f66..90667e06 100644
--- a/retis-events/src/ovs.rs
+++ b/retis-events/src/ovs.rs
@@ -31,7 +31,18 @@ pub enum OvsEventType {
     /// Upcall event. It indicates the begining of an upcall. An upcall can have multiple enqueue
     /// events.
     #[serde(rename = "upcall")]
-    Upcall(UpcallEvent),
+    Upcall {
+        /// Upcall command. Holds OVS_PACKET_CMD:
+        ///   OVS_PACKET_CMD_UNSPEC   = 0
+        ///   OVS_PACKET_CMD_MISS     = 1
+        ///   OVS_PACKET_CMD_ACTION   = 2
+        ///   OVS_PACKET_CMD_EXECUTE  = 3
+        cmd: u8,
+        /// Upcall port.
+        port: u32,
+        /// Cpu ID
+        cpu: u32,
+    },
 

Oh, did not realized this was the same limitation for ip.V4/6. This was a known limitation when started playing with pyo3 back then.

I'm not too much in favor of the above, it prevents from having a per-variant type and thus helpers. We should investigate more; eg. maybe we can do the get_all implementation ourselves for such enums using variant names as keys.

@amorenoz
Copy link
Contributor Author

amorenoz commented Oct 2, 2024

I have found a workaround. We can "merge" the type into the enum (same for ip.V4) at the cost of loosing helpers (which we don't have right now):

diff --git a/retis-events/src/ovs.rs b/retis-events/src/ovs.rs
index ff061f66..90667e06 100644
--- a/retis-events/src/ovs.rs
+++ b/retis-events/src/ovs.rs
@@ -31,7 +31,18 @@ pub enum OvsEventType {
     /// Upcall event. It indicates the begining of an upcall. An upcall can have multiple enqueue
     /// events.
     #[serde(rename = "upcall")]
-    Upcall(UpcallEvent),
+    Upcall {
+        /// Upcall command. Holds OVS_PACKET_CMD:
+        ///   OVS_PACKET_CMD_UNSPEC   = 0
+        ///   OVS_PACKET_CMD_MISS     = 1
+        ///   OVS_PACKET_CMD_ACTION   = 2
+        ///   OVS_PACKET_CMD_EXECUTE  = 3
+        cmd: u8,
+        /// Upcall port.
+        port: u32,
+        /// Cpu ID
+        cpu: u32,
+    },
 

Oh, did not realized this was the same limitation for ip.V4/6. This was a known limitation when started playing with pyo3 back then.

I'm not too much in favor of the above, it prevents from having a per-variant type and thus helpers. We should investigate more; eg. maybe we can do the get_all implementation ourselves for such enums using variant names as keys.

I know, it's not nice.
I did run some experiments but failed to implement such manual getter. I need to dig a bit digger but, given in theory V4(SkbIpv4Event) is a tuple (i.e: we could have V4(SkbIpv4Event, SomeOtherThing), each element is named _0, _1, etc. This might change in the future and there might be a better way to access tuple fields. Those attributes seem not available from the rust side...

I agree, if we need helpers, we cannot just remove the type. However, in such case we could have this:

diff --git a/retis-events/src/ovs.rs b/retis-events/src/ovs.rs
index ff061f66..0546f9e2 100644
--- a/retis-events/src/ovs.rs
+++ b/retis-events/src/ovs.rs
@@ -31,7 +31,10 @@ pub enum OvsEventType {
     /// Upcall event. It indicates the begining of an upcall. An upcall can have multiple enqueue
     /// events.
     #[serde(rename = "upcall")]
-    Upcall(UpcallEvent),
+    Upcall {
+        #[serde(flatten)]
+        upcall: UpcallEvent
+     },
 

This workaround:

  1. keeps the field "named"
  2. keeps an explicit struct that can have helpers
  3. serde flattens the json representation removing the extra step

The result (with an unchanged json file):

>>> e
{'skb': {}, 'kernel': {'probe_type': 'raw_tracepoint', 'symbol': 'openvswitch:ovs_dp_upcall'}, 'ovs': {'event_type': 'upcall', 'cpu': 1, 'port': 3366920467, 'cmd': 1}, 'common': {'timestamp': 23868955262984, 'task': {'pid': 1360, 'comm': 'napi/phy0-8197', 'tgid': 1360}, 'smp_id': 2}, 'skb-tracking': {'skb': 18446616546107689472, 'timestamp': 23868955262984, 'orig_head': 18446616576100907520}}
>>> e["ovs"]
OvsEvent { event: Some(Upcall { upcall: UpcallEvent { cmd: 1, port: 3366920467, cpu: 1 } }) }
>>> e["ovs"].event
Upcall { upcall: UpcallEvent { cmd: 1, port: 3366920467, cpu: 1 } }
>>> e["ovs"].event.upcall
UpcallEvent { cmd: 1, port: 3366920467, cpu: 1 }
>>> e["ovs"].event.upcall.cmd
1

Both options (removing the struct and adding an extra field) might be enough for now, WDYT?

@atenart
Copy link
Contributor

atenart commented Oct 3, 2024

diff --git a/retis-events/src/ovs.rs b/retis-events/src/ovs.rs
index ff061f66..0546f9e2 100644
--- a/retis-events/src/ovs.rs
+++ b/retis-events/src/ovs.rs
@@ -31,7 +31,10 @@ pub enum OvsEventType {
     /// Upcall event. It indicates the begining of an upcall. An upcall can have multiple enqueue
     /// events.
     #[serde(rename = "upcall")]
-    Upcall(UpcallEvent),
+    Upcall {
+        #[serde(flatten)]
+        upcall: UpcallEvent
+     },
 

This workaround:

1. keeps the field "named"

2. keeps an explicit struct that can have helpers

3. serde flattens the json representation removing the extra step

The result (with an unchanged json file):

>>> e
{'skb': {}, 'kernel': {'probe_type': 'raw_tracepoint', 'symbol': 'openvswitch:ovs_dp_upcall'}, 'ovs': {'event_type': 'upcall', 'cpu': 1, 'port': 3366920467, 'cmd': 1}, 'common': {'timestamp': 23868955262984, 'task': {'pid': 1360, 'comm': 'napi/phy0-8197', 'tgid': 1360}, 'smp_id': 2}, 'skb-tracking': {'skb': 18446616546107689472, 'timestamp': 23868955262984, 'orig_head': 18446616576100907520}}
>>> e["ovs"]
OvsEvent { event: Some(Upcall { upcall: UpcallEvent { cmd: 1, port: 3366920467, cpu: 1 } }) }
>>> e["ovs"].event
Upcall { upcall: UpcallEvent { cmd: 1, port: 3366920467, cpu: 1 } }
>>> e["ovs"].event.upcall
UpcallEvent { cmd: 1, port: 3366920467, cpu: 1 }
>>> e["ovs"].event.upcall.cmd
1

Both options (removing the struct and adding an extra field) might be enough for now, WDYT?

The above is quite nice as a workaround IMO. The only downside is having to handle this when working with the event in the Rust side, but that's manageable and something we can easily change later if needed.

@amorenoz amorenoz force-pushed the pylib3 branch 4 times, most recently from f3e48bd to 090172b Compare October 4, 2024 11:39
@amorenoz
Copy link
Contributor Author

amorenoz commented Oct 7, 2024

The latest version has:

  • A rework of the OVS events to at least not make them worse to access in python
  • Basic support for direct field access (improvements will come as follow-up series)
  • A rework of iterators allowing different iterators for Series and Event
  • Use standard python interactive shell instead of trying to rewrite
  • CLI arguments change. In preparation of shipping canned scripts, switched the arguments retis [-i events.json] [SCRIPT]

Copy link
Contributor

@atenart atenart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reworks!

Could we improve the following error msg by calling file_type()? Or even just print an already sorted series (sorting something sorted should just return the input)?

$ retis sort retis.sorted 
Error: Incorrect filetype. Use next_series().

Or maybe just improving the underlying error message, it is currently developer-centered but could be user-centered. I guess this was designed with Python use in mind, but maybe such things should be done in the Python part only?

docs/python.md Outdated Show resolved Hide resolved
retis-events/src/python.rs Outdated Show resolved Hide resolved
retis-events/src/python_embed.rs Outdated Show resolved Hide resolved
docs/python.md Outdated Show resolved Hide resolved
.cirrus.yml Outdated Show resolved Hide resolved
retis-events/src/file.rs Outdated Show resolved Hide resolved
RELEASE.md Outdated Show resolved Hide resolved
docs/python.md Outdated Show resolved Hide resolved
docs/python.md Outdated Show resolved Hide resolved
docs/python.md Outdated Show resolved Hide resolved
@amorenoz
Copy link
Contributor Author

amorenoz commented Oct 7, 2024

Thanks for the reworks!

Could we improve the following error msg by calling file_type()? Or even just print an already sorted series (sorting something sorted should just return the input)?

$ retis sort retis.sorted 
Error: Incorrect filetype. Use next_series().

Ack. I'll write something that can serve both purposes.

Or maybe just improving the underlying error message, it is currently developer-centered but could be user-centered. I guess this was designed with Python use in mind, but maybe such things should be done in the Python part only?

For this specific case, how about:

$ retis sort retis_sorted.data 
File already sorted

@atenart
Copy link
Contributor

atenart commented Oct 8, 2024

For this specific case, how about:

$ retis sort retis_sorted.data 
File already sorted

Sounds good, thanks!

@amorenoz amorenoz added the run-functional-tests Request functional tests to be run by CI label Oct 9, 2024
@amorenoz amorenoz force-pushed the pylib3 branch 4 times, most recently from b7c8b51 to 9db8a08 Compare October 18, 2024 12:01
@amorenoz
Copy link
Contributor Author

@atenart FYI It turns out pyo3 requires python >= 3.7 which means it doesn't work with centos-8 :-(. So we do need a non-python build after all.

@atenart atenart closed this Oct 18, 2024
@atenart atenart reopened this Oct 18, 2024
@atenart
Copy link
Contributor

atenart commented Oct 18, 2024

@atenart FYI It turns out pyo3 requires python >= 3.7 which means it doesn't work with centos-8 :-(. So we do need a non-python build after all.

And as stable distro gets older pyo3 as chances to start being incompatible at some point (not only w/ c8s). Being able to build w/o Python is a fix, but removing support when is was previously available is a bummer. Not that I have a solution, I guess that's more a packaging issue than an upstream one.

Also thanks for trying to make all of this work, rebasing and making the CI to pass looks like a long journey :(

amorenoz and others added 11 commits October 21, 2024 19:11
It is not required by the core or by any other piece of infrastructure.
Only some events need to derive from Default because of the way they are
constructed.

Signed-off-by: Adrian Moreno <[email protected]>
OVS module currently makes no assumptions on the order of the event
chunks and tries to build the event even if out-of-order pieces are
received. This makes little sense in practice as these chuncks are sent
in the same hook.

Removing that unneeded requirement and assuming the base action event
(OVS_DP_ACTION) will be received before specific action argument events
(e.g: OVS_DP_ACTION_OUTPUT) makes decoding simpler.

This also it avoids requiring a Default version of the event which is
currently implemented using an "undefined" value for enums. This
mechanism is not supported by pyo3.

Also, create a dummy object to avoid having mixed complex unit variants
in enums.

[1] Upstream discussions:
PyO3/pyo3#3582 (comment)
PyO3/pyo3#3749

Signed-off-by: Adrian Moreno <[email protected]>
It is useful to be able to define a dummy event to test their creation
(factories) and manipulation.

However, in preparation to add python support, we must ensure pyo3
dependencies only come from retis-events. Therefore, all events must be
defined there, including TestEvent.

Create a feature to enable this dummy event that is enabled only when
"cargo test" is run on the main crate.

Signed-off-by: Adrian Moreno <[email protected]>
It would anyway be resolved to 'a and nightly toolchain prefers it to be
explicit.

Signed-off-by: Adrian Moreno <[email protected]>
It currently uses a struct EventResult that was originally thought to be
used when creating events from ebpf as it has a timeout. The file
factory is not using this so we can express the result as
Result<Option<Event>>.

Signed-off-by: Adrian Moreno <[email protected]>
Use pyo3 to represent events and event sections.

Notably, this requires some simple enums to implement PartialEq.
Also, SectionId::from_str() has to be recovered.

EventTypes are given a __repr__() helper that uses the debug formatter.
It seems a good first start.

EventSections are given a __repr__() helper that prints a dictionary
based on the json object as well as a show() helper that formats the
event.

Co-authored-by: Adrian Moreno <[email protected]>
Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: Adrian Moreno <[email protected]>
Add an python shell as post-process command.

Signed-off-by: Antoine Tenart <[email protected]>
Co-authored-by: Antoine Tenart <[email protected]>
Signed-off-by: Adrian Moreno <[email protected]>
Add a helper to iterate through the available sections of an event.
Expose such helper to the python representation.

Signed-off-by: Adrian Moreno <[email protected]>
Currently, EventSeries are created, printed and even encoded into json
but we do not support reading it back from json.

This patch adds support for EventSeries in FileEventFactory.
The type of file that is being processed is autodetected upon creation
and a differnt interface (next_series()) is used to retrieve EventSeries
so that typing is clear.

By also adding TrackingInfo to the types that unmarshal, we can now do:

```
retis collect [...]
retis sort -o sorted.json
retis print sorted.json
```

Signed-off-by: Adrian Moreno <[email protected]>
docs/python.md Show resolved Hide resolved
docs/python.md Outdated Show resolved Hide resolved
The features used for pyo3 are different (and not-compatible)
for the embedded interpreter and python bindings, so move the
interpreter into an independent file that is easy to conditionally
compile off.

Use maturin to build with stable abi3.

Signed-off-by: Adrian Moreno <[email protected]>
Store the result as artifact to ease testing.

Signed-off-by: Adrian Moreno <[email protected]>
Signed-off-by: Adrian Moreno <[email protected]>
Use tox to handle virtual env and pytest to as test runner.
Add a task in cirrus CI to run them.

Signed-off-by: Adrian Moreno <[email protected]>
Python support provided by pyo3 requires linking against python >= 3.6.
Building retis on distros with older python versions would fail. In
order to work around this issues, make python an optional dependency but
enable it by default.

On non-supported distros, CARGO_CMD_OPTS="--no-default-features" can be
used to build without python support.

Signed-off-by: Adrian Moreno <[email protected]>
Copy link
Contributor

@atenart atenart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; thanks for the hard work, I know this has been quite challenging! I believe this is a great addition.

@atenart atenart merged commit 9a4d810 into retis-org:main Oct 24, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-functional-tests Request functional tests to be run by CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants