Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add VARRAY_FOREACH and VPTRS_ITER macros to iterate over arrays and pointer arrays #4239

Closed

Conversation

nigoroll
Copy link
Member

These patches add two macros, semantic patches to start using them and the respective changes:

The VARRAY_FOREACH macro is intended to simplify loops over elements of an array using pointer arithmetic instead of an array index.

The usual pattern of iterating over the array index and then accessing the i-th element of the array is replaced with calculating the end pointer and looping over the array elements.

The macro basically implements

for (pointer = array, end = array + length; pointer < end; pointer++)

This pattern is more efficient but less clear to read when written out. The macro is similar to the *_FOREACH pattern which we already use in many other places with trees, lists and queues, but declares the loop variable locally.

The second macro VPTRS_ITER further simplifies our "array of pointers" case as, for example, used in struct strands.

Here, the loop variable is to be declared outside the macro to support keeping found elements from within the loop.

This macro is intended to simplify loops over elements of an array using pointer
arithmetic instead of an array index.

The usual pattern of iterating over the array index and then accessing the i-th
element of the array is replaced with calculating the end pointer and looping
over the array elements.

The macro basically implements

	for (pointer = array, end = array + length; pointer < end; pointer++)

This pattern is more efficient but less clear to read when written out. The
macro is similar to the *_FOREACH pattern which we already use in many other
places with trees, lists and queues, but declares the loop variable locally.

Remarks on the implementation:

- typeof() is now formalized with C23 but has been a GNU extension for long. We
  use typeof() since e1ac593 as of March 2021.

- The construct 'typeof(*(arr)) * var' (using the element type declaring a
  pointer on it) is used to un-const arr while keeping the constness of the
  element type: If arr is const, '(typeof arr) var = arr' does not work because
  var can not be modified.

  The additional (typeof(var)) cast is to avoid a Flexelint warning (reference
  to pointer).
The coccinelle patch only catches some cases, makes spatch enter a spinloop and
the patch result needs manual polish on top, but at this point the purpose is
only to show use cases for the new macro.
Based on VARRAY_FOREACH, this additional macro further simplifies the case of
arrays of pointer as, for example, used in struct strands.

Here, the loop variable is to be declared outside the macro to support keeping
found elements from within the loop.
we add a semantic patch to turn VARRAY_FOREACH cases into VPTRS_ITER cases where
applicable.
@nigoroll
Copy link
Member Author

nigoroll commented Dec 2, 2024

Bugwash: Overall not sure if this is a net positive (I share this sentiment), but there might be potential with regards to bounds checking.
Ideas to ponder:

@nigoroll
Copy link
Member Author

nigoroll commented Dec 2, 2024

So I pondered the question of better bounds checking and could no come up with anything interesting. An argument could be made that, when iterating with a pointer, out of bounds access might be more obvious than through an array index.
regarding the "generic array type" , we could use

struct varray {
    unsigned magic;
    unsigned length;
    TYPE array[] __attribute__((counted_by(length)));
}

this could be used for strands and possibly other cases, but it would not be embedable in structs and we would need to take care of the flexarray sizeof issue...
On the pro side, this could be passed more easily as a single argument (instead of array + length as separate arguments) and we would get the counted_by checks.
Obviously, this would be wrapped in VARRAY(TYPE) macro.

@nigoroll
Copy link
Member Author

nigoroll commented Dec 9, 2024

bugwash discussed, not currently actionable

@nigoroll
Copy link
Member Author

nigoroll commented Jan 3, 2025

I am going ahead with the idea in new code, and some generalization will arise naturally. We can then see if we want to elevate it later.

@nigoroll nigoroll closed this Jan 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant