Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend SetU #4262

Closed
wants to merge 6 commits into from
Closed

Extend SetU #4262

wants to merge 6 commits into from

Conversation

Rot127
Copy link
Member

@Rot127 Rot127 commented Feb 18, 2024

Your checklist for this pull request

  • I've read the guidelines for contributing to this repository
  • I made sure to follow the project's coding style
  • I've documented or updated the documentation of every function and struct this PR changes. If not so I've explained why.
  • I've added tests that prove my fix is effective or that my feature works (if possible)
  • I've updated the rizin book with the relevant information (if needed)

Detailed description

Did not add the doxygen yet. Wanted to get some feedback first.

  • Add a getter for length of SetU.
  • Add a foreach macro.

Test plan

Added

Closing issues

...

librz/util/set.c Outdated Show resolved Hide resolved
librz/include/rz_util/set.h Outdated Show resolved Hide resolved
@Rot127
Copy link
Member Author

Rot127 commented Feb 19, 2024

How about we change this PR to re-implement the sets properly? SetU works fine, if it uses HtUU (not in this implementation thoug, push some commits later). And we can implement SetP by simply using SetU but casting the pointers to ut64 before.

@Rot127
Copy link
Member Author

Rot127 commented Feb 19, 2024

Also I couldn't find any unit tests specifically for the hash tables. Which should be definitely added.

@ret2libc
Copy link
Member

Also I couldn't find any unit tests specifically for the hash tables. Which should be definitely added.

There are some in test/unit/test_sdb_hash.c (that's becase Ht was originally in SDB, then we moved the code to the main rizin repo but some stuff are still tied to SDB :( )

@Rot127
Copy link
Member Author

Rot127 commented Feb 19, 2024

@ret2libc Any opinion on the foreach macro construction? I found the implemented foreach loop via callbacks really unpractical. But It looks a little wonky. But if you in general approve it, we could have something like this for hash tables as well.

Copy link
Member

@ret2libc ret2libc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the approach with the iter if it works well ;)
Before using it here, though, i'd like to have it in the hashtable as well so we can really test if it works well (with ASAN).

librz/include/rz_util/set.h Outdated Show resolved Hide resolved
librz/include/rz_util/set.h Show resolved Hide resolved
XVilka

This comment was marked as resolved.

@XVilka

This comment was marked as resolved.

librz/include/rz_util/set.h Outdated Show resolved Hide resolved
librz/util/set.c Outdated Show resolved Hide resolved
librz/util/set.c Show resolved Hide resolved
@XVilka XVilka requested a review from pelijah April 21, 2024 02:22
@pelijah
Copy link
Contributor

pelijah commented Apr 21, 2024

Ideally foreach macro should be implemented for HT and be adapted for sets then. But it might be a tough task to implement universal macro considering custom HTs like SdbHt. So for now it's ok to have foreach for sets only. And yet we also need the same for SetS.

@XVilka

This comment was marked as resolved.

Copy link
Contributor

@pelijah pelijah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

foreach and size method for SetS

Comment on lines +87 to +92
SetUIter it;
set_u_foreach (set_u, it) {
x++;
bool matches = it.v == 0x5050505 || it.v == 0x6060606;
mu_assert_true(matches, "Set contained ill-formed value.");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer rz_list_foreach like variant with separate value variable:

Suggested change
SetUIter it;
set_u_foreach (set_u, it) {
x++;
bool matches = it.v == 0x5050505 || it.v == 0x6060606;
mu_assert_true(matches, "Set contained ill-formed value.");
}
SetUIter it;
ut64 val;
set_u_foreach (set_u, it, val) {
x++;
bool matches = val == 0x5050505 || val == 0x6060606;
mu_assert_true(matches, "Set contained ill-formed value.");
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ret2libc your opinion?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't really have strong feelings about it. Separated variable means the type is more explicit, which is probably worth the trade against the extra line. But as long as we have a foreach without a callback, I am happy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for (it = list->head; it && (pos = it->elem, 1); it = it->next)

Adapted for your case: (val = iter.v, 1)

Copy link
Member

@XVilka XVilka Apr 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the one with the explicit value is better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be more important for the SetP, where we have a room for type ambiguity.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the value should always be a pointer. so you can read and write the value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the value should always be a pointer. so you can read and write the value.

For HT, yes. But in case of sets value is a key.

@Rot127
Copy link
Member Author

Rot127 commented Apr 22, 2024

Then let me extend the PR to include more foreach macros:

So the todo list would be:

  • Use @pelijah suggested changes
  • Add foreach for:
    • HtUU (and in extend SetU)
    • HtUS (and in extend SetS)
    • HtSS, HtSU
    • Testing
  • Value should be always be a pointer, so writes are possible.

@pelijah
Copy link
Contributor

pelijah commented Apr 22, 2024

If you are going to implement foreach for HTs, it would be better to rename ht_xx_foreach APIs to avoid code refactoring.
Here an idea of iterator type for ht_inc.h based on yours SetU iterator:

typedef struct Ht_(iter_t) {
	st64 ti; ///< Table index
	ut64 bi; ///< Bucket index
	HT_(Kv) *kv; ///< Current KV-pair.  Stop iteration if kv is NULL. 
} HT(Iter);

Also you need to provide in ht_inc.c some sort of Ht_(iter_init) and Ht_(iter_next) so you can just use the expression below in foreach macros:

for (Ht_(iter_init)(&iter); iter.kv != NULL && ... ; Ht_(iter_next)(&iter))

@Rot127
Copy link
Member Author

Rot127 commented May 18, 2024

Implemented by #4500

@Rot127 Rot127 closed this May 18, 2024
@Rot127 Rot127 deleted the setu branch May 18, 2024 12:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants