Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regex: update relevant files in compat/regex #1641

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Chand-ra
Copy link

No description provided.

Chandra Pratap added 2 commits January 13, 2024 00:38
re-apply commit a997bf4: get the regex engine to compile within git

re-apply c01499e: have space around && and || operators

re-apply 19716b2: fix possible overflow errors in binary search

re-apply commit 0bdaacf: define intptr_t and uintptr_t on NonStop

re-apply commit f65d07f: use compat regex with SANITIZE=address

Signed-off-by: Chandra Pratap <[email protected]>

if (BE (length1 < 0 || length2 < 0 || stop < 0, 0))
if (__glibc_unlikely ((length1 < 0 || length2 < 0 || stop < 0
|| ckd_add (&len, length1, length2))))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ckd_add() (or anything else from stdckdint.h) is C23, that isn't going to be accepted any time soon.

re_compile_fastmap (bufp);

if (BE (bufp->no_sub, 0))
if (__glibc_unlikely (bufp->no_sub))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__glibc_unlikely() and __glibc_likely() are probably very simple maceo wrappers around __builtin_expect(), but it might make sense to stick with BE() instead.

@@ -51,30 +62,15 @@
# include "../locale/localeinfo.h"
#endif

#if defined (_MSC_VER)
#include <stdio.h> /* for size_t */
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be the reason for the size_t errors you mentioned.

re_string_t *pstr,
RE_TRANSLATE_TYPE trans, int icase,
const re_dfa_t *dfa) internal_function;
RE_TRANSLATE_TYPE trans, bool icase,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There will be a test baloon for bool in Git 2.44.0, but it's better to not introduce bool in more places right now.

@dscho
Copy link
Member

dscho commented Jan 24, 2024

I gave this a whirl two weeks ago, but wasn't able to finish. The current WIP is here: https://github.com/git/git/compare/dscho:compat-regex-update.

The strategy I followed was to replace the files with newer versions, and then reapplying Git's patches individually, to ease future upgrades. To that end, I ran git cherry-pick $(git rev-list --reverse --no-merges a997bf423d4b..HEAD^ -- compat/regex). The following patches were actually skipped:

  • SKIPPED (already in gnulib): de83172 (Change regerror() declaration from K&R style to ANSI C (C89))
  • SKIPPED (RE_ENABLE_I18N is no longer a thing): b50f370 (compat/regex: define out variables only used under RE_ENABLE_I18N)
  • SKIPPED (already in gnulib): 178b331 (compat/regex: get rid of old-style definition)
  • SKIPPED (already in gnulib): ce518bb (Fix compat/regex ANSIfication on MinGW)
  • SKIPPED (already in gnulib): ce9171c (compat/regex: fix spelling and grammar in comments)
  • SKIPPED (already in gnulib): 749f763 (typofix: in-code comments)
  • SKIPPED (already in gnulib): 56a1a3a (Silence GCC's "cast of pointer to integer of a different size" warning)
  • SKIPPED (already in gnulib): bd8f005 (regex: fix a SIZE_MAX macro redefinition warning)
  • SKIPPED (already in gnulib): 6412757 (Spelling fixes)
  • SKIPPED (already in gnulib, using https:// instead of http:// even): 4842579 (Replace Free Software Foundation address in license notices)
  • SKIPPED (already in gnulib): 03670c8 (Fix spelling errors in no-longer-updated-from-upstream modules)

A couple of thoughts:

  • Instead of creating code blocks to avoid the error "ISO C90 forbids mixed declarations and code", as I have done in the WIP commit, it would probably make more sense to add -Wno-declaration-after-statement to the COMPAT_CFLAGS specifically, potentially limited to compat/regex/* similar to how we do things with mimalloc.
  • I don't know where struct regmatch_list is declared, but this declaration/definition (as well as the regmatch_list_*() support functions) need to be added, too.
  • ckd_add() can probably be defined as a macro that makes use of Git's st_add().
  • The /* fallthru */ construct is used elsewhere in Git's source code, not sure why the FALLTHROUGH definition isn't enough.
  • It might make more sense to try upgrading to gawk's variant of the source code first, and only once that works, try to go directly for gnulib. Not sure. Maybe we're too close already for that to make sense, but then, maybe not.

@dscho
Copy link
Member

dscho commented Jan 24, 2024

The current WIP is here: https://github.com/git/git/compare/dscho:compat-regex-update.

FWIW I got it to compile, but there's quite a lot to clean up and/or change, still. For example, defining ckd_add() only for GCC is a no-go, I would think. Also, the __attribute_* macros need fixin'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants