Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SPARQL query for extracting Urdu adjectives (#242) #248

Merged

Conversation

Ekikereabasi-Nk
Copy link
Contributor

Contributor checklist


Description

This pull request introduces a new SPARQL query designed to retrieve Urdu adjectives and their grammatical forms from Wikidata. The query is inspired by the SPARQL code from the "Expand Scribe-Data Hindi verb queries" by KesharwaniArpita, and it aims to enhance our understanding and representation of Urdu lexicographical data.

Key features of the query:

  1. Fetches the adjective's lemma (base form) from Wikidata.
  2. Retrieves various grammatical forms associated with each adjective.
  3. Filters results to include only forms where the language is set to "ur" (Urdu).
  4. Displays the lexeme ID, lemma, and optional grammatical features for each result.

Context and significance:

  • This query is part of a broader effort to document and analyze Hindustani (Q11051), also known as Hindi-Urdu, a language spoken widely in India and Pakistan.

  • While this specific query focuses on Urdu adjectives, it complements existing work on Hindi verbs, reflecting the close relationship between Hindi and Urdu within the Hindustani language complex.

The SPARQL query has been validated using the Wikidata Query Service
-->

Related issue

Copy link

github-actions bot commented Oct 4, 2024

Thank you for the pull request!

The Scribe team will do our best to address your contribution as soon as we can. The following is a checklist for maintainers to make sure this process goes as well as possible. Feel free to address the points below yourself in further commits if you realize that actions are needed :)

If you're not already a member of our public Matrix community, please consider joining! We'd suggest using Element as your Matrix client, and definitely join the General and Data rooms once you're in. Also consider joining our bi-weekly Saturday dev syncs. It'd be great to have you!

Maintainer checklist

  • The commit messages for the remote branch should be checked to make sure the contributor's email is set up correctly so that they receive credit for their contribution

    • The contributor's name and icon in remote commits should be the same as what appears in the PR
    • If there's a mismatch, the contributor needs to make sure that the email they use for GitHub matches what they have for git config user.email in their local Scribe-Data repo
  • The linting and formatting workflow within the PR checks do not indicate new errors in the files changed

  • The CHANGELOG has been updated with a description of the changes for the upcoming release and the corresponding issue (if necessary)

Copy link
Contributor

@KesharwaniArpita KesharwaniArpita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ekikereabasi-Nk You may want to check out the following QID for your grammatical feature:
Singulative Numeral: Q110786
Collective Numeral: Q146786
Oblique Case: Q1233197

@Ekikereabasi-Nk
Copy link
Contributor Author

@KesharwaniArpita Thank you very much for the information. I will look at it

@andrewtavis andrewtavis self-requested a review October 5, 2024 18:42
Copy link
Contributor Author

@Ekikereabasi-Nk Ekikereabasi-Nk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @andrewtavis the files have been restored to their original state as requested. All unnecessary file/folder have been removed, ensuring the pull request only includes the intended work.

@andrewtavis
Copy link
Member

I did a couple of minor edits here, @Ekikereabasi-Nk, but looking at some of the data it seems like there are many forms that we could still include or also make more specific by adding more properties to each form. Do you want to expand the current query and then we'll be good to merge? 😊

@Ekikereabasi-Nk
Copy link
Contributor Author

Hi @andrewtavis thanks for review and providing feedback. I appreciate your suggestion to expand the current query by including more forms and adding more properties to each form. I'm open to expanding the query as you've suggested

@andrewtavis
Copy link
Member

Sounds great, @Ekikereabasi-Nk! Looking forward to the further commits! 😊

@andrewtavis andrewtavis added the hacktoberfest-accepted Accepted as a part of Hacktoberfest label Oct 7, 2024
@Ekikereabasi-Nk
Copy link
Contributor Author

Hi @andrewtavis I have worked on expanding, and updated the queries for urdu adjectives. Please take a look. Thank you

Copy link
Member

@andrewtavis andrewtavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As with the other ones, we need to combine the properties to get individual forms. Let's work towards doing this in future queries, @Ekikereabasi-Nk 😊 Thanks so much for the efforts here!

@andrewtavis andrewtavis merged commit a73e98d into scribe-org:main Oct 10, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hacktoberfest-accepted Accepted as a part of Hacktoberfest
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants