-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expand Scribe-Data Hindi queries #212
Comments
Hi @andrewtavis while I am working on this, do you think you can assign this issue to me so there's no confusion later? |
Yes definitely, @KesharwaniArpita :) Thanks for your willingness to help! |
Hi @andrewtavis ,I’ve raised a PR for expanding the Hindi verb extraction query. Initially, I was working with verb tenses, but after digging into the data on Wikidata through lexemeIDs, I realized that for Hindi, the available data focuses more on "कारक" (Kārak) forms (which express the relationships between words in a sentence) rather than tenses. I believe the same is going to be the case with other languages(Like Urdu and Bengali). So, I shifted gears and built a modified query based on these Kārak forms—things like direct case, gerund, intransitive phase, and more. I’ve tested the updated query, and saved the results too. Would love to get your thoughts on this approach and any suggestions you might have! Also can I checkout the other languages? |
Thanks so much for your hard work, @KesharwaniArpita! Do you want to but a note for this into the Urdu issue and check for Bengali as well. I'm pretty sure that Bengali verbs are modeled the way that they should be as the Wikidata Bengali community is very good 😊 By all means check out other languages as you already have! We'll get to the review in the coming days :) |
@andrewtavis Sure. Thank you!!!! Should I raise the issue for Bengali? |
The Bengali verbs have already been checked a while ago, so maybe you can check that query and see if you'd change/expand it in any way :) CC @mhmohona, who originally wrote the Bengali query :) |
Sure!!! I'll look into that. |
I want to participate in this issue too, can I do that? I am new to Sparql but could collaborate and contribute. Thanks! 😊 |
Hi @SethiShreya ! 😊 I'd love to collaborate. Even I'm new to SPARQL. Let's work together! Looking forward to your thoughts. Thanks! 🙌 |
Yeah that would be great, lets connect sometime to collaborate further |
It would be helpful if you could tell me how much progress have done on this issue, and what are the features that are needed to be added |
Thanks for sharing, I will look into it @KesharwaniArpita |
As discussed with @KesharwaniArpita, there are things that we can expand for the Hindi language: gender for nouns, Adjectives, Prepositions, Adverbs, etc. We have discussed collaborating, so I will be working on Gender for nouns and she on Adjectives. Is is correct? @andrewtavis |
Sounds great, @SethiShreya! Thank you both for the collaboration and coordination! |
@KesharwaniArpita I reviewed the files on Hindi nouns and gender is already done, right? |
@andrewtavis I want to work on Punjabi language(an Indian language) query, can you please make an issue for that? |
@SethiShreya ,You can try working on conjuctions or prepositions and there cases if you like to? |
Just added a list of data types that we want to include to this issue :) Have marked those that are already done or have PRs open, and we can work on the others 😊 If the data type can't work, then we can move to the others and open up specific issues later :) |
Terms
Description
This issue would expand the queries for Hindi that are found in src/scribe_data/language_data_extraction/Hindi. As of now the nouns query is likely fairly good, but we need to add verb conjugations to the verbs query as is done in other languages :)
Data types to include:
Contribution
Happy to support with this and answer any questions that come up! Also happy to review when it's time 😊
The text was updated successfully, but these errors were encountered: