Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to extract a model's download stats (e.g., last 30 days) in times series format? #2390

Open
frr717 opened this issue Jul 14, 2024 · 7 comments

Comments

@frr717
Copy link

frr717 commented Jul 14, 2024

Is your feature request related to a problem? Please describe.
Currently, I can only find code like below to get a static data point (last 30 day download count from today):
info = model_info("bert-base-uncased") model_info(info.modelId).downloads

Describe the solution you'd like
I wonder whether huggingface can provide methods with an input specifying the date? such as

model_info(info.modelId).get_downloads('20240131')

Describe alternatives you've considered
currently no...
I appreciate any help from all of you!

@Wauplin
Copy link
Contributor

Wauplin commented Jul 15, 2024

Hi @frr717, thanks for your interest. There is currently no way to get this data as time series. The only information you can get is the downloads in the last 30 days and overall downloads. What would be your use case for a timeseries format?

@frr717
Copy link
Author

frr717 commented Jul 15, 2024

Hi @frr717, thanks for your interest. There is currently no way to get this data as time series. The only information you can get is the downloads in the last 30 days and overall downloads. What would be your use case for a timeseries format?

Thank you for you reply.

I am in a research project that needs to use this times series to conduct some regression analysis on companies that those models belong to. Hence I am interested to know whether your team has a plan to implement it?
Thanks!

@frr717
Copy link
Author

frr717 commented Jul 15, 2024

Hi @frr717, thanks for your interest. There is currently no way to get this data as time series. The only information you can get is the downloads in the last 30 days and overall downloads. What would be your use case for a timeseries format?

BTW, I want to kindly ask you another questions regarding the image (a SVG element in the html of a model, such as [https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5] on the right top corner, besides the "Downloads last month":
Snipaste_2024-07-15_18-16-22
What is the frequency of the data points in the image?
Take this model on the above link as an example: it was created on 2024-05-19. So, does this mean that the line in the SVG represents each DAY's last-30-day downloads since its creation time?
Thank you!

@julien-c
Copy link
Member

it's each day in the last 30 days

@frr717
Copy link
Author

frr717 commented Jul 15, 2024

it's each day in the last 30 days

thank you!

@frr717
Copy link
Author

frr717 commented Jul 19, 2024

it's each day in the last 30 days这是过去 30 天内的每一天

hi, @julien-c
The data points on this image has been compressed to the range [0,100].

Could you kindly tell me the formula it uses?

Thank you!

@julien-c
Copy link
Member

0 means 0 download, ie. we don't move the origin.

So yes, you can get the daily downloads from the last30days total + the graph. It's a bit hacky but it'll work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants