Skip to content

Python library that simplifies obtaining data from API endpoints by converting them directly into Pandas DataFrames. This library offers robust features, including retry strategies for failed requests.

License

Notifications You must be signed in to change notification settings

ivdatahub/api-to-dataframe

Repository files navigation

API to DataFrame

Python library that simplifies obtaining data from API endpoints by converting them directly into Pandas DataFrames. This library offers robust features, including retry strategies for failed requests.

Github

PyPI - Status PyPI - Downloads PyPI - Version

PyPI - Python Version

CI CD

Codecov

Project Stack

Python  Docker  Poetry  GitHub Actions  CodeCov  pypi  pandas  pytest 

Installation

To install the package using pip, use the following command:

pip install api-to-dataframe

To install the package using poetry, use the following command:

poetry add api-to-dataframe

User Guide

## Importing library
from api_to_dataframe import ClientBuilder, RetryStrategies

# Create a client for simple ingest data from API (timeout 1 second)
client = ClientBuilder(endpoint="https://api.example.com")

# if you can define timeout with LINEAR_RETRY_STRATEGY and set headers:
headers = {
    "application_name": "api_to_dataframe"
}
client = ClientBuilder(endpoint="https://api.example.com"
                        ,retry_strategy=RetryStrategies.LINEAR_RETRY_STRATEGY
                        ,connection_timeout=2
                        ,headers=headers)

# if you can define timeout with EXPONENTIAL_RETRY_STRATEGY and set headers:
client = ClientBuilder(endpoint="https://api.example.com"
                        ,retry_strategy=RetryStrategies.EXPONENTIAL_RETRY_STRATEGY
                        ,connection_timeout=10
                        ,headers=headers
                        ,retries=5
                        ,initial_delay=10)


# Get data from the API
data = client.get_api_data()

# Convert the data to a DataFrame
df = client.api_to_dataframe(data)

# Display the DataFrame
print(df)

Important notes:

  • Opcionals Parameters: The params timeout, retry_strategy and headers are opcionals.

  • Default Params Value: By default the quantity of retries is 3 and the time between retries is 1 second, but you can define manually.

  • Max Of Retries: For security of API Server there is a limit for quantity of retries, actually this value is 5, this value is defined in lib constant. You can inform any value in RETRIES param, but the lib only will try 5x.

  • Exponential Retry Strategy: The increment of time between retries is time passed in initial_delay param * 2 * the retry_number, e.g with initial_delay=2

    RetryNumber WaitingTime
    2 2s
    2 4s
    3 6s
    4 8s
    5 10s
  • Linear Retry Strategy: The increment of time between retries is time passed in initial_delay e.g with initial_delay=2

    RetryNumber WaitingTime
    1 2s
    2 2s
    3 2s
    4 2s
    5 2s

About

Python library that simplifies obtaining data from API endpoints by converting them directly into Pandas DataFrames. This library offers robust features, including retry strategies for failed requests.

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Sponsor this project