Skip to content

Efficiency, Accuracy, Assurance - pick any three.

Notifications You must be signed in to change notification settings

alanhyue/pandasboost

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pandasboost

Data analysis with no compromises. Efficiency, Accuracy, Assurance - pick any three.

Documentation

Sub-modules

Functions

Function check_keep

def check_keep(
    frame,
    query,
    desc
)

Filter a dataframe with query and report the number of rows affected.

Parameters

query : str : Query for filtering the dataframe. Will be passed to pandas.DataFrame.query.

desc : str : Description of the filter.

Function levels

def levels(
    dataframe,
    show_values=True
)

Report the number of unique values (levels) for each variable. Useful to inspect categorical variables.

Parameters

show_values : bool : Whether to report a short sample of level values.

Function nmissing

def nmissing(
    dataframe,
    show_all=False
)

Evaluate the number of missing values in columns in the dataframe

Parameters

show_all : bool : Whether to report all columns. False to show only columns with one or more missing values.

Module pandasboost.formatter

Functions

Function bignum

def bignum(
    n,
    precision=0
)

Transform a big number into a business style representation.

Example
>>> bignum(123456)
Output: 123K

Function format_percentage

def format_percentage(
    n,
    precision='auto'
)

Display a decimal number in percentage.

Parameters

n : float : The number to format.

percision : int, str, default 'auto' : The precision of outcome. Default 'auto' to automatically choose the least precision on which the outcome is not zero.

Examples

format_percentage(0.001) ==> '0.1%' format_percentage(-0.0000010009) ==> '-0.0001%' format_percentage(0.001, 4) ==> '0.1000%'

Function cut_groups

def cut_groups(
    srs,
    rules,
    right=True,
    missing='missing'
)

Function frequency

def frequency(
    srs,
    business=True,
    ascending=None,
    by_index=False
)

Report frequency of values.

Parameters

ascending : boolean, default None : Whether to sort in ascending order. If none, will use ascending when sorted by index, and descending when sorted by frequency.

by_index : boolean, default True : Whether sort result by index.

About

Efficiency, Accuracy, Assurance - pick any three.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published