Skip to content

Commit

Permalink
Merge pull request #1815 from ranaroussi/dev
Browse files Browse the repository at this point in the history
sync dev -> main
  • Loading branch information
ValueRaider authored Jan 6, 2024
2 parents 8975689 + 477dc6e commit 49d8dfd
Show file tree
Hide file tree
Showing 19 changed files with 717 additions and 478 deletions.
13 changes: 13 additions & 0 deletions .github/workflows/ruff.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
name: Ruff
on:
pull_request:
branches:
- master
- main
- dev
jobs:
ruff:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: chartboost/ruff-action@v1
67 changes: 28 additions & 39 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,26 @@ Yahoo! finance API is intended for personal use only.**

---

## Installation

Install `yfinance` using `pip`:

``` {.sourceCode .bash}
$ pip install yfinance --upgrade --no-cache-dir
```

[With Conda](https://anaconda.org/ranaroussi/yfinance).

To install with optional dependencies, replace `optional` with: `nospam` for [caching-requests](#smarter-scraping), `repair` for [price repair](https://github.com/ranaroussi/yfinance/wiki/Price-repair), or `nospam,repair` for both:

``` {.sourceCode .bash}
$ pip install yfinance[optional]
```

[Required dependencies](./requirements.txt) , [all dependencies](./setup.py#L62).

---

## Quick Start

### The Ticker module
Expand Down Expand Up @@ -87,6 +107,9 @@ msft.quarterly_cashflow
msft.major_holders
msft.institutional_holders
msft.mutualfund_holders
msft.insider_transactions
msft.insider_purchases
msft.insider_roster_holders

# Show future and historic earnings dates, returns at most next 4 quarters and last 8 quarters by default.
# Note: If more are needed use msft.get_earnings_dates(limit=XX) with increased limit argument.
Expand Down Expand Up @@ -155,9 +178,10 @@ data = yf.download("SPY AAPL", period="1mo")

### Smarter scraping

To use a custom `requests` session (for example to cache calls to the
API or customize the `User-agent` header), pass a `session=` argument to
the Ticker constructor.
Install the `nospam` packages for smarter scraping using `pip` (see [Installation](#installation)). These packages help cache calls such that Yahoo is not spammed with requests.

To use a custom `requests` session, pass a `session=` argument to
the Ticker constructor. This allows for caching calls to the API as well as a custom way to modify requests via the `User-agent` header.

```python
import requests_cache
Expand All @@ -168,7 +192,7 @@ ticker = yf.Ticker('msft', session=session)
ticker.actions
```

Combine a `requests_cache` with rate-limiting to avoid triggering Yahoo's rate-limiter/blocker that can corrupt data.
Combine `requests_cache` with rate-limiting to avoid triggering Yahoo's rate-limiter/blocker that can corrupt data.
```python
from requests import Session
from requests_cache import CacheMixin, SQLiteCache
Expand Down Expand Up @@ -232,41 +256,6 @@ yf.set_tz_cache_location("custom/cache/location")

---

## Installation

Install `yfinance` using `pip`:

``` {.sourceCode .bash}
$ pip install yfinance --upgrade --no-cache-dir
```

Test new features by installing betas, provide feedback in [corresponding Discussion](https://github.com/ranaroussi/yfinance/discussions):
``` {.sourceCode .bash}
$ pip install yfinance --upgrade --no-cache-dir --pre
```

To install `yfinance` using `conda`, see
[this](https://anaconda.org/ranaroussi/yfinance).

### Requirements

- [Python](https://www.python.org) \>= 2.7, 3.4+
- [Pandas](https://github.com/pydata/pandas) \>= 1.3.0
- [Numpy](http://www.numpy.org) \>= 1.16.5
- [requests](http://docs.python-requests.org/en/master) \>= 2.31
- [lxml](https://pypi.org/project/lxml) \>= 4.9.1
- [appdirs](https://pypi.org/project/appdirs) \>= 1.4.4
- [pytz](https://pypi.org/project/pytz) \>=2022.5
- [frozendict](https://pypi.org/project/frozendict) \>= 2.3.4
- [beautifulsoup4](https://pypi.org/project/beautifulsoup4) \>= 4.11.1
- [html5lib](https://pypi.org/project/html5lib) \>= 1.1
- [peewee](https://pypi.org/project/peewee) \>= 3.16.2

#### Optional (if you want to use `pandas_datareader`)

- [pandas\_datareader](https://github.com/pydata/pandas-datareader)
\>= 0.4.0

## Developers: want to contribute?

`yfinance` relies on community to investigate bugs and contribute code. Developer guide: https://github.com/ranaroussi/yfinance/discussions/1084
Expand Down
4 changes: 4 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,10 @@
'lxml>=4.9.1', 'appdirs>=1.4.4', 'pytz>=2022.5',
'frozendict>=2.3.4', 'peewee>=3.16.2',
'beautifulsoup4>=4.11.1', 'html5lib>=1.1'],
extras_require={
'nospam': ['requests_cache>=1.0', 'requests_ratelimiter>=0.3.1'],
'repair': ['scipy>=1.6.3'],
},
# Note: Pandas.read_html() needs html5lib & beautifulsoup4
entry_points={
'console_scripts': [
Expand Down
14 changes: 6 additions & 8 deletions tests/context.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,18 +4,20 @@
import datetime as _dt
import sys
import os
import yfinance
from requests import Session
from requests_cache import CacheMixin, SQLiteCache
from requests_ratelimiter import LimiterMixin, MemoryQueueBucket
from pyrate_limiter import Duration, RequestRate, Limiter

_parent_dp = os.path.abspath(os.path.join(os.path.dirname(__file__), '..'))
_src_dp = _parent_dp
sys.path.insert(0, _src_dp)

import yfinance


# Optional: see the exact requests that are made during tests:
# import logging
# logging.basicConfig(level=logging.DEBUG)


# Use adjacent cache folder for testing, delete if already exists and older than today
testing_cache_dirpath = os.path.join(_ad.user_cache_dir(), "py-yfinance-testing")
yfinance.set_tz_cache_location(testing_cache_dirpath)
Expand All @@ -27,12 +29,8 @@


# Setup a session to rate-limit and cache persistently:
from requests import Session
from requests_cache import CacheMixin, SQLiteCache
from requests_ratelimiter import LimiterMixin, MemoryQueueBucket
class CachedLimiterSession(CacheMixin, LimiterMixin, Session):
pass
from pyrate_limiter import Duration, RequestRate, Limiter
history_rate = RequestRate(1, Duration.SECOND*2)
limiter = Limiter(history_rate)
cache_fp = os.path.join(testing_cache_dirpath, "unittests-cache")
Expand Down
14 changes: 7 additions & 7 deletions tests/data/CNE-L-1d-bad-stock-split-fixed.csv
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@ Date,Open,High,Low,Close,Adj Close,Volume,Dividends,Stock Splits
2023-05-18 00:00:00+01:00,193.220001220703,200.839996337891,193.220001220703,196.839996337891,196.839996337891,653125,0,0
2023-05-17 00:00:00+01:00,199.740005493164,207.738006591797,190.121994018555,197.860000610352,197.860000610352,822268,0,0
2023-05-16 00:00:00+01:00,215.600006103516,215.600006103516,201.149993896484,205.100006103516,205.100006103516,451009,243.93939,0.471428571428571
2023-05-15 00:00:00+01:00,215.399955531529,219.19995640346,210.599967302595,217.399987792969,102.39998147147,1761679.3939394,0,0
2023-05-12 00:00:00+01:00,214.599988664899,216.199965558733,209.599965558733,211.399977329799,99.573855808803,1522298.48484849,0,0
2023-05-11 00:00:00+01:00,219.999966430664,219.999966430664,212.199987357003,215.000000871931,101.269541277204,3568042.12121213,0,0
2023-05-10 00:00:00+01:00,218.199954659598,223.000000435965,212.59995640346,215.399955531529,101.457929992676,5599908.78787879,0,0
2023-05-09 00:00:00+01:00,224,227.688003540039,218.199996948242,218.399993896484,102.87100982666,1906090,0,0
2023-05-05 00:00:00+01:00,220.999968174526,225.19996686663,220.799976457868,224.4,105.697140066964,964523.636363637,0,0
2023-05-04 00:00:00+01:00,216.999989972796,222.799965558733,216.881988961356,221.399965994698,104.284055655343,880983.93939394,0,0
2023-05-15 00:00:00+01:00,456.9090,464.9696,446.7272,461.1515,217.2121,830506.0000,0,0
2023-05-12 00:00:00+01:00,455.2121,458.6060,444.6060,448.4242,211.2173,717655.0000,0,0
2023-05-11 00:00:00+01:00,466.6666,466.6666,450.1212,456.0606,214.8142,1682077.0000,0,0
2023-05-10 00:00:00+01:00,462.8484,473.0303,450.9696,456.9090,215.2138,2639957.0000,0,0
2023-05-09 00:00:00+01:00,475.1515,482.9746,462.8485,463.2727,218.2112,898585.2857,0,0
2023-05-05 00:00:00+01:00,468.7878,477.6969,468.3636,476.0000,224.2061,454704.0000,0,0
2023-05-04 00:00:00+01:00,460.3030,472.6060,460.0527,469.6363,221.2086,415321.0000,0,0
91 changes: 18 additions & 73 deletions tests/prices.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,6 @@ def test_pricesEventsMerge(self):

def test_pricesEventsMerge_bug(self):
# Reproduce exception when merging intraday prices with future dividend
tkr = 'S32.AX'
interval = '30m'
df_index = []
d = 13
Expand All @@ -160,7 +159,7 @@ def test_pricesEventsMerge_bug(self):
future_div_dt = _dt.datetime(2023, 9, 14, 10)
divs = _pd.DataFrame(data={"Dividends":[div]}, index=[future_div_dt])

df2 = yf.utils.safe_merge_dfs(df, divs, interval)
yf.utils.safe_merge_dfs(df, divs, interval)
# No exception = test pass

def test_intraDayWithEvents(self):
Expand Down Expand Up @@ -235,8 +234,10 @@ def test_dailyWithEvents(self):
self.assertTrue((df_divs.index.date == dates).all())
except AssertionError:
print(f'- ticker = {tkr}')
print('- response:') ; print(df_divs.index.date)
print('- answer:') ; print(dates)
print('- response:')
print(df_divs.index.date)
print('- answer:')
print(dates)
raise

def test_dailyWithEvents_bugs(self):
Expand Down Expand Up @@ -282,60 +283,6 @@ def test_dailyWithEvents_bugs(self):
self.assertTrue(df_merged[df_prices.columns].iloc[1:].equals(df_prices))
self.assertEqual(df_merged.index[0], div_dt)

def test_intraDayWithEvents(self):
tkrs = ["BHP.AX", "IMP.JO", "BP.L", "PNL.L", "INTC"]
test_run = False
for tkr in tkrs:
start_d = _dt.date.today() - _dt.timedelta(days=59)
end_d = None
df_daily = yf.Ticker(tkr, session=self.session).history(start=start_d, end=end_d, interval="1d", actions=True)
df_daily_divs = df_daily["Dividends"][df_daily["Dividends"] != 0]
if df_daily_divs.shape[0] == 0:
continue

last_div_date = df_daily_divs.index[-1]
start_d = last_div_date.date()
end_d = last_div_date.date() + _dt.timedelta(days=1)
df_intraday = yf.Ticker(tkr, session=self.session).history(start=start_d, end=end_d, interval="15m", actions=True)
self.assertTrue((df_intraday["Dividends"] != 0.0).any())

df_intraday_divs = df_intraday["Dividends"][df_intraday["Dividends"] != 0]
df_intraday_divs.index = df_intraday_divs.index.floor('D')
self.assertTrue(df_daily_divs.equals(df_intraday_divs))

test_run = True

if not test_run:
self.skipTest("Skipping test_intraDayWithEvents() because no tickers had a dividend in last 60 days")

def test_intraDayWithEvents_tase(self):
# TASE dividend release pre-market, doesn't merge nicely with intra-day data so check still present

tase_tkrs = ["ICL.TA", "ESLT.TA", "ONE.TA", "MGDL.TA"]
test_run = False
for tkr in tase_tkrs:
start_d = _dt.date.today() - _dt.timedelta(days=59)
end_d = None
df_daily = yf.Ticker(tkr, session=self.session).history(start=start_d, end=end_d, interval="1d", actions=True)
df_daily_divs = df_daily["Dividends"][df_daily["Dividends"] != 0]
if df_daily_divs.shape[0] == 0:
continue

last_div_date = df_daily_divs.index[-1]
start_d = last_div_date.date()
end_d = last_div_date.date() + _dt.timedelta(days=1)
df_intraday = yf.Ticker(tkr, session=self.session).history(start=start_d, end=end_d, interval="15m", actions=True)
self.assertTrue((df_intraday["Dividends"] != 0.0).any())

df_intraday_divs = df_intraday["Dividends"][df_intraday["Dividends"] != 0]
df_intraday_divs.index = df_intraday_divs.index.floor('D')
self.assertTrue(df_daily_divs.equals(df_intraday_divs))

test_run = True

if not test_run:
self.skipTest("Skipping test_intraDayWithEvents_tase() because no tickers had a dividend in last 60 days")

def test_weeklyWithEvents(self):
# Reproduce issue #521
tkr1 = "QQQ"
Expand Down Expand Up @@ -427,9 +374,9 @@ def test_tz_dst_ambiguous(self):
raise Exception("Ambiguous DST issue not resolved")

def test_dst_fix(self):
# Daily intervals should start at time 00:00. But for some combinations of date and timezone,
# Daily intervals should start at time 00:00. But for some combinations of date and timezone,
# Yahoo has time off by few hours (e.g. Brazil 23:00 around Jan-2022). Suspect DST problem.
# The clue is (a) minutes=0 and (b) hour near 0.
# The clue is (a) minutes=0 and (b) hour near 0.
# Obviously Yahoo meant 00:00, so ensure this doesn't affect date conversion.

# The correction is successful if no days are weekend, and weekly data begins Monday
Expand All @@ -452,8 +399,8 @@ def test_dst_fix(self):
raise

def test_prune_post_intraday_us(self):
# Half-day before USA Thanksgiving. Yahoo normally
# returns an interval starting when regular trading closes,
# Half-day before USA Thanksgiving. Yahoo normally
# returns an interval starting when regular trading closes,
# even if prepost=False.

# Setup
Expand Down Expand Up @@ -489,8 +436,8 @@ def test_prune_post_intraday_us(self):
self.assertEqual(len(late_open_dates), 0)

def test_prune_post_intraday_omx(self):
# Half-day before Sweden Christmas. Yahoo normally
# returns an interval starting when regular trading closes,
# Half-day before Sweden Christmas. Yahoo normally
# returns an interval starting when regular trading closes,
# even if prepost=False.
# If prepost=False, test that yfinance is removing prepost intervals.

Expand Down Expand Up @@ -540,7 +487,6 @@ def test_prune_post_intraday_omx(self):
def test_prune_post_intraday_asx(self):
# Setup
tkr = "BHP.AX"
interval = "1h"
interval_td = _dt.timedelta(hours=1)
time_open = _dt.time(10)
time_close = _dt.time(16, 12)
Expand Down Expand Up @@ -578,7 +524,7 @@ def test_aggregate_capital_gains(self):
end = "2019-12-31"
interval = "3mo"

df = dat.history(start=start, end=end, interval=interval)
dat.history(start=start, end=end, interval=interval)


class TestPriceRepair(unittest.TestCase):
Expand All @@ -601,7 +547,6 @@ def test_reconstruct_2m(self):
tkrs = ["BHP.AX", "IMP.JO", "BP.L", "PNL.L", "INTC"]

dt_now = _pd.Timestamp.utcnow()
td_7d = _dt.timedelta(days=7)
td_60d = _dt.timedelta(days=60)

# Round time for 'requests_cache' reuse
Expand All @@ -611,7 +556,7 @@ def test_reconstruct_2m(self):
dat = yf.Ticker(tkr, session=self.session)
end_dt = dt_now
start_dt = end_dt - td_60d
df = dat.history(start=start_dt, end=end_dt, interval="2m", repair=True)
dat.history(start=start_dt, end=end_dt, interval="2m", repair=True)

def test_repair_100x_random_weekly(self):
# Setup:
Expand Down Expand Up @@ -856,7 +801,7 @@ def test_repair_zeroes_daily(self):
self.assertFalse(repaired_df["Repaired?"].isna().any())

def test_repair_zeroes_daily_adjClose(self):
# Test that 'Adj Close' is reconstructed correctly,
# Test that 'Adj Close' is reconstructed correctly,
# particularly when a dividend occurred within 1 day.

tkr = "INTC"
Expand Down Expand Up @@ -926,10 +871,10 @@ def test_repair_zeroes_hourly(self):
self.assertFalse(repaired_df["Repaired?"].isna().any())

def test_repair_bad_stock_split(self):
# Stocks that split in 2022 but no problems in Yahoo data,
# Stocks that split in 2022 but no problems in Yahoo data,
# so repair should change nothing
good_tkrs = ['AMZN', 'DXCM', 'FTNT', 'GOOG', 'GME', 'PANW', 'SHOP', 'TSLA']
good_tkrs += ['AEI', 'CHRA', 'GHI', 'IRON', 'LXU', 'NUZE', 'RSLS', 'TISI']
good_tkrs += ['AEI', 'GHI', 'IRON', 'LXU', 'NUZE', 'RSLS', 'TISI']
good_tkrs += ['BOL.ST', 'TUI1.DE']
intervals = ['1d', '1wk', '1mo', '3mo']
for tkr in good_tkrs:
Expand Down Expand Up @@ -991,8 +936,8 @@ def test_repair_bad_stock_split(self):
# print(repaired_df[c] - correct_df[c])
raise

# Had very high price volatility in Jan-2021 around split date that could
# be mistaken for missing stock split adjustment. And old logic did think
# Had very high price volatility in Jan-2021 around split date that could
# be mistaken for missing stock split adjustment. And old logic did think
# column 'High' required fixing - wrong!
sketchy_tkrs = ['FIZZ']
intervals = ['1wk']
Expand Down
Loading

0 comments on commit 49d8dfd

Please sign in to comment.