Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A whole bunch of stuff. #4

Open
wants to merge 69 commits into
base: master
Choose a base branch
from

Conversation

segfaultmagnet
Copy link

Basically an entirely rewrite of getmagiccardprices.py

It now has some configurable elements, can take a collection of cards exported from deckstats.net, and handles prints of the same card from different editions.

segfaultmagnet and others added 30 commits November 9, 2015 15:43
Figure out why Qt craps out after first row of input.
Need to add handling for commas in card names.
Reads CSV with name and quantity of card. Scrapes for prices based on
exact name match (does not handle multiple matches). Only returns first
result / latest printing of card (see http://magiccards.info/search.html
and http://magiccards.info/syntax.html).

New:
Added usage.

Todo:
Add search by specific edition.
Reads CSV with name and quantity of card. Scrapes for prices based on
exact name match (does not handle multiple matches). Only returns first
result / latest printing of card (see http://magiccards.info/search.html
and http://magiccards.info/syntax.html).

New:
Added usage.

Todo:
Add search by specific edition.
Reads CSV with name and quantity of card. Scrapes for prices based on
exact name match. Only returns first result / latest printing of card
(see http://magiccards.info/search.html
and http://magiccards.info/syntax.html).

Depends:
PySide

New:
Added usage, proper file handling.

Todo:
Add search by specific edition.
Reads CSV with name and quantity of card. Scrapes for prices based on
exact name match. Only returns first result / latest printing of card
(see http://magiccards.info/search.html
and http://magiccards.info/syntax.html).

Depends:
PySide

New:
Added usage, proper file handling.

Todo:
Add search by specific edition.
Cleaned up, added progress.
Reads CSV with name and quantity of card. Scrapes for prices based on
exact name match. Only returns first result / latest printing of card
(see http://magiccards.info/search.html
and http://magiccards.info/syntax.html).

Depends:
docopt, PySide

New:
Added usage, proper file handling.

Todo:
Add search by specific edition.
Add formatting for deckstats.net input and output.
Reads CSV with name and quantity of card. Scrapes for prices based on
exact name match. Only returns first result / latest printing of card
(see http://magiccards.info/search.html and
http://magiccards.info/syntax.html).

Depends:
docopt, PySide

New:
Added usage, proper file handling.

Todo:
Add search by specific edition.
Add formatting for deckstats.net input and output.
Clean up documentation (Python style guide??).
Reads CSV with name and quantity of card. Scrapes for prices based on
exact name match. Only returns first result / latest printing of card
(see http://magiccards.info/search.html and
http://magiccards.info/syntax.html).

Depends:
docopt, PySide

New:
Added usage, proper file handling.

Todo:
Add search by specific edition.
Add handling for multiple instances of same card (duplicates, different
printing, foil).
Improve formatting for deckstats.net input and output.
Clean up documentation (Python style guide??).
Added config file (conf.ini).
Minor improvements to results accuracy. Seeing lots of misses due to
multiple results. Can't use both exact name "!" and edition "e:" in same
query. At ~97% accuracy with personal collection.

Known causes of misses:
Non-alphanumeric characters (e.g. Æther Burst, R&D's Secret Lair,
Legions of Lim-Dûl). Some of these work when the correct character is
used; others do not (sometimes depends on whether input CSV has correct
character).
Multiple results (e.g. Forest 10e also returns Karplusan Forest 10e).
Query does not work with both full name and edition?
No price returned at all (e.g. Purphoros's Emissary).

Possible solutions:
Re-try with exact name only on initial miss (still questionable).
Retrieve additional info from mtgjson via DeckBrew's API (i.e. number
within printing), then try exact URL of card (e.g.
http://magiccards.info/cstd/en/29.html).
This reverts commit dff8b3d.
Moved bulk of price fetching functionality into new class GetPrices
(mtgs_getprices.py). Moved webpage rendering to new class WebRenderer
(mtgs_webrenderer.py). New error classes (mtgs_error.py).
Need to figure out Unicode.
Converted to Python 3.
Added MTG JSON.
No longer uses numpy.
More comprehensive set_defs.csv

Much to do.
Dropped semicolon-delimited input for simplicity. May consider adding
back in future.
Found that card entries from MTG JSON are wildly inconsistent in
availability of "numbers" and "mciNumbers" fields.

Made MTGCard a little easier to work with from the outside.
Improved scraping to use exact URL whenever possible (hampered by
inconsistent data).
Updated write and summary methods.
Better sample files.
Debug output goes to "./debug.log" instead of terminal.
Cleaned up debugging.
Cleaned up some documentation.
Added sample files SOI.csv and SOI_out.csv and corresponding debug.log.
Note that the 7 misses were all due to no price data on the
correctly-scraped pages.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant