Skip to content

Commit

Permalink
feat(csvpy): Add --no-number-ellipsis, --sniff-limit, --no-inference.…
Browse files Browse the repository at this point in the history
… fix(csvpy): Support --locale, --blanks, --null-value, --date-format, --datetime-format, --skip-lines. Remove --linenumbers, --zero. (closes #1231)
  • Loading branch information
jpmckinney committed Feb 10, 2024
1 parent c42d0db commit 78c5bb2
Show file tree
Hide file tree
Showing 3 changed files with 58 additions and 5 deletions.
16 changes: 16 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,24 @@
Unreleased
----------

* feat: :doc:`/scripts/csvpy` adds the options:

* :code:`--no-number-ellipsis`, to disable the ellipsis (````) if max precision is exceeded, for example, when using ``table.print_table()``
* :code:`--sniff-limit``
* :code:`--no-inference``

* feat: :doc:`/scripts/csvpy` removes the ``--linenumbers`` and ``--zero`` output options, which had no effect.
* feat: :doc:`/scripts/in2csv` adds a :code:`--reset-dimensions` option to `recalculate <https://openpyxl.readthedocs.io/en/stable/optimized.html#worksheet-dimensions>`_ the dimensions of an XLSX file, instead of trusting the file's metadata. csvkit's dependency `agate-excel <https://agate-excel.readthedocs.io/en/latest/>`_ 0.4.0 automatically recalculates the dimensions if the file's metadata expresses dimensions of "A1:A1" (a single cell).
* fix: :doc:`/scripts/csvlook` only reads up to :code:`--max-rows` rows instead of the entire file.
* fix: :doc:`/scripts/csvpy` supports the existing input options:

* :code:`--locale`
* :code:`--blanks`
* :code:`--null-value`
* :code:`--date-format`
* :code:`--datetime-format`
* :code:`--skip-lines`

* fix: :doc:`/scripts/in2csv`: :code:`--write-sheets` no longer errors when standard input is an XLS or XLSX file.
* Update minimum agate version to 1.6.3.

Expand Down
39 changes: 34 additions & 5 deletions csvkit/utilities/csvpy.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,40 +3,69 @@
import sys

import agate
from agate import config

from csvkit.cli import CSVKitUtility


class CSVPy(CSVKitUtility):
description = 'Load a CSV file into a CSV reader and then drop into a Python shell.'
override_flags = ['l', 'zero']

def add_arguments(self):
self.argparser.add_argument('--dict', dest='as_dict', action='store_true',
help='Load the CSV file into a DictReader.')
self.argparser.add_argument('--agate', dest='as_agate', action='store_true',
help='Load the CSV file into an agate table.')
self.argparser.add_argument(
'--dict', dest='as_dict', action='store_true',
help='Load the CSV file into a DictReader.')
self.argparser.add_argument(
'--agate', dest='as_agate', action='store_true',
help='Load the CSV file into an agate table.')
self.argparser.add_argument(
'--no-number-ellipsis', dest='no_number_ellipsis', action='store_true',
help='Disable the ellipsis if the max precision is exceeded.')
self.argparser.add_argument(
'-y', '--snifflimit', dest='sniff_limit', type=int, default=1024,
help='Limit CSV dialect sniffing to the specified number of bytes. '
'Specify "0" to disable sniffing entirely, or "-1" to sniff the entire file.')
self.argparser.add_argument(
'-I', '--no-inference', dest='no_inference', action='store_true',
help='Disable type inference when parsing the input. This disables the reformatting of values.')

def main(self):
if self.input_file == sys.stdin:
self.argparser.error('csvpy cannot accept input as piped data via STDIN.')

if self.args.no_number_ellipsis:
config.set_option('number_truncation_chars', '')

# Attempt reading filename, will cause lazy loader to access file and raise error if it does not exist
filename = self.input_file.name

if self.args.as_dict:
klass = agate.csv.DictReader
class_name = 'agate.csv.DictReader'
variable_name = 'reader'
input_file = self.skip_lines()
kwargs = {}
elif self.args.as_agate:
klass = agate.Table.from_csv
class_name = 'agate.Table'
variable_name = 'table'
input_file = self.input_file

sniff_limit = self.args.sniff_limit if self.args.sniff_limit != -1 else None
kwargs = dict(
skip_lines=self.args.skip_lines,
sniff_limit=sniff_limit,
column_types=self.get_column_types(),
)
else:
klass = agate.csv.reader
class_name = 'agate.csv.reader'
variable_name = 'reader'
input_file = self.skip_lines()
kwargs = {}

variable = klass(self.input_file, **self.reader_kwargs)
variable = klass(input_file, **kwargs, **self.reader_kwargs)

welcome_message = 'Welcome! "{}" has been loaded in an {} object named "{}".'.format(
filename, class_name, variable_name)
Expand Down
8 changes: 8 additions & 0 deletions docs/scripts/csvpy.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ Loads a CSV file into a :class:`agate.csv.Reader` object and then drops into a P
[-S] [--blanks] [--null-value NULL_VALUES [NULL_VALUES ...]]
[--date-format DATE_FORMAT] [--datetime-format DATETIME_FORMAT]
[-H] [-K SKIP_LINES] [-v] [-l] [--zero] [-V] [--dict] [--agate]
[--no-number-ellipsis] [-y SNIFF_LIMIT] [-I]
[FILE]
Load a CSV file into a CSV reader and then drop into a Python shell.
Expand All @@ -26,6 +27,13 @@ Loads a CSV file into a :class:`agate.csv.Reader` object and then drops into a P
-h, --help show this help message and exit
--dict Load the CSV file into a DictReader.
--agate Load the CSV file into an agate table.
--no-number-ellipsis Disable the ellipsis if the max precision is exceeded.
-y SNIFF_LIMIT, --snifflimit SNIFF_LIMIT
Limit CSV dialect sniffing to the specified number of
bytes. Specify "0" to disable sniffing entirely, or
"-1" to sniff the entire file.
-I, --no-inference Disable type inference when parsing the input. This
disables the reformatting of values.
This tool will automatically use the IPython shell if it is installed, otherwise it will use the running Python shell.

Expand Down

0 comments on commit 78c5bb2

Please sign in to comment.