From fb33cb50f6ba9e7245cb60eb232781cb1c35d8a8 Mon Sep 17 00:00:00 2001 From: James McKinney <26463+jpmckinney@users.noreply.github.com> Date: Sun, 28 Apr 2024 00:02:34 -0400 Subject: [PATCH] docs: Document how to handle long rows, #1237 --- docs/scripts/csvclean.rst | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/docs/scripts/csvclean.rst b/docs/scripts/csvclean.rst index 8613d6868..7dd6de36b 100644 --- a/docs/scripts/csvclean.rst +++ b/docs/scripts/csvclean.rst @@ -52,6 +52,10 @@ Cleans a CSV file of common syntax errors: 1,Alice, 2,Bob,CA + .. tip:: + + :doc:`csvcut` without options also adds missing delimiters! + To change the value used to fill short rows, use :code:`--fillvalue`. For example, with :code:`--fillvalue "US"`: .. code-block:: none @@ -117,6 +121,10 @@ Test a file with known bad rows: 1,"Expected 3 columns, found 4 columns",1,27,,I'm too long! 2,"Expected 3 columns, found 2 columns",,I'm too short! +.. note:: + + If any data rows are longer than the header row, you need to add columns manually: for example, by adding one or more delimiters (``,``) to the end of the header row. :code:`csvclean` can't do this, because it is designed to work with standard input, and correcting an error at the start of the CSV data based on an observation later in the CSV data would require holding all the CSV data in memory – which is not an option for large files. + To change the line ending from line feed (LF or ``\n``) to carriage return and line feed (CRLF or ``\r\n``) use: .. code-block:: bash