Columns
How to manipulate columns
| Commands(s) | Operation |
|---|---|
! |
pins the current column on the left as a key column |
H L |
slides the current column one position to the left/right |
gH gL |
slides the current column all the way to the left/right of its section |
How to hide (remove) and unhide (return) columns
How to hide columns
- Press
-(hyphen) to hide the current column.
or
- Press
Shift+Con the source sheet to open its Columns sheet. - Move the cursor right to the width column.
- Move the cursor down to the row which represents the column you wish to hide.
- Press
efollowed by0to set the width for that column to 0. - Press
qto return to the source sheet.
How to unhide columns
- Press
gvto unhide all columns on current sheet.
or
- Press
Shift+Con the source sheet to open its Columns sheet. - Move the cursor right to the width column.
- Move the cursor down to the row which represents the column you wish to unhide. Currently, that cell should contain the value 0.
- Press
efollowed by a positive number to set the width. - Press
qto return to the source sheet.
How to specify column types
| Command | Type |
|---|---|
~ |
string |
# |
int |
% |
float |
$ |
currency |
@ |
date |
z# |
vlen |
z~ |
anytype |
Columns usually begin as untyped (anytype). Errors when working with numerical or datetime data is often due to values being considered as strings, and the problem is solved by setting the correct type.
The float type uses Python's builtin float() constructor to parse the string, and it parses by using the decimal separator.
The currency type is a bit of a misnomer. It filters out any non-numeric characters, and then parses the remainder of the cell value as a float. The reasons to prefer using float over currency, is performance (it is quite a bit slower than native parsing with float), or if any non-float characters should get reported as an error.
The date type parses dates into ISO8601 format. Those columns can then be used in mathematical calculations, and the calculations are interpreted for dates. E.g. 2020-01-01 + 1, is 2020-01-02.
The vlen type formats the cell value to the length of the content. For example, if the cell content is a list of length 3, then when vlen typed it will display a value of 3.
There is also the floatlocale type, which uses Python's locale.atof to parse the Column values. With floatlocale, you can set the LC_NUMERIC environment variable appropriately (before launching VisiData), such that atof() will parse the number based on your locale setting. There is a type-floatlocale command, which is unbound by default, because parsing this way is significantly slower than using the builtin float type.
If you need locale-specific float parsing regularly, you may want to rebind % or z% (or maybe some other keystroke) to type-floatlocale instead.
The following example uses the file sample.tsv.
How to batch specify column types for more than one column
- Press
Shift+Cto open the Columns sheet. - Press
sortto select the rows referencing the columns you wish to type. - Type
gfollowed by the any of the above typing keystrokes to set the type for all selected columns on the source sheet.
How to format columns
Note: Un-typed file formats, like tsvs and csvs, will save as they are displayed.
Some types have an option for their default display formatting.
| Type | Option | Default |
|---|---|---|
| int | dispintfmt | {:.0f} |
| float | dispfloatfmt | {:.02f} |
| currency | dispcurrencyfmt | %0.2f |
| date | dispdatefmt | %Y-%m-%d |
Ways to adjust the display formatting:
- The
fmtstrcolumn on the Columns Sheet allows you to specify the formatting for specific columns within that session, without affecting the default for the others. - The
disp_TYPE_fmtoption can be changed on the Options Sheet to set the formatting for all columns of typeTYPEin that session. - The
--disp-TYPE-fmtargument can be passed through the commandline to set the formatting for all columns of typeTYPEin that session. - The
options.disp_TYPE_fmtcan be set in the~/.visidatarcto change the default formatting for all columns of typeTYPEfor all sessions.
There are several formatting styles offered:
- Formatting that starts with
'%'(e.g. %0.2f) will use locale.format_string(). - Otherwise (e.g. {:.02f}), formatting will be passed to Python's string.format().
- Date fmtstr are passed to strftime.
The default for currency uses locale.format_string(). The default for int/float/date use string.format().
How to format a specific numeric columns to contain a thousands separator within a session?
- Set a column to a numeric type by pressing
#(int),%(float), or$(currency). - Press
Shift+Cto open the Columns Sheet. - Move to the row referencing the column whose display you wish to format. Move the cursor to the fmtstr column.
- Type
efollowed by {:,.0f} for aninttype and {:,.02f} for a floating point type.
How to quickly adjust the precision of a float or date?
- Ensure the column is either typed as a float (
%), floatsi, currency ($), or date (@). - Press
SPACEand type the longnamessetcol-precision-moreorsetcol-precision-lessto adjust the precision in the current column.
Bind the longnames to keys, if using these commands frequently.
How to set all date columns to be month/day/year.
The default can be set in a ~/.visidatarc.
options.disp_date_fmt = '%m/%d/%Y'
or passed through the commandline
vd --disp-date-fmt='%m/%d/%Y'
or set in the Options Sheet.
- Press
Shift+Oto open the Options Sheet. - Move the cursor down to the relevant dispdatefmt option.
- Type
efollowed by %m/%d/%Y.
How to specify a comma decimal separator when typing floating point numbers?
- Before launching VisiData, set the shell environment variable
LC_NUMERICto a locale which interprets commas as decimals. Any European locale should do; an example that works isen_DK.UTF-8. - Within VisiData, set a column to type
floatlocaleby pressingSpacefollowed by type-floatlocale.
Note that type-floatlocale is significantly slower than type-float. However, if you wish to replace the current binding for type-float with type-floatlocale, add to your ~/.visidatarc:
Sheet.unbindkey('%')
Sheet.bindkey('%', 'type-floatlocale')
or if you never use type-floatsi, you can do
Sheet.unbindkey('z%')
Sheet.bindkey('z%', 'type-floatlocale')
How to split a column
Python regular expressions provide more finetuned column splitting. The following example uses the commands for column splitting and transformation with xd/puzzles.tsv.
:adds new columns derived from splitting the current column at positions defined by a regex pattern.options.default_sample_size(default: 100) rows around the cursor will be used to determine the number of columns that will be created.;adds new columns derived from pulling the contents of the current column which match the regex within capture groups. The new columns are named using the capture group index, or if named capture groups are used, the capture group names. This command uses theoptions.default_sample_size(default:100) rows around the cursor as sample rows.*followed by regex/substring replaces the text which matches the capture groups in regex with the contents of substring. substring may include backreferences (\1 etc).
[How do I substitute text in my column]
The * command can be used to do content transformations of cells. The g* variant transforms in-place, instead of creating a new column.
The following example uses benchmarks.csv.
Question Transform the SKU values of food to nutri.
- Move cursor to SKU column.
- Press
gsto select all rows. - Type
g*followed by food/nutri.
- tests/transform-cols.vd
How to expand columns that contain nested data
If a column includes container data such as JSON objects or arrays, the ( family of commands can expand the child values into top-level columns:
| Command | Operation |
|---|---|
( |
expand current column |
g( |
expand all visible columns fully |
z( |
expand current column to a specific depth (prompt for input) |
gz( |
expand all visible columns to a specific depth (prompt for input) |
) |
contract (unexpand) the current column |
zM |
expand current column row-wise within that column |
The following demo shows ( commands applied to this data:
[
[ "short", "array" ],
[ "slightly", "longer", "array" ],
{ "nested": "data" },
{ "more": { "deeply": { "nested": "data" } } }
]
Note that by default the expansion logic will look for nested columns in up to options.default_sample_size (Default: 100) rows surrounding the cursor. This behavior can be controlled by adjusting default_sample_size in the Options Sheet, or setting options.default_sample_size in the ~/.visidatarc file.
How to create derivative columns
The = command takes a Python expression as input and creates a new column, where each cell evaluates the expression in the context of its row.
These variables and functions are available in the scope of an expression:
- Column names evaluate to the typed value of the cell in the named column for the same row.
vdattributes and methods; useCtrl+X vdto view the vd object, or see the API.Sheetattributes and methods; useg Ctrl+Yto view the sheet object (or see the API).- Global functions and variables (add your own in your .visidatarc).
- modules that have been
imported in Python- if you need a module that hasn't already been imported at runtime, use
g Ctrl+X import <modname>.
- if you need a module that hasn't already been imported at runtime, use
sheet: the current sheet (a TableSheet object)col: the current column (as a Column object; use for Column metadata)row: the current row (a Python object of the internal rowtype)- curcol: evaluate to the typed value of this row in the column that the cursor was on at the time that the expression column was added.
- cursorCol: evaluate to the typed value of this row for the column the cursor is on. Changes as the cursor moves for
=. Uses the column from the time the calculation was made forg=,gz=, andz=.
Additional attributes can be added to sheets and columns.
col deliberately returns a Column object, but any other Column object is interpreted as the value within that column for the same row. For example, both curcol and cursorcol return values, not the object itself.
For example, this customizes addcol-expr to set the curcol attribute on the new ExprColumn to a snapshot of the current cursor column (at the time the expression column is added):
Sheet.addCommand('=', 'addcol-expr', 'addColumnAtCursor(ExprColumn(inputExpr("new column expr="), curcol=cursorCol))', 'create new column from Python expression, with column names as variables')
Then, an expression can use curcol as though it referred to the value in the saved column.
Tab autocompletion when inputting an expression will cycle through valid column names only.
The following examples use the file sample.tsv.
Question On which days have we sold more than 10 Items?
- Scroll to the Units column. Set the type of the Units column by pressing
#(int). - Type
=followed byUnits > 10. A new column will be created. The cells in this column will contain the value True in rows where the number of Units are greater than 10 and False otherwise. - Move to the new derived column.
- Type
|followed byTrueto select all rows where there were more than 10 Units sold. - Press
"to open a duplicate sheet with only those selected rows.
Question I have a dataset with separate columns for Year, Month and Day. How can I concatenate them into a single date column?
- Type
=followed byYear + '-' + Month + '-' + Day. - Set the type of the new derived column by pressing
@(date). - Type
^followed byDateto rename the column to Date.
Question I have a dataset with Date column that is missing a prefix of '2020-'. How do I add it to the Date column?
When using =, and wanting to reference the current column, we recommend using curcol. When using g=, gz=, and z=, we recommend cursorCol. =, unlike the others, is dynamic and changes with adjustment of underlying values, which means it will change along with the movement of the cursor (tracked by cursorCol). curcol is a special attribute of a new ExprColumn, which remembers the cursorCol at the time of creation.
- Move the cursor to Date.
- Type
g=followed by f"2020-{cursorCol}".
Question I have a dataset with file names. How do I create a new column with the file names lower cased?
- Move the cursor to file names column.
- Type
=followed by curcol.casefold(). - Move to the newly created column, and rename it with
^, followed by the desired name.
How to configure multiple columns
Properties of columns on the source sheet can be changed by using standard editing commands on its Columns sheet (accessed with Shift+C). In particular, it facilitates the selection of multiple columns, followed by utilising one of the g-prefixed commands to modify all of them.
For a full list of available commands, see the man page. Some example workflows follow.
The following examples use the file sample.tsv.
How to set multiple statistical aggregators
Question What is the average daily revenue from sales of each Item?
- Set the type of the Units column by pressing
#(int). - Set the type of the Total column by pressing
%(float). - Press
Shift+Cto open the Columns sheet. - Press
sorton the rows referencing the source sheet Units column and the Total column to select them. - Type
g+followed byavgto add a avg statistical aggregator to the selected rows. - Press
qto exit and return to the source sheet. - Scroll to the Item column. Press
Shift+Fto open the Frequency table.
Question What are the daily average and sum total number of Units sold for each Item?
Press
Shift+Cto open the Columns sheet.Move the cursor to the row referencing the source sheet Units column.
a. Press
sortto select it.b. Set the type for the source sheet Units columns by pressing
g#(int).c. Move the cursor to the aggregators column.
d. Type
eto enter edit mode, followed by sum avg.Press
qto exit and return to the source sheet.Move the cursor to the Item column. Press
Shift+Fto open the Frequency table.
