CSV Format

Use CSV when you want the most portable raw dataset and do not mind larger text files.


Choose CSV when compatibility matters more than storage efficiency.

It is the easiest raw format to inspect manually, ingest with custom scripts, and move through generic tools that do not understand Parquet or NinjaTrader formats.

At A Glance

PropertyValue
extension.csv.gz
wrappergzip
delimitersemicolon ;
header rowyes
raw data or summaryraw event stream
timestamps inside rowsUTC, ISO8601, nanosecond precision
file day conventionYYYYMMDD grouped by America/New_York market day
best fitinteroperability, inspection, custom parsers, simple ETL

What You Get

Each file contains the full daily event stream for one symbol and one expiration.

SFTP path example:

bash
data/csv/ES/06-25/20250601.csv.gz

Every row follows the same eight-field layout:

csv
level;mdt;timestamp;operation;depth;market_maker;price;volume

How To Read The Rows

Two row families appear in the same file:

  • L1: top of book, trades, and session statistics
  • L2: order book depth updates

Typical examples:

csv
L1;2;2025-06-01T17:19:57.725675300Z;;;;5913.75;1
L2;0;2025-06-01T15:45:52.803532500Z;0;3;;5914.5;8

Interpretation:

  • the first row is a trade print
  • the second row is a depth update on the ask side

MDT Codes

mdtMeaning
0ask
1bid
2trade
3daily high
4daily low
5cumulative volume
6session open
7previous close
8open interest
9settlement price

For L2 rows:

  • operation is 0, 1, or 2 for add, update, or remove
  • depth is the price level index, where 0 is the best level
  • market_maker is reserved and currently empty

For L1 rows:

  • operation, depth, and market_maker are blank

Why Users Choose CSV

  • it works almost everywhere
  • it is easy to inspect without special tooling
  • it is a safe interchange format when the destination is unknown
  • it is easier than Parquet for quick conversions into custom formats

Common Gotchas

  • The delimiter is ;, not a comma.
  • The file is delivered as .csv.gz, so keep the gzip wrapper in mind.
  • The file day in the name is America/New_York, while row timestamps are UTC.
  • CSV and Parquet carry the same logical dataset. The difference is representation, not coverage.

If your workflow is analytics-first rather than compatibility-first, Parquet is usually the better default.

Related Pages