You exported a sales report from your accounting tool and got two options: download as CSV or XLSX. You picked one, opened it, and something looked wrong — formulas were gone, accents turned into garbage, or the dates flipped between European and American format. This is the daily reality of spreadsheet file formats, and the choice between XLSX and CSV is rarely as obvious as it seems.
Both formats store tabular data, but they were designed for different jobs. Understanding what each one preserves — and what it silently drops — saves hours of debugging and a lot of frustration.
What CSV actually is
CSV stands for Comma-Separated Values. It is the oldest, simplest tabular format still in widespread use, dating back to mainframe data exchange in the 1970s.
A CSV file is a plain text file. Each line is a row, and within a row, values are separated by a delimiter — usually a comma, sometimes a semicolon, tab, or pipe. There is no formatting, no formula, no styling, and no concept of multiple sheets. Just rows and columns of text.
This simplicity is CSV's superpower. Any tool that handles tabular data — from a 50-year-old COBOL program to a modern Python script — can read CSV. There is no proprietary parser, no version compatibility, no licensing. Open it in a text editor and you can read it.
What XLSX actually is
XLSX is the Office Open XML Spreadsheet format, introduced by Microsoft in 2007 to replace the older binary .xls format. Despite the Microsoft origin, XLSX is an open ISO standard (ISO/IEC 29500), and most modern spreadsheet applications support it natively.
An XLSX file is not a single file — it is a ZIP archive containing dozens of XML documents. Inside, you find:
- The cell data and formulas
- Formatting (fonts, colours, borders, number formats)
- Multiple sheets, each with its own grid
- Charts, pivot tables, named ranges, conditional formatting
- Embedded images and even macros
You can rename a .xlsx file to .zip, unzip it, and inspect the XML yourself. This makes XLSX both rich and inspectable.
The honest comparison
| Capability | CSV | XLSX |
|---|---|---|
| Stores plain values | ✅ | ✅ |
| Preserves formulas | ❌ | ✅ |
| Preserves formatting (bold, colour) | ❌ | ✅ |
| Multiple sheets | ❌ | ✅ |
| Charts and pivot tables | ❌ | ✅ |
| Number formats (currency, dates) | ❌ (raw text only) | ✅ |
| Universal compatibility | ✅ | ✅ (modern apps) |
| Human-readable in a text editor | ✅ | ❌ (it's a ZIP) |
| File size for plain data | Smallest | 5-10× bigger |
| Risk of locale issues | High (commas, dates) | Low (types are explicit) |
| Streamable for huge files | ✅ | ⚠️ (must unzip first) |
When CSV is the right answer
Use CSV when you need any of these properties:
- Maximum compatibility. Importing into a database, feeding a script, sending to a partner whose tools you don't know — CSV will work everywhere.
- Massive datasets. A million-row CSV streams nicely; a million-row XLSX may push memory limits and can hit Excel's hard cap of 1,048,576 rows per sheet.
- Version control. CSV diffs cleanly in Git. XLSX shows up as a binary blob.
- Pure data exchange. When you only need the values and the receiving system will apply its own formatting.
When XLSX is the right answer
Switch to XLSX when any of the following matters:
- Formulas need to survive. A budget with
=SUM(B2:B30)becomes a static number in CSV. - Multiple sheets. A monthly tracker with one sheet per month collapses to a single sheet in CSV.
- Formatting carries meaning. Bold totals, colour-coded categories, currency symbols, percentage formats — all lost in CSV.
- Type-safe dates and numbers. XLSX stores
2026-05-02as a date type. CSV stores it as text, and the next tool decides how to interpret it (often badly). - The recipient is a human. Humans read XLSX files. Programs read CSV.
The traps that catch everyone
A few specific issues bite users repeatedly:
The locale comma trap In French, German, and many other locales, the decimal separator is a comma, not a period. So 1,5 means 1.5. But CSV uses commas as field separators. Excel in those locales saves CSV with semicolons instead — which then breaks when imported into a tool expecting commas. The result: numbers in the wrong columns, or whole rows merged.
The date format trap A CSV with 03/04/2026 is ambiguous. Is it 3 April or 4 March? Excel auto-interprets based on locale, sometimes silently rewriting the date. XLSX stores dates as numbers (days since 1900) with explicit type metadata, removing the ambiguity.
The leading zero trap A phone number, ZIP code, or product SKU that starts with 0 survives in XLSX as text. In CSV, when reopened in Excel, it gets parsed as a number and the leading zero disappears. Forever.
The encoding trap A CSV exported as ANSI on Windows looks fine until a French name with é or a Japanese filename arrives. Always export CSV as UTF-8 with BOM if Excel will reopen it, or as UTF-8 plain if a script consumes it.
A practical rule of thumb
Use this simple test:
- Will the file be opened by a human? → XLSX
- Will the file be consumed by a program or pipeline? → CSV
- Are there formulas, multiple sheets, or formatting? → XLSX
- Is the file going to a database, an API, or a partner with unknown tools? → CSV
- Is the file bigger than 100 MB or 500,000 rows? → CSV
When in doubt, keep the master in XLSX and export to CSV when needed. Going the other way — building up rich formatting in CSV and trying to upgrade to XLSX — loses the advantages of each format.
Going further
If you work with spreadsheets daily, two short tutorials cover the common workflows in your browser:
- How to Edit XLSX Spreadsheets Online Without Excel — open, edit, run formulas, and export, no Excel required.
- How to Convert Between JSON, YAML, and CSV — round-trip data between common formats without losing structure.
Both run entirely in your browser and never upload your files to any server.
