5 Common Invoice File Formats and How to Convert Between Them

Understand the most common invoice file formats (PDF, CSV, XML, JSON, EDI) and learn how to convert between them for different business needs.

5 Common Invoice File Formats and How to Convert Between Them

Invoices come in different file formats, and each format serves different purposes. Knowing which format to use—and how to convert between them—saves time and prevents compatibility headaches.

Here are the five most common invoice file formats and when to use each one.

1. PDF (Portable Document Format)

PDF is the universal format for sharing invoices. It preserves formatting exactly, looks the same on any device, and prevents accidental edits.

Characteristics

  • Read-only appearance: Looks professional and tamper-resistant
  • Universal compatibility: Opens on any computer, phone, or tablet
  • Print-ready: What you see is what prints
  • Data is locked: Information can't be easily extracted for processing

Best Used For

  • Sending invoices to customers
  • Archiving final invoice copies
  • Legal and compliance documentation
  • Sharing with anyone who just needs to view or print

Limitations

  • Can't edit data without special software
  • Can't import directly into accounting systems
  • Requires extraction to use data in other applications

Converting From PDF

To get data out of a PDF invoice:

  1. Upload to ConvertMyInvoice
  2. AI extracts line item data automatically
  3. Download as CSV, XML, or JSON

This converts the "locked" visual format into usable structured data.

2. CSV (Comma-Separated Values)

CSV is the simplest data format. It's plain text with commas separating values and line breaks separating records.

Position,Description,Quantity,UnitPrice,Total
1,Office Chair,2,199.99,399.98
2,Desk Lamp,5,34.50,172.50

Characteristics

  • Human-readable: Open in any text editor
  • Spreadsheet-native: Works perfectly in Excel, Google Sheets
  • Universally compatible: Every business application supports CSV import
  • Minimal file size: No overhead, just data

Best Used For

  • Importing into accounting software (QuickBooks, Xero, Wave)
  • Spreadsheet analysis and reporting
  • Data exchange between systems
  • Bulk data entry replacement

Limitations

  • No formatting (no bold, colors, or styling)
  • Flat structure only (no hierarchy or nesting)
  • Special characters can cause issues (commas, quotes)
  • No data type information (everything is text)

Converting To/From CSV

PDF → CSV: Use ConvertMyInvoice to extract invoice data into CSV format

CSV → Excel: Simply open the file in Excel, or use File → Import

CSV → JSON/XML: Use online converters or spreadsheet export features

For a deeper comparison of data formats, see CSV vs XML vs JSON: Which Format is Best for Invoice Data?

3. XML (eXtensible Markup Language)

XML uses tags to structure data hierarchically. It's verbose but self-describing—the format explains what the data means.

<?xml version="1.0"?>
<Invoice>
  <InvoiceNumber>INV-001</InvoiceNumber>
  <LineItems>
    <Item>
      <Description>Office Chair</Description>
      <Quantity>2</Quantity>
      <UnitPrice>199.99</UnitPrice>
    </Item>
  </LineItems>
</Invoice>

Characteristics

  • Self-describing: Tags explain what each value represents
  • Hierarchical: Supports nested, complex data structures
  • Validatable: XML schemas can enforce data rules
  • Enterprise standard: Common in large organizations and B2B

Best Used For

  • ERP system integration (SAP, Oracle, Microsoft Dynamics)
  • B2B data interchange
  • Government and regulatory submissions
  • Automated workflow processing
  • Long-term data archival

Limitations

  • Verbose (larger files than CSV or JSON)
  • Harder to read for non-technical users
  • Requires XML-aware tools for proper viewing
  • Overkill for simple spreadsheet tasks

Converting To/From XML

PDF → XML: ConvertMyInvoice offers XML as an output option

XML → CSV: Excel can import XML (Data → Get Data → From XML)

XML → JSON: Many online converters available, or use programming libraries

4. JSON (JavaScript Object Notation)

JSON is the data format of the modern web. It's concise, readable, and native to web applications.

{
  "invoiceNumber": "INV-001",
  "lineItems": [
    {
      "description": "Office Chair",
      "quantity": 2,
      "unitPrice": 199.99
    }
  ]
}

Characteristics

  • Lightweight: Less verbose than XML
  • Web-native: Default format for APIs and web applications
  • Developer-friendly: Works naturally in JavaScript, Python, and most languages
  • Supports complex structures: Arrays, objects, nesting

Best Used For

  • API integrations
  • Custom application development
  • Modern software stacks
  • NoSQL database storage
  • Webhook payloads

Limitations

  • No native Excel support (requires conversion)
  • Few accounting applications accept JSON directly
  • Less familiar to non-technical business users
  • No built-in schema validation

Converting To/From JSON

PDF → JSON: ConvertMyInvoice provides JSON output option

JSON → CSV: Many online converters; Excel Power Query can handle it

JSON → XML: Programming libraries or online converters

5. EDI (Electronic Data Interchange)

EDI is the oldest electronic invoice format, dating to the 1960s. It uses cryptic fixed-position codes that machines read easily but humans struggle with.

ISA*00*          *00*          *ZZ*SENDER...
GS*IN*SENDER*RECEIVER*20231015*1234*1*X*004010
ST*810*0001
BIG*20231015*INV-001
...

Characteristics

  • Machine-optimized: Designed for computer processing, not human reading
  • Standardized: ANSI X12 and UN/EDIFACT standards
  • Established: Decades of use in large enterprises
  • Integrated: Direct system-to-system transmission

Best Used For

  • Large retailer requirements (Walmart, Amazon Vendor Central)
  • Automotive industry supply chains
  • Healthcare claims processing
  • Any industry with established EDI mandates

Limitations

  • Extremely difficult to read or debug manually
  • Requires specialized EDI software or services
  • High implementation cost
  • Overkill for small business needs

Converting To/From EDI

EDI conversion typically requires specialized software or third-party services. It's not a DIY format for most businesses. If a trading partner requires EDI, they usually provide specifications and sometimes conversion services.

For small businesses receiving EDI requirements, options include:

  • EDI service providers (SPS Commerce, TrueCommerce)
  • Trading partner portals that convert to/from EDI
  • Accounting software with built-in EDI support

Format Comparison Summary

FormatReadabilityExcel CompatibleAccounting ImportDeveloper UseFile Size
PDFExcellentNoNoNoMedium
CSVGoodExcellentExcellentLimitedSmallest
XMLModerateLimitedSome systemsGoodLargest
JSONGoodNoRareExcellentSmall
EDIPoorNoSpecializedSpecializedSmall

Choosing the Right Format

Use this decision guide:

You need to send an invoice to a customer → PDF

You need to import data into accounting software → CSV

You need to integrate with enterprise ERP → XML

You need to feed data into custom applications or APIs → JSON

Your trading partner mandates it → EDI (get help)

You're not sure → Start with CSV (most flexible)

Converting Between Formats: Practical Workflow

Most invoice format conversion follows this pattern:

Starting with a PDF Invoice

  1. Extract data using ConvertMyInvoice → outputs CSV, XML, or JSON
  2. Use the extracted format directly, or convert further

Starting with CSV

  • To Excel: Open directly (double-click or File → Open)
  • To JSON: Use online converters or Excel export features
  • To XML: Use online converters or Excel's XML mapping

Starting with JSON or XML

  • To CSV: Online converters, or Excel's import features
  • To each other: Online converters or programming libraries

The key insight: CSV is the universal interchange format. When in doubt, get your data into CSV first, then convert to whatever you need.

Frequently Asked Questions

Which format preserves the original invoice appearance?

Only PDF preserves exact visual appearance. All other formats (CSV, XML, JSON, EDI) contain only data, not formatting. If you need a visual record, keep the original PDF alongside any data extractions.

Can I convert an invoice back to PDF after extracting data?

You can create a new PDF from data, but it won't look like the original invoice. PDF extraction is essentially one-way for visual fidelity—you get the data out, but regenerating the exact original layout isn't practical.

What format should I use for archiving invoices?

Keep original PDFs for legal/visual archives. For data archives that support analysis and searching, CSV or XML are good choices due to their simplicity and long-term compatibility.

Why does my accounting software only accept CSV?

CSV is the most universal data format. It's simple, well-understood, and supported by every programming language and application. Accounting software vendors implement CSV import because it works everywhere.

How do I handle invoices that arrive in different formats from different vendors?

Convert everything to a common format (usually CSV or your accounting software's preferred format). This standardization happens at the point of receipt—whether manually or through extraction tools—so your downstream processes work consistently.


Need to convert PDF invoices to CSV, XML, or JSON? ConvertMyInvoice extracts invoice data and exports to your preferred format. Upload a PDF, choose your output format, and download—free, no signup required.