5 Common Invoice File Formats and How to Convert Between Them
5 Common Invoice File Formats and How to Convert Between Them
Invoices come in different file formats, and each format serves different purposes. Knowing which format to use—and how to convert between them—saves time and prevents compatibility headaches.
Here are the five most common invoice file formats and when to use each one.
1. PDF (Portable Document Format)
PDF is the universal format for sharing invoices. It preserves formatting exactly, looks the same on any device, and prevents accidental edits.
Characteristics
- Read-only appearance: Looks professional and tamper-resistant
- Universal compatibility: Opens on any computer, phone, or tablet
- Print-ready: What you see is what prints
- Data is locked: Information can't be easily extracted for processing
Best Used For
- Sending invoices to customers
- Archiving final invoice copies
- Legal and compliance documentation
- Sharing with anyone who just needs to view or print
Limitations
- Can't edit data without special software
- Can't import directly into accounting systems
- Requires extraction to use data in other applications
Converting From PDF
To get data out of a PDF invoice:
- Upload to ConvertMyInvoice
- AI extracts line item data automatically
- Download as CSV, XML, or JSON
This converts the "locked" visual format into usable structured data.
2. CSV (Comma-Separated Values)
CSV is the simplest data format. It's plain text with commas separating values and line breaks separating records.
Position,Description,Quantity,UnitPrice,Total
1,Office Chair,2,199.99,399.98
2,Desk Lamp,5,34.50,172.50
Characteristics
- Human-readable: Open in any text editor
- Spreadsheet-native: Works perfectly in Excel, Google Sheets
- Universally compatible: Every business application supports CSV import
- Minimal file size: No overhead, just data
Best Used For
- Importing into accounting software (QuickBooks, Xero, Wave)
- Spreadsheet analysis and reporting
- Data exchange between systems
- Bulk data entry replacement
Limitations
- No formatting (no bold, colors, or styling)
- Flat structure only (no hierarchy or nesting)
- Special characters can cause issues (commas, quotes)
- No data type information (everything is text)
Converting To/From CSV
PDF → CSV: Use ConvertMyInvoice to extract invoice data into CSV format
CSV → Excel: Simply open the file in Excel, or use File → Import
CSV → JSON/XML: Use online converters or spreadsheet export features
For a deeper comparison of data formats, see CSV vs XML vs JSON: Which Format is Best for Invoice Data?
3. XML (eXtensible Markup Language)
XML uses tags to structure data hierarchically. It's verbose but self-describing—the format explains what the data means.
<?xml version="1.0"?>
<Invoice>
<InvoiceNumber>INV-001</InvoiceNumber>
<LineItems>
<Item>
<Description>Office Chair</Description>
<Quantity>2</Quantity>
<UnitPrice>199.99</UnitPrice>
</Item>
</LineItems>
</Invoice>
Characteristics
- Self-describing: Tags explain what each value represents
- Hierarchical: Supports nested, complex data structures
- Validatable: XML schemas can enforce data rules
- Enterprise standard: Common in large organizations and B2B
Best Used For
- ERP system integration (SAP, Oracle, Microsoft Dynamics)
- B2B data interchange
- Government and regulatory submissions
- Automated workflow processing
- Long-term data archival
Limitations
- Verbose (larger files than CSV or JSON)
- Harder to read for non-technical users
- Requires XML-aware tools for proper viewing
- Overkill for simple spreadsheet tasks
Converting To/From XML
PDF → XML: ConvertMyInvoice offers XML as an output option
XML → CSV: Excel can import XML (Data → Get Data → From XML)
XML → JSON: Many online converters available, or use programming libraries
4. JSON (JavaScript Object Notation)
JSON is the data format of the modern web. It's concise, readable, and native to web applications.
{
"invoiceNumber": "INV-001",
"lineItems": [
{
"description": "Office Chair",
"quantity": 2,
"unitPrice": 199.99
}
]
}
Characteristics
- Lightweight: Less verbose than XML
- Web-native: Default format for APIs and web applications
- Developer-friendly: Works naturally in JavaScript, Python, and most languages
- Supports complex structures: Arrays, objects, nesting
Best Used For
- API integrations
- Custom application development
- Modern software stacks
- NoSQL database storage
- Webhook payloads
Limitations
- No native Excel support (requires conversion)
- Few accounting applications accept JSON directly
- Less familiar to non-technical business users
- No built-in schema validation
Converting To/From JSON
PDF → JSON: ConvertMyInvoice provides JSON output option
JSON → CSV: Many online converters; Excel Power Query can handle it
JSON → XML: Programming libraries or online converters
5. EDI (Electronic Data Interchange)
EDI is the oldest electronic invoice format, dating to the 1960s. It uses cryptic fixed-position codes that machines read easily but humans struggle with.
ISA*00* *00* *ZZ*SENDER...
GS*IN*SENDER*RECEIVER*20231015*1234*1*X*004010
ST*810*0001
BIG*20231015*INV-001
...
Characteristics
- Machine-optimized: Designed for computer processing, not human reading
- Standardized: ANSI X12 and UN/EDIFACT standards
- Established: Decades of use in large enterprises
- Integrated: Direct system-to-system transmission
Best Used For
- Large retailer requirements (Walmart, Amazon Vendor Central)
- Automotive industry supply chains
- Healthcare claims processing
- Any industry with established EDI mandates
Limitations
- Extremely difficult to read or debug manually
- Requires specialized EDI software or services
- High implementation cost
- Overkill for small business needs
Converting To/From EDI
EDI conversion typically requires specialized software or third-party services. It's not a DIY format for most businesses. If a trading partner requires EDI, they usually provide specifications and sometimes conversion services.
For small businesses receiving EDI requirements, options include:
- EDI service providers (SPS Commerce, TrueCommerce)
- Trading partner portals that convert to/from EDI
- Accounting software with built-in EDI support
Format Comparison Summary
| Format | Readability | Excel Compatible | Accounting Import | Developer Use | File Size |
|---|---|---|---|---|---|
| Excellent | No | No | No | Medium | |
| CSV | Good | Excellent | Excellent | Limited | Smallest |
| XML | Moderate | Limited | Some systems | Good | Largest |
| JSON | Good | No | Rare | Excellent | Small |
| EDI | Poor | No | Specialized | Specialized | Small |
Choosing the Right Format
Use this decision guide:
You need to send an invoice to a customer → PDF
You need to import data into accounting software → CSV
You need to integrate with enterprise ERP → XML
You need to feed data into custom applications or APIs → JSON
Your trading partner mandates it → EDI (get help)
You're not sure → Start with CSV (most flexible)
Converting Between Formats: Practical Workflow
Most invoice format conversion follows this pattern:
Starting with a PDF Invoice
- Extract data using ConvertMyInvoice → outputs CSV, XML, or JSON
- Use the extracted format directly, or convert further
Starting with CSV
- To Excel: Open directly (double-click or File → Open)
- To JSON: Use online converters or Excel export features
- To XML: Use online converters or Excel's XML mapping
Starting with JSON or XML
- To CSV: Online converters, or Excel's import features
- To each other: Online converters or programming libraries
The key insight: CSV is the universal interchange format. When in doubt, get your data into CSV first, then convert to whatever you need.
Frequently Asked Questions
Which format preserves the original invoice appearance?
Only PDF preserves exact visual appearance. All other formats (CSV, XML, JSON, EDI) contain only data, not formatting. If you need a visual record, keep the original PDF alongside any data extractions.
Can I convert an invoice back to PDF after extracting data?
You can create a new PDF from data, but it won't look like the original invoice. PDF extraction is essentially one-way for visual fidelity—you get the data out, but regenerating the exact original layout isn't practical.
What format should I use for archiving invoices?
Keep original PDFs for legal/visual archives. For data archives that support analysis and searching, CSV or XML are good choices due to their simplicity and long-term compatibility.
Why does my accounting software only accept CSV?
CSV is the most universal data format. It's simple, well-understood, and supported by every programming language and application. Accounting software vendors implement CSV import because it works everywhere.
How do I handle invoices that arrive in different formats from different vendors?
Convert everything to a common format (usually CSV or your accounting software's preferred format). This standardization happens at the point of receipt—whether manually or through extraction tools—so your downstream processes work consistently.
Need to convert PDF invoices to CSV, XML, or JSON? ConvertMyInvoice extracts invoice data and exports to your preferred format. Upload a PDF, choose your output format, and download—free, no signup required.