Get extracted content from processed file
GET /files/{fileId}/extracted-content
Returns the extracted content from file processing. For PDFs, returns markdown text and optionally structured extraction. For XLSX/CSV, returns TOON-formatted content.
Authorizations
Section titled “Authorizations ”Parameters
Section titled “ Parameters ”Path Parameters
Section titled “Path Parameters ”Unique identifier of the file to retrieve
Unique identifier of the file to retrieve
Responses
Section titled “ Responses ”Extracted content found
object
File type category
Unique identifier of the extraction record, or null if not processed
Extracted content, or null if not processed or extraction failed
object
Extracted markdown/text content (PDF, DOCX)
Total number of pages in the PDF
Total number of text blocks in the PDF
Total number of paragraphs (DOCX)
Total number of tables (DOCX)
TOON-formatted content (XLSX/CSV)
Total number of sheets (XLSX)
Total number of rows
CSV metadata (delimiter, encoding, etc.)
XLSX sheets metadata
Raw text content (text files)
Content type (text/markdown or text/plain)
Character count (text files)
Line count (text files)
Structured extraction (PDF only), or null if not available
object
Unique identifier of the structured extraction
Status: ‘complete’ (all sections succeeded), ‘partial’ (some failed but below threshold), ‘failed’ (too many sections failed)
Document sections/hierarchy
Per-section JSON schemas (null for failed sections)
Per-section extracted data (null for failed sections)
Metadata about extraction including error details for failed sections
object
Total number of sections in the document
Number of successfully extracted sections
Number of failed sections
Ratio of failed content (0.0 to 1.0)
Indices of failed sections
Error details for failed sections
object
Threshold used for failure determination (0.3)
Version of the processor used
When the extraction was performed
Whether the extracted content is empty
Bad Request - Validation error or invalid input
object
Unauthorized - Authentication required or invalid token
object
Forbidden - Insufficient permissions
object
Not Found - Resource does not exist