🗂️
Extract
OfficialActiveby UniSkill Team
Structured data extraction from documents, PDFs, and web pages.
Description
The Extract skill uses LLM-guided structured extraction to pull typed data from unstructured sources — PDFs, HTML documents, and plain text. Define your target JSON schema and the skill will parse, validate, and return clean typed output. Supports nested objects, arrays, and optional fields with confidence scores. Perfect for automating data entry, invoice processing, and document intelligence pipelines.
API Reference
Input Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
source | string | Yes | URL or raw text content to extract from |
schema | object | Yes | JSON Schema defining the extraction target structure |
source_type | 'url' | 'text' | 'pdf' | No | Type of input (auto-detected if omitted) |
Response Schema
| Field | Type | Description |
|---|---|---|
data | object | Extracted data conforming to the provided schema |
confidence | number | Overall extraction confidence (0–1) |
missing_fields | string[] | Schema fields not found in source |
Use Cases
- Invoice processing — extract line items, totals, vendor info
- Resume parsing — pull structured candidate profiles
- Contract analysis — extract dates, parties, and obligations
- Product catalog enrichment from supplier PDFs
Pricing
Cost per Request
2CR
Credits are deducted per successful API call.
Performance
Avg. Latency~1.0s
Success Rate98.4%
Integration
curl -X POST https://api.uniskill.io/v1/extract
-H "Authorization: Bearer <LOGIN_TO_VIEW_TOKEN>"
-H "Content-Type: application/json"
-d '{"query": "example"}'