Appearance
CSV & JSON Connectors
The CSV and JSON connectors enable bulk data import into the knowledge graph from structured files. Both support file upload and URL-based fetching, with automatic schema inference and configurable graph node generation.
CSV Connector
Configuration
File Upload
json
{
"name": "Customer Data",
"source_type": "csv",
"config": {
"file_path": "/data/customers.csv",
"node_label": "Customer",
"delimiter": ",",
"has_header": true,
"id_column": "customer_id"
}
}URL-Based
json
{
"name": "Public Dataset",
"source_type": "csv",
"config": {
"url": "https://data.example.com/exports/products.csv",
"node_label": "Product",
"delimiter": ",",
"has_header": true,
"id_column": "sku"
}
}Schema Inference
The CSV connector automatically infers column types by sampling the first 100 rows:
| Inferred Type | Detection | Example |
|---|---|---|
string | Default fallback | "John Doe" |
integer | All numeric, no decimal point | 42, 1000 |
float | Numeric with decimal point | 19.99, 3.14 |
boolean | true/false, yes/no, 1/0 | true |
datetime | ISO 8601 or common date formats | 2025-01-15T10:30:00Z |
email | Contains @ with domain | user@example.com |
url | Starts with http:// or https:// | https://example.com |
CSV Configuration Reference
| Field | Type | Default | Description |
|---|---|---|---|
file_path | string | -- | Local file path (mutually exclusive with url) |
url | string | -- | URL to fetch the CSV from |
node_label | string | required | Neo4j node label for imported rows |
delimiter | string | , | Column delimiter |
has_header | bool | true | First row contains column names |
id_column | string | -- | Column to use as the unique node identifier |
encoding | string | utf-8 | File encoding |
skip_rows | int | 0 | Number of rows to skip from the top |
max_rows | int | -- | Maximum rows to import |
column_mapping | object | -- | Rename columns: {"old_name": "new_name"} |
relationship_columns | array | [] | Columns that reference other nodes (see below) |
JSON Connector
Configuration
File Upload
json
{
"name": "API Export",
"source_type": "json",
"config": {
"file_path": "/data/services.json",
"node_label": "Service",
"root_path": "$.services",
"id_field": "id"
}
}URL-Based
json
{
"name": "Remote Config",
"source_type": "json",
"config": {
"url": "https://api.example.com/config.json",
"node_label": "ConfigEntry",
"root_path": "$",
"id_field": "key",
"headers": {
"Authorization": "Bearer token"
}
}
}JSON Path Support
Use JSONPath expressions to target specific parts of the JSON document:
| Path | Description | Example Input | Matches |
|---|---|---|---|
$ | Root (whole document) | [{...}, {...}] | All items |
$.data | Nested key | {"data": [{...}]} | Items in data |
$.results[*] | Array items | {"results": [{...}]} | Each result |
$.teams[*].members | Nested arrays | {"teams": [{"members": [...]}]} | All members |
Schema Inference
For JSON objects, schema inference maps JSON types to Neo4j property types:
| JSON Type | Neo4j Type | Notes |
|---|---|---|
string | string | Direct mapping |
number (integer) | integer | No decimal point |
number (float) | float | Has decimal point |
boolean | boolean | Direct mapping |
array | string[] | Stored as string array property |
object | Flattened | Nested objects are flattened with dot notation |
null | -- | Null values are omitted |
JSON Configuration Reference
| Field | Type | Default | Description |
|---|---|---|---|
file_path | string | -- | Local file path (mutually exclusive with url) |
url | string | -- | URL to fetch the JSON from |
node_label | string | required | Neo4j node label for imported objects |
root_path | string | $ | JSONPath to the array of objects to import |
id_field | string | -- | Field to use as the unique node identifier |
headers | object | {} | HTTP headers for URL-based fetching |
flatten_nested | bool | true | Flatten nested objects with dot notation |
max_depth | int | 3 | Maximum nesting depth to flatten |
max_items | int | -- | Maximum items to import |
relationship_fields | array | [] | Fields that reference other nodes |
Graph Node Generation
Both connectors create typed nodes in Neo4j with all columns/fields as properties.
Basic Import
Given this CSV:
csv
customer_id,name,email,plan,signup_date
C001,Alice Smith,alice@example.com,pro,2025-01-15
C002,Bob Jones,bob@example.com,free,2025-02-01With node_label: "Customer" and id_column: "customer_id", this creates:
cypher
(:Customer {customer_id: "C001", name: "Alice Smith", email: "alice@example.com",
plan: "pro", signup_date: datetime("2025-01-15")})
(:Customer {customer_id: "C002", name: "Bob Jones", email: "bob@example.com",
plan: "free", signup_date: datetime("2025-02-01")})Relationship Columns
Define columns/fields that reference other nodes to automatically create relationships:
json
{
"config": {
"node_label": "Order",
"id_column": "order_id",
"relationship_columns": [
{
"column": "customer_id",
"target_label": "Customer",
"target_id_property": "customer_id",
"relationship_type": "PLACED_BY"
},
{
"column": "product_sku",
"target_label": "Product",
"target_id_property": "sku",
"relationship_type": "CONTAINS"
}
]
}
}This creates relationships between imported nodes and existing nodes in the graph:
cypher
(:Order {order_id: "O100"})-[:PLACED_BY]->(:Customer {customer_id: "C001"})
(:Order {order_id: "O100"})-[:CONTAINS]->(:Product {sku: "PROD-42"})TIP
Relationship columns work across connector boundaries. You can import a CSV of deployment metadata and link it to Kubernetes pods or GitHub repositories already in the graph.
Use Cases
- Import team rosters — CSV of team members linked to GitHub users
- Load infrastructure inventories — JSON exports from CMDB systems
- Enrich the graph — add business context (cost centers, SLAs, ownership) to technical resources
- Bulk data migration — import historical data from legacy systems
- Configuration audits — import and query JSON config files for consistency checks
Troubleshooting
| Error | Cause | Fix |
|---|---|---|
File not found | File path is incorrect or file does not exist | Verify the file path is absolute and accessible |
Failed to fetch URL | URL is unreachable or returned an error | Check the URL and any authentication headers |
Column not found | id_column references a non-existent column | Verify column names match the CSV header |
Invalid JSON | JSON file is malformed | Validate the JSON with a linter |
JSONPath matched no items | root_path does not match any array in the document | Test the JSONPath expression against your data |
Encoding error | File uses non-UTF-8 encoding | Set the encoding field (e.g., latin-1, utf-16) |