CSV & JSON Connectors

The CSV and JSON connectors enable bulk data import into the knowledge graph from structured files. Both support file upload and URL-based fetching, with automatic schema inference and configurable graph node generation.

CSV Connector

Configuration

File Upload

json

{
  "name": "Customer Data",
  "source_type": "csv",
  "config": {
    "file_path": "/data/customers.csv",
    "node_label": "Customer",
    "delimiter": ",",
    "has_header": true,
    "id_column": "customer_id"
  }
}

URL-Based

json

{
  "name": "Public Dataset",
  "source_type": "csv",
  "config": {
    "url": "https://data.example.com/exports/products.csv",
    "node_label": "Product",
    "delimiter": ",",
    "has_header": true,
    "id_column": "sku"
  }
}

Schema Inference

The CSV connector automatically infers column types by sampling the first 100 rows:

Inferred Type	Detection	Example
`string`	Default fallback	`"John Doe"`
`integer`	All numeric, no decimal point	`42`, `1000`
`float`	Numeric with decimal point	`19.99`, `3.14`
`boolean`	`true`/`false`, `yes`/`no`, `1`/`0`	`true`
`datetime`	ISO 8601 or common date formats	`2025-01-15T10:30:00Z`
`email`	Contains `@` with domain	`user@example.com`
`url`	Starts with `http://` or `https://`	`https://example.com`

CSV Configuration Reference

Field	Type	Default	Description
`file_path`	string	--	Local file path (mutually exclusive with `url`)
`url`	string	--	URL to fetch the CSV from
`node_label`	string	required	Neo4j node label for imported rows
`delimiter`	string	`,`	Column delimiter
`has_header`	bool	`true`	First row contains column names
`id_column`	string	--	Column to use as the unique node identifier
`encoding`	string	`utf-8`	File encoding
`skip_rows`	int	`0`	Number of rows to skip from the top
`max_rows`	int	--	Maximum rows to import
`column_mapping`	object	--	Rename columns: `{"old_name": "new_name"}`
`relationship_columns`	array	`[]`	Columns that reference other nodes (see below)

JSON Connector

Configuration

File Upload

json

{
  "name": "API Export",
  "source_type": "json",
  "config": {
    "file_path": "/data/services.json",
    "node_label": "Service",
    "root_path": "$.services",
    "id_field": "id"
  }
}

URL-Based

json

{
  "name": "Remote Config",
  "source_type": "json",
  "config": {
    "url": "https://api.example.com/config.json",
    "node_label": "ConfigEntry",
    "root_path": "$",
    "id_field": "key",
    "headers": {
      "Authorization": "Bearer token"
    }
  }
}

JSON Path Support

Use JSONPath expressions to target specific parts of the JSON document:

Path	Description	Example Input	Matches
`$`	Root (whole document)	`[{...}, {...}]`	All items
`$.data`	Nested key	`{"data": [{...}]}`	Items in `data`
`$.results[*]`	Array items	`{"results": [{...}]}`	Each result
`$.teams[*].members`	Nested arrays	`{"teams": [{"members": [...]}]}`	All members

Schema Inference

For JSON objects, schema inference maps JSON types to Neo4j property types:

JSON Type	Neo4j Type	Notes
`string`	`string`	Direct mapping
`number` (integer)	`integer`	No decimal point
`number` (float)	`float`	Has decimal point
`boolean`	`boolean`	Direct mapping
`array`	`string[]`	Stored as string array property
`object`	Flattened	Nested objects are flattened with dot notation
`null`	--	Null values are omitted

JSON Configuration Reference

Field	Type	Default	Description
`file_path`	string	--	Local file path (mutually exclusive with `url`)
`url`	string	--	URL to fetch the JSON from
`node_label`	string	required	Neo4j node label for imported objects
`root_path`	string	`$`	JSONPath to the array of objects to import
`id_field`	string	--	Field to use as the unique node identifier
`headers`	object	`{}`	HTTP headers for URL-based fetching
`flatten_nested`	bool	`true`	Flatten nested objects with dot notation
`max_depth`	int	`3`	Maximum nesting depth to flatten
`max_items`	int	--	Maximum items to import
`relationship_fields`	array	`[]`	Fields that reference other nodes

Graph Node Generation

Both connectors create typed nodes in Neo4j with all columns/fields as properties.

Basic Import

Given this CSV:

csv

customer_id,name,email,plan,signup_date
C001,Alice Smith,alice@example.com,pro,2025-01-15
C002,Bob Jones,bob@example.com,free,2025-02-01

With node_label: "Customer" and id_column: "customer_id", this creates:

cypher

(:Customer {customer_id: "C001", name: "Alice Smith", email: "alice@example.com",
            plan: "pro", signup_date: datetime("2025-01-15")})
(:Customer {customer_id: "C002", name: "Bob Jones", email: "bob@example.com",
            plan: "free", signup_date: datetime("2025-02-01")})

Relationship Columns

Define columns/fields that reference other nodes to automatically create relationships:

json

{
  "config": {
    "node_label": "Order",
    "id_column": "order_id",
    "relationship_columns": [
      {
        "column": "customer_id",
        "target_label": "Customer",
        "target_id_property": "customer_id",
        "relationship_type": "PLACED_BY"
      },
      {
        "column": "product_sku",
        "target_label": "Product",
        "target_id_property": "sku",
        "relationship_type": "CONTAINS"
      }
    ]
  }
}

This creates relationships between imported nodes and existing nodes in the graph:

cypher

(:Order {order_id: "O100"})-[:PLACED_BY]->(:Customer {customer_id: "C001"})
(:Order {order_id: "O100"})-[:CONTAINS]->(:Product {sku: "PROD-42"})

TIP

Relationship columns work across connector boundaries. You can import a CSV of deployment metadata and link it to Kubernetes pods or GitHub repositories already in the graph.

Use Cases

Import team rosters — CSV of team members linked to GitHub users
Load infrastructure inventories — JSON exports from CMDB systems
Enrich the graph — add business context (cost centers, SLAs, ownership) to technical resources
Bulk data migration — import historical data from legacy systems
Configuration audits — import and query JSON config files for consistency checks

Troubleshooting

Error	Cause	Fix
`File not found`	File path is incorrect or file does not exist	Verify the file path is absolute and accessible
`Failed to fetch URL`	URL is unreachable or returned an error	Check the URL and any authentication headers
`Column not found`	`id_column` references a non-existent column	Verify column names match the CSV header
`Invalid JSON`	JSON file is malformed	Validate the JSON with a linter
`JSONPath matched no items`	`root_path` does not match any array in the document	Test the JSONPath expression against your data
`Encoding error`	File uses non-UTF-8 encoding	Set the `encoding` field (e.g., `latin-1`, `utf-16`)

CSV & JSON Connectors ​

CSV Connector ​

Configuration ​

File Upload ​

URL-Based ​

Schema Inference ​

CSV Configuration Reference ​

JSON Connector ​

Configuration ​

File Upload ​

URL-Based ​

JSON Path Support ​

Schema Inference ​

JSON Configuration Reference ​

Graph Node Generation ​

Basic Import ​

Relationship Columns ​

Use Cases ​

Troubleshooting ​

CSV & JSON Connectors

CSV Connector

Configuration

File Upload

URL-Based

Schema Inference

CSV Configuration Reference

JSON Connector

Configuration

File Upload

URL-Based

JSON Path Support

Schema Inference

JSON Configuration Reference

Graph Node Generation

Basic Import

Relationship Columns

Use Cases

Troubleshooting