A command-line utility for migrating data from Neo4j to Neptune.
Conversion only:
java -jar neo4j-to-neptune.jar convert-csv \
--input /path/to/neo4j-export.csv \
--dir output \
--infer-typesConversion only using configuration file:
java -jar neo4j-to-neptune.jar convert-csv \
--conversion-config /path/to/conversion-config.yaml \
--dir output \
--infer-typesConvert and bulk load to Neptune:
java -jar neo4j-to-neptune.jar convert-csv \
--input /tmp/neo4j-export.csv \
--dir output \
--neptune-endpoint my-cluster.cluster-abc123.us-east-2.neptune.amazonaws.com \
--bucket-name my-neptune-bucket \
--iam-role-arn arn:aws:iam::123456789012:role/NeptuneLoadFromS3 \
--infer-typesConvert and bulk load using configuration file:
java -jar neo4j-to-neptune.jar convert-csv \
--input /tmp/neo4j-export.csv \
--dir output \
--bulk-load-config /path/to/bulk-load-config.yaml \
--infer-typesConvert and bulk load to Neptune:
java -jar neo4j-to-neptune.jar convert-csv \
-i /tmp/neo4j-export.csv \
-d output \
--neptune-endpoint my-cluster.cluster-abc123.us-east-2.neptune.amazonaws.com \
--bucket-name my-neptune-bucket \
--iam-role-arn arn:aws:iam::123456789012:role/NeptuneLoadFromS3 \
--infer-typesConvert and bulk load using configuration file:
java -jar neo4j-to-neptune.jar convert-csv \
-i /tmp/neo4j-export.csv \
-d output \
--bulk-load-config bulk-load-config.yaml \
--infer-typesmvn clean install
Migration of data from Neo4j to Neptune can be accomplished in a few ways:
- Export CSV from Neo4j – Use the APOC export procedures to export data from Neo4j to CSV.
- Convert CSV – Use the
convert-csvcommand-line utility to convert the exported CSV into the Neptune Gremlin bulk load CSV format. - Bulk Load into Neptune – Use the Neptune bulk load API to load data into Neptune.
- Export CSV from Neo4j – Use the APOC export procedures to export data from Neo4j to CSV.
- Convert and Bulk Load – Use the
convert-csvcommand with either--bulk-load-configor the required combination of--bucket-name,--neptune-endpoint, and--iam-role-arnto automatically convert the CSV, upload it to S3, and initiate a bulk load into Neptune.
- Stream and Convert CSV – Use the
convert-csvcommand with either--dotenv-fileor all of--uri,--usernameand--passwordto provide Neo4j credentials and stream the data without manually exporting it. - Bulk Load - Specify either
--bulk-load-configor the required combination of--bucket-name,--neptune-endpoint, and--iam-role-arnas the bulk load parameters.
Use the apoc.export.csv.all procedure from neo4j's APOC library to export data from Neo4j to CSV.
Follow the instructions for installing the APOC library for either Neo4j Desktop or Neo4j Server.
Create or update the apoc.conf configuration file to enable exports:
apoc.export.file.enabled=true
CALL apoc.export.csv.all(
"neo4j-export.csv",
{d:','}
)
Note: When running this command please use the syntax above, do not include {stream:true} in the command. Streaming the results back to the browser and then downloading them as a CSV will result in a file that will not be correctly processed by the conversion utility.
The path that you specify for the export file will be resolved relative to the Neo4j import directory. apoc.export.csv.all creates a single CSV file containing data for all nodes and relationships.
Use the convert-csv command-line utility to convert the CSV files exported from Neo4j into the Neptune Gremlin bulk load CSV format.
The utility requires two sets of parameters:
- Input method from Neo4j — Provide one of the following:
- [Manual] A path to locally exported Neo4j CSV files using the
--inputor-iflag. - [Stream] A path to a
.envfile using the--dotenv-fileor-dfflag. This file must define the variablesNEO4J_URI,NEO4J_USERNAME, andNEO4J_PASSWORD, corresponding to the Neo4j URI, username, and password. - [Stream] Directly specify the Neo4j URI, username, and password using the flags
--urior-u,--usernameor-n, and--passwordor-pw
- Output directory — The path to the local directory where the converted CSV files will be written.
There are also optional parameters that allow you to specify node, relationship multi-valued property policies, turn on data type inferencing, or to use a configuration YAML file for more granular conversion manipulation.
Neo4j allows 'homogeneous lists of simple types' to be stored as properties on both nodes and edges. These lists can contain duplicate values.
Neptune provides for set and single cardinality for vertex properties, and single cardinality for edge properties. Hence, there is no straightforward migration of Neo4j node list properties containing duplicate values into Neptune vertex properties, or Neo4j relationship list properties into Neptune edge properties.
The --node-property-policy and --relationship-property-policy parameters allow you to control the migration of multi-valued properties into Neptune.
--node-property-policy takes one of four values, the default being PutInSetIgnoringDuplicates:
LeaveAsString– Store a multi-valued Neo4j node property as a string representation of a JSON-formatted listHalt– Halt (throw an exception) if a multi-valued Neo4j node property is encounteredPutInSetIgnoringDuplicates– Convert a multi-valued Neo4j node property to a set cardinality Neptune property, discarding duplicate valuesPutInSetButHaltIfDuplicates– Convert a multi-valued Neo4j node property to a set cardinality Neptune property, discarding duplicate values but halt (throw an exception) if a multi-valued Neo4j node property containing duplicate values is encountered
--relationship-property-policy takes one of two values, the default being LeaveAsString:
LeaveAsString– Store a multi-valued Neo4j relationship property as a string representation of a JSON-formatted listHalt– Halt (throw an exception) if a multi-valued Neo4j relationship property is encountered
When importing data into Neptune using the bulk loader, you can specify the data type for each property. If you supply an --infer-types flag to convert-csv, the utility will attempt to infer the narrowest supported type for each column in the output CSV.
Note that convert-csv will always use a double for values with decimal or scientific notation.
To convert a local CSV that was exported from Neo4j, provide the required parameters:
java -jar neo4j-to-neptune.jar convert-csv \
--input /path/to/neo4j-export.csv \
--dir /path/to/output \
[additional options]To stream data from Neo4j to Neptune
java -jar neo4j-to-neptune.jar convert-csv \
--dotenv-file /path/to/dotenv/file \
--dir /path/to/output \
[additional options]To stream data from Neo4j to Neptune
java -jar neo4j-to-neptune.jar convert-csv \
--uri <Neo4j connection uri> \
--username <Neo4j username> \
--password <Neo4j password> \
--dir /path/to/output \
[additional options]The convert-csv utility conversion process can be configured through the YAML file for ID transformation, label mapping, and filtering:
java -jar neo4j-to-neptune.jar convert-csv \
--input /path/to/neo4j-export.csv \
--dir /path/to/output \
--conversion-config conversion-config.yaml \
[additional options]For an example of the conversion config YAML file refer to docs/example-conversion-config.yaml
Templates can reference original data fields using placeholders:
- Vertex templates:
{_id},{_labels},{property_name} - Edge templates:
{_type},{_start},{_end},{~from},{~to},{property_name}
"{_labels}_{name}_{_id}"→"Person_John_123""e!{~label}_{~from}_{~to}"→"e!KNOWS_Person_123_Person_456"
The utility uses two-pass processing: first transforming vertices and building ID mappings, then processing edges with correct vertex references. When vertices are skipped, connected edges are automatically skipped to maintain graph consistency.
Use the Neptune bulk loader to load data into Neptune from the converted CSV files.
The convert-csv command supports an integrated bulk loading option that automatically uploads converted CSV files to S3 and initiates the Neptune bulk load process. You can configure bulk loading using either CLI parameters or a YAML configuration file.
Before using the bulk load feature, ensure you have:
-
AWS Credentials: Configure AWS credentials using one of the following methods:
- AWS CLI:
aws configure - Environment variables:
AWS_ACCESS_KEY_IDandAWS_SECRET_ACCESS_KEY - IAM roles (if running on EC2)
- AWS credentials file
- AWS CLI:
-
S3 Bucket: An S3 bucket where CSV files will be uploaded, where the region is the same as your Neptune Cluster
-
IAM Role: An IAM role with permissions to:
- Read from the S3 bucket
- Access Neptune cluster
- Perform bulk load operations
-
Neptune Cluster: A running Neptune cluster with bulk load enabled, where the region is the same as your S3 bucket
To enable bulk load using CLI parameters, provide the required Neptune configuration:
java -jar neo4j-to-neptune.jar convert-csv \
--input /path/to/neo4j-export.csv \
--dir /path/to/output \
--neptune-endpoint your-cluster.cluster-abc123.region.neptune.amazonaws.com \
--bucket-name your-s3-bucket \
--iam-role-arn arn:aws:iam::account:role/YourNeptuneRole \
[additional options]Create a YAML configuration file with your bulk load settings:
bulk-load-config.yaml:
# S3 Configuration
bucket-name: "my-neptune-bulk-load-bucket"
s3-prefix: "neptune"
# Neptune Configuration
neptune-endpoint: "my-neptune-cluster.cluster-abc123def456.us-east-1.neptune.amazonaws.com"
neptune-port: "8182"
# IAM Configuration
iam-role-arn: "arn:aws:iam::123456789012:role/NeptuneLoadFromS3Role"
# Performance Settings
parallelism: "OVERSUBSCRIBE" # Options: LOW, MEDIUM, HIGH, OVERSUBSCRIBE
# Monitoring
monitor: trueThen use the configuration file:
java -jar neo4j-to-neptune.jar convert-csv \
--input /path/to/neo4j-export.csv \
--dir /path/to/output \
--bulk-load-config bulk-load-config.yaml \
[additional options]You can use a configuration file and override specific parameters via CLI:
java -jar neo4j-to-neptune.jar convert-csv \
--input /path/to/neo4j-export.csv \
--dir /path/to/output \
--bulk-load-config bulk-load-config.yaml \
--neptune-endpoint different-cluster.cluster-xyz789.us-west-2.neptune.amazonaws.com \
--parallelism HIGH \
[additional options][!IMPORTANT] Parameter Precedence: CLI parameters override configuration file values, which override default values.
The following parameters must be provided either via CLI or configuration file:
- Neptune endpoint:
--neptune-endpointorneptune-endpointin YAML - S3 bucket name:
--bucket-nameorbucket-namein YAML - IAM role ARN:
--iam-role-arnoriam-role-arnin YAML
- Neptune port:
--neptune-portorneptune-portin YAML (default: "8182") - S3 prefix:
--s3-prefixors3-prefixin YAML - Parallelism:
--parallelismorparallelismin YAML (default: "OVERSUBSCRIBE")- Options:
LOW,MEDIUM,HIGH,OVERSUBSCRIBE
- Options:
- Monitor progress:
--monitorormonitorin YAML (default: false)
When bulk load parameters are provided (either via --bulk-load-config or --neptune-endpoint), the tool validates all bulk load parameters before starting the conversion process. If any required parameters are missing or invalid, the conversion will be aborted with a clear error message indicating which parameters are missing.
- Validate: All bulk load parameters are validated before conversion starts
- Convert: Neo4j CSV is converted to Gremlin load data format
- Upload: Converted CSV files are uploaded to S3
- Load: Neptune bulk load job is initiated
- Monitor: (Optional) Progress is monitored until completion or timeout
java -jar target/neo4j-to-neptune.jar convert-csv -d output --uri bolt://localhost:7687 --username neo4j --password password --bulk-load-config bulk-load-config.yaml
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Sep. 04, 2025 12:14:34 P.M. com.amazonaws.services.neptune.io.Neo4jStreamWriter <init>
INFO: Successfully connected to Neo4j database at: bolt://localhost:7687
Sep. 04, 2025 12:14:34 P.M. com.amazonaws.services.neptune.io.Neo4jStreamWriter streamToFile
INFO: Starting data export to file: /tmp/output/1757013274850/neo4j-stream-data-temp.csv
Sep. 04, 2025 12:14:35 P.M. com.amazonaws.services.neptune.io.Neo4jStreamWriter streamToFile
INFO: Successfully exported 1 records (425 lines) to file: /tmp/output/1757013274850/neo4j-stream-data-temp.csv
Sep. 04, 2025 12:14:35 P.M. com.amazonaws.services.neptune.io.Neo4jStreamWriter close
INFO: Neo4j driver closed successfully
Vertices: 171
Edges : 253
Output : output/1757013274850
output/1757013274850
Completed in 0 second(s)
S3 Bucket: my-bucket
S3 Prefix: neptune
AWS Region: us-west-2
IAM Role ARN: arn:aws:iam::123456789100:role/NeptuneReadS3
Neptune Endpoint: db-neptune-1.cluster-xxxxxxxxxxxx.us-west-2.neptune.amazonaws.com
Neptune Port: 8182
Bulk Load Parallelism: OVERSUBSCRIBE
Bulk Load Monitor: true
Uploading Gremlin load data to S3...
Starting sequential upload of files from /tmp/output/1757013274850 to s3://my-bucket/neptune/1757013274850
Uploading file 1 of 2: vertices.csv
Starting upload with compression of /tmp/output/1757013274850/vertices.csv to s3://my-bucket/neptune/1757013274850/vertices.csv.gz
File size: 7.2 KB
Initiating Transfer Manager upload...
Upload with compression completed for /tmp/output/1757013274850/vertices.csv
Successfully uploaded vertices.csv (1/2)
Uploading file 2 of 2: edges.csv
Starting upload with compression of /tmp/output/1757013274850/edges.csv to s3://my-bucket/neptune/1757013274850/edges.csv.gz
File size: 17.6 KB
Initiating Transfer Manager upload...
Upload with compression completed for /tmp/output/1757013274850/edges.csv
Successfully uploaded edges.csv (2/2)
Successfully uploaded all 2 files from /tmp/output/1757013274850
Files uploaded successfully to S3. Files available at: s3://my-bucket/neptune/1757013274850/
Starting Neptune bulk load...
Testing connectivity to Neptune endpoint...
Successfully connected to Neptune. Status: 200 healthy
Neptune bulk load started successfully with load ID: a478c673-xxxx-xxxx-xxxx-77a70a430be8
Monitoring load progress for job: a478c673-xxxx-xxxx-xxxx-77a70a430be8
Neptune bulk load status: LOAD_IN_PROGRESS
Neptune bulk load status: LOAD_IN_PROGRESS
Neptune bulk load status: LOAD_IN_PROGRESS
Neptune bulk load status: LOAD_IN_PROGRESS
Neptune bulk load completed with status: LOAD_COMPLETED
If you prefer the manual approach or need more control over the process, you can still use the Neptune bulk loader directly to load data into Neptune from the converted CSV files.
This approach gives you more flexibility in terms of:
- Custom S3 upload strategies
- Advanced bulk load configurations
- Integration with existing CI/CD pipelines
- Custom monitoring and error handling