cypher - How can I speed up creation/update of large nodes in Neo4J? -


i'm trying create (or nothing if node exists) 1000+ nodes large in sense of amount of data i'm storing properties on each. properties each node like:

props: {       'hpsi0713i-aehn_2_qc1hip-839.hpsi0813i-ffdb_3_qc1hip-1.1': ['19','26','1.00','qc1hip-839'],       [... 1431 more ...] } 

and take 650k if store data in text file.

if create node (using merge , constrained unique property not if exists) properties these taking 20-40 seconds per node.

to debug split node creation setting properties, creating node first, getting node id back, matching id set properties. node creation fast expected. here's debugging of setting properties:

will run cypher: match (n) id(n) = 198058 set n = { props } return n - setting 1432 new properties node unique property 88aa3f215e73daea9bf65147e630cbd7_qc1hip-1 took 19 seconds run cypher: match (n) id(n) = 198059 set n = { props } return n - setting 1432 new properties node unique property 88aa3f215e73daea9bf65147e630cbd7_qc1hip-10 took 22 seconds 

one odd thing i've noticed during debugging if delete these nodes so:

match (n:`labelforthesenodes`) optional match (n)-[r]-() delete n, r 

if try adding them again it's fast nodes i've done, slow again nodes haven't:

will run cypher: match (n) id(n) = 198063 set n = { props } return n - setting 1432 new properties node unique property 88aa3f215e73daea9bf65147e630cbd7_qc1hip-1 took 0 seconds run cypher: match (n) id(n) = 198064 set n = { props } return n - setting 1432 new properties node unique property 88aa3f215e73daea9bf65147e630cbd7_qc1hip-10 took 1 seconds run cypher: match (n) id(n) = 198068 set n = { props } return n - setting 1432 new properties node unique property 88aa3f215e73daea9bf65147e630cbd7_qc1hip-1016 took 24 seconds 

what can done speed operation up? @ moment have simple loop creates/sets properties on single node each time, take @ least 6hrs create 1000 nodes.

as dealing large datasets, suggest should have batch insertion instead of creating single node @ time.

in case of updating, maintain dictionary of nodes instead of updating database every , creating/updating nodes database takes more time modifying dictionary. when node related modifications over, can send entire dictionary batch insertion


Comments