django - Simple RDF mysql backend -
okay have more abstract mysql kind of question , i'm not sure begin not lot of resources online in area. warning not have lot of experience mysql.
okay, have built far simple django interface have incorporated ctakes nlp pipeline into. pipeline runs when user submits a block of text associated study name , spits out array of triples automatically generated using semantic role labeler , dependency parser.
[(weight_gain, mayleadto, obesity), (cancer, maybe, present), (study, observedchangein, sleep_patterns), ....]
what store individual triples in simplest mysql schema possible primary key study name , each row contains subject, object , predicate field corresponding triple structure.
so far saving output direct chunk of triple_store:
code handling script , storing data:
if form.is_valid(): instance = form.save(commit=false) instancevalueslist = [instance.study_abstract,instance.study_methods,instance.study_results] instancevaluesstring = u" ".join(instancevalueslist).encode('utf-8').strip() tempfileinlocation ='/tmp/'+randomword(8)+'.txt' tempfilein = open(tempfileinlocation, 'w') tempfilein.write(instancevaluesstring) tempfilein.close() tempfileoutlocation ='/tmp/'+randomword(8)+'.txt' arglist=[tempfileinlocation, tempfileoutlocation] p = subprocess.popen(['run_pipeline.sh'] +arglist, stdin=subprocess.pipe, stdout=subprocess.pipe) (stdoutdata, stderrdata) = p.communicate() tempfileout = open(tempfileoutlocation, 'r') instance.processed_data = tempfileout.read() #instance.processed_data = stdoutdata instance.save() p.stdin.close() p.stdout.close() tempfileout.close()
model code:
class query(models.model): study_name = models.charfield(max_length=250, default='') study_abstract = models.textfield(default='') study_methods = models.textfield(default='') study_results = models.textfield(default='') processed_data = models.textfield(default='') updated = models.datetimefield(auto_now=true, auto_now_add=false) timestamp = models.datetimefield(auto_now=false, auto_now_add=true)
what gives me in processed_data field whole array of triples stored is.
i have been exploring online relational db schemas simple rdf implementations , have come across several solutions unsure of how implement them. 1 seemed promising
create table triple ( property varchar (255), resource varchar (255), value blob, hint char(1), );
how go switching code implement this?
thanks in advance help!
Comments
Post a Comment