Week 1

Let's firstly set up the evaluation measure!

Posted by Stuart Chen on June 1, 2019

The Goals for Week 1

Let’s have a look into this week’s topics:

  1. qald evaluation against GERBIL interface

  2. building the semantic tree structue for the natural language sentence

The balance between whether to write a script to de the evaluation or to wrap the whole system into GERBIL

Method 1: evaluation.sh + evaluation.py V.S. Mehod 2: Wrapper of the QA System

For Method 1, the basic idea was that, based on the two files that I committed to my branch, the evaluation depends on running the GERBIL system on local machine to get a json-format response as comparison. It works on two parts parallel:

    1)first, in the evaluation.sh , it calls the NSpM model to give the output based on trained model;

    2)next, it retrieve the comparative output from the GERBIL to do the evaluation. The evaluation is conducted upon the data file after generation operation.  

For Method 2, it is that the GERBIL needs to wrap the whole self-built QA system that’s written in python into the existed system that’s written in java via jython. And I was adviced that this would be more straightforward to spares us from implementing the evaluation module of our own.

Besides, I have worked on cleaning up and refining a dataset from QALD that may be useful sooner, which contains a set of natural language queries and SPARQL queries with their answer’s value.

When talking the question, those natural language queries, it highlights the importance of sparsing a sentence while generating the template.

Entity Recognition & Semantic Parsing

To illustrate the ideas, I would hope to show you this figure that was extracted from the data below:

        {
                                "id": "3",
                                "answertype": "resource",
                                "aggregation": false,
                                "onlydbo": true,
                                "hybrid": false,
                                "question": [{
                                                "language": "en",
                                                "string": "Who was the wife of U.S. president Lincoln?",
                                                "keywords": "U.S. president, Lincoln, wife"
                                        },
        
        "query": {
        "sparql": 
        "PREFIX dbo: <http://dbpedia.org/ontology/>\nPREFIX res: <http://dbpedia.org/resource/>\nSELECT DISTINCT ?uri \nWHERE {\n\tres:Abraham_Lincoln dbo:spouse ?uri.\n}"
                                },
                                "answers": [{
                                        "head": {
                                                "vars": [
                                                        "uri"
                                                ]
                                        },
                                        "results": {
                                                "bindings": [{
                                                        "uri": {
                                                                "type": "uri",
                                                                "value": "http://dbpedia.org/resource/Mary_Todd_Lincoln"
                                                        }

This show the idelogical processing about how we detect the entities from the sentence and do the parsing, because we need to mapp them towards the RDF in DBpedia.