Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blank Nodes in Graph Patterns #59

Open
no-reply opened this issue Aug 14, 2015 · 7 comments
Open

Blank Nodes in Graph Patterns #59

no-reply opened this issue Aug 14, 2015 · 7 comments
Assignees
Labels

Comments

@no-reply
Copy link
Member

SPARQL::Client::Repository#query_pattern runs afoul of a restriction in SPARQL about the allowed blank node labels in queries. I suspect most SPARQL implementations will just interpret these as two unique blank nodes without problem, but I noticed that Blazegraph throws errors and it seems technically correct to do so.

We have the option to change the node labels within the method, here; better might be to find a place somewhere in Pattern to change the blank node labels so they won't be repeated.

Thoughts?

@gkellogg
Copy link
Member

It's not clear exactly what restriction you're referring to. The link you provided shows the creation of a SPARQL CONSTRUCT using the supplied patterns. Is it that the serialization of a BNode element might render something which is not valid SPARQL? Do you have a specific example?

Any BNodes generated automatically should be fine; those that are created by a client may fail on some servers, in which case the server may complain with a failure code, but that would seem to be just fine to me, as the client is in charge of creating such nodes.

@no-reply
Copy link
Member Author

Sorry, I gave the wrong link. The restriction I intended to reference is this one:

When using blank nodes of the form _:abc, labels for blank nodes are scoped to the basic graph pattern. A label can be used in only a single basic graph pattern in any query.

When given patterns with blank nodes, the code linked in the original issue description creates queries like:

CONSTRUCT { _:one _:two _:three . } WHERE { _:one _:two _:three . }

Blazegraph (apparently correctly) rejects this. From a description of the behavior I sent them:

In short, some upstream code uses the same bnode label in both CONSTRUCT and WHERE. The error you throw (included below) appears correct, and we'll fix this on the RDF.rb side, but I thought you might be interested in the issue. It seems like it would be harmless to interpret these as two unique bnodes in two separate scopes.

ERROR: BigdataRDFServlet.java:214: cause=java.util.concurrent.ExecutionException: org.openrdf.query.MalformedQueryException: com.bigdata.rdf.sail.sparql.ast.VisitorException: BNodeID already used in another scope: g69995647769040, query=SPARQL-QUERY: queryStr=CONSTRUCT { _:g69995647769040 http://xmlns.com/foaf/0.1/mbox_sha1sum ?g69995650401960 . } WHERE { _:g69995647769040 http://xmlns.com/foaf/0.1/mbox_sha1sum ?g69995650401960 . }

@gkellogg
Copy link
Member

I don't think there are any official tests for this, and my implementation certainly doesn't raise an error. (Of course, BNodes in predicate locations are never okay).

In the case of #query_pattern, we could simply fail if any element of the pattern is a BNode; arguably, the rdf-spec tests for Repository shouldn't use these patterns, as you generally can't remotely work with BNodes without skolemizing them. It only really works for in-memory Repositories, or those making a guarantee about BNode label stability (we're considering this for a hypothetical normalized dataset).

@no-reply
Copy link
Member Author

In the case of #query_pattern, we could simply fail if any element of the pattern is a BNode

I'm thinking this would be overkill. Something like CONSTRUCT { _:node ?predicate ?object . } WHERE { ?node ?predicate ?object . FILTER(isBlank(?node)) } seems like a legitimate pattern.

The rest of what you've said rings true. The bnode handling I have in the Blazegraph work thus far is okay-ish, but carries some big caveats. I think there are solutions here without leaving the realm of SPARQL Update.

@gkellogg
Copy link
Member

Yes, I was thinking much the same (?node instead of _:node, of course).

@no-reply
Copy link
Member Author

I think they are semantically equivalent, with each constructing "fresh" blank nodes in CONSTRUCT for each solution.

In any case, there are two tests that fail with this error when run over Blazegraph, one can just be changed--I don't think it's intended to test anything to do with blank nodes--the other is:

... behaves like an RDF::Repository when querying statements behaves like an RDF::Queryable RDF::Queryable#first_value returns the correct value when the pattern matches
     Failure/Error: expect(subject.first_value(matching_pattern)).to eq subject.first_literal(matching_pattern).value
     SPARQL::Client::MalformedQuery:
       SPARQL-QUERY: queryStr=CONSTRUCT { _:t519724 <http://xmlns.com/foaf/0.1/mbox_sha1sum> ?g70276900592100 . } WHERE { _:t519724 <http://xmlns.com/foaf/0.1/mbox_sha1sum> ?g70276900592100 . }
       java.util.concurrent.ExecutionException: org.openrdf.query.MalformedQueryException: com.bigdata.rdf.sail.sparql.ast.VisitorException: BNodeID already used in another scope: t519724

I have a patch that uses a different node ID in WHERE, but would be happy to try/submit one that switches to a filtered variable shared between the patterns.

@gkellogg
Copy link
Member

@no-reply I'm assigning this to you to apply your patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants