My takeaway from this year’s ISWC

Less is sufficient
Theory and practice is happening more and more and it’s getting rewarded
We need to think bigger
Semantic and Knowledge Graph technologies are doing well in industry

So let’s start! This was a very exciting and busy ISWC for me.

In 2008, @olafhartig & I met at @iswc_conf when we were PhD students Today we giving keynotes: “Let's rethink our optimization criteria for Web querying: A case for
response time focused query processing” and “The socio-technical phenomena of data integration” How cool! #iswc2019
— Juan Sequeda (@juansequeda) October 25, 2019

For the past two years, Claudio Gutierrez and I have been researching the history of knowledge graphs (see http://knowledgegraph.today/). We culminated this work with a paper and a tutorial at ISWC. It was very well received:

This is really inspiring, I will go back home hungry for reading and learning more. Fortunately, @juansequeda is giving us a long reading list. #knowledgegraphs #ai @iswc_conf #iswc_conf https://t.co/NgKIVCAbxo
— Pasquale Lisena (@PasqLisena) October 27, 2019

I also gave the keynote “The socio-technical phenomena of data integration” at the Ontology Matching Workshop

Brilliant keynote by @juansequeda at the OM workshop on the socio-technical aspects of data integration, stimulating and challenging take home message. @iswc_conf pic.twitter.com/tK8cEbgHhV
— Valentina Tamma (@VTamma) October 26, 2019

Exciting keynote by @juansequeda at the Ontology Matching Workshop @iswc2019 #iswc2019 pic.twitter.com/gwN98YY9B4
— Oktie Hassanzadeh (@oktie) October 26, 2019

Part of my message was to push the need of the Knowledge Scientist role.

#Knowledge scientists are the missing chain between Data and Business @juansequeda @iswc_conf #iswc_conf pic.twitter.com/5KLURHhKIv
— Pasquale Lisena (@PasqLisena) October 26, 2019

I gave a talk on our in-use paper “A Pay-as-you-go Methodology to Design and Build Enterprise Knowledge Graphs from Relational Databases”

A pay-as-go Methodology in 10 steps to design and build KGs from relational Database by @juansequeda #ISWC2019 pic.twitter.com/V2WmSJuHYs
— Ghislain Atemezing (@gatemezing) October 28, 2019

Marketer in chief ⁦@juansequeda⁩ is telling the audience how to cite his paper at #iswc_conf! And you got a tee shirt if you ask a question 😃Good Talk! pic.twitter.com/YxzwANu1M1
— Raphaël Troncy (@rtroncy) October 28, 2019

In conjunction with Dave Griffith, I also gave an industry presentation on how we are building a hybrid data cloud at data.world. Finally, I was also on an industry panel.

Oh, and data.world had a table

#iswc_conf Come learn about @datadotworld and get some socks and stickers. Come to our talks, ask questions and get tshirts!

We are looking for research partners and research interns!

Also come look at the books that I mentioned in my history of knowledge graph tutorial pic.twitter.com/z4LtSYhqJm
— Juan Sequeda (@juansequeda) October 27, 2019

And our socks were a hit

We are getting OWL socks at #iswc_conf from @juansequeda to walk towards more semantics pic.twitter.com/Q9YfVqaAe4
— Fabien Gandon (@fabien_gandon) October 28, 2019

Let’s not forget about Knowledge

Jerome Euzenat’s keynote was philosophical. His key message was that we have gotten too focused with data and we are forgetting knowledge and how knowledge evolves.

Brilliant keynote by Jerome Euzenat. Providing a journey of Knowledge and making a call for brains that we have lost focused on the evolution of knowledge. #iswc_conf pic.twitter.com/fdtflgdWJD
— Juan Sequeda (@juansequeda) October 28, 2019

I agree with him. In the beginning of the semantic web, I would argue that the focus was on ontologies (i.e. knowledge). From the mid 2000s, the focus shifted to data (Linked Data, LOD) and that is where we have been. We should not forget about knowledge. And it’s because of this:

TRUEEEEEE!!!! #iswc2019 #iswc_conf @iswc_conf #semanticweb pic.twitter.com/nGzuJaofW0
— Mauro Dragoni (@maurodragoni) October 28, 2019

I would actually rephase Jerome’s message and say that we should not just forget about knowledge, but we should not forget about combining data and knowledge at scale.

A don’t forget:

Important observation by Jerome Euzenat at #iswc2019 #iswc_conf keynote pic.twitter.com/rKzlSn9euA
— jahendler (@jahendler) October 28, 2019

Outrageous Idea

There was a track called Outrageous Idea and the outrageous issue was that most of the submissions were rejected because they weren’t considered outrageous by the reviewers. This lead to an interesting panel discussion.

The outrageous ideas discussion at #iswc_conf We need to think about being “audacious” per @jahendler. We need to be able to share ideas/results which may not have all the scientific rigor that is expected. This is how we can share crazy ideas pic.twitter.com/Vjki0J62vF
— Juan Sequeda (@juansequeda) October 29, 2019

The semantic web community has a track record of being outrageous:

The idea of linking data on the web was crazy and many thought it would not happen.

Even though SPARQL is not a predominant query language, one of the largest repositories of knowledge, Wikidata, is all in RDF and SPARQL.

Querying the web as if it were a database was envisioned in the early 90s, and given the linked data on the web, it actually became possible (see Olaf Hartig PhD dissertation and all the work that was spawned. No wonder his 2009 paper received the 10 year prize this year. Congrats my friend!).

What an incredible honor! I have received the SWSA 10-years award for our 2009 paper about executing SPARQL queries over the Web of Linked Data. I wonder what my younger self in #iswc2009 would have said. @iswc_conf pic.twitter.com/uxrDue3LFl
— Olaf Hartig (@olafhartig) October 28, 2019

Heck, the semantic web itself is an outrageous idea (that hasn’t yet been fulfilled).

However, there is a sentiment that this community is stuck and focused on incremental advances. Something needs to change. For example, we should have a venue/track to publish work that may lack a bit of scientific rigor because it is visionary, may not have a well defined research questions or clearly stated hypothesis (because we are dreaming!), or the evaluation is preliminary/lacking because it’s still not understood how to evaluate or what to compare to. Rumor has it that there will be some sort of a vision track next year. Let’s see!

Pragmatism in Science

It was great to see scientific contributions combining theory and implementation, thus being more pragmatic. A catalyst, in my opinion, was the Reproducibility Initiative. Several papers had a “Reproduced” tag to note that the results were implemented, the code was available and that a third party reproduced the results. One of the best research paper nominees, “Absorption-Based Query Answering for Expressive Description Logics” won the Best Reproducibility award. These researchers are well respected theoreticians, and it’s very interesting to see how they are interested in bridging their theory with practice.

The best research paper, “Validating SHACL constraints over a SPARQL endpoint” which is highly theoretical, also has experimental results and made their code available: SHACL2SPARQL

I’m seeing this trend also in the database community. For example, the Graph Query Language (GQL) for Property Graphs standardization process, will be accompanied by a definition of formal semantics, which is being led by theoreticians including Leonid Libkin.

I’m also starting to see this interest the other direction: researchers who focus more on building systems are being more rigorous with their theory and experiments. For example, the best student research paper was “RDF Explorer: A Visual SPARQL Query Builder” (see rdfexplorer.org). The computer science team partnered with an HCI researcher to make a user study and providing scientific rigor to their work (and ultimately getting nominated and winning an award).

Bottomline, it’s my perception that the theoritians want to make sure that their theory is actually used, and systems builders are focusing more and more on the science and not just the engineering. This is FANTASTIC!

Table to Knowledge Graph Matching

One of the big topics at the conference was the “Tabular Data to Knowledge Graph Matching” challenge. The challenge consisted of three tasks:

CTA: Assigning a class (:Actor) from a Knowledge Graph to a column
CEA: Matching a cell to an entity (:HarrisonFord) in the Knowledge Graph
CPA: Assigning a property (:actedIn) from the Knowledge Graph to the relationship between two columns

The matching was to DBpedia. The summary of the challenge in one slide:

For example the team from USC, Tabularisi, at a high level, created candidate matches by using DBpedia Spotlight and used TF-IDF and that was sufficient to get decent results.

Tabularisi, another participant in the tabular data to knowledge graph challenge. Their insight was to focus on what is easy first and use TF-IDF. #iswc_conf pic.twitter.com/jwTPWfxUPf
— Juan Sequeda (@juansequeda) October 29, 2019

The winner of the challenge, Mtab, in my opinion, over-engineered their approach for DBpedia, which is how they were able to win the challenge.

MTab, winner of the tablular data to knowledge graph matching challenge. #iswc_conf pic.twitter.com/Z87srWW6aU
— Juan Sequeda (@juansequeda) October 29, 2019

DAGOBAH, from Orange Labs had two approaches. The first baseline that used DBpedia Spotlight and compared it against a sophisticated approach using embeddings. The embedding approach was slightly better, but more expensive.

DAGOBAH: An End-to-End Context-Free Tabular Data Semantic Annotation System. Tried two approaches, a baseline that used lookup services and another approach based on embeddings. Not a big difference #iswc_conf pic.twitter.com/RLTdQSoQ4B
— Juan Sequeda (@juansequeda) October 29, 2019

There were other approaches such as CSV2KG and MantisTable:

MantisTable, Semantic table interpretation #iswc_conf pic.twitter.com/9BvVRuFaOR
— Juan Sequeda (@juansequeda) October 29, 2019

My takeaway: “less is sufficient.” Seems like we can get sufficient quality by not being too sophisticated. In a way, this is good.

More notes

Olaf Hartig gave a keynote at Workshop on Querying and Benchmarking the Web of Data (QuWeDa). His message:

Keynote by @olafhartig: The message is to rethink optimization criteria for querying data in the web and consider focusing on optimizing response time instead of query execution time #ISWC2019 #iswc_conf pic.twitter.com/gFNcKVXaPQ
— Juan Sequeda (@juansequeda) October 25, 2019

Albert Meroño ‏presented work on modeling and querying lists in RDF Graphs ( Paper. Slides ). Really interesting

Lists can be represented in RDF Graphs using rdf:List, Sequence Ontology Patterns, etc. So what’s the query implication of representing lists in these different approaches? Very interesting work! I wonder the status on Property Graphs https://t.co/yUKXdq2g9v #ISWC2019 #iswc_conf pic.twitter.com/o9PJVgnG0z
— Juan Sequeda (@juansequeda) October 25, 2019

I really need to check out VLog, a new rule based reasoner on Knowledge Graphs VLog code. Java library based on the VLog rule engine Paper

VLog, a new rule based reasoner on Knowledge Graphs. Definitely need to dig into this. #iswc_conf https://t.co/fT2ZDRSHmL https://t.co/xPzB5xU0bV pic.twitter.com/lHm40pLTZP
— Juan Sequeda (@juansequeda) October 28, 2019

SHACL and Validation

I chaired the SHACL & Validation session at #iswc_conf Interesting to see academia taking the W3C SHACL standard, an industry deliverable, and extending it with recursion, combining with inference rules and type checking program code. pic.twitter.com/mlWVuYUIFx
— Juan Sequeda (@juansequeda) October 29, 2019

OSTRICH is an RDF triple store that allows multiple versions of a dataset to be stored and queried at the same time. Code. Slides.

OSTRICH is an RDF triple store that allows multiple versions of a dataset to be stored and queried at the same time. Intriguing. #iswc_conf https://t.co/THEzTsctW8 pic.twitter.com/Zm9OKTn9CV
— Juan Sequeda (@juansequeda) October 30, 2019

Interesting to see how the Microsoft Academic Knowledge Graph as created in RDF. http://ma-graph.org/

The Microsoft Academic Knowledge Graph resource is really cool. An RDF graph dataset available via SPARQL, resolvable URIs and entity embeddings. #iswc_conf https://t.co/GvWttxIeIK pic.twitter.com/Yjt80XsRc8
— Juan Sequeda (@juansequeda) October 30, 2019

An interesting dataset, FoodKG: A Semantics-Driven Knowledge Graph for Food Recommendation https://foodkg.github.io/index.html

Translating SPARQL to Spark SQL is getting more attention. Clever stuff in the poster: Exploiting Wide Property Tables Empowered by Inverse Properties for Efficient Distributed SPARQL Query Evaluation

There wasn’t so much material on Machine learning and embeddings, only one session (not surprising because I guess that type of work gets sent to Machine learning conferences). The couple of things I saw (not complete):

Ampligraph.org, an “open source library based on TensorFlow that predicts links between concepts in a knowledge graph.”
The poster Embedding OWL ontologies with OWL2Vec (code) was intriguing.
The DAGOBAH (see Table to KG Matching) used embeddings for their approach.
Pre-trained entity embeddings for all publications of the Microsoft Academic Graph using RDF2Vec

Need to check out http://ottr.xyz/

I missed the GraphQL tutorial.

Industry

Even though this is an academic/scientific conference, there was still a bit of industry attendees.

Dougal Watt (former IBM NZ Chief Technologist and founder of Meaningful Technology) gave a keynote, where he was preaching Dave McComb’s message of being data centric. I liked how he introduced the phrase “knowledge centric” which is where we should be heading.

Dougal Watt #iswc_conf keynote sharing all the pains of application centric thinking and how we should shift to a data/knowledge centric view. Glad to see how @semanticarts message is being delivered. New hires & 20+ year veteran are open. Devs in the middle are resistant ones pic.twitter.com/F1G70lAs6H
— Juan Sequeda (@juansequeda) October 27, 2019

Pinterest and Stanford won the best in-use paper award for “Use of OWL and Semantic Web Technologies at Pinterest”

Congratulations @PinterestEng for receiving the Best In-Use Paper award at the #iswc_conf – this is the core component of our knowledge graph and powers that pinner love! #ML #Pinterest https://t.co/FYyi2nOfZX
— Jeremy King (@jeremybking) October 30, 2019

Bosch presented their use case of combining semantics and NLP. They are creating a search engine for material scientist to find documents.

Evgeny Kharlamov presenting the "Bosch Materials Science Knowledge Graph" in the ISWC Industry Track #iswc_conf pic.twitter.com/ugE8IlEphz
— metaphacts (@metaphacts) October 27, 2019

Google was present:

a nice, elegant talk from @GoogleResearch on the approach behind Google’s query-specific FAQ feature, which truly is one of the most useful things on the Web rn @iswc_conf #ISWC2019 #iswc_conf https://t.co/8bEoQ3egwc pic.twitter.com/ulN0N8YyBw
— Alina Petrova (@AlinaCodes) October 28, 2019

Joint work between Springer and KMi was presented

@FraOsborne now presenting "Improving Editorial Workflow and Metadata Quality at Springer Nature” #iswc_conf @kmiou https://t.co/kSgOdkm9GM pic.twitter.com/ZxO5pltUEs
— Harith Alani (@halani) October 28, 2019

The Amazon Neptune presented a demo “Enabling an Enterprise Data Management Ecosystem using Change Data Capture with Amazon Neptune” and an industry talk “Transactional Guarantees for SPARQL Query Execution with Amazon Neptune”

I learned about Ampligraph.org from Accenture, an “Open source library based on TensorFlow that predicts links between concepts in a knowledge graph.”

Great to see Orange Labs participating in the table to knowledge graph matching challenge (more above).

Always great to connect with Peter Haase from Metaphacts and meet new folks like Jonas Almeida from NCI.

And that’s a wrap

ISWC is always a lot of fun. In addition to all the scientific and technical content, there is also a sense of community. I always enjoy being part of the mentoring lunch:

the room is full for the mentoring lunch!!! many nice discussions are in progress… #iswc2019 #iswc_conf @iswc_conf #semanticweb pic.twitter.com/w0mSVcRhiM
— Mauro Dragoni (@maurodragoni) October 29, 2019

We had a fantastic gala dinner

#iswc_conf gala dinner pic.twitter.com/M5p0zA5CI8
— Juan Sequeda (@juansequeda) October 29, 2019

And we even got all the hispanics together:

The Hispanic crowd at #iswc_conf pic.twitter.com/r2RxcxwzZN
— Juan Sequeda (@juansequeda) October 29, 2019

Take a look at other trip reports (I can now read them after I published mine!)

Avijit Thawani: https://medium.com/@avijitthawani/iswc-2019-new-zealand-bd15fe02d3d4

Sven Lieber: https://sven-lieber.org/en/2019/11/05/iswc-2019/

Cogan Shimizu: https://daselab.cs.ksu.edu/blog/is`wc-2019

Armin Haller: https://www.linkedin.com/pulse/knowledge-graphs-modelling-took-center-stage-iswc-2019-armin-haller/

With that, see you next year:

And that’s a wrap #iswc_conf See you next year in Athens, Greece for ISWC2020

… and ISWC2021 will be in Albany, NY organized by RPI pic.twitter.com/gs6sfZ6yu2
— Juan Sequeda (@juansequeda) October 30, 2019

Month: November 2019

International Semantic Web Conference (ISWC) 2019 Trip Report