Why the US Semantics Technology Symposium was a big deal

I recently attend the 1st US Semantics Technology Symposium. I quickly published my trip report because I wanted to get it out asap, otherwise I would never get it out (procrastination!). It’s been a week since my trip report and I can’t stop thinking about this event. After a week of reflection, I realized that there is a big picture that did not come across my trip report: THE US2TS WAS A VERY BIG DEAL!
Why?  Three reasons:
1) There is a thriving U.S. community interested in semantic technologies
2) We are not alone! We have support!

3) Time is right for semantic technologies

Let me drill through each one these points.

1) There is a thriving US community interested in semantic technologies

Before the event, I asked myself: what does success look like personally and for the wider community after this event? I didn’t know what to expect and honestly, I had low expectations. I was treating this event as a social gathering with friends and colleagues that I hadn’t seen in a while.
It was much more than a social gathering. I realized that there was a thriving US community interested in semantic technologies, outside of the usual suspects (academics such as USC/ISI, Stanford, RPI, Wright State, UCSB and industry such as IBM and Franz). At the first coffee break, I told Pascal “I didn’t know there were all these peoples in the US interested in semantic technologies!”. Apparently many people shared the same comment with Pascal. US2TS was the first step, in my opinion, to unify an existing community that was not connected in the US.
I’ve known about the semantic work that Inovex and GE Research have been doing. I was very glad to see them coming to an event like this and publicizing to the wider community about what they are doing.
Very exciting to meet new people and see what they are doing coming from places such as Maine, U Idaho, UNC, Cornell, UC Davis, Pitt, Duke, UGA, UTEP, Oregon, Bosch, NIST, US AF, USGS, Independent Consultants,
Additionally, very exciting to see the different applications domains. Life Science has always been prominent. I learned about the complexity of geospatial and humanities data. I’m sure there are many more similar complex use cases out there.

2) We are not alone! We have support!

The US government has been funding work in semantic technologies through different agencies such NSF, DARPA and NIH. Chaitan Baru, Senior Advisor for Data Science at the National Science Foundation had a clear message. NSF thinks of semantic technologies as a central component of one of its Ten Big Ideas: Harnessing the Data Revolution:
How do we harness the data revolution? Chaitan and others have been working through NITRD to promote an Open Knowledge Network that will be built by many contributors and offer content and services for the research community and for industry. I am convinced that an Open Knowledge Network is key component to harness the data revolution! (More on Open Knowledge Network below.)
Basically, NSF is dangling a $60 million carrot in front of the entire US Semantic Technologies community.
Chaitan’s slides will be made available soon through the US2TS web site.

3) Time is right for Semantic Technologies

Semantic Technologies work! They solve problems that require integrating data from heterogeneous sources where having a clear understanding of the meaning of the data is crucial. Craig Knoblock’s keynote described how to create semantic driven applications from end to end in different application domains. Semantic technologies are key to address these problems.
One of the themes was that we need better tools. Existing tools are made for citizens of the semantic city. Nevertheless, we know that the technology works. I argue that it is the right opportunity to learn from our experiences and improve our toolkits. There may not be science in this effort and that’s fine. I think that is a perfect fit for industry and startups. I really enjoyed talking to Varish and learning how he is pushing for GE Research to open source and disseminate the tools they are creating. Same for Inovex. One of our goals at Capsenta is to bridge the chasm between the semantic and non-semantic cities by creating tools and methodologies. Tools don’t have to come just from academia. It’s clear to me that the time is right industry to work on tools.
One of the highlights of the event was Yolanda Gil’s keynote. Her slides are available at https://tinyurl.com/Gil-us2ts-2018. Yolanda made three important points:
1) The thirst for semantics are growing: We are seeing the interest in other areas of Computer Science, namely, Machine Learning/Data Science (slide 3), Natural Language Processing (slide 4), Image Processing (slide 5), Big Data (slide 6) and Industry through Knowledge Graphs (slide 7). If the thirst for semantics is growing, the question is, how are we quenching that thirst? We are seeing Deep Learning workshops at semantic web conferences. It’s time that we do it the other way: semantic/knowledge graphs papers and workshops at Deep Learning conferences.
2) Supercomputing analogy: In 1982, Peter Lax chaired a report on “Large Scale Computing in Science and Engineering”. During that time, supercomputing had major investments by other countries, dominated by large industry players, limited access to academia and lack of training. The report recommend NSF to invest in Supercomputing. The result was the National Science Foundation Network (NSFNET) and the Supercomputing centers that exist today. This seems like an analogous situation when it comes to semantics and knowledge graphs: major investments by other countries (Europe), dominated by large industry players (Google, etc), limited access to academia (not mainstream in other areas of CS) and lack of training (we need to improve the tools). I found this analogy brilliant!
3) We need an Open Knowledge Network: As you can imagine, to continue the analogy, we need to create a data and engineering infrastructure around knowledge, similar to the Supercomputing centers.  An Open Knowledge Network would be supported by centers at universities, support research and content creation by the broader community, be always accessible and reliable to academia, industry, and anyone, and enable new scientific discoveries and new commercial applications.  For this, we need to think of semantic/knowledge graphs as reliable infrastructure, train the next generation of researchers, and think of the Open Knowledge Network as a valuable resource worth of collective investment.
Do yourself a favor and take a look at the Yolanda’s slides.

Conclusion

This is perfect timing. We have a thriving semantics technology community in the US. Semantic technologies work: we are seeing a thirst for semantics and interest from different areas of computer science. Finally, the NSF has a budget and is eager to support the US Semantic technologies community.

Trip Report: 1st U.S. Semantic Technologies Symposium (#US2TS)

I attended the 1st U.S. Semantic Technologies Symposium (#US2TS), hosted by Wright State University in Dayton, Ohio on March 1-2, 2018. The goal of this meeting was to bring together the U.S. community interested in Semantic Technologies. I was extremely happy to see 120 people get together in Dayton, Ohio to discuss semantics for 2 days. I’m glad to see such a vibrant community in the U.S. … and not just academics. Actually, I would say that academics were the minorities. I saw a lot of familiar faces and met a lot of people from different areas.

The program was organized around the following topics: Cross Cutting Technologies, Publishing and Retrieving, Space and Time and Life Sciences. Each topic had a set of panelists. Each panelist gave a 10 minute talk. There was plenty of time for discussion and a break out session. It was a very lively. The program can be found here: http://us2ts.org/posts/program/

I gave a 10 min version of my talk “Integrating Relational Databases with the Semantic Web: a journey between two cities“. The takeaway message: in order to use semantic technologies to address the data challenges of business intelligence and data integrate, we need to fulfill the role of the Knowledge Engineer and empowered them with new tools and methodologies. Looks like I did a good job at it and it was well received 😃

Two main topics: Ontologies and Tools

Complexity and Usability of ontologies was a topic throughout the two days. Hallway talk is that light semantics is enough (happily surprised to hear this). However, Life Science and Spatial domain need heavyweight semantics (more below). CIDOC-CRM is the ontology used in the museum domain. Apparently very complicated. A lot of people don’t like it but they have to use it.

Linked Open USABLE Data (LOUD): We need to find a balance between usable and complexity.

I was part of a breakout session on ontologies and reuse. I really appreciated Peter Fox’s comment on ontologies (paraphrasing): there are three sides that we need to take into account 1) expressivity, 2) maintainability and 3) evolvability

I shared our pay-as-you-go methodology to create ontologies and mappings in a poster and in hallway discussions. It was well received.

Tools Tools TOOLS: we need better tools. That was another theme of the meeting. There seemed to be an agreement with my claim that the existing tools are made for the semantic city.

JSON-LD came up a lot. People love it.

Application Areas of Semantics

As expected, Life science was present at this meeting. Melissa Haendel from Oregon Health & Science University showed some really cool results that were possible thanks to semantics. Chris Mungall from Lawrence Berkeley National Laboratory gave an overview of the Gene Ontology.

Really interesting to learn that data in the geography domain (spatial data) is complex and requires heavyweight semantics, just like in life science.

Interesting observations about humanities data. I see the need for semantics

I need to check out perio.do: “A gazetteer of period definitions for linking and visualizing data“. One of the project leads is a fellow longhorn, Prof. Adam Rabinowitz. I want to meet him!

Meeting people

Great chatting with Varish Mulwad from GE Research and learning about all the semantic work that is going on at GE Research. Need to check out Semtk (Semantics Toolkit ) and these papers: .

SemTK: An Ontology-first, Open Source Semantic Toolkit for Managing and Querying Knowledge Graphs

Integrated access to big data polystores through a knowledge-driven framework

I enjoy meeting Alessandro Oltramari and learning about the semantic work going on at Bosch.

Great to finally meet Vinh Nguyen. Her PhD was on Contextualized Knowledge Graphs (I should take a look at her PhD dissertation) and she is now organizing an ISWC 2018 workshop on this topic.

Happy Birthday Craig Knoblock!! He gave a fantastic keynote on his birthday!

Glad to have bumped into Ora Lassila. It’s been a long time!!

Future research directions

Take aways from the Meeting

This is an event that was missing in the U.S. I’m glad that it was organized (Fantastic job Pascal and Krzysztof!). Looking forward to this event next year!