Thank you Capsenta! Hello data.world!

Together with my PhD advisor, Prof Daniel Miranker at the University of Texas at Austin, we founded Capsenta in 2014 for the following reasons:

1) We truly believe that there is a commercial opportunity for companies wanting to connect their relational databases with semantic web technologies.
2) We are both passionate about commercializing research via startups.

Over the years we have been using Ultrawrap, and recently Gra.fo, to address data integration and business intelligence problems using semantic web technologies. Business users do not understand their complex data sources. IT struggles to understand the thousands of tables, millions of attributes and how the data all works together. We deliver a beautiful view of these myriad, complex relational data sources by designing an ontology (i.e., knowledge graph schema), mapping it to the complex data sources via a pay-as-you-go methodology and then using the mappings to integrate the data in a virtual (NoETL) or materialized (ETL) way. Our ultimate goal is to take complex data and turn it into beautiful data.

We are now experiencing more interest and uptake from the industry. Knowledge Graphs are the new cool kid on the block. Graph databases are hot. The Semantic Web community continues to constantly provide evidence that semantic technology works in the real world (just see all the papers in the in-use and industry tracks at ISWC and ESWC, the recent Knowledge Graph Conference, etc.). The industry is really starting to care!

There is one company who really, really cares: data.world.

And I am extremely excited to share that

Capsenta has been acquired by data.world!

Who is data.world?

data.world is a data platform where anybody can add their data, integrate it, share it, query it and much more. Data on the data.world platform becomes part of a web of linked data. The coolest thing is that it runs 100% on semantic web technology. Literally! They have made use of many research results from the semantic web community. For example, every single dataset is stored as RDF HDT. They make use of Apache Jena. You can query all the data in SPARQL, even federate queries across different datasets. When you import relational or csv data, they use the RDB2RDF and CSV2RDF direct mappings. They have even created their own SQL to SPARQL translator, thus enabling tabular data to be queried in SQL in addition to SPARQL. All changes are tracked and the provenance is represented in PROV-O and queryable. Heck, they even support SHACL! data.world is a true semantic web platform.

data.world started out in 2016 by creating a community of open data, which has been called a kind of “GitHub for data”. Now, data.world is the world’s largest collaborative data community and that community has come together to upload and curate hundreds of thousands of data sets.

data.world is also a Public Benefit Corporation with the following ambitious mission:

– Build the most meaningful, collaborative and abundant data resource in the world in order to maximize data’s societal problem-solving utility.
– Advocate publicly for improving the adoption, usability, and proliferation of open data and linked data. (YES, you read that correctly! Their mission is to improve the adoption of linked data!!!!)
– Serve as an accessible historical repository of the world’s data.

It’s now time to start the next phase of taking data.world to the enterprise. This is where Capsenta comes in.

Why am I excited?

There are two main reasons why I am excited:

Perfect Technology Match: We both breathe and eat semantic web. Ultrawrap is a component that will help data.world create a hybrid data platform. We have enterprise customers who want to keep their data in place and not move it to the cloud. This is where Ultrawrap NoETL plays a crucial role. Furthermore, we both acknowledge that we need to make semantic web technology easy to use. data.world’s consumer-grade UI is a valuable differentiator. At Capsenta we created Gra.fo because there wasn’t an easily-usable ontology/knowledge graph schema editor for business users.

Perfect Mission/Vision Match: We are both heading towards the same goal. The way data is managed within enterprises is ugly and complicated. We have to address this problem from a holistic point of view. At Capsenta, our goal is to change the way the world models, governs and integrates data by generating beautiful data that the business users can consume and start solving their business problems. We want to democratize data, or how data.world states it, humanize the data. It’s clear to us that data integration is not just about the technology but also about the people. We need to empower the different stakeholders to be part of the conversation. That is why data.world is all about collaboration. Capsenta’s Gra.fo allows users to share their documents and have conversations via comments.

Oh, and we are both in Austin! How cool is that!

How did we get here?

When I transferred to UT Austin to finish my undergrad in Computer Science in 2006, by serendipity, I met Prof. Daniel Miranker. He was also intrigued by the Semantic Web. Our research started with a very basic question: what is the relationship between relational databases and semantic web? It was clear to us that if the semantic web were to be successful, it must incorporate relational databases because that is where the majority of data is located. After I finished my undergrad, I wanted to continue this same line of research and keep working with Dan. One of the main reasons I wanted to do a PhD was because of the potential to start a company from our research. If semantic web technologies were to take off, then we would be seeing a lot of companies wanting to integrate their relational databases with the semantic web… and we would have the solution! Capsenta was founded to commercialize my PhD research.

With his fantastic technical basis in hand, Wayne Heideman joined the journey as CEO to guide the commercialization of these ideas, the technology and its productization. Since then we have demonstrated that our technology works to integrate data within very large enterprises in industries such as healthcare, e-commerce, oil and gas, and pharma and have enjoyed commercial success with millions of dollars of customer revenue.

Personally, I have learned A LOT about how to work with data in large enterprise settings (our smallest customer is a billion dollar revenue company), from both the technical and social aspects. It is very satisfying to see our research being used in the real world to solve challenging data integration problems… and that we get paid to do it.

In order to scale Capsenta’s business, we needed more fuel. Given the alignment that we have with data.world, it makes complete sense to join forces.

What’s next?

The entire Capsenta team has joined data.world! I am now Principal Scientist at data.world. I continue to wear my scientific hat and collaborate with many research partners, attending and presenting at conferences, participating in program committees and editorial boards, supervising students and more. I also wear a business hat where I support engineering, technical sales and work with customers to understand their problems and tie them back to R&D.

Capsenta and data.world had already been working together for over a year as partners and Ultrawrap NoETL was already integrated as the virtualization mechanism for data.world before the acquisition. It will be very fun to further integrate Capsenta’s technology within data.world. We also plan to continue to support all of our customers and continue development and support for Ultrawrap and Gra.fo.

Parting Thoughts

With my scientific hat

I am very proud to be part of a startup coming out of research done at the Department of Computer Science at the University of Texas at Austin. I’m looking forward to seeing more startups coming out of UTCS.

There is so much fun research to be done! It’s going to be fun organizing our research plans for the short, medium and long term. Stay tuned!

With my business hat

This is a huge win for companies who are looking to deploy an Enterprise Knowledge Graph. If you are learning and starting small, we can help you. If you are advanced and you know exactly what you want, we can help you too. Together, we now have the best platform in the world to create knowledge graphs!

Personally,

Thanks to the entire Capsenta team, past and present. We are starting this new chapter thanks to all of you.

Thanks to the investors for trusting us in this endeavor.

Thanks to Dan Miranker for believing in me.

Thanks to Wayne Heidenman for teaching me so much about business and technology.

Thanks to my family for supporting me every step of the way.

It’s clear that both Capsenta and data.world are heading in the same direction. We are honored and humbled to be invited to be part of the data.world journey and are excited at what it holds for us all.

Thank you Capsenta!

Hello data.world!

We are one team now.