The Tuva Project Manifesto

Healthcare data has been challenging to work with since the inception of claims data in the mid-90s. The Tuva Project is our attempt to fix it - Tuva or Bust.

If you're reading this, you probably know about the awesome potential healthcare data has for discovering new treatments and lowering the cost and increasing the quality of healthcare. But you're probably also familiar with the challenges working with healthcare data presents.

Here's the problem: Healthcare data is complex. It's perhaps more complex than data from any other industry. Think about it: healthcare data attempts to capture human biology and the delivery of healthcare in a database. There are thousands (millions?) of body systems, diagnostic tests, diseases, and treatments, all of which are captured in healthcare data.

The complexity of healthcare data is also related to the fact that the various systems (e.g. EMRs) that capture the data are highly variable. The end result is that healthcare data is extremely heterogeneous and this makes it impossible to perform data analysis on raw healthcare data.

As an industry we've understood this for some time, but we've been busy solving other problems. The first problem we solved in the late 2000s / early 2010s was making healthcare data electronic. This was a big step because it's tough to do analysis with paper records.

The second problem we started solving a few years ago - interoperability. Since becoming digital, healthcare data has been extremely silo'd and difficult to access. But thanks to the 21st Century Cures Act and the regulations promulgated from it, this is changing, and it's now easier to access healthcare data than ever. (Although if you're a vendor requesting claims data from a payer this is still a massive pain.)

This brings us to the present and the single biggest problem that remains to be solved: data transformation. Solving this problem is all about figuring out the best way to go from raw healthcare data sources (i.e. claims, medical records, and other clinical sources like labs) to enriched, quality-tested data that is ready for analysis and machine learning.

Every healthcare data person is familiar with the data transformation problem. We've all been re-learning and re-inventing solutions to this problem for the past 30 years. The goal of the Tuva Project is to solve the data transformation problem, and because this problem is fundamentally a knowledge problem, an open source approach is the only way we'll get there.

If you're a healthcare data person, you have tremendous knowledge and expertise to contribute. We hope you'll join us on this mission and help unlock the true potential of healthcare data to improve health and healthcare for all of us.

Why the Name Tuva?

We are massive Richard Feynman fans. Feynman embodied so many great traits that are critical for deeply understanding a subject and doing science.

Tuva is an allegory for Feynman. Tuva was formerly a country in the Soviet Union. For more than a decade before his death, Feynman and his friend Ralph Leighton tried to travel to the country of Tuva. What started as a joke became a mission - and it was challenging. Getting a visa to Soviet Russia during the cold war was next to impossible. Ultimately Feynman died a few weeks before their visas came, but Ralph traveled to Tuva and chronicled the trip and their adventure trying to get there in his book Tuva or Bust.

Ralph helped pen a number of other books about Feynman which we highly recommend. If you're new to Feynman start with Surely You're Joking, Mr. Feynman.

Why the Name Tuva?​

Why the Name Tuva?