Getting started
Access to open drug data is valuable. Open drug data is not easy to work with.
Background
By "open drug data" we mean data about drugs openly available to the public - typically (but not only) from US government sources like the Food and Drug Administration (FDA), National Library of Medicine (NLM), and Centers for Medicare and Medicaid Services (CMS). These organizations do a reasonably good job at presenting and sharing their own siloed data; however, they all seem to use different data formats, structures, and update frequencies.
When we say open drug data is valuable but not easy to work with, here are a few examples of what we mean.
š±ļø Click each example below to read more.
We could go on, but you probably get the idea. Open drug data holds a lot of value, but reliably accessing this value on a ongoing basis requires either a well thought-out data pipeline infrastructure or a lot of error-prone manual work.
So what are we doing about it?
Introduction
We built a platform of one-click data pipelines that that can automaticallyĀ extractĀ up-to-the-day current open drug data and not onlyĀ loadĀ it all into a common database so itās easier to work with, but alsoĀ transformĀ it into curated marts containing the polished end result of a complex series of novel combination and re-organization of the original drug data.
Oh - and we open sourced it all.
More details about how SageRx works behind the scenes are available in the aptly named How SageRx works.
So whatās the big deal?
This is different from the commercial drug database you might be familiar with.
- For one, not only is the code and SQL to do the data transformations completelyĀ open source, the documentation is also open and written by pharmacist / developer hybrids who know how to translate pharmacy domain knowledge into developer-friendly concepts.
- Second, it is fairly lightweight, easy to spin up (using Docker), and pretty much runs itself. Even not-super-technical people can contribute by adding their own custom data transformations requiring only SQL. And - if you think your work could benefit others - you could even contribute a pull request to the overall open-source SageRx project.
- Lastly, at its core, SageRx is based on open common standards that promote interoperability - instead of licensed, proprietary coding systems that make it difficult to share data between organizations.
To be clear, weāre not a huge organization of people scrubbing the source data and phoning manufacturers to fill in gapsā¦ but itās not our intention to be that. We want to build something sustainable with very little overhead that might make drug data more accessible and understandable for people that need to work with it.
Who needs it?
Our hope is that SageRx can benefit (at the very least):
- Startup founders
- Researchers
- Data analysts
- Maybe you?
If any of this interests you, please star theĀ repo, join ourĀ Slack, and/or shoot us anĀ email. Oh - and please be patient as we try to get our documentation in order. If you have questions or need help getting started in the meantime, the #proj-sagerx channel of ourĀ SlackĀ is an excellent resource for support.
ā Previous
Next ā