Skip to main content
Joey LeGrand
Founder, CodeRx
View all authors

Introducing the CodeRx Drug Database

· 5 min read
Joey LeGrand
Founder, CodeRx

The drug data solution that actually makes sense.

Why choose between unusable and unaffordable when there's a better way?

For years, anyone building healthcare software, conducting pharmacy research, or analyzing medication data has faced an unfair choice. On one side: free government data that requires months of learning obscure formats, parsing XML from nested zip files, and becoming an expert in RxNorm's abstract table structures. On the other: enterprise drug databases with six-figure price tags and vendor lock-in.

We built the CodeRx Drug Database because there should be a better choice.

Pictured: a server pharm.

Pictured: a server pharm. </joke>

The problem we're solving

Five years ago, when we needed drug data for a healthcare software development project, we discovered what thousands of developers and analysts already knew: there's no middle ground. You either spend months learning to work with raw data sources like RxNorm, FDA, and DailyMed, or you sign a contract that costs more than most early-stage startups can afford.

We chose to build our own data pipeline. We learned the hard way that "open" doesn't mean "easy." We figured out the best way to parse DailyMed's 50,000+ XML files buried in nested zip files. We became fluent in RxNorm's SABs and TTYs. We normalized NDC formats across five different data sources. We automated and enhanced weekly pricing updates from CMS NADAC data.

When our project was complete, we realized something: we had built exactly what the healthcare community needed. Not another GitHub repo that tackles one narrow problem. Not another stagnant open-source project. But a complete, supported, modern drug data product that anyone could use.

Why the CodeRx Drug Database is different

Better than raw open data

You don't need to become a government data format expert. We've done that part.

Instead of spending weeks learning RxNorm's abstract RXNCONSO, RXNREL, and RXNSAT tables, you get purpose-built data marts like drugs, ingredients, packages, and classes that actually make sense to pharmacists and developers.

No more parsing XML from DailyMed zip files. No more Excel formulas to normalize NDC formats every time you download fresh data from FDA and CMS. No more writing complex SQL joins just to answer basic questions about drug products.

Setup in minutes, not months. Browse our open documentation to explore the exact data structure, or subscribe to weekly updates for production use. Either way, you're working with clean, integrated data from day one.

Documentation built for humans. Our web-hosted, searchable documentation includes specific use cases with SQL examples. No lengthy PDFs to search through. No generic technical documentation written for data scientists. Just clear explanations focused on pharmacy applications.

An oldie but a goodie. We finally built a better option.

An oldie but a goodie. We finally built a better option.

More affordable than proprietary databases

Costs 90% less than alternatives. That's not a typo.

For early-stage startups, researchers, and data analysts, enterprise drug databases are simply out of reach. The CodeRx Drug Database costs at least 90% less while providing the core data you actually need.

Modern integration. No complex vendor contracting. No legacy data formats. Just straightforward CSV (or Parquet!) files you can load into any modern database or analytics tool.

Analytics-ready data. Proprietary databases provide terminology that requires transformation for common use cases. The CodeRx Drug Database gives you semantic drug concepts organized around real-world pharmacy questions — identifying drug classes, calculating days' supply, grouping therapeutics — without writing complex transformations every time.

Community-driven innovation. When you have an idea for a new data mart or feature, we can build it together. No waiting for a vendor's product team to add it to their multi-year roadmap.

What you get

The CodeRx Drug Database includes pre-built data marts that solve real pharmacy problems:

  • Packages: NDC-to-drug mappings with brand vs generic indicator and pricing data
  • Drugs: Unified view of brand and clinical products with dose forms, ingredients, and brand relationships
  • Classes: Multiple classification systems to aggregate drugs by therapeutic class or indication
  • Ingredients: Detailed ingredient strength information, including precise ingredient classifications
  • Excipients: Inactive ingredient tracking with special flags for preservatives, dyes, and gluten
  • Synonyms: Multi-source synonym aggregation for improved search and matching
  • Plus analytics-ready data marts for common pharmacy use cases

All data is integrated from multiple authoritative sources: FDA - NDC Directory, NLM - RxNorm and DailyMed, CMS - NADAC, and more.

Who this is for

Early-stage health tech startups that need professional drug data without enterprise pricing

Pharmacy researchers and analysts who want to focus on insights, not data wrangling

Healthcare developers building medication-related features who need reliable, well-structured drug data

Anyone who's ever thought "there has to be a better way than learning RxNorm from scratch"

The best of all worlds

You shouldn't have to choose between spending months learning government data formats and spending six figures on enterprise software. You shouldn't have to reinvent the wheel just to get basic drug information into your application.

The CodeRx Drug Database is the solution we wish existed five years ago. It's affordable, modern, easy to use, and built by pharmacists who code and developers who care about healthcare.

Get started now

Want to see what we've built? All our documentation is available right now on our website:

Explore the Docs →

Need weekly updates? Let's talk about a subscription that makes sense for your organization.

Subscription Options →

Have questions? Shoot us a message or jump into the CodeRx Slack and we will help answer them.

Contact Us →

The hidden dependency in drug pricing transparency

· 5 min read
Joey LeGrand
Founder, CodeRx

Why transparent drug pricing relies on proprietary pack size data

Last week, a pharmacy manager called me with what seemed like a simple question: "How can I compare the prices I'm paying to NADAC to see where I might be overpaying?" It's a reasonable request—NADAC (National Average Drug Acquisition Cost) exists to provide transparent pricing benchmarks, and this pharmacy contributes their own purchase data to the system. Surely they should be able to use it for internal analysis, right?

Comparing Advils to Advils. OK I know Advil probably wouldn&#39;t be in a prescription vial, but work with me here…

Working with drug product data

· 7 min read
Joey LeGrand
Founder, CodeRx

Drug products are a concept foundational to working with drug data in almost any capacity. They are the hub around which many types of analyses pertinent to pharmacy and the medication use process are organized. Drug information databases all have their own proprietary way of working with drug products, but the fundamental concepts are all the same. In this article, we explain those fundamental concepts in plain English and connect them with open standard identifiers.

Drug products can be both the concept of and physical manifestation of a medication that a patient could take.

The problem with drug information

· 7 min read
Joey LeGrand
Founder, CodeRx

The unfair choice between not easy to use and not easy to afford.

I remember years ago hearing for the first time that in order to know how much drugs cost, people had to pay for a license to a drug information provider. And I don't mean "how much do drugs cost with a GoodRx coupon" — I mean how much does it generally cost a pharmacy to purchase a bottle of a specific medication product.

Drug pricing information

What information can you get from open drug data?

· 4 min read
Joey LeGrand
Founder, CodeRx

More than you might think...

Open drug data is a powerful resource for healthcare, pharmacy, and research professionals. While it has some gaps, it serves as a foundation for innovation, providing key insights without the barriers of proprietary systems. With the right tools to fill in these gaps, open drug data can rival — and even surpass — commercial databases in accessibility, interoperability, and fostering innovation.

What&#39;s this pill?

The elusiveness of drug package size data

· 7 min read
Joey LeGrand
Founder, CodeRx

It's weirdly hard to know how much drug product is inside a given drug package. We dive into why it's challenging and how we plan to make it a lot easier.

Just like you can buy different package sizes of, say, pop at the grocery store (yes - I call it pop - I'm originally from the Midwest), pharmacies can also stock different package sizes of drugs. Just like pop comes in 12 or 24 packs of 12 oz cans and also single 20 oz or 2 liter bottles, drug products can come in varying package sizes. The same oral solid drug product from the same manufacturer could be available in say 100, 500, and 1000 count bottles. The same vaccine from the same manufacturer can come in multi-dose 5 mL vials, or pre-filled 0.5 mL syringes - each with perhaps the option of buying a 1 or 10 pack.

Grocery store shelves are not terribly dissimilar from pharmacy shelves. Different products from different manufacturers with different pack sizes.

Open does not mean easy when it comes to drug data

· 7 min read
Joey LeGrand
Founder, CodeRx

Sometimes you get what you pay for. Sometimes the alternative is too expensive.

For the past year or so, we've been going down rabbit holes discovering more and more sources of open drug data, each with its own differences and quirks. By "drug data" we mean data about drugs - typically (but not only) from US government sources like the Food and Drug Administration (FDA), National Library of Medicine (NLM), and Centers for Medicare and Medicaid Services (CMS). These organizations do a reasonably good job at presenting and sharing their own siloed data; however, they all seem to use different data formats, structures, and update frequencies.

You would have to be some weird combination of data scientist, software engineer, and clinician to sustainably aggregate and combine data from these sources in a meaningful way.

Luckily we are.

An actual image of me trying to explain to someone what I&#39;ve been working on for the past year.

Restructuring RxNorm for humans

· 4 min read
Joey LeGrand
Founder, CodeRx

RxNorm is an invaluable resource created and maintained by the National Library of Medicine (NLM). It is a standard nomenclature to represent drug products, providing semantic interoperability across many different drug vocabularies and fueling medication-related clinical decision support. However, the way the data within RxNorm is structured is pretty abstract and really difficult to understand without spending several hours reading various different pages of documentation on NLM's website.

For instance, take a look at the three RxNorm database tables below and tell me how you would find all of the national drug codes (NDCs) for all of the clinical drug products that contain lisinopril as an ingredient. Bet you can't figure it out without reviewing at least 4 different NLM sources of documentation.

  1. RxNorm Technical Documentation
  2. RxNorm Term Types
  3. RxNorm Relationships
  4. RxNorm Attributes

The three tables available in the RxNorm Current Prescribable Content release.