Why Cannabis Genetics Doesn't Need Reference Standards
Chad TernesA common objection to genomic fingerprinting in cannabis is that the field lacks validated reference standards — agreed-upon genetic benchmarks against which unknown samples can be compared. It sounds like a reasonable scientific concern. But it reflects a misunderstanding of how population genetics and genomics actually work, and it overstates what reference standards would even give us in a crop with cannabis's history.
People say we need references and standards for cannabis genetics. Yes, that would be great, but I think we're too far past that point for most strains. This especially holds true if you want OG-type strains included.
There's a good chance we will never have those types of references. And there's a reason for that.
Cannabis is still illegal at the federal level. Most cannabis operations were, historically, clandestine out of necessity. During that time, breeders weren't thinking about preserving strains for use as genetic reference standards. They were thinking about staying out of prison, all while producing the flower that defined generations of cannabis culture. Archival rigor wasn't the priority. Survival was.
But here's what the "we need standards" argument misses: population genetics and genomics are specifically designed to characterize genetic relationships without requiring a fixed reference point.
When we sequence a set of cannabis samples and compare their SNP profiles, we aren't measuring each sample against a gold standard and asking "how close is this to the real thing?" We're asking a different and more answerable question: how do these samples relate to each other? That's a population-level inference, and it doesn't require a reference to be valid. The same mathematical and statistical frameworks used to reconstruct genetic and evolutionary relationships in wild plant populations — where no curated reference exists and maybe never will — apply directly to cannabis cultivar identity analysis. Phylogenetic reconstruction, principal component analysis, admixture modeling, and pairwise genomic distance calculations all operate on the relationships within the data, not on the distance from an external standard.
And even if reference standards did exist, there's a deeper problem: who decides which one is correct?
Cannabis strain identity has been transmitted informally for decades through clone trades, seed swaps, and word of mouth, with no chain of custody and no central registry. The "original" Sour Diesel or OG Kush exists in dozens of claimed lineages, held by different people, with different stories. Any reference standard built on that foundation would itself be based on anecdotal evidence at best. There would be legitimate disagreement about which reference was authoritative, and that disagreement would undermine the standard's legal and scientific utility almost immediately.
The more honest and scientifically defensible position is this: we may never have true reference standards for most cannabis cultivars. But depending on the context and application, we don't necessarily need them. What we need is a rigorous, reproducible methodology for characterizing genetic identity and relationships from the samples we actually have. Population genomics already provides that.
The absence of a reference isn't a gap in the science. It's a result of the biological and legal history of this crop, and the analytical framework exists to work around it.