Jump to content

Copyright status of genetic sequences

From Wikipedia, the free encyclopedia

The copyright status of genetic sequences is a subject of debate among companies, legal scholars, and policymakers. The growth of the synthetic biology field in recent decades has sparked interest in the idea of copyright as an alternative to patent protection for artificially created DNA sequences. Arguments for extending copyright to engineered DNA sequences rest on their biological role as an information storage medium[1] and analogies to computer programs, which in many countries are copyrightable as "literary works".[2][3]

Background

[edit]

DNA as an information medium

[edit]
The process by which the information in DNA is transformed to make gene products (proteins and non-coding RNAs).

The idea that artificial DNA sequences may be copyrightable rests on the observation that DNA embodies genetic information that dictates the structure and behavior of living organisms. This information is stored in sequences of nucleotides, which are normally represented as the letters A (adenine), C (cytosine), G (guanine), and T (thymine). In the conventional process of gene expression, DNA is first transcribed into messenger RNA and then translated into sequences of amino acids called proteins.[4] Some genes do not code for proteins and are instead transcribed into non-coding RNAs (ncRNAs), which include functional RNA molecules such as transfer RNAs (tRNAs) and ribosomal RNAs (rRNAs) that play various roles in gene expression.[5] Other regions of non-coding DNA help regulate gene expression, like promoters, enhancers, silencers, and insulators; still others are structural elements of chromosomes like telomeres and centromeres.[6]

Several authors have made analogies between DNA and computer programs. Nina Srejovic describes cells as "protein-producing machines" with multiple possible chemical inputs and outputs. The DNA functions as the cell's "operating system", directing it to produce different proteins ("outputs") depending on the cell's configuration of regulatory proteins and enzymes ("inputs").[1] In The Birth of the Mind, Gary Marcus describes an organism's genome as a computer program that drives the process of embryogenesis, in which each gene functions like an individual line of code. Medical scientist[7] Pradeep Mutalik writes that various genes, promoters, regulators, and inhibitors have functionality akin to control flow constructs like if–then statements, loops, and subroutine calls.[8]

DNA has also been used as a general-purpose information storage medium, used to store a wide variety of digital data such as books, photographs, films, and music.[1] In 2012, researchers synthesized 54,898 pieces of DNA containing binary data representing the HTML code for a book written by George Church, together with images and JavaScript code. The binary zeroes were encoded as A and C nucleotides, while the ones were encoded as Gs and Ts. The DNA molecules could be decoded using DNA sequencing techniques.[9] More recent applications have used a rotating ternary (base 3) encoding to avoid sequencing issues arising from repeated nucleotides (homopolymers).[10][11]

[edit]

Since the 1980s, academics have proposed that synthetic DNA may be eligible for copyright protection for similar reasons as computer software.[4][2] In the United States and several other jurisdictions, computer programs are treated as literary works and thus protected under copyright law. The legal definition of "literary work" extends beyond the common-sense meaning of the term (literature such as poetry and prose) and includes any work "expressed in words, numbers, or other verbal or numerical symbols or indicia, regardless of the nature of the material objects, such as books... tapes, disks, or cards, in which they are embodied" (17 U.S.C. § 101). In Apple Computer, Inc. v. Franklin Computer Corp. (1983), the Court of Appeals for the Third Circuit held that a program is protected as a literary work whether expressed as source code or object code, and irrespective of the medium (such as read-only memory) in which it is embodied.[12]

At the same time, copyright extends only to the expressive authorship in a computer program, not the methods or processes embodied in it. Section 102(b) was included in the Copyright Act of 1976 to codify the idea–expression distinction in part out of concern that this boundary might be blurred in the software context:

In no case does copyright protection for an original work of authorship extend to any idea, procedure, process, system, method of operation, concept, principle, or discovery, regardless of the form in which it is described, explained, illustrated, or embodied in such work.

Section 102(b) addresses both high-level abstractions (ideas, concepts, and principles) and "more complex, detailed, and functional information innovations" (procedures, processes, systems, methods of operation, and discoveries). According to Samuelson (2007), the latter categories are excluded from copyright protection to prevent copyright from being used to circumvent the more stringent requirements for patent protection.[13] Furthermore, the merger doctrine limits copyright protection for computer programs because they are primarily functional works. That is, when there are only a few reasonable ways to express that idea, the expression is said to "merge" with the idea, and neither is copyrightable.[14]

Copyrightability of DNA

[edit]

In 1982, law professor Irving Kayton argued that engineered DNA compounds are copyrightable as literary works. Under the Copyright Act of 1976, computer programs are literary works because they are expressed in "verbal or numerical symbols or indicia", such as the domains of a magnetic storage device. For "genetically engineered works", the indicia are the nucleotides (A, C, G, and T) that make up the DNA molecules in which they are embodied. Alternatively, genetically engineered works may comprise a sui generis category of copyrightable subject matter; according to Kayton, the Copyright Act states that copyrightable works of authorship include the eight categories enumerated in 17 U.S.C. § 102(a) but are not necessarily limited to them.[note 1] Kayton argued that cells or cell cultures are also "tangible media of expression" in which genetic works can be fixed.[4]

Kayton posited that copyright could even extend to recombinant DNA molecules—those that are built from existing, often naturally occurring DNA fragments from different sources. Such recombinant DNA sequences could be considered compilations—works of authorship resulting from the creative selection, coordination, and arrangement of pre-existing materials, whether or not the underlying materials are protected by copyright. For example, a plasmid containing genes from two different bacteria and an E. coli bacterium with the plasmid added would both be copyrightable compilations.[4]

As of 2016, genetic sequences were not recognized as copyrightable subject matter by any jurisdiction.[3] The United States Copyright Office's position is that "DNA sequences and other genetic, biological, or chemical substances or compounds, regardless of whether they are man-made or produced by nature," are ideas, systems, or discoveries rather than copyrightable works of authorship.[15]: 23 

Nina Srejovic has argued that DNA molecules themselves are not "works" that can be subject to copyright protection, but rather an information storage medium in which copies of genetic sequences and works of authorship can be "fixed". Copyright protection for a work extends to its information content and does not depend on the type of material object in which it is embodied. Thus, a work such as a motion picture is copyrightable whether fixed in a DVD, film, video tape, or a strand of DNA. Srejovic disagrees with the Copyright Office's position, contending that it would categorically "disqualify works of authorship from copyright protection simply because they are fixed as a DNA compound."[1]

"Prancer" test case

[edit]

In 2012, the artificial gene synthesis company DNA 2.0 attempted to register a copyright for an engineered DNA sequence called "Prancer", in collaboration with law professors Christopher Holman and Andrew Torrance; the U.S. Copyright Office refused registration. In an appeal, the team contended that human-designed DNA sequences, like computer programs, qualify as "literary works" under existing copyright law. Section 102(a) of the Copyright Act of 1976 lists eight categories of copyrightable subject matter, such as literary, musical, and audiovisual works; referencing the legislative history of the Copyright Act, the team argue that these categories were intended to be flexible and non-exhaustive.[3]

Naturally occurring DNA sequences

[edit]

Naturally occurring DNA sequences are not protectable under United States copyright law, as they are "discoveries" as opposed to works of human authorship and thus cannot be copyrighted.[16] Holman et al. (2016) state that this is a point of consensus among proponents and detractors of copyright protection for engineered DNA.[3] However, proponents have argued that a creative selection, coordination, and arrangement of naturally occurring DNA sequences could constitute an original, copyrightable work.[4][17]

By contrast, UK copyright law considers works that are created through an author's skill, judgment, and effort—and not copied from a pre-existing work—to be original works of authorship. Thus, San Martin and Hurdle (2017) argue that a textual representation of a DNA molecule as a sequence of letters (A, C, G, and T) could be copyrightable under UK law, analogous to translating an pre-existing work from a foreign language into a form that the reader can understand. However, this would not stop another person from independently sequencing the same DNA compound and securing a copyright for their own representation of its genetic code, even if it is identical to another's pre-existing representation of the same genetic sequence.[18]

Policy arguments

[edit]

The public policy benefits and costs of extending copyright protection to sequences of DNA have also been debated. Proponents argue that it is a superior means of protecting engineered DNA sequences than patents for several reasons.

First, copyright protection is easier for creators of synthetic DNA sequences to obtain and enforce against commercial actors copying or using them without authorization. Unlike patents, which must be granted by the patent office and can take years to obtain, copyright is granted automatically when a work is "fixed in a tangible medium of expression." In cases of "piracy" of genetically engineered products, Christopher Holman argues that proving copyright infringement would be more straightforward than proving patent infringement if the DNA were protected by copyright. Furthermore, copyright law unlocks a wider variety of remedies for infringement, including criminal penalties and blocking importation of the infringing copies.

Second, because copyright protection is thinner, Holman argues that copyrights would still provide "meaningful protection" against pirates without frustrating innovation in synthetic biology as much as patents, especially open source innovation. Due to the idea–expression divide, copyright would only cover a specific DNA sequence, not its functionality. Thus, another researcher could create an alternative DNA sequence with the same functionality as an existing DNA sequence without infringing its copyright. Likewise, two researchers could create the same genetic sequence independently without violating each other's copyrights, as independent creation is an absolute defense under copyright law but not under patent law.[17]

On the other hand, law professor Dan Burk contends that the thinness of copyright makes it an impractical legal tool for protecting genetic sequences. Burk argues that courts have had difficulty separating the functionality of computer programs from their potentially copyrightable expression, and would likewise face similar challenges ascertaining the copyrightability of engineered DNA.[19]

See also

[edit]

Notes

[edit]
  1. ^ When Kayton's paper was published in 1982, the Copyright Act enumerated seven categories of copyrightable works. The eighth category, architectural works, was introduced by the Architectural Works Copyright Protection Act of 1990.

References

[edit]
  1. ^ a b c d Srejovic, Nina (2022). "Copyright Protection for Works in the Language of Life". Washington Law Review. 97 (2). Retrieved 2025-06-22.
  2. ^ a b Burk, Dan L. (April 2018). "DNA Copyright in the Administrative State" (PDF). UC Davis Law Review. 51 (4): 1297–1349. Retrieved 2025-06-24.
  3. ^ a b c d Holman, Christopher M.; Gustafsson, Claes; Torrance, Andrew W. (2016). "Are Engineered Genetic Sequences Copyrightable?: The U.S. Copyright Office Addresses a Matter of First Impression". Biotechnology Law Report. 35 (3). Retrieved 2025-06-22.
  4. ^ a b c d e Kayton, Irving (January 1982). "Copyright in Living Genetically Engineered Works" (PDF). George Washington Law Review. 50 (2). Retrieved 2025-06-22.
  5. ^ Mattick, John S.; Makunin, Igor V. (2006-04-15). "Non-coding RNA". Human Molecular Genetics. 15 (suppl_1): R17 – R29. doi:10.1093/hmg/ddl046. PMID 16651366.
  6. ^ "What is noncoding DNA?". MedlinePlus. United States National Library of Medicine. Retrieved 2025-06-25.
  7. ^ "Pradeep Mutalik, MD". Yale School of Medicine. Retrieved 2025-06-24.
  8. ^ Mutalik, Pradeep (2018-04-05). "How the DNA Computer Program Makes You and Me". Quanta Magazine. Retrieved 2025-06-24.
  9. ^ Rojahn, Susan Young (2012-08-16). "An Entire Book Written in DNA". MIT Technology Review. Retrieved 2025-06-24.
  10. ^ Goldman N, Bertone P, Chen S, Dessimoz C, LeProust EM, Sipos B, Birney E (February 2013). "Towards practical, high-capacity, low-maintenance information storage in synthesized DNA". Nature. 494 (7435): 77–80. Bibcode:2013Natur.494...77G. doi:10.1038/nature11875. PMC 3672958. PMID 23354052.
  11. ^ Lee HH, Kalhor R, Goela N, Bolot J, Church GM (June 2019). "Terminator-free template-independent enzymatic DNA synthesis for digital information storage". Nature Communications. 10 (1): 2383. Bibcode:2019NatCo..10.2383L. doi:10.1038/s41467-019-10258-1. PMC 6546792. PMID 31160595.
  12. ^ Apple Computer, Inc. v. Franklin Computer Corporation, 714 F.2d 1240 (3d Cir. 1983).
  13. ^ Samuelson, Pamela (2007). "Why Copyright Law Excludes Systems and Processes from the Scope of its Protection" (PDF). Texas Law Review. 85 (1). Retrieved 2025-06-22.
  14. ^ Public Domain This article incorporates public domain material from Hickey, Kevin J. (2021-05-10). Google v. Oracle: Supreme Court Rules for Google in Landmark Software Copyright Case. Congressional Research Service. Retrieved 2025-06-22.
  15. ^ Public domain This article incorporates public domain material from this U.S government document. "Chapter 300 – Copyrightable Authorship: What Can Be Registered" (PDF). Compendium of U.S. Copyright Office Practices. United States Copyright Office. 2021-01-28. Retrieved 2025-06-22.
  16. ^ Kasunic, Robert J. (2014-02-11). Copyright Office letter affirming refusal to register the "Prancer DNA Sequence" . United States Copyright Office – via Wikisource.
  17. ^ a b Holman, Christopher M. (2017-07-19). "Copyright for Engineered DNA (Part 2)". GQ Life Sciences. Retrieved 2025-06-22.
  18. ^ San Martin, Beatriz; Hurdle, Heidi (2017). "An alternative to patents: can DNA be protected by copyright and design right law?". Cell & Gene Therapy Insights. 3 (8): 639–649. doi:10.18609/cgti.2017.062. Retrieved 2025-06-22.
  19. ^ Neilson, Susie (2016-06-14). "Copyrighting DNA Is a Bad Idea". Retrieved 2025-06-22.