Skip to main content

Table 2 One possible set of DNA sequence data compression factors for the various experimental classes

From: The future of DNA sequence archiving

Class

Description

Rate for Physically Unique samples

Rate for Physically Archived/archivable Samples

1

Historical sampling of environment or time specific elements

1.0

1.0

2

Very rare objects

1.0

1.0

3

Longitudinal studies which could in theory be rerun in the future but have a > 10 year horizon to recreate

1.0

2.0

4

Samples acquired from patients or animals with a high individual acquisition cost, but a conceptually continuous generation

1.0

10.0

5

A complex experiment with > 6 month resource development

10.0

100.0

6

A routine experiment with < 6 month resource development

20.0

200.0

7

Verification experiment as a component in an overall flow

1000.0

∞ (Infinite compression of data indicates no data archiving; it may, however, be useful simply to record that the experiment was carried out.)

  1. Compression is higher for data that are easy or inexpensive to reproduce, and lower for data derived from unique or irreproducible samples.