Samples
Generate sample snails with genomes and locations.
samples(options)
Main driver for snailz snail creation.
- options.genomes: genomes data file.
- options.grids: grids parameter file.
- options.params: path to parameter file (see params.SampleParams for fields).
- options.outfile: optional path to saved output file.
- options.surveys: survey CSV parameter file.
- options.sites: sites CSV parameter file.
Generated data is written as CSV to the specified output file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
options
|
Namespace
|
see above. |
required |
Source code in snailz/samples.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
|
_generate_samples(options, genomes, grids)
Generate snail samples.
For each previously-generated genome:
- Select a survey and a random point in that survey's area, and determine if that point is contaminated.
- Determine the range of possible snail sizes based on genotype and contamination.
- Generate a size.
- Append a record to a list that is later converted to a dataframe.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
options
|
Namespace
|
see samples(). |
required |
genomes
|
dict
|
JSON representation of previously-generated genomes. |
required |
grids
|
dict
|
key-to-grid dictionary whose grids are NumPy arrays. |
required |
Returns:
Type | Description |
---|---|
DataFrame
|
Dataframe with sample ID, survey ID, longitude, latitude, sequence, and snail size. |
Source code in snailz/samples.py
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 |
|
_load_grids(options)
Load all grid files.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
options
|
Namespace
|
see samples(). |
required |
Returns:
Type | Description |
---|---|
dict
|
Key-to-NumPy array map of contamination grids. |
Source code in snailz/samples.py
88 89 90 91 92 93 94 95 96 97 98 99 100 |
|
_random_geo(sites, surveys, grids)
Select random point from a randomly-selected sample grid.
- Select site.
- Select random grid cell.
- Determine whether that cell is contaminated.
- Use site center point and survey spacing to determine longitude and latitude of cell.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sites
|
DataFrame
|
dataframe of site data. |
required |
surveys
|
DataFrame
|
dataframe of surveys. |
required |
grids
|
dict
|
key-to-grid dictionary whose grids are NumPy arrays. |
required |
Returns:
Type | Description |
---|---|
tuple
|
Selected survey ID, (lon, lat) point, and whether point is contaminated |
Source code in snailz/samples.py
103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 |
|
_save(options, samples)
Save results to file or show on standard output.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
options
|
Namespace
|
controlling options. |
required |
samples
|
DataFrame
|
dataframe of generated samples. |
required |
Source code in snailz/samples.py
145 146 147 148 149 150 151 152 153 154 155 |
|
_size_limit(options, genomes, seq, contaminated)
Calculate upper bound on snail size.
If the genome has the significant mutation in the right location and the site is contaminated, the snail may have the mutant size. Otherwise, it has the normal size.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
options
|
Namespace
|
controlling options. |
required |
genomes
|
dict
|
JSON containing overall information about genomes. |
required |
seq
|
str
|
specific sequence of this snail. |
required |
contaminated
|
bool
|
is sample location contaminated? |
required |
Returns:
Type | Description |
---|---|
float
|
Parameter value for upper bound on normal or mutant snail size. |
Source code in snailz/samples.py
158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 |
|