Scripts and Pipelines
GCtoMAT is a collection of utility shell scripts to harvest retention times and peaks from multiple GC-MS output data files (Agilent) to build matrices for statistical analysis. User manual, sample files, and several scripts (for different file formats) are provided.
extract is a simple PERL script that extracts unmapped short reads (e.g Illumina) from a SAM file and outputs them back into FASTq format
Euglossa dilemma genome assembly v1.0 is the official gene set `Edil_OGS_v1.0`, created using a combination of homology-based and _de novo_ gene predictions. Homology-based gene predictions are based on the honey bee [OGSv3.2](http://hymenopteragenome.org/beebase/?q=download_sequences). The file `Edil_OGS_v1.0_apis_gene_homology.txt` contains the honey bee genes used as parents for each gene identified in the homology-based approach.