aknecht2 created page: run times authored by aknecht2's avatar aknecht2
2016-02-25
2017-01-13
===========
* Modify ExpressOrtho perl script to confirm to perl style standards.
* Fetch bams or fastqs.
* Specify file accession for control and experiment instead of experiment accession.
* Add IDR for 2 replicates. Wait on response to figure out. (IDR / FDR?)
* MongoDB read only database to clone. MongoDB helper scripts to update database. Need to create own db instance for saving results.
* Don't need to save output files back to the database.
* Generate report log at the end.
* Remove Duplicates yes / no & duplicates.
* Pooled & Pseudo replicates - pooled concatenate, pseudo shuffled (bam -> sam for plaintext). Configurable option.
* Sphinx documentation at the same time.
* Most conditions don't exist, exist for some.
* Collapse Bed & Peak Collections
* Only need one of score or signal_value
* Need to derive read length from downloaded fastq
* Randomly select experiment to control for peak calling, only use possible_controls
* Only need human / mouse
* DFBS keep everything same except one of condition OR cell type. (ignore for now)
* Restrict to database, add genome collection.
2016-03-03
============
* Run on Myc / Max -- wait on correct experiments to use!
* Nothing else!
2016-03-10
==============
* Don't use the assembly from the samples record, use the Grch assembly.
* bowtie2 standard error continues quality measures!
* for now focus on bowtie2
* MXI1
* Add aggregation pipelines to meta to extract relevant transcription factors
2016-03-17
============
* Potentially do comparison paper of no SQL -> sql for bioinformatics. Run similar setup/design/analysis in SQL and see how it compares.
* 3 Papers (Biological paper, Pegasus paper, MongoDB paper)!?
2016-04-05
2016-12-13
============
* Created dummy sample entries to run pipeline faster!
* Look into better downloading tools to increase speed
* Organize files by organism -> cell_type -> tf/hm -> (biorep1, biorep2, idr)
2016-04-12
2016-12-07
=============
* Add gem
* For DFBS: macs2, csaw, jmosaic
* Human: CHD2 for k562 H1-hESC cells
2016-09-22
===============
* macs2 paired end reads need to be run differently check accordingly.
* convert/sort/idr for individual samples.
2016-09-15
=============
......@@ -53,28 +36,47 @@ End Goals:
* Merged replicates in workflow
* Don't need GUI
2016-09-22
===============
* macs2 paired end reads need to be run differently check accordingly.
* convert/sort/idr for individual samples.
2016-12-07
2016-04-12
=============
* Human: CHD2 for k562 H1-hESC cells
* Add gem
* For DFBS: macs2, csaw, jmosaic
2016-12-13
2016-04-05
============
* Organize files by organism -> cell_type -> tf/hm -> (biorep1, biorep2, idr)
* Created dummy sample entries to run pipeline faster!
* Look into better downloading tools to increase speed
2017-01-13
2016-03-17
============
* Potentially do comparison paper of no SQL -> sql for bioinformatics. Run similar setup/design/analysis in SQL and see how it compares.
* 3 Papers (Biological paper, Pegasus paper, MongoDB paper)!?
2016-03-10
==============
* Don't use the assembly from the samples record, use the Grch assembly.
* bowtie2 standard error continues quality measures!
* for now focus on bowtie2
* MXI1
* Add aggregation pipelines to meta to extract relevant transcription factors
2016-03-03
============
* Run on Myc / Max -- wait on correct experiments to use!
* Nothing else!
2016-02-25
===========
* Modify ExpressOrtho perl script to confirm to perl style standards.
* Fetch bams or fastqs.
* Specify file accession for control and experiment instead of experiment accession.
* Add IDR for 2 replicates. Wait on response to figure out. (IDR / FDR?)
* MongoDB read only database to clone. MongoDB helper scripts to update database. Need to create own db instance for saving results.
* Don't need to save output files back to the database.
* Generate report log at the end.
* Remove Duplicates yes / no & duplicates.
* Pooled & Pseudo replicates - pooled concatenate, pseudo shuffled (bam -> sam for plaintext). Configurable option.
* Sphinx documentation at the same time.
* Most conditions don't exist, exist for some.
* Collapse Bed & Peak Collections
* Only need one of score or signal_value
* Need to derive read length from downloaded fastq
* Randomly select experiment to control for peak calling, only use possible_controls
* Only need human / mouse
* DFBS keep everything same except one of condition OR cell type. (ignore for now)
* Restrict to database, add genome collection.