2017-05-02
- Finish adding the peak calling tools.
- Start work on the pegasus paper.
- 3 papers. Dual computational (Pegasus pipeline) / biological papers, resubmit express ortho.
2017-04-13
- PePR, Hidden Domains, & Music tests are working, need to get them implemented
- More Documentation! Especially getting read the docs setup.
- XSede stuff too! Very close.
2017-03-31
- Zerone
- PePR
- Hidden Domains
- MUSIC
- Documentation
- Create skeleton of paper
2017-03-15
- Need to decide how to handle spp broad peaks.
- Further work to get MUSIC, PEPR, CHILLIN, ZERONE
- Download update encode json dump
2017-03-09
- Download updated encode json dump and update
- Get broad/narrow peaks & idr in by next week
- Bridges
- Cleanup database tables.
2017-02-23
- Work toward getting bridges up & running by the end of next week.
- Decide on full research allocation for Bridges.
- Get new peak calling tools.
2017-02-09
- No NIH :(
- Express Ortho paper review & resubmit
- Mini tool comparison for ~10 tools, talk to developers
- March 13th / Great Lakes Conference abstract due
- Dual paper submit, have pegasus ready for submission.
2017-01-26
- Great Lakes Conference in Chicago May 15-17th
- Review Paper by Monday, Deadline for submission is next Wednesday
- Generate new swooshy flow chart for the workflow
- Think about scoring algorithm for different peak calling tool
- Add idr granularity
- Keep using random pairing for control / signal
- Adjust signal input to list
- Validate idr specification / narrow & broad peak specification
- Generate read distribution from bam, otherwise use default
- Create default ccat config file.
- Remove peakranger gene annotation file
2017-01-13
- Modify ExpressOrtho perl script to confirm to perl style standards.
- Fetch bams or fastqs.
- Specify file accession for control and experiment instead of experiment accession.
- Add IDR for 2 replicates. Wait on response to figure out. (IDR / FDR?)
- MongoDB read only database to clone. MongoDB helper scripts to update database. Need to create own db instance for saving results.
- Don't need to save output files back to the database.
- Generate report log at the end.
- Remove Duplicates yes / no & duplicates.
- Pooled & Pseudo replicates - pooled concatenate, pseudo shuffled (bam -> sam for plaintext). Configurable option.
- Sphinx documentation at the same time.
2016-12-13
- Organize files by organism -> cell_type -> tf/hm -> (biorep1, biorep2, idr)
2016-12-07
- Human: CHD2 for k562 H1-hESC cells
2016-09-22
- macs2 paired end reads need to be run differently check accordingly.
- convert/sort/idr for individual samples.
2016-09-15
- Grant application in April, hear back in May
- Mirrored papers, ChIPathlon / Biological.
- Next steps: Re run human / mouse with individual samples, run idr on the output.
- CH12.lx / GM12878 if time permits. End Goals:
- IDR in workflow
- Pooled/Pseudo replicates in workflow
- Merged replicates in workflow
- Don't need GUI
2016-04-12
- Add gem
- For DFBS: macs2, csaw, jmosaic
2016-04-05
- Created dummy sample entries to run pipeline faster!
- Look into better downloading tools to increase speed
2016-03-17
- Potentially do comparison paper of no SQL -> sql for bioinformatics. Run similar setup/design/analysis in SQL and see how it compares.
- 3 Papers (Biological paper, Pegasus paper, MongoDB paper)!?
2016-03-10
- Don't use the assembly from the samples record, use the Grch assembly.
- bowtie2 standard error continues quality measures!
- for now focus on bowtie2
- MXI1
- Add aggregation pipelines to meta to extract relevant transcription factors
2016-03-03
- Run on Myc / Max -- wait on correct experiments to use!
- Nothing else!
2016-02-25
- Most conditions don't exist, exist for some.
- Collapse Bed & Peak Collections
- Only need one of score or signal_value
- Need to derive read length from downloaded fastq
- Randomly select experiment to control for peak calling, only use possible_controls
- Only need human / mouse
- DFBS keep everything same except one of condition OR cell type. (ignore for now)
- Restrict to database, add genome collection.