A US-based contract research organization (CRO) handling approximately 200 sequencing samples per day sought to enhance its data processing capabilities. The goal was to reduce turnaround time and operational costs while minimizing manual intervention.
Challenge
The existing workflow relied on manual data handling, which slowed down processing and increased labor costs. With the need for rapid results in a competitive market, the CRO required an automated solution for data analysis and pipeline execution.
Approach
The project introduced automation at multiple stages:
- Automatic sequencing data upload to a monitored S3 bucket for real-time processing.
- Data analysis using Nextflow and AWS Batch, optimized with cost-efficient spot instances.
- Customized pipeline monitoring with real-time notifications for immediate status updates.

Results
- Turnaround time reduced by up to 12 hours.
- Data processing costs reduced by 3 times.
- Fully automated data processing, triggered on data arrival.
- Automatic results upload, ensuring seamless data delivery to customers.
Conclusion
By implementing cloud-based automation, the CRO significantly improved operational efficiency. The enhanced workflow freed bioinformaticians to focus on developing new pipelines rather than managing routine data processing, fostering greater innovation.