Analyzing microscopic image datasets is difficult. It is even more so with the rapid increase of PixelBiotech’s customer base. We are constantly receiving requests for HuluREAD to analyze datasets that are acquired using different microscopes, under various imaging conditions and on different species. In this article, we discuss important lessons we learned in building HuluREAD into an efficient and scalable algorithmic platform for customized HuluFISH image data analysis.
Lesson #1: Fully embrace AI
Machine learning especially deep learning has made significant progress in the past years. In HuluREAD, we have applied many machine learning models to their full extent. Machine learning has drastically increased the capability of model adaptation and helped to avoid a lot of tedious manual steps. For example, we have built anomaly detection models to detect low-quality datasets due to overexposure or focus failure, saving unnecessary computational spending. Various deep learning models have been built and deployed for the segmentation and classification of FISH spots, nuclei and cells. With techniques like incremental learning and domain adaptation, we can quickly train or re-train a model to adapt to new datasets.
Segmentation of cell nuclei with different shape, appearance and density (real customer data).
Lesson #2: Prepare a full arsenal of techniques
Machine learning such as deep learning is largely a statistical method based on low-level image features. But often high-level structural information is needed during the analysis. So, one needs to be prepared to have a full arsenal of image analysis techniques: digital image processing, computational geometry, morphological image analysis, among others. As a rule of thumb, we found that a robust and high-quality solution usually consists of multiple interacting components. For example, the deep learning model extracts and aggregates the low-level cues, while some geometric or morphological models governs the higher level consistency and maintains meaningful output.
Detection of HuluFISH spots (real customer data).
Lesson #3: Practice, practice, practice
In theory there is no difference between theory and practice. In practice there is. – Benjamin Brewster
Our best teacher during the process of building HuluREAD has been practice. It is the thousands of datasets we analyzed that guided every improvement of HuluREAD. We have learnt the true advantages and disadvantages of many techniques at a much larger scale both in terms of volume and in terms of complexity. The solutions that are live in HuluREAD today have survived thousands of real tests. To ensure highly efficient algorithm development iterations, we found the following practice very critical. First, build a very good computational infrastructure and make full use of parallel and GPU computation. Just like experimental science, repetitions matter. Second, we paid strict attention to engineering quality including system architecting, design patterns, programming paradigm and test coverage, among others. Third, pay attention to usability from day one. Build good GUIs and CLIs. Good usability encourages us to maintain high version compatibility and process standardization, which in turn yields better efficiency in algorithm development iterations.
More segmentation of cells and cell nuclei under various imaging conditions (real customer data).
While we try to best serve our customers with better image analysis, we also want to share our experience with the community. If you are interested or have questions, feel free to reach out to us at firstname.lastname@example.org.