Fig. 5From: Automatic identification of scientific publications describing digital reconstructions of neural morphologySelecting the most convenient thresholds. A Proportion of irrelevant articles incorrectly classified as relevant (false relevant) as a function of the fraction of labor saved by accepting the automated classification without review. The inset displays the enlarged range between 75% and 100% of saved labor. B Proportion of relevant articles incorrectly classified as irrelevant (false irrelevant) as a function of the fraction of labor saved by accepting the automated classification without review. The inset displays the enlarged range between 75% and 100% of saved labor. C Using test labeled data we select optimal thresholds to maximize saved labor while minimizing misclassification errors and the number of publications to be manually reviewed. D Once the classifier is deployed, we analyze the results by type of text: high-quality text obtained from publishers’ APIs and low-quality raw text extracted from PDFsBack to article page