Skip to main content
Figure 4 | Source Code for Biology and Medicine

Figure 4

From: Layout-aware text extraction from full-text PDF of scientific articles

Figure 4

Text Flow Evaluations. The graph above shows the relative alignment cost of LA-PDFText and PDF2Text with respect to the gold standard. Each green dot represents the difference between the normalized alignment scores of LA-PDFText and PDF2Text for one paper in the PLoS Biology dataset. + markers show normalized alignment scores produced by LA-PDFText and - markers show normalized alignment scores produced by PDF2text. Results indicated that LA-PDFText extracts text with better alignment scores with respect to the gold standard than PDF2Text for 91% of the documents tested (p < 0.001).

Back to article page