Class DocumentParser


  • public class DocumentParser
    extends java.lang.Object
    Document content parser, currently based on Tika
    • Constructor Summary

      Constructors 
      Constructor Description
      DocumentParser()  
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static java.lang.String parse​(DocumentDB doc)
      Parse document as text
      static java.lang.String parse​(DocumentDB doc, int writeLimit)
      Parse document as text
      static java.lang.String parse​(java.io.File file)
      Parse document as text
      static java.lang.String parse​(java.io.File file, int writeLimit)
      Parse document as text
      static java.lang.String parse​(java.lang.String path)
      Parse document as text
      static java.lang.String parse​(java.lang.String path, byte[] data)
      Parse document as text
      static java.lang.String parse​(java.lang.String path, byte[] data, int writeLimit)
      Parse document as text
      static java.lang.String parse​(java.lang.String path, int writeLimit)
      Parse document as text
      • Methods inherited from class java.lang.Object

        equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • DocumentParser

        public DocumentParser()
    • Method Detail

      • parse

        public static java.lang.String parse​(DocumentDB doc,
                                             int writeLimit)
        Parse document as text
        Parameters:
        doc - Document
        writeLimit - Write limit
        Returns:
        Extracted text
      • parse

        public static java.lang.String parse​(DocumentDB doc)
        Parse document as text
        Parameters:
        doc - Document
        Returns:
        Extracted text
      • parse

        public static java.lang.String parse​(java.lang.String path,
                                             int writeLimit)
        Parse document as text
        Parameters:
        path - Document path (if relative, use local doc dir as base directory)
        writeLimit - Write limit
        Returns:
        Extracted text
      • parse

        public static java.lang.String parse​(java.lang.String path)
        Parse document as text
        Parameters:
        path - Document path (if relative, use local doc dir as base directory)
        Returns:
        Extracted text
      • parse

        public static java.lang.String parse​(java.io.File file,
                                             int writeLimit)
        Parse document as text
        Parameters:
        file - Document file
        writeLimit - Write limit
        Returns:
        Extracted text
      • parse

        public static java.lang.String parse​(java.io.File file)
        Parse document as text
        Parameters:
        file - Document file
        Returns:
        Extracted text
      • parse

        public static java.lang.String parse​(java.lang.String path,
                                             byte[] data,
                                             int writeLimit)
        Parse document as text
        Parameters:
        path - Document path
        data - Document data
        writeLimit - Write limit
        Returns:
        Extracted text
      • parse

        public static java.lang.String parse​(java.lang.String path,
                                             byte[] data)
        Parse document as text
        Parameters:
        path - Document path
        data - Document data
        Returns:
        Extracted text