Apache Tika - Apache Tika
Apache Tika - a content analysis toolkit The Apache Tika toolkit detects and extracts metadata and text content from various documents - from PPT to CSV to PDF - using existing parser libraries. Tika unifies these parsers under a single interface to allow
tika.apache.org |