Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[7.x] Attachment ingest processor: add resource_name field (#64389) #66301

Merged

Conversation

danhermann
Copy link
Contributor

In the current plugin: ingest-attachment, the text file cannot be read properly if the encode is not utf-8
and contain some non-ascii characters.

I study a little about Tika witch is used in ingest-attachment. Then I find out if we can tell Tika the file's name, it can recognize the file better. So I add an attachment options file_name, if there is a field defined as file_name, then this name will sent to Tika to improve the result.

But there is something not looks well. That's the gradle check. I wrote the unit test for reading different text using
different encoding. But seems there is a role to not commit no-utf8 things.

Backport of #64389

@danhermann danhermann added >enhancement :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP backport v7.11.0 labels Dec 14, 2020
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Dec 14, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (Team:Core/Features)

@danhermann danhermann merged commit f026c75 into elastic:7.x Dec 14, 2020
@danhermann danhermann deleted the bacport_7x_64389_resource_name_attachment branch December 14, 2020 21:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >enhancement Team:Data Management Meta label for data/management team v7.11.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants