Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore files starting with "_" #13

Open
ghost opened this issue Jul 27, 2016 · 0 comments
Open

Ignore files starting with "_" #13

ghost opened this issue Jul 27, 2016 · 0 comments

Comments

@ghost
Copy link

ghost commented Jul 27, 2016

Dear devs,

I wanted first to thank you for this piece of software, really great!

I have one request I would like to raise with you, if possible. Could you please set the code so files starting with "" are ignored? The use case is as follows:
I have a data source that is quite slow. I use Apache Flume to store that data into HDFS. Because the data velocity is small, I set up Flume to roll to a new file after 10mn. This results in creating a lot of small files which your crusher handles just perfectly.
Now the issue is that Flume's temp files (i.e. files that are not closed yet) start with "
" and are appended a ".tmp". When I run the crusher, if the file is closed in the meantime, well... the file is not found. I would like also to avoid errors from Flume's side and thus avoid manipulating those files.

The request is thus to either have a new option to ignore files starting with "_" or just ignore them by default.

Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

0 participants