Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We were wowed by NiFi when we looked at it originally. Once we put it in Local env to build test flows, we found that the most complex tasks for data flows were fairly simple to setup. And the simplest tasks ended up requiring complex workarounds because the system was trying to be extra smart about what it was doing. In the end, we decided not to use it in Production due to the 80/20 split of simple/complex tasks we had.

Hopefully it's better than what it was in Jan/Feb timeframe.



+1 mpayne's comment. Would be interesting to know what you mean by "extra smart". Can you give an example? Thanks!


Interesting. Any details on the things that you found particularly easy or particularly difficult to do with it?


Simple use cases that were more complicated

- We wanted to collect files from various locations and push that in hdfs. Nifi seemed like a good way to build a self service setup. Once we setup source and sink, if we read + processed + removed file from destination, Nifi copied it again. We did not have control of always removing source files as soon as it was copied on destination. Components we used were GetFile, PutFile and some options in Conflict Resolution settings for these components

- Inspect file name, run a script to generate new subdirectories for partitions and place the file in appropriate partitions. Attaching a script was easy. Changing the destination path on the fly was not.

Some Complex cases - there are other ways to do this but setting this up in Nifi was a breeze

- Set up file collection from ftp, sftp, file copy from 20+ locations. This was painless, few minutes per source

- Add REST interaction within data flow easily

- Read CSV files and convert to Avro/Sequence files

- Read files and route part of data to different processors

We also ran into some strange bugs where Nifi got stuck in some type of loop and kept copying data over and over again.

We were able to do all this testing in 2 weeks. Give it a shot, it might work out for your use case.


I'm REALLY interested in hearing more about this.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: