Ask HN: What does your BI stack look like?

edmundsauto · on Feb 22, 2021

A quick plug for ddt here - it's a tool that lets you define your data model transforms and store in source code. (Then use that in your transformations in your warehouse.)

https://www.getdbt.com/

cbradford · on Feb 22, 2021

Streamlit for Python and PowerBI. Great tools

Jugurtha · on Feb 22, 2021

Would you elaborate on your workflow and how you use these ? Thank you.

nonameiguess · on Feb 23, 2021

I'm not going to answer this for my current company, but man was it a mess at my last one. I was involved in writing custom ETL tooling and attempting to onboard product management teams to using it. Financial data was all stored in APEX, which could only be queried using Windows APIs, so the one guy we had who knew Visual Basic did all of that. I wrote a bunch of Python libraries that defined a custom intermediate format for data extracted from Jira, Rational Team Concert, and Azure DevOps boards, since there was no consistent standard for what product teams actually used but it tended to be one of those three things. We offered the ability to feed the data to InfluxDB and visualize it in Grafana via automation provided by ansible that would deploy the DB, dashboards, and configuration for the custom ETL tools together so they were in sync with each other. That was for on-premises single team usage.

For enterprise, company-wide aggregation, we were supposed to be feeding an enterprise measures system with long term storage in Amazon Redshift and standardized dashboards built in PowerBI that utilized standard metrics definitions everyone was supposed to converge on. But that product was two years behind schedule and eventually cancelled when the company merged with an airline manufacturer and Covid hit a month later and suddenly capital funds totally dried up. So luckily our on-premises solution was entirely free and allowed user to define whatever collection strategies and metrics they wanted.

I hope someone is still actually using this. It was just me and two other guys who made all this tooling and automation and deployed it to seven pilot product teams middle of last year. Not sure how they're doing without me since I wrote all of the core libraries myself with no external input or review because we were in such a hurry to demonstrate value.

MrApathy · on Feb 23, 2021

Context: bulge bracket bank.

Within my area, data sourced from just about everywhere, Oracle, Maria, SAP, SharePoint, flat files, etc. Alteryx and Informatica used to structure data, with some Python at the more advanced level.

On the visualization side most everything is Tableau or Qlik, with the former having the larger footprint (maybe 3x as large). Tableau is oriented more for less technical business teams, Qlik Sense for more technically included teams (more powerful, at the expense of being more difficult), and QlikView for a lot of legacy dashboards -- currently reviewing whether to close it off to new development.

On the reporting side, primarily Cognos.

We rely heavily on vendor applications since we can typically insure support is available.

Only new tool we've evaluated and picked up recently (last couple of years) has been Looker, which was licensed for a small use case and hasn't worked as expected. Interest has been primarily in the scripting/ETL component, less so on the visualization side, but the license model makes it very expensive for what we use it for.

hm-nah · on Feb 22, 2021

Myriad of inputs from ancient on-prem FTP and mainframe to data uploaded to Azure Storage containers. Azure Data Factory migrating all inputs to Azure Data Lake(s). Different visualization tools for on-prem and cloud. On-prem is a messy turd of BOBJ (barf!), custom CGI scripts, SAS generating html/emails/etc. Tableau and PowerBI for other things. Nothing cohesive.

thorin · on Feb 23, 2021

Seems like these days everyone uses a completely different bunch of stuff.

About 5 years ago when I was a BI lead we were using Jasper reports, which was acquired by Tibco. I also trialled stuff like Qlikview and Tableau which seemed great for their particular use-cases.

Before that we used Oracle Discoverer, Oracle Apex and Business Objects.

markus_zhang · on Feb 23, 2021

Vertica and HDFS for storage.

Home grown ETL processes in Python, Spark and Airflow with Rendeck and Jenkins.

Tableau for dashboards.

Python + Dash for more complicated dashboards.

max_hammer · on Feb 22, 2021

Snowflake on AWS for storage.

Looker for reporting.

Airflow for scheduling.

Sagemaker for ML

twolf910616 · on Feb 24, 2021

Have you look into Step Function? Just curious since it's suppose to be a competing product with Airflow. Also Sagemaker for us has been way too expensive.

aed · on Feb 23, 2021

Stitch for ETL (really just the E and L part)

Redshift for storage

dbt to transform the raw data into something warehousey

Metabase for viz

eyeball · on Feb 22, 2021

Lots of sql & python feeding Oracle feeding qlik sense

msencenb · on Feb 22, 2021

At my day job we use Redshift + Metabase