đ° #17 Singer SDK, Data as a Product, Readmes for Data; ThDPTh #17 đ°
Finally a singer SDK, the data as a product webinar and using readme driven development for better data related work products.
Data will power every piece of our existence in the near future. I collect âData Pointsâ to help understand & shape this future.Â
If you want to support this, please share it on Twitter, LinkedIn, or Facebook.
đĽ Singer SDK + SingerHub + Spec Extension
I do believe that the future of data, be it BI or data integration & EL (T) workflows is open-source simply due to the nature of the task. So itâs great to see the GitLab meltano team tackle the three major challenges that are out there with one of the current options.
Letâs step back for a second. Currently, there really is only one open-source data âconnectorâ which is called singer. But singer got a bunch of problems. Still, the meltano team decided to base its tool on top of singer. The great news? Meltano is starting to tackle all of the major problems with singer.
The first problem is how to find singer taps. They will work on the âSingerHubâ in April which will start to address this problem. Other tools like airbyte already have something like this by integrating all âtapsâ into one repository. The second problem is developing a new tap which just became a lot easier with their new advanced cookiecuttr template. The third problem is the actual spec of singer. The spec is way too generic to make it a successful OS project simply because it leaves too much freedom. Airbyte solves this issue by wrapping all taps into docker. The meltano team will work on extending the singer spec so Iâm looking forward to their approach to fix these problems.
Meltano Launches v0.1.0 of the Singer Tap SDK - Meltano
The Meltano team launches their Singer Tap SDK - the easiest way to build and maintain high quality data extractors compatible with the Singer ecosystem.
meltano.com  â˘Â Share
đŽ Data as a Product Webinar
I recently watched Zhamak Dehghaniâs webinar on âData as a Productâ in which she really takes the time to focus on just one of the principles of the data mesh paradigm shift. I really like the focus because seeing data as a product that has to be managed by-product management techniques and not just as a by-product is in my experience the single hardest thing about a data mesh. Everything else is just technical problems and things that derive from this principle.
Zhamak extends the already known âDATSISâ framework for creating good data products to include a few more items. One that I found important is the idea that a data product should be âvaluable on its ownâ and âmore valuable joined togetherâ.
Thatâs an important idea and should tell you, that if you think your analytical database tables with lots of ids in there are data products, then youâre most likely mistaken. For the rest, watch the webinar.
Webinar | Data as a Product | ThoughtWorks
Zhamak Dehgani explored during a recent webinar the principle of âdata as a productâ and described how this simple change in perspective has a deep âŚ
www.thoughtworks.com  â˘Â Share
đŁ Readme Driven Development for Data
Documenting things isnât the most fun exercise for data analysts, analytics engineers, or data scientists. And yet itâs a crucial step to get your work product to be used, be it an API, a dashboard, or a dbt model.
The GitHub Co-founder Tom Preston-Werner explains a great approach that makes keeping documentation up to date much easier: âReadme Driven Developmentâ. The benefits are best explained by the terratest library which uses RDD:
â[RDD] ensures the documentation stays up to date and allows you to think through the problem at a high level before you get lost in the weeds of coding.â (terratest Contributing Guideline)
I use this approach myself and find it very useful no matter what youâre developing. If youâre doing a new dashboard, write the description first. Make sure itâs short and concise and suddenly your dashboard will become much less cluttered, much more focused.
âŚBy the same principle a beautifully crafted library with no documentation is also damn near worthless. If your software solves the wrong problem or nobody can figure out how to use it, thereâs something very bad going on.
tom.preston-werner.com  â˘Â Share
đ In Other News &Â Thanks
Thanks for reading this far! Iâd also love it if you shared this newsletter with people whom you think might be interested in it.
P.S.: I share things that matter, not the most recent ones. I share books, research papers, and tools. I try to provide a simple way of understanding all these things. I tend to be opinionated. You can always hit the unsubscribe button!
Data; Business Intelligence; Machine Learning, Artificial Intelligence; Everything about what powers our future.
In order to unsubscribe, click here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue