Q2 2021 Release Notes
That was fast! Another quarter just breezed by. So what have we been working on at Intent HQ?
The IHQ Platform uses a set of configuration to define steps of data processing and enrichment during a pipeline run.
Popular due to its versatile nature, Python is popular with users such as data scientists, data engineers etc. to generate new value by combining information from multiple data sources. Additionally, users can utilise the Intent HQ DSL to help Python reference data sources and process information efficiently.
In most scenarios, the effort of testing and validating configuration and/or feature engineering on a production-sized dataset is not only expensive and time consuming, but also highly inefficient at scale.
In this release, we have added the capability to run a pipeline for a chosen set of customer profiles. E.g. this may be a subset of users eligible for a sample to be further used as test/validation datasets during feature engineering.
Whether to validate configuration or enrichments (in DSL or Python) on a smaller sample first before going to production, or with the intention of using the subset for further work, we encourage the use of sampling for data exploration given the associated cost and effort of using full datasets.
In order to increase the level of ease and automation in data transformation, Intent HQ has added functionality to trigger data processing (pipelines) after file ingestion has been completed.
This means the platform will automatically recognise and reference a given pipeline configuration as soon as ‘expected’ data has been received and ingested and will continue to generate enriched profiles.