Insights from the Gartner Data & Analytics Summit in London

May 28, 2024

Analytics Data Integration Event

Read in minutes

I had the opportunity of attending the Gartner Data & Analytics Summit in London from May 13th to 15th. This three-day event featured over 100 sessions, many of which ran concurrently, making it impossible to capture everything. However, I would like to share my top takeaways from this insightful conference.

D&A generate value

It’s usually very difficult to evaluate the return of the governance and management of the D&A. Gartner made a lot of studies and polls to bring some concrete evidence.

Studies show a good D&A maturity impact positively the financial performance of firms by 30%.

Governance is a keystone of D&A maturity, but its return is highly under valuated.

Firms should focus on business outcomes rather than ROI and prioritize Execution over strategy.

Evaluating the return on governance and management of Data & Analytics (D&A) is often challenging. Recent studies and polls conducted by Gartner provide some evidence to help our efforts.

A a strong D&A maturity can boost a firm’s financial performance by 30%.

Governance is a critical cornerstone of this D&A maturity, yet its value is frequently underestimated. To harness the full potential of our D&A initiatives, it’s essential that we shift our focus from traditional ROI metrics to broader business outcomes.

Moreover, prioritizing execution over strategy can drive more tangible results. By emphasizing practical implementation and operational excellence, we can ensure that our D&A governance delivers maximum value to the organization.

AI and GenAI – the elephant in the room

Everyone is talking about AI and GenAI, and it’s widely accepted that GenAI represents a disruption on par with the creation of the internet itself. The pressing question is: how can we harness this disruption to meet our own needs?

Studies show that firms which have designated AI as a strategic priority have outperformed their peers by 80% over the past nine years. This highlights the transformative potential of AI when integrated into a company’s core strategy.

I could hear on many sessions the “AI-ready Data” concept. Mainly compared with Analytics data, above quality, compliance, and accessibility, AI require more metadata, context, labeling. This can only be achieved through mature governance practices.

Learning & collective intelligence

In today’s fast-paced and data-rich environment, strong centralized models struggle to keep up with the volume of data and the speed at which decisions need to be made. The concept of collective intelligence: distributing decision-making power to local groups rather than centralizing it with top management.

This approach can be effectively implemented by focusing on several key areas:

Access to the Right Data: Ensure that all team members have access to the relevant and necessary data. This empowers local groups to make informed decisions quickly and accurately.

Sense of Purpose: Clearly communicate the organization’s vision and goals. When teams understand the bigger picture and their role within it, they are more motivated and aligned with the company’s objectives.

Autonomy: Granting teams the autonomy to make decisions fosters innovation and responsiveness. Local groups are often closer to the issues at hand and can act more swiftly than a centralized authority.

Literacy: Invest in training programs that enhance data and AI literacy skills. Equipping teams with the knowledge to understand and leverage data effectively is crucial for informed decision-making.

Data Fabric and Data Mesh

During the summit, two key architectural approaches were frequently discussed: Data Fabric and Data Mesh. While sometimes viewed as opposing strategies, they can also be seen as complementary.

Data Fabric focuses on leveraging the extended metadata provided by our existing platforms. Its primary goal is to “activate” this metadata to facilitate automated enhancements and suggestions.

Data Mesh, on the other hand, decentralizes data delivery and empowers business-driven D&A initiatives. Its core principles include treating Data as a Product.

Combining these approaches can lead to a scalable, flexible data architecture.

Data Fabric’s automation capabilities can enhance the efficiency of decentralized data management within a Data Mesh framework.

The extended metadata in Data Fabric can support the productization efforts of Data Mesh, ensuring that data products are enriched with comprehensive metadata.

Data and Analytics Governance Is Key to Your Success

As final takeaway as shown in the previous paragraph, Governance was a central theme in many of the sessions I attended. The modern approach to governance emphasizes a federated or cooperative model rather than a traditional centralized one. This approach aligns more closely with our business strategy and desired outcomes, rather than focusing solely on the data itself.

Governance should be driven by the business strategy and desired outcomes. By focusing on what we aim to achieve as an organization, we can ensure that our governance efforts are relevant and impactful.

Author: Xavier GROSFILS, COO at AKABI

SHARE ON :


Related articles

May 28, 2024

Read in minutes

Enhancing Real-Time Data Processing with Databricks: Apache Kafka vs. Apache Pulsar

In the era of big data, real-time data processing is essential for organizations seeking immediate insights and the ability to respond swiftly to changing marke...

March 25, 2024

Read in minutes

Getting started with the new Power BI Visual Calculations feature!

Power BI’s latest feature release, Visual Calculations, represents a paradigm shift in how users interact with data.      Rolled ...

February 20, 2024

Read in 5 minutes

Revolutionizing Data Engineering: The Power of Databricks’ Delta Live Tables and Unity Catalog

Databricks has emerged as a pivotal platform in the data engineering landscape, offering a comprehensive suite of tools designed to tackle the complexities of d...


comments

Enhancing Real-Time Data Processing with Databricks: Apache Kafka vs. Apache Pulsar

May 28, 2024

Analytics Data Integration Microsoft Azure

Read in minutes

In the era of big data, real-time data processing is essential for organizations seeking immediate insights and the ability to respond swiftly to changing market conditions. Apache Kafka and Apache Pulsar are two of the most popular platforms for managing streaming data. Their integration with Databricks, a powerful analytics platform built on Apache Spark, enhances these capabilities, providing robust solutions for real-time data management. This article explores the features of Kafka and Pulsar, compares their strengths, and provides guidance on which to choose based on specific use cases.

Apache Kafka: A Standard in Data Streaming

Apache Kafka is a distributed streaming platform originally developed by LinkedIn and later donated to the Apache Software Foundation. Kafka’s architecture is based on a distributed log, where data is written to “topics” divided into partitions.

Topics act as categories for data streams, while partitions are the individual logs that store records sequentially. This division allows Kafka to scale horizontally, enabling high throughput and parallel processing. Partitions also ensure fault tolerance by replicating data across multiple brokers, which maintains data integrity and availability.

Kafka excels in scenarios that require rapid ingestion and real-time processing of large volumes of data. Its ecosystem includes Kafka Streams for stream processing, Kafka Connect for integrating various data sources, and KSQL for querying data streams with SQL-like syntax. These features make Kafka ideal for applications such as monitoring, log aggregation, and real-time analytics.

Key Features of Kafka:

  • High Throughput and Low Latency: Capable of handling millions of messages per second with minimal delay, making it suitable for applications that require quick data processing.
  • Durable Storage: Messages can be stored for a configurable retention period, allowing for replay and historical analysis.
  • Mature Ecosystem: Includes robust tools for stream processing, data integration, and real-time querying.

Apache Pulsar: The Next Generation of Streaming

Apache Pulsar is a distributed messaging and streaming platform developed by Yahoo and now managed by the Apache Software Foundation. Pulsar’s architecture separates message delivery from storage using a two-tier system comprising Brokers and BookKeeper nodes. This design enhances flexibility and scalability.

Brokers handle the reception and delivery of messages, while BookKeeper nodes manage persistent storage. ZooKeeper plays a crucial role in this architecture by coordinating the metadata and configuration management. This separation allows Pulsar to scale storage independently from message handling, providing efficient resource management and improved fault tolerance. Brokers ensure smooth data flow, BookKeeper nodes ensure data durability, and ZooKeeper maintains system coordination and consistency.

Pulsar supports advanced features such as multi-tenancy, geographic replication, and transaction support. Its multi-tenant capabilities allow multiple teams to share the same infrastructure without interference, making Pulsar suitable for complex, large-scale applications. Additionally, Pulsar supports various APIs and protocols, facilitating seamless integration with different systems.

Key Features of Pulsar:

  • Multi-Tenancy: Supports multiple tenants with resource isolation and quotas, providing efficient resource management.
  • Advanced Features: Includes geographic replication for data availability across data centers and transaction support for consistent message delivery.
  • Flexible Integrations: Supports various APIs and protocols, enabling easy integration with different systems.

Comparing Apache Kafka and Apache Pulsar

While both Kafka and Pulsar are designed for real-time data streaming, they have distinct characteristics that may make one more suitable than the other depending on specific use cases.

Performance and Scalability: Kafka is known for its high throughput and low latency, making it ideal for applications requiring rapid data ingestion and processing. It is well-suited for high-performance use cases where low latency is critical. Pulsar, on the other hand, offers similar performance levels but excels in scenarios requiring multi-tenancy and seamless scaling. Its architecture separating compute and storage makes Pulsar preferable for applications needing flexible scaling and multi-tenant support.

Architecture and Flexibility: Kafka uses a simpler, monolithic architecture which can be easier to deploy and manage for straightforward use cases. This simplicity can be advantageous for quick and efficient setup. In contrast, Pulsar’s two-tier architecture provides more flexibility, especially for applications requiring geographic replication and fine-grained resource management. Pulsar is better suited for complex architectures needing advanced features.

Feature Set: Kafka’s extensive ecosystem, including tools like Kafka Streams, Kafka Connect, and KSQL, makes it a comprehensive solution for stream processing and real-time querying. This makes Kafka ideal for use cases that leverage its mature set of tools. Pulsar includes advanced features like native multi-tenancy, message replication across data centers, and built-in transaction support. These features make Pulsar preferable for applications requiring sophisticated capabilities.

Community and Ecosystem: Kafka has a larger and more mature ecosystem with widespread adoption across various industries, making it a safer bet for long-term projects needing extensive community support. Pulsar, while rapidly growing, offers cutting-edge features particularly appealing for cloud-native and multi-cloud environments. Pulsar is more appropriate for modern, cloud-native applications.

Integration with Databricks

Databricks, built on Apache Spark, leverages both Kafka and Pulsar to provide powerful and scalable real-time data processing capabilities. Here’s how these integrations enhance Databricks:

Databricks offers built-in connectors for reading and writing data directly from & to Kafka, enabling users to build real-time data pipelines using Spark Structured Streaming. This facilitates the transformation and analysis of data streams in real-time.

Similarly, Databricks supports Apache Pulsar, allowing for real-time data streaming with exactly-once processing semantics. Pulsar’s features such as geographic replication and transaction support enhance the resilience and reliability of streaming applications on Databricks.

Benefits of Integration

Integrating Kafka and Pulsar with Databricks provides several benefits. The scalability of both platforms allows for handling large volumes of real-time data without compromising performance. Pulsar’s multi-tenant capabilities and Kafka’s extensive features provide flexible integration tailored to specific business needs. Databricks also offers robust tools for access management and data governance, enhancing the security and reliability of streaming solutions.

Conclusion

Integrating Kafka and Pulsar with Databricks allows organizations to leverage leading streaming technologies to build efficient and scalable real-time data pipelines. By combining the power of Spark with Kafka’s resilience and Pulsar’s flexibility, Databricks provides a robust platform to meet the growing needs of real-time data processing.

For high-speed, low-latency applications, Kafka is the preferred choice. For complex, multi-tenant environments requiring advanced features like geographic replication and transaction support, Pulsar is more suitable.

Author: Pierre-Yves RICHER, Data Engineering Practice Leader at AKABI

SHARE ON :


Related articles

May 28, 2024

Read in minutes

Insights from the Gartner Data & Analytics Summit in London

I had the opportunity of attending the Gartner Data & Analytics Summit in London from May 13th to 15th. This three-day event featured over 100 sessions, man...

March 25, 2024

Read in minutes

Getting started with the new Power BI Visual Calculations feature!

Power BI’s latest feature release, Visual Calculations, represents a paradigm shift in how users interact with data.      Rolled ...

February 20, 2024

Read in 5 minutes

Revolutionizing Data Engineering: The Power of Databricks’ Delta Live Tables and Unity Catalog

Databricks has emerged as a pivotal platform in the data engineering landscape, offering a comprehensive suite of tools designed to tackle the complexities of d...


comments

Getting started with the new Power BI Visual Calculations feature!

March 25, 2024

Analytics Business Inteligence

Read in minutes

Power BI’s latest feature release, Visual Calculations, represents a paradigm shift in how users interact with data.     

Rolled out in February 2024 as a preview, this groundbreaking addition enables users to craft dynamic calculations directly within visuals. It opens up a new era of simplicity, flexibility and power in data analysis.

Visual Calculations are different from traditional calculation methods in Power BI. They are linked to specific visuals instead of being stored within the model. This simplifies the creation process and improves maintenance and performance. Visual Calculations allow users to generate complex calculations seamlessly, without the challenges of filter context and model intricacies.

This article explores Visual Calculations, including their types, applications, and transformative impact for Power BI users. Visual Calculations can revolutionize the data analysis landscape within Power BI by simplifying DAX complexities and enhancing data interaction.

Enable visual calculations

To enable this preview feature, navigate to Options and Settings > Options > Preview features and select Visual calculations.  After restarting the tool, Visual Calculations will be enabled.

Adding a visual calculation

To add a visual calculation, select a visual and then select the New calculation button in the ribbon:

The visual calculations window becomes accessible when you enter Edit mode. Within the Edit mode interface, you’ll encounter three primary sections, arranged sequentially from top to bottom:

  • The visual preview which shows the visual you’re working with
  • A formula bar where you can define your visual calculation
  • The visual matrix which shows the data used for the visual, and displays the results of visual calculations as you add them

To create a visual calculation, simply input the expression into the formula bar. For instance, within a visual displaying Net Sales by Year, you can add a visual calculation to determine the running total by entering:

Running total = RUNNINGSUM([Net Sales])

As you add visual calculations, they’re shown in the list of fields on the visual:

Additionally, the visual calculation is shown on the visual:

Without visual calculations, it’s a bit more complex: you must combine several DAX functions to get the same result. The DAX equivalent at model level would be the following formula:

Running total (model level) = 
VAR MaxDate = MAX('Order Date'[Date])
RETURN
    CALCULATE(
        SUM('Fact Orders'[Net Sales]),
        'Order Date'[Date] <= MaxDate,
        ALL('Order Date')
    )


Use fields for calculations without showing them in the visual

In Edit mode for visual calculations, you can hide fields from the visual. For example, if you want to show only the running total visual calculation, you can hide Net Sales from the view:

Hiding fields doesn’t remove them from the visual, so your visual calculations can still refer to them and continue to work. A hidden field will still appear in the visual matrix but will simply not appear in the resulting visual. It’s a very good idea from Microsoft, and a very practical one! As a good practice, we recommend to include hidden fields only if they are necessary for your visual calculations to work.

Templates available for common scenarios

To start with, several templates are already available, covering the most common scenarios:

  • Running sum: Calculates the sum of values, adding the current value to the preceding values. Uses the RUNNINGSUM function.
  • Moving average: Calculates an average of a set of values in a given window by dividing the sum of the values by the size of the window. Uses the MOVINGAVERAGE function.
  • Percent of parent: Calculates the percentage of a value relative to its parent. Uses the COLLAPSE function.
  • Percent of grand total: Calculates the percentage of a value relative to all values, using the COLLAPSEALL function.
  • Average of children: Calculates the average value of the set of child values. Uses the EXPAND function.
  • Versus previous: Compares a value to a preceding value, using the PREVIOUS function.
  • Versus next: Compares a value to a subsequent value, using the NEXT function.
  • Versus first: Compares a value to the first value, using the FIRST function.
  • Versus last: Compares a value to the last value, using the LAST function.

Conclusion

Visual Calculations bridge the gap between calculated columns and measures, offering the simplicity of context from calculated columns and the dynamic calculation flexibility of measures. Visual Calculations offer improved performance compared to detail-level measures when operating on aggregated data within visuals. They can refer directly to visual structure, providing users with unprecedented flexibility in data analysis.

This new feature will be very useful for those who are new to Power BI and for whom the DAX language can be a real challenge. It will simplify some calculation scenarios!

Author: Vincent HERMAL, Data Analytics Practice Leader at AKABI

SHARE ON :


Related articles

May 28, 2024

Read in minutes

Insights from the Gartner Data & Analytics Summit in London

I had the opportunity of attending the Gartner Data & Analytics Summit in London from May 13th to 15th. This three-day event featured over 100 sessions, man...

May 28, 2024

Read in minutes

Enhancing Real-Time Data Processing with Databricks: Apache Kafka vs. Apache Pulsar

In the era of big data, real-time data processing is essential for organizations seeking immediate insights and the ability to respond swiftly to changing marke...

February 20, 2024

Read in 5 minutes

Revolutionizing Data Engineering: The Power of Databricks’ Delta Live Tables and Unity Catalog

Databricks has emerged as a pivotal platform in the data engineering landscape, offering a comprehensive suite of tools designed to tackle the complexities of d...


comments

Revolutionizing Data Engineering: The Power of Databricks’ Delta Live Tables and Unity Catalog

February 20, 2024

Business Inteligence Data Integration Microsoft Azure

Read in 5 minutes

Databricks has emerged as a pivotal platform in the data engineering landscape, offering a comprehensive suite of tools designed to tackle the complexities of data processing, analytics, and machine learning at scale. Among its innovative offerings, Delta Live Tables (DLT) and Unity Catalog stand out as transformative features that significantly enhance the efficiency and reliability of data pipelines. This article delves into these concepts, elucidating their functionalities, benefits, and their particular relevance to data engineers.

Delta Live Tables (DLT): Revolutionizing Data Pipelines

Delta Live Tables is an ETL framework built on top of Databricks, designed to streamline the development and maintenance of data pipelines. With DLT, data engineers can define declarative pipelines that automatically manage complex data transformations, dependencies, and error handling. This high-level abstraction allows engineers to focus on business logic and data transformations rather than the operational complexities of pipeline orchestration.


Key Features and Advantages:

  • Declarative Syntax: DLT allows data engineers to define transformations using SQL or Python, specifying what the data should look like rather than how to achieve it. This declarative approach simplifies pipeline development and maintenance.
  • Automated Error Handling: DLT provides robust error handling mechanisms, including automatic retries, dead-letter queues for unprocessable messages, and detailed error logging. This reduces the time data engineers spend on debugging and fixing pipeline issues.
  • Data Quality Controls: With DLT, data engineers can embed data quality checks directly into their pipelines, ensuring that data meets specified quality constraints before it moves downstream. This built-in validation mechanism enhances data reliability and trustworthiness.
  • Live Tables: DLT continuously monitors for new data and incrementally updates its outputs, ensuring that downstream users and applications always have access to fresh, high-quality data. This real-time processing capability is crucial for time-sensitive analytics and decision-making.
  • Change Data Capture (CDC): DLT supports the capture of changes made to source data, enabling seamless and efficient integration of updates into data pipelines. This feature ensures that data reflects the latest changes, crucial for accurate analytics and real-time reporting.
  • Historical and Live Views: Data engineers can create views that either maintain a history of data changes or display the most current data. This allows users to access data snapshots over time or see the present state of data, thereby facilitating thorough analysis and informed decision-making.

Unity Catalog: Centralizing Data Governance

Unity Catalog enhances Databricks by introducing a unified governance framework for all data and AI assets in the Lakehouse, centralizing metadata management, access control, and auditing to streamline data governance and security at scale.

A data catalog acts as an organized inventory for an organization’s data assets, providing metadata, usage, and source information to facilitate data discovery and management. Unity Catalog realizes this by integrating with the Databricks Lakehouse, offering not just a cataloging function but also a unified approach to governance. This ensures consistent security policies, simplifies data access management, and supports comprehensive auditing, helping organizations navigate their data landscape more efficiently and in compliance with regulatory requirements.

Key Features and Advantages:

  • Unified Metadata Management: Unity Catalog consolidates metadata across various data assets, including tables, files, and machine learning models, providing a single source of truth for data governance.
  • Fine-grained Access Control: With Unity Catalog, data engineers can define precise access controls at the column, row, and table levels, ensuring that sensitive data is adequately protected and compliance requirements are met.
  • Cross-Service Policy Enforcement: Unity Catalog applies consistent governance policies across different Databricks workspaces and services, ensuring uniform security and compliance posture across the data landscape.
  • Data Discovery and Lineage: It facilitates easy discovery of data assets and provides comprehensive lineage information, enabling data engineers to understand data origins, transformations, and dependencies. This transparency is vital for troubleshooting, impact analysis, and compliance auditing.
  • Auditing: This feature tracks data interactions, offering insights into user activities and changes within the Databricks environment. This facilitates compliance and security by providing a detailed audit trail for accountability and analysis.

Integration: Synergy Between DLT and Unity Catalog

The integration of Delta Live Tables and Unity Catalog within Databricks provides a cohesive and powerful environment for data engineering. DLT’s streamlined pipeline management, combined with Unity Catalog’s robust governance framework, offers a comprehensive solution for building, managing, and securing data pipelines at scale.

  • Enhanced Data Reliability: DLT’s real-time processing and data quality checks, coupled with Unity Catalog’s governance capabilities, ensure that data pipelines produce accurate, reliable, and compliant data outputs.
  • Increased Productivity: The declarative nature of DLT and the centralized governance of Unity Catalog reduce the complexity and overhead associated with data pipeline development and management, allowing data engineers to focus on delivering value.
  • Scalability and Flexibility: Both DLT and Unity Catalog are designed to scale with the needs of the business, accommodating large volumes of data and complex data transformations without sacrificing performance or manageability.

Conclusion: Empowering Data Engineers

For data engineers, the combination of Delta Live Tables and Unity Catalog within Databricks represents a significant leap forward in terms of productivity, data quality, and governance. By abstracting away the complexities of pipeline development and data management, these features allow engineers to concentrate on solving business problems through data. The result is a more efficient, reliable, and secure data infrastructure that can drive insights and innovation at scale. As the data landscape continues to evolve, tools like DLT and Unity Catalog will be indispensable in empowering data engineers to meet the challenges of tomorrow.

It’s important to note that, although Delta Live Tables (DLT) and Unity Catalog are designed to work together seamlessly within the Databricks environment, it’s perfectly viable to pair DLT with a different data cataloging system. This versatility allows organizations to take advantage of DLT’s sophisticated capabilities for automating and managing data pipelines while still utilizing another data catalog that may align more closely with their existing infrastructure or specific needs. Databricks supports this flexible data management strategy, enabling businesses to leverage DLT’s real-time processing and data quality enhancements without being restricted to using only Unity Catalog.

As we explore the horizon of technological innovation, it’s evident that the future is unfolding before us. Engaging with the latest advancements in data management and governance is more than just keeping pace; it’s about seizing the opportunity to redefine how we interact with the vast universe of data. The moment has come to embrace these new possibilities, leveraging their power to drive forward our data-centric initiatives.

Author: Pierre-Yves RICHER, Data Engineering Practice Leader at AKABI

SHARE ON :


Related articles

May 28, 2024

Read in minutes

Insights from the Gartner Data & Analytics Summit in London

I had the opportunity of attending the Gartner Data & Analytics Summit in London from May 13th to 15th. This three-day event featured over 100 sessions, man...

May 28, 2024

Read in minutes

Enhancing Real-Time Data Processing with Databricks: Apache Kafka vs. Apache Pulsar

In the era of big data, real-time data processing is essential for organizations seeking immediate insights and the ability to respond swiftly to changing marke...

March 25, 2024

Read in minutes

Getting started with the new Power BI Visual Calculations feature!

Power BI’s latest feature release, Visual Calculations, represents a paradigm shift in how users interact with data.      Rolled ...


comments

AKABI’s Consultants Share Insights from Dataminds Connect 2023

November 20, 2023

Analytics Business Inteligence Data Integration Event Microsoft Azure + 1

Read in 5 minutes

Dataminds Connect 2023, a two-day event taking place in the charming city of Mechelen, Belgium, has proven to be a cornerstone in the world of IT and Microsoft data platform enthusiasts. Partly sponsored by AKABI, this event is a gathering of professionals and experts who share their knowledge and insights in the world of data.

With a special focus on the Microsoft Data Platform, Dataminds Connect has become a renowned destination for those seeking the latest advancements and best practices in the world of data. We were privileged to have some of our consultants attend this exceptional event and we’re delighted to share their valuable feedback and takeaways.

How to Avoid Data Silos – Reid Havens

In his presentation, Reid Havens emphasized the importance of avoiding data silos in self-service analytics. He stressed the need for providing end users with properly documented datasets, making usability a top priority. He suggested using Tabular Editor to hide fields or make them private to prevent advanced users from accessing data not meant for self-made reports. Havens’ insights provided a practical guide to maintaining data integrity and accessibility within the organization.

Context Transition in DAX – Nico Jacobs

Nico Jacobs took on the complex challenge of explaining the concept of “context” and circular dependencies within DAX. He highlighted that while anyone can work with DAX, not everyone can understand its reasoning. Jacobs’ well-structured presentation made it clear how context influences DAX and its powerful capabilities. Attendees left the session with a deeper understanding of this essential language.

Data Modeling for Experts with Power BI – Marc Lelijveld

Marc Lelijveld’s expertise in data modeling was on full display as he delved into various data architecture choices within Power BI. He effortlessly navigated topics such as cache, automatic and manual refresh, Import and Dual modes, Direct Lake, Live Connection, and Wholesale. Lelijveld’s ability to simplify complex concepts made it easier for professionals to approach new datasets with confidence.

Breaking the Language Barrier in Power BI – Daan Lambrechts

Daan Lambrechts addressed the challenge of multilingual reporting in Power BI. While the tool may not inherently support multilingual reporting, Lambrechts showcased how to implement dynamic translation mechanisms within Power BI reports using a combination of Power BI features and external tools like Metadata Translator. His practical, step-by-step live demo left the audience with a clear understanding of how to meet the common requirement of multilingual reporting for international and multilingual companies.

Lessons Learned: Governance and Adoption for Power BI – Paulien van Eijk & Teske van Maaren

This enlightening session focused on the (re)governance and (re)adoption of Power BI within organizations where Power BI is already in use, often with limited governance and adoption. Paulien van Eijk and Teske van Maaren explored various paths to success and highlighted key concepts to consider:

  • Practices: Clear and transparent guidance and control on what actions are permitted, why, and how.
  • Content Ownership: Managing and owning the content in Power BI.
  • Enablement: Empowering users to leverage Power BI for data-driven decisions.
  • Help and Support: Establishing a support system with training, various levels of support, and community.

Power BI Hidden Gems – Adam Saxton & Patrick Leblanc

Participating in Adam Saxton and Patrick Leblanc’s “Power BI Hidden Gems” conference was a truly enlightening experience. These YouTube experts presented topics like Query folding, Prefer Dual to Import mode, Model properties (discourage implicit measures), Semantic link, Deneb, and Incremental refresh in a clear and engaging manner. Their presentation style made even the most intricate aspects of Power BI accessible and easy to grasp. The quality of the presentation, a hallmark of experienced YouTubers, made the learning experience both enjoyable and informative.

The Combined Power of Microsoft Fabric for Data Engineer, Data Analyst and Data Governance Manager – Ioana Bouariu, Emilie Rønning and Marthe Moengen

I had the opportunity to attend the session entitled “The Combined Power of Microsoft Fabric for Data Engineer, Data Analyst, and Data Governance Manager”. The speakers adeptly showcased the collaborative potential of Microsoft Fabric, illustrating its newfound relevance in our evolving data landscape. The presentation effectively highlighted the seamless collaboration facilitated by Microsoft Fabric among data engineering, analysis, and governance roles. In our environment, where these roles can be embodied by distinct teams or even a single versatile individual, Microsoft Fabric emerges as a unifying force. Its adaptability addresses the needs of diverse profiles, making it an asset for both specialized teams and agile individuals. Its potential promises to open exciting new perspectives for the future of data management.

Behind the Hype, Architecture Trends in Data – Simon Whiteley

I thoroughly enjoyed Simon Whiteley’s seminar on the impact of hype in technology trends. He offered valuable insights into critically evaluating emerging technologies, highlighting their journey from experimentation to maturity through Gartner’s hype curve model.

Simon’s discussion on attitudes towards new ideas, the significance of healthy skepticism, and considerations for risk tolerance was enlightening. The conclusion addressed the irony of consultants cautioning against overselling ideas, emphasizing the importance of skepticism. The section on trade-offs in adopting new technologies provided practical insights, especially in balancing risk and fostering innovation.

In summary, the seminar provided a comprehensive understanding of technology hype, offering practical considerations for navigating the evolving landscape. Simon’s expertise and engaging presentation style made it a highly enriching experience.

In Conclusion

Dataminds Connect 2023 was indeed a remarkable event that provided valuable insights into the world of data. We want to extend our sincere gratitude to the organizers for putting together such an informative and well-executed event. The knowledge and experiences gained here will undoubtedly contribute to our continuous growth and success in the field. We look forward to being part of the next edition and the opportunity to continue learning and sharing our expertise with the data community. See you next year!

Vincent Hermal, Azure Data Analytics Practice Leader
Pierre-Yves Richer, Azure Data Engineering Practice Leader
avec la participation très précieuse de Sophie Opsommer, Ethan Pisvin, Pierre-Yves Outlet et Arno Jeanjot

SHARE ON :


Related articles

May 28, 2024

Read in minutes

Insights from the Gartner Data & Analytics Summit in London

I had the opportunity of attending the Gartner Data & Analytics Summit in London from May 13th to 15th. This three-day event featured over 100 sessions, man...

May 28, 2024

Read in minutes

Enhancing Real-Time Data Processing with Databricks: Apache Kafka vs. Apache Pulsar

In the era of big data, real-time data processing is essential for organizations seeking immediate insights and the ability to respond swiftly to changing marke...

March 25, 2024

Read in minutes

Getting started with the new Power BI Visual Calculations feature!

Power BI’s latest feature release, Visual Calculations, represents a paradigm shift in how users interact with data.      Rolled ...


DP-500 : How to successfully pass the exam?

January 27, 2023

Analytics Microsoft Azure

Read in minutes

Are you looking to earn the Microsoft certification DP-500: Designing and Implementing Enterprise-Scale Analytics Solutions Using Microsoft Azure and Microsoft Power BI? If so, you’re not alone! This certification is highly sought after by professionals looking to advance their careers in the field of data analytics. In this LinkedIn article, we’ll provide some expert tips to help you prepare for and pass this important certification exam.

First, let’s start by looking at what this certification covers. The DP-500 certification is geared towards professionals who are responsible for designing and implementing large-scale analytics solutions using Microsoft Azure Synapse Analytics and Microsoft Power BI. This includes tasks such as designing data pipelines, managing data storage, and creating dashboards and reports for business users.

To prepare for the DP-500 exam, it’s important to have a strong understanding of the following topics:

Microsoft Azure: This includes knowledge of Azure data storage options (such as Azure SQL Database and Azure Data Lake), as well as Azure data processing and analytics tools. You’ll also need to be familiar with Microsoft Purview.

Microsoft Power BI: This includes knowledge of Power BI desktop and online, as well as how to design and publish reports and dashboards using Power BI. You’ll also need to be familiar with Power BI data modeling and visualization techniques.

Data management and data governance: You’ll need to understand how to manage data at scale, including tasks such as data cleansing, data transformation, and data security.

Data visualization: You’ll need to be able to design effective data visualizations that effectively communicate insights to business users.

Some advice from one of our consultants

It is understandable that you may be feeling anxious or unsure about your chances of success on the DP-500 exam, especially if you have not had previous experience with Azure Synapse Analytics & Microsoft Purview. Prior to preparing for the exam, I had not had any experience using those two tools. These are important technologies that are covered on the exam, and it may have been necessary for you to spend additional time studying and gaining familiarity with them in order to fully prepare for the exam.

It is important to note that four weeks of study is a reasonable amount of time to prepare for the exam, as long as you use your study time effectively and focus on the most important exam objectives

So, what can you do to prepare for the DP-500 exam? Here are a few tips:

No alt text provided for this image

Use Microsoft’s official certification training materials: These materials are designed specifically to help you prepare for the DP-500 exam and are a great place to start.

Take online courses: There are many online courses available that can help you deepen your understanding of the topics covered on the DP-500 exam. One website that you might find helpful is Datamozart. This website offers a range of courses and resources for data professionals, including those preparing for the DP-500 exam.

Watch YouTube videos: There are many YouTube channels that offer helpful content for those preparing for the DP-500 exam. One channel that you might find particularly useful is Azure Synapse Analytics. This channel offers a range of videos on topics related to Azure Synapse Analytics, which is a key tool covered on the DP-500 exam.

Get insights from experts: Consider reaching out to experts in the field for advice on how to prepare for the DP-500 exam. Two Data Platform MVPs, Andy Cutler and Nikola Ilic, are known for their great explanations and insights on data platform topics. You might find it helpful to follow their blogs or watch their videos for additional guidance on preparing for the DP-500 exam.

Practice with sample questions: It is understandable that you may be looking for sample questions to help you prepare for the DP-500 exam. However, it is important to note that the quality and reliability of sample questions can vary greatly. Some sample questions may not accurately reflect the content or difficulty level of the actual exam, and using them as your sole source of preparation may not be sufficient to fully prepare you for the exam. Examtopic is a great website that provides information and resources for various IT certification exams. When I studied for the exam, the site did not contain any practice questions but now you can find sample questions here. It will probably help you a lot.

Gain hands-on experience: There’s no substitute for real-world experience when it comes to preparing for the DP-500 exam. Try working on projects using Azure and Power BI to get a feel for how these tools

I wish you the best of luck as you prepare for the DP-500 exam. Remember to stay focused, stay motivated, and keep up with your studies. With hard work and dedication, you can succeed on the exam and achieve your certification goals.

SHARE ON :


Related articles

May 28, 2024

Read in minutes

Insights from the Gartner Data & Analytics Summit in London

I had the opportunity of attending the Gartner Data & Analytics Summit in London from May 13th to 15th. This three-day event featured over 100 sessions, man...

May 28, 2024

Read in minutes

Enhancing Real-Time Data Processing with Databricks: Apache Kafka vs. Apache Pulsar

In the era of big data, real-time data processing is essential for organizations seeking immediate insights and the ability to respond swiftly to changing marke...

March 25, 2024

Read in minutes

Getting started with the new Power BI Visual Calculations feature!

Power BI’s latest feature release, Visual Calculations, represents a paradigm shift in how users interact with data.      Rolled ...


comments

Power BI and QlikView Comparison

September 28, 2022

Business Inteligence

Read in minutes

When we talk about Business Intelligence and Data Visualization, there are 3 leaders on the market today; Power BI, Qlik (Qlikview & Qliksense) and Tableau. 

Power BI was developed by Microsoft in 2010 as the Crescent Project. The first version was released in 2011. Later, Microsoft gave it the name of Power BI and it will be released in 2015. Today Power BI emerges as a leader of Business Intelligence tools in the market. 

Qlikview was first called Quikview in 1994 (long before Microsoft) but was renamed in 1996. When QlikView was created, its main function was to collect in-depth analysis of data from different systems. Finally, it evolved into Business Intelligence software where it now enjoys top marks from several software review sites.

In this article, we provide an overall comparison, through different criteria, between Power BI and QliKview, based on feedback (migrations from QlikView to Power BI for example) and documentation.

Data Sources

Regarding connectivity, two softwares allow extensive access to different types of data, whether located on-premise or in the cloud. However, QlikView has the particularity of requiring the prior downloading and installation of these connectors before being able to use them. Overall, in both cases, connecting to data sources is not a major problem.

Filter, Slicer and Selection

One of QlikView’s strengths is certainly its associative experience and its filter management. We are able to make a selection and a filter directly using a simple click on the values available through all the visuals. In this case we talk about Cross Filtering, it means all the visuals adapt to the interaction that we have with another visual.

This feature is also available in Power BI but unfortunately it only works on one tab. If we want to persist our selection through tabs, we have to use slicers and we need to synchronize them. For QlikView users, this intermediate step can be less intuitive and more costly in terms of navigation.

Unlike query-based BI tools, using in-memory associative technology, when QlikView users select a data point, no query is triggered. But all other fields are instantly filtered and grouped based on the user’s selection. Selection appear in green and datasets which are related to the selection appear in white. Data which is unrelated to the selection appear in gray. With this way, users have access to a tool which is both intuitive and user-friendly to browse data and to search information related to their activities.

On the other hand, Power BI does not directly allow to display data which is not linked to our selection.

Data Processing and Transformation 

Features to process and transform data in both software are numerous but those of Qlikview, thanks to a scripting language, are considered advanced and offer more possibilities during development. But the level of required knowledge is higher to master QlikView language and to provides our first models.

On the other hand, with its intuitive interface, Power BI is easy to use for less experienced people and especially those who do not have programming skills.

User Interface

Power BI is very intuitive with Drag & Drop everywhere. A novice user is able to create visuals and dashboards very easily.

For its part, QlikView allows a more advanced level of customization than Power BI but is much less intuitive and more complex, especially for a new user.
 
But if the investment is made to master and to use the full potential of QlikView, its highly customizable setup and his wide range of features can be a key advantage.

Price 

Power BI pricing is simple. The desktop version is free, while Power BI Pro costs less than $10 per user per month. The latest version – Power BI Premium – offers capacity pricing that helps to optimize costs.

For QlikView, fees are not so simple. QlikView website offers two editions, Enterprise and Personal. While the personal version, to use on a personal computer, is free, the price of the enterprise version is only accessible after contacting their sales team. According to anecdotal experience, no solution can beat the cost effectiveness of Power BI and QlikView is estimated to be 2-3 times more expensive.

Conclusion

In conclusion, Power BI and QlikView are two colossus of data processing and visualization. In most cases, these two softwares will fulfill all features in terms of data exploration and analysis.

However, Qlikview appears to be more complex and requires a higher level of learning but offers more customization advantages.
Indeed Power BI spotlight an easy-to-use and a familiar interface for users who know Microsoft environment. It is updated really often unlike QlikView, with the addition of many interesting features to its catalog. We need to take also its community into consideration since it is very active and it can be a very interesting support.

In addition, its acquisition turns out to be much less expensive than Qlikview, an important argument for companies today. It is certainly for these reasons that we find Microsoft as the leader of BI market and ahead of QlikView.
However Qlik does not give up, and has spotlighted last years his new flagship product Qliksense, which seems to adopt the many qualities of Power BI.

SHARE ON :


Related articles

March 25, 2024

Read in minutes

Getting started with the new Power BI Visual Calculations feature!

Power BI’s latest feature release, Visual Calculations, represents a paradigm shift in how users interact with data.      Rolled ...

February 20, 2024

Read in 5 minutes

Revolutionizing Data Engineering: The Power of Databricks’ Delta Live Tables and Unity Catalog

Databricks has emerged as a pivotal platform in the data engineering landscape, offering a comprehensive suite of tools designed to tackle the complexities of d...

November 20, 2023

Read in 5 minutes

AKABI’s Consultants Share Insights from Dataminds Connect 2023

Dataminds Connect 2023, a two-day event taking place in the charming city of Mechelen, Belgium, has proven to be a cornerstone in the world of IT and Microsoft ...


comments

PowerBI CICD with Azure DevOps (Setting up the tenant) (2/3)

September 15, 2022

Business Inteligence

Read in minutes

Previous post Introduction & Implementation

Now, we need to enable the service to use PowerBI API

  1. Go to Settings
  2. Click to Admin portal
  3. Go for Tenant settings
  4. Then in the Developer settings part Allow service principal to use PowerBI API
Figure 5 Setting up tenant

Grant access to workspace

Now, we need to add the service as admin to the workspace

  1. Go to Workspace
  2. Click on the three dots and Workspace access
  3. Add the PowerBI service connection and give him Admin role
  4. Finally click on Add
Figure 6 Grant access to workspace

Adding the service as gateway owner

To be able to use the tenant create in the Azure Active Directory we need to add the service as owner of the gateway

Add the extension to Azure DevOps

As we explain before, this work is based on an extension. We need to add it to Azure DevOps.

  1. On the top right corner click on the and then Browse marketplace
  2. Search for PowerBI actions
  3. Click on Get it free
  4. Select your organization and download the extension

Creation of the service connection in Azure DevOps

To be able to use the tenant created in the Azure Active Directory we need to create a service connection in Azure DevOps

  1. Go for your organization settings and click on Service Connection
  2. Create a service connection by searching Power BI Service Connection
  3. Select Service Principal and then fill Tenant ID and Client ID with the credentials that you copied beforehand. If not, you can find it in the Azure Active Directory under the app registration menu
Figure 7 Add service connections

Now that we have set up every tool, we can create the release pipeline.

SHARE ON :


Related articles

March 25, 2024

Read in minutes

Getting started with the new Power BI Visual Calculations feature!

Power BI’s latest feature release, Visual Calculations, represents a paradigm shift in how users interact with data.      Rolled ...

February 20, 2024

Read in 5 minutes

Revolutionizing Data Engineering: The Power of Databricks’ Delta Live Tables and Unity Catalog

Databricks has emerged as a pivotal platform in the data engineering landscape, offering a comprehensive suite of tools designed to tackle the complexities of d...

November 20, 2023

Read in 5 minutes

AKABI’s Consultants Share Insights from Dataminds Connect 2023

Dataminds Connect 2023, a two-day event taking place in the charming city of Mechelen, Belgium, has proven to be a cornerstone in the world of IT and Microsoft ...


comments

PowerBI CICD with Azure DevOps (Introduction & Implementation) (1/3)

September 14, 2022

Business Inteligence

Read in minutes

Introduction

The purpose of this document is to explain how to implement PowerBI CICD in Azure DevOps. This document is for those who are tired of publishing reports by hand on different environments.

To implement the solution, we used the “Power BI actions” extension which you can find here.

This document will walk you through the steps we took to implement the solution.

The extension

The PowerBI Actions extension is based on the PowerBI API created by Microsoft that you can find here. The extension allows you to automate several tasks such as Upload / Import a PowerBI dashboard, create a workspace, update a report connection…To perform these tasks, the extension must work with a connection to the PowerBI service connection.

Implementation

To perform the following steps, you must have sufficient authorization. If you do not have sufficient authorization, you may need to contact someone who does.

Creation of the PowerBI service connection

  1. Sign in to Azure Portal
  2. Select Azure Active Directory and then App Registration
  3. Click on New Registration
Figure 1 Creation of PowerBI service connection first step
Figure 2 Creation of PowerBI service connection second step

On the next page, copy the application IDs for further use.

Then we need to create a client secret:

  1. Go to certificates & secrets
  2. Click on New client secret
  3. Add a description
  4. Click on Add
Figure 3 Creation of client secrets

Now, we need to give some permission to the app

  1. Go to App permission
  2. Click to Add permission
  3. Go for PowerBI Service
  4. Select Application permissions
  5. Check Tenant.Read.All and Tenant.ReadWrite.All
  6. Click on Add Permission
Figure 4 Add permissions

Now the app as been created and it’s ready to be used in PowerBI

SHARE ON :


Related articles

March 25, 2024

Read in minutes

Getting started with the new Power BI Visual Calculations feature!

Power BI’s latest feature release, Visual Calculations, represents a paradigm shift in how users interact with data.      Rolled ...

February 20, 2024

Read in 5 minutes

Revolutionizing Data Engineering: The Power of Databricks’ Delta Live Tables and Unity Catalog

Databricks has emerged as a pivotal platform in the data engineering landscape, offering a comprehensive suite of tools designed to tackle the complexities of d...

November 20, 2023

Read in 5 minutes

AKABI’s Consultants Share Insights from Dataminds Connect 2023

Dataminds Connect 2023, a two-day event taking place in the charming city of Mechelen, Belgium, has proven to be a cornerstone in the world of IT and Microsoft ...


comments

PowerBI CICD with Azure DevOps (Creation of the release pipeline) (3/3)

September 14, 2022

Business Inteligence

Read in minutes

Previous post Setting up the tenant

For this part, we assume that you already have a pipeline for power BI. Moreover, the solution presented here fit our project, it may be that you need to add or modify some steps.

Deployment in DEV

Firstly, we need to add the Power BI Action task.

  1. Click on the + sign
  2. Search for Power BI Actions
  3. Add the task
Figure 8 Add a task
  1. Add the service connection
  2. Select the action Upload Power BI Report
  3. Add the name of the workspace
  4. Give the path of the folder where your reports are stored. Be careful, your report name can not contain a dot.
  5. Check Overwrite Power BI File
Figure 9 Power BI Action: Publish

Deployment in UAT

For deploying reports in UAT we use the same system, but we must add more steps because reports are published with the DEV dataset. We must therefore change it to the associated dataset.

The first task is another Power BI Action: Publish   the configuration is identical as the one for DEV, we just change the Workspace name to UAT.

The second task give the ownership of all datasets to the service connection.

  1. Selection the task Take dataset ownership
  2. Add the name of the workspace
  3. Check Update all datasets in workspace the dataset name will be ignored
Figure 10 Power BI Action: Take ownership

Then, we need to update the gateway.

  1. Selection the task Update gateway
  2. Add the name of the workspace
  3. Fill in the name of your dataset and of the getaway. We cannot check the Update all  option because the metrics report will make the job fail
Figure 11 Power BI Action: UpdateGateway

Then, we need to update the datasource.

  1. Selection the task Update datasource connection
  2. Fill in the name of the dataset
  3. Choose the Datasource type
  4. Fill the information with the one that’s fit your project
Then, we need to update the datasource.

Figure 13 Power BI Action: UpdateDatasource

The last two task must be replicate for all you report since we can’t use the Update all option.

Deployment in PROD

This part is the same as the UAT one, we just needed to change the name of the workspace and the name of the new database everywhere they appear in different steps.

Report development

Now, when you develop a report make sure to push it to your repository with the Data source settings setup with the DEV data source

SHARE ON :


Related articles

March 25, 2024

Read in minutes

Getting started with the new Power BI Visual Calculations feature!

Power BI’s latest feature release, Visual Calculations, represents a paradigm shift in how users interact with data.      Rolled ...

February 20, 2024

Read in 5 minutes

Revolutionizing Data Engineering: The Power of Databricks’ Delta Live Tables and Unity Catalog

Databricks has emerged as a pivotal platform in the data engineering landscape, offering a comprehensive suite of tools designed to tackle the complexities of d...

November 20, 2023

Read in 5 minutes

AKABI’s Consultants Share Insights from Dataminds Connect 2023

Dataminds Connect 2023, a two-day event taking place in the charming city of Mechelen, Belgium, has proven to be a cornerstone in the world of IT and Microsoft ...


comments