Blog Archive - Find more from Datactics in our Datablogs https://www.datactics.com/category/blog/ Unlock your data's true potential Thu, 21 Nov 2024 21:54:04 +0000 en-GB hourly 1 https://wordpress.org/?v=6.7.2 https://www.datactics.com/wp-content/uploads/2023/01/DatacticsFavIconBluePink-150x150.png Blog Archive - Find more from Datactics in our Datablogs https://www.datactics.com/category/blog/ 32 32 Datactics Snaps Up Award For Regulator-Ready Data https://www.datactics.com/blog/datactics-snaps-up-award-for-regulator-ready-data/ Thu, 21 Nov 2024 21:30:00 +0000 https://www.datactics.com/?p=27643 New York, Nov 21st 2024  Datactics has secured the ‘Most Innovative Technology for Regulatory Compliance’ award at this year’s A-Team Group RegTech Insight Awards USA. The Belfast-based firm, which has made its name specializing in innovative solutions to challenging data problems, developed its Data Readiness solution in response to the prevalence of data-driven regulations across […]

The post Datactics Snaps Up Award For Regulator-Ready Data appeared first on Datactics.

]]>
New York, Nov 21st 2024 

Datactics has secured the ‘Most Innovative Technology for Regulatory Compliance’ award at this year’s A-Team Group RegTech Insight Awards USA.

The Belfast-based firm, which has made its name specializing in innovative solutions to challenging data problems, developed its Data Readiness solution in response to the prevalence of data-driven regulations across both sides of the Atlantic. 

Data Readiness grew out of Datactics’ close work with banking customers in the UK, primarily around data used to identify and reporting on customer deposits in line with stringent regulation from the Financial Conduct Authority (FCA) and Prudential Regulatory Authority (PRA).  

In 2024, Datactics developed Data Readiness as a specific defined solution to measure and prepare data to specific regulatory standards, offering the solution with a range of deployment options, including as-a-service.  

Behind the solution

Matt Flenley, Head of Marketing at Datactics, noted, “This award is brilliant news for all our talented developers and engineers back home. We’ve a long record of working alongside customers to help them with specific problems they’re encountering, and for which the risk of bad data affecting their ability to demonstrate compliance is often significant.

“With our experience in developing rule-sets for customers seeking to comply with UK depositor protection regulations, alongside international standards such as BCBS 239, we felt the time was right to offer this solution as its own piece of regulatory compliance technology.

“We’d like to thank the A-Team panel and all those who saw fit to recognise the approach we’ve taken here – thank you all so much!” 

Angela Wilbraham, CEO at A-Team Group, and host of the 4th annual RegTech Insight Awards USA 2024, commented, “Congratulations to Datactics for winning the Most Innovative Technology for Regulatory Compliance award in this year’s A-Team Group RegTech Insight Awards USA 2024.

“These awards celebrate providers of leading RegTech solutions, services and consultancy and are uniquely designed to recognise both start-up and established providers who are creatively finding solutions to help with regulatory challenges, and span a wide range of regulatory requirements.

“Our congratulations for their achievement in winning this award in a highly competitive contest.” 

For more on Datactics, visit www.datactics/get-data-readiness 

For more on A-Team Group, visit https://a-teaminsight.com/category/regtech-insight/  

The post Datactics Snaps Up Award For Regulator-Ready Data appeared first on Datactics.

]]>
Nightmare on LLM Street: How To Prevent Poor Data Haunting AI https://www.datactics.com/blog/nightmare-on-llm-street/ Fri, 25 Oct 2024 14:34:36 +0000 https://www.datactics.com/?p=27295 Why risk taking a chance on poor data for training AI? If it's keeping you awake at night, read on for a strategy to overcome the nightmare scenarios!

The post Nightmare on LLM Street: How To Prevent Poor Data Haunting AI appeared first on Datactics.

]]>
How to prevent poor data haunting AI

It’s October, the Northern Hemisphere nights are drawing in, and for many it’s time when things take a scarier turn. But for public sector leaders exploring AI, that fright need not apply to your data. It definitely shouldn’t be something that haunts your digital transformation dreams.

With a reported £800m budget unveiled by the previous government to address ‘digital and AI’, UK public sector departments are keen to be the first to explore the sizeable benefits that AI and automation offer. The change of government in July 2024 has done nothing to indicate that this drive has lessened in any way; in fact, the Labour manifesto included the commitment to a “single unique identifier” to “better support children and families”[1].

While we await the first Budget of this Labour government, it’s beyond doubt that there is an urgent need to tackle this task amid a cost-of-living crisis, with economies still trying to recover from the economic shock of COVID and deal with energy price hikes amid several sizeable international conflicts.

However, like Hollywood’s best Halloween villains, old systems, disconnected data, and a lack of standardisation are looming large in the background.

Acting First and Thinking Later

It’s completely understandable that the pressures would lead us to this point. Societal expectations from the emergence of ChatGPT, among others, have only fanned the flames, swelling the sense that technology should just ‘work’ and leading to an overinflated belief in what is possible.

Recently, LinkedIn attracted some consternation[i][2] by automatically including members’ data in its AI models without seeking express consent first. For whatever reason, the idea that people would just accept this change was overlooked. It took the UK’s Information Commissioner’s Office, the ICO, to intervene for the change to be withdrawn – in the UK, at least.

A dose of reality is the order of the day. Government systems are lacking integrated data, and clear consent frameworks of the type that LinkedIn actually possesses seldom exist in one consistent way. Already lacking funds, the public sector needs to act carefully, and mindfully, to prevent their AI experiments (which is, after all, what they are) from leading to inaccuracies and wider distrust from the general public.

One solution is for Government departments to form one, holistic set of consents concerning use of data for AI, especially Large Language Models and Generative AI – similar to communication consents under the General Data Protection Regulation, GDPR.

The adoption of a flexible consent management policy, one which can be updated and maintained for future developments and tied to an interoperable, standardised single view of citizen (SCV), will serve to support the clear, safe development of AI models into the future. The risks of building models now, on shakier foundations, will only serve to erode public faith. The evidence of the COVID-era exam grades fiasco[3] demonstrates the risk that these models present to real human lives.

Of course, it’s not easy to do. Many legacy systems contain names, addresses and other citizen data in a variety of formats. This makes it difficult to be sure that when more than one dataset includes a particular name, that name actually refers to the same individual. Traditional solutions to this problem use anything from direct matching technology to the truly awful exercise of humans manually reviewing tens of thousands of records in spreadsheets. This is one recurring nightmare that society really does need to stop having.

Taking Refuge in Safer Models

Intelligent data matching uses a variety of matching algorithms and well-established machine learning techniques to reconcile data held in old systems, new ones, documents, even voice notes. Such approaches could help the public sector to streamline their SCV processes, managing consents more effectively. The ability to understand who has opted in, marrying opt-ins and opt-outs to demographic data is critical. This approach will help model creators to interpret the inherent bias in the models built on those consenting to take part, to understand how reflective of society the predictive models are likely to be – including whether or not it is actually safe to use the model at all.

It’s probable that this transparency in process could also lead to greater trust in the general public to take part in data sharing in this way. In the LinkedIn example, the news that data was being used without explicit consent, raced around like wildfire on the platform itself. This sort of outcome cannot be what LinkedIn anticipated, which in and of itself is a concern about the mindset of the model creators.

It Doesn’t Have to Be a Nightmare

It’s a spooky enough season without adding more fear to the bonfire; certainly, this article isn’t intended as a reprimand. The desire to save time and money to deliver better services to a country’s citizens is a major part of many a civil servant’s professional drive. And AI and automation offer so many opportunities for much better outcomes! For just one example, NHS England’s AI tool already uses image recognition to detect heart disease up to 30 times faster than a human[4] . Mid and South Essex (MSE) NHS Foundation used a predictive analytical machine learning model called Deep Medical to reduce the rate at which patients either didn’t attend appointments or cancelled with short notice (referred to as Did Not Attend, or DNA). Its pilot project identified which patients were more likely to fall into the DNA category, developed personalised reminder schedules, and through identifying frail patients who were less likely to attend an appointment, highlighted them to relevant clinical teams.[5]

The time for taking action is now. Public sector organisations, government departments and agencies should focus on the need to develop systems that will preserve and maintain trust in the AI-led future. This blog has shown that better is possible, through a dedicated desire to align citizen data and their consents to contact. In a society where people have trust and transparency in the ways that their data will be used to train AI, the risk of nightmare scenarios can be averted and we’ll all sleep better at night.


[1] https://www.ropesgray.com/en/insights/viewpoints/102jc9k/labour-victory-the-implications-for-data-protection-ai-and-digital-regulation-i

[2] https://etedge-insights.com/in-focus/trending/linkedin-faces-backlash-for-using-user-data-in-ai-training-without-consent/

[3] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7894241/#:~:text=COVID%2D19%20prompted%20the%20UK,teacher%20assessed%20grades%20and%20standardisation.

[4] https://www.healthcareitnews.com/news/emea/nhs-rolls-out-ai-tool-which-detects-heart-disease-20-seconds

[5] https://www.nhsconfed.org/publications/ai-healthcare


[i]

The post Nightmare on LLM Street: How To Prevent Poor Data Haunting AI appeared first on Datactics.

]]>
Datactics Awards 2024: Celebrating Customer Innovation https://www.datactics.com/blog/datactics-awards-2024-celebrating-customer-innovation/ Tue, 24 Sep 2024 15:28:14 +0000 https://www.datactics.com/?p=27124 In 2024, our customers have been busy delivering data-driven return on investment for their respective organisations. We wanted to recognise and praise their efforts in our first-ever Datactics Customer Awards! The oak-panelled setting of historic Toynbee Hall provided the venue for the 2024 Datactics Summit, which this year carried a theme of ‘Data-Driven Return on […]

The post Datactics Awards 2024: Celebrating Customer Innovation appeared first on Datactics.

]]>
In 2024, our customers have been busy delivering data-driven return on investment for their respective organisations. We wanted to recognise and praise their efforts in our first-ever Datactics Customer Awards!

The winners of the Datactics awards gather for a photograph. Caption describes who is in the picture.
Datactics Customer Awards winners 2024 gather for a group photo.
(From L to R: Erikas Rimkus, RBC Brewin Dolphin; Rachel Irving, Daryoush Mohammadi-Zaniani, Nick Jones and Tony Cole, NHS BSA; Lyndsay Shields, Danske Bank UK; Bobby McClung, Renfrewshire Health and Social Care Partnership). Not pictured: Solidatus.

The oak-panelled setting of historic Toynbee Hall provided the venue for the 2024 Datactics Summit, which this year carried a theme of ‘Data-Driven Return on Investment.’

Attendees gathered for guest speaker slots covering:

  • Danske Bank UK’s Lyndsay Shields presenting a ‘Data Management Playbook’ covering the experiences of beginning with a regulatory-driven change for FSCS compliance, through to broader internal evangelisation on the benefits of better data;
  • Datactics’ own data engineer, Eugene Coakley, in a lively discussion on the data driving sport, drawing from his past career as a professional athlete and Olympic rower with Team Ireland;
  • and Renfrewshire HSCP’s Bobby McClung explaining how automation and the saving of person-hours or even days in data remediation was having a material impact on the level of care the organsation is now able to deliver to citizens making use of its critical services.

The Datactics Customer Awards in full

In recent months, the team at Datactics has worked to identify notable achievements in data in the past year. Matt Flenley, Head of Marketing at Datactics, presented each with a specific citation, quoted below.

Data Culture Champion of the Year – Lyndsay Shields, Danske Bank UK
Data Culture Champion Award graphic

“We’re delighted to be presenting Lyndsay with this award. As one of our longest-standing customers, Lyndsay has worked tirelessly to embed a positive data culture at Danske Bank UK. Her work in driving the data team has helped inform and guide data policy at group level, bringing up the standard of data management across Danske Bank.

“Today’s launch of the Playbook serves to showcase the work Lyndsay and her team have put into driving the culture at Danske Bank UK, and the wider culture across Danske Bank.”

Data-Driven Social Impact Award – Renfrewshire Health and Social Care Partnership
Data Driven Social Impact Award graphic

“Through targeted use of automation, Renfrewshire Health and Social Care Partnership has been able to make a material difference to the operational costs of local government care provision.

“Joe Deary’s early vision and enthusiasm for the programme, and the drive of the team under and alongside Bobby, has effectively connected data automation to societally-beneficial outcomes.”

Data Strategy Leader of the Year – RBC Brewin Dolphin
Data Strategy Leader of the Year Award graphic

“RBC Brewin Dolphin undertook a holistic data review towards the end of 2023, culminating in a set of proposals to create a rationalised data quality estate. The firm twinned data this strategy with technology innovation including being early adopters of ADQ from Datactics. They overcame some sizeable hurdles, notably supporting Datactics in our early stages of deployment. Their commitment to being an ambitious, creative partner makes them stand out.

“At Datactics we’re delighted to be giving the team this award and would also like to thank them for being exemplars of patience in the way they have worked with us this year in particular.”

Datactics Award for Partner of the Year – Solidatus
Partner of the Year award graphic

“Solidatus and Datactics have been partnered for the last two years but it’s really in 2023-2024 that this partnership took off.

“Ever since we jointly supported Maybank, in Malaysia, in their data quality and data lineage programme, we have worked together on joint bids and supported one another in helping customers choose the ‘best of breed’ option in procuring data management technology. We look forward to our next engagements!”

Datactics Data Champion of the Year – NHSBSA
Data Champion of the Year Award graphic

“For all the efforts Tony, Nick and team have made to spread the word about doing more with data, we’d like to recognize NHS Business Services Authority with our Datactics Data Champion of the Year award.

“As well as their advocacy for our platform, applying it to identify opportunities for cost savings and efficiencies across the NHS, the team has regularly presented their work to other Government departments and acted as a reference client on multiple occasions. Their continued commitment to the centrality of data as a business resource is why they’re our final champions this year, the Datactics Data Champion 2024.”

:yndsay Shields of Danske Bank celebrates winning her award.
Lyndsay from Danske Bank UK
Bobby McClung from Renfrewshire HSCP celebrates winning their award.
Bobby from Renfrewshire HSCP
Clive Mawdesley and Erikas Rimkus from RBC Brewin Dolphin celebrate winning their award
Erikas and Clive from RBC Brewin Dolphin
Winners from NHS BSA celebrate winning their award.
Tony, Rachel, Nick and Daryoush from NHS BSA

Toasting success at Potter & Reid

The event closed with its traditional visit to Shoreditch hot spot Potter & Reid. Over hand-picked canapés and sparkling drinks, attendees networked and mingled to share in the award winners’ achievements in demonstrating what data-driven culture and return on investment looks like in practice. Keep an eye out for a taster video from this year’s event!

The post Datactics Awards 2024: Celebrating Customer Innovation appeared first on Datactics.

]]>
Datactics journey to ISO 27001:2022 certification https://www.datactics.com/blog/datactics-journey-to-iso-270012022-certification/ Wed, 04 Sep 2024 14:07:55 +0000 https://www.datactics.com/?p=27021 Dave Brown, Head of Security and Devops, Datactics At Datactics, maintaining the highest information security standards has always been at the core of our operations. This unwavering commitment has recently been formally recognised with our achievement of the ISO 27001:2022 Certification for our Information Security Management System (ISMS). It is a major milestone in achieving […]

The post Datactics journey to ISO 27001:2022 certification appeared first on Datactics.

]]>
Dave Brown, Head of Security and Devops, Datactics

ISO 27001:2022 Certification for Information Security Management System

At Datactics, maintaining the highest information security standards has always been at the core of our operations. This unwavering commitment has recently been formally recognised with our achievement of the ISO 27001:2022 Certification for our Information Security Management System (ISMS).

It is a major milestone in achieving ISO27001 and a powerful validation of our continuous efforts to protect client data and ensure the integrity, confidentiality, and availability of our information assets. In this blog, I’ll share our journey of achieving ISO 27001:2022.

The journey toward ISO 27001:2022 Certification

Our path to ISO certification began in Q4 2023, with invaluable support from industry experts at Vertical Structure.

The process started with a thorough evaluation of our existing security posture, carefully measuring it against the rigorous requirements set by ISO 27001. This stage involved a deep dive into our policies, procedures, and infrastructure to identify potential vulnerabilities and areas for improvement.

Following the evaluation, the DevOps team implemented targeted improvements over several months to strengthen our security framework. These efforts came to a head in January 2024 with a successful Stage 1 audit conducted by NQA. This initial audit was instrumental, providing us with crucial feedback and pinpointing specific areas for improvement.

The rigorous Stage 2 audit

By June 2024, we prepared for the critical Stage 2 audit and welcomed NQA back to the Datactics headquarters for an intensive review

This audit was exhaustive. For five days, the auditors scrutinized every facet of our ISMS. The audit team delved into our operations, from software development processes to client support systems and internal IT protocols. The auditors even spot-tested Datactics staff on their knowledge and understanding of Information Security Management within the company. This thorough examination ensured that no stone was left unturned.

Thanks to the hard work of our entire team, we successfully passed the audit! Datactics earned the ISO 27001:2022 certification, reinforcing our compliance with global information security standards and demonstrating our proactive approach to maintaining a secure operational environment.

Beyond certification: A commitment to excellence

For Datactics, the ISO 27001 certification is more than just a formal recognition; it embodies our ongoing commitment to excellence in information security and sets the stage for future advancements.

Achieving ISO 27001 is a significant milestone for Datactics and is the result of hard work and dedication from the entire team as we aim to grow and improve our security posture, proving the team’s dedication to providing secure and reliable policies and procedures that protect ourselves and our clients.

This milestone is more than just an endpoint; it marks the beginning of an ongoing journey. As we continue to innovate and expand our platform, maintaining a robust information security practice will remain a top priority. Backed strongly by our senior management team, we are ready to build on this foundation, creating a more secure, process-driven, and impactful data quality platform that will positively influence the industry.

As we continue our journey, the ISO 27001:2022 Certification for Information Security Management System reinforces our dedication to building a more secure, process-driven, and impactful data quality platform that will positively influence the industry. As a team, we are excited to see what comes next.

Dave Brown, Head of DevOps, Datactics
Dave Brown is the Head of Security and DevOps at Datactics. For more insights from Datactics, find us on Linkedin.

About ISO 27001:2022

The ISO 27001 certification is recognised as the global benchmark for managing information security. Datactics accreditation has been issued by NQA, a leading global independently accredited certification body that provides assessments (audits) of organisations to various management system standards since 1988. The process was supported by Vertical Structure, who conduct technical security training, helping companies to achieve certification to international standards such as ISO27001.

The post Datactics journey to ISO 27001:2022 certification appeared first on Datactics.

]]>
ISO 27001:2022 Certification Success https://www.datactics.com/blog/datactics-achieves-certification-iso-27001/ Fri, 30 Aug 2024 10:06:01 +0000 https://www.datactics.com/?p=27015 Datactics, a leader in data quality software has achieved ISO 27001:2022 Certification for Information Security Management System. The ISO 27001 certification is recognised globally as a benchmark for managing information security. The rigorous certification process, conducted by NQA and Vertical Structure, involved an extensive evaluation of Datactics’ security policies, procedures, people, and controls. Achieving this […]

The post ISO 27001:2022 Certification Success appeared first on Datactics.

]]>

Datactics, a leader in data quality software has achieved ISO 27001:2022 Certification for Information Security Management System.

The ISO 27001 certification is recognised globally as a benchmark for managing information security. The rigorous certification process, conducted by NQA and Vertical Structure, involved an extensive evaluation of Datactics’ security policies, procedures, people, and controls. Achieving this certification demonstrates Datactics’ dedication to safeguarding client data and maintaining information assets’ integrity, confidentiality, and availability.

Victoria Wallace, Senior DevOps & Security Specialist, stated: “Security is at the heart of everything that Datactics does and achieving ISO 27001:2022 certification is a testament to the team’s unwavering commitment in this technical field. Showcasing the extensive work that went into this prestigious achievement proves that dedication and determination can lead to significant success, both within Datactics and across our client ecosystem. Achieving and maintaining this certification is a key part of Datactics’ progress in enhancing our secure, process-driven, and powerful data quality platform.”

Tom Shields, Cyber & Information Security Consultant at Vertical Structure, said “It was a pleasure working with the team at Datactics. Their enthusiastic approach to ISO 27001 Information Security and the associated business risk mitigation was evident in every interaction. Involvement from top to bottom was prioritised from day one, allowing us to integrate into their team from the very outset. The opportunity to guide such organisations in certifying to ISO 27001 is a privilege for us, and we look forward to continuing to work alongside their team in the future.

About ISO 27001:2022 Certification

Datactics’ accreditation has been issued by NQA, a leading global independently accredited certification body. NQA has provided assessments (audits) of organisations to various management system standards since 1988.

Founded in 2006, Vertical Structure is an independent cyber security consultancy with a ‘people-first’ approach. Vertical Structure specialises in providing people-focused security and penetration testing services for web applications, cloud infrastructure and mobile applications.

Vertical Structure also conducts technical security training, helping companies to achieve certification to international standards such as ISO 27001, Cyber Essentials and CAIQ and are proud to be an Amazon Web Services® Select Consulting Partner.

The post ISO 27001:2022 Certification Success appeared first on Datactics.

]]>
Life after the Olympics: Finding my new team at Datactics https://www.datactics.com/blog/life-after-the-olympics-finding-my-new-team-at-datactics/ Fri, 09 Aug 2024 14:53:59 +0000 https://www.datactics.com/?p=26949 As we approach the end of the 2024 Olympic Games in Paris, I can’t help but reflect on my own experience at the Olympics, and where it’s led me to now in working within the data-driven world of technology as a senior data engineer. 2004 Athens Olympic Games Twenty years ago, I represented Team Ireland […]

The post Life after the Olympics: Finding my new team at Datactics appeared first on Datactics.

]]>
As we approach the end of the 2024 Olympic Games in Paris, I can’t help but reflect on my own experience at the Olympics, and where it’s led me to now in working within the data-driven world of technology as a senior data engineer.
2004 Athens Olympic Games

Twenty years ago, I represented Team Ireland in the 2004 Athens Olympic Games, in the men’s lightweight rowing four. The experience was both challenging and rewarding – training for the Olympics required immense dedication, teamwork, and resilience. But the camaraderie with my teammates, plus the thrill of representing my country on the world stage, was unparalleled.

Eugene Coakley rowing for Team Ireland at the 2004 Athens Olympic Games
Eugene Coakley (L) rowing for Team Ireland at the 2004 Athens Olympic Games

Whilst the experience of the Olympics was incredible, adjusting to life outside the professional world of sport was challenging. I spent many years searching for job satisfaction after retiring from international sport in 2008, until an opportunity for something new within the world of software development presented itself during the Pandemic. I’d always been interested in tech, but balancing work and raising two young kids made it difficult to find the time to upskill and consider a career move.

Everything changed in April 2020, when I was furloughed from my job and discovered a Masters in Software Development offered by Queen’s University, Belfast. The course provided an excellent opportunity to earn whilst I learned, allowing me to develop my skills without financial strain. As I progressed through the program, I found myself particularly drawn to data analysis and by the end of the program (15 months later), I had the fortunate opportunity to meet Datactics’ CEO, Stuart Harvey.

Finding my new team

Stuart felt like the career guidance teacher I never had. During our first initial meeting, he recognised that my unique skill set and experience would lend themselves to developing a career in data and suggested that I join the Data Engineering team at Datactics. Because working in the tech industry doesn’t hinge solely on writing code, it can be just as valuable to have skills in problem-solving, creative thinking, effective communication, and teamwork; all of which I developed as a rower.

Not long into the role, I felt a familiar sense of camaraderie and teamwork underpinning daily life at Datactics. The collaborative environment mirrored the tight-knit dynamic of my rowing team, where each member’s contribution is crucial to success and every team member is valued. I also found myself applying the same principles I learnt in professional sport – precision and accuracy- to my work at Datactics. A typical day for me involves working across financial services and government, ensuring data quality and integrity for businesses worldwide. I still get the reward of working in a team, but my responsibilities now are more data quality rule-building than rowing; less oar technique and more data engineer skills.

A data-driven sport

In hindsight, my interest in data is not all that surprising. Rowing is a data-driven sport, with every stroke, race, and training session meticulously recorded and analysed. Having accurate and reliable data helped me understand my performance, identify areas for improvement, and strategise for future races. Now, instead of improving the performance of myself and my team, I use these same data-driven principles to help our customers enhance their performance and achieve their goals.

Transitioning from the world of sport to software has been uniquely challenging and rewarding, with rowing providing me with more transferable skills than I ever expected. I still get to leverage the lessons I learned on the water and enjoy the daily camaraderie of working with an incredible team. As the Paris Games mark twenty years since I represented my country, I’m reminded of some amazing times representing Team Ireland and look forward to seeing where the athletes of Paris 2024 will go in their careers. I’m also reminded that the Olympic spirit, and the value of teamwork, has never left me.

About Eugene

Eugene Coakley - Data Engineer at Datactics

Eugene is a Senior Data Engineer at Datactics. Datactics provides leading data quality and matching software, augmented by machine learning and designed for non-technical business users. Eugene assists clients with their data quality ambitions, providing hands-on support and data quality expertise. At Datactics, we have developed our professional services offering, Datactics Catalyst, to deliver practical support in your data strategy. From augmenting your data team to work on specific data projects to delivering specialised training, our professional services team support data leaders in reaching their goals. Find out more here.

The post Life after the Olympics: Finding my new team at Datactics appeared first on Datactics.

]]>
AI/ML Scalability with Kubernetes   https://www.datactics.com/blog/ai-ml-scalability-with-kubernetes/ Wed, 05 Jun 2024 13:34:51 +0000 https://www.datactics.com/?p=26195 Kubernetes: An Introduction  In the ever-evolving world of engineering, scalability isn’t just a feature—it’s a necessity. As businesses and data continue to grow, the ability to scale applications efficiently becomes critical. At Datactics, we are at the forefront of integrating cutting-edge AI/ML functionality that enhances our Augmented Data Quality solutions. To align with current standards […]

The post AI/ML Scalability with Kubernetes   appeared first on Datactics.

]]>
Scalability with Kubernetes

Kubernetes: An Introduction 

In the ever-evolving world of engineering, scalability isn’t just a feature—it’s a necessity. As businesses and data continue to grow, the ability to scale applications efficiently becomes critical. At Datactics, we are at the forefront of integrating cutting-edge AI/ML functionality that enhances our Augmented Data Quality solutions. To align with current standards and ensure optimal AI/ML scalability with Kubernetes, our AI/ML team has integrated K8s into our infrastructure and deployment strategies.

What is Kubernetes? 

Kubernetes, also known as K8s, is an open-source platform designed to automate the deployment, scaling, and management of containerised applications. It adjusts the number of containerised applications to match incoming traffic, ensuring adequate resources to handle requests seamlessly.

Docker containers, managed through an API layer often using FastAPI, function like fully equipped packages of software, including all necessary dependencies. Kubernetes enables ‘horizontal scaling’—increasing or decreasing the number of container instances based on demand—using various load balancing and rollout strategies to make the process appear seamless. This method helps evenly spread traffic among containers, preventing overload and optimising resources. 

Kubernetes for Data Management

Every day, companies handle a lot of complicated data from different sources, at different velocities, and scales. This includes important tasks like cleaning, combining, matching, and resolving errors. It’s crucial to suggest and enforce Data Quality (DQ) rules in your data pipelines and efficiently identify DQ issues, ensuring these processes are automated, scalable, and responsive to fluctuating demands. 

Many organisations use Kubernetes (K8s) to automate deploying, scaling, and managing applications in containers across multiple machines. With features like service discovery, load balancing, self-healing, automated rollouts, and rollbacks, Kubernetes has become a standard for managing applications that are essential for handling complex data—both in the cloud and on-premise. Implementing AI/ML scalability with Kubernetes allows these organisations to process large volumes of data efficiently and respond quickly to changes in data flow and processing demands.

Real-World Scenario: The Power of Kubernetes 

It’s Friday at 5pm, and just as you’re about to leave the office, your boss informs you that transaction data for last month has been uploaded to the network share in a CSV document and it needs to be profiled immediately. The CSV file is massive—about a terabyte of data—and trying to open it in Excel would be disastrous. This is where Datactics and Kubernetes come to the rescue.  

You could run a Python application that might take all weekend to process, meaning you’d have to keep checking its progress and your weekend would be ruined. Instead, you could use Kubernetes to scale out Datactics’ powerful Profiling tools and complete the profiling before you even leave the building. Company saved. Weekend saved. 

Application of Kubernetes 

The world has grown progressively faster, and speed in the digital realm is king: speed in service delivery, speed in recovery in the event of a failure, and speed to production. We believe that the AI/ML features offered by Datactics should adhere to the same high standards. No matter how much data your organisation handles or how many data sources there are, it’s important to adjust resources to meet demand and reduce waste during the most critical moments. 

At Datactics, AI/ML features are deployed as Docker containers and FastAPI. Depending on your particular environment, we might run these containers on a single machine like AWS EC2 and deploy a single instance of each AI/ML feature, which is suitable for experiments and proof of concepts. However, for a fully operational infrastructure capable of supporting a large organisation, Kubernetes is essential. 

Kubernetes helps deploy Docker containers by providing a blueprint with deployment details, necessary resources, and any dependencies like external storage. This blueprint facilitates horizontal scaling to support additional instances of each AI/ML feature. 

Conclusion 

Kubernetes proved to be a game-changer for scaling Datactics’ AI/ML services, ultimately leading to a robust solution that ensures our AI/ML features can dynamically scale according to client needs. We tailor our deployment strategies to meet the diverse needs of our clients. Whether the requirement is a simple installation or a complex, scalable infrastructure, our commitment is to provide solutions that ensure our clients’ applications are efficient, reliable, and scalable. 

We aim to meet any specific requirements, always exploring various potential deployment setups preferred by our clients. If your organisation is looking to enhance its data processing capabilities, get in touch with us here. Let us help you optimise your data management strategies with the power of Kubernetes and our innovative AI/ML solutions. 

The post AI/ML Scalability with Kubernetes   appeared first on Datactics.

]]>
Insights from techUK’s Security and Public Safety SME Forum https://www.datactics.com/blog/panel-discussion-techuk-security-and-public-safety-sme-forum/ Fri, 24 May 2024 10:58:48 +0000 https://www.datactics.com/?p=25972 Chloe O’Kane, Project Manager at Datactics, recently spoke at techUK’s Security and Public Safety SME Forum, which included a panel discussion featuring speakers from member companies of techUK’s National Security and JES programs. The forum provided an excellent opportunity to initiate conversations and planning for the future among its members.   Read Chloe’s Q&A from […]

The post Insights from techUK’s Security and Public Safety SME Forum appeared first on Datactics.

]]>
Chloe O’Kane, Project Manager at Datactics, recently spoke at techUK’s Security and Public Safety SME Forum, which included a panel discussion featuring speakers from member companies of techUK’s National Security and JES programs. The forum provided an excellent opportunity to initiate conversations and planning for the future among its members.

Chloe O'Kane, Project Manager at Datactics

 

Read Chloe’s Q&A from the panel session, ‘Challenges and opportunities facing SMEs in the security and public safety sectors’, below:

What made you want to join the forum?

For starters, techUK is always a pleasure to work with – my colleagues and I at Datactics have several contacts at techUK that we speak with regularly and it’s clear that they care about the work they’re doing. It never feels like a courtesy call – you always come away with valuable actions to follow up on. Having had such positive experiences with techUK before, I felt encouraged to join the Security and Public Safety SME forum. Being a part of the Security and Public Safety SME Forum is exciting- you’re in a room full of like-minded people who want to make a difference. 

What are your main hopes and expectations from the forum?

I’ve previously participated in techUK events where senior stakeholders from government departments have led open and honest conversations about gaps in their knowledge. It’s refreshing to see them hold their hands up and say ‘We need help and we want to hear from SMEs’.

I think it would be great to see more of this in the Security and Public Safety SME forum, with people not being afraid to ask for help and demonstrating a desire to make a change.

What are, in your opinion, the main challenges faced by the SME community in the security and public safety sectors?

One of the challenges we face as SMEs is that we have to be deliberate about the work we do. We might see an opportunity that we know we’re a good fit for, but before we can commit, we need to think about it more than just ‘do we fit the technical criteria?’ We need to think about how it’s going to affect wider aspects of the company – Do we have sufficient staffing? Do they need security clearance? What is the delivery timeline?

If we aren’t being intentional, we risk disrupting our current way of working. We have a loyal and happy customer base and an excellent team of engineers, developers, and PMs to manage and support them, but even if a brilliant data quality deal lands on our desk, if it would take an army to deliver it, we may not be able to commit the same resources that a big consultancy firm can and, ultimately, we may have to pass on it.  

Moreover, our expertise lies specifically in data quality. As a leading DQ vendor, we excel in this area. However, if a project requires both data quality and additional data management services, we may not be the most suitable candidate, despite being the best at delivering the data quality component.

What are your top 3 areas of focus that the forum should address?

Ultimately, I think the goal of this forum should be steered by asking the question ‘How do we make people feel safe’?

A big challenge is always going to be striking the balance between tackling the issues that affect people’s safety, whilst navigating those bigger ‘headline’ stories that can have a lasting effect on the public. For instance, if you google ‘Is the UK a safe place to live?’, largely speaking the answers will say that ‘yes, the UK is a very safe place to live’. However, people’s perceptions don’t always align with that. I remember reading an article last year about how public trust in police has fallen to the lowest levels ever, so I think that would be a good place to start.  

From a member’s perspective though, more selfishly, I’d like to get the following out of the forum – 

  • Access to more SME opportunities 
  • Greater partnership opportunities 
  • More insights into procurement and access to the market 
In your opinion, why is networking and collaboration so important? Have you any success stories to share?


Our biggest success in networking and collaboration is having so many customers willing to endorse us and share our joint achievements.

We focus on understanding our customers, learning how they use our product, and listening to their likes and dislikes. This feedback shapes our roadmap and shows customers how much we value their input. This approach not only creates satisfied customers, but also turns them into advocates for our product. They mention us at conferences, in speeches, and in reference requests, and even help other customers with their data management strategies.

For us, networking is about more than just making new contacts; it’s about helping our customers connect and build relationships. Our customers’ advocacy is incredibly valuable because prospective customers like to hear success stories from them, perhaps more than salespeople.

About Datactics

Datactics specialises in data quality solutions for security and public safety. Using advanced data matching, cleansing, and validation, we help law enforcement and public safety agencies manage and analyse large datasets. This ensures critical information is accurate and accessible, improving response times, reducing errors, and protecting communities from threats.

For more information on how we support security and public safety services, visit our GovTech and Policing page, or reach out to us via our contact us page.

The post Insights from techUK’s Security and Public Safety SME Forum appeared first on Datactics.

]]>
Got three minutes? Get all you need to know on ADQ! https://www.datactics.com/blog/adq-in-three-minutes/ Wed, 17 Apr 2024 11:17:55 +0000 https://www.datactics.com/?p=25382 To save you scrolling through our website for the essential all you need to know info on ADQ, we’ve created this handy infographic. Our quick ADQ in three minutes guide can be downloaded from the button below the graphic. Happy reading! As always, don’t hesitate to get in touch if you’re looking for an answer […]

The post Got three minutes? Get all you need to know on ADQ! appeared first on Datactics.

]]>
To save you scrolling through our website for the essential all you need to know info on ADQ, we’ve created this handy infographic.

Our quick ADQ in three minutes guide can be downloaded from the button below the graphic. Happy reading! As always, don’t hesitate to get in touch if you’re looking for an answer that you can’t find here. Simply hit ‘Contact us’ with your query and let us do the rest.

adq in three minutes part one: augmented data quality process from datactics - connect to data, profile data, leverage AI rule suggestion, configure controls.
adq in three minutes part 2:
measure data health; get alerts and remediations; generate AI powered insights, and work towards a return on investment.

Wherever you are on your data journey, we have the expertise, the tooling and the guidance to help accelerate your data quality initiatives. From connecting to data sources, through rule building, measuring and into improving the quality of data your business relies on, let ADQ be your trusted partner.

If you would like to read some customer stories of how we’ve already achieved this, head on over to our Resources page where you’ll find a wide range of customer case studies, white papers, blogs and testimonials.

To get hold of this infographic, simply hit Download this! below.

The post Got three minutes? Get all you need to know on ADQ! appeared first on Datactics.

]]>
Datactics placed in the 2024 Gartner® Magic Quadrant™ for Augmented Data Quality Solutions  https://www.datactics.com/blog/datactics-placed-in-the-2024-gartner-magic-quadrant-for-augmented-data-quality-solutions/ Fri, 05 Apr 2024 13:34:56 +0000 https://www.datactics.com/?p=25091 Belfast, Northern Ireland – 5th April, 2024 – Datactics, a leading provider of data quality and matching software, has been recognised in the 2024 Gartner Magic Quadrant for Augmented Data Quality Solutions for a third year running.   Gartner included only 13 data quality vendors in the report, where Datactics is named a Niche Player. Datactics’ […]

The post Datactics placed in the 2024 Gartner® Magic Quadrant™ for Augmented Data Quality Solutions  appeared first on Datactics.

]]>

Belfast, Northern Ireland – 5th April, 2024 – Datactics, a leading provider of data quality and matching software, has been recognised in the 2024 Gartner Magic Quadrant for Augmented Data Quality Solutions for a third year running.  

Gartner included only 13 data quality vendors in the report, where Datactics is named a Niche Player. Datactics’ Augmented Data Quality platform (ADQ) offers a unified and user-friendly experience, optimising data quality management and improving operational efficiencies. By augmenting data quality processes with advanced AI and machine learning techniques, such as outlier detection, bulk remediation, and rule suggestion, Datactics serves customers across highly regulated industries, including financial services and government.

In an era where messy, unreliable and inaccurate data poses a substantial threat to organisations, the demand for data quality solutions has never been greater. Datactics stands out for its user-friendly, scalable, and highly efficient data quality solutions, designed to empower business users to manage and improve data quality seamlessly. Its solutions leverage AI and machine learning to automate complex data management tasks, thereby significantly enhancing operational efficiency and data-driven decision-making across various industries. 

“We are thrilled to be included in the 2024 Gartner Magic Quadrant for Augmented Data Quality Solutions,” said Stuart Harvey, CEO of Datactics. “Our team’s dedication and innovative approach is solving the complex challenges of practical data quality for customers across industries.

We believe the report significantly highlights our distinction from traditional observability solutions, showcasing Datactics’ focus on identifying, measuring and remediating broken data. We are committed to assisting our clients to create clean, ready-to-use data via the latest techniques in AI and have invested heavily in automation to reduce the manual effort required in rule building and management while retaining human-in-the-loop supervision. It is gratifying to note that Gartner recognises Datactics for its ability to execute and completeness of vision.”

Datactics’ solutions are designed to empower data leaders to trust their data for critical decision-making and regulatory compliance. For organisations looking to enhance their data quality and leverage the power of augmented data management, Datactics offers a proven platform that stands out for its ease of use, flexibility, and comprehensive support. 

Magic Quadrant reports are a culmination of rigorous, fact-based research in specific markets, providing a wide-angle view of the relative positions of the providers in markets where growth is high and provider differentiation is distinct. Providers are positioned into four quadrants: Leaders, Challengers, Visionaries and Niche Players. The research enables you to get the most from market analysis in alignment with your unique business and technology needs.

Gartner does not endorse any vendor, product or service depicted in its research publications and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

GARTNER is a registered trademark and service mark of Gartner and Magic Quadrant is a registered trademark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and are used herein with permission. All rights reserved.


The post Datactics placed in the 2024 Gartner® Magic Quadrant™ for Augmented Data Quality Solutions  appeared first on Datactics.

]]>
Shaping the Future of Insurance: Insights from Tia Cheang https://www.datactics.com/blog/shaping-the-future-of-insurance-with-tia-cheang/ Tue, 02 Apr 2024 13:55:13 +0000 https://www.datactics.com/?p=25115 Tia Cheang, Director of IT Data and Information Services at Gallagher, recently delivered an interview with Tech-Exec magazine drawing from her knowledge and experience in shaping the future of the insurance industry at one of the world’s largest insurance brokers. You can read the article here. Tia is also one of DataIQ’s Most Influential People […]

The post Shaping the Future of Insurance: Insights from Tia Cheang appeared first on Datactics.

]]>

Tia Cheang, Director of IT Data and Information Services at Gallagher, recently delivered an interview with Tech-Exec magazine drawing from her knowledge and experience in shaping the future of the insurance industry at one of the world’s largest insurance brokers. You can read the article here.

Tia is also one of DataIQ’s Most Influential People In Data for 2024 (congratulations, Tia!). We took the opportunity to ask Tia a few questions of our own, building on some of the themes from the Tech-Exec interview.

In the article with Tech-Exec, you touched on your background, your drive and ambition, and what led you to your current role at Gallagher. What are you most passionate about in this new role?

In 2023, I started working at Gallagher after having an extensive career in data in both public and private sectors. This job was a logical next step for me, as it resonates with my longstanding interest in utilising data in creative ways to bring about beneficial outcomes. I was eager to manage a comprehensive data transformation at Gallagher to prepare for the future, aligning with my interests and expertise.

I am responsible for leading our data strategy and developing a strong data culture. We wish to capitalise on data as a route to innovation and strategic decision-making. Our organisation is therefore creating an environment where data plays a crucial role in our business operations, to allow us to acquire new clients and accomplish significant results rapidly. The role offers an exciting opportunity to combine my skills and lead positive changes in our thinking towards data and its role in the future of insurance.

The transition to making data an integral part of business operations is often challenging. How have you found the experience? 

At Gallagher, our current data infrastructure faces the typical challenges that arise when a firm is expanding. Our data warehouses collect data from many sources, which mirrors the diverse aspects of our brokerage activities. These encompass internal systems, such as customer relationship management (CRM), brokerage systems, and other business applications. We handle multiple data types in our data estate, ranging from structured numerical data to unstructured text. The vast majority of our estate is currently hosted on-premise using Microsoft SQL Server technology, however, we also manage various other departmental data platforms such as QlikView. 

“…we want data capabilities that provide flexibility and agility, to enable us to quickly react to new market opportunities.”

A key challenge we face is quickly incorporating new data sources obtained through our mergers and acquisitions activity. These problems affect our data management efforts in terms of migration, seamless data integration, maintaining data quality, and providing data accessibility. 
To overcome this, we want data capabilities that provide flexibility and agility, to enable us to quickly react to new market opportunities. Consequently, we are implementing a worldwide data transformation to update our data technology, processes, and skills to provide support for this initiative. This transformation will move Gallagher data to the cloud, using Snowflake to leverage the scalability and elasticity of the platform for advanced analytics. Having this flexibility gives us a major advantage, offering computational resources where and when they are required.

How does this technology strategy align with your data strategy, and how do you plan to ensure data governance and compliance while implementing these solutions, especially in a highly-regulated industry like insurance?

Gallagher’s data strategy aims to position us as the leader in the insurance sector. By integrating our chosen solutions within the Snowflake platform, we strive to establish a higher standard in data-driven decision-making. 

This strategy involves incorporating data management tools such as Collibra, CluedIn, and Datactics into our re-platforming efforts, with a focus on ensuring the compatibility and interoperability of each component. We are aligning each tool’s capabilities with Snowflake’s powerful data lake functionality with the support of our consulting partners to ensure that our set of tools function seamlessly within Snowflake’s environment.

“…we are contemplating upcoming AI and automation regulations and considering how to futureproof our products and approaches…”

We are meticulously navigating the waters of data governance and compliance. We carefully plan each stage to ensure that all components of our data governance comply with the industry regulations and legislation of the specific region. For example, we are contemplating upcoming AI and automation regulations and considering how to futureproof our products and approaches to comply with them.

The success of our programme requires cooperation across our different global regions, stakeholders, and partners. We are rethinking our data governance using a bottom-up approach tailored to the specific features of our global insurance industry. We review our documentation and test the methods we use to ensure they comply with regulations and maintain proper checks and balances. We seek to understand the operational aspects of a process in real-world scenarios and evaluate its feasibility and scalability.

Could you expand on your choice of multiple solutions for data management technology? What made you go this route over a one-stop shop for all technologies?

We have selected “best of breed” solutions for data quality, data lineage, and Master Data Management (MDM), based on a requirement for specialised, high-performance tools. We concentrated on high-quality enterprise solutions for easy integration with our current technologies. Our main priorities were security, scalability, usability, and compatibility with our infrastructure. 

By adopting this approach, we achieve enhanced specialisation and capabilities in each area, providing high-level performance. This strategy offers the necessary flexibility within the organisation to establish a unified data management ecosystem. This aligns with our strategic objectives, ensuring that our data management capability is scaleable, secure, and adaptable.

Regarding the technologies we have selected, Collibra increases data transparency through efficient cataloguing and clear lineage; CluedIn ensures consistent and reliable data across systems; and Datactics is critical for maintaining high-quality data. 

“As we venture into advanced analytics, the importance of our data quality increases.”

In Datactics’ case, it provides data cleansing tools that ensure the reliability and accuracy of our data, underpinning effective decision-making and strategic planning. The benefits of this are immense, enhancing operating efficiency, reducing errors, and enabling well-informed decisions. As we venture into advanced analytics, the importance of our data quality increases. Therefore, Datactics was one of the first technologies we started using.

We anticipate gaining substantial competitive advantages from our strategic investment, such as improved decision-making capabilities, operational efficiency, and greater customer insights for personalisation. Our ability to swiftly adapt to market changes is also boosted. Gallagher’s adoption of automation and AI technologies will also strengthen our position, ensuring we remain at the forefront of technological progress.

On Master Data Management (MDM), you referred to the importance of having dedicated technology for this purpose. How do you see MDM making a difference at Gallagher, and what approach are you taking?

Gallagher is deploying Master Data Management to provide a single customer view. We expect substantial improvements in operational efficiency and customer service when it is completed. This will improve processing efficiency by removing duplicate data and offering more comprehensive, actionable customer insights. These improvements will benefit the insurance brokerage business and will enable improved data monetisation and stronger compliance, eventually enhancing client experience and increasing operational efficiency.

Implementing MDM at Gallagher is foundational to our ability to enable global analytics and automation. To facilitate it, we need to create a unified, accurate, and accessible data environment. We plan to integrate MDM seamlessly with our existing data systems, leveraging tools like CluedIn to manage reference data efficiently. This approach ensures that our MDM solution supports our broader data strategy, enhancing our overall data architecture.

“By including data quality activities in our approach, we anticipate significant benefits from the MDM initiative.”

Data quality is crucial in Gallagher’s journey to achieve this, particularly in establishing a unified consumer view via MDM. Accurate and consistent data is essential for consolidating several client data sources into a master profile; we see it as essential, as without good data quality the benefits of our transformation will be reduced. By including data quality activities in our approach, we anticipate significant benefits from the MDM initiative. We foresee a marked improvement in data accuracy and consistency throughout all business units. We want to empower users across the organisation to make more informed, data-driven decisions to facilitate growth. Furthermore, a single source of truth enables us to streamline our operations, leading to greater efficiencies by removing manual processes. Essentially, this strategic MDM implementation transforms data into a valuable asset that drives innovation and growth for Gallagher.

Looking to the future of insurance, what challenges do you foresee in technology, data and the insurance market?

Keeping up with the fast speed of technology changes can be challenging. We are conducting horizon scanning on new technologies to detect emerging trends. We wish to include new tools and processes that will complement and improve our current systems as they become ready.

“We prioritise the security of our data assets and our clients’ privacy because it is essential for our reputation and confidence in the market.”

Next is ensuring robust data security and compliance, particularly when considering legislation changes about AI and data protection. Our approach is to continuously strengthen our data policies as we grow and proactively manage our data. We prioritise the security of our data assets and our clients’ privacy because it is essential for our reputation and confidence in the market.

Finally, we work closely with our technology partners to leverage their expertise. This collaborative approach ensures that we take advantage of new technologies to their maximum capacity while preserving the integrity and effectiveness of our current systems. 

Are there any other technologies or methodologies you are considering for improving data management in the future beyond what you have mentioned?

Beyond the technologies and strategies already mentioned, at Gallagher, we plan to align our data management practices with the principles outlined in DAMA/DMBOK (Data Management Body of Knowledge). This framework will ensure that our data management capabilities are not just technologically advanced but also adhere to the best practices and standards in the industry.

In addition to this, we are always on the lookout for emerging technologies and methodologies that could further enhance our data management. Whether it’s advancements in AI, machine learning, or new data governance frameworks, we are committed to exploring and adopting methodologies that can add value to our data management practices.

For more from Tia, you can find her on LinkedIn.



The post Shaping the Future of Insurance: Insights from Tia Cheang appeared first on Datactics.

]]>
FSCS compliance: The Future of Depositor Protection https://www.datactics.com/blog/fscs-compliance-the-future-of-depositor-protection/ Wed, 27 Mar 2024 16:44:31 +0000 https://www.datactics.com/?p=25044   Why does FSCS compliance matter? HSBC Bank plc (HBEU) and HSBC UK Bank plc (HBUK)’s January 2024 fine, imposed by the Prudential Regulation Authority (PRA) for historic failures in deposit protection identification and notification, alongside the 2023 United States banking crisis, jointly serve as stark reminders of the importance of depositor protection regulation. Both […]

The post FSCS compliance: The Future of Depositor Protection appeared first on Datactics.

]]>
HSBC Bank

 

Why does FSCS compliance matter?

HSBC Bank plc (HBEU) and HSBC UK Bank plc (HBUK)’s January 2024 fine, imposed by the Prudential Regulation Authority (PRA) for historic failures in deposit protection identification and notification, alongside the 2023 United States banking crisis, jointly serve as stark reminders of the importance of depositor protection regulation.

Both events, emblematic of the broader challenges faced by the banking sector, underscore the necessity of rigorous data governance and quality for FSCS compliance and depositor protection.

HSBC’s penalty, the second largest imposed by the PRA, highlights the consequences of inadequate data management practices, while the 2023 US banking crisis, characterised by the failure of three small-to-midsize banks, reveals the systemic risks posed by liquidity concerns and market instability.

These incidents draw attention not only to the pressing issues of today, but also to the enduring mechanisms put in place to safeguard financial stability. The Financial Services Compensation Scheme (FSCS), established in the United Kingdom, embodies such a mechanism, created to instil consumer confidence and prevent the domino effect of bank runs.

What is Single Customer View (SCV)?

The FSCS’s role becomes especially pivotal in times of uncertainty: if a bank collapses, the FSCS’s compensation mechanism needs to activate almost instantaneously to maintain this confidence.

According to the Prudential Regulation Authority (PRA) Rulebook (Section 12), firms are required to produce a Single Customer View (SCV) — a comprehensive record of eligible guaranteed deposits — within 24 hours of a bank’s failure or whenever the PRA or FSCS requests it.

This response, underscored by the accuracy and rapidity of depositor information, is a bulwark designed to avert a banking crisis by ensuring timely compensation for affected customers. Over time, as the FSCS has amplified depositor protection to cover up to £85,000 per individual, the 24-hour SCV mandate has marked a significant stride towards a more secure and robust financial sector, solidifying the foundation where depositor trust is paramount.

What data challenges does SCV pose?

When it comes to implementing the SCV regulation, the devil lies in the details. The demand for accuracy and consistency in depositor records translates into specific, often arduous, data quality challenges. Financial institutions must ensure that each depositor’s record is not only accurate but also aligned with SCV’s granular requirements:

Below are 5 data challenges associated with SCV:
  • Identification and rectification of duplicated records– Duplication can occur due to disparate data entry points or legacy systems not communicating effectively.
  • Lack of consistency across records– Customer details may have slight variations across different systems, such as misspelt names or outdated addresses, which can impede the quick identification of accounts under SCV mandates.
  • Data timeliness– SCV necessitates that data be updated within a 24-hour window, requiring real-time (or near-real-time) processing capabilities. Legacy systems, often built on batch processing, may struggle to adapt to this requirement.
  • Discrepancies in account status — determining if an account is active, dormant, or closed — must be resolved to prevent compensation delays or errors.
  • Aggregating siloed data– the comprehensive nature of depositor information mandated by SCV involves aggregating data across multiple product lines, account types, and even geographical locations for international banks, a task that can be formidable given the legacy data structures and the diversity of regulatory environments.

The HSBC fine, in particular, underscores the ramifications of inaccurate risk categorisation under the depositor protection rules and the insufficiency of stress testing scenarios tailored to depositor data. Without robust data quality controls, banks risk misclassifying depositor coverage, which could potentially lead to regulatory sanctions and reputational damage.

Why integrate SCV with wider data strategies?

By incorporating meticulous data standards and validation processes as part of an enterprise strategy, banks can transform data management from a regulatory burden into a strategic asset.

The crux of effective depositor protection lies not just in adhering to SCV requirements, but in embracing a broader perspective on data governance and quality. This means positioning SCV not in isolation but as a critical component of a comprehensive account and customer-level data strategy.

To overcome these challenges, financial institutions must not only deploy advanced data governance and quality tooling but also foster a culture of data stewardship where data quality is an enterprise-wide responsibility and not one that is siloed within IT departments. By incorporating meticulous data standards and validation processes as part of an enterprise strategy, banks can transform data management from a regulatory burden into a strategic asset.

An enterprise approach involves:
  • Unified Data Governance Frameworks: Establishing unified data governance frameworks that ensure data accuracy, consistency, and accessibility across the enterprise.
  • Advanced Data Quality Measures: Implementing advanced data quality measures that address inaccuracies and inconsistencies head-on, ensuring that all customer data is up-to-date and reliable.
  • Integration with Broader Business Objectives: Aligning SCV and other regulatory data requirements with broader business objectives, including risk management, customer experience enhancement, and operational efficiency.
  • Leveraging Technology and Analytics: Employing cutting-edge technology and analytics to streamline data management processes, from data collection and integration to analysis and reporting.

How does Datactics support FSCS compliance?

 

The recent HSBC fine and the 2023 US banking crisis serve as critical catalysts for reflection on the role of depositor protection regulation and the imperative of a holistic data strategy.

FSCS regulatory reporting compliance underscores the evolution of depositor protection in response to financial crises, whilst the challenges presented by these regulations highlight the need for advanced data governance and quality measures.

At Datactics, we understand that the challenges posed by regulations like SCV indicate broader issues within data management and governance.

Our approach transcends the piecemeal addressing of regulatory requirements; instead, we advocate for and implement a comprehensive data strategy that integrates SCV within the wider context of account and customer-level data management.

Our solutions are designed to support regulatory compliance but also to bolster the overall data governance and quality framework of financial institutions.

Datactics Data Readiness

 

We work closely with our clients to:
  • Identify and Address Data Quality Issues: Through advanced analytics and machine learning, we pinpoint and rectify data quality issues, ensuring compliance and enhancing overall data integrity.
  • Implement Robust Data Governance Practices: We help institutions establish and maintain robust data governance practices that align with both regulatory requirements and business goals.
  • Foster a Culture of Data Excellence: Beyond technical solutions, we emphasise the importance of fostering a culture that values data accuracy, consistency, and transparency.

We are committed to helping our customers navigate FSCS compliance, not by addressing regulations in isolation but by integrating them into a broader, strategic and more sustainable framework of account and customer-level data management. By doing so, we ensure compliance and protection for depositors whilst paving the way for a more resilient, trustworthy, and efficient banking sector.

 

FSCS compliance Datactics

The post FSCS compliance: The Future of Depositor Protection appeared first on Datactics.

]]>
Four Essential Tips to Build a Data Governance Business Case https://www.datactics.com/blog/4-tips-on-how-to-build-a-data-governance-business-case/ Mon, 26 Feb 2024 14:30:36 +0000 https://www.datactics.com/?p=24721 In an era where data drives strategic decision-making, data governance and the quality of that data become increasingly vital. Building a business case for data governance can bring a number of enterprise-wide benefits. This is especially true in banking and financial services, where the risk-focused mindset can sometimes overshadow the potential to become data-driven. However, […]

The post Four Essential Tips to Build a Data Governance Business Case appeared first on Datactics.

]]>
How to build a business case for data governance

In an era where data drives strategic decision-making, data governance and the quality of that data become increasingly vital.

Building a business case for data governance can bring a number of enterprise-wide benefits. This is especially true in banking and financial services, where the risk-focused mindset can sometimes overshadow the potential to become data-driven.

However, it is often a challenge to communicate the value of investing in a data governance and analytics programme.

Successful data governance programmes are influenced by more than the deployment of advanced technologies or methodologies. They are also determined by fostering an organisational culture that fundamentally prioritises data governance. Often referred to as ‘data literacy’, this focus on encouraging a data-driven culture helps ensure that better data management efforts are adopted and sustained over time. Consequently, this can lead to improved data quality, better adherence to rules, and smarter decision-making across the company.

In a recent roundtable in London with some of our customers, we gained first-hand insight into how they are tackling the challenge of fostering a company culture that values data governance. As thought leaders in their fields, we thought we’d share some of their insights. We’ve broken these tips down into simple summaries below.

The Four Essential Tips

Here are four ways that our customers cultivate a company culture that prioritises data governance:

  1. Start with Data Quality
  2. Highlight Success Stories
  3. Use Positive Language
  4. Tap into the Human Side of Data Governance

1. Start with Data Quality

Our customers agreed that this is one of the most impactful steps. Data quality is the foundation, ensuring consistency and accuracy across the organisation’s data landscape. This is essential for any governance and analytics programme to succeed. This step helps make the benefits of data governance more apparent and relatable to all employees, as stakeholders see how data quality can enhance decision-making, reduce errors, and streamline processes. Equally, better data powers better decisions, more of which to follow…

2. Highlight Success Stories

When trying to gain buy-in internally, it’s important to be able to create a compelling story that your key stakeholders can relate to. Success can look different depending on every organisation and it’s particularly important to shout about the wins, big or small. For one organisation, having proper data governance can drive efficiencies and profits. For another, it could result in more lives saved. Real-life examples of how improved data governance has led to better outcomes can be an excellent motivator for change.

3. Use Positive Language

The way data analytics and governance are talked about has the power to significantly influence key stakeholders. This can be as simple as talking about the opportunities and benefits of having a robust data governance programme, instead of framing it as something that’s necessary to comply with regulations. Compliance is critical, but so is growing your business; consequently, demonstrate the value your improved data quality is bringing in clear dashboards.

4. Tap into the Human Side of Data Governance

While it may be true that people will frequently resist change, it doesn’t have to derail your ambitions. To deal with this effectively, try to identify some of the areas of frustration felt by other teams across the organisation. To begin with, ask them about their daily work challenges. Oftentimes, these challenges are caused by underlying problems with data quality. Understanding this helps convince them of the value of investing more in data governance to make their day-to-day jobs easier. Our customers also commented on the value of having good interpersonal skills to work effectively with stakeholders and deal with push-back.

        Maintaining a Successful Data Governance Programme

        Once these initial steps have been taken, continue the conversation through ongoing education and training. Offering workshops, seminars, and online courses can help demystify data governance and analytics, making it more accessible across the business.

        Another way to sustain an enterprise data governance programme is by leveraging technology. User-friendly, no-code tools and platforms are a great way of democratising data governance, making it more accessible across the business. With AI, these tools can automate mundane tasks, extract valuable insights from the data, and ensure data accuracy. Accordingly, this makes it easier to encourage a company-wide culture that values data governance.

        Conclusion

        Fostering a company culture that values data governance is a multifaceted process. With this in mind, it’s worth seeing how our customers have gone about it. In general they achieve buy-in by starting with data quality; leveraging the power of storytelling; providing continuous education; and embracing data management technologies. By focusing on these areas, organisations can ensure that their data governance efforts move beyond compliance requirements to become strategic advantages driveing better decision-making and operational efficiency.

        How Datactics can help

        Looking for advice on how to build a business case for data governance within your organisation? This is something we’ve done for our clients.

        We have developed Datactics Catalyst, our professional services offering, to deliver practical support in your data strategy. 

        From augmenting your data team to working on specific data projects, delivering training or providing a short-term specialist to solve a specific data quality problem, let Datactics Catalyst accelerate your ambitions, help you increase data literacy and foster a data-driven culture.

        Have a look at our Catalyst page to find out more: www.datactics.com/

        The post Four Essential Tips to Build a Data Governance Business Case appeared first on Datactics.

        ]]>
        What is a Data Quality Firewall?  https://www.datactics.com/blog/what-is-a-data-quality-firewall/ Thu, 01 Feb 2024 15:33:56 +0000 https://www.datactics.com/?p=24514 What is a Data Quality firewall? A data quality firewall is a key component of data management. It is a form of data quality monitoring, using software to prevent the ingestion of messy or bad data. It’s a set of measures or processes to ensure the integrity, accuracy, and reliability of data within an organisation, […]

        The post What is a Data Quality Firewall?  appeared first on Datactics.

        ]]>
        What is a Data Quality firewall?

        A data quality firewall is a key component of data management. It is a form of data quality monitoring, using software to prevent the ingestion of messy or bad data.

        an image depicting what a data quality firewall might look like. data is streaming from a central point, with a bright light depicting the orderly transmission of data through the firewall.

        It’s a set of measures or processes to ensure the integrity, accuracy, and reliability of data within an organisation, and helps support data governance strategies. This could involve controls and checks to prevent the entry of inaccurate or incomplete data from data sources into data stores, as well as mechanisms to identify and rectify any data quality issues that arise. 

        In its simplest form, a data quality firewall could be data stewards manually checking the data. However, this isn’t recommended, as it’s considerably inefficient and could cause inaccuracies.  Instead, a more effective approach is the use of automation.

        An automated approach

        Data quality metrics (e.g. completeness, duplication, validity etc.) can be generated automatically and are useful for identifying a data quality issue. At Datactics, with our expertise in AI-augmented data quality, we understand that the most value is derived from data quality rules that are highly specific to an organisation’s context. This includes rules focusing on Accuracy, Consistency, Duplication, and Validity. The ability to execute all the above rules should be a part of any data quality firewall. 

        The above is perfectly suited to an API giving an on-demand view of the data’s health before ingestion into the warehouse. This real-time assessment ensures that only clean, high-quality data is stored, significantly reducing downstream errors and inefficiencies.

        What Features are Required for a Data Quality Firewall? 

         

        The ability to define Data Quality Requirements 

        The ability to specify what data quality means for your organisation is key. For example, you may want to consider whether data should be processed in situ or passed through an API, depending on data volumes and other factors. Here are a couple of other questions worth considering when defining data quality requirements- 

        • Which rules should be applied to the data?  It goes without saying that not all data is the same. Rules which are highly applicable to the specific business context will be more useful than a generic completeness rule, for example. This may involve checking data types, ranges, and formats, or validation against sources of truth. Reject data that doesn’t meet the specified criteria.
        • What should be done with broken data? Strategies for dealing with broken data should be flexible. Options might include quarantining the entire dataset, isolating only the problematic records, passing all data with flagged issues, or immediately correcting issues, like removing duplicates or standardising formats. All the above should be options for the user of the API.  The point is, not every use case is the same and a one-size-fits-all solution won’t be sufficient. 

        Key DQ Firewall Features:

        Data Enrichment 

        Data enrichment may involve adding identifiers and codes to the data entering the warehouse. This can help with usability and traceability. 

        Logging and Auditing 

        Robust logging and auditing mechanisms should be provided. Log all incoming and outgoing data, errors, and any data quality-related issues. This information can be valuable for troubleshooting and monitoring data quality over time. 

        Error Handling 

        A comprehensive error-handling strategy should be provided, with clearly defined error codes and messages to communicate issues with consumers of the API. Guidance on how to resolve or address data quality errors is provided. 

        Reporting 

        Regular reporting on data quality metrics and issues, including trend analysis, helps in keeping track of the data quality over time.

        Documentation 

        The API documentation should include information about data quality expectations, supported endpoints, request and response formats, and any specific data quality-related considerations. 

         

        How Datactics can help 

         

        You might have noticed that the concept of a Data Quality Firewall is not just limited to data entering an organisation. It’s equally valuable at any point in the data migration process, ensuring quality as data travels within an organisation. Wouldn’t it be nice to know the quality of your data is assured as it flows through your organisation?

        Datactics can help with this. Our Augmented Data Quality (ADQ) solution uses AI and machine learning to streamline the process, providing advanced data profiling, outlier detection, and automated rule suggestions. Find out more about our ADQ platform here.

        The post What is a Data Quality Firewall?  appeared first on Datactics.

        ]]>
        The benefits of an Augmented Data Quality Solution https://www.datactics.com/blog/augmented-data-quality/the-benefits-of-an-augmented-data-quality-solution/ Mon, 22 Jan 2024 15:55:40 +0000 https://www.datactics.com/?p=24323 In the digital era, data is essential for every organisation, meaning good data management is needed to empower businesses to make well-informed decisions and operate efficiently. However, this can be a challenging landscape, encompassing catalogs, lineage, observability, master data management, and data quality.  We’re at a point now where institutions’ data estates are rapidly expanding. […]

        The post The benefits of an Augmented Data Quality Solution appeared first on Datactics.

        ]]>
        The benefits of an Augmented Data Quality Solution

        In the digital era, data is essential for every organisation, meaning good data management is needed to empower businesses to make well-informed decisions and operate efficiently. However, this can be a challenging landscape, encompassing catalogs, lineage, observability, master data management, and data quality. 

        We’re at a point now where institutions’ data estates are rapidly expanding. Stretching from legacy systems to cloud migrations and data warehouses, and spanning relational databases to unstructured documents, the importance of data quality has never been greater. This, coupled with the decentralisation of organisational data, has made it difficult for organisations to maintain good data quality. 

         

        From traditional to transformative Data Quality Solutions 

        Addressing data quality issues within a business has typically involved very labour-heavy, manual processes. The nature of the modern data landscape, with its complex and ever-growing data sets, is demanding much more in the way of transformative solutions. Consequently, data quality systems must now adapt to automate processes like data profiling, rule suggestion, and time-series analysis of data issues. This is where the revolutionary concept of ‘augmented data quality’ comes into play. 

         

        Augmented Data Quality- What is it? 

        In short, augmented data quality is an approach that uses machine learning (ML) and artificial intelligence (AI) to automate and enhance data quality management. The aim is to automatically improve data quality by analyzing data, identifying and fixing issues, and providing clear, transparent metrics on data quality and improvement actions across your entire data estate. As a result, our users have found that an augmented data quality approach makes their data assets more valuable, allowing them to maximise the value of their data at a low cost with minimal manual effort. 

         Augmented data quality promotes self-service data quality management, making it easier for business users to carry out tasks without the need for deep technical expertise and knowledge of data science techniques. Moreover, it offers many benefits, from improved data accuracy to increased efficiency, and reduced costs. Rather than needing to carry out many specific tasks when assessing the quality of a set of data, augmented data quality automates this process, making it a valuable resource for enterprises dealing with big data. 

         Whilst AI and machine learning models can speed up routine DQ tasks, they cannot fully automate the whole process. In other words, augmented data quality does not eliminate the need for human oversight, decision-making, and intervention; instead, it complements it by leveraging human-in-the-loop technology, using advanced algorithms to perform large amounts of checks and fixes while making use of human expertise to review and tackle only the most difficult of issues, ensuring the highest levels of accuracy. 

         

        Datactics Augmented Data Quality Platform 

         

        Datactics Augmented Data Quality Solution

         

        Responding to these challenges, Datactics has developed the Augmented Data Quality platform (ADQ), which streamlines the data quality journey through a user-friendly interface. Our technology team has pioneered the use of AI/ML capabilities to make it easier for businesses to improve data quality. This includes: 

        • Automated Data Profiling: Enabling you to efficiently onboard new sources of data or analyse existing ones, this feature allows the user to quickly understand their data, identify trends and outliers, and, when errors are found, automatically suggest and apply data quality rules. 
        • DQ Insights Hub: Making use of a wide range of our machine learning capabilities, this feature provides a summarised view of data quality across many sources, allowing you to create interactive and fully customizable dashboards. These dashboards highlight and track many DQ metrics, from the number of issues found with each data element to the average time it takes for these issues to be remediated and then re-occur again.
        • Predictive Features:  We’ve developed a bespoke machine learning algorithm that learns from your data quality issues, allowing you to gain a deeper understanding of the root causes of the problems and empowering you to take preventative measures to ensure they don’t reoccur. By training this exclusively on your data, you get the most accurate predictions whilst also ensuring your data is fully secure. 
        Benefits of the Datactics ADQ platform

        These represent tangible benefits for our users. At the heart of ADQ’s success is the new user layer that simplifies all the key components of a good data quality solution, such as connectivity, integrations, rule authoring, remediation, and insights. Essentially providing a pragmatic and practical real-world understanding of data quality

        The Datactics platform is designed with all levels of users in mind. ADQ’s interface is intuitive and user-friendly, ensuring that users, regardless of their technical proficiency, can easily navigate and utilise the platform to its full potential. With support for a spectrum of different technologies, ADQ is the perfect platform for any user, from a non-technical business user to expert data scientists. This approach democratises data quality management, making it accessible and manageable for a wider range of professionals within an organisation. 

        The practical benefits of ADQ are evident in our client testimonials, with users reporting significant reductions in cost and time associated with building data quality projects. Specifically, the rule suggestion feature has been a game-changer for many, identifying a substantial portion of business rules which results in considerable time savings. Essentially, it provides a pragmatic and practical real-world understanding of data quality. 

         

         

        Datactics Augmented Data Quality

        Empowering Organisations with Data 

        In the future, we plan to enhance ADQ with more automated features, better insights, and additional integrations. Some of the new features upcoming this year include incorporating generative AI into the platform, allowing non-technical users to create data quality checks using natural language prompts. Suggestions for remediations, generated using historical fixes and our bespoke machine learning algorithm, will vastly boost the number of issues that can be automatically resolved, decreasing the likelihood of human error and leaving your data stewards free to tackle the most critical and problematic cases. Additionally, by enhancing our predictive capabilities, we will allow you to pre-emptively act before data quality issues occur, ensuring your organisation is always working with high quality data. 

         The release of ADQ marks a significant milestone at Datactics, in terms of innovation and supporting our customers. It embodies our commitment to providing state-of-the-art data management solutions, enabling organisations to fully leverage their data assets. We are proud of our team’s vision and dedication to delivering a platform that not only addresses current data quality challenges but also paves the way for future innovations. 

        For more information about the Datactics ADQ solution, take a look at this piece by A-Team Insight or reach out to us at www.datactics.com. 

         

         

        The post The benefits of an Augmented Data Quality Solution appeared first on Datactics.

        ]]>
        Net Promoter Score: H1 2024 https://www.datactics.com/blog/net-promoter-score-h1-2024/ Mon, 15 Jan 2024 16:22:32 +0000 https://www.datactics.com/?p=24393 Please find our latest Net Promoter Score below. Not seeing a form? Please contact your account manager for more information.

        The post Net Promoter Score: H1 2024 appeared first on Datactics.

        ]]>
        Please find our latest Net Promoter Score below. Not seeing a form? Please contact your account manager for more information.

        The post Net Promoter Score: H1 2024 appeared first on Datactics.

        ]]>
        The Importance of Data Quality in Machine Learning https://www.datactics.com/blog/the-importance-of-data-quality-in-machine-learning/ Mon, 18 Dec 2023 12:40:03 +0000 https://www.datactics.com/?p=18042 We are currently in an exciting area and time, where Machine Learning (ML) is applied across sectors from self driving cars to personalised medicine. Although ML models have been around for a while – for example, the use of algorithmic trading models from the 80’s, Bayes since 1700s – we are still in the nascent […]

        The post The Importance of Data Quality in Machine Learning appeared first on Datactics.

        ]]>
        the importance of data quality in machine learning

        We are currently in an exciting area and time, where Machine Learning (ML) is applied across sectors from self driving cars to personalised medicine. Although ML models have been around for a while – for example, the use of algorithmic trading models from the 80’s, Bayes since 1700s – we are still in the nascent stages of productionising ML.

        From a technical viewpoint, this is ‘Machine Learning Ops’ or MLOPs. MLOPs involve figuring out how to build, deploy via continuous integration and deployment, tracking and monitoring models and data in production. 

        From a human, risk, and regulatory viewpoint we are grappling with big questions about ethical AI (Artificial Intelligence) systems and where and how they should be used. Areas including risk, privacy and security of data, accountability, fairness, adversarial AI, and what this means, all come into play in this topic. Additionally, the debate over supervised machine learning, semi-supervised learning, and unsupervised machine learning, brings further complexity to the mix.

        Much of the focus is on the models themselves, such as OpenAI GPT-4.  Everyone can get their hands on pre-trained models or licensed APIs; What differentiates a good deployment is the data quality.

        However, the one common theme that underpins all this work, is the rigour required in developing production-level systems and especially the data necessary to ensure they are reliable, accurate, and trustworthy. This is especially important for ML systems; the role that data and processes play; and the impact of poor-quality data on ML algorithms and learning models in the real world.

        Data as a common theme 

        If we shift our gaze from the model side to the data side, including:

        • Data management – what processes do I have to manage data end to end, especially generating accurate training data?
        • Data integrity – how am I ensuring I have high-quality data throughout?
        • Data cleansing and improvement – what am I doing to prevent bad data from reaching data scientists?
        • Dataset labeling – how am I avoiding the risk of unlabeled data?
        • Data preparation – what steps am I taking to ensure my data is data science-ready?

        A far greater understanding of performance and model impact (consequences) could be achieved. However, this is often viewed as less glamorous or exciting work and, as such, is often unvalued. For example, what is the impetus for companies or individuals to invest at this level (such as regulatory – e.g. BCBS, financial, reputational, law)?

        Yet, as well defined in research by Google,

        “Data largely determines performance, fairness, robustness, safety, and scalability of AI systems…[yet] In practice, most organizations fail to create or meet any data quality standards, from under-valuing data work vis-a-vis model development.” 

        This has a direct impact on people’s lives and society, where “…data quality carries an elevated significance in high-stakes AI due to its heightened downstream impact, impacting predictions like cancer detection, wildlife poaching, and loan allocations”.

        What this looks like in practice

        We have seen this in the past, with the exam predictions in the UK during Covid. In this case, teachers predicted the grades of their students, then an algorithm was applied to these predictions to downgrade any potential grade inflation by the Office of Qualifications and Examinations Regulation, using an algorithm. This algorithm was quite complex and non-transparent in the first instance. When the results were released, 39% of grades were downgraded. The algorithm captured the distribution of grades from previous years, the predicted distribution of grades for past students, and then the current year.

        In practice, this meant that if you were a candidate who had performed well at GCSE, but attended a historically poor performing school, then it was challenging to achieve a top grade. Teachers had to rank their students in the class, resulting in a relative ranking system that could not equate to absolute performance. It meant that even if you were predicted a B, were ranked at fifteenth out of 30 in your class, and the pupil ranked at fifteenth the last three years received a C, you would likely get a C.

        The application of this algorithm caused an uproar. Not least because schools with small class sizes – usually private, or fee-paying schools – were exempt from the algorithm resulting in the use of the teaching predicted grades. Additionally, it baked in past socioeconomic biases, benefitting underperforming students in affluent (and previously high-scoring) areas while suppressing the capabilities of high-performing students in lower-income regions.

        A major lesson to learn from this, therefore, was transparency in the process and the data that was used.

        An example from healthcare

        Within the world of healthcare, it had an impact on ML cancer prediction with IBM’s ‘Watson for Oncology’, partnering with The University of Texas MD Anderson Cancer Center in 2013 to “uncover valuable insights from the cancer center’s rich patient and research databases”. The system was trained on a small number of hypothetical cancer patients, rather than real patient data. This resulted in erroneous and dangerous cancer treatment advice.

        Significant questions that must be asked include:

        • Where did it go wrong here – certainly the data but in general a wider AI system?
        • Where was the risk assessment?
        • What testing was performed?
        • Where did responsibility and accountability reside?

        Machine Learning practitioners know well the statistic that 80% of ML work is data preparation. Why then don’t we focus on this 80% effort and deploy a more systematic approach to ensure data quality is embedded in our systems, and considered important work to be performed by an ML team?

        This is a view recently articulated by Andrew Ng who urges the ML community to be more data-centric and less model-centric. In fact, Andrew was able to demonstrate this using a steel sheets defect detection prediction use case whereby a deep learning computer vision model achieved a baseline performance of 76.2% accuracy. By addressing inconsistencies in the training dataset and correcting noisy or conflicting dataset labels, the classification performance reached 93.1%. Interestingly and compellingly from the perspective of this blog post, minimal performance gains were achieved addressing the model side alone.

        Our view is, if data quality is a key limiting factor in ML performance –then let’s focus our efforts here on improving data quality, and can ML be deployed to address this? This is the central theme of the work the ML team at Datactics undertakes. Our focus is automating the manual, repetitive (often referred to as boring!) business processes of DQ and matching tasks, while embedding subject matter expertise into the process. To do this, most of our solutions employ a human-in-the-loop approach where we capture human decisions and expertise and use this to inform and re-train our models. Having this human expertise is essential in guiding the process and providing context improving the data and the data quality process. We are keen to free up clients from manual mundane tasks and instead use their expertise on tricky cases with simpler agree/disagree options.

        To learn more about an AI-driven approach to Data Quality, read our press release about our Augmented Data Quality platform here. 

        The post The Importance of Data Quality in Machine Learning appeared first on Datactics.

        ]]>
        Datactics launches Augmented Data Quality Solution https://www.datactics.com/press-releases/datactics-launches-augmented-data-quality-solution/ Tue, 14 Nov 2023 10:47:44 +0000 https://www.datactics.com/?p=24027 Datactics, a leading provider of data quality software solutions, has taken a significant leap forward with the launch of its Augmented Data Quality Solution (ADQ).     This innovative solution makes faster and more efficient AI-powered data quality accessible and beneficial to all, through an enriched user interface and more expansive implementation of machine learning […]

        The post Datactics launches Augmented Data Quality Solution appeared first on Datactics.

        ]]>
        Datactics, a leading provider of data quality software solutions, has taken a significant leap forward with the launch of its Augmented Data Quality Solution (ADQ).

         

        Datactics Augmented Data Quality

         

        This innovative solution makes faster and more efficient AI-powered data quality accessible and beneficial to all, through an enriched user interface and more expansive implementation of machine learning functions throughout.

        ADQ covers the full spectrum of end-to-end data quality management. The platform provides data profiling, cleaning, matching, and remediation without the need for coding, and leverages the power of AI to provide meaningful data quality insights on data breaks, causes, and detecting outliers.

        In today’s data-driven world, enterprises face the challenge of maintaining high-quality data across the organisation and ensuring that critical data is easily accessible to governance professionals. ADQ eliminates potential delays in data remediation workflows by using greater automation and no-code tooling, thereby reducing manual effort and increasing accuracy.

        ADQ seamlessly integrates with data catalogues and lineage systems such as Alation and Solidatus, accommodating both cloud-only and hybrid data architectures. This interoperability ensures that customer data management ecosystems remain harmonious and efficient.

        The release of ADQ unveils a range of machine learning extensions that will allow customers of Datactics to deliver data quality improvements faster, more efficiently, and with maximum business impact. Users will benefit from improved profiling, including outlier detection and automated rule suggestion. Additionally, ADQ includes a new feature called ‘Insights Hub’ which allows customers to benefit from the long histories of data quality ‘break/fix’ activities and to perform analysis into which remediations are having the most substantial business impact.

        Datactics CEO Stuart Harvey says,

        “ADQ makes use of the power of machine learning in a very practical way that will help a data governance professional do their job faster and better. ADQ can improve data quality time to value for an analyst struggling to choose the most efficient rules for complex data via automated rule suggestion.

        Likewise, having created rules to measure and remediate broken data, ADQ can be used to further root cause analysis by understanding whether data quality improvements are making a difference over time. All of this can be done without deep technical expertise on behalf of the ADQ user. We have several international clients already using the system live, and we look forward to rolling out to new and existing customers throughout 2023 and into 2024.”

         

        For further information about Datactics’ latest innovation, please visit www.datactics.com or read more in a recent article by the A-Team 

        The post Datactics launches Augmented Data Quality Solution appeared first on Datactics.

        ]]>