AI Strategies for Leveraging Data as a Competitive Advantage
After about fifty years of gestation, AI is now delivering a much-promised leap in human capital productivity. In this issue, we introduce a framework to help you assess your organisation’s AI opportunities. You can only engage with AI if you have extensive data, so the framework is based on the characteristics of your data.
Key thoughts
- To exploit AI, you need an extensive system of record (data) to fuel the target application.
- Data obligations and data rights determine the potential for creating a strategic advantage.
- Systems of record and inquiry are needed to collect and mine data, respectively.
Data are today’s diamonds
A competitive strategy is based on building a capital asymmetry. For instance, a mining company might own a mineral-rich lode that it can exploit at a much lower price per ton than any alternative. This natural capital asymmetry enables it to make industry-leading profits and pour these into maintaining its exclusive position. For many decades, De Beers enjoyed a natural capital monopoly in the diamond industry that enabled it to control the global supply of diamonds. It complemented this imbalance by generating a symbolic capital asymmetry around diamonds to position them as the foremost gemstone. For decades, De Beers was a money machine without peer.
Data are today’s diamonds. Under the right conditions, AI can polish data to create a competitive advantage. An organisation’s wherewithal for creating a capital asymmetry depends on its systems of record and systems of inquiry (the five types of fundamental systems were introduced in an earlier posting).
AI needs vast amounts of data (a system of record) to feed a neural network (a system of inquiry), a mathematical representation of the brain commonly used by AI to learn from data. Large language models (LLMs) require hundreds of gigabytes of data to calibrate a neural network with over 100 billion parameters.
System of record
A system of record is typically a database or document store. When an enterprise creates and maintains a system of record for its exclusive use, it has a proprietary dataset. Alternatively, it might rely on a public system of record. If competitors use the same public dataset, the possibility of creating a competitive advantage is limited. It’s like mining the same diamond deposit as your competitors. The ideal situation is to build a proprietary database by instituting data obligations. A high-value system of record is typically proprietary and difficult or impossible for competitors to replicate. A public dataset is a low-value system of record because competitors have complete access to it.
Data obligations
Some companies oblige customers and suppliers to provide specific data to do business. For example, people downloading the Amazon shopping app agree to give away considerable personal data, including health and fitness data, for the right to buy from Amazon, as shown in the following figure. When you download an app from the Apple Store, you can view its ‘nutrient label’ to reveal the associated data obligations you have accepted, often without your explicit awareness of such an agreement.
Amazon app data obligations

Loyalty programs are opt-in variations of a data obligation. To accumulate points, you permit the collection of some of your behavioural data. Some programs, such as Woolworth’s Everyday Rewards, gather limited transaction data only. As no customer identifying details are recorded, such loyalty programs have weak data obligations because the obligatory data are not connected to an individual.
A major purpose of establishing a data obligation or instituting a loyalty program is to build a high-value propriety dataset that you can mine to learn about a customer’s specific or general behaviour. The breadth of data collected determines what can be learned and how well messages can be tailored to specific customers or customers in general.
When people sign up with a social media site, their obligated data enables the host to capture the data required to execute its business model. Social media companies convert the captured high-value data into economic capital through advertising.
Imagine the value of the viewing habit data collected by Netflix. It knows what you viewed, how long you watched a show before abandoning it, what you binge on, and much more. Consequently, when bidding for a movie or series, it can compute its worth (e.g., estimated total viewing hours) more accurately than the show’s producer. Netflix has a massive data asymmetry.
Internal systems that generate and store data are also obligatory because they require everyone across the organisation, as appropriate, to use them (e.g., recording sales, entering production output). These proprietary data sets are rarely intentionally released for public access. Their value depends on the potential capital loss if they are made public. A data breach that reveals customers’ personal data can result in a large loss of social and symbolic capital. For many organisations, there is often a significant invisible loss resulting from a failure to exploit their internal data. Many enterprises are data rich and information poor because they don’t mine their systems of record.
Data rights
Data rights determine who can use the data accumulated by a data obligation. Data rights are typically tightly preserved by the data collector because they are a source of competitive advantage. In the case of Amazon’s marketplace, Amazon has sometimes excluded vendors from accessing transaction data, such as not revealing a buyer’s address details to the vendor. As Amazon has over 100 private label brands directly competing with its vendors, its control of data rights can be a tremendous competitive edge.
From a strategic perspective, ideally you want to exploit data obligations to create a high-value system of record to which you have exclusive access. In the worst case, you rely on public data accessible by all.
System of inquiry
A system of inquiry converts data into information. In the case of AI, large volumes of structured or unstructured data are needed to build a predictive model.
Structured data are typically tables with rows and columns that clearly define data attributes. All columns are the same data type (e.g., all numeric). Because of the structure, it is less expensive to build a machine-learning model based on structured data.
Machine learning with structured data
A Metro Trains Melbourne maintenance facility is an 1882 timber portal-framed building of historical significance. Limited structural or façade changes can be undertaken to stabilise the building. Under current Health and Safety Procedures, all staff must evacuate when prevailing wind conditions exceed 80 km/h, resulting in a significant loss of productivity. A system was needed to monitor wind speed and direction and manage workspace evacuation.
A wind-pressure modelling study (a system of inquiry) determined that not all building zones must be evacuated when the winds exceed 80 km/h. Some zones can withstand winds over 140 km/h. Digital Frontier Partners designed and implemented an Internet of Things (IoT) solution whereby the building was zoned into five distinct areas – each with a different safety threshold for wind speed.
In this case, structured data were available, or could be collected, for the parameters determining the stability of the building. This high-value system of record fed a machine-learning algorithm.
Machine learning algorithms are readily available in open-source data analytics packages, such as Python and R. While some coding is required to deploy the package, software customisation is low.
Metro trains gained a boost in internal efficiency when it exploited a high-value system of record using lightly customised software.
Text, video, and audio files are unstructured data. They are not decomposable into a tabular format. Building a machine learning model based on unstructured data, such as an LLM, is more expensive.
Machine learning with unstructured data
Granting permission to establish a food preparation service is common for local government administrations. They need to ensure that an applicant’s plan complies with federal, state, and local provisions. Navigating these standards is challenging, especially if an applicant has limited English skills. It is quite common for cities to employ personnel to reduce the compliance complexity of dealing with unstructured legal documents.
DFP is working with a large local government in Victoria to create a localised LLM populated with the applicable regulations for food safety. The LLM-based application supports local government employees to answer an applicant’s questions and process their application. Furthermore, it can use an AI translation app to respond in multiple languages. For example, a person who had migrated from Iran wanted answers in English and Farsi.
Other local governments can readily copy this pioneering application of this localised LLM. The data are publicly available, and the app does not require a high level of tailoring. Consequently, this case falls into the table stakes quadrant as it will become an expected service of local governments.
AI strategy
Based on the available data’s value and the software customisation level, we identify four generic strategies, as shown in the following figure.
AI strategies

Table stakes
Table stakes refers to what customers, citizens, or employees expect an organisation to provide. They typically include a website, a chat line, and a call centre. Some AI applications will soon emerge as table stakes, such as an intelligent agent to answer product queries and handle orders. Initially, novel applications are widely adopted because the necessary data are readily accessible and off-the-shelf or easily customised software is available.
Internal efficiency
Companies can exploit high-value, usually internal, data with easily deployed software components, such as a machine learning package for structured data, to raise internal efficiency. This is an under-exploited option that recently emerge because of AI. Many enterprises have effectively ignored this opportunity for decades.
External effectiveness
An insurance company could raise its external effectiveness by creating a highly customised system based on public data. For example, it might invest heavily in software using freely available Bureau of Meteorology weather data and public land records to set flood insurance rates for specific buildings.
Competitive advantage
By combining high-value data with specialised software, a company can gain a capital asymmetry that is difficult to replicate. Amazon and Netflix have large proprietary data sets and highly customised AI software, facilitating monopoly power. They are effective AI strategists.
Next steps
Frameworks, such as the AI strategy quadrant we have introduced, are useful analytical tools. There effectiveness increases when they are amply illustrated by practical application. Our next step is to find organizations that are exemplars of each of the quadrants, and these will be reported in the next issue of Productive Thinking.
Critical reflections
- Are you exploiting your high-value data?
- What AI apps will become table stakes in your industry?
- Should you aim for internal efficiency or external effectiveness?
- Do you have the requisite high-value systems of record that, with highly tailored software, could create a strategic advantage?
Authors
Rick Watson
Research Director
Pieter Snyman
Partner, CTO & CSIO