Dataverse retention: The conversation nobody wants to have

Dataverse gives us security, relationships, forms, views, business rules, APIs, model-driven apps, Power Automate integration, Power Pages integration, ALM support, auditing, and a much better enterprise story than a random SharePoint list pretending to be a database.

But there is one conversation that teams often postpone for too long:

How long are we going to keep this data?

Not “can we store it?”
Not “can we build the app?”
Not “can we automate the process?”

But:

When should this data stop being active business data?
When should it be archived?
When should it be deleted?
Who is responsible for making that decision?

That is the Dataverse retention conversation nobody wants to have.

And the longer we avoid it, the more expensive and uncomfortable it becomes.

Dataverse is not a magical infinite drawer

Dataverse is a powerful enterprise data platform, but that does not mean every row should live forever in the active application database.

In real enterprise projects, data growth happens quietly.

A supplier onboarding app starts with a few hundred submissions.
A request management app grows into thousands of records.
A Power Pages portal collects forms, attachments, comments, approvals and audit history.
A Power Automate process creates logs, status rows, integration payloads and error traces.
A model-driven app becomes the operational system for an entire department.

At first, nobody cares.

The app works.
The process is automated.
The business is happy.

Then someone opens the Power Platform admin center and notices that Dataverse capacity is growing. Database storage is growing. File storage is growing. Log storage is growing. Suddenly, the conversation becomes less about innovation and more about ownership, cost, compliance and cleanup.

Microsoft’s capacity model separates Dataverse usage into database, file, and log storage, so it is important to understand what kind of data is actually growing before jumping to conclusions. You can read more in Microsoft’s documentation on Dataverse capacity-based storage.

This is usually the moment when somebody asks:

“Can’t we just delete the old records?”

Maybe.
But maybe not.

And that is exactly why retention should be discussed before the database becomes painful.

Retention is not just a technical setting

A common mistake is treating retention as a technical configuration.

It is not.

Retention is a business, compliance, legal, security, architectural and operational decision that only happens to be implemented technically.

Before creating any retention policy, someone needs to answer questions like:

Why do we store this data?
Who owns this data?
How long is it useful for daily operations?
How long must it be kept for audit, legal or regulatory reasons?
When does it become inactive?
Who is allowed to view historical records?
Can the data be anonymized instead of retained?
Can attachments be removed earlier than the main business record?
What happens if a user requests deletion?
What happens if the business wants to reopen an old case?
Does reporting still need this data?
Does another system already store the official record?

Those questions cannot be answered by a developer alone.

They need input from the business owner, data owner, security, privacy, compliance and platform team. In smaller organizations, one person may wear several of those hats. In large enterprises, this can become a proper governance conversation.

That is uncomfortable.

But it is much better than discovering after three years that nobody knows whether old Dataverse records can be deleted.

Active data, inactive data and deleted data

A useful way to explain retention is to split the data lifecycle into three stages.

1. Active data

This is data used by the application today.

Users search it, update it, approve it, reject it, report on it and trigger flows from it. Active data belongs in the normal Dataverse transactional store.

Examples:

open requests
active suppliers
current approvals
ongoing cases
recently submitted Power Pages forms
records still needed by business users in the app

2. Inactive data

This is data that is no longer needed for daily operations, but still needs to be kept.

Maybe it is required for audit.
Maybe it supports legal discovery.
Maybe the business needs read-only access for historical reference.
Maybe regulations say it must be retained for several years.

Inactive data should not necessarily consume the same active database space forever.

This is where Dataverse long-term retention can become relevant.

3. Deleted data

This is data that no longer needs to be retained.

The business process is complete.
The required retention period has passed.
There is no legal hold.
There is no reporting need.
There is no business justification to keep it.

At that point, deletion should be a normal part of the lifecycle, not an emergency cleanup project.

One important nuance: depending on environment settings, deleted records may be recoverable for a short configured period. Microsoft has a Keep deleted Dataverse records capability that can retain deleted records for a configurable period between 1 and 30 days, if enabled. But this should not be confused with a retention strategy or long-term archive.

Short-term recoverability is not the same thing as business data retention.

Backups are not retention

This is one of the most important points to explain to stakeholders.

Backups are not a retention strategy.

Backups protect you from operational problems: accidental deletion, failed deployments, broken changes, disaster recovery scenarios and environment restoration needs.

Retention answers a different question:

“What data should we keep, in what state, for how long, and why?”

Keeping a backup for a few days does not mean you have a compliant data retention strategy. It also does not mean users can conveniently access historical business data.

Microsoft documents environment backups separately from data retention. By default, system and manual backups are retained for up to seven days, and for production Managed Environments the retention period can be extended up to 28 days. You can read more in Back up and restore environments.

Backups and retention solve different problems.

So when someone says:

“We have backups, so we are covered.”

The answer should be:

“Covered for restore scenarios, maybe. Not necessarily covered for business data retention.”

Audit logs are also a separate conversation

Audit logs are another area people mix into the same bucket.

Dataverse auditing can help track user and system activity, changes to data and access-related events. That is important for compliance and investigations.

But audit retention is not the same as application data retention.

Your business record and the audit history of that record may have different requirements. For example:

a business request may need to be kept for 6 years
audit logs may need to be kept for a different period
plugin trace logs may have a much shorter operational value
attachments may have a different lifecycle again

Microsoft’s auditing documentation describes audit settings and audit log retention as a separate admin configuration. See Manage Dataverse auditing.

This is why saying “retain everything for seven years” sounds simple, but often becomes expensive, unclear and difficult to manage.

A better question is:

“What exactly needs to be retained, for what purpose, and in which storage type?”

The storage conversation: Database, file and log

Dataverse capacity is not one single bucket.

In practical terms, you need to think about three major storage types.

Database storage

Database storage is mostly structured Dataverse table data and platform data that is not counted as file or log storage.

This is where most business records live.

File storage

File storage includes attachments, notes with files, images and file columns.

This can grow quickly in Power Pages and request management scenarios, especially when users upload PDFs, photos, forms, contracts, screenshots or supporting evidence.

Log storage

Log storage includes records such as audit data and plug-in trace logs. Microsoft’s capacity documentation explains storage usage by database, file and log in the Power Platform admin center. More details are available in Dataverse capacity-based storage.

When designing retention, you need to understand which part of the solution is actually growing.

A table with 20,000 rows may not be the issue.
A table with 20,000 rows and 80,000 attachments might be.
A heavily audited environment with aggressive logging may have a completely different problem.

Retention starts with visibility.

Long-term retention is useful, but it is not an undo button

Dataverse long-term retention is a platform capability designed to retain inactive data in a cost-efficient way while keeping it within Dataverse security boundaries.

But it must be understood properly.

Once data is retained as long-term inactive data, it cannot be moved back into the live Dataverse transactional store. This means long-term retention should not be treated as a reversible archive button. Microsoft states this clearly in the Dataverse long-term retention overview and in the documentation for setting a data retention policy for a table.

Once data is retained, it is not simply “moved to another folder” where business users can continue working with it as before. Long-term retained data is read-only and should be used for limited inquiry, compliance, audit and legal discovery scenarios.

That means long-term retention is not a solution for data that users still need to update.

If the business says:

“We might need to reopen those old records and continue processing them.”

Then you need to be very careful.

Maybe the records are not truly inactive.
Maybe the process needs a “closed but reopenable” state.
Maybe only some child data should be archived.
Maybe reporting should be moved to an analytical layer.
Maybe the business is not ready for long-term retention yet.

This is exactly why retention is an architecture conversation, not just an admin setting.

Long-term retention does not support everything

Another important caveat: long-term retention is not available for every table or every storage type.

Microsoft’s documentation says that Dataverse standard tables, except system tables, custom tables, attachments and images can be retained. However, audit tables and elastic tables are currently not supported for long-term retention. See Types of data retained long term.

This matters because people often hear “Dataverse retention” and assume everything can be moved into the same long-term store.

That is not the case.

Audit retention still needs to be handled separately. Elastic table scenarios need separate consideration. Operational logs may need their own cleanup strategy. Attachments and images should be reviewed carefully, especially in Power Pages scenarios where external users may upload many files.

The right question is not:

“Can Dataverse retain data?”

The better question is:

“Which data exactly are we talking about, and is that data type supported by the retention mechanism we want to use?”

Long-term retention is not reporting architecture

This deserves its own section.

Long-term retention should not be treated as the primary reporting layer for historical analytics.

If historical data is needed for frequent Power BI reporting, trend analysis, operational dashboards or complex queries across multiple related tables, long-term retention may not be the right primary design.

Microsoft documents several limitations when viewing long-term retained data. For example, retained data queries are limited, joins and aggregation functions are not allowed, and Microsoft suggests considering Fabric for complex queries and Power BI options. See View long-term retained data in Microsoft Dataverse.

Microsoft’s FAQ also states that using the Dataverse connector with Power BI for reporting long-term retained data is not supported, while live and retained data can be accessed with Microsoft Fabric. See Frequently asked questions about long-term data retention with Dataverse.

So if your requirement is:

“We need to analyze five years of historical data every day in Power BI.”

Then the answer is probably not:

“Let’s just use long-term retention.”

The better architectural conversation is:

“Should historical reporting be separated from the transactional Dataverse app?”

In many enterprise solutions, that may mean using Microsoft Fabric, a data lake, Synapse Link patterns, or another analytical layer approved by your organization.

Long-term retention is for rarely accessed inactive data that must still be retained. It is not a replacement for a proper reporting architecture.

The dangerous question: “Can we delete it?”

Eventually, every Dataverse-heavy solution faces this question.

And the answer should never be casual.

Before deleting records, ask:

Is this the system of record?
Does another system store the same information?
Are there legal or regulatory retention requirements?
Is there a business owner who approved deletion?
Are there related child records?
Are there attachments, notes, emails or activities?
Are there Power BI reports depending on this data?
Are there integrations expecting those records to exist?
Are there audit requirements?
Are there open support cases related to the data?
Was the deletion tested in a lower environment?

Deleting old rows without understanding relationships can break reporting, integrations, app logic and user trust.

The real problem is not deletion itself.

The real problem is deleting data without ownership.

Plugin trace logs are short-term troubleshooting data

Plugin trace logs deserve a special mention because they are often misunderstood.

They are useful for troubleshooting custom Dataverse plug-ins, but they are not a long-term business archive.

Microsoft documents that PluginTraceLog records have a finite lifetime. A background bulk deletion job runs once per day to delete records older than 24 hours, and Microsoft may disable logging to the Plug-in Trace Log if the plugintracelogbase table exceeds 100 GB. See Tracing and logging.

That is a strong signal: plugin trace logs should be treated as short-term troubleshooting data.

If someone wants to keep integration logs, error history or operational diagnostics for longer, that should be designed intentionally. It may require a separate logging table, Application Insights, Azure logging, external monitoring, or another approved operational logging approach.

But plugin trace logs themselves should not become the place where a solution quietly stores years of troubleshooting history.

A practical retention model for Dataverse solutions

For most enterprise Power Platform solutions, I like to define a simple retention model during design.

It does not have to be perfect on day one. But it should exist.

Here is a practical starting point.

1. Define the business record

What is the main record?

Examples:

supplier request
application access request
inspection case
contract review
incident report
employee submission
onboarding request

This matters because retention usually starts with the parent business record.

2. Define the active period

How long is the record actively used?

Examples:

until approval is complete
until the case is closed
until the supplier is onboarded
until the fiscal year ends
until the review cycle is complete

3. Define the inactive period

After the active period, how long should it remain available for read-only reference?

Examples:

1 year for operational reference
3 years for internal audit
6 years for legal or tax-related requirements
longer if required by regulation

4. Define the deletion rule

When can the data be permanently removed?

This should be explicit. “Never” is not a strategy unless there is a real legal or business reason behind it.

5. Define exceptions

Some records may need different treatment.

Examples:

legal hold
ongoing investigation
VIP/customer escalation
records linked to financial reporting
records involved in disputes
records with personal data deletion requests

6. Define ownership

Every retention rule needs an owner.

Not “IT”.
Not “the platform team”.
Not “the developer”.

A real business owner.

A Simple Retention Decision Table

You can use a table like this during solution design.

Data type	Active use	Retain as inactive	Delete after	Owner	Notes
Main request record	Until closed	6 years	7 years	Process owner	Needed for audit
Attachments	Until closed	6 years	7 years	Process owner	Check sensitivity and file size
Approval comments	Until closed	6 years	7 years	Process owner	Part of decision history
Integration logs	30 days	90 days	180 days	App owner	Operational troubleshooting only
Plugin trace logs	Short-term troubleshooting only	Normally not retained	Default cleanup removes records older than 24 hours	Platform team	Avoid changing this without a clear admin reason
Audit logs	Depends on policy	Depends on compliance	Depends on audit retention	Compliance / admin	Separate from business record

The exact numbers are not the point.

The point is that the conversation becomes visible.

What I would ask before building the app

If I were reviewing a Dataverse solution design, I would ask (NOW 😊) these questions before giving a green light to a solution:

What is the main business record?
Is Dataverse the system of record?
What data is personal, confidential or regulated?
What is the active lifecycle of the record?
What happens after the process is completed?
Who owns the retention decision?
What should be archived?
What should be deleted?
What should never be stored in the first place?
Are attachments really needed in Dataverse?
Is audit enabled, and why?
How long are audit logs retained?
Are there Power BI reports depending on old records?
Are integrations depending on historical data?
Do we need a legal hold process?
Do we have a tested cleanup approach?
What will this solution look like after three years of usage?

That last question is often the most important one.

Power Platform makes it easy to build for today.
Enterprise architecture forces us to think about year three.

Common mistakes I see

Mistake 1: Treating Dataverse as permanent storage for everything

Dataverse is excellent, but not every file, log, payload or historical snapshot needs to live forever in the active database.

Mistake 2: Enabling audit without discussing retention

Auditing is valuable, but it has a cost and a lifecycle. “Turn it on for everything” is not always a responsible decision.

Mistake 3: Confusing backup retention with business retention

Backups help you restore environments. They do not replace a proper business data retention model.

Mistake 4: Ignoring attachments

Attachments are often the silent capacity killer, especially in Power Pages scenarios.

Mistake 5: Waiting until capacity becomes a problem

At that point, every cleanup decision becomes urgent, political and risky.

Mistake 6: No business owner

If nobody owns the data, nobody can safely approve its deletion.

Mistake 7: Treating long-term retention as reporting architecture

Long-term retention is not designed to replace a proper reporting layer for frequent analytics, trend analysis or operational dashboards.

My recommendation

For every serious Dataverse solution, add a small “Data Lifecycle & Retention” section to the solution design.

It does not need to be complicated.

Start with:

what data is stored
why it is stored
who owns it
how long it is active
how long it is retained
when it is deleted
whether audit is required
whether attachments are required
whether reporting needs historical data
what happens during offboarding or decommissioning

If long-term retention is considered, validate whether the target tables and storage types are supported. Microsoft’s documentation for Dataverse long-term retention and setting a retention policy for a table should be reviewed before implementation.

This one section can prevent a lot of pain later.

Because Dataverse retention is not only about saving storage.

It is about building solutions that are responsible, explainable and sustainable.

Final thought

The uncomfortable truth is this:

Power Platform makes it very easy to create business data.

It does not automatically make your organization good at managing the lifecycle of that data.

That part still needs architecture.
It needs ownership.
It needs governance.
It needs a conversation that many teams prefer to avoid.

But if your solution matters enough to store data in Dataverse, it matters enough to define what should happen to that data later.

That is the retention conversation.

And it is better to have it before the storage warning appears.

Dataverse is not a magical infinite drawer

Retention is not just a technical setting

Active data, inactive data and deleted data

1. Active data

2. Inactive data

3. Deleted data

Backups are not retention

Audit logs are also a separate conversation

The storage conversation: Database, file and log

Database storage

File storage

Log storage

Long-term retention is useful, but it is not an undo button

Long-term retention does not support everything

Long-term retention is not reporting architecture

The dangerous question: “Can we delete it?”

Plugin trace logs are short-term troubleshooting data

A practical retention model for Dataverse solutions

1. Define the business record

2. Define the active period

3. Define the inactive period

4. Define the deletion rule

5. Define exceptions

6. Define ownership

A Simple Retention Decision Table

What I would ask before building the app

Common mistakes I see

Mistake 1: Treating Dataverse as permanent storage for everything

Mistake 2: Enabling audit without discussing retention

Mistake 3: Confusing backup retention with business retention

Mistake 4: Ignoring attachments

Mistake 5: Waiting until capacity becomes a problem

Mistake 6: No business owner

Mistake 7: Treating long-term retention as reporting architecture

My recommendation

Final thought

Microsoft Learn References

Dodaj komentarzAnuluj odpowiedź