Dataverse retention: The conversation nobody wants to have

Power Platform projects usually start with a beautiful sentence:

“Let’s store it in Dataverse.”

And honestly, most of the time, that is a good decision.

Dataverse gives us security, relationships, forms, views, business rules, APIs, model-driven apps, Power Automate integration, Power Pages integration, ALM support, auditing, and a much better enterprise story than a random SharePoint list pretending to be a database.

But there is one conversation that teams often postpone for too long:

How long are we going to keep this data?

Not “can we store it?”
Not “can we build the app?”
Not “can we automate the process?”

But:

When should this data stop being active business data?
When should it be archived?
When should it be deleted?
Who is responsible for making that decision?

That is the Dataverse retention conversation nobody wants to have.

And the longer we avoid it, the more expensive and uncomfortable it becomes.


Dataverse is not a magical infinite drawer

Dataverse is a powerful enterprise data platform, but that does not mean every row should live forever in the active application database.

In real enterprise projects, data growth happens quietly.

A supplier onboarding app starts with a few hundred submissions.
A request management app grows into thousands of records.
A Power Pages portal collects forms, attachments, comments, approvals and audit history.
A Power Automate process creates logs, status rows, integration payloads and error traces.
A model-driven app becomes the operational system for an entire department.

At first, nobody cares.

The app works.
The process is automated.
The business is happy.

Then someone opens the Power Platform admin center and notices that Dataverse capacity is growing. Database storage is growing. File storage is growing. Log storage is growing. Suddenly, the conversation becomes less about innovation and more about ownership, cost, compliance and cleanup.

Microsoft’s capacity model separates Dataverse usage into database, file, and log storage, so it is important to understand what kind of data is actually growing before jumping to conclusions. You can read more in Microsoft’s documentation on Dataverse capacity-based storage.

This is usually the moment when somebody asks:

“Can’t we just delete the old records?”

Maybe.
But maybe not.

And that is exactly why retention should be discussed before the database becomes painful.


Retention is not just a technical setting

A common mistake is treating retention as a technical configuration.

It is not.

Retention is a business, compliance, legal, security, architectural and operational decision that only happens to be implemented technically.

Before creating any retention policy, someone needs to answer questions like:

  • Why do we store this data?
  • Who owns this data?
  • How long is it useful for daily operations?
  • How long must it be kept for audit, legal or regulatory reasons?
  • When does it become inactive?
  • Who is allowed to view historical records?
  • Can the data be anonymized instead of retained?
  • Can attachments be removed earlier than the main business record?
  • What happens if a user requests deletion?
  • What happens if the business wants to reopen an old case?
  • Does reporting still need this data?
  • Does another system already store the official record?

Those questions cannot be answered by a developer alone.

They need input from the business owner, data owner, security, privacy, compliance and platform team. In smaller organizations, one person may wear several of those hats. In large enterprises, this can become a proper governance conversation.

That is uncomfortable.

But it is much better than discovering after three years that nobody knows whether old Dataverse records can be deleted.


Active data, inactive data and deleted data

A useful way to explain retention is to split the data lifecycle into three stages.

1. Active data

This is data used by the application today.

Users search it, update it, approve it, reject it, report on it and trigger flows from it. Active data belongs in the normal Dataverse transactional store.

Examples:

  • open requests
  • active suppliers
  • current approvals
  • ongoing cases
  • recently submitted Power Pages forms
  • records still needed by business users in the app

2. Inactive data

This is data that is no longer needed for daily operations, but still needs to be kept.

Maybe it is required for audit.
Maybe it supports legal discovery.
Maybe the business needs read-only access for historical reference.
Maybe regulations say it must be retained for several years.

Inactive data should not necessarily consume the same active database space forever.

This is where Dataverse long-term retention can become relevant.

3. Deleted data

This is data that no longer needs to be retained.

The business process is complete.
The required retention period has passed.
There is no legal hold.
There is no reporting need.
There is no business justification to keep it.

At that point, deletion should be a normal part of the lifecycle, not an emergency cleanup project.

One important nuance: depending on environment settings, deleted records may be recoverable for a short configured period. Microsoft has a Keep deleted Dataverse records capability that can retain deleted records for a configurable period between 1 and 30 days, if enabled. But this should not be confused with a retention strategy or long-term archive.

Short-term recoverability is not the same thing as business data retention.


Backups are not retention

This is one of the most important points to explain to stakeholders.

Backups are not a retention strategy.

Backups protect you from operational problems: accidental deletion, failed deployments, broken changes, disaster recovery scenarios and environment restoration needs.

Retention answers a different question:

“What data should we keep, in what state, for how long, and why?”

Keeping a backup for a few days does not mean you have a compliant data retention strategy. It also does not mean users can conveniently access historical business data.

Microsoft documents environment backups separately from data retention. By default, system and manual backups are retained for up to seven days, and for production Managed Environments the retention period can be extended up to 28 days. You can read more in Back up and restore environments.

Backups and retention solve different problems.

So when someone says:

“We have backups, so we are covered.”

The answer should be:

“Covered for restore scenarios, maybe. Not necessarily covered for business data retention.”


Audit logs are also a separate conversation

Audit logs are another area people mix into the same bucket.

Dataverse auditing can help track user and system activity, changes to data and access-related events. That is important for compliance and investigations.

But audit retention is not the same as application data retention.

Your business record and the audit history of that record may have different requirements. For example:

  • a business request may need to be kept for 6 years
  • audit logs may need to be kept for a different period
  • plugin trace logs may have a much shorter operational value
  • attachments may have a different lifecycle again

Microsoft’s auditing documentation describes audit settings and audit log retention as a separate admin configuration. See Manage Dataverse auditing.

This is why saying “retain everything for seven years” sounds simple, but often becomes expensive, unclear and difficult to manage.

A better question is:

“What exactly needs to be retained, for what purpose, and in which storage type?”


The storage conversation: Database, file and log

Dataverse capacity is not one single bucket.

In practical terms, you need to think about three major storage types.

Database storage

Database storage is mostly structured Dataverse table data and platform data that is not counted as file or log storage.

This is where most business records live.

File storage

File storage includes attachments, notes with files, images and file columns.

This can grow quickly in Power Pages and request management scenarios, especially when users upload PDFs, photos, forms, contracts, screenshots or supporting evidence.

Log storage

Log storage includes records such as audit data and plug-in trace logs. Microsoft’s capacity documentation explains storage usage by database, file and log in the Power Platform admin center. More details are available in Dataverse capacity-based storage.

When designing retention, you need to understand which part of the solution is actually growing.

A table with 20,000 rows may not be the issue.
A table with 20,000 rows and 80,000 attachments might be.
A heavily audited environment with aggressive logging may have a completely different problem.

Retention starts with visibility.


Long-term retention is useful, but it is not an undo button

Dataverse long-term retention is a platform capability designed to retain inactive data in a cost-efficient way while keeping it within Dataverse security boundaries.

But it must be understood properly.

Once data is retained as long-term inactive data, it cannot be moved back into the live Dataverse transactional store. This means long-term retention should not be treated as a reversible archive button. Microsoft states this clearly in the Dataverse long-term retention overview and in the documentation for setting a data retention policy for a table.

Once data is retained, it is not simply “moved to another folder” where business users can continue working with it as before. Long-term retained data is read-only and should be used for limited inquiry, compliance, audit and legal discovery scenarios.

That means long-term retention is not a solution for data that users still need to update.

If the business says:

“We might need to reopen those old records and continue processing them.”

Then you need to be very careful.

Maybe the records are not truly inactive.
Maybe the process needs a “closed but reopenable” state.
Maybe only some child data should be archived.
Maybe reporting should be moved to an analytical layer.
Maybe the business is not ready for long-term retention yet.

This is exactly why retention is an architecture conversation, not just an admin setting.


Long-term retention does not support everything

Another important caveat: long-term retention is not available for every table or every storage type.

Microsoft’s documentation says that Dataverse standard tables, except system tables, custom tables, attachments and images can be retained. However, audit tables and elastic tables are currently not supported for long-term retention. See Types of data retained long term.

This matters because people often hear “Dataverse retention” and assume everything can be moved into the same long-term store.

That is not the case.

Audit retention still needs to be handled separately. Elastic table scenarios need separate consideration. Operational logs may need their own cleanup strategy. Attachments and images should be reviewed carefully, especially in Power Pages scenarios where external users may upload many files.

The right question is not:

“Can Dataverse retain data?”

The better question is:

“Which data exactly are we talking about, and is that data type supported by the retention mechanism we want to use?”


Long-term retention is not reporting architecture

This deserves its own section.

Long-term retention should not be treated as the primary reporting layer for historical analytics.

If historical data is needed for frequent Power BI reporting, trend analysis, operational dashboards or complex queries across multiple related tables, long-term retention may not be the right primary design.

Microsoft documents several limitations when viewing long-term retained data. For example, retained data queries are limited, joins and aggregation functions are not allowed, and Microsoft suggests considering Fabric for complex queries and Power BI options. See View long-term retained data in Microsoft Dataverse.

Microsoft’s FAQ also states that using the Dataverse connector with Power BI for reporting long-term retained data is not supported, while live and retained data can be accessed with Microsoft Fabric. See Frequently asked questions about long-term data retention with Dataverse.

So if your requirement is:

“We need to analyze five years of historical data every day in Power BI.”

Then the answer is probably not:

“Let’s just use long-term retention.”

The better architectural conversation is:

“Should historical reporting be separated from the transactional Dataverse app?”

In many enterprise solutions, that may mean using Microsoft Fabric, a data lake, Synapse Link patterns, or another analytical layer approved by your organization.

Long-term retention is for rarely accessed inactive data that must still be retained. It is not a replacement for a proper reporting architecture.


The dangerous question: “Can we delete it?”

Eventually, every Dataverse-heavy solution faces this question.

And the answer should never be casual.

Before deleting records, ask:

  • Is this the system of record?
  • Does another system store the same information?
  • Are there legal or regulatory retention requirements?
  • Is there a business owner who approved deletion?
  • Are there related child records?
  • Are there attachments, notes, emails or activities?
  • Are there Power BI reports depending on this data?
  • Are there integrations expecting those records to exist?
  • Are there audit requirements?
  • Are there open support cases related to the data?
  • Was the deletion tested in a lower environment?

Deleting old rows without understanding relationships can break reporting, integrations, app logic and user trust.

The real problem is not deletion itself.

The real problem is deleting data without ownership.


Plugin trace logs are short-term troubleshooting data

Plugin trace logs deserve a special mention because they are often misunderstood.

They are useful for troubleshooting custom Dataverse plug-ins, but they are not a long-term business archive.

Microsoft documents that PluginTraceLog records have a finite lifetime. A background bulk deletion job runs once per day to delete records older than 24 hours, and Microsoft may disable logging to the Plug-in Trace Log if the plugintracelogbase table exceeds 100 GB. See Tracing and logging.

That is a strong signal: plugin trace logs should be treated as short-term troubleshooting data.

If someone wants to keep integration logs, error history or operational diagnostics for longer, that should be designed intentionally. It may require a separate logging table, Application Insights, Azure logging, external monitoring, or another approved operational logging approach.

But plugin trace logs themselves should not become the place where a solution quietly stores years of troubleshooting history.


A practical retention model for Dataverse solutions

For most enterprise Power Platform solutions, I like to define a simple retention model during design.

It does not have to be perfect on day one. But it should exist.

Here is a practical starting point.

1. Define the business record

What is the main record?

Examples:

  • supplier request
  • application access request
  • inspection case
  • contract review
  • incident report
  • employee submission
  • onboarding request

This matters because retention usually starts with the parent business record.

2. Define the active period

How long is the record actively used?

Examples:

  • until approval is complete
  • until the case is closed
  • until the supplier is onboarded
  • until the fiscal year ends
  • until the review cycle is complete

3. Define the inactive period

After the active period, how long should it remain available for read-only reference?

Examples:

  • 1 year for operational reference
  • 3 years for internal audit
  • 6 years for legal or tax-related requirements
  • longer if required by regulation

4. Define the deletion rule

When can the data be permanently removed?

This should be explicit. “Never” is not a strategy unless there is a real legal or business reason behind it.

5. Define exceptions

Some records may need different treatment.

Examples:

  • legal hold
  • ongoing investigation
  • VIP/customer escalation
  • records linked to financial reporting
  • records involved in disputes
  • records with personal data deletion requests

6. Define ownership

Every retention rule needs an owner.

Not “IT”.
Not “the platform team”.
Not “the developer”.

A real business owner.


A Simple Retention Decision Table

You can use a table like this during solution design.

Data typeActive useRetain as inactiveDelete afterOwnerNotes
Main request recordUntil closed6 years7 yearsProcess ownerNeeded for audit
AttachmentsUntil closed6 years7 yearsProcess ownerCheck sensitivity and file size
Approval commentsUntil closed6 years7 yearsProcess ownerPart of decision history
Integration logs30 days90 days180 daysApp ownerOperational troubleshooting only
Plugin trace logsShort-term troubleshooting onlyNormally not retainedDefault cleanup removes records older than 24 hoursPlatform teamAvoid changing this without a clear admin reason
Audit logsDepends on policyDepends on complianceDepends on audit retentionCompliance / adminSeparate from business record

The exact numbers are not the point.

The point is that the conversation becomes visible.


What I would ask before building the app

If I were reviewing a Dataverse solution design, I would ask (NOW 😊) these questions before giving a green light to a solution:

  1. What is the main business record?
  2. Is Dataverse the system of record?
  3. What data is personal, confidential or regulated?
  4. What is the active lifecycle of the record?
  5. What happens after the process is completed?
  6. Who owns the retention decision?
  7. What should be archived?
  8. What should be deleted?
  9. What should never be stored in the first place?
  10. Are attachments really needed in Dataverse?
  11. Is audit enabled, and why?
  12. How long are audit logs retained?
  13. Are there Power BI reports depending on old records?
  14. Are integrations depending on historical data?
  15. Do we need a legal hold process?
  16. Do we have a tested cleanup approach?
  17. What will this solution look like after three years of usage?

That last question is often the most important one.

Power Platform makes it easy to build for today.
Enterprise architecture forces us to think about year three.


Common mistakes I see

Mistake 1: Treating Dataverse as permanent storage for everything

Dataverse is excellent, but not every file, log, payload or historical snapshot needs to live forever in the active database.

Mistake 2: Enabling audit without discussing retention

Auditing is valuable, but it has a cost and a lifecycle. “Turn it on for everything” is not always a responsible decision.

Mistake 3: Confusing backup retention with business retention

Backups help you restore environments. They do not replace a proper business data retention model.

Mistake 4: Ignoring attachments

Attachments are often the silent capacity killer, especially in Power Pages scenarios.

Mistake 5: Waiting until capacity becomes a problem

At that point, every cleanup decision becomes urgent, political and risky.

Mistake 6: No business owner

If nobody owns the data, nobody can safely approve its deletion.

Mistake 7: Treating long-term retention as reporting architecture

Long-term retention is not designed to replace a proper reporting layer for frequent analytics, trend analysis or operational dashboards.


My recommendation

For every serious Dataverse solution, add a small “Data Lifecycle & Retention” section to the solution design.

It does not need to be complicated.

Start with:

  • what data is stored
  • why it is stored
  • who owns it
  • how long it is active
  • how long it is retained
  • when it is deleted
  • whether audit is required
  • whether attachments are required
  • whether reporting needs historical data
  • what happens during offboarding or decommissioning

If long-term retention is considered, validate whether the target tables and storage types are supported. Microsoft’s documentation for Dataverse long-term retention and setting a retention policy for a table should be reviewed before implementation.

This one section can prevent a lot of pain later.

Because Dataverse retention is not only about saving storage.

It is about building solutions that are responsible, explainable and sustainable.


Final thought

The uncomfortable truth is this:

Power Platform makes it very easy to create business data.

It does not automatically make your organization good at managing the lifecycle of that data.

That part still needs architecture.
It needs ownership.
It needs governance.
It needs a conversation that many teams prefer to avoid.

But if your solution matters enough to store data in Dataverse, it matters enough to define what should happen to that data later.

That is the retention conversation.

And it is better to have it before the storage warning appears.


Microsoft Learn References

Dodaj komentarz

Twój adres email nie zostanie opublikowany. Wymagane pola są oznaczone *